The (Astex) Fragment network

This feature is implemented on the back-end in the fragalysis code. This algorithm is an open-source implementation of work done by Astex.


audience: novice scientist developer


Compound enumeration is a method in drug-discovery whereby you search some sort of database for molecules that can be purchased or synthesised, based on some criteria or hypotheses. One of the main features of fragalysis is the ability to look at hit molecules that come from experiment, and look for new compounds that elaborate or change that hit in some way.

In the first screenshot below, we show the 3D structure of a molecule that we know binds to our target protein. This is awesome! What we really want to do is improve how well that molecule binds, by elaborating it in a way that more interactions are formed with the protein.

So how do we find compounds that match that criteria? With the fragment network. The fragment network takes a big set of compounds, and splits them up into parts – rings, linkers and substituents. These parts form the nodes in a graph network.

The edges between these nodes describe how the bits of molecules can be linked together to make new molecules.

From this information, we know how we can change a molecule by searching the network for new bits to add to an initial hit, with transformations described along the edges in the graph-network.

In Fragalysis, we show what changes or elaborations we can make to a molecule with vectors. In the screenshot below, we show the same molecule as before, but with a representation of how we can change the molecule. The cone shaped bits represent that an elaboration can be made from the atom they are connected to in the direction they are pointing. The cylinders represent that the molecule can be changed along the bond that they lie on. Green colouring means there are more than 10 compounds available; yellow colouring means there are between 0-10 compounds available; and red means there are no compounds available, but the graph network knows that you could elaborate the molecule in that vector.

Also shown in the above screenshot, on the right hand side, are molecules that the graph network has found for us along the hydroxy group, which is highlighted in red in the top right-hand corner. These compounds are available either from enamine or molport – we use their databases to calculate our graph network.