Conformer generation and molecular minimisation¶
Note
The ccdc.conformer
module is available only to CSD-Discovery, CSD-Materials and
CSD-Enterprise users.
Introduction¶
The ccdc.conformer
module can be used to generate a diverse
ensemble of conformers that are optimised to reflect the geometrical
preferences of molecules observed in the CSD.
The methodology requires an input molecule where all atoms have 3D coordinates. The following steps are then performed:
The bond distances and angles are optimised according to data from the CSD
Conformational space is sampled using rotamer distributions and ring templates derived from the CSD and the generated conformers are scored
A diverse subset is selected
The ccdc.conformer.ConformerGenerator
is available in the
ccdc.conformer
module. Let us import this module.
>>> from ccdc import conformer
Let us also import the ccdc.io
module so that we can read in
a molecule of interest and define the path to a molecule file of
interest, in this instance a CDK2 inhibitor.
>>> from ccdc import io
>>> filepath = '2vtp-lig.mol2'
In this instance we will get the first molecule in the file.
>>> mol_reader = io.MoleculeReader(filepath)
>>> mol = mol_reader[0]
Note
For more information on the conformer generation algorithm please see: “Knowledge-Based Conformer Generation Using the Cambridge Structural Database”, J. C. Cole, O. Korb, P. McCabe. M. G. Read and R. Taylor, J. Chem. Inf. Model., 58, 615-629, 2018, DOI: 10.1021/acs.jcim.7b00697.
Conformer Generation¶
Setting Up the Calculation¶
To generate conformers we create a ccdc.conformer.ConformerGenerator
.
>>> conformer_generator = conformer.ConformerGenerator()
Note
There are optional parameters for finer control of the
ccdc.conformer.ConformerGenerator
; please see the API
documentation for further detail.
The ccdc.conformer.ConformerGenerator
has a number of settings that can be customised, for example the maximum number
of conformers to generate, that is a representative set of high probability and at the same time diverse conformers.
The ccdc.conformer.ConformerSettings.max_conformers
setting specifies the maximum number of conformers to generate. If one specifies a large number and the conformational
space is exhausted before this number is reached a smaller number of conformers will be generated.
By default the conformer generator will attempt to generate 200 conformers, but this number can be changed by the user. Let us change it to 25.
>>> conformer_generator.settings.max_conformers
200
>>> conformer_generator.settings.max_conformers = 25
Another setting that can be customised is the ccdc.conformer.ConformerSettings.max_unusual_torsions
, which represents the number of unusual rotamers allowed per conformer. By default this number is set to 2.
Finally, it is possible to superimpose the resulting set of conformers onto a common refernce frame, that is all atoms of the input molecule. To enable this feature:
>>> conformer_generator.settings.superimpose_conformers_onto_reference = True
To generate a set of conformers for a molecule, call the ccdc.conformer.ConformerGenerator.generate()
function.
>>> conformers = conformer_generator.generate(mol)
>>> len(conformers)
25
Note
The ccdc.conformer.ConformerGenerator.generate()
function can also take a list of molecules which may give some
performance improvement when working on large numbers of
molecules.
It is possible to overlay the list of conformers onto a specific set of atoms using the ccdc.descriptors.MolecularDescriptors.Overlay()
. For example, we can superimpose the 25 conformers onto the pyrazole substructure.
>>> from ccdc.search import SubstructureSearch, SMARTSSubstructure
>>> from ccdc import descriptors as d
>>> query = 'C1=CNN=C1'
>>> conformers_mols = [c.molecule for c in conformers]
>>> ss_search = SubstructureSearch()
>>> substructure = SMARTSSubstructure(query)
>>> sub_id = ss_search.add_substructure(substructure)
>>> hits = ss_search.search(conformers_mols, max_hits_per_structure=1)
>>> ref_ats = hits[0].match_atoms()
Let’s write the superimposed conformers to a Mol2 file.
>>> with io.MoleculeWriter('superimposed_conformers.mol2') as writer:
... for hit in hits:
... hit_ats = hit.match_atoms()
... atoms = zip(ref_ats, hit_ats)
... ov = d.MolecularDescriptors.Overlay(hits[0].molecule, hit.molecule, atoms)
... superimposed_hit = ov.molecule
... writer.write(superimposed_hit)
Analysing the Conformer Hit List¶
The ccdc.conformer.ConformerGenerator.generate()
function returns a ccdc.conformer.ConformerHitList
object, which behaves like a
Python list. The ccdc.conformer.ConformerHitList
contains attributes relating to the overall performance of the run. For example,
whether or not the sampling limit was reached and how many rotamers had no observations in the CSD.
>>> conformers.sampling_limit_reached
False
>>> conformers.n_rotamers_with_no_observations
0
An individual ccdc.conformer.ConformerHit
contains a ccdc.molecule.Molecule
as well as the ccdc.conformer.ConformerHit.normalised_score
- a value between 0.0 (most probable) and 1.0 (least probable).
Conformers are listed in probability order (the most probable first).
>>> most_probable_conformer = conformers[0]
>>> most_probable_conformer.molecule
<ccdc.molecule.Molecule ...>
>>> '%.2f' % most_probable_conformer.normalised_score
'0.00'
It is possible to calculate the rmsd of each conformer in the hit list with respect to the input molecule (this is the default behaviour) or with respect to the minimised molecule.
>>> print(round(most_probable_conformer.rmsd(), 3))
0.723
>>> print(round(most_probable_conformer.rmsd(wrt='minimised'), 3))
0.690
Locking Rotatable Bonds¶
Particular bonds can be locked so that they are not rotated during conformer generation. A locked bond in a ring will lock the whole ring. This can make the conformer generator create more diversity in other areas of the molecule.
>>> conformer_generator.lock_torsion(mol.bond('C13', 'N14'))
Molecular Minimisation¶
The ccdc.conformer
module also contains functionality for optimising a molecule to the geometrical preferences of bond distances and valence angles observed in the CSD.
Note
By default this is the first step in conformer generation.
The geometrical optimisation makes use of the Tripos force field functional forms and, where available, equilibrium bond distances and valence angles are parametrised using data obtained from the CSD.
>>> molecule_minimiser = conformer.MoleculeMinimiser()
>>> minimised_mol = molecule_minimiser.minimise(mol)
To compare the two molecules we can make use of the ccdc.descriptors.MolecularDescriptors.rmsd()
function.
>>> from ccdc.descriptors import MolecularDescriptors
>>> round( MolecularDescriptors.rmsd(mol, minimised_mol), 3)
0.434