Conformer generation and molecular minimisation¶
ccdc.conformer module is available only to CSD-Discovery, CSD-Materials and
ccdc.conformer module can be used to generate a diverse
ensemble of conformers that are optimised to reflect the geometrical
preferences of molecules observed in the CSD.
The methodology requires an input molecule where all atoms have 3D coordinates. The following steps are then performed:
The bond distances and angles are optimised according to data from the CSD
Conformational space is sampled using rotamer distributions and ring templates derived from the CSD and the generated conformers are scored
A diverse subset is selected
ccdc.conformer.ConformerGenerator is available in the
ccdc.conformer module. Let us import this module.
>>> from ccdc import conformer
Let us also import the
ccdc.io module so that we can read in
a molecule of interest and define the path to a molecule file of
interest, in this instance a CDK2 inhibitor.
>>> from ccdc import io >>> filepath = '2vtp-lig.mol2'
In this instance we will get the first molecule in the file.
>>> mol_reader = io.MoleculeReader(filepath) >>> mol = mol_reader
For more information on the conformer generation algorithm please see: “Knowledge-Based Conformer Generation Using the Cambridge Structural Database”, J. C. Cole, O. Korb, P. McCabe. M. G. Read and R. Taylor, J. Chem. Inf. Model., 58, 615-629, 2018, DOI: 10.1021/acs.jcim.7b00697.
Setting Up the Calculation¶
To generate conformers we create a
>>> conformer_generator = conformer.ConformerGenerator()
There are optional parameters for finer control of the
ccdc.conformer.ConformerGenerator; please see the API
documentation for further detail.
ccdc.conformer.ConformerGenerator has a number of settings that can be customised, for example the maximum number
of conformers to generate, that is a representative set of high probability and at the same time diverse conformers.
ccdc.conformer.ConformerSettings.max_conformers setting specifies the maximum number of conformers to generate. If one specifies a large number and the conformational
space is exhausted before this number is reached a smaller number of conformers will be generated.
By default the conformer generator will attempt to generate 200 conformers, but this number can be changed by the user. Let us change it to 25.
>>> conformer_generator.settings.max_conformers 200 >>> conformer_generator.settings.max_conformers = 25
Another setting that can be customised is the
ccdc.conformer.ConformerSettings.max_unusual_torsions, which represents the number of unusual rotamers allowed per conformer. By default this number is set to 2.
Finally, it is possible to superimpose the resulting set of conformers onto a common refernce frame, that is all atoms of the input molecule. To enable this feature:
>>> conformer_generator.settings.superimpose_conformers_onto_reference = True
To generate a set of conformers for a molecule, call the
ccdc.conformer.ConformerGenerator.generate() function. It
is important that all atoms in the molecules, including hydrogen atoms, have coordinates. Molecules with missing
coordinates are ignored.
>>> conformers = conformer_generator.generate(mol) >>> len(conformers) 25
function can also take a list of molecules which may give some
performance improvement when working on large numbers of
It is possible to overlay the list of conformers onto a specific set of atoms using the
ccdc.descriptors.MolecularDescriptors.Overlay(). For example, we can superimpose the 25 conformers onto the pyrazole substructure.
>>> from ccdc.search import SubstructureSearch, SMARTSSubstructure >>> from ccdc import descriptors as d >>> query = 'C1=CNN=C1' >>> conformers_mols = [c.molecule for c in conformers] >>> ss_search = SubstructureSearch() >>> substructure = SMARTSSubstructure(query) >>> sub_id = ss_search.add_substructure(substructure) >>> hits = ss_search.search(conformers_mols, max_hits_per_structure=1) >>> ref_ats = hits.match_atoms()
Let’s write the superimposed conformers to a .MOL2 file.
>>> with io.MoleculeWriter('superimposed_conformers.mol2') as writer: ... for hit in hits: ... hit_ats = hit.match_atoms() ... atoms = zip(ref_ats, hit_ats) ... ov = d.MolecularDescriptors.Overlay(hits.molecule, hit.molecule, atoms) ... superimposed_hit = ov.molecule ... writer.write(superimposed_hit)
Analysing the Conformer Hit List¶
ccdc.conformer.ConformerGenerator.generate() function returns a
ccdc.conformer.ConformerHitList object, which behaves like a
Python list. The
ccdc.conformer.ConformerHitList contains attributes relating to the overall performance of the run. For example,
whether or not the sampling limit was reached and how many rotamers had no observations in the CSD.
>>> conformers.sampling_limit_reached False >>> conformers.n_rotamers_with_no_observations 0
ccdc.conformer.ConformerHit contains a
ccdc.molecule.Molecule as well as the
ccdc.conformer.ConformerHit.normalised_score - a value between 0.0 (most probable) and 1.0 (least probable).
Conformers are listed in probability order (the most probable first).
>>> most_probable_conformer = conformers >>> most_probable_conformer.molecule <ccdc.molecule.Molecule ...> >>> '%.2f' % most_probable_conformer.normalised_score '0.00'
It is possible to calculate the rmsd of each conformer in the hit list with respect to the input molecule (this is the default behaviour) or with respect to the minimised molecule.
>>> print(round(most_probable_conformer.rmsd(), 3)) 0.723 >>> print(round(most_probable_conformer.rmsd(wrt='minimised'), 3)) 0.690
Locking Rotatable Bonds¶
Particular bonds can be locked so that they are not rotated during conformer generation. A locked bond in a ring will lock the whole ring. This can make the conformer generator create more diversity in other areas of the molecule.
>>> conformer_generator.lock_torsion(mol.bond('C13', 'N14'))
ccdc.conformer module also contains functionality for optimising a molecule to the geometrical preferences of bond distances and valence angles observed in the CSD.
By default this is the first step in conformer generation.
The geometrical optimisation makes use of the Tripos force field functional forms and, where available, equilibrium bond distances and valence angles are parametrised using data obtained from the CSD.
>>> molecule_minimiser = conformer.MoleculeMinimiser() >>> minimised_mol = molecule_minimiser.minimise(mol)
To compare the two molecules we can make use of the
>>> from ccdc.descriptors import MolecularDescriptors >>> round( MolecularDescriptors.rmsd(mol, minimised_mol), 3) 0.434