Release notes

Overview

The Cambridge Structural Database (CSD) is a highly curated and comprehensive repository of organic and organo-metallic crystal structures and is an essential resource to scientists around the world.

The Cambridge Structural Database System (CSDS) is a powerful and highly flexible suite of software tools and structural knowledge-bases. The CSDS enables exploration and application of the knowledge contained within more than 800,000 crystal structures.

The CSD Python API has been developed to make the CSD data and CSDS functionality accessible in a programmatic fashion. The aim is to facilitate integration with in-house work-flows and 3rd party applications. In addition, the CSD Python API can be used to perform activities not currently possible through the graphical interfaces. It is a platform for innovation.

Searchable documentation is available on at http://www.ccdc.cam.ac.uk/docs/csd_python_api/

An open community forum focused on using the CSD Python API is available at http://www.ccdc.cam.ac.uk/forum/csd_python_api/ and you are encouraged to visit this forum in the first instance if you require help or inspiration. We are also grateful for any feedback from your experiences with the CSD Python API. Alternatively, any feedback on the CSD Python API may be sent to support@ccdc.cam.ac.uk.

Citing the CSD Python API

When publishing works that benefited from the CSD Python API, please consider using the following citation:

“The Cambridge Structural Database”
C. R. Groom, I. J. Bruno, M. P. Lightfoot and S. C. Ward, Acta Crystallographica Section B, B72, 171-179, 2016

For further citation advice, refer to your download agreement and visit http://www.ccdc.cam.ac.uk/support/product_references/

Licensed Features

Some features are conditionally available depending on the user’s CSD licence.

API Feature All Users CSD-Materials CSD-Discovery
IO y y y
Entry, Crystal*1, Molecule y y y
Search y y y
Interaction y y y
Descriptors*2 y y y
Diagram y y y
Conformer*3 y y y
Protein y y y
Utilities y y y
Screening     y
Docking     y
Cavity     y
Pharmacophore     y

*1 The crystal packing similarity API is available to CSD-Materials users.

*2 The powder pattern simulation and comparison API, Morphology API, HBond Coordination API, and HBond Propensity API are available to CSD-Materials users.

*3 The Conformer Generation API is available to CSD-Materials and CSD-Discovery users.

CSD-Enterprise users have access to CSD-Discovery and CSD-Materials feature sets.

Change Log

2.0.0

Backwards incompatible changes

  • ASER format databases are no longer supported. If you have databases that do not now work with the CSD Python API, you can contact your local in-house database manager, or you can contact support@ccdc.cam.ac.uk to receive assistance in converting your files to the new format.
  • Creation of SubstructureSearch and SimilaritySearch screens for speeding up searches of large, non-CSD databases is no longer supported. Please convert your databases to csdsql format databases where the screens are built-in.
  • ccdc.io.EntryReader when created with a list of identifiers will raise a RuntimeError if any of the identifiers is not present in the underlying database.

Deprecations

  • ccdc.MolecularDescriptors.overlay_rmsd_and_rmsd_tanimoto() has been deprecated and replaced with the method ccdc.MolecularDescriptors.overlay_rmsds_and_transformation(), which also returns the overlay transformation matrix.

Major new features

Minor new features

bug fixes

  • fix uninstallation of conda package.
  • improve detection of GOLD executable at less standard locations.
  • recognise cofactor atoms as part of the protein.
  • improve compatibility with PyQt module.
  • accept unicode identifiers in GCD lists.
  • allow uppercase filenames with lower-case list of pdb codes (JIRA GOLD-1082) when creating a cavity database with a identifier list filter.

1.5.3

Backwards incompatible changes

Deprecations

None

Major new features

None

Minor new features

  • ccdc.molecule.Molecule.normalise_atom_positions() allows a molecule to reorder its atoms canonically.
  • ccdc.docking.Docker.Settings now allows the specification of a scoring parameter file and a torsion distribution file.
  • API tests are dependent on a specific CSD version being present, that may not match what the user has, may be skipped
  • Molecule and crystal formulas of disordered structures use occupancies giving fractional element counts
  • ccdc.__build__() provides a unique build identifier.

bug fixes

  • ccdc.search.SubstructureSearch.SearchHit.match_substructures() has been revised to ensure the order of matched atoms in the returned molecule is preserved accords with the order of substructure atoms in the query.
  • Fixed problem that on 64-bit Windows due to modified registry layout, a 32-bit API can have trouble locating the database and therefore a licence

1.5.2

Backwards incompatible changes

  • the method ccdc.molecule.add_hydrogens() will no longer add hydrogens to atoms in a polymeric bond, correcting an earlier error.

Deprecations

None

Major new features

None

Minor new features

  • there is a new method, ccdc.molecule.Atom.is_in_line_of_sight() which checks whether a pair of atoms is occluded by a third. See Line Of Sight for details.
  • ccdc.descriptors.StatisticalDescriptors.RankStatistics additionally accepts string identifier to specify activity classifications.
  • All search classes raise exceptions when from_xml_file methods called with non-existent XML file.
  • the method ccdc.utilities.Timer.progress() now takes an optional argument specifying that the output should be written in place, overwriting previous output.
  • There is a new property, ccdc.search.Search.SearchHit.identifier() to access the string identifier for the hit.
  • Fail earlier (at import time) for bad installations with helpful error messages for: - Python versions other than 2.7.x - 64-bit install into 32-bit Python environment, or vice versa. - UCS2 install into UCS Python environment, or vice versa.

Bug fixes

  • Fix diagram generation segmentation fault on Linux platforms with NVidia graphics drivers (BZ17616)
  • Fixed bug that multiple covalently-bound ligands with the same chain identifier were treated as a single ligand (Bug 18737)
  • Avoid rounding errors when comparing crystal contact lengths.
  • Fix incorrect licensing restrictions for Point and Vector.
  • Throw a runtime_error instead of seg faulting when incorrect atoms are provided for a substructure search centroid
  • Throw a runtime_error instead of seg faulting when illegal substructure indices specified in search
  • Give consistent errors when search XML files are missing, irrespective of Search class.
  • Fix stretching of grid files when saved using .grd format (BZ18734)
  • Fix ccdc.search.SMARTSSubstructure hit atom indexes don’t correspond to the substructure specification (Bug 18661)
  • Fix calculation of crystal density where the Z’ is less than 1 (BZ16116)
  • FIx distance parameters read from CONNSER files were erroneously VdW-corrected (BZ18736)

1.5.1

Backwards incompatible changes

  • the module ccdc.cavity has been moved to ccdc_rp.cavity.
  • when reading loops from the attributes of a CIF file, values which are deemed not significant will be replaced by None. Previously only the first item of the loop was checked.

Deprecations

None

Major new features

None

Minor new features

Bug fixes

  • “se” to represent aromatic Selenium (as in Selenophene) is now supported by the SMILES and SMARTS parsers.
  • :meth:’descriptors.MolecularDescriptors.point_group_analysis’ failed if called twice on same molecule.
  • :meth:’descriptors.MolecularDescriptors.point_group_analysis’ now deduces correct point group symmetry for BROFRM02.

1.5.0

Backwards incompatible changes

None

Deprecations

None

Major new features

Minor new features

1.4.0

Backwards incompatible changes

  • some of the names of ccdc.interaction.InteractionLibrary.CentralGroup have been changed to correct a bug in the handling of groups with distinct geometries, for example, ‘planar uncharged aromatic amino’ is now distinguishable from ‘pyramidal uncharged aromatic amino’.

Deprecations

  • The attribute ccdc.conformer.ConformerSettings.normalised_score_threshold is deprecated.

Major new features

Minor new features

Bug fixes

  • Writing a protein containing atoms with unknown element types to PDB formatted-files now generates valid PDB format.
  • Reading proteins from Mol2 files now retains residue information.
  • There was a bug where ccdc.search.SubstructureSearch.SubstructureHit.match_atoms() could return atoms not in the same molecule. This has been fixed.
  • The python Numpy module is no longer a prerequisite for the CSD Python API, leading to an easier installation experience for pip users.

1.3.0

Backwards incompatible changes

  • The ccdc.entry.Citation previously had a member, ccdc.entry.Citation.journal_name which has been superceded by an instance of the new ccdc.entry.Journal.
  • The ccdc.descriptors.PowderPattern has been removed, after deprecation in the previous release. Please use ccdc.descriptors.CrystalDescriptors.PowderPattern instead.
  • Several methods and classes from ccdc.cavity have been removed following deprecation in the previous release.
  • When reading a CIF file, bond information, even if present, will be ignored. This is for consistency with other CCDC programs.

Deprecations

None

Major new features

Minor new features

Bug fixes

1.2.0

Backwards incompatible changes

None

Deprecations

  • ccdc.descriptors.PowderPattern has been moved into a new namespace, and now appears as ccdc.descriptors.CrystalDescriptors.PowderPattern. It is available under its old location for this release, for backwards compatibility, but will be available only in its new location for the next 1.3 release.

Major new features

  • There is now an implementation of Graph Sets in the API. See graph-sets for details.
  • ccdc.docking.Docker now allows GOLD to be invoked for rescoring. See Rescoring for details.

Minor new features

Bug fixes

  • ccdc.protein.Protein.Residue.__eq__() method now compares chain ID as well as residue sequence number.
  • ccdc.protein.Protein.Residue.__lt__() method now sorted on chain ID as well as residue sequence number.

1.1.1

Deprecations

  • ccdc.cavity.Cavity.RapmadPocket has been deprecated in favour of ccdc.cavity.Cavity.PocketDistanceHistograms.
  • ccdc.cavity.Cavity.PocketDistanceHistograms.identifier, ccdc.cavity.Cavity.PocketDistanceHistograms.nfeatures and ccdc.cavity.Cavity.PocketDistanceHistograms.feature_coordinates have been deprecated.
  • ccdc.cavity.Cavity.rapmad_pocket() has been deprecated in favour of ccdc.cavity.Cavity.pocket_distance_histograms().
  • ccdc.cavity.CavityDB.rapmad_pocket() and ccdc.cavity.CavityDB.rapmad_pockets() have been deprecated in favour of ccdc.cavity.CavityDB.pocket_distance_histograms() and ccdc.cavity.CavityDB.pocket_distance_histogram_sets().
  • ccdc.cavity.CavityDB.search_rapmad() and ccdc.cavity.CavityDB.search_cavbase() have been deprecated in favour of ccdc.cavity.CavityDB.pocket_search() and ccdc.cavity.CavityDB.cavbase_search().

Minor new features

  • updated naming in experimental interface to protein cavities. See ccdc.cavity.

1.1.0

Backwards incompatible changes

  • molecules read from a database no longer raise an exception if there are no atoms.
  • molecules with no atoms can be written, converted to strings
  • entries and crystals can be created with no underlying atoms
  • translating molecules no longer raise an exception if there are siteless atoms.

Deprecations

None

Major new features

Molecule

  • ccdc.molecule.Molecule now provides a method to calculate partial charges for organic molecules.
  • ccdc.molecule.Atom has a property, partial_charge, to get or set the partial charge of an atom. All partial charges of a molecule will be reset if an atom’s formal charge is changed.

Crystal

Minor new features

1.0.0

Backwards incompatible changes

None

Deprecations

None

Major new features

  • updated for CSDS 2016
  • there is now an API for docking ligands into proteins, using GOLD. This is currently available only to collaborators.
  • ccdc.molecule.Molecule can now determine intramolecular hydrogen bonds and close contacts.
  • ccdc.molecule.Atom.is_chiral and ccdc.molecule.Atom.chirality have been extensively revised to give more accurate determination of R/S chirality including the determination of para-chiral centres (whose chirality is determined solely by the chirality of other atoms). Note that structures with pi-bonds will not support the determination of chirality.
  • ccdc.descriptors.MolecularDescriptors has new methods to define geometric objects from atom and ring coordinates.
  • ccdc.descriptors.GeometricDescriptors is new and provides methods to define vectors and planes from points, and to calculate geometric relationships between them. See Molecular geometry for details.

Minor new features

Entry

  • ccdc.entry.Entry.cross_references gives a tuple of ccdc.entry.Entry.CrossReference instances. These provide cross-references between entries of the CSD.
  • a ccdc.io.EntryReader of a mol2 format entry will now extract SDFile-like tags from the Mol2Comments and place them in an attributes dictionary. The EntryWriter will write these attributes.
  • a ccdc.io.EntryReader of a mol2 format entry will now extract Mol2 format atom sets and place them in a dictionary attribute ccdc.entry.Entry.atom_sets where found. The EntryWriter will write atom sets if the above attribute is set.

Crystal

Molecule

IO

  • gcd files may now use an arbitrary database as the source of entries. Use the form io.EntryReader(gcd_file, source_database). This will also work with lists of identifiers.

0.7.0

Backwards incompatible changes

None

Deprecations

  • The property ccdc.entry.jds_deposition_number is deprecated. This is a historical journal deposition number and has since been superseded by CCDC numbers. The method ccdc.search.TextNumericSearch.add_jds_deposition_number() is similarly deprecated.

Major new features

Minor new features

Entry

Crystal

IO

  • res format (SHELX) files may now be read and written through the standard io classes.

Search

GeometryAnalyser

Screener

Examples

  • maximum_common_substructure.py shows a similarity search followed by maximum common substructure search.
  • filter_csd.py shows how an iteration over the entries of the CSD can omit entries on a number of criteria.
  • simple_report.py shows how python’s format() method may be used to write HTML reports on a CSD entry.

Bug fixes

None

0.6.0

Backwards incompatible changes

Deprecations

Major new features

Minor new features

Geometry Analyser

Diagrams

Search

  • ccdc.search.TextNumericSearch has a new property, queries, which will give a human-readable representation of the queries added to the search instance.

Entry

  • ccdc.entry.Entry has the property radiation_source to express the experimental radiation probe used in the determination of the crystal.
  • ccdc.entry.Entry has the property is_polymeric to allow filtering of the CSD on polymeric structures.
  • ccdc.entry.Entry and ccdc.crystal.Crystal may now be compared for equality (based on its identifier) and hashed, allowing use as a key in a dictionary or set.

Molecule

Crystal

  • ccdc.crystal.Crystal now has a spacegroup_number_and_setting attribute to provide details of the crystal’s space group.
  • ccdc.crystal.Crystal has a new method to calculate a packing shell of a given number of molecules.
  • ccdc.crystal.Crystal now reports its symmetry operations as a tuple of strings.
  • ccdc.crystal.Crystal will report rotational and translational components of its symmetry operators.
  • ccdc.crystal.Crystal now allows the generation of molecules generated by the crystal’s symmetry operations.

0.5.0

Backwards incompatible changes

Deprecations

Major new features

Minor new features

Bug fixes

0.4.0

Backwards incompatible changes

None

Major new features

  • Access to the CCDC conformer generator and molecular minimiser. See the ccdc.conformer module. Feature under development - currently available only to associated collaborators.
  • Access to the CCDC diagram generation functionality. See the ccdc.diagram module.
  • Mogul analysis of individual fragments:
    • ccdc.mogul.Mogul.analyse_bond()
    • ccdc.mogul.Mogul.analyse_angle()
    • ccdc.mogul.Mogul.analyse_torsion()
    • ccdc.mogul.Mogul.analyse_ring()
  • The entire CSD Python API is now unicode compatible

Minor new features

Bug fixes

  • Fixed a bug which meant that the ccdc.mogul.MogulResult.histogram() function was not returning correct data.

0.3.1

Backwards incompatible changes

Major new features

  • Support for 64-bit Python on Linux
  • ccdc.search.SubstructureSearch has expanded functionality
    • Ability to measure distances, angles and torsion angles in hit structures
    • Ability to constrain distances, angles and torsion angles in hit structures
    • It now provides the ability to add more than one substructure, which can be used to set up inter-molecular contact searches
  • A number of IsoStar classes have been implemented. See ccdc.isostar for details.

Minor new features

Bug fixes

  • There is a much larger number of open ASER database instances provided, and a memory leak of open ASER database instances has been fixed.
  • ccdc.search.ConnserSubstructure: will raise an exception for a missing or empty file name parameter.

Table Of Contents

Previous topic

Conditions of Use

Next topic

Known issues