Cavity API¶
Introduction¶
The ccdc.cavity
module contains a class ccdc.cavity.Cavity
representing a putative binding site on the protein surface that has been
automatically detected by the LIGSITE algorithm.
Reference article introducing the LIGSITE algorithm.
This approach uses a three-dimensional grid for the detection of surface depressions with a grid spacing of 0.5 Ångströms. As a result, the cavity is represented by a set of grid intersection points, also called surface points.
Once a cavity has been detected, the flanking fragments of amino acids are assigned pseudocenters to represent their physicochemical properties. Currently, seven types of pseudocenters are used:
DONOR: representing a hydrogen bond donor functionality
ACCEPTOR: representing a hydrogen bond acceptor functionality
DONOR_ACCEPTOR: representing both hydrogen bond donor and acceptor functionalities
AROMATIC: represents the center of an aromatic ring system
ALIPHATIC: represents an aliphatic moiety
PI: denotes the presence of a double bond
METAL: represents the position of a metal ion
Reference article for the introduction of the cavity graph comparison method.
Reference article for the introduction of the fast cavity graph comparison method.
Reference article for the introduction of the cavity histograms comparison method.
The class ccdc.cavity.Cavity
has methods to automatically create
cavities for PDB files, or read a cavity from an XML representation, to extract
useful information such as bound ligands, and to compare two cavities using
different comparison methods. From this it is possible to write screens to
search for similar cavities across a range of proteins, a defined set of
cavities, or the entire cavity database.
See also
API¶
- class ccdc.cavity.Cavity(_cavity, _pdb)[source]¶
A cavity on the protein surface.
- class CavityDistanceHistograms(cavity, reference_points=None)[source]¶
A cavity description based on histograms of distances between reference points and cavity pseudocenters.
- property histograms¶
The tuple of histograms defined for this cavity.
- class CavityGraphComparison(_comparison)[source]¶
The result of a cavity graph comparison.
- property clique_rmsd¶
The rms deviation of the matched clique points.
- property n_cliques¶
The total number of cliques detected during the comparison.
- property n_matches¶
The number of matching pseudocenters detected.
- property rmsd¶
The rms deviation for the match of all pseudocenters.
- property score¶
The similarity score of the comparison.
- property transformation_matrix¶
The transformation matrix to overlay the target onto the query cavity
- class FastCavityGraphComparison(_comparison)[source]¶
The result of a fast cavity graph comparison.
- property largest_clique_size¶
The size of the largest clique detected.
- property product_graph_size¶
The size of the product graph generated during the comparison.
- property score¶
The similarity score of the comparison.
- class Feature(_feature=None)[source]¶
An interaction feature in a cavity.
- property amino_acid_code¶
The associated amino acid code.
This will be ‘UNK’ if no protein structure has been associated with the cavity containing the feature.
- property atom¶
The atom from which this feature is defined.
This will be
None
if no protein information is associated with the cavity. Aromatic and aliphatic features have no associated atom, so this property has the valueNone
in those cases.
- property burial¶
The burial value assigned to this feature (0 to 7, where 7 means most buried).
- property chain¶
The chain from which this feature was defined.
This will be
None
if no protein information is associated with the cavity.
- property coordinates¶
The position of the feature.
- Returns
a named tuple of coordinates
- property protein_vector¶
Vector denoting the ideal interaction direction of the feature with another one outside the protein.
- property residue¶
The residue associated with this feature.
This will be
None
if no protein information is associated with the cavity.
- property surface_depths¶
The depth values assigned to the surface points of this feature (0 to 7, where 7 means most buried).
- Returns
a tuple of depth values
- property surface_points¶
The surface points associated with this feature.
These approximate the surface shape close to the feature.
- Returns
a tuple of named tuples of coordinates
- property surface_vector¶
Vector denoting the connection from the feature to the centre of its assigned surface points.
- property type¶
The type of the interaction feature.
- property bounding_box¶
The origin and far corner of the cavity.
- cavity_distance_histograms(reference_points=None)[source]¶
Create a set of feature distance histograms for this cavity based on the given reference point specification.
- Parameters
reference_points – a set of reference point measures
- Returns
a
ccdc.cavity.Cavity.CavityDistanceHistograms
instance
- compare(other, comparison_method=1, histogram_reference_points=None, max_product_graph_size=36000)[source]¶
Compare this cavity to another cavity.
- Parameters
other – a
ccdc.cavity.Cavity
instancecomparison_method – a member of
ccdc.cavity.Cavity.ComparisonMethod
, either Cavity.ComparisonMethod.FAST_CAVITY_GRAPH_COMPARISON, Cavity.ComparisonMethod.CAVITY_GRAPH_COMPARISON or Cavity.ComparisonMethod.CAVITY_HISTOGRAMS_COMPARISONhistogram_reference_points – an iterable of strings drawn from ‘centroid’, ‘centroid_closest’, ‘centroid_furthest’, ‘centroid_furthest_furthest’. If empty or None, ‘centroid’ and ‘centroid_closest’ will be used for the generation of distance histograms with Cavity.ComparisonMethod.CAVITY_HISTOGRAMS_COMPARISON
max_product_graph_size – the maximum allowed size of the product graph for fast cavity graph comparisons
- Returns
a
ccdc.cavity.Cavity.FastCavityGraphComparison
instance, accdc.cavity.Cavity.CavityGraphComparison
instance or a similarity score for cavity histogram comparisons
- property features¶
The features of the cavity.
- Returns
a tuple of
ccdc.cavity.Cavity.Feature
instances
- features_by_atom_distance(atoms, radius)[source]¶
The set of all features within a radius of any of the atoms.
- features_by_distance(centre, radius)[source]¶
The set of features of the cavity within radius of the centre.
- features_by_residues(residues)[source]¶
The set of features associated with any of the given residues.
- static from_pdb_file(pdb_file, maximum_incomplete_residues_per_chain=0)[source]¶
Create cavities from a PDB file.
- Parameters
pdb_file – PDB file for the generation of cavities
maximum_incomplete_residues – the maximum number of incomplete residues to allow (0 by default)
- Raises
RuntimeError if the PDB file contains more than 1000 SEQRES lines
- Returns
a tuple of
ccdc.cavity.Cavity
instances
- static from_xml(xml, pdb_file=None)[source]¶
Reads a cavity from an XML string.
- Parameters
xml – an XML representation of the cavity
pdb_file – an optional PDB file for the associated protein, from which additional data for the cavity may be computed
- static from_xml_file(xml_file, pdb_file=None)[source]¶
Reads a cavity from an XML file and associated PDB file.
- Parameters
xml_file – XML file representing the cavity
pdb_file – an optional PDB file for the associated protein, from which additional data for the cavity may be computed
- Raises
RuntimeError if the XML file does not exist
- Returns
a
ccdc.cavity.Cavity
instance
- property identifier¶
The identifier of this cavity.
- property ligand_identifiers¶
Tuple of ligand identifiers found in the cavity.
- property ligands¶
List of ligands of the cavity.
If there is no protein associated with the cavity this will be
None
.
- subcavity(features)[source]¶
Make a subcavity based on a set of features from this cavity.
- Parameters
features – a set of features for construction of the subcavity
- Returns
a
ccdc.cavity.Cavity
instance
- to_pymol_file(file_name=None, show_surface_points=False)[source]¶
Create a visualisation file of this cavity that can be run in PyMOL.
The cavity will be represented by its physicochemical features.
- Parameters
file_name – Python file containing the information for displaying the cavity. This should have a .py extension. If not defined, the file will be named using the cavity identifier
show_surface_points – additionally display the points representing the cavity’s surface shape
- to_xml_file(file_name)[source]¶
Writes the XML representing a cavity to a file.
- Parameters
file_name – the file to which to write the XML
- property volume¶
Volume of the cavity in cubic Angstroms.
- class ccdc.cavity.CavityDatabase(file_name=None)[source]¶
An SQLite database for cavities. A path to a database must be passed in when creating an instance of this class.
Please note that the schema for the database, and, indeed, the final choice of underlying database has not been finalised. Accordingly this should be treated as a prototypical implementation. The API for creation and for searching should remain valid, however.
- class Settings[source]¶
Settings appropriate to cavity searches.
- acceptor_range = None¶
minimum and maximum number of acceptor features
- aliphatic_range = None¶
minimum and maximum number of aliphatic features
- aromatic_range = None¶
minimum and maximum number of aromatic features
- donor_acceptor_range = None¶
minimum and maximum number of donor-acceptor features
- donor_range = None¶
minimum and maximum number of donor features
- histogram_reference_points = None¶
an iterable of strings drawn from ‘centroid’, ‘centroid_closest’, ‘centroid_furthest’, ‘centroid_furthest_furthest’. If empty or None, ‘centroid’ and ‘centroid_closest’ will be used for the generation of distance histograms with Cavity.ComparisonMethod.CAVITY_HISTOGRAMS_COMPARISON
- ligand_range = None¶
minimum and maximum number of ligands
- logfile = False¶
logfile of comparison scores, default False
- max_hit_structures = 0¶
maximum number of structures returned
- max_product_graph_size = 36000¶
maximum size of product graph allowed when using the fast cavity graph comparison method
- metal_range = None¶
minimum and maximum number of metal features
- pi_range = None¶
minimum and maximum number of pi features
- start = 0¶
offset starting position in database
- verbose = False¶
verbose output, default False
- volume_range = None¶
minimum and maximum cavity volume
- with_ligands = None¶
ligand identifiers
- cavity_distance_histograms(identifier)[source]¶
The distance histograms corresponding to the given cavity identifier.
- get_cavities_by_ligand(ligand_id)[source]¶
Get all the cavities containing a particular PDB ligand identifier.
- get_cavity_identifiers_by_ligand(ligand_id)[source]¶
Get the identifiers of all cavities containing a particular PDB ligand identifier.
- get_info_for_cavity(cav_name)[source]¶
Get the information for a cavity.
- Parameters
cav_name – the name of the cavity from the cav_xml table of the database
- Returns
a dictionary of values from the info table
- populate_all(directory, id_file=None, verbose=False, maximum_allowed_incomplete_residues=0)[source]¶
Create all tables from the directory of input files.
- Parameters
directory – directory containing PDB or XML cavity files
verbose – enable verbose output, default False
- search(cavity=None, comparison_method=1, settings=None)[source]¶
Searches the database and optionally performs cavity comparisons against the results.
The query can include a Cavity for comparison against the database. If a cavity is not specified, a search for cavities matching the constraints is performed.
- Parameters
cavity – a
ccdc.cavity.Cavity
instancecomparison_method – a member of
ccdc.cavity.Cavity.ComparisonMethod
, either Cavity.ComparisonMethod.FAST_CAVITY_GRAPH_COMPARISON, Cavity.ComparisonMethod.CAVITY_GRAPH_COMPARISON or Cavity.ComparisonMethod.CAVITY_HISTOGRAMS_COMPARISONsettings – a
ccdc.cavity.Cavity.Settings
instance
- Returns
a list of tuples of comparison score and cavity identifier, sorted by comparison score, starting with the highest similarity, or alternatively returns a generator in the case where no comparisons are performed