Descriptors API¶
Introduction¶
The ccdc.descriptors
module contains classes for calculating descriptors.
The main classes in the ccdc.descriptors
module are:
API¶
- class ccdc.descriptors.MolecularDescriptors[source]¶
Namespace for descriptors of a molecular nature.
- class AdjacencyMatrixDescriptorCalculator(molecule)[source]¶
Descriptor calculator for descriptors based on a molecule’s adjacency matrix.
- self_returning_walk(k)[source]¶
Return the number of walks of length k that start and end at the same atom. See Handbook of Molecular Descriptors, page 384, “self-returning walk counts”.
- Parameters:
k – the number of steps to walk.
- Returns:
float
- self_returning_walk_ln(k)[source]¶
Return the logarithm of the number of walks of length k that start and end at the same atom. See Handbook of Molecular Descriptors, page 384, “self-returning walk counts”.
- Parameters:
k – the number of steps to walk.
- Returns:
float
- topological_charge_autocorrelation_index(k)[source]¶
Calculate the topological charge autocorrelation index See https://pubs.acs.org/doi/pdf/10.1021/ci00019a008
- Parameters:
k – the topological distance to measure across
- Returns:
float
- class AtomDistanceSearch(molecule)[source]¶
More rapid searching for atoms within a certain distance of a point.
- class AtomPairDistanceDescriptorCalculator(molecule)[source]¶
Atom pair distance descriptor calculations.
- Parameters:
molecule – a
ccdc.molecule.Molecule
instance.
- element_pair_count(element_a, element_b, distance)[source]¶
Return a count of the number of times a pair of elements appear with a specified minimum path length. See Handbook of Molecular Descriptors, page 428, “substructure descriptors, atom pairs”.
- Parameters:
element_a – str. the first element name.
element_b – str. the second element name.
distance – int. the number of bonds between atoms of the specified the elements.
- Returns:
float
- class ConnectivityIndices(molecule)[source]¶
Connectivitiy index descriptor calculations.
- class InChIGenerator(include_stereo=True, add_hydrogens=True)[source]¶
Generate InChI molecular descriptors.
The original source of the InChI generation tool is available at https://www.inchi-trust.org/
See Stephen R Heller, Igor Pletnev, Stephen Stein and Dmitrii Tchekhovskoi, J. Cheminformatics, 2015, 7:23 https://doi.org/10.1186/s13321-015-0068-4
- Parameters:
include_stereo – configure the generator to include stereochemistry (True, default) or ignore stereochemistry (False)
add_hydrogens – configure the generator to add hydrogens (True, default) or not add hydrogens (False)
- class InChI(inchi_internal)[source]¶
An InChI object with the following attributes:
- Variables:
success – a boolean to indicate if the InChI generation was successful
inchi – the InChI string
key – the InChI key
errors – a tuple of InChI generation errors
warnings – a tuple of InChI generation warnings
- property add_hydrogens¶
Whether the InChI generator should add missing hydrogens
- generate(structure, include_stereo=None, add_hydrogens=None)[source]¶
Generate InChI.
- Parameters:
structure – a
ccdc.crystal.Crystal
orccdc.molecule.Molecule
objectinclude_stereo – set to True or False to override generator’s setting
add_hydrogens – set to True or False to override generator’s setting
- Returns:
a
ccdc.descriptors.MolecularDescripts.InChIGenerator.InChI
instance- Raises:
TypeError if the type of the input structure is invalid
- property include_stereo¶
Whether stereo chemistry be considered.
- class MaximumCommonSubstructure(settings=None)[source]¶
Identifies the maximum common substructure of two molecules.
- class Settings[source]¶
Settings for the MCS calculation.
- property check_bond_count¶
Whether the bond count of an atom be checked.
- property check_bond_polymeric¶
Check whether the bond be polymeric.
- property check_bond_type¶
Whether the bond type be checked.
- property check_charge¶
Whether the atom charge be checked.
- property check_element¶
Whether the element be checked.
- property check_hydrogen_count¶
Whether the atom’s hydrogen count be checked.
- property connected¶
Whether substructure should be connected.
Note that finding disconnected maximal substructures is a lot slower than finding connected.
- property ignore_hydrogens¶
Whether the hydrogens be ignored.
- search(mol1, mol2, only_edges=False, search_step_limit='unlimited')[source]¶
Calculate the maximum common substructure between two molecules.
- Parameters:
mol2 (mol1,) –
ccdc.molecule.Molecule
instances.only_edges – bool. The search will find a maximal common substructure matching only the edges.
search_step_limit – positive integer or ‘unlimited’. Controls the maximum number of steps the algorithm takes.
- Returns:
a pair of tuples, giving matched
ccdc.molecule.Atom
andccdc.molecule.Bond
instances.- Raises:
ValueError with invalid input
Note: this function is computationally exponential, so will take a long time on large molecules.
- class Overlay(mol1, mol2, atoms=None, invert=False, rotate_torsions=False, with_symmetry=True, match_elements=True)[source]¶
Overlays two molecules
- property atom_match¶
Returns pairs of atoms from mol1 and mol2 matched in the overlay
- property max_distance¶
Returns the maximum distance between two equivalent atoms in the overlay (Angstroms)
- property molecule¶
Returns input molecule mol2 transformed to overlay onto mol1
- property rmsd¶
Returns RMSD between the two overlaid molecules
- property rmsd_tanimoto¶
Returns Tanimoto RMSD between the two overlaid molecules
- property transformation¶
Returns Molecule.Transformation object required to overlay mol2 over mol1
- class PrincipleAxesAlignedBox(molecule)[source]¶
The bounding box of the molecule aligned on its principle axes.
The vectors of the box have lengths of the size of the box. The x_vector is the major axis of the molecule, the y_vector the minor axis and the z_vector the minimal axis of the molecule.
- property aligned_molecule¶
The molecule aligned along its principle axes, with centre at its centre of geometry.
- property volume¶
The volume of the box.
- property x_vector¶
The vector of the major axis of the box.
- property y_vector¶
The vector of the minor axis of the box.
- property z_vector¶
The vector of the minimal axis of the box.
- static atom_angle(a, b, c)[source]¶
Angle subtended by three arbitrary atoms.
- Parameters:
- Returns:
float - the angle in degrees or
None
if one of the atoms has no coordinates
- static atom_distance(a, b)[source]¶
Distance between two atom irrespective their parent molecules.
- Parameters:
- Returns:
float or
None
if one of the atoms has no coordinates
- static atom_plane(*atoms)[source]¶
Define a plane from the coordinates of the atoms.
- Parameters:
atoms – there must be at least three
ccdc.molecule.Atom
in the arguments.
- static atom_torsion_angle(a, b, c, d)[source]¶
Plane angle subtended by the triples abc and bcd.
- Parameters:
- Returns:
float - the angle in degrees or
None
if one of the atoms has no coordinates
- static atom_vector(atom0, atom1)[source]¶
Define the vector from atom0 to atom1.
- Parameters:
atom0 –
ccdc.molecule.Atom
atom1 –
ccdc.molecule.Atom
- Returns:
- Raises:
RuntimeError if either atom has no coordinates.
- static bond_length(bond)[source]¶
The length of a bond.
- Parameters:
bond –
ccdc.molecule.Bond
- Returns:
float, or
None
if an atom of the bond has no coordinates
- static overlay(mol1, mol2, atoms=None, invert=False, rotate_torsions=False, with_symmetry=True)[source]¶
Overlay mol2 on mol1. Deprecated and replaced with
ccdc.MolecularDescriptors.Overlay
- Parameters:
mol1 – a
ccdc.molecule.Molecule
instancemol2 – a
ccdc.molecule.Molecule
instanceatoms – a list of pairs of atoms to use in the overlay, or None for all atoms to be used
invert – allow inversions in the overlay
rotate_torsions – allow torsional rotations when overlaying
with_symmetry – take account of symmetry when overlaying atoms
- Returns:
a
ccdc.molecule.Molecule
instance which is a copy of mol2 overlaid on mol1
Note: if with_symmetry is true, and matching atoms are provided, then the matching atoms need to form a connected structure.
- static overlay_rmsds_and_transformation(mol1, mol2, atoms=None, invert=False, rotate_torsions=False, with_symmetry=True)[source]¶
Overlay mol2 on mol1 and return properties of the overlay. Deprecated and replaced with
ccdc.MolecularDescriptors.Overlay
- Parameters:
mol1 – a
ccdc.molecule.Molecule
instancemol2 – a
ccdc.molecule.Molecule
instanceatoms – a list of pairs of atoms to use in the overlay, or None for all atoms to be used
invert – allow inversions in the overlay
rotate_torsions – allow torsional rotations when overlaying
with_symmetry – take account of symmetry when overlaying atoms
- Returns:
a tuple containing a
ccdc.molecule.Molecule
instance which is a copy of mol2 overlaid on mol1 as entry 0, the rmsd as entry 1, the Tanimoto rmsd as entry 2 and the overlay transformation as entry 3
- static point_group_analysis(mol)[source]¶
Return Schoenflies notation of the point group symmetry of a molecule.
The point group symmetry is returned as a tuple of:
order (e.g. 1)
symbol (e.g. ‘C1’)
description (e.g. ‘Objects in this point group have no symmetry.’)
- Parameters:
mol –
ccdc.molecule.Molecule
- Returns:
(int, str, str)
- static ring_centroid(ring)[source]¶
The centroid of the ring’s atoms.
- Parameters:
ring –
ccdc.molecule.Molecule.Ring
- static ring_plane(ring)[source]¶
The plane of the ring’s atoms.
- Parameters:
ring –
ccdc.molecule.Molecule.Ring
- static rmsd(mol1, mol2, atoms=None, overlay=False, exclude_hydrogens=True, with_symmetry=True)[source]¶
Return the RMSD of two molecules.
Both molecules should have the same atoms if
atoms
isNone
.- Parameters:
atoms – a list of pairs
ccdc.molecule.Atom
orNone
overlay – Whether to overlay the molecules before calculating RMSD
exclude_hydrogens – Whether all-atom or heavy atom RMSD should be calculated
with_symmetry – Whether to allow symmetrical matches
- Returns:
float
- class ccdc.descriptors.GeometricDescriptors[source]¶
A namespace to hold geometric classes and functions.
- class Plane(vector, distance, _plane=None)[source]¶
A plane in 3D.
- property distance¶
The distance from the origin of the plane.
- property normal¶
The normal to the plane.
- property plane_vector1¶
A vector in the plane, normal to the plane’s normal.
- property plane_vector2¶
A vector in the plane, normal to both the plane’s normal and the plane’s plane_vector1.
- class Quaternion(_quaternion=None)[source]¶
A normalised quaternion suitable for expressing rotations in 3D space.
Quaternions are a convenient method for expressing a complex sequence of rotations
By default this constructs a unit quaternion
- static from_dimensions(q0, q1, q2, q3)[source]¶
create from 4 real numbers. The quaternion will be normalised to unit length. :param q0,q1,q2,q3: the 4 dimensions of the axes 1,i,j and k :raises ValueError: with invalid input (e.g. if the length of the quaternion is 0)
- static from_euler_angles(alpha, beta, gamma, unit='degrees')[source]¶
create a quaternion from a set of euler angles
Note
The powder pattern, morphology, hydrogen-bond coordination and graph set features are available only to CSD-Materials and CSD-Enterprise users.
- class ccdc.descriptors.CrystalDescriptors[source]¶
Namespace for crystallographic descriptors.
- class GraphSetSearch(settings=None)[source]¶
Finds the graph sets of a crystal.
- class GraphSet(_graph_set_atoms, _view)[source]¶
An individual graph set.
- property degree¶
The degree of the graph set, i.e. the number of atoms involved.
- property descriptor¶
The descriptor of the graph set.
- property edge_labels¶
The edge labels of the graph set.
The labels are arbitrary letters identifying a unique hydrogen bond, separated by ‘>’ or ‘<’ indicating the donor-acceptor direction.
- property hbonds¶
The hydrogen bonds of the graph set.
- Returns:
a tuple of
ccdc.crystal.Crystal.HBond
instances.
- property label_set¶
The set of hydrogen bond labels found in the graph set.
- property nacceptors¶
The number of acceptors involved in the graph set.
- property ndonors¶
The number of donors involved in the graph set.
- property nmolecules¶
The number of molecules involved in the graph set.
- property period¶
The period of the graph set, i.e the number of hydrogen bonds in the repeat unit.
If the type of the graph set is not a chain or a ring this will be -1
- class Settings(hbond_criterion=None)[source]¶
Configurable settings for the graph set analyser.
- property angle_tolerance¶
The tolerance of the HBond angle.
- property distance_range¶
Allowable distance range for a HBond to be formed.
- property intermolecular¶
Whether HBonds should be intermolecular, intramolecular, or any.
- level = 2¶
deepest level to search. This is the number of different HBonds involved.
- max_chain_size = 4¶
longest chain to search
- max_discrete_chain_size = 4¶
longest discrete chain to search
- max_ring_size = 6¶
largest ring to search
- property path_length_range¶
The shortest and longest bond-path separation for intramolecular contacts.
- property require_hydrogens¶
Whether Hydrogens are required for the HBond.
- property vdw_corrected¶
Whether the distance range is Van der Waals corrected.
- search(crystal)[source]¶
Find all graph sets for the crystal subject to the constraints of the settings.
- Parameters:
crystal –
ccdc.crystal.Crystal
instance.- Returns:
a tuple of
ccdc.descriptors.CrystalDescriptors.GraphSetSearch.GraphSet
instances.
- class HBondCoordination(settings=None, skip_telemetry=False)[source]¶
Calculate HBond coordination predictions.
The HBondCoordination class is available only to CSD-Materials and CSD-Enterprise users.
- class Predictions(crystal, _analysis, _predictions)[source]¶
The predictions for HBonds coordinations.
- class Observation(label, coordination_count, probability)¶
- coordination_count¶
Alias for field number 1
- count(value, /)¶
Return number of occurrences of value.
- index(value, start=0, stop=9223372036854775807, /)¶
Return first index of value.
Raises ValueError if the value is not present.
- label¶
Alias for field number 0
- probability¶
Alias for field number 2
- functional_groups_of_hbond(hbond)[source]¶
The functional group pertaining to a hydrogen-bonding atom.
- property is_valid¶
Whether or not valid predictions were made.
- property observed¶
The predicted probabilities of observed HBonds.
- class HBondPropensities(settings=None)[source]¶
Calculates HBond propensities.
- class FittingData(_fitting_data=None, identifiers=None, databases=None)[source]¶
The collection of entries used for the prediction.
- class FittingDataEntry(_fitting_item)[source]¶
An individual entry with associated matching data.
- property identifier¶
The identifier of the fitting data item.
- advice_comment(functional_group=None)[source]¶
A string indicating whether or not there are enough data for propensity predictions.
Note: when first made the fitting data has not performed substructure matching, so results for particular groups will be inappropriately bad. Results will be valid after
ccdc.hbond_coordination.CrystalDescriptors.HBondPropensities.match_fitting_data()
has been called.
- class FunctionalGroup(_model_group)[source]¶
A functional group capable of hydrogen bonding.
- property identifier¶
The name of the functional group.
- class HBond(hbp, _outcome)[source]¶
A putative HBond in the propensity calculation.
- Variables:
self.donor – the
CrystalDescriptors.HBondPropensities.HBondDonor
involved in this hbond.self.acceptor – the
CrystalDescriptors.HBondPropensities.HBondAcceptor
involved in this hbond.self.propensity – the
CrystalDescriptors.HBondPropensities.Propensity
of this hbond.
- class HBondAcceptor(_analysis)[source]¶
A potental acceptor atom.
This class will be augmented with the evidence found during match_fitting_data().
- property acceptor_atom_type¶
A string representation of the atom’s acceptor type.
- property accessible_surface_area¶
The accessible surface area of the HBond atom.
- property atom¶
The
ccdc.molecule.Atom
of the HBondAtom.
- property functional_group_identifier¶
The identifier of the functional group for this atom.
- property identifier¶
The full identifier of this atom.
- property label¶
The label of the atom in the original structure.
- property nlone_pairs¶
The number of lone pairs associated with this atom.
- class HBondAtom(_analysis)[source]¶
Base class for HBondDonor and HBondAcceptor.
- property accessible_surface_area¶
The accessible surface area of the HBond atom.
- property atom¶
The
ccdc.molecule.Atom
of the HBondAtom.
- property functional_group_identifier¶
The identifier of the functional group for this atom.
- property identifier¶
The full identifier of this atom.
- property label¶
The label of the atom in the original structure.
- property nlone_pairs¶
The number of lone pairs associated with this atom.
- class HBondDonor(_analysis)[source]¶
A potential donor atom.
This class will be augmented with the evidence found during match_fitting_data().
- property accessible_surface_area¶
The accessible surface area of the HBond atom.
- property atom¶
The
ccdc.molecule.Atom
of the HBondAtom.
- property donor_atom_type¶
A string representation of the atom’s donor type.
- property functional_group_identifier¶
The identifier of the functional group for this atom.
- property identifier¶
The full identifier of this atom.
- property label¶
The label of the atom in the original structure.
- property nlone_pairs¶
The number of lone pairs associated with this atom.
- class HBondGrouping(hbond_propensities, _outcome)[source]¶
A grouping of interactions between donors and acceptors representing a possible hbond network.
This represents a point in the chart of Mercury’s HBondPropensity wizard.
- Variables:
self.donors – a tuple of
CrystalDescriptors.HBondPropensities.HBondDonor
with specified coordination outcomes for this grouping.self.acceptors – a tuple of
CrystalDescriptors.HBondPropensities.HBondAcceptor
with specified coordination outcomes for this grouping.self.hbonds – a tuple of
CrystalDescriptors.HBondPropensities.HBond
forming the hbonds in the network.self.hbond_score – the average propensity value of hbonds.
self.coordination_score – the negative average coordination score of the donors and acceptors with these coordination outcomes.
- class InterPropensity(hbp, _row)[source]¶
Predicted propensity for a single HBond.
- property acceptor_component¶
The component number of the acceptor in the target structure.
- property acceptor_label¶
The label of the acceptor atom.
- property acceptor_rank¶
The rank number of the acceptor.
- property bounds¶
The lower and upper bounds of the prediction.
- property donor_component¶
The component number of the donor in the target structure.
- property donor_label¶
The label of the donor atom.
- property donor_rank¶
The rank number of the donor.
- property hbond_count¶
The number of instances of the hbond observed in the target structure.
- property is_acceptor_bifurcated¶
Whether the acceptor is bifurcated in the target structure.
- property is_donor_bifurcated¶
Whether the donor is bifurcated in the target structure.
- property is_intermolecular¶
Whether or not the predicted propensity is for an intermolecular HBond.
- property is_observed¶
Whether the hbond is observed in the target structure.
- property predictive_error¶
The error in the prediction.
- property propensity¶
The predicted value.
- property scores¶
The calculated values and statistics for the hbond prediction.
- property uncertainty¶
The uncertainty in the prediction.
- class IntraPropensity(hbp, _row)[source]¶
Predicted propensity for an intramolecular HBond.
- property acceptor_component¶
The component number of the acceptor in the target structure.
- property acceptor_label¶
The label of the acceptor atom.
- property acceptor_rank¶
The rank number of the acceptor.
- property bounds¶
The lower and upper bounds of the prediction.
- property donor_component¶
The component number of the donor in the target structure.
- property donor_label¶
The label of the donor atom.
- property donor_rank¶
The rank number of the donor.
- property hbond_count¶
The number of instances of the hbond observed in the target structure.
- property is_acceptor_bifurcated¶
Whether the acceptor is bifurcated in the target structure.
- property is_donor_bifurcated¶
Whether the donor is bifurcated in the target structure.
- property is_intermolecular¶
Whether or not the predicted propensity is for an intermolecular HBond.
- property is_observed¶
Whether the hbond is observed in the target structure.
- property predictive_error¶
The error in the prediction.
- property propensity¶
The predicted value.
- property scores¶
The calculated values and statistics for the hbond prediction.
- property uncertainty¶
The uncertainty in the prediction.
- class Model(_model)[source]¶
The logistic regression model.
- class Coefficient(_coefficient)[source]¶
A coefficient of the regression model.
- property confidence_interval¶
The upper and lower bounds of the coefficient.
- property estimate¶
The estimate of the coefficient.
- property identifier¶
The identifier of the coefficient.
- property is_baseline¶
Whether or not the coefficient is used for the baseline calculation.
- property p_value¶
P-value of the coefficient.
- property significance_code¶
A string representation of how significant the parameter is.
‘*’ for P-value < 0.01, ‘’ < 0.01. ‘*’ < 0.05 and ‘.’ < 0.1
- property standard_error¶
Standard error of the coefficient.
- property z_value¶
Z-value of the coefficient.
- class Parameter(_crystal_structure_property)[source]¶
A named parameter of the regression.
- calculate(donor, acceptor)[source]¶
The value of this property for the pair of atoms.
- Parameters:
donor – ccdc.hbond_coordination.CrystalDescriptors.HBondPropensities.HBondDonor instance.
acceptor – ccdc.hbond_coordination.CrystalDescriptors.HBondPropensities.HBondAcceptor instance.
- Returns:
float
- property identifier¶
The identifier of the parameter.
- property advice_comment¶
A string representing the quality of the discrimination based on the ROC.
- property akaike_information_criterion¶
The Akaike Information Criterion (AIC) of the model.
- property area_under_roc_curve¶
Area under the ROC curve.
- property coefficients¶
The coefficients of the model.
- property equation¶
The regression equation.
- property log_likelihood¶
The log likelihood of the model.
- property log_likelihood_test_p_value¶
The P-value of the log likelihood of the model.
- property null_deviance¶
The null deviance of the model.
- property null_deviance_degrees_of_freedom¶
The degrees of freedom of the null deviance of the model.
- property residual_deviance¶
The residual deviance of the model.
- property residual_deviance_degrees_of_freedom¶
The number of degrees of freedom of the residual deviance of the model.
- class Propensity(hbp, _row)[source]¶
Base class for inter- and intra-molecular propensity predictions.
- Variables:
self.donor – the
CrystalDescriptors.HBondPropensities.HBondDonor
involved in this putative hbond.self.acceptor – the
CrystalDescriptors.HBondPropensities.HBondAcceptor
involved in this putative hbond.self.hbond – the
ccdc.crystal.Crystal.HBond
from the target structure if this hbond is observed.
- property acceptor_component¶
The component number of the acceptor in the target structure.
- property acceptor_label¶
The label of the acceptor atom.
- property acceptor_rank¶
The rank number of the acceptor.
- property bounds¶
The lower and upper bounds of the prediction.
- property donor_component¶
The component number of the donor in the target structure.
- property donor_label¶
The label of the donor atom.
- property donor_rank¶
The rank number of the donor.
- property hbond_count¶
The number of instances of the hbond observed in the target structure.
- property is_acceptor_bifurcated¶
Whether the acceptor is bifurcated in the target structure.
- property is_donor_bifurcated¶
Whether the donor is bifurcated in the target structure.
- property is_observed¶
Whether the hbond is observed in the target structure.
- property predictive_error¶
The error in the prediction.
- property propensity¶
The predicted value.
- property scores¶
The calculated values and statistics for the hbond prediction.
- property uncertainty¶
The uncertainty in the prediction.
- class Settings[source]¶
Pertaining to HBond propensity calculation.
- property databases¶
The databases to be used for the prediction.
Note: the databases MUST be SQLite ASER databases for the moment.
- property limit_identifier_list¶
A list of identifiers to limit the search
- property working_directory¶
The working directory for the predictions.
- calculate_propensities(crystal=None)[source]¶
Apply the regression equation to a crystal.
- Parameters:
crystal –
ccdc.crystal.Crystal
instance or None. If None the target structure will be used.
- property fitting_data¶
The fitting data.
- generate_hbond_groupings(min_donor_prob=None, min_acceptor_prob=None)[source]¶
Generate all possible permutations of donors and acceptors to create all possible hbond groupings.
- hbond_atoms(crystal=None)[source]¶
The HBondDonor and HBondAcceptor atoms of a crystal.
- Parameters:
crystal –
ccdc.crystal.Crystal
instance, or None, in which case the HBondAtoms of the target will be returned.- Returns:
a pair of tuples of
ccdc.descriptors.CrystalDescriptors.HBondPropensities.HBondDonor
andccdc.descriptors.CrystalDescriptors.HBondPropensities.HBondAcceptor
.
- make_fitting_data()[source]¶
Deprecated method. Please use match_fitting_data or use CrystalDescriptors.HBondPropensities.FittingData.from_file to limit the entries that are searched
returns an object that will cause all of the database entries to be searched
- match_fitting_data(count=None, verbose=False)[source]¶
Reduces fitting data down such that each functional group has at least the specified number of examples.
- property propensities¶
The inter- and intra-propensities of the prediction.
- set_target(crystal)[source]¶
Sets a single target for the propensity calculation.
- Parameters:
crystal – a
ccdc.crystal.Crystal
instance.
- class Morphology(crystal=None)[source]¶
- class Facet(_facet, _perpendicular_distance, _miller_indices)¶
One of the facets of a morphology.
- property area¶
The area of the polygon.
- property centre_of_geometry¶
The centre of geometry of the facet.
- property coordinates¶
The coordinates of the vertices of the facet.
- property edges¶
The edges making up the facet.
- property miller_indices¶
The Miller indices of the facet.
- property perpendicular_distance¶
The perpendicular distance from the origin.
- property plane¶
The plane of the facet.
This is a
ccdc.descriptors.GeometricDescriptors.Plane
instance.
- class OrientedBoundingBox(morphology)¶
The bounding box of the morphology.
This box is not necessarily axis-aligned.
- property corners¶
The eight points forming the corners of the bounding box.
- property major_length¶
The length of the major axis of the bounding box.
- property median_length¶
The length of the middle axis of the bounding box.
- property minor_length¶
The length of the minor axis of the bounding box.
- property volume¶
The volume of the bounding box.
- property bounding_box¶
The bounding box of the morphology.
A pair of
ccdc.molecule.Coordinates
representing the minimum and maximum corners of the box.
- property centre_of_geometry¶
The centroid of the morphology.
- property facets¶
The facets making up the morphology.
- static from_file(file_name)¶
Creates a Morphology instance from a CIF file.
The CIF file should be those written by this class or Mercury, which includes a scaling for each of the perpendicular distances.
- static from_growth_rates(crystal, growth_rates)¶
Creates a morphology from an iterable of growth rates.
- Parameters:
crystal – an instance of
ccdc.crystal.Crystal
.growth_rates – an iterable of pairs,
ccdc.crystal.Crystal.MillerIndices
and perpendicular distance, otherwise known as morphological importance.
- property oriented_bounding_box¶
The minimum volume box of the morphology.
This will not necessarily be aligned to the orthonormal cartesian axes.
- relative_area(miller_indices)¶
The relative area of the facet.
This is what is usually called the Morphological Importance of a facet.
- property scale_factor¶
The factor by which the morphology is scaled.
- property volume¶
The volume of the morphology.
This is calculated stochastically, rather than analytically, so has some error.
- write(file_name, keep_all_indices=False)¶
Write this morphology to CIF file.
- class PoreAnalyser(crystal, settings=None)[source]¶
Calculates Pore Analysis. crystal is ccdc.crystal.Crystal
- class Flags[source]¶
Flags for validlity of cached variables
- property calculator_is_valid¶
grid spacing (A)
- class Settings[source]¶
Settings for PoreAnalyser
- property cutoff_distance¶
Cut-off distance (A)
- property grid_spacing¶
grid spacing (A)
- property he_probe_epsilon¶
UFF L-J epsilon/k for He probe (K)
- property he_probe_sigma¶
UFF L-J sigma for He probe (A)
- property n_probe_sigma¶
UFF L-J sigma for N probe (A)
- property samples_per_atom¶
Sample size for surface area calculation
- property temperature¶
Temperature (K)
- property max_pore_diameter¶
Result: Max pore diameter (A)
- property network_accessible_geometric_volume¶
Result: Network accessible geometric pore volume (A^3)
- property network_accessible_helium_volume¶
Result: Network accessible He pore volume (A^3)
- property network_accessible_surface_area¶
Result: network accessible surface area (A^2)
- property network_accessible_surface_area_per_mass¶
Result: network accessible surface area per mass (m^2/g)
- property network_accessible_surface_area_per_volume¶
Result: metwork accessible surface area per volume (m^2/cm^3)
- property num_percolated_dimensions¶
Result: Number of percolated dimensions
- property pore_limiting_diameter¶
Result: Pore limiting diameter (A)
- property system_density¶
Result: density (g/cm^3)
- property system_mass¶
Result: mass of unit cell (g/mol)
- property system_volume¶
Result: volume of unit cell (A^3)
- property total_geometric_volume¶
Result: geometric pore volume (A^3)
- property total_helium_volume¶
Result: He pore volume (A^3)
- property total_surface_area¶
Result: surface area (A^2)
- property total_surface_area_per_mass¶
Result: surface area per mass (m^2/g)
- property total_surface_area_per_volume¶
Result: surface area per volume (m^2/cm^3)
- class PowderPattern(_pattern, _settings=None, _simulation=None, _crystal=None)[source]¶
Represents a powder pattern.
- The powder pattern class is available only to CSD-Materials and
CSD-Enterprise users.
- class PreferredOrientation(values=None, _function=None)[source]¶
A preferred orientation for PXRD simulation.
- property h¶
The miller indices h value of the preferred orientation.
- property k¶
The miller indices k value of the preferred orientation.
- property l¶
The miller indices l value of the preferred orientation.
- property r¶
The March-Dollase r value of the preferred orientation.
- class Settings(_settings=None)[source]¶
Settings which may be set for a Powder simulation.
- property deuterium_is_hydrogen¶
Whether to include treat deuterium as hydrogen :return: True or false
- property fast_peak_shape: bool¶
Whether to use a fast but less accurate peak shape calculation during simulation
The peak shape will be applied using fast fourier transform convolution of the peak shape function. This is faster but less accurate than the default convolution method. The resulting peaks will tend to be wider.
- Returns:
True or False
- property full_width_at_half_maximum¶
The the full width at half height of peaks to use in simulation :return: float representing the full width at half height of peaks (in degrees)
- property include_hydrogens¶
Whether to include hydrogens in the simulation :return: True or false
- property march_dollase_preferred_orientation¶
Setting for march_dollase.
- Returns:
a
PXRDMatchOptimiser.Settings.PreferredOrientation
or None
The default value is None. This can be set with a tuple of (h, k, l, r).
- property second_wavelength¶
Set or get the secondary wavelength
- Parameters:
value – float or pair of floats (wavelength and scale factor) or another Wavelength object or None to remove the secondary wavelength
- Returns:
Secondary wavelength object (or None) for the simulation
- property slit_type: str¶
The type of slit to be simulated
This may be ‘fixed’ (default) or ‘variable’.
- Returns:
string representing type of slit to be simulated
- property two_theta_maximum¶
Where to end the pattern simulation :return: float representing the maximum 2-theta (in degrees)
- property two_theta_minimum¶
Where to start the pattern simulation :return: float representing the minimum 2-theta (in degrees)
- property two_theta_step¶
The step-size used in the pattern simulation :return: float representing the step size (in degrees)
- property wavelength¶
Set or get the primary wavelength
- Parameters:
value – float or pair of floats (wavelength and scale factor) or another Wavelength object or None to reset to the default
- Returns:
Primary wavelength object (or None) for the simulation
- class TickMark(_tick, _crystal=None)[source]¶
A tick mark in a simulated powder pattern.
- property is_systematically_absent¶
Whether this tick mark is systematically absent.
- property miller_indices¶
The Miller indices of this tick mark.
- property two_theta¶
Two theta value of this tick.
- class Wavelength(wavelength=None, scale_factor=1.0, _wavelength=None)[source]¶
Represents a wavelength for powder studies.
Some standard wavelengths - these are floats, not
ccdc.descriptors.CrystalDescriptors.PowderPattern.Wavelength
- property scale_factor¶
The scale factor of this Wavelength.
- property wavelength¶
The wavelength.
- property esd¶
The array of esd values (Estimated Square Deviations).
- static from_crystal(crystal, settings=None)[source]¶
Create a CrystalDescriptors.PowderPattern from a crystal.
- Parameters:
crystal –
ccdc.crystal.Crystal
settings –
ccdc.descriptors.CrystalDescriptors.PowderPattern.Settings
- static from_file(file_name, format=None, default_wavelength=None)[source]¶
Create a CrystalDescriptors.PowderPattern from a file.
format
may take one of the following values: -'xy'
: XY format (2 columns: 2theta and intensity) -'xye'
: XYE format (3 columns: 2theta, intensity and ESD) -'xrdml'
: Panalytical XRDML format Ifformat
isNone
, it will be deduced from the filename extension.- Parameters:
file_name – path to the file
format – string indicating the format to expect; if
None
will deduce from filename extensiondefault_wavelength – Default wavelength used if no wavelength found/parsed in file
- static from_xrdml_file(file_name, default_wavelength=None)[source]¶
Create a CrystalDescriptors.PowderPattern from a Panalytical XRDML file.
See https://www.malvernpanalytical.com/en/products/category/software/x-ray-diffraction-software/data-collector for details of the XRDML format.
- Parameters:
file_name – path to XRDML file
default_wavelength – Default wavelength used if no wavelength found/parsed in file
- static from_xy_file(file_name, default_wavelength=None)[source]¶
Create a CrystalDescriptors.PowderPattern from an xy file.
- Parameters:
file_name – path to xy file
default_wavelength – Default wavelength used if no wavelength found/parsed in file
- static from_xye_file(file_name, default_wavelength=None)[source]¶
Create a CrystalDescriptors.PowderPattern from an xye file.
- Parameters:
file_name – path to xye file
default_wavelength – Default wavelength used if no wavelength found/parsed in file
- integral(start=0.0, end=180.0)[source]¶
The area under the curve.
- Parameters:
start – float
end – float
- Returns:
float
- property intensity¶
The array of intensity values.
- resetWavelength(new_wavelength=None)[source]¶
Reset the wavelength for an existing powder pattern
- Parameters:
new_wavelength – New wavelength, if this is left blank then the wavelength is reset to 1.54056 Angstrom
- similarity(other, width=2.0, use_esds=True)[source]¶
Measure of match between this pattern and another.
This uses the cross-correlations described in `R. de Gelder, R. Wehrens, J.A. Hageman (2001) <i>J. Comp. Chem.</i> <b>22</b>:273-289. https://doi.org/10.1002/1096-987X(200102)22:3%3C273::AID-JCC1001%3E3.0.CO;2-0`_
- Parameters:
width – width (in degrees) of the base of the triangle weight function
use_esds – Whether to use the powder pattern estimates of standard deviation on the counts in the calculation as weightings
- Returns:
float that represents the similarity of the two patterns
- property tick_marks¶
The array of tick marks if this is a simulated powder pattern.
- Returns:
list of
ccdc.descriptors.CrystalDescriptors.PowderPattern.TickMark
orNone
if this is not a simulated powder pattern.
- property two_theta¶
The array of two_theta values.
- write_raw_file(file_name)[source]¶
Write a Bruker .raw file.
- Parameters:
file_name – output file name
- write_xrdml_file(file_name)[source]¶
Write a Panalytical .xrdml format file.
- Parameters:
file_name – output file name
- class ccdc.descriptors.StatisticalDescriptors[source]¶
A namespace holding statistical descriptors.
- class RankStatistics(scores, activity_column=None)[source]¶
Represents a ranked collection of values for which statistics can be derived.
- ACC(fraction=0.0)[source]¶
Calculate accuracy metric (ACC) at the specified fraction.
ACC = (TP+TN) / (TP+FP+TN+FN)
- Parameters:
fraction – position within data for which accuracy metric is to be determined.
- Raises:
ValueError if fraction is not within interval [0,1]
- BEDROC(alpha=0.0)[source]¶
Calculate Boltzmann-Enhanced Discrimination of ROC (BEDROC) as defined in:
Truchon J., Bayly C.I., “Evaluating Virtual Screening Methods: Good and Bad Metric for the “Early Recognition” Problem” J. Chem. Inf. Model. 47:488-508 (2007).
- Parameters:
alpha – exponential weighting factor.
- Raises:
ValueError if alpha is less than or equal to 0.0.
- EF(fraction=0.0)[source]¶
Calculate enrichment factor (EF) at the specified fraction.
- Parameters:
fraction – position within data for which enrichment factor is to be determined.
- Raises:
ValueError if fraction is not within interval [0,1]
- PPV(fraction=0.0)[source]¶
Calculate precision or positive predictive value (PPV) at the specified fraction.
- Parameters:
fraction – position within data for which precision is to be determined.
- Raises:
ValueError if fraction is not within interval [0,1]
- RIE(alpha=0.0)[source]¶
Calculate robust initial enhancement (RIE) as defined in:
Sheridan R.P., Singh S.B., Fluder E.M., Kearsley S.K., “Protocols for Bridging the Peptide to Nonpeptide Gap in Topological Similarity Searches” J. Chem. Inf. Comp. Sci. 41:1395-1406 (2001).
- Parameters:
alpha – exponential weighting factor
- Raises:
ValueError if alpha is less than or equal to 0.0
- ROC()[source]¶
Calculate receiver operating characteristic (ROC) curve.
- Returns:
list, list - True positive rate, False positive rate
- property activity_column¶
Get extractor for active/inactive classification from scores data.