Package com.actelion.research.chem
Class Canonizer
- java.lang.Object
-
- com.actelion.research.chem.Canonizer
-
public class Canonizer extends java.lang.Object
-
-
Field Summary
Fields Modifier and Type Field Description static int
ASSIGN_PARITIES_TO_TETRAHEDRAL_N
static int
cIDCodeCurrentVersion
protected static int
cIDCodeVersion2
protected static int
cIDCodeVersion3
static int
CONSIDER_DIASTEREOTOPICITY
static int
CONSIDER_ENANTIOTOPICITY
static int
CONSIDER_STEREOHETEROTOPICITY
static int
COORDS_ARE_3D
protected static int
cParity1And
protected static int
cParity1Or
protected static int
cParity2And
protected static int
cParity2Or
static int
CREATE_PSEUDO_STEREO_GROUPS
static int
CREATE_SYMMETRY_RANK
static int
DISTINGUISH_RACEMIC_OR_GROUPS
static int
ENCODE_ATOM_CUSTOM_LABELS
static int
ENCODE_ATOM_CUSTOM_LABELS_WITHOUT_RANKING
static int
ENCODE_ATOM_SELECTION
static int
TIE_BREAK_FREE_VALENCE_ATOMS
-
Constructor Summary
Constructors Constructor Description Canonizer(StereoMolecule mol)
Runs a canonicalization procedure for the given molecule that creates unique atom ranks, which takes stereo features, ESR settings and query features into account.Canonizer(StereoMolecule mol, int mode)
Runs a canonicalization procedure for the given molecule that creates unique atom ranks, which takes stereo features, ESR settings and query features into account.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description StereoMolecule
getCanMolecule()
StereoMolecule
getCanMolecule(boolean includeExplicitHydrogen)
java.lang.String
getEncodedCoordinates()
Encodes the molecule's atom coordinates into a compact String.java.lang.String
getEncodedCoordinates(boolean keepPositionAndScale)
Encodes the molecule's atom coordinates into a compact String.java.lang.String
getEncodedCoordinates(boolean keepPositionAndScale, Coordinates[] atomCoordinates)
Encodes the molecule's atom coordinates into a compact String.java.lang.String
getEncodedMapping()
int
getEZParity(int bond)
Returns the absolute bond parity, which is based on priority ranks.int[]
getFinalRank()
int[]
getGraphAtoms()
int[]
getGraphIndexes()
java.lang.String
getIDCode()
static int
getNeededBits(int maxNo)
int
getPseudoEZGroup(int bond)
If mMode includes CREATE_PSEUDO_STEREO_GROUPS, then this method returns this bond's relative stereo feature group number provided this bond is a pseudo stereo bond, i.e.int
getPseudoStereoGroupCount()
If mMode includes CREATE_PSEUDO_STEREO_GROUPS, then this method returns the number of independent relative stereo feature groups.int
getPseudoTHGroup(int atom)
If mMode includes CREATE_PSEUDO_STEREO_GROUPS, then this method returns this atom's relative stereo feature group number provided this atom is a pseudo stereo center, i.e.int
getSymmetryRank(int atom)
Returns the symmetry rank before tie breaking.int[]
getSymmetryRanks()
Returns the symmetry ranks before tie breaking.int
getTHParity(int atom)
Returns the absolute tetrahedral parity, which is based on priority ranks.boolean
hasCIPParityDistinctionProblem()
void
invalidateCoordinates()
boolean
normalizeEnantiomer()
This normalizes all absolute tetrahedral-, allene- and atrop-parities within the molecule.protected void
setCIPParities()
void
setParities()
Creates parities based on atom indices of original molecule and copies them back into that molecule.boolean
setSingleUnknownAsRacemicParity()
If the molecule contains exactly one stereo center and if that has unknown configuration, than assume that the configuration is meant to be racemic and update molecule accordingly.protected void
setStereoCenters()
void
setUnknownParitiesToExplicitlyUnknown()
Sets all atoms with TH-parity 'unknown' to explicitly defined 'unknown'.
-
-
-
Field Detail
-
CREATE_SYMMETRY_RANK
public static final int CREATE_SYMMETRY_RANK
- See Also:
- Constant Field Values
-
CONSIDER_DIASTEREOTOPICITY
public static final int CONSIDER_DIASTEREOTOPICITY
- See Also:
- Constant Field Values
-
CONSIDER_ENANTIOTOPICITY
public static final int CONSIDER_ENANTIOTOPICITY
- See Also:
- Constant Field Values
-
CONSIDER_STEREOHETEROTOPICITY
public static final int CONSIDER_STEREOHETEROTOPICITY
- See Also:
- Constant Field Values
-
ENCODE_ATOM_CUSTOM_LABELS
public static final int ENCODE_ATOM_CUSTOM_LABELS
- See Also:
- Constant Field Values
-
ENCODE_ATOM_SELECTION
public static final int ENCODE_ATOM_SELECTION
- See Also:
- Constant Field Values
-
ASSIGN_PARITIES_TO_TETRAHEDRAL_N
public static final int ASSIGN_PARITIES_TO_TETRAHEDRAL_N
- See Also:
- Constant Field Values
-
COORDS_ARE_3D
public static final int COORDS_ARE_3D
- See Also:
- Constant Field Values
-
CREATE_PSEUDO_STEREO_GROUPS
public static final int CREATE_PSEUDO_STEREO_GROUPS
- See Also:
- Constant Field Values
-
DISTINGUISH_RACEMIC_OR_GROUPS
public static final int DISTINGUISH_RACEMIC_OR_GROUPS
- See Also:
- Constant Field Values
-
TIE_BREAK_FREE_VALENCE_ATOMS
public static final int TIE_BREAK_FREE_VALENCE_ATOMS
- See Also:
- Constant Field Values
-
ENCODE_ATOM_CUSTOM_LABELS_WITHOUT_RANKING
public static final int ENCODE_ATOM_CUSTOM_LABELS_WITHOUT_RANKING
- See Also:
- Constant Field Values
-
cIDCodeVersion2
protected static final int cIDCodeVersion2
- See Also:
- Constant Field Values
-
cIDCodeVersion3
protected static final int cIDCodeVersion3
- See Also:
- Constant Field Values
-
cIDCodeCurrentVersion
public static final int cIDCodeCurrentVersion
- See Also:
- Constant Field Values
-
cParity1And
protected static final int cParity1And
- See Also:
- Constant Field Values
-
cParity2And
protected static final int cParity2And
- See Also:
- Constant Field Values
-
cParity1Or
protected static final int cParity1Or
- See Also:
- Constant Field Values
-
cParity2Or
protected static final int cParity2Or
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
Canonizer
public Canonizer(StereoMolecule mol)
Runs a canonicalization procedure for the given molecule that creates unique atom ranks, which takes stereo features, ESR settings and query features into account.- Parameters:
mol
-
-
Canonizer
public Canonizer(StereoMolecule mol, int mode)
Runs a canonicalization procedure for the given molecule that creates unique atom ranks, which takes stereo features, ESR settings and query features into account. If mode includes ENCODE_ATOM_CUSTOM_LABELS, than custom atom labels are considered for the atom ranking and are encoded into the idcode.
If mode includes COORDS_ARE_3D, then getEncodedCoordinates() always returns a 3D-encoding even if all z-coordinates are 0.0. Otherwise coordinates are encoded in 3D only, if at least one of the z-coords is not 0.0.- Parameters:
mol
-mode
- 0 or one or more of CONSIDER...TOPICITY, CREATE..., ENCODE_ATOM_CUSTOM_LABELS, ASSIGN_PARITIES_TO_TETRAHEDRAL_N, COORDS_ARE_3D
-
-
Method Detail
-
hasCIPParityDistinctionProblem
public boolean hasCIPParityDistinctionProblem()
-
getCanMolecule
public StereoMolecule getCanMolecule()
-
getCanMolecule
public StereoMolecule getCanMolecule(boolean includeExplicitHydrogen)
-
setUnknownParitiesToExplicitlyUnknown
public void setUnknownParitiesToExplicitlyUnknown()
Sets all atoms with TH-parity 'unknown' to explicitly defined 'unknown'. Sets all double bonds with EZ-parity 'unknown' to cross bonds. Sets the first bond atom of all BINAP type bonds with parity 'unknown' to explicitly defined 'unknown' parity.
-
setSingleUnknownAsRacemicParity
public boolean setSingleUnknownAsRacemicParity()
If the molecule contains exactly one stereo center and if that has unknown configuration, than assume that the configuration is meant to be racemic and update molecule accordingly. If stereo configuration is ill defined with a stereo bond whose pointed tip is not at the stereo center, then the molecule is not touched and the stereo center kept as undefined.- Returns:
- whether a stereo center was converted to be racemic
-
getIDCode
public java.lang.String getIDCode()
-
getFinalRank
public int[] getFinalRank()
-
getSymmetryRank
public int getSymmetryRank(int atom)
Returns the symmetry rank before tie breaking. For this the Canonizer mode must contain the CREATE_SYMMETRY_RANK option. If ranking shall reflect atom diastereotopicity or even enantiotopicity, use mode CONSIDER_DIASTEREOTOPICITY or CONSIDER_STEREOHETEROTOPICITY, respectively.- Parameters:
atom
-- Returns:
- rank
-
getSymmetryRanks
public int[] getSymmetryRanks()
Returns the symmetry ranks before tie breaking. For this the Canonizer mode must contain the CREATE_SYMMETRY_RANK option. If ranking shall reflect atom diastereotopicity or even enantiotopicity, use mode CONSIDER_DIASTEREOTOPICITY or CONSIDER_STEREOHETEROTOPICITY, respectively.- Returns:
- ranks
-
invalidateCoordinates
public void invalidateCoordinates()
-
getEncodedCoordinates
public java.lang.String getEncodedCoordinates()
Encodes the molecule's atom coordinates into a compact String. Together with the idcode the coordinate string can be passed to the IDCodeParser to recreate the original molecule including coordinates.
If the molecule's coordinates are 2D, then coordinate encoding will be relative, i.e. scale and absolute positions get lost during the encoding. 3D-coordinates, however, are encoded retaining scale and absolute positions.
If the molecule has 3D-coordinates and if there are no implicit hydrogen atoms, i.e. all hydrogen atoms are explicitly available with their coordinates, then hydrogen 3D-coordinates are also encoded despite the fact that the idcode itself does not contain hydrogen atoms, because it must be canonical.- Returns:
-
getEncodedCoordinates
public java.lang.String getEncodedCoordinates(boolean keepPositionAndScale)
Encodes the molecule's atom coordinates into a compact String. Together with the idcode the coordinate string can be passed to the IDCodeParser to recreate the original molecule including coordinates.
If keepPositionAndScale==false, then coordinate encoding will be relative, i.e. scale and absolute positions get lost during the encoding. Otherwise the encoding retains scale and absolute positions.
If the molecule has 3D-coordinates and if there are no implicit hydrogen atoms, i.e. all hydrogen atoms are explicitly available with their coordinates, then hydrogen 3D-coordinates are also encoded despite the fact that the idcode itself does not contain hydrogen atoms, because it must be canonical.- Parameters:
keepPositionAndScale
- if false, then coordinates are scaled to an average bond length of 1.5 units- Returns:
-
getEncodedCoordinates
public java.lang.String getEncodedCoordinates(boolean keepPositionAndScale, Coordinates[] atomCoordinates)
Encodes the molecule's atom coordinates into a compact String. Together with the idcode the coordinate string can be passed to the IDCodeParser to recreate the original molecule including coordinates.
If keepPositionAndScale==false, then coordinate encoding will be relative, i.e. scale and absolute positions get lost during the encoding. Otherwise the encoding retains scale and absolute positions.
If the molecule has 3D-coordinates and if there are no implicit hydrogen atoms, i.e. all hydrogen atoms are explicitly available with their coordinates, then hydrogen 3D-coordinates are also encoded despite the fact that the idcode itself does not contain hydrogen atoms, because it must be canonical.- Parameters:
keepPositionAndScale
- if false, then coordinates are scaled to an average bond length of 1.5 unitsatomCoordinates
- external atom coordinate set for the same molecule, e.g. from a Conformer- Returns:
-
getEncodedMapping
public java.lang.String getEncodedMapping()
-
getNeededBits
public static int getNeededBits(int maxNo)
- Parameters:
maxNo
- highest possible index of some kind- Returns:
- number of bits needed to represent numbers up to maxNo
-
getTHParity
public int getTHParity(int atom)
Returns the absolute tetrahedral parity, which is based on priority ranks.- Parameters:
atom
-- Returns:
- one of the Molecule.cAtomParityXXX constants
-
getEZParity
public int getEZParity(int bond)
Returns the absolute bond parity, which is based on priority ranks.- Parameters:
bond
-- Returns:
- one of the Molecule.cBondParityXXX constants
-
getPseudoStereoGroupCount
public int getPseudoStereoGroupCount()
If mMode includes CREATE_PSEUDO_STEREO_GROUPS, then this method returns the number of independent relative stereo feature groups. A relative stereo feature group always contains more than one pseudo stereo features (TH or EZ), which only in combination define a certain stereo configuration.- Returns:
-
getPseudoEZGroup
public int getPseudoEZGroup(int bond)
If mMode includes CREATE_PSEUDO_STEREO_GROUPS, then this method returns this bond's relative stereo feature group number provided this bond is a pseudo stereo bond, i.e. its stereo configuration only is relevant in combination with other pseudo stereo features. If this bond is not a pseudo stereo bond, then this method returns 0.- Parameters:
bond
-- Returns:
-
getPseudoTHGroup
public int getPseudoTHGroup(int atom)
If mMode includes CREATE_PSEUDO_STEREO_GROUPS, then this method returns this atom's relative stereo feature group number provided this atom is a pseudo stereo center, i.e. its stereo configuration only is relevant in combination with other pseudo stereo features. If this atom is not a pseudo stereo center, then this method returns 0.- Parameters:
atom
-- Returns:
-
normalizeEnantiomer
public boolean normalizeEnantiomer()
This normalizes all absolute tetrahedral-, allene- and atrop-parities within the molecule. This is done by finding the lowest atom rank that is shared by an odd number of atoms with determines parities, not counting unknown and none. If there number of parity2 atoms is higher than parity1 atoms of that rank, then all parities are inverted.
You may call this method before creating the idcode from this Canonizer to convert internal parity information to the noermalized enantiomer. When calling getIDCode() afterwards, the idcode represents the normalized enantiomer. Stereo information of the underlying molecule is not touched.- Returns:
- true, if all internal parities were inverted
-
setParities
public void setParities()
Creates parities based on atom indices of original molecule and copies them back into that molecule. It also sets the stereo center flag. These atom parities are not normalized (e.g. in racemic or meso groups) and reflect the original molecule's up/down bonds.
-
setStereoCenters
protected void setStereoCenters()
-
setCIPParities
protected void setCIPParities()
-
getGraphAtoms
public int[] getGraphAtoms()
- Returns:
- an int[] giving all atom indexes in the order as they appear in the graph
-
getGraphIndexes
public int[] getGraphIndexes()
- Returns:
- an int[] giving the relationship between new atom numbers and old atom numbers
-
-