Package com.actelion.research.chem.io
Class CompoundFileParser
- java.lang.Object
-
- com.actelion.research.chem.io.CompoundFileParser
-
- Direct Known Subclasses:
DWARFileParser
,ODEFileParser
,SDFileParser
public abstract class CompoundFileParser extends java.lang.Object
-
-
Field Summary
Fields Modifier and Type Field Description protected java.io.BufferedReader
mReader
-
Constructor Summary
Constructors Constructor Description CompoundFileParser()
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected abstract boolean
advanceToNext()
Dont't call this method directly.void
close()
Closes the underlying reader.static CompoundFileParser
createParser(java.lang.String fileName)
Creates the proper parser for the given type of compound file (currently SD or DWAR).java.lang.String
getCoordinates()
Either getIDCode and this method or getMolecule() must be overwritten!!!java.lang.Object
getDescriptor(java.lang.String shortName)
If the file source contains encoded descriptors, then overwrite this method to save the calculation time.DescriptorHandlerFactory
getDescriptorHandlerFactory()
abstract java.lang.String
getFieldData(int column)
Returns the cell content of the current row.int
getFieldIndex(java.lang.String fieldName)
abstract java.lang.String[]
getFieldNames()
Compiles all column names that contain alpha-numerical information.java.lang.String
getIDCode()
Either this method and getCoordinates() or getMolecule() must be overwritten!!!StereoMolecule
getMolecule()
Either this method or getIDCode() and getCoordinates() must be overwritten!!!abstract java.lang.String
getMoleculeName()
abstract int
getRowCount()
Depending on data source returns the total row count or -1 if unknownboolean
isOpen()
boolean
next()
Advances the row counter to the next rowvoid
setDescriptorHandlerFactory(DescriptorHandlerFactory factory)
If a requested descriptor is not available in a particuar compound record, the parser can create one itself, provided its DescriptorHandlerFactory knows the descriptor name.
-
-
-
Method Detail
-
createParser
public static CompoundFileParser createParser(java.lang.String fileName)
Creates the proper parser for the given type of compound file (currently SD or DWAR).- Parameters:
fileName
-- Returns:
- parser or null, if the file doesn't exist or cannot be accessed
-
getFieldNames
public abstract java.lang.String[] getFieldNames()
Compiles all column names that contain alpha-numerical information. Columns containing chemistry objects, coordinates or descriptors don't appear in the list.- Returns:
- columns name array in the order of appearance
-
getFieldData
public abstract java.lang.String getFieldData(int column)
Returns the cell content of the current row. Multi-line cell entries are separated by a '\n' character.- Parameters:
column
- refers to alpha-numerical columns only, as getFieldNames()- Returns:
-
getRowCount
public abstract int getRowCount()
Depending on data source returns the total row count or -1 if unknown- Returns:
- number of rows or -1
-
advanceToNext
protected abstract boolean advanceToNext()
Dont't call this method directly. Use next() instead.- Returns:
- false if there is no next row
-
isOpen
public boolean isOpen()
- Returns:
- whether the file was found and open to accept next() calls
-
next
public boolean next()
Advances the row counter to the next row- Returns:
- false if there is no next row
-
close
public final void close()
Closes the underlying reader. Call this, if you don't read all records of the file. The reader is closed automatically after the last record has been read.
-
getIDCode
public java.lang.String getIDCode()
Either this method and getCoordinates() or getMolecule() must be overwritten!!!- Returns:
- idcode of first chemical structure column of the current row
-
getCoordinates
public java.lang.String getCoordinates()
Either getIDCode and this method or getMolecule() must be overwritten!!!- Returns:
- idcoords of first chemical structure column of the current row
-
getMoleculeName
public abstract java.lang.String getMoleculeName()
- Returns:
- name/id of (primary) chemical structure of the current row
-
setDescriptorHandlerFactory
public void setDescriptorHandlerFactory(DescriptorHandlerFactory factory)
If a requested descriptor is not available in a particuar compound record, the parser can create one itself, provided its DescriptorHandlerFactory knows the descriptor name. The default DescriptorHandlerFactory is null, thus one needs to set one in order to allow the parser to create descriptors.- Parameters:
factory
-
-
getDescriptorHandlerFactory
public DescriptorHandlerFactory getDescriptorHandlerFactory()
- Returns:
- currently used DescriptorHandlerFactory
-
getFieldIndex
public int getFieldIndex(java.lang.String fieldName)
- Parameters:
fieldName
-- Returns:
- index of the field with the given name, -1 if fieldName doesn't exist
-
getDescriptor
public java.lang.Object getDescriptor(java.lang.String shortName)
If the file source contains encoded descriptors, then overwrite this method to save the calculation time.- Parameters:
shortName
-- Returns:
- descriptor as int[] or whatever is the descriptors binary format
-
getMolecule
public StereoMolecule getMolecule()
Either this method or getIDCode() and getCoordinates() must be overwritten!!!- Returns:
- the structure of the records (primary) molecule or null
-
-