org.biojava3.genome
Class GeneFeatureHelper

java.lang.Object
  extended by org.biojava3.genome.GeneFeatureHelper

public class GeneFeatureHelper
extends Object

Author:
Scooter Willis

Constructor Summary
GeneFeatureHelper()
           
 
Method Summary
static void addGeneIDGFF2GeneFeatures(LinkedHashMap<String,ChromosomeSequence> chromosomeSequenceList, FeatureList listGenes)
          Load GFF2 feature file generated from the geneid prediction algorithm and map features onto the chromosome sequences
static void addGeneMarkGTFGeneFeatures(LinkedHashMap<String,ChromosomeSequence> chromosomeSequenceList, FeatureList listGenes)
           
static void addGlimmerGFF3GeneFeatures(LinkedHashMap<String,ChromosomeSequence> chromosomeSequenceList, FeatureList listGenes)
           
static void addGmodGFF3GeneFeatures(LinkedHashMap<String,ChromosomeSequence> chromosomeSequenceList, FeatureList listGenes)
          Load GFF3 file using mRNA as the gene feature as not all GFF3 files are complete
static LinkedHashMap<String,ChromosomeSequence> getChromosomeSequenceFromDNASequence(LinkedHashMap<String,DNASequence> dnaSequenceList)
           
static LinkedHashMap<String,GeneSequence> getGeneSequences(Collection<ChromosomeSequence> chromosomeSequences)
           
static LinkedHashMap<String,ProteinSequence> getProteinSequences(Collection<ChromosomeSequence> chromosomeSequences)
           
static LinkedHashMap<String,ChromosomeSequence> loadFastaAddGeneFeaturesFromGeneIDGFF2(File fastaSequenceFile, File gffFile)
          Loads Fasta file and GFF2 feature file generated from the geneid prediction algorithm
static LinkedHashMap<String,ChromosomeSequence> loadFastaAddGeneFeaturesFromGeneMarkGTF(File fastaSequenceFile, File gffFile)
           
static LinkedHashMap<String,ChromosomeSequence> loadFastaAddGeneFeaturesFromGlimmerGFF3(File fastaSequenceFile, File gffFile)
           
static LinkedHashMap<String,ChromosomeSequence> loadFastaAddGeneFeaturesFromGmodGFF3(File fastaSequenceFile, File gffFile, boolean lazyloadsequences)
          Lots of variations in the ontology or descriptors that can be used in GFF3 which requires writing a custom parser to handle a GFF3 generated or used by a specific application.
static LinkedHashMap<String,ChromosomeSequence> loadFastaAddGeneFeaturesFromUpperCaseExonFastaFile(File fastaSequenceFile, File uppercaseFastaFile, boolean throwExceptionGeneNotFound)
           
static void main(String[] args)
           
static void outputFastaSequenceLengthGFF3(File fastaSequenceFile, File gffFile)
          Output a gff3 feature file that will give the length of each scaffold/chromosome in the fasta file.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

GeneFeatureHelper

public GeneFeatureHelper()
Method Detail

loadFastaAddGeneFeaturesFromUpperCaseExonFastaFile

public static LinkedHashMap<String,ChromosomeSequence> loadFastaAddGeneFeaturesFromUpperCaseExonFastaFile(File fastaSequenceFile,
                                                                                                          File uppercaseFastaFile,
                                                                                                          boolean throwExceptionGeneNotFound)
                                                                                                   throws Exception
Throws:
Exception

outputFastaSequenceLengthGFF3

public static void outputFastaSequenceLengthGFF3(File fastaSequenceFile,
                                                 File gffFile)
                                          throws Exception
Output a gff3 feature file that will give the length of each scaffold/chromosome in the fasta file. Used for gbrowse so it knows length.

Parameters:
fastaSequenceFile -
gffFile -
Throws:
Exception

loadFastaAddGeneFeaturesFromGeneIDGFF2

public static LinkedHashMap<String,ChromosomeSequence> loadFastaAddGeneFeaturesFromGeneIDGFF2(File fastaSequenceFile,
                                                                                              File gffFile)
                                                                                       throws Exception
Loads Fasta file and GFF2 feature file generated from the geneid prediction algorithm

Parameters:
fastaSequenceFile -
gffFile -
Returns:
Throws:
Exception

addGeneIDGFF2GeneFeatures

public static void addGeneIDGFF2GeneFeatures(LinkedHashMap<String,ChromosomeSequence> chromosomeSequenceList,
                                             FeatureList listGenes)
                                      throws Exception
Load GFF2 feature file generated from the geneid prediction algorithm and map features onto the chromosome sequences

Parameters:
chromosomeSequenceList -
listGenes -
Throws:
Exception

getChromosomeSequenceFromDNASequence

public static LinkedHashMap<String,ChromosomeSequence> getChromosomeSequenceFromDNASequence(LinkedHashMap<String,DNASequence> dnaSequenceList)

loadFastaAddGeneFeaturesFromGmodGFF3

public static LinkedHashMap<String,ChromosomeSequence> loadFastaAddGeneFeaturesFromGmodGFF3(File fastaSequenceFile,
                                                                                            File gffFile,
                                                                                            boolean lazyloadsequences)
                                                                                     throws Exception
Lots of variations in the ontology or descriptors that can be used in GFF3 which requires writing a custom parser to handle a GFF3 generated or used by a specific application. Probably could be abstracted out but for now easier to handle with custom code to deal with gff3 elements that are not included but can be extracted from other data elements.

Parameters:
fastaSequenceFile -
gffFile -
lazyloadsequences - If set to true then the fasta file will be parsed for accession id but sequences will be read from disk when needed to save memory
Returns:
Throws:
Exception

addGmodGFF3GeneFeatures

public static void addGmodGFF3GeneFeatures(LinkedHashMap<String,ChromosomeSequence> chromosomeSequenceList,
                                           FeatureList listGenes)
                                    throws Exception
Load GFF3 file using mRNA as the gene feature as not all GFF3 files are complete

Parameters:
chromosomeSequenceList -
listGenes -
Throws:
Exception

loadFastaAddGeneFeaturesFromGlimmerGFF3

public static LinkedHashMap<String,ChromosomeSequence> loadFastaAddGeneFeaturesFromGlimmerGFF3(File fastaSequenceFile,
                                                                                               File gffFile)
                                                                                        throws Exception
Throws:
Exception

addGlimmerGFF3GeneFeatures

public static void addGlimmerGFF3GeneFeatures(LinkedHashMap<String,ChromosomeSequence> chromosomeSequenceList,
                                              FeatureList listGenes)
                                       throws Exception
Throws:
Exception

loadFastaAddGeneFeaturesFromGeneMarkGTF

public static LinkedHashMap<String,ChromosomeSequence> loadFastaAddGeneFeaturesFromGeneMarkGTF(File fastaSequenceFile,
                                                                                               File gffFile)
                                                                                        throws Exception
Throws:
Exception

addGeneMarkGTFGeneFeatures

public static void addGeneMarkGTFGeneFeatures(LinkedHashMap<String,ChromosomeSequence> chromosomeSequenceList,
                                              FeatureList listGenes)
                                       throws Exception
Throws:
Exception

getProteinSequences

public static LinkedHashMap<String,ProteinSequence> getProteinSequences(Collection<ChromosomeSequence> chromosomeSequences)
                                                                 throws Exception
Throws:
Exception

getGeneSequences

public static LinkedHashMap<String,GeneSequence> getGeneSequences(Collection<ChromosomeSequence> chromosomeSequences)
                                                           throws Exception
Throws:
Exception

main

public static void main(String[] args)
                 throws Exception
Throws:
Exception