Class TranscriptSequence

  • All Implemented Interfaces:
    java.lang.Iterable<NucleotideCompound>, Accessioned, Sequence<NucleotideCompound>

    public class TranscriptSequence
    extends DNASequence
    This is the sequence if you want to go from a gene sequence to a protein sequence. Need to start with a ChromosomeSequence then getting a GeneSequence and then a TranscriptSequence
    Author:
    Scooter Willis
    • Constructor Detail

      • TranscriptSequence

        public TranscriptSequence​(GeneSequence parentDNASequence,
                                  int begin,
                                  int end)
        Parameters:
        parentDNASequence -
        begin -
        end - inclusive of end
    • Method Detail

      • getStrand

        public Strand getStrand()
        Returns:
        the strand
      • removeCDS

        public CDSSequence removeCDS​(java.lang.String accession)
        Remove a CDS or coding sequence from the transcript sequence
        Parameters:
        accession -
        Returns:
      • getCDSSequences

        public java.util.LinkedHashMap<java.lang.String,​CDSSequence> getCDSSequences()
        Get the CDS sequences that have been added to the TranscriptSequences
        Returns:
      • addCDS

        public CDSSequence addCDS​(AccessionID accession,
                                  int begin,
                                  int end,
                                  int phase)
                           throws java.lang.Exception
        Add a Coding Sequence region with phase to the transcript sequence
        Parameters:
        accession -
        begin -
        end -
        phase - 0,1,2
        Returns:
        Throws:
        java.lang.Exception
      • getProteinCDSSequences

        public java.util.ArrayList<ProteinSequence> getProteinCDSSequences()
        Return a list of protein sequences based on each CDS sequence where the phase shift between two CDS sequences is assigned to the CDS sequence that starts the triplet. This can be used to map a CDS/exon region of a protein sequence back to the DNA sequence If you have a protein sequence and a predicted gene you can take the predict CDS protein sequences and align back to the protein sequence. If you have errors in mapping the predicted protein CDS regions to an the known protein sequence then you can identify possible errors in the prediction
        Returns:
      • getDNACodingSequence

        public DNASequence getDNACodingSequence()
        Get the stitched together CDS sequences then maps to the cDNA
        Returns:
      • getProteinSequence

        public ProteinSequence getProteinSequence()
        Get the protein sequence
        Returns:
      • getProteinSequence

        public ProteinSequence getProteinSequence​(TranscriptionEngine engine)
        Get the protein sequence with user defined TranscriptEngine
        Parameters:
        engine -
        Returns:
      • getStartCodonSequence

        public StartCodonSequence getStartCodonSequence()
        Returns:
        the startCodonSequence
      • addStartCodonSequence

        public void addStartCodonSequence​(AccessionID accession,
                                          int begin,
                                          int end)
        Parameters:
        startCodonSequence - the startCodonSequence to set
      • getStopCodonSequence

        public StopCodonSequence getStopCodonSequence()
        Returns:
        the stopCodonSequence
      • addStopCodonSequence

        public void addStopCodonSequence​(AccessionID accession,
                                         int begin,
                                         int end)
        Parameters:
        stopCodonSequence - the stopCodonSequence to set