org.biojava3.alignment
Class SimpleProfile<S extends Sequence<C>,C extends Compound>

java.lang.Object
  extended by org.biojava3.alignment.SimpleProfile<S,C>
Type Parameters:
S - each element of the alignment Profile is of type S
C - each element of an AlignedSequence is a Compound of type C
All Implemented Interfaces:
Iterable<AlignedSequence<S,C>>, Profile<S,C>
Direct Known Subclasses:
SimpleProfilePair, SimpleSequencePair

public class SimpleProfile<S extends Sequence<C>,C extends Compound>
extends Object
implements Profile<S,C>

Implements a data structure for the results of sequence alignment. Every List returned is unmodifiable.

Author:
Mark Chapman

Nested Class Summary
 
Nested classes/interfaces inherited from interface org.biojava3.alignment.template.Profile
Profile.StringFormat
 
Field Summary
protected static SubstitutionMatrix<AminoAcidCompound> matrix
           
 
Constructor Summary
protected SimpleProfile(AlignedSequence<S,C> query, AlignedSequence<S,C> target)
          Creates a pair profile for the given already aligned sequences.
protected SimpleProfile(Profile<S,C> query, Profile<S,C> target, List<AlignedSequence.Step> sx, List<AlignedSequence.Step> sy)
          Creates a pair profile for the given profiles.
  SimpleProfile(S sequence)
          Creates a profile from a single sequence.
protected SimpleProfile(S query, S target, List<AlignedSequence.Step> sx, int xb, int xa, List<AlignedSequence.Step> sy, int yb, int ya)
          Creates a pair profile for the given sequences.
 
Method Summary
 AlignedSequence<S,C> getAlignedSequence(int listIndex)
          Returns AlignedSequence at given index.
 AlignedSequence<S,C> getAlignedSequence(S sequence)
          Searches for the given Sequence within this alignment profile.
 List<AlignedSequence<S,C>> getAlignedSequences()
          Returns a List containing the individual AlignedSequences of this alignment.
 List<AlignedSequence<S,C>> getAlignedSequences(int... listIndices)
          Returns a List containing some of the individual AlignedSequences of this alignment.
 List<AlignedSequence<S,C>> getAlignedSequences(S... sequences)
          Returns a List containing some of the individual AlignedSequences of this alignment.
 C getCompoundAt(int listIndex, int alignmentIndex)
          Returns the Compound at row of given sequence and column of alignment index.
 C getCompoundAt(S sequence, int alignmentIndex)
          Returns the Compound at row of given sequence and column of alignment index.
 int[] getCompoundCountsAt(int alignmentIndex)
          Returns the number of each Compound in the given column for all compounds in CompoundSet.
 int[] getCompoundCountsAt(int alignmentIndex, List<C> compounds)
          Returns the number of each Compound in the given column only for compounds in the given list.
 List<C> getCompoundsAt(int alignmentIndex)
          Returns the Compound elements of the original Sequences at the given column.
 CompoundSet<C> getCompoundSet()
          Returns CompoundSet of all AlignedSequences
 float[] getCompoundWeightsAt(int alignmentIndex)
          Returns the fraction of each Compound in the given column for all compounds in CompoundSet.
 float[] getCompoundWeightsAt(int alignmentIndex, List<C> compounds)
          Returns the fraction of each Compound in the given column only for compounds in the given list.
 int getIndexOf(C compound)
          Searches for the given Compound within this alignment profile.
 int[] getIndicesAt(int alignmentIndex)
          Returns the indices in the original Sequences corresponding to the given column.
 int getLastIndexOf(C compound)
          Searches for the given Compound within this alignment profile.
 int getLength()
          Returns the number of columns in the alignment profile.
 List<S> getOriginalSequences()
          Returns a List containing the original Sequences used for alignment.
 int getSize()
          Returns the number of rows in this profile.
 ProfileView<S,C> getSubProfile(Location location)
          Returns a ProfileView windowed to contain only the given Location.
 boolean hasGap(int alignmentIndex)
          Returns true if any AlignedSequence has a gap at the given index.
 boolean isCircular()
          Returns true if any AlignedSequence is circular.
 Iterator<AlignedSequence<S,C>> iterator()
           
 String toString()
          Returns a simple view of the alignment profile.
 String toString(int width)
          Returns a formatted view of the alignment profile.
 String toString(Profile.StringFormat format)
          Returns a formatted view of the alignment profile.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

matrix

protected static final SubstitutionMatrix<AminoAcidCompound> matrix
Constructor Detail

SimpleProfile

protected SimpleProfile(AlignedSequence<S,C> query,
                        AlignedSequence<S,C> target)
Creates a pair profile for the given already aligned sequences.

Parameters:
query - the first sequence of the pair
target - the second sequence of the pair
Throws:
IllegalArgumentException - if sequences differ in size

SimpleProfile

public SimpleProfile(S sequence)
Creates a profile from a single sequence.

Parameters:
sequence - sequence to seed profile

SimpleProfile

protected SimpleProfile(S query,
                        S target,
                        List<AlignedSequence.Step> sx,
                        int xb,
                        int xa,
                        List<AlignedSequence.Step> sy,
                        int yb,
                        int ya)
Creates a pair profile for the given sequences.

Parameters:
query - the first sequence of the pair
target - the second sequence of the pair
sx - lists whether the query sequence aligns a Compound or gap at each index of the alignment
xb - number of Compounds skipped in the query sequence before the aligned region
xa - number of Compounds skipped in the query sequence after the aligned region
sy - lists whether the target sequence aligns a Compound or gap at each index of the alignment
yb - number of Compounds skipped in the target sequence before the aligned region
ya - number of Compounds skipped in the target sequence after the aligned region
Throws:
IllegalArgumentException - if alignments differ in size or given sequences do not fit in alignments

SimpleProfile

protected SimpleProfile(Profile<S,C> query,
                        Profile<S,C> target,
                        List<AlignedSequence.Step> sx,
                        List<AlignedSequence.Step> sy)
Creates a pair profile for the given profiles.

Parameters:
query - the first profile of the pair
target - the second profile of the pair
sx - lists whether the query profile aligns a Compound or gap at each index of the alignment
sy - lists whether the target profile aligns a Compound or gap at each index of the alignment
Throws:
IllegalArgumentException - if alignments differ in size or given profiles do not fit in alignments
Method Detail

getAlignedSequence

public AlignedSequence<S,C> getAlignedSequence(int listIndex)
Description copied from interface: Profile
Returns AlignedSequence at given index.

Specified by:
getAlignedSequence in interface Profile<S extends Sequence<C>,C extends Compound>
Parameters:
listIndex - index of sequence in profile
Returns:
desired sequence

getAlignedSequence

public AlignedSequence<S,C> getAlignedSequence(S sequence)
Description copied from interface: Profile
Searches for the given Sequence within this alignment profile. Returns the corresponding AlignedSequence.

Specified by:
getAlignedSequence in interface Profile<S extends Sequence<C>,C extends Compound>
Parameters:
sequence - an original Sequence
Returns:
the corresponding AlignedSequence

getAlignedSequences

public List<AlignedSequence<S,C>> getAlignedSequences()
Description copied from interface: Profile
Returns a List containing the individual AlignedSequences of this alignment.

Specified by:
getAlignedSequences in interface Profile<S extends Sequence<C>,C extends Compound>
Returns:
list of aligned sequences

getAlignedSequences

public List<AlignedSequence<S,C>> getAlignedSequences(int... listIndices)
Description copied from interface: Profile
Returns a List containing some of the individual AlignedSequences of this alignment.

Specified by:
getAlignedSequences in interface Profile<S extends Sequence<C>,C extends Compound>
Parameters:
listIndices - indices of sequences in profile
Returns:
list of aligned sequences

getAlignedSequences

public List<AlignedSequence<S,C>> getAlignedSequences(S... sequences)
Description copied from interface: Profile
Returns a List containing some of the individual AlignedSequences of this alignment.

Specified by:
getAlignedSequences in interface Profile<S extends Sequence<C>,C extends Compound>
Parameters:
sequences - original Sequences
Returns:
list of aligned sequences

getCompoundAt

public C getCompoundAt(int listIndex,
                       int alignmentIndex)
Description copied from interface: Profile
Returns the Compound at row of given sequence and column of alignment index. If the given sequence has overlap, this will return the Compound from the top row of the sequence.

Specified by:
getCompoundAt in interface Profile<S extends Sequence<C>,C extends Compound>
Parameters:
listIndex - index of sequence in profile
alignmentIndex - column index within an alignment
Returns:
the sequence element

getCompoundAt

public C getCompoundAt(S sequence,
                       int alignmentIndex)
Description copied from interface: Profile
Returns the Compound at row of given sequence and column of alignment index. If the given sequence has overlap, this will return the Compound from the top row of the sequence.

Specified by:
getCompoundAt in interface Profile<S extends Sequence<C>,C extends Compound>
Parameters:
sequence - either an AlignedSequence or an original Sequence
alignmentIndex - column index within an alignment
Returns:
the sequence element

getCompoundCountsAt

public int[] getCompoundCountsAt(int alignmentIndex)
Description copied from interface: Profile
Returns the number of each Compound in the given column for all compounds in CompoundSet.

Specified by:
getCompoundCountsAt in interface Profile<S extends Sequence<C>,C extends Compound>
Parameters:
alignmentIndex - column index within an alignment
Returns:
list of counts

getCompoundCountsAt

public int[] getCompoundCountsAt(int alignmentIndex,
                                 List<C> compounds)
Description copied from interface: Profile
Returns the number of each Compound in the given column only for compounds in the given list.

Specified by:
getCompoundCountsAt in interface Profile<S extends Sequence<C>,C extends Compound>
Parameters:
alignmentIndex - column index within an alignment
compounds - list of compounds to count
Returns:
corresponding list of counts

getCompoundsAt

public List<C> getCompoundsAt(int alignmentIndex)
Description copied from interface: Profile
Returns the Compound elements of the original Sequences at the given column.

Specified by:
getCompoundsAt in interface Profile<S extends Sequence<C>,C extends Compound>
Parameters:
alignmentIndex - column index within an alignment
Returns:
the sequence elements

getCompoundSet

public CompoundSet<C> getCompoundSet()
Description copied from interface: Profile
Returns CompoundSet of all AlignedSequences

Specified by:
getCompoundSet in interface Profile<S extends Sequence<C>,C extends Compound>
Returns:
set of Compounds in contained sequences

getCompoundWeightsAt

public float[] getCompoundWeightsAt(int alignmentIndex)
Description copied from interface: Profile
Returns the fraction of each Compound in the given column for all compounds in CompoundSet.

Specified by:
getCompoundWeightsAt in interface Profile<S extends Sequence<C>,C extends Compound>
Parameters:
alignmentIndex - column index within an alignment
Returns:
list of fractional weights

getCompoundWeightsAt

public float[] getCompoundWeightsAt(int alignmentIndex,
                                    List<C> compounds)
Description copied from interface: Profile
Returns the fraction of each Compound in the given column only for compounds in the given list.

Specified by:
getCompoundWeightsAt in interface Profile<S extends Sequence<C>,C extends Compound>
Parameters:
alignmentIndex - column index within an alignment
compounds - list of compounds to count
Returns:
corresponding list of fractional weights

getIndexOf

public int getIndexOf(C compound)
Description copied from interface: Profile
Searches for the given Compound within this alignment profile. Returns column index nearest to the start of the alignment profile, or -1 if not found.

Specified by:
getIndexOf in interface Profile<S extends Sequence<C>,C extends Compound>
Parameters:
compound - search element
Returns:
index of column containing search element nearest to the start of the alignment profile

getIndicesAt

public int[] getIndicesAt(int alignmentIndex)
Description copied from interface: Profile
Returns the indices in the original Sequences corresponding to the given column. All indices are 1-indexed and inclusive.

Specified by:
getIndicesAt in interface Profile<S extends Sequence<C>,C extends Compound>
Parameters:
alignmentIndex - column index within an alignment
Returns:
the sequence indices

getLastIndexOf

public int getLastIndexOf(C compound)
Description copied from interface: Profile
Searches for the given Compound within this alignment profile. Returns column index nearest to the end of the alignment profile, or -1 if not found.

Specified by:
getLastIndexOf in interface Profile<S extends Sequence<C>,C extends Compound>
Parameters:
compound - search element
Returns:
index of column containing search element nearest to the end of the alignment profile

getLength

public int getLength()
Description copied from interface: Profile
Returns the number of columns in the alignment profile.

Specified by:
getLength in interface Profile<S extends Sequence<C>,C extends Compound>
Returns:
the number of columns

getOriginalSequences

public List<S> getOriginalSequences()
Description copied from interface: Profile
Returns a List containing the original Sequences used for alignment.

Specified by:
getOriginalSequences in interface Profile<S extends Sequence<C>,C extends Compound>
Returns:
list of original sequences

getSize

public int getSize()
Description copied from interface: Profile
Returns the number of rows in this profile. If any AlignedSequences are circular and overlap within the alignment, the returned size will be greater than the number of sequences, otherwise the numbers will be equal.

Specified by:
getSize in interface Profile<S extends Sequence<C>,C extends Compound>
Returns:
number of rows

getSubProfile

public ProfileView<S,C> getSubProfile(Location location)
Description copied from interface: Profile
Returns a ProfileView windowed to contain only the given Location. This only includes the AlignedSequences which overlap the location.

Specified by:
getSubProfile in interface Profile<S extends Sequence<C>,C extends Compound>
Parameters:
location - portion of profile to view
Returns:
a windowed view of the profile

hasGap

public boolean hasGap(int alignmentIndex)
Description copied from interface: Profile
Returns true if any AlignedSequence has a gap at the given index.

Specified by:
hasGap in interface Profile<S extends Sequence<C>,C extends Compound>
Parameters:
alignmentIndex - column index within an alignment
Returns:
true if any AlignedSequence has a gap at the given index

isCircular

public boolean isCircular()
Description copied from interface: Profile
Returns true if any AlignedSequence is circular. If so, sequences may simply wrap around from the end to the start of the alignment or they may contribute multiple overlapping lines to the profile.

Specified by:
isCircular in interface Profile<S extends Sequence<C>,C extends Compound>
Returns:
true if any AlignedSequence is circular

toString

public String toString(int width)
Description copied from interface: Profile
Returns a formatted view of the alignment profile. This shows the start and end indices of the profile and each sequence for each group of lines of the given width. Each line may also be labeled.

Specified by:
toString in interface Profile<S extends Sequence<C>,C extends Compound>
Parameters:
width - limit on the line length
Returns:
a formatted view of the alignment profile

toString

public String toString(Profile.StringFormat format)
Description copied from interface: Profile
Returns a formatted view of the alignment profile. Details depend on the format given.

Specified by:
toString in interface Profile<S extends Sequence<C>,C extends Compound>
Parameters:
format - output format
Returns:
a formatted view of the alignment profile

toString

public String toString()
Description copied from interface: Profile
Returns a simple view of the alignment profile. This shows each sequence on a separate line (or multiple lines, if circular) and nothing more. This should result in Profile.getSize() lines with Profile.getLength() Compounds per line.

Specified by:
toString in interface Profile<S extends Sequence<C>,C extends Compound>
Overrides:
toString in class Object
Returns:
a simple view of the alignment profile

iterator

public Iterator<AlignedSequence<S,C>> iterator()
Specified by:
iterator in interface Iterable<AlignedSequence<S extends Sequence<C>,C extends Compound>>