All Classes Interface Summary Class Summary Enum Summary
Class |
Description |
AcgtTree |
ACGT tree
|
AlleleCountStats |
Count singletons and other allele counts per sample
|
ArrayUtil |
Methods for manipulating arrays.
|
AutoHashMap<K,V> |
A Hash that creates new elements if they don't exists
|
Average |
A simple class that calculates averages
|
AverageInt |
A simple class that calculates average of integer numbers
|
BasesChangeCounter |
Counts how many bases changed, given an XOR between two longs
|
BBCompressionUtils |
Created by IntelliJ IDEA.
|
BBFileHeader |
|
BBFileReader |
|
BBTotalSummaryBlock |
Created by IntelliJ IDEA.
|
BBZoomLevelFormat |
Created by IntelliJ IDEA.
|
BBZoomLevelHeader |
Created by IntelliJ IDEA.
|
BBZoomLevels |
Created by IntelliJ IDEA.
|
BedAnnotationOutputFormatter |
Formats: Show all annotations that intersect the BED input file.
|
BedFeature |
Created by IntelliJ IDEA.
|
BedFileIterator |
Opens a sequence change file and iterates over all intervals in BED format.
|
BedOutputFormatter |
Formats output as BED file
Referneces: http://genome.ucsc.edu/FAQ/FAQformat.html#format1
|
BigBedDataBlock |
Created by IntelliJ IDEA.
|
BigBedFileIterator |
FileIterator for BigBed features
Note: I use Broad's IGV code to do all the work, this is just a wrapper
|
BigBedIterator |
Created by IntelliJ IDEA.
|
BigWigDataBlock |
Created by IntelliJ IDEA.
|
BigWigIterator |
Created by IntelliJ IDEA.
|
BigWigSection |
Created by IntelliJ IDEA.
|
BigWigSectionHeader |
Created by IntelliJ IDEA.
|
BigWigSectionHeader.WigItemType |
|
BinarySequence |
Base class for a binary 'read'.
|
Binomial |
Calculate binomial distribution
References http://en.wikipedia.org/wiki/Binomial_distribution
|
BinSeqFileIterator<T extends BinarySequence> |
Reads all sequences from a file
Warning: You should always call "close()" at the end of the iteration.
|
BioType |
BioTypes: Gene or transcript bioType annotation
References: http://vega.sanger.ac.uk/info/about/gene_and_transcript_types.html
Biotypes classifies genes and transcripts into groups including: protein coding, pseudogene
, processed pseudogene, miRNA, rRNA, scRNA, snoRNA, snRNA.
|
BitUtil |
A variety of high efficiency bit twiddling routines.
|
BlackBoxEvent |
A black box event
|
BlastResultEntry |
|
BlastResultFileIterator |
Iterate on each line of a GWAS catalog (TXT format)
|
BooleanMutable |
A mutable boolean
|
BPTree |
Created by IntelliJ IDEA.
|
BPTreeChildNode |
Created by IntelliJ IDEA.
|
BPTreeChildNodeItem |
Created by IntelliJ IDEA.
|
BPTreeHeader |
Created by IntelliJ IDEA.
|
BPTreeLeafNode |
Created by IntelliJ IDEA.
|
BPTreeLeafNodeItem |
|
BPTreeNode |
Created by IntelliJ IDEA.
|
CatalystActivity |
A catalyst activity event
|
Cds |
CDS: The coding region of a gene, also known as the coding sequence or CDS (from Coding DNA Sequence), is
that portion of a gene's DNA or RNA, composed of exons, that codes for protein.
|
Chromosome |
Interval for the whole chromosome
If a SNP has no 'ChromosomeInterval' => it is outside the chromosome => Invalid
|
ChromosomeSimpleName |
Convert chromosome names to simple names
|
ChrPosScoreList |
A list of
|
ChrPosStats |
How many changes per position do we have in a chromosome.
|
CircularCorrection |
Correct circular genomic coordinates
Nomenclature: We use coordinates at the beginning of the chromosme and negative coordinates
|
CochranArmitageTest |
Calculate a Cochran-Armitage test
Reference: http://en.wikipedia.org/wiki/Cochran-Armitage_test_for_trend
The trend test is applied when the data take the form of a 2 x k contingency
table.
|
Coder |
Class used to encode & decode sequences into binary and vice-versa
They are usually stored in 'long' words
|
CodonChange |
Analyze codon changes based on a variant and a Transcript
|
CodonChangeDel |
Calculate codon changes produced by a deletion
|
CodonChangeDup |
Calculate codon changes produced by a duplication
|
CodonChangeIns |
Calculate codon changes produced by an insertion
|
CodonChangeInterval |
Calculate codon changes produced by a Interval
Note: An interval does not produce any effect.
|
CodonChangeInv |
Calculate codon changes produced by an inversion
|
CodonChangeMixed |
Calculate codon changes produced by a 'mixed' variant
Essentially every 'mixed' variant can be represented as a concatenation of a SNP/MNP + an INS/DEL
|
CodonChangeMnp |
Calculate codon changes produced by a MNP
|
CodonChangeSnp |
Calculate codon changes produced by a SNP
|
CodonChangeStructural |
Calculate codon changes produced by a duplication
|
CodonTable |
A codon translation table
|
CodonTables |
All codon tables are stored here.
|
CombinatorialIterator |
Generate all possible 'count' combinations
|
CommandLine |
Command line and arguments
The way to run a command from 'main' is usually:
public static void main(String[] args) {
Command cmd = new Command();
cmd.parseArgs(args);
cmd.run();
}
|
CompareByValue |
Compare two elements in a Map (e.g.
|
CompareByValue |
Compare two elements in a Map (e.g.
|
CompareEffects |
Compare effects in tests cases
|
CompareToEnsembl |
Compare our results to ENSEML's Variant Effect predictor's output
|
CompareToVep |
Compare our results to ENSEML's Variant Effect predictor's output
|
Compartment |
A Reactome compartment (a part of a cell)
|
Complex |
A Reactome complex (a bunch of molecules or complexes
|
ConcurrentActor |
|
ConcurrentFactory |
|
ConcurrentProperties |
|
Config |
|
CountByKey<T> |
Counters indexed by key.
|
CountByType |
Counters indexed by 'type' (type is a generic string that can mean anything)
|
Counter |
A simple class that counts...
|
CounterDouble |
A simple class that counts...
|
CountFragments |
Base by base coverage (one chromsome)
|
CountReads |
Count how many reads map (from many SAM/BAM files) onto markers
|
CountReadsOnMarkers |
Count how many reads map (from many SAM/BAM files) onto markers
|
Coverage |
Base by base coverage (one chromsome)
|
CoverageByType |
|
CoverageChr |
Base by base coverage (one chromsome)
|
CreateSpliceSiteTestCase |
|
Custom |
This is a custom interval (i.e.
|
CytoBands |
Cytband definitions
E.g.: http://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/cytoBand.txt.gz
|
Depolymerisation |
A depolymerization event
|
Diff |
|
DistanceResult |
|
DnaAndQualitySequence |
Binary packed DNA sequence and base calling quality
Notes:
- This is designed for short sequences (such as "short reads")
- Every base is encoded in 8 bits:
- Six bits for the base quality [0 , ..
|
DnaAndQualitySequenceWithId |
DnaAndQualitySequence with an ID
|
DnaCoder |
Class used to encode & decode sequences into binary and vice-versa
Note:This is a singleton class.
|
DnaNSequence |
Binary packed DNA sequence that allows also 'N' bases: {A, C, G, T, N}
|
DnaQualityCoder |
Class used to encode & decode sequences into binary and vice-versa
- Every base is encoded in 8 bits:
- Six bits for the base quality [0 , ..
|
DnaQualSubsequenceComparator |
Compares two subsequences of DNA (DnaAndQualitySequence)
|
DnaSeqFileIterator |
|
DnaSeqIdFileIterator |
|
DnaSeqPeFileIterator |
|
DnaSequence |
Binary packed DNA sequence
Notes:
- This is designed for short sequences (such as "short reads")
- Every base is encoded in 2 bits {a, c, g, t} <=> {0, 1, 2, 3}
- All bits are stored in an array of 'words' (integers)
- Most significant bits are the first bases in the sequence (makes comparison easier)
|
DnaSequenceByte |
Binary packed DNA sequence.
|
DnaSequenceId |
Binary packed DNA sequence with an ID (long)
|
DnaSequencePe |
Pair end DNA sequence (binary packed)
It consists of 2 DNA sequences separated by a gap.
|
DnaSubsequenceComparator<T extends DnaSequence> |
Compares two subsequences of DNA (DnaSequence)
|
Download |
Command line program: Build database
|
Downstream |
Interval for a gene, as well as some other information: exons, utrs, cds, etc.
|
EffectType |
Effect type:
Note that effects are sorted (declared) by impact (highest to lowest putative impact).
|
EffFormatVersion |
VcfFields in SnpEff version 2.X have a different format than 3.X
As of version 4.1 we switch to a standard annotation format
|
Embl |
A class representing the same data as an EMBL file
References: http://www.ebi.ac.uk/embl/Documentation/User_manual/usrman.html
|
EmblFile |
A file containing one or more set of features (e.g.
|
EnrichmentAlgorithm |
A generic enrichment algorithm for selecting gene-sets from a collection of gene-sets
|
EnrichmentAlgorithm.EnrichmentAlgorithmType |
|
EnrichmentAlgorithmGreedy |
A generic greedy enrichment algorithm for selecting gene-sets
|
EnrichmentAlgorithmGreedyVariableSize |
A greedy enrichment algorithm for selecting gene-sets using a variable geneSet-size strategy:
i) Select only from geneSets in low-sizes e.g.
|
Entity |
A reactome basic entity (e.g.
|
Entity.TransferFunction |
|
Event |
A reactome event (any generic event, from pathways to polymerizations)
|
ExecuteOsCommand |
Launches an 'OS command' (e.g.
|
Exon |
Interval for an exon
|
Exon.ExonSpliceType |
Characterize exons based on alternative splicing
References: "Alternative splicing and evolution - diversification, exon definition and function" (see Box 1)
|
ExonSpliceCharacterizer |
Characterize exons based on alternative splicing
References: "Alternative splicing and evolution - diversification, exon definition and function" (see Box 1)
|
FastaFileIterator |
Opens a fasta file and iterates over all fasta sequences in the file
|
Fastq2Fastq |
Convert FASTQ (phred64) file to FASTQ (phred33)
|
FastqFileIterator |
Opens a fastq file and iterates over all fastq sequences in the file
Unlike BioJava's version, this one does NOT load all sequences in
memory.
|
FastqSplit |
Split a fastq into N files
|
FastqTools |
Simple maipulation of fastq sequences
|
FastqTrimmer |
Trim fastq sequence when quality drops below a threshold
The resulting sequence has to ba at least 'minBases'
|
FastqTrimmerAdrian |
Trim fastq sequence when:
- Median quality drops below a threshold (mean is calculated every 2 bases instead of every base)
- Sequence length is at least 'minBases'
From Adrian Platts
...Also the sliding window was not every base.
|
FastqTrimmerMedian |
Trim fastq sequence when median quality drops below a threshold
|
Feature |
A feature in a GenBank or EMBL file
|
Feature.Type |
|
FeatureCoordinates |
A feature in a GenBank or EMBL file
|
Features |
A class representing a set of features
References: http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html
|
FeaturesFile |
A file containing one or more set of features (e.g.
|
FileIndexChrPos |
Index a file that has "chr \t pos" as the beginning of a line (e.g.
|
FileIterator<T> |
Opens a file and iterates over all objects in the file
Note: The file is not loaded in memory, thus allows to iterate over very large files
|
Filter<T> |
A Generic filter interface
|
FindRareAaIntervals |
Find intervals where rare amino acids occur
|
FisherExactTest |
Calculate Fisher's exact test (based on hypergeometric distribution)
|
FisherPValueAlgorithm |
|
FisherPValueGreedyAlgorithm |
|
FloatStats |
A simple class that does some basic statistics on double numbers
|
FrameType |
Type of frame calculations
Internally, we use GFF style frame calculation for Exon / Transcript
Technically, these are 'frame' and 'phase' which are calculated in different ways
UCSC type: Indicated the coding base number modulo 3.
|
GenBank |
A class representing the same data as a GenBank file (a 'GB' file)
References: http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord
|
GenBankFile |
A file containing one or more set of features (e.g.
|
Gene |
Interval for a gene, as well as transcripts
|
Gene.GeneType |
|
GeneCountByTypeTable |
Count for each 'type' and 'gene'.
|
GeneIds |
Maps different Gene IDs:
- ENSEMBL Gene ID to transcript ID
- ENSEMBL Gene ID to Gene Name
- ENSEMBL Gene ID to Refseq Gene ID
- ENSEMBL Gene ID to Refseq Protein ID
|
GenericMarker |
An interval intended as a mark
|
GenericMarkerFileIterator |
Opens a file and creates generic markers (one per line)
|
Genes |
A collection of genes (marker intervals)
Note: It is assumed that all genes belong to the same genome
|
GeneSet |
An set of genes (that belongs to a collection of gene-sets)
|
GeneSets |
A collection of GeneSets
Genes have associated "experimental values"
|
GeneSetsRanked |
A collection of GeneSets
Genes are ranked (usually by 'value')
|
GeneStats |
Some statistics about a gene
|
Genome |
This is just used for the Interval class.
|
GenomicSequences |
This class stores all "relevant" sequences in a genome
This class is able to:
i) Add all regions of interest
ii) Store genomic sequences for those regions of interest
iii) Retrieve genomic sequences by interval
|
Genotypes |
Simple test program
|
GenotypeStats |
Calculate statistics on genotype
|
GenotypeVector |
A vector of genotypes in a 'compact' structure
Note: Genotypes 0/0, 0/1, 1/0, 1/1 are stored in 2 bits.
|
Gff3FileIterator |
Opens a sequence change file and iterates over all intervals in GFF3 format.
|
GffMarker |
An interval intended as a mark
|
GffType |
|
GoogleBarChart |
|
GoogleChartVenn |
A simple wrapper to goolge charts API (from charts4j)
Plots integer data
|
GoogleGenePercentBar |
A simple wrapper to goolge charts API (from charts4j)
|
GoogleGeneRegionChart |
|
GoogleGeneRegionNumExonsChart |
|
GoogleHistogram |
A simple wrapper to goolge charts API (from charts4j)
|
GoogleLineChart |
|
GooglePlot |
A simple wrapper to goolge charts API (from charts4j)
|
GooglePlotInt |
A simple wrapper to goolge charts API (from charts4j)
Plots integer data
|
GoTerm |
An instance of a GO term (a node in the DAG)
|
GoTerms |
A collection of GO terms
|
Gpr |
General pupose rutines
|
GprHtml |
General stuff realted to HTML
|
GprSeq |
|
Gtex |
Load data from GTEx files.
|
GtexExperiment |
A 'column' in a GTEx file (values from one experiment
|
Gtf2Marker |
An interval intended as a mark
|
GuessTableTypes |
Given a table in a TXT file, try to guess the value types for each column
|
HashLongLongArray |
A Hash using primitive types instead or warped object
The idea is to be able to add many long values for each key
This could be implemented by simply doing HashMap > (but it
would consume much more memory)
Note: We call each 'long[]' a bucket
WARNING: This collection does NOT allow elements to be deleted! But you can replace values.
|
Hgvs |
HGSV notation
References: http://www.hgvs.org/
|
HgvsDna |
Coding DNA reference sequence
References http://www.hgvs.org/mutnomen/recs.html
Nucleotide numbering:
- there is no nucleotide 0
- nucleotide 1 is the A of the ATG-translation initiation codon
- the nucleotide 5' of the ATG-translation initiation codon is -1, the previous -2, etc.
|
HgvsProtein |
Coding change in HGVS notation (amino acid changes)
References: http://www.hgvs.org/mutnomen/recs.html
|
HomHetStats |
Count Hom/Het per sample
From Pierre:
For multiple ALT, I suggest to count the number of REF allele
0/1 => ALT1
0/2 => ALT1
1/1 => ALT2
2/2 => ALT2
1/2 => ALT2
|
Hypergeometric |
Calculate hypergeometric distribution using an optimized algorithm
that avoids problems with big factorials.
|
IdGenerator |
Generates Id
|
IdMap |
Maps many IDs to many Names
I.e.
|
IdMapper |
Map IDs
|
IdMapperEntry |
An entry in a ID mapping file
|
IntegrationTest |
Base class for integration tests
|
Intergenic |
Interval for in intergenic region
|
IntergenicConserved |
Interval for a conserved intergenic region
|
Interval |
A genomic interval.
|
IntervalAndSubIntervals<T extends Marker> |
Interval that contains sub intervals.
|
IntervalComparatorByEnd |
Compare intervals by end position
|
IntervalComparatorByStart |
Compare intervals by start position
|
IntervalForest |
A set of interval trees (e.g.
|
IntervalNode |
Node for interval tree structure
|
IntervalNodeOri |
The Node class contains the interval tree information for one single node
|
IntervalSetIterator |
Iterate over intervals.
|
IntervalTree |
An Interval Tree is essentially a map from intervals to objects, which
can be queried for all data associated with a particular interval of
point
|
IntervalTreeArray |
Interval tree structure using arrays
This is slightly faster than the new IntervalTree implementation
|
IntervalTreeOri |
An Interval Tree is essentially a map from intervals to objects, which
can be queried for all data associated with a particular interval of
point
|
IntHisto |
Histogram of integer numbers
|
Intron |
Intron
|
IntronConserved |
Interval for a conserved non-coding region in an intron
|
IntStats |
A simple class that does some basic statistics on integer numbers
|
Iterator2Iterable<T> |
Convert an iterator instance to a (fake) iterable
|
Itree |
Interval tree interface
|
IubString |
Find all bases combinations from a string containing IUB codes
|
Jaspar |
Load PWM matrices from a Jaspar file
|
KeyValue<A,B> |
A "key = value" pair
|
LeadingEdgeFractionAlgorithm |
Leading edge fraction algorithm
References: "Common Inherited Variation in Mitochondrial Genes Is Not Enriched for Associations with Type 2 Diabetes or Related Glycemic Traits"
http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1001058
See page 12, "Step 4"
|
LineChrPos |
A simple chr:pos parser
Stores using bytes instead of chars
|
LineClassFileIterator<T> |
Iterate on each line.
|
LineFileIterator |
Iterate on each line in this file
|
LineFilter |
Filter a line before processing
|
LineSeqFileIterator |
One line per sequence.
|
LittleEndianInputStream |
|
LogStats |
Log basic usage information to a server (for feedback and stats)
This information an always be suppressed (no info sent at all)
|
LogStats.RequestResult |
|
LossOfFunction |
Analyze if a set of effects are can create a "Loss Of Function"
and "Nonsense mediated decays" effects.
|
Marker |
An interval intended as a mark
|
MarkerFileIterator<M extends Marker> |
Opens a Marker file and iterates over all markers
|
MarkerParentId |
This is a marker used as a 'fake' parent during data serialization
|
Markers |
A collection of markers
|
MarkerSeq |
Marker with a DNA sequence
|
MarkerSerializer |
Serialize markers to (and from) file
Note: Marker's children are serialized first (e.g.
|
MarkerTypes |
Create a list of marker types (names or labels for markers)
|
MarkerUtil |
Generic utility methods for Markers
|
MarkerWithFrame |
A Marker that has 'frame' information (Exon and Cds)
|
Master<TI,TO> |
Master: Distributes the jobs to all workers, sends the results to 'listener'
|
MasterEff |
Master agent for SnpEff 'eff' command
|
MasterVcf<T> |
A master that prcess VCF files
|
MasterVcfStr |
A simple demo of a master process
|
MatrixEntry |
A simple entry in a 'Matrix' file
|
MatrixEntryFileIterator |
Iterate on each line of a file, creating a MatrixEntry
|
MicroCosmEntry |
Entry in a MicroCosm (miRNA target prediction) file
|
MicroCosmFileIterator |
Iterate on each line of a MicroCosm predictions
References:
http://www.ebi.ac.uk/enright-srv/microcosm/
|
MicroRnaBindingSite |
miRna binding site (usually this was predicted by some algorithm)
|
MineMarkerIntervals |
Mine marker intervals: I.e.
|
MineTwoMarkerIntervals |
Mine marker intervals: I.e.
|
Monitor |
|
Motif |
Regulatory elements
|
MotifFileIterator |
Opens a regulation file and create Motif elements.
|
MotifLogo |
Create a DNA logo for a PWM
References:
- See WebLogo http://weblogo.berkeley.edu/
- "WebLogo: A Sequence Logo Generator"
|
MultivalueHashMap<K,V> |
A Hash that can hold multiple values for each key
|
NeedlemanWunsch |
Needleman-Wunsch (global sequence alignment) algorithm for sequence alignment (short strings, since it's not memory optimized)
|
NeedlemanWunschOverlap |
Needleman-Wunsch algorithm for string alignment (short strings, since it's not memory optimized)
|
NextProt |
NextProt annotation marker
|
NextProtDb |
Parse NetxProt XML file and build a database
http://www.nextprot.org/
|
NextProtParser |
Parse NetxProt XML file and build a database
http://www.nextprot.org/
|
NextProtParserV2 |
Parse NetxProt XML file (version 2)
http://www.nextprot.org/
|
Nmer |
Binary packed N-mer (i.e.
|
NmerCount |
Mark if an Nmer has been 'seen'
It only count up to 255 (one byte per counter)
|
NmerCountWc |
Create a counter that can count Nmers as well as their WC complements
That means that given an Nmer, the nmer and the Watson-Crick complement are counted the same.
|
NoneAlgorithm |
An algorithm that does nothing
|
NormalDistribution |
Calculate Normal distribution (PDF & CDF) using more precision if required
|
NullReader |
A buffered reader for a file.
|
ObservedOverExpected |
Observed over expected values (o/e) ratios
E.g.: CpG dinucleotides in a sequence
|
ObservedOverExpectedCHG |
Observed over expected values (o/e) of CHG in a sequence
|
ObservedOverExpectedCHH |
Observed over expected values (o/e) of CHH in a sequence
|
ObservedOverExpectedCpG |
Observed over expected values (o/e) of CpG in a sequence
|
OpenBitSet |
An "open" BitSet implementation that allows direct access to the array of words
storing the bits.
|
OsCmdQueue |
A queue of commands to be run.
|
OsCmdRunner |
Run an OS command as a thread
|
OutputFormatter |
Formats output
How is this used:
- newSection(); // Create a new 'section' on the output format (e.g.
|
Overlap<S extends BinarySequence> |
Calculates the best overlap between two sequences
Note: An overlap is a simple 'alignment' which can only contain gaps at the
beginning or at the end of the sequences.
|
OverlapDnaSeq |
|
OverlapFilter<T extends BinarySequence> |
Indicate whether an overlap between two sequences should be considered or not
|
OverlapFilterCompareAllAll |
Only allow overlaps between sequences mapped to same/different partition
|
OverlapFilterDnaId |
Only allow sequences with different IDs to be overlapped
|
OverlapRessult<T extends BinarySequence> |
An object used to store overlap parameters
|
Parser<T> |
Parse a string and return a collection of objects.
|
Pathway |
A Reactome pathway
|
Pcingola |
Author's data
|
PdbFile |
A structure that reads PDB files
This code is similar to 'PDBFileReader' from BioJava, but the BioJava version
doesn't close file descriptors and eventually produces a crash when reading
many files.
|
PedEntry |
An entry in a PED table.
|
PedFamily |
A family: A group of Tfams with the same familyId
|
PedFileIterator |
PED file iterator (PED file from PLINK)
Reference: http://pngu.mgh.harvard.edu/~purcell/plink/data.shtml
|
PedGenotype |
A Simple genotype implementation for PED files
|
PedigreeEnrty |
Pedigree entry in a VCF file header
E.g.:
##PEDIGREE=
or
##PEDIGREE=
|
PedPedigree |
A pedigree of PedEntries
|
PlinkMap |
PLINK MAP file
References: http://pngu.mgh.harvard.edu/~purcell/plink/data.shtml
|
Polymerisation |
A polymerization event
|
PosStats |
How many changes per position do we have in a chromosome.
|
Progress |
|
PromoterSequences |
Get promoter sequences from genes
|
ProteinInteractionLocus |
Protein interaction: An amino acid that is "in contact" with another amino acid.
|
ProteinProteinInteractionLocus |
Protein interaction: An amino acid that is "in contact" with another amino acid
within the same protein.
|
ProteinStructuralInteractionLocus |
Protein interaction: An amino acid that is "in contact" with another amino acid.
|
PurityChange |
Analize purity changes in codons and amino acids
|
PvaluesList |
A list of pvalues (i.e.
|
Pwm |
Create a DNA motif count matrix
Refrence http://en.wikipedia.org/wiki/Position-specific_scoring_matrix
|
PwmAndSeqs |
Create a DNA motif count matrix and also
count the number of sequences in that contribute
to this motif.
|
Qseq2Fastq |
Convert qseq file to fastq
|
Qseq2FastqSplit |
Convert qseq file to fastq
|
RandMarker |
Create random markers using a uniform distribution
|
RankSumNoReplacementPdf |
Calculate rank sum probability distribution function (pdf) and cumulative distribution function (cdf).
|
RankSumNoReplacementSimulate |
Calculate rank sum probability distribution function (pdf) and cumulative distribution function (cdf).
|
RankSumPdf |
Calculate rank sum probability distribution function (pdf) and cumulative distribution function (cdf).
|
RankSumPValueAlgorithm |
|
RankSumPValueGreedyAlgorithm |
|
RareAminoAcid |
Rare amino acid annotation:
These are amino acids that occurs very rarely in an organism.
|
Reaction |
A reaction
|
Reaction.RegulationType |
Reaction regulation types
|
Reactome |
Load reactome data from TXT files
|
ReadsOnMarkersModel |
Calculate the maximum interval length by type, for all markers in a genome
Create a probability model based on binomial ditribution.
|
Regulation |
Regulatory elements
|
RegulationBedFileIterator |
Opens a GFF3 file and create regulatory elements.
|
RegulationConsensusMultipleBed |
Create a regulation consensus from multiple BED files
|
RegulationFileConsensus |
Create a regulation consensus from a regulation file.
|
RegulationFileIterator |
Opens a regulation file and create Regulation elements.
|
RegulationFileSplitBytType |
Split regulation files into smaller files (one per 'regulation type')
Regulation files can be quite large and we cannot read them into
memory.
|
RegulationGffFileIterator |
Opens a GFF3 file and create regulatory elements.
|
ReSampleInt |
Re-sample statistic
Statistic is a sum of a set of integer numbers (e.g.
|
ReSampleMap |
Resample statistic
|
ReSampleMapRank |
Re-sample statistic using ranks of scores (scores are double)
|
Result<T> |
A result form a work
|
Result |
Store a result form a greedy search algorithm
|
RPChromosomeRegion |
Created by IntelliJ IDEA.
|
RPTree |
Created by IntelliJ IDEA.
|
RPTreeChildNode |
Container class for R+ tree leaf or child node format
Note: RPTreeNode interface supports leaf and child node formats
|
RPTreeChildNodeItem |
Created by IntelliJ IDEA.
|
RPTreeHeader |
Created by IntelliJ IDEA.
|
RPTreeLeafNode |
Created by IntelliJ IDEA.
|
RPTreeLeafNodeItem |
Created by IntelliJ IDEA.
|
RPTreeNode |
Created by IntelliJ IDEA.
|
RPTreeNodeProxy |
|
SamEntry |
An entry in a SAM file
References: http://samtools.sourceforge.net/SAM-1.3.pdf
|
SamFileIterator |
Reads a SAM file
Note: This is a very 'rustic' reader (we should use Picard's API instead)
|
SamHeader |
Sam header
|
SamHeaderRecord |
Sam header record
|
SamHeaderRecordSq |
SQ header: Reference sequence dictionary.
|
SamplingStats<T> |
Perform stats by analyzing some samples
|
ScoreList |
A list of scores
|
ScoreList.ScoreSummary |
|
SeekableBufferedReader |
A buffered reader for a file.
|
SeekableFileStream |
|
SeekableStream |
User: jrobinso
Date: Nov 29, 2009
|
SequenceComplexity |
Measures the complexity of a sequence
Ideally we'd like to measure the Kolmogorov complexity of the sequence.
|
SequenceIndexer<T extends BinarySequence> |
A collection of sequences that are indexed using some algorithm
Note: The ID is just the position in the array.
|
SequenceReference |
A reference to a sequence.
|
SequenceRotator |
Rotates a binary packed sequence
WARNING: We only rotate up to Coder.basesPerWord() because after that the sequences are the same (with an integer offset)
NOTE: Left rotation 'n' is the same as a right rotation 'Coder.basesPerWord() - n'
|
Sex |
|
SmithWaterman |
Smith-Waterman (local sequence alignment) algorithm for sequence alignment (short strings, since it's not memory optimized)
|
SnpEff |
SnpEff's main command line program
|
SnpEff.GeneDatabaseFormat |
Available gene database formats
|
SnpEff.InputFormat |
Available input formats
|
SnpEff.OutputFormat |
Available output formats
|
SnpEffCmdAcat |
ACAT: Create ACAT score for T2D project
Note: This is just used to compile 'ACAT' score in T2D-GENES project, not useful at all for general audience.
|
SnpEffCmdBuild |
Command line program: Build database
|
SnpEffCmdBuildNextProt |
Parse NetxProt XML file and build a database
http://www.nextprot.org/
|
SnpEffCmdCds |
Command line: Calculate coding sequences from a file and compare them to the ones calculated from our data structures
Note: This is done in order to see potential incompatibility
errors between genome sequence and annotation.
|
SnpEffCmdClosest |
Command line: Find closes marker to each variant
|
SnpEffCmdCount |
Count reads from a BAM file given a list of intervals
|
SnpEffCmdDatabases |
Show all databases configures in snpEff.config
Create an HTML 'download' table based on the config file
Also creates a list of genome for Galaxy menu
|
SnpEffCmdDownload |
Command line program: Build database
|
SnpEffCmdDump |
Command line program: Build database
|
SnpEffCmdDump.DumpFormat |
|
SnpEffCmdEff |
Command line program: Predict variant effects
|
SnpEffCmdGenes2Bed |
Simple test program
|
SnpEffCmdGsa |
Command line: Gene-Sets Analysis
Perform gene set analysys
|
SnpEffCmdLen |
Calculate the maximum interval length by type, for all markers in a genome
|
SnpEffCmdPdb |
PDB distance analysis
References: http://biojava.org/wiki/BioJava:CookBook:PDB:read
|
SnpEffCmdProtein |
Command line: Read protein sequences from a file and compare them to the ones calculated from our data structures
Note: This is done in order to see potential incompatibility
errors between genome sequence and annotation.
|
SnpEffCmdSeq |
Command line program: Show a transcript or a gene
|
SnpEffCmdShow |
Command line program: Show a transcript or a gene
|
SnpEffCmdSpliceAnalysis |
Analyze sequences from splice sites
|
SnpEffCmdTranslocationsReport |
Create an SVG representation of a Marker
|
SnpEffectPredictor |
Predicts effects of SNPs
Note: Actually tries to predict any kind of SeqChange, not only SNPs .
|
SnpEffPredictorFactory |
This class creates a SnpEffectPredictor from a file (or a set of files) and a configuration
|
SnpEffPredictorFactoryEmbl |
This class creates a SnpEffectPredictor from an Embl file.
|
SnpEffPredictorFactoryFeatures |
This class creates a SnpEffectPredictor from a 'features' file.
|
SnpEffPredictorFactoryGenBank |
This class creates a SnpEffectPredictor from a GenBank file.
|
SnpEffPredictorFactoryGenesFile |
This class creates a SnpEffectPredictor from a file (or a set of files) and a configuration
The files used are:
- genes.txt : Biomart query from Ensembl (see scripts/genes_dataset.xml)
- Fasta files: One per chromosome (as described in the config file)
|
SnpEffPredictorFactoryGff |
This class creates a SnpEffectPredictor from a GFF file.
|
SnpEffPredictorFactoryGff2 |
This class creates a SnpEffectPredictor from a GFF2 file.
|
SnpEffPredictorFactoryGff3 |
This class creates a SnpEffectPredictor from a GFF3 file
References:
- http://www.sequenceontology.org/gff3.shtml
- http://gmod.org/wiki/GFF3
- http://www.eu-sol.net/science/bioinformatics/standards-documents/gff3-format-description
|
SnpEffPredictorFactoryGtf22 |
This class creates a SnpEffectPredictor from a GTF 2.2 file
References: http://mblab.wustl.edu/GTF22.html
|
SnpEffPredictorFactoryKnownGene |
This class creates a SnpEffectPredictor from a TXT file dumped using UCSC table browser
Fields in this table
Field Example SQL type Info Description
----- ------- -------- ---- -----------
name uc001aaa.3 varchar(255) values Name of gene
chrom chr1 varchar(255) values Reference sequence chromosome or scaffold
strand + char(1) values + or - for strand
txStart 11873 int(10) unsigned range Transcription start position
txEnd 14409 int(10) unsigned range Transcription end position
cdsStart 11873 int(10) unsigned range Coding region start
cdsEnd 11873 int(10) unsigned range Coding region end
exonCount 3 int(10) unsigned range Number of exons
exonStarts 11873,12612,13220, longblob Exon start positions
exonEnds 12227,12721,14409, longblob Exon end positions
proteinID varchar(40) values UniProt display ID for Known Genes, UniProt accession or RefSeq protein ID for UCSC Genes
alignID uc001aaa.3 varchar(255) values Unique identifier for each (known gene, alignment position) pair
|
SnpEffPredictorFactoryRand |
This class creates a random set of chromosomes, genes, transcripts and exons
|
SnpEffPredictorFactoryRefSeq |
This class creates a SnpEffectPredictor from a TXT file dumped using UCSC table browser
RefSeq table schema: http://genome.ucsc.edu/cgi-bin/hgTables
field example SQL type info description
bin 585 smallint(5) range Indexing field to speed chromosome range queries.
|
SpliceSite |
Interval for a splice site
Reference: http://en.wikipedia.org/wiki/RNA_splicing
Spliceosomal introns often reside in eukaryotic protein-coding genes.
|
SpliceSiteAcceptor |
Interval for a splice site acceptor
Note: Splice sites donnor are defined as the last 2 bases of an intron
Reference: http://en.wikipedia.org/wiki/RNA_splicing
|
SpliceSiteBranch |
A (putative) branch site.
|
SpliceSiteBranchU12 |
A (putative) U12 branch site.
|
SpliceSiteDonor |
Interval for a splice site donnor
Note: Splice sites donnor are defined as the first 2 bases of an intron
Reference: http://en.wikipedia.org/wiki/RNA_splicing
|
SpliceSiteRegion |
Interval for a splice site acceptor
From Sequence Ontology: A sequence variant in which a change has occurred
within the region of the splice site, either within 1-3 bases of the exon
or 3-8 bases of the intron.
|
SpliceTypes |
Analyze sequences from splice sites
|
StartMaster |
A message telling master process to start calculating
|
StartMasterVcf |
A message telling master process to start calculating
It also sends the filename to be opened
|
StreamGobbler |
Read the contents of a stream in a separate thread
This class is used when executing OS commands in order to read STDOUT / STDERR and prevent process blocking
It can alert an AlertListener when a given string is in the stream
|
SubsequenceComparator<T extends BinarySequence> |
Compare two subsequences (actually it compares two sequences from different starting points)
|
SuffixIndexerNmer<T extends BinarySequence> |
Index all suffixes of all the sequences (it indexes using Nmers).
|
Svg |
Create an SVG representation of a Marker
|
SvgBnd |
Create an SVG representation of a BND (translocation) variant
In a VCF file, there are four possible translocations (BND) entries:
REF ALT Meaning
type 1: s t[p[ piece extending to the right of p is joined after t
type 2: s t]p] reverse comp piece extending left of p is joined after t
type 3: s ]p]t piece extending to the left of p is joined before t
type 4: s [p[t reverse comp piece extending right of p is joined before t
|
SvgCds |
Create an SVG representation of a Marker
|
SvgExon |
Create an SVG representation of a Marker
|
SvgGene |
Create an SVG representation of a Marker
|
SvgIntron |
Create an SVG representation of a Marker
|
SvgNextProt |
Create an SVG representation of a NextProt annotation tracks
|
SvgScale |
Create an SVG representation of a "Scale and Chromsome labels
|
SvgSpacer |
Leave an empty vertical space
|
SvgTranscript |
Create an SVG representation of a transcript
|
SvgTranslocation |
Create an SVG representation of a BND (translocation) variant
In a VCF file, there are four possible translocations (BND) entries:
REF ALT Meaning
type 1: s t[p[ piece extending to the right of p is joined after t
type 2: s t]p] reverse comp piece extending left of p is joined after t
type 3: s ]p]t piece extending to the left of p is joined before t
type 4: s [p[t reverse comp piece extending right of p is joined before t
|
TabixIndex |
Tabix Index (i.e.
|
TabixInterval |
|
TabixIterator |
Iterate on a result from TabixReader.query()
|
TabixReader |
|
TableFile |
Load a table from a file.
|
TestCasesAlign |
test cases for Sequence alignment
|
TestCasesAnn |
Test case
|
TestCasesAnnParse |
Test case for parsing ANN fields
|
TestCasesApplyDel |
Test cases: apply a variant (DEL) to a transcript
|
TestCasesApplyIns |
Test cases: apply a variant (INS) to a transcript
|
TestCasesApplyMixed |
Test cases: apply a variant (MIXED) to a transcript
|
TestCasesApplyMnp |
Test cases: apply a variant (MNP) to a transcript
|
TestCasesApplySnp |
Test cases: apply a variant (SNP) to a transcript
|
TestCasesBase |
Base class for some test cases
|
TestCasesBaseApply |
Test case
Transcript:
1:0-999, strand: +, id:transcript1, Protein
Exons:
1:100-199 'exon1', rank: 1, frame: ., sequence: atgtccgcaggtgaaggcatacacgctgcgcgtatactgatgttacctcgatggattttgtcagaaatatggtgcccaggacgcgaagggcatattatgg
1:300-399 'exon2', rank: 2, frame: ., sequence: tgtttgggaattcacgggcacggttctgcagcaagctgaattggcagctcggcataaatcccgaccccatcgtcacgcacggatcaattcatcctcaacg
1:900-999 'exon3', rank: 0, frame: ., sequence: ggtagaggaaaagcacctaacccccattgagcaggatctctttcgtaatactctgtatcgattaccgatttatttgattccccacatttatttcatcggg
CDS : atgtccgcaggtgaaggcatacacgctgcgcgtatactgatgttacctcgatggattttgtcagaaatatggtgcccaggacgcgaagggcatattatggtgtttgggaattcacgggcacggttctgcagcaagctgaattggcagctcggcataaatcccgaccccatcgtcacgcacggatcaattcatcctcaacgggtagaggaaaagcacctaacccccattgagcaggatctctttcgtaatactctgtatcgattaccgatttatttgattccccacatttatttcatcggg
Protein : MSAGEGIHAARILMLPRWILSEIWCPGREGHIMVFGNSRARFCSKLNWQLGINPDPIVTHGSIHPQRVEEKHLTPIEQDLFRNTLYRLPIYLIPHIYFIG
Transcript (full coordinates):
0
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
....................................................................................................atgtccgcaggtgaaggcatacacgctgcgcgtatactgatgttacctcgatggattttgtcagaaatatggtgcccaggacgcgaagggcatattatgg....................................................................................................tgtttgggaattcacgggcacggttctgcagcaagctgaattggcagctcggcataaatcccgaccccatcgtcacgcacggatcaattcatcctcaacg....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ggtagaggaaaagcacctaacccccattgagcaggatctctttcgtaatactctgtatcgattaccgatttatttgattccccacatttatttcatcggg
M S A G E G I H A A R I L M L P R W I L S E I W C P G R E G H I M V F G N S R A R F C S K L N W Q L G I N P D P I V T H G S I H P Q R V E E K H L T P I E Q D L F R N T L Y R L P I Y L I P H I Y F I G
0120120120120120120120120120120120120120120120120120120120120120120120120120120120120120120120120120 1201201201201201201201201201201201201201201201201201201201201201201201201201201201201201201201201201 2012012012012012012012012012012012012012012012012012012012012012012012012012012012012012012012012012
---------------------------------------------------------------------------------------------------->>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>---------------------------------------------------------------------------------------------------->>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------->>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
| | | | | |
| | | | | |^999
| | | | |^900
| | | |^399
| | |^300
| |^199
|^100
|
TestCasesBinomial |
Test for Binomial distribution
|
TestCasesBuild |
Test case
|
TestCasesCds |
Test random SNP changes
|
TestCasesChiSquare |
Test for Hypergeometric distribution and Fisher exact test
|
TestCasesCircular |
Test cases for circular genomes
|
TestCasesCochranArmitage |
Cochran-Armitage test statistic test case
|
TestCasesCodonTable |
Codon tables
|
TestCasesCytoBands |
Test case for cytobands
|
TestCasesDel |
Test random DEL changes
|
TestCasesDels |
Test random DEL changes
|
TestCasesDnaNSequence |
|
TestCasesDnaOverlap |
|
TestCasesDnaSequence |
|
TestCasesDnaSequenceByte |
|
TestCasesEffectCollapse |
Test Splice sites variants
The sample transcript used is:
Transcript:1:751-1139, strand: +, id:transcript_0, Protein
Exons:
1:751-810 'exon_0_0', rank: 1, frame: ., sequence: cgattgacctacatagtaatgagttttgttggtccgtaagacttcgcccaaaaccgcgca
1:1013-1139 'exon_0_1', rank: 2, frame: ., sequence: cttcgactactcgggggtctaagcacgttttctgcagggaaagtaatatatgcttgtgcgcaaccatggtaacagggattcacggccccgttaatggtatgacctaagccccatacgagtcatccaa
CDS : cgattgacctacatagtaatgagttttgttggtccgtaagacttcgcccaaaaccgcgcacttcgactactcgggggtctaagcacgttttctgcagggaaagtaatatatgcttgtgcgcaaccatggtaacagggattcacggccccgttaatggtatgacctaagccccatacgagtcatccaa
Protein : RLTYIVMSFVGP*DFAQNRALRLLGGLSTFSAGKVIYACAQPW*QGFTAPLMV*PKPHTSHP?
|
TestCasesEffectCollapse2 |
Test case
|
TestCasesFasta |
Test case for FASTA file parsing
|
TestCasesFileIndexChrPos |
Test cases for file index (chr:pos index on files)
|
TestCasesFisherExactTest |
Test for Hypergeometric distribution and Fisher exact test
|
TestCasesGenePvalueList |
GenePvalueList statistics test case
|
TestCasesGenomicSequences |
Test case
|
TestCasesGenotypeVector |
Test cases for GenotypeVector class
|
TestCasesHgvs |
Test case for basic HGV annotaions
|
TestCasesHgvsBase |
Test random SNP changes
|
TestCasesHgvsDnaDup |
Test case
|
TestCasesHgvsDnaDupNegative |
Test cases for HGVS's 'dup' on the negative strand
|
TestCasesHgvsExon |
Test random SNP changes
|
TestCasesHgvsIntron |
Test random SNP changes
|
TestCasesHgvsProtDup |
Test case
|
TestCasesHypergeometric |
Test for Hypergeometric distribution and Fisher exact test
|
TestCasesIns |
Test random SNP changes
|
TestCasesIntegratioBuildPdb |
Test cases for annotation of protein interaction loci
|
TestCasesIntegrationApply |
Test 'apply' method (apply variant to marker)
|
TestCasesIntegrationBase |
Base class: Provides common methods used for testing
|
TestCasesIntegrationCancer |
Test cases for cancer effect (difference betwee somatic an germline tissue)
|
TestCasesIntegrationCanonical |
Test cases for canonical transcript selection
|
TestCasesIntegrationCircularGenome |
Test case
|
TestCasesIntegrationCodingTag |
Test case: Make sure VCF entries have some 'coding' (transcript biotype), even
when biotype info is not available (e.g.
|
TestCasesIntegrationConfig |
Test case
|
TestCasesIntegrationCutsomIntervals |
Test Loss of Function prediction
|
TestCasesIntegrationDelEtc |
Test cases on deletions
|
TestCasesIntegrationDup |
Test case
|
TestCasesIntegrationEff |
Test cases for other 'effect' issues
|
TestCasesIntegrationEmbl |
Test case for EMBL file parsing (database creation)
|
TestCasesIntegrationErrors |
Test cases for error reporting
|
TestCasesIntegrationExonFrame |
Test case for exon frames
|
TestCasesIntegrationFilterTranscripts |
Filter transcripts
|
TestCasesIntegrationGenBank |
Test case for EMBL file parsing (database creation)
|
TestCasesIntegrationGenomicSequences |
Test case for genomic sequences
|
TestCasesIntegrationGff3 |
Test case for GFF3 file parsing
|
TestCasesIntegrationGtf22 |
Test case for GTF22 file parsing
|
TestCasesIntegrationHgvs |
Test random SNP changes
|
TestCasesIntegrationHgvsDel |
Test cases for HGVS notation on insertions
|
TestCasesIntegrationHgvsDnaDup |
Test case
|
TestCasesIntegrationHgvsFrameShift |
Test case
|
TestCasesIntegrationHgvsHard |
Test case
|
TestCasesIntegrationHgvsIns |
Test cases for HGVS notation on insertions
|
TestCasesIntegrationHgvsLarge |
Test random SNP changes
|
TestCasesIntegrationHgvsMnps |
Test case
|
TestCasesIntegrationHgvsUpDownStream |
Test cases for HGVS notation
|
TestCasesIntegrationHugeDeletions |
Test case where VCF entries are huge (e.g.
|
TestCasesIntegrationInsEtc |
Test random SNP changes
|
TestCasesIntegrationInsVep |
Test random SNP changes
|
TestCasesIntegrationLof |
Test Loss of Function prediction
|
TestCasesIntegrationMarkerSeq |
Test case
|
TestCasesIntegrationMissenseSilentRatio |
Calculate missense over silent ratio
|
TestCasesIntegrationMixedVariants |
Test mixed variants
|
TestCasesIntegrationMnp |
Test random SNP changes
|
TestCasesIntegrationMotif |
Test Motif databases
|
TestCasesIntegrationNextProt |
Test NextProt databases
|
TestCasesIntegrationNmd |
Test Nonsense mediated decay prediction
|
TestCasesIntegrationNoChange |
Test case where VCF entries has no sequence change (either REF=ALT or ALT=".")
|
TestCasesIntegrationProtein |
Protein translation test case
|
TestCasesIntegrationProteinInteraction |
Test cases for annotation of protein interaction loci
|
TestCasesIntegrationRefSeq |
Test case for GTF22 file parsing
|
TestCasesIntegrationRegulation |
Test case
|
TestCasesIntegrationSequenceOntology |
Test case for sequence ontology
|
TestCasesIntegrationSnp |
Test SNP variants
|
TestCasesIntegrationSnpEff |
Invoke all integration test cases
|
TestCasesIntegrationSnpEffMultiThread |
Invoke multi thread integration test
WARNING: JUnit doesn't seem to work if you use multi-threading....
|
TestCasesIntegrationSnpEnsembl |
Test random SNP changes
|
TestCasesIntegrationSpliceRegion |
Test cases for variants
|
TestCasesIntegrationStructural |
Test SNP variants
|
TestCasesIntegrationTranscript |
Test random SNP changes
|
TestCasesIntegrationTranscriptError |
Test case where VCF entries hit a transcript that has errors
|
TestCasesIntegrationVariant |
Test cases for variants
|
TestCasesIntegrationVcfs |
VCF annotations test cases
|
TestCasesIntergenic |
Test intergenic markers
|
TestCasesIntervals |
|
TestCasesIntervalTree |
Test case for interval tree structure
|
TestCasesIntervalTreeArray |
Test case for interval tree structure
|
TestCasesIntervalTreeOri |
Test case for interval tree structure
|
TestCasesIntervalVariant |
Test random Interval Variants (e.g.
|
TestCasesIntStats |
|
TestCasesIubString |
|
TestCasesJaspar |
Test case for Jaspar parsing
|
TestCasesMarkerUtils |
|
TestCasesMnps |
Test random SNP changes
|
TestCasesNmers |
|
TestCasesOverlap |
|
TestCasesProteinInteraction |
Test cases for protein interaction
|
TestCasesReactome |
Test Reactome circuits
|
TestCasesSeekableReader |
Seekable file reader test case
|
TestCasesSequenceIndexer |
|
TestCasesSnps |
Test random SNP changes
|
TestCasesSpliceRegion |
Test Splice sites variants
|
TestCasesSpliceSite |
Test Splice sites variants
|
TestCasesStructuralDel |
Test case
Gene: geneId1
1:957-1157, strand: +, id:transcript_0, Protein
Exons:
1:957-988 'exon_0_0', rank: 1, frame: ., sequence: gttgcttgaatactgtatagccttgccattgt
1:1045-1057 'exon_0_1', rank: 2, frame: ., sequence: tgtgttgctaact
1:1148-1157 'exon_0_2', rank: 3, frame: ., sequence: agacatggac
CDS : gttgcttgaatactgtatagccttgccattgttgtgttgctaactagacatggac
Protein : VA*ILYSLAIVVLLTRHG?
1
0 1
6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567
gttgcttgaatactgtatagccttgccattgt........................................................tgtgttgctaact..........................................................................................agacatggac
V A * I L Y S L A I V V L L T R H G
01201201201201201201201201201201 2012012012012 0120120120
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>-------------------------------------------------------->>>>>>>>>>>>>------------------------------------------------------------------------------------------>>>>>>>>>>
| | | | | |
| | | | | ^1157
| | | | ^1148
| | | ^1057
| | ^1045
| ^988
^957
Gene: geneId2
1:2066-2141, strand: +, id:transcript_1, Protein
Exons:
1:2066-2069 'exon_1_0', rank: 1, frame: ., sequence: actt
1:2084-2089 'exon_1_1', rank: 2, frame: ., sequence: cccttt
1:2116-2126 'exon_1_2', rank: 3, frame: ., sequence: tacgcccacgt
1:2133-2141 'exon_1_3', rank: 4, frame: ., sequence: ccgccgctg
CDS : acttcccttttacgcccacgtccgccgctg
Protein : TSLLRPRPPL
1
7 8 9 0 1 2 3 4
6789012345678901234567890123456789012345678901234567890123456789012345678901
actt..............cccttt..........................tacgcccacgt......ccgccgctg
T S L L R P R P P L
0120 120120 12012012012 012012012
>>>>-------------->>>>>>-------------------------->>>>>>>>>>>------>>>>>>>>>
| | | | | | | |
| | | | | | | ^2141
| | | | | | ^2133
| | | | | ^2126
| | | | ^2116
| | | ^2089
| | ^2084
| ^2069
^2066
|
TestCasesStructuralDup |
Test case for structural variants: Duplications
|
TestCasesStructuralInv |
Test cases for structural variants: Inversions
Gene models used in these test cases:
Gene: Gene_1:953-1216
1:957-1157, strand: +, id:transcript_0, Protein
Exons:
1:957-988 'exon_0_0', rank: 1, frame: ., sequence: gttgcttgaatactgtatagccttgccattgt
1:1045-1057 'exon_0_1', rank: 2, frame: ., sequence: tgtgttgctaact
1:1148-1157 'exon_0_2', rank: 3, frame: ., sequence: agacatggac
CDS : gttgcttgaatactgtatagccttgccattgttgtgttgctaactagacatggac
Protein : VA*ILYSLAIVVLLTRHG?
1
0 1
6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567
gttgcttgaatactgtatagccttgccattgt........................................................tgtgttgctaact..........................................................................................agacatggac
V A * I L Y S L A I V V L L T R H G
01201201201201201201201201201201 2012012012012 0120120120
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>-------------------------------------------------------->>>>>>>>>>>>>------------------------------------------------------------------------------------------>>>>>>>>>>
| | | | | |
| | | | | ^1157
| | | | ^1148
| | | ^1057
| | ^1045
| ^988
^957
Gene: Gene_1:2057-2157
1:2066-2141, strand: +, id:transcript_1, Protein
Exons:
1:2066-2069 'exon_1_0', rank: 1, frame: ., sequence: actt
1:2084-2089 'exon_1_1', rank: 2, frame: ., sequence: cccttt
1:2116-2126 'exon_1_2', rank: 3, frame: ., sequence: tacgcccacgt
1:2133-2141 'exon_1_3', rank: 4, frame: ., sequence: ccgccgctg
CDS : acttcccttttacgcccacgtccgccgctg
Protein : TSLLRPRPPL
1
7 8 9 0 1 2 3 4
6789012345678901234567890123456789012345678901234567890123456789012345678901
actt..............cccttt..........................tacgcccacgt......ccgccgctg
T S L L R P R P P L
0120 120120 12012012012 012012012
>>>>-------------->>>>>>-------------------------->>>>>>>>>>>------>>>>>>>>>
| | | | | | | |
| | | | | | | ^2141
| | | | | | ^2133
| | | | | ^2126
| | | | ^2116
| | | ^2089
| | ^2084
| ^2069
^2066
|
TestCasesStructuralTranslocations |
Test case for structural variants: Translocation (fusions)
We create two genes (one transcript each).
|
TestCasesVariantDecompose |
Test cases: apply a variant (MIXED) to a transcript
|
TestCasesVariantRealignment |
Test cases for variant realignment
|
TestCasesVcf |
VCF parsing test cases
|
TestCasesZzz |
Test random SNP changes
|
TestSuiteHgvs |
Invoke all test cases for SnpEff
|
TestSuiteIntegration |
Invoke all integration test cases
|
TestSuiteUnity |
Invoke all test cases for SnpEff
|
TfamEntry |
An entry in a TFAM table.
|
Timer |
|
TPair64 |
Pair of 'long' (64 bits)
|
Transcript |
Interval for a transcript, as well as some other information: exons, utrs, cds, etc.
|
TranscriptSet |
A set of transcripts
|
TranscriptSupportLevel |
Transcript level support
Reference: http://useast.ensembl.org/Help/Glossary?id=492;redirect=no
|
TranslocationReport |
Pojo for translocation reports
|
TsTvStats |
Calculate Ts/Tv rations per sample (transitions vs transversions)
|
Tuple<A,B> |
Tuple: A pair of objects
|
TxtSerializable |
|
Upstream |
Interval for a gene, as well as some other information: exons, utrs, cds, etc.
|
Utr |
Interval for a UTR (5 prime UTR and 3 prime UTR
|
Utr3prime |
Interval for a UTR (5 prime UTR and 3 prime UTR
|
Utr5prime |
Interval for a UTR (5 prime UTR and 3 prime UTR
|
Variant |
A variant represents a change in a reference sequence
Notes:
This class was previously known as Variant.
|
Variant.VariantType |
|
VariantBnd |
A 'BND' variant (i.e.
|
VariantEffect |
Effect of a variant.
|
VariantEffect.Coding |
|
VariantEffect.EffectImpact |
|
VariantEffect.ErrorWarningType |
Errors for change effect
|
VariantEffect.FunctionalClass |
This class is only getFused for SNPs
|
VariantEffectFilter |
A Generic ChangeEffect filter
|
VariantEffectFusion |
Effect of a structural variant (fusion) affecting two genes
|
VariantEffects |
A sorted collection of variant effects
|
VariantEffectStats |
Variants effect statistics
|
VariantEffectStructural |
Effect of a structural variant affecting multiple genes
|
VariantFileIterator |
Opens a sequence change file and iterates over all sequence changes
|
VariantNonRef |
A variant respect to non-reference (e.g.
|
VariantRealign |
Re-align a variant towards the leftmost (rightmost) position
Note: We perform a 'progressive' realignment, asking for more
reference sequence as we need it
|
VariantStats |
Variants statistics
|
VariantTxtFileIterator |
Opens a sequence change file and iterates over all sequence changes
TXT Format: Tab-separated format, containing five columns that correspond to:
chr \t position \t refSeq \t newSeq \t strand \t quality \t coverage \t id \n
Fields strand, quality, coverage and id are optional
E.g.
|
VariantTypeStats |
Count variant types (SNP, MNP, INS, DEL)
|
VariantVcfEntry |
Variant + VcfEntry
This is used to 'outer-join' a VcfEntry into all its constituent variants.
|
VariantWithScore |
A variant that has a numeric score.
|
VcfAnnotator |
Annotate a VCF file: E.g.
|
VcfAnnotatorChain |
Maintains a list of VcfAnnotators and applies them one by one
in the specified order
|
VcfConsequence |
An 'CSQ' entry in a vcf line ('Consequence' from ENSEMBL's VEP)
Format:
##INFO=
|
VcfConsequenceHeader |
An 'CSQ' entry in a vcf header line
|
VcfEffect |
An 'ANN' or 'EFF' entry in a VCF INFO field
Note: 'EFF' is the old version that has been replaced by the standardized 'ANN' field (2014-12)
*
|
VcfEntry |
A VCF entry (a line) in a VCF file
|
VcfEntry.AlleleFrequencyType |
|
VcfFileIterator |
Opens a VCF file and iterates over all entries
Format: VCF 4.1
Reference: http://www.1000genomes.org/wiki/Analysis/Variant%20Call%20Format/vcf-variant-call-format-version-41
Old 4.0 format: http://www.1000genomes.org/wiki/doku.php?id=1000_genomes:analysis:vcf4.0
1.
|
VcfGenotype |
A VCF genotype field
There is one genotype per sample in each VCF entry
|
VcfHapMapFileIterator |
Opens a Hapmap phased file and iterates over all entries, returning VcfEntries for each line
Note: Each HapMap file has one chromosome.
|
VcfHeader |
Represents the header of a vcf file.
|
VcfHeaderEntry |
Represents a info elements in a VCF file's header
References:
https://samtools.github.io/hts-specs/VCFv4.3.pdf
http://www.1000genomes.org/wiki/Analysis/Variant%20Call%20Format/vcf-variant-call-format-version-41
|
VcfHeaderFormat |
|
VcfHeaderInfo |
Represents a info elements in a VCF file
References: http://www.1000genomes.org/wiki/Analysis/Variant%20Call%20Format/vcf-variant-call-format-version-41
INFO fields should be described as follows (all keys are required):
##INFO=
Possible Types for INFO fields are: Integer, Float, Flag, Character, and String.
|
VcfHeaderInfo.VcfInfoNumber |
Number of values in an INFO field.
|
VcfInfoType |
|
VcfLof |
An 'LOF' entry in a vcf line
|
VcfNmd |
An 'NMD' entry in a vcf line
|
VcfOutputFormatter |
Formats output as VCF
|
VcfRefAltAlign |
Needleman-Wunsch (global sequence alignment) algorithm for sequence alignment
Only used for short strings (algorithm is not optimized)
|
VcfStats |
VCF statistics: This are usually multi-sample statistics
|
VcfWorkQueue |
A work queue that processes a VCF file
Sends batches of VcfEntries to each worker.
|
VersionCheck |
Check is a new version is available
|
WigItem |
Created by IntelliJ IDEA.
|
Work<T> |
A message in the AKKA system.
|
Worker<TI,TO> |
Worker: Performs a simple work and get the data back
TI: Data type in (input for this calculation)
TO: Data type out (result form the calculation)
|
WorkerEff |
Worker agent for SnpEff 'eff' command
|
WorkerVcf |
|
WorkerVcfStr |
A trivial calculation on a VCF that returns a String
|
ZoomDataBlock |
Created by IntelliJ IDEA.
|
ZoomDataRecord |
Created by IntelliJ IDEA.
|
ZoomLevelIterator |
Created by IntelliJ IDEA.
|
ZoomLevelIterator.EmptyIterator |
|