Package picard.util

Class AdapterMarker


  • public class AdapterMarker
    extends Object
    Store one or more AdapterPairs to use to mark adapter sequence of SAMRecords. This is a very compute-intensive process, so this class implements two heuristics to reduce computation: - Adapter sequences are truncated, and then any adapter pairs that become identical after truncation are collapsed into a single pair. - After a specified number of reads with adapter sequence has been seen, prune the list of adapter pairs to include only the most frequently seen adapters. For a flowcell, there should only be a single adapter pair found. Note that the AdapterPair object returned by all the adapterTrim* methods will not be one of the original AdapterPairs passed to the ctor, but rather will be one of the truncated copies.
    • Field Detail

      • DEFAULT_PRUNE_ADAPTER_LIST_AFTER_THIS_MANY_ADAPTERS_SEEN

        public static final int DEFAULT_PRUNE_ADAPTER_LIST_AFTER_THIS_MANY_ADAPTERS_SEEN
        See Also:
        Constant Field Values
      • DEFAULT_NUM_ADAPTERS_TO_KEEP

        public static final int DEFAULT_NUM_ADAPTERS_TO_KEEP
        See Also:
        Constant Field Values
    • Constructor Detail

      • AdapterMarker

        public AdapterMarker​(AdapterPair... originalAdapters)
        Truncates adapters to DEFAULT_ADAPTER_LENGTH
        Parameters:
        originalAdapters - These should be in order from longest & most likely to shortest & least likely.
      • AdapterMarker

        public AdapterMarker​(int adapterLength,
                             AdapterPair... originalAdapters)
        Parameters:
        adapterLength - Truncate adapters to this length.
        originalAdapters - These should be in order from longest & most likely to shortest & least likely.
    • Method Detail

      • getNumAdaptersToKeep

        public int getNumAdaptersToKeep()
      • setNumAdaptersToKeep

        public AdapterMarker setNumAdaptersToKeep​(int numAdaptersToKeep)
        After seeing the thresholdForSelectingAdapters number of adapters, keep up to this many of the original adapters.
      • getThresholdForSelectingAdaptersToKeep

        public int getThresholdForSelectingAdaptersToKeep()
      • setThresholdForSelectingAdaptersToKeep

        public AdapterMarker setThresholdForSelectingAdaptersToKeep​(int thresholdForSelectingAdaptersToKeep)
        When this number of adapters have been matched, discard the least-frequently matching ones.
        Parameters:
        thresholdForSelectingAdaptersToKeep - set to -1 to never discard any adapters.
      • getMinSingleEndMatchBases

        public int getMinSingleEndMatchBases()
      • setMinSingleEndMatchBases

        public AdapterMarker setMinSingleEndMatchBases​(int minSingleEndMatchBases)
        Parameters:
        minSingleEndMatchBases - When marking a single-end read, adapter must match at least this many bases.
      • getMinPairMatchBases

        public int getMinPairMatchBases()
      • setMinPairMatchBases

        public AdapterMarker setMinPairMatchBases​(int minPairMatchBases)
        Parameters:
        minPairMatchBases - When marking a paired-end read, adapter must match at least this many bases.
      • getMaxSingleEndErrorRate

        public double getMaxSingleEndErrorRate()
      • setMaxSingleEndErrorRate

        public AdapterMarker setMaxSingleEndErrorRate​(double maxSingleEndErrorRate)
        Parameters:
        maxSingleEndErrorRate - For single-end read, no more than this fraction of the bases that align with the adapter can mismatch the adapter and still be considered an adapter match.
      • getMaxPairErrorRate

        public double getMaxPairErrorRate()
      • setMaxPairErrorRate

        public AdapterMarker setMaxPairErrorRate​(double maxPairErrorRate)
        Parameters:
        maxPairErrorRate - For paired-end read, no more than this fraction of the bases that align with the adapter can mismatch the adapter and still be considered an adapter match.
      • adapterTrimIlluminaSingleRead

        public AdapterPair adapterTrimIlluminaSingleRead​(htsjdk.samtools.SAMRecord read)
      • findAdapterPairAndIndexForSingleRead

        public htsjdk.samtools.util.Tuple<AdapterPair,​Integer> findAdapterPairAndIndexForSingleRead​(byte[] read,
                                                                                                          int templateIndex)
        Return the adapter to be trimmed from a read represented as an array of bytes[]
        Parameters:
        read - The byte array of read data
        templateIndex - The paired index of the reads (1 or 2, 1 for single ended reads)
        Returns:
        The adapter pair that matched the read and its index in the read.
      • adapterTrimIlluminaPairedReads

        public AdapterPair adapterTrimIlluminaPairedReads​(htsjdk.samtools.SAMRecord read1,
                                                          htsjdk.samtools.SAMRecord read2)
      • adapterTrimIlluminaSingleRead

        public AdapterPair adapterTrimIlluminaSingleRead​(htsjdk.samtools.SAMRecord read,
                                                         int minMatchBases,
                                                         double maxErrorRate)
        Overrides defaults for minMatchBases and maxErrorRate
      • findAdapterPairAndIndexForSingleRead

        public htsjdk.samtools.util.Tuple<AdapterPair,​Integer> findAdapterPairAndIndexForSingleRead​(byte[] read,
                                                                                                          int minMatchBases,
                                                                                                          double maxErrorRate,
                                                                                                          int templateIndex)
        Return the adapter to be trimmed from a read represented as an array of bytes[]
        Parameters:
        read - The byte array of read data
        minMatchBases - The minimum number of base matches required for adapter matching
        maxErrorRate - The maximum error rate allowed for adapter matching
        templateIndex - The paired index of the reads (1 or 2, 1 for single ended reads)
        Returns:
        The adapter pair that matched the read and its index in the read or null.
      • adapterTrimIlluminaPairedReads

        public AdapterPair adapterTrimIlluminaPairedReads​(htsjdk.samtools.SAMRecord read1,
                                                          htsjdk.samtools.SAMRecord read2,
                                                          int minMatchBases,
                                                          double maxErrorRate)
        Overrides defaults for minMatchBases and maxErrorRate