hit

A Hit is created when hmmsearch find similarities between a profile and protein of the input dataset

../../_images/gene_obj_interaction.svg

A diagram showing the interaction between CoreGene, ModelGene, Model, HIt, ValidHit interactions The diagram above represents the models, genes and hit generated from the definitions below.

<model name="A" inter_gene_max_space="2">
    <gene name="abc" presence="mandatory"/>
    <gene name="def" presence="accessory"/>
</model>

<model name="B" inter_gene_max_space="5">
    <gene name="def" presence="mandatory"/>
        <exchangeables>
            <gene name="abc"/>
        </exchangeables>
    <gene name="ghj" presence="accessory"
</model>

hit

class macsypy.hit.Hit(gene, hit_id, hit_seq_length, replicon_name, position_hit, i_eval, score, profile_coverage, sequence_coverage, begin_match, end_match)[source]

Handle the hits filtered from the Hmmer search. The hits are instanciated by HMMReport.extract() method

__eq__(other)[source]

Return True if two hits are totally equivalent, False otherwise.

Parameters

other (macsypy.report.Hit object) – the hit to compare to the current object

Returns

the result of the comparison

Return type

boolean

__gt__(other)[source]

compare two Hits. If the sequence identifier is the same, do the comparison on the score. Otherwise, do it on alphabetical comparison of the sequence identifier.

Parameters

other (macsypy.report.Hit object) – the hit to compare to the current object

Returns

True if self is > other, False otherwise

__hash__()[source]

To be hashable, it’s needed to be put in a set or used as dict key

__init__(gene, hit_id, hit_seq_length, replicon_name, position_hit, i_eval, score, profile_coverage, sequence_coverage, begin_match, end_match)[source]
Parameters
  • gene (macsypy.gene.CoreGene object) – the gene corresponding to this profile

  • hit_id (str) – the identifier of the hit

  • hit_seq_length (int) – the length of the hit sequence

  • replicon_name (str) – the name of the replicon

  • position_hit (int) – the rank of the sequence matched in the input dataset file

  • i_eval (float) – the best-domain evalue (i-evalue, “independent evalue”)

  • score (float) – the score of the hit

  • profile_coverage (float) – percentage of the profile that matches the hit sequence

  • sequence_coverage (float) – percentage of the hit sequence that matches the profile

  • begin_match (int) – where the hit with the profile starts in the sequence

  • end_match (int) – where the hit with the profile ends in the sequence

__lt__(other)[source]

Compare two Hits. If the sequence identifier is the same, do the comparison on the score. Otherwise, do it on alphabetical comparison of the sequence identifier.

Parameters

other (macsypy.report.Hit object) – the hit to compare to the current object

Returns

True if self is < other, False otherwise

__str__()[source]
Returns

Useful information on the Hit: regarding Hmmer statistics, and sequence information

Return type

str

__weakref__

list of weak references to the object (if defined)

get_position()[source]
Returns

the position of the hit (rank in the input dataset file)

Return type

integer

class macsypy.hit.HitWeight(itself: float = 1, exchangeable: float = 0.8, mandatory: float = 1, accessory: float = 0.5, neutral: float = 0, loner_multi_system: float = 0.7)[source]

The weight to compute the cluster and system score see user documentation macsyfinder functionning for further details by default

  • itself = 1

  • exchangeable = 0.8

  • mandatory = 1

  • accessory = 0.5

  • neutral = 0

  • loner_multi_system = 0.7

__delattr__(name)

Implement delattr(self, name).

__eq__(other)

Return self==value.

__hash__()

Return hash(self).

__init__(itself: float = 1, exchangeable: float = 0.8, mandatory: float = 1, accessory: float = 0.5, neutral: float = 0, loner_multi_system: float = 0.7) None
__repr__()

Return repr(self).

__setattr__(name, value)

Implement setattr(self, name, value).

__weakref__

list of weak references to the object (if defined)

class macsypy.hit.ValidHit(hit, gene_ref, gene_status)[source]

Encapsulates a macsypy.report.Hit This class stores a Hit that has been attributed to a putative system. Thus, it also stores:

  • the system,

  • the status of the gene in this system, (‘mandatory’, ‘accessory’, …

  • the gene in the model for which it’s an occurrence

__eq__(other)[source]

Return self==value.

__gt__(other)[source]

Return self>value.

__hash__ = None
__init__(hit, gene_ref, gene_status)[source]
Parameters
__lt__(other)[source]

Return self<value.

__weakref__

list of weak references to the object (if defined)

property loner
Returns

True if the hit represent a loner macsypy.Gene.ModelGene, False otherwise.

property multi_system
Returns

True if the hit represent a multi_systems macsypy.Gene.ModelGene, False otherwise.

macsypy.hit.get_best_hits(hits, key='score')[source]

If several hits match the same protein, keep only the best match based either on

  • score

  • i_evalue

  • profile_coverage

Parameters
  • hits ([ macsypy.hit.Hit object, …]) – the hits to filter, all hits must match the same protein.

  • key (str) – The criterion used to select the best hit ‘score’, i_evalue’, ‘profile_coverage’

Returns

the list of the best hits

Return type

[ macsypy.hit.Hit object, …]