pcannotate Module

annotate_peptides

proteoclade.pcannotate.annotate_peptides(file, db, pctaxa, taxon_levels=None, worker_threads=None)

Drives the taxonomic and gene annotation of peptide-containing files.

Parameters
  • file (string) – csv or txt file containing wide-form, peptide entries

  • db (string) – PCDB file containing digested peptides to match w/ experiment

  • pctaxa (string) – PCTAXA file containing taxonomic mapping for species and above

  • taxon_levels (None, string, or tuple) – Which taxa to annotate above the organism level (default None)

  • worker_threads (None or integer) –

    Number of worker threads to use. (default None)

    if None: will use up to 6 threads.

Notes

Outputs csv or txt file with all data and appended taxonomic and gene annotations

‘annotated_’ + ‘denovo_matched’ + file

annotate_denovo

proteoclade.pcannotate.annotate_denovo(file, db, pctaxa, method='dbconstrain', taxon_levels=None, worker_threads=None)

Drives the annotation of denovo/psm-containing files.

Parameters
  • file (string) – csv or txt file containing long-form PSM entries

  • db (string) – PCDB file containing digested peptides to match w/ experiment

  • pctaxa (string) – PCTAXA file containing taxonomic mapping for species and above

  • method (string) –

    “dbconstrain”: serially checks PSM candidates against the PCDB

    ”top”: only looks at top scoring PSM candidate (default: “dbconstrain”)

  • taxon_levels (None, string, or tuple) – Which taxa to annotate above the organism level (default None)

  • worker_threads (None or integer) –

    Number of worker threads to use. (default None)

    if None: will use up to 6 threads.

Notes

Output is csv or txt file with all data and appended taxonomic and gene annotations

‘denovo_matched_’ + file

‘annotated_’ + ‘denovo_matched’ + file

filter_taxa

proteoclade.pcannotate.filter_taxa(file, taxon_levels, taxa, unique=False)

Filters peptide files based on desired taxa.

Parameters
  • file (string) – csv or txt file containing wide-form, peptide entries

  • taxon_levels (string, list, or tuple) – Taxonomic ranks to include in file search. Must be annotated

  • taxa (string, list, or tuple) – Taxa to include in filter

  • unique (bool) – Whether specified taxa must be unique in their given taxonomic rank

Notes

Output is csv or txt file pared down by filter specifications.

‘filtered_’ + file name