ProteoClade 5 Minute Demo

Want to see ProteoClade in action? This tutorial provides a quick run through of both targeted and de novo workflows to demonstrate the tool’s features.

Prepare Data

  1. Install ProteoClade

  2. Download some example data. Extract these files to a folder. “targeted_humouse_example.txt” is a truncated and reformated MaxQuant search from a patient-derived xenograft data set, while “denovo_bacteria_example.csv” is a truncated de novo PEAKS search from an oral microbiome data set.

  3. Navigate to the folder with the example data, open a Python 3 shell, and import ProteoClade:

    >>> from proteoclade import *
    
  4. Download and assemble taxonomy information from the NCBI:

    >>> download_taxonomy()
    

Targeted Database Example

  1. Download protein sequence information from UniProt:

    >>> download_uniprot((9606,'sr'),(10090,'sr'), download_folder = 'pdxseq')
        #Downloads human and mouse proteomes by Taxon ID.
    
  2. Create a PCDB for patient-derived xenografts:

    >>> create_pcdb('humouse', 'pdxseq')
    
  3. Annotate the targeted experiment. Make sure to replace the XXXXXX with the date/name of the PCTAXA file you generated in step 4 of “Prepare Data”.:

    >>> annotate_peptides('targeted_humouse_example.txt', 'humouse.pcdb', 'XXXXXX.pctaxa', taxon_levels = ('species','phylum'))
    
  4. Roll up peptide information to gene symbols:

    >>> roll_up('annotated_targeted_humouse_example.txt')
    

Results: “rollup_annotated_targeted_humouse_example.txt” now contains genes derived from species-specific peptides and their summed ion intensities.

de novo Example

  1. Download protein sequence information from UniProt:

    >>> download_uniprot((1891914,'a'),(1283313,'a'), download_folder = 'denovoseq')
        #Downloads strep oralis and alloprevotella proteomes by Taxon ID.
    
  2. Create a PCDB for patient-derived xenografts:

    >>> create_pcdb('bacteria', 'denovoseq')
    
  3. Annotate the de novo experiment. Make sure to replace the XXXXXX with the date/name of the PCTAXA file you generated in step 4 of “Prepare Data”.:

    >>> annotate_denovo('denovo_bacteria_example.csv', 'bacteria.pcdb', 'XXXXXX.pctaxa', taxon_levels = ('species','phylum'))
    
  4. Roll up peptide information to gene symbols:

    >>> roll_up('annotated_denovo_matched_denovo_bacteria_example.csv')
    

Results: ‘annotated_denovo_matched_denovo_bacteria_example.csv’ contains species and phyla annotations for the de novo data set, while ‘rollup_annotated_denovo_matched_denovo_bacteria_example.csv’ contains peptides summed to gene symbols. Note that although this de novo data set does not contain quantitative information, spectral counts are provided in additional columns.