Document Type

Journal Article

Publication Date

8-4-2014

Journal

BMC Bioinformatics

Volume

Volume 15

Inclusive Pages

Article number 262

Abstract

Background

The use of sequencing technologies to investigate the microbiome of a sample can positively impact patient healthcare by providing therapeutic targets for personalized disease treatment. However, these samples contain genomic sequences from various sources that complicate the identification of pathogens.

Results

Here we present Clinical PathoScope, a pipeline to rapidly and accurately remove host contamination, isolate microbial reads, and identify potential disease-causing pathogens. We have accomplished three essential tasks in the development of Clinical PathoScope. First, we developed an optimized framework for pathogen identification using a computational subtraction methodology in concordance with read trimming and ambiguous read reassignment. Second, we have demonstrated the ability of our approach to identify multiple pathogens in a single clinical sample, accurately identify pathogens at the subspecies level, and determine the nearest phylogenetic neighbor of novel or highly mutated pathogens using real clinical sequencing data. Finally, we have shown that Clinical PathoScope outperforms previously published pathogen identification methods with regard to computational speed, sensitivity, and specificity.

Conclusions

Clinical PathoScope is the only pathogen identification method currently available that can identify multiple pathogens from mixed samples and distinguish between very closely related species and strains in samples with very few reads per pathogen. Furthermore, Clinical PathoScope does not rely on genome assembly and thus can more rapidly complete the analysis of a clinical sample when compared with current assembly-based methods. Clinical PathoScope is freely available at:http://sourceforge.net/projects/pathoscope/ webcite.

Comments

Reproduced with permission of BioMed Central Bioinformatics.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.

Peer Reviewed

1

Open Access

1

Workflow employed to develop the Clinical PathoScope pipeline.pdf (19 kB)
Workflow employed to develop the Clinical PathoScope pipeline

Viral genomes with human ribosomal RNA contamination.txt (1 kB)
Viral genomes with human ribosomal RNA contamination

Simulated data summary & code.xlsx (37 kB)
Simulated data summary & code

Alignment optimization variables and methods.pdf (75 kB)
Alignment optimization variables and methods

Commands and versions of alignment algorithms evaluated.docx (21 kB)
Commands and versions of alignment algorithms evaluated

Results of all alignment runs.xlsx (40 kB)
Results of all alignment runs

Subtraction and filtration optimization methods.pdf (39 kB)
Subtraction and filtration optimization methods

Overview of clinical datasets used to evaluate Clinical PathoScope.xlsx (10 kB)
Overview of clinical datasets used to evaluate Clinical PathoScope

List of candidate primers and adapters used for quality control filtering.txt (1 kB)
List of candidate primers and adapters used for quality control filtering

Phylogeny of 16S genes for genera found in clinical samples.pdf (176 kB)
Phylogeny of 16S genes for genera found in clinical samples

Read coverage for 16S genes and nearest phylogenetic neighbors.pdf (1666 kB)
Read coverage for 16S genes and nearest phylogenetic neighbors.

Share

COinS