Title: Automated profiling of the human virome from raw metagenomic data
Authors: Moosa Y, Vilsker M, Vanden Eyden E, Fonseca V, Nooij S, Deforche K, de Oliveira T.
Journal: Virus Evolution,: doi: 10.1093/ve/vew036.033 (2017)
Viruses influence human health as conventional pathogens, as modulators of gene expression and through their involvement in complex host-microbiome interactions. Next generation sequencing (NGS) has enabled us to explore the role of the microbiome in human health and disease.
Metagenomic sequencing should allow us to profile all biological elements in a clinical sample in an unbiased, hypothesis-free way. However viruses display much greater variation than all other elements, and the existing tools and methods for virus identification and discovery are not effective enough.
We have developed a bioinformatics pipeline to identify and classify all known viruses present in a metagenomic sample. Viral NGS reads are identified using a protein-based alignment method, DIAMOND, which is substantially faster than the standard BLAST method, and more reliable for viruses. These reads are automatically assembled into contigs using SPAdes, a de novo assembler. The contigs are then used to classify the virus at species level using a pan-viral typing tool based on all available taxonomic reference sequences from the International Committee on Taxonomy of Viruses (ICTV) database.
This bioinformatics pipeline is Java-encoded and will include an easy-to-use web interface that is fit-for-purpose for researchers or clinicians. This tool can assemble viral contigs from paired-end reads generated by an Illumina MiSeq sequencer. So far 1865 viruses can be identified at species level resolution and 10 viruses (Chikungunya virus, Dengue virus, HBV, HCV, HHV8, HIV-1, HPV, HTLV-1, YFV, and Zika virus) at the genotype level.
A web version of the pan-viral typing tool is already available and a web version with extended NGS functionality is currently being evaluated. Eliminating the need for virus-specific laboratory techniques, or targeted sequence capture, means a virome can be profiled in the context of its non-viral microbiome. Preliminary findings suggest our tool offers greater functionality than existing alternatives, with greater sensitivity to known viruses (including bacteriophages), automatic assembly and good quality phylogenetic analyses. A systematic comparison is underway.
Citation: Moosa Y, Vilsker M, Vanden Eyden E, Fonseca V, Nooij S, Deforche K, de Oliveira T. Automated profiling of the human virome from raw metagenomic data Virus Evolution,: doi: 10.1093/ve/vew036.033 (2017).