Software Packages

Department of Quantitative Health Sciences
Mayo Clinic Research
Formerly known as the Department of Health Sciences Research

Related links:

Division Overview

R Shiny Applications

Hybrid-Denovo

Hybrid-denovo is a de novo OTU-picking pipeline integrating single- and paired-end 16S sequence tags. It is designed to take Illumina paired-end sequencing reads as input and output the OTU BIOM table, together with their representative sequences and a phylogenetic tree of OTUs.

The most distinguishable feature of hybrid-denovo is that it can process a mixture of paired-end reads and single-end reads. It is very useful in that Illumina paired-end reads become a mixture of paired-end reads and single-end reads after quality control. For more details, please read our online article.

System requirements:
Linux platform (we used CentOS 6)

Installation:

From source code:

Install miniconda (Python2.7).
Install hybrid-denovo (conda install hybrid-denovo -c jeffchen2000 -c conda-forge -c bioconda -c biobuilds)
Download hybrid-denovo reference database (hybrid-denovo_database.tar.gz) and uncompress.
Download Linux version 8 of USEARCH from http://www.drive5.com/usearch/download.html
Open config/tool.info and set up paths to USEARCH and hybrid-denovo reference databases.
Make sure the files are installed in /your_miniconda_path/share/

From VM using virtualBox

Alternatively, you can download our VM virtualBox hybriddenovo.ova, which packages all dependencies
Install on Windows (we installed it on windows 7)
Install Oracle VirtualBox
Open the OVA image you downloaded in step 1.
Ubuntu is installed in the VM virtualBox and the sudo password is ‘mayo’ (in cause you want to install additional packages)

Files and directories in the package (hybrid-denovo.tar.gz):

hybrid-denovo: the main script file
config: directory that stores configuration files
1. run.info : the input parameters of the pipeline (open the config/run.info for detail)
2. tool.info : the path to external modules and packages of the pipeline, and it is set in run.info
external : external modules and packages
README : this README file
sampleV3V5: a test sample for V3V5 rDNA amplicon reads
scripts : shell script and jar files developed by us
test : our test run results

Usage:
/path/to/hybrid-denovo /path/to/run.info

key parameters to set run.info (open the config/run.info to edit):

R1PAIRED_READ_TYPE: read type (0: single end; 1: paired end with overlap, such as V4 region amplicon; 2: paired end without overlap, such as V3-V5 region amplicon)
R1PAIRED_READ_LENGTH: input read length
R1PAIRED_INPUT_FILES: a directory that includes all input fastq files. (within which, any *.fastq will be used as input)
R1PAIRED_WORK_DIR: your working/output directory
R1PAIRED_TOOL_INFO: absolute path to tool.info (by default, the pipeline will use: /your_source_dir/config/tool.info). Please remember to open tool.info and set correct tool paths

Output Files:

mapping.txt: a mapping file associates sample ID and fastq file, based on which,
you can add other meta information for further analysis (such as QIIME).
workspace/imtornado/QC.log.txt: QC results showing the number of input reads and
the number of QC passed reads
workspace/imtornado/: results generated by IM-TORNADO using read1s only
- test_R1.biom (BIOM file)
- test_R1.biom.table (converted by QIIME from BIOM file)
- test_R1.tree (a phylogenetic tree generated by FastTree)
- test_R1.otus.final.result.fasta (OTU representatives)
workspace/imtornado/: results generated by IM-TORNADO using paired-end reads
- test_paired.biom (BIOM file)
- test_paired.biom.table (converted by QIIME from BIOM file)
- test_paired.tree (a phylogenetic tree generated by FastTree)
- test_paired.otus.final.result.fasta (OTU representatives)
workspace/R1Paired/: results generated by our hybrid-denovo method
- test_PairedSingle.biom (BIOM file)
- test_PairedSingle.biom.table (converted by QIIME from BIOM file)
- test_PairedSingle.tree (a phylogenetic tree generated by FastTree)
- test_PairedSingle.otus.final.result.fasta (OTU representatives)

Test run:

Go to unpacked directory
mkdir mytest
cd mytest
Run command ‘../hybrid-denovo ../config/run.info’.
Compare your results to our results (in /your_source_dir/test) to confirm if you have installed correctly.

Notes:

To deal with large datasets, we have a parallel-computing version. For request, please contact us.

Prior versions:

1.0.0 – Original release

Questions:
Please contact chen.xianfeng@mayo.edu or chen.jun2@mayo.edu

Page last modified: November 17, 2023

Software Packages

Hybrid-Denovo

Other Topics in Research

Mayo Clinic Footer

Legal Conditions and Terms

Advertising

Reprints