Bioinformatics Software Packages

The Bioinformatics Program at Mayo Clinic has created several software packages to analyze, visualize and interpret genomic data. We welcome feedback and comments.

These applications are freely available to academics. Scientific acknowledgment and authorship information can be found in the section describing each application.

BIMA

A mapping/alignment customized for mate-pair library next generation sequencing.

Read more...

BioR: Rapid, Flexible System for Genomic Annotation

A toolkit and set of catalogs to retrieve genomic annotation for variants, genes, diseases, conditions, genetic tests, and drugs.

Read more...

CAP-miRSEQ

miRNAs play a key role in normal physiology and various diseases such as cancer. However, analyzing miRNA sequencing data is challenging due to the requirement of significant computational resources and bioinformatics expertise. To address this, we present a comprehensive analysis pipeline for deep microRNA sequencing (CAP-miRSeq) that integrates read preprocessing, alignment, mature/precursor/novel miRNA qualification, variant detection in miRNA coding region, and flexible differential expression between experimental conditions. Using well characterized data, we demonstrated the pipeline’s superior performances, flexibilities, and practical use in research and biomarker discovery.

Read more...

ChIP-RNA-seqPRO

ChIP-RNA-seqPRO is a resource motivated by this current need and provides a strategy that enables the user to profile regulatory associations between epigenomic modifications and co/post-transcriptional processes.

Read more...

CNVNator

Structural Variations (SVs) and Copy Number Variations (CNVs) are the major source of genomic variations. CNVnator is a tool for Copy Number Variation (CNV) discovery and genotyping  from depth-of-coverage by mapped reads.  It accepts .bam files as input and generates CNVs calls in less than 10 hours of calculations. The source code and extended descriptions […]

Read more...

eSNV-Detect

eSNV-Detect v1.0: Reliable Identification of Variants Using RNA-seq Data

Read more...

Ezimputer

Ezimputer is an impute2-based genotype imputation workflow that greatly simplifies the process of imputation and achieves a significant speedup of imputation using multiple CPUs on a computer cluster.

Read more...

Fusion-sense

A tool used to calculate the estimated sensitivity of fusion finding for an RNA-seq experiment. It plots the estimated sensitivity as a function of the distance to the 3’ end and also calculates the decay rate for the sample.

Read more...

GeneSetScan

GeneSetScan is a pre-compiled binary for 64-bit linux systems. It offers a general approach to scan genome-wide SNP data for gene-set association analyses.

Read more...

GenomeSmasher

GenomeSmasher is a set of tools used to create diploid FASTA files with containing snps, indels, duplications, deletions and translocations.

Read more...

HiChIP Pipeline

HiChIP: A high-throughput pipeline for integrative analysis of ChIP-Seq data HiChIP pipeline is designed for performing comprehensive analysis of chromatin immunoprecipitation and sequencing (ChIP-Seq) data. It can be used to analyze profiles from transcription factor binding, histone modifications, histone variants, and chromatin regulators. Paired-end and single-end NexGen sequencing data from ChIP experiment with different antibodies, […]

Read more...

Hybrid-Denovo

Microbiota pipeline that utilizes and integrates information from a mix of both paired-end and single-end reads.

Read more...

ICQ-lincRNA

ICQ-lincRNA (Identification, Characterization, and Quantification of Long Intergenic Non-Coding RNAs), offers an end-to-end solution to identify and annotate expressed lincRNAs in next generation RNA sequencing data.  Specifically, ICQ-lincRNA: Conducts ab-initio genome-wide transcript assembly by both Cufflinks and Scripture using Binary Alignment/Map (BAM) files Conducts downstream quantitative analyses including gene count, exon count, overlap with known […]

Read more...

MAP-RSeq

The MAP-RSeq workflow integrates a suite of open source bioinformatics tools along with in-house developed methods to analyze paired-end RNA-Seq data.

Read more...

PANDA

The PANDA (Pathway AND Annotation) Explorer is a data visualization tool capable of annotating genes with any data type and graphically displaying the result within the context of pathways.

Read more...

Panoply

Precision Cancer Genomic Report: Single Sample Inventory. Given clinical and genomic data on a cancer patient and a set of matched cancer treatment responders with the same data sources, determine drug targets in driver genes that are connected by driver DNA events to outlying RNA expression events.  In development. Authors: Krishna R. Kalari, Jason P. […]

Read more...

PatternCNV

A versatile tool for detecting copy number changes from exome sequencing data.

Read more...

RVboost

RVboost v0.1: RNA-seq variant prioritization approach for Illumina next-generation sequencing data.

Read more...

SAAP-RRBS

The Streamlined Analysis and Annotation Pipeline for RRBS data (SAAP-RRBS) integrates read quality assessment/clean-up, alignment, methylation data extraction, annotation, reporting, and visualization. With this package, bioinformaticians or investigators can submit sequencing reads and quickly receive a fully annotated CpG methylation report.

Read more...

SnowShoes-FTD

SnowShoes-FTD is a bioinformatics tool to identify fusion transcripts from paired-end transcriptome sequencing data.

Read more...

SNPPicker

A post-processor to optimize the selection of tag SNPs from common bin-tagging programs. SNPPicker uses a multi-step search strategy in combination with a statistical model to produce optimal genotyping panels.

Read more...

SoftSearch

SoftSearch is a sensitive structural variant (SV) detection tool for Illumina paired-end next-generation sequencing data.

Read more...

Trace-RRBS

trace-rrbs v0.1: Targeted Alignment and Artificial Cytosine Elimination for RRBS for Illumina next-generation sequencing data.

Read more...

TREAT

TREAT is a Targeted RE-sequencing Annotation Tool that offers a comprehensive, open framework, end-to-end solution for analyzing and interpreting targeted re-sequencing data.

Read more...

VCF-Miner

The Variant Call Format (VCF) is the de facto standard for storing variant information from next-generation DNA sequencing experiments.

Read more...

WANDY

Wandy: A program for CNV/Aneuploidy detection from WGS sequencing data Wandy is designed for Copy Number Variation (CNV) and Aneuploidy detection from large genomes such as human. It takes a sorted BAM file as input and report predicted chromosome regions that have amplifications or deletions using LOG2 ratio, generate graphic reports. There are two download […]

Read more...