A toolkit and set of catalogs to retrieve genomic annotation for variants, genes, diseases, conditions, genetic tests, and drugs.
miRNAs play a key role in normal physiology and various diseases such as cancer. However, analyzing miRNA sequencing data is challenging due to the requirement of significant computational resources and bioinformatics expertise. To address this, we present a comprehensive analysis pipeline for deep microRNA sequencing (CAP-miRSeq) that integrates read preprocessing, alignment, mature/precursor/novel miRNA qualification, variant detection in miRNA coding region, and flexible differential expression between experimental conditions. Using well characterized data, we demonstrated the pipeline’s superior performances, flexibilities, and practical use in research and biomarker discovery.
ChIP-RNA-seqPRO is a resource motivated by this current need and provides a strategy that enables the user to profile regulatory associations between epigenomic modifications and co/post-transcriptional processes.
Structural Variations (SVs) and Copy Number Variations (CNVs) are the major source of genomic variations. CNVnator is a tool for Copy Number Variation (CNV) discovery and genotyping from depth-of-coverage by mapped reads. It accepts .bam files as input and generates CNVs calls in less than 10 hours of calculations. The source code and extended descriptions […]
Ezimputer is an impute2-based genotype imputation workflow that greatly simplifies the process of imputation and achieves a significant speedup of imputation using multiple CPUs on a computer cluster.
A tool used to calculate the estimated sensitivity of fusion finding for an RNA-seq experiment. It plots the estimated sensitivity as a function of the distance to the 3’ end and also calculates the decay rate for the sample.
GeneSetScan is a pre-compiled binary for 64-bit linux systems. It offers a general approach to scan genome-wide SNP data for gene-set association analyses.
GenomeSmasher is a set of tools used to create diploid FASTA files with containing snps, indels, duplications, deletions and translocations.
HGT-ID v1.0: An efficient and sensitive program for detecting viral insertion sequences in the genome of human cancers
HiChIP: A high-throughput pipeline for integrative analysis of ChIP-Seq data HiChIP pipeline is designed for performing comprehensive analysis of chromatin immunoprecipitation and sequencing (ChIP-Seq) data. It can be used to analyze profiles from transcription factor binding, histone modifications, histone variants, and chromatin regulators. Paired-end and single-end NexGen sequencing data from ChIP experiment with different antibodies, […]
Microbiota pipeline that utilizes and integrates information from a mix of both paired-end and single-end reads.
ICQ-lincRNA (Identification, Characterization, and Quantification of Long Intergenic Non-Coding RNAs), offers an end-to-end solution to identify and annotate expressed lincRNAs in next generation RNA sequencing data. Specifically, ICQ-lincRNA: Conducts ab-initio genome-wide transcript assembly by both Cufflinks and Scripture using Binary Alignment/Map (BAM) files Conducts downstream quantitative analyses including gene count, exon count, overlap with known […]
Produce gplot of continuous variable(y-axis) vs a group variable(x-axis) in such a way that no points are hidden.
Univariate logistic regression model summaries with multiple dependent variables and predictors.
The MAP-RSeq workflow integrates a suite of open source bioinformatics tools along with in-house developed methods to analyze paired-end RNA-Seq data.
Computes Lin’s concordance correlation coefficient (CCC) for any number of raters.
Conducts likelihood ratio tests for nested logistic and Cox proportional hazards models.
Uses Graph Template Language to create a highly customizabile Kaplan-Meier curve.
This macro creates a macro variable containing the number of observations in a SAS dataset.
The PANDA (Pathway AND Annotation) Explorer is a data visualization tool capable of annotating genes with any data type and graphically displaying the result within the context of pathways.
PANOPLY, a novel computational approach to integrate both germline and somatic data obtained from multi-omics platforms for an individual of interest and analyze that data in the context matched-control samples.
A versatile tool for detecting copy number changes from exome sequencing data.
Create a scatterplot matrix graphically displaying the bivariate relationships between a number of variables.
Estimated Integrated Discrimination Index (IDI) and Net Reclassification Improvement (NRI) for comparison of a new risk model to an old model.
RVboost v0.1: RNA-seq variant prioritization approach for Illumina next-generation sequencing data.
The Streamlined Analysis and Annotation Pipeline for RRBS data (SAAP-RRBS) integrates read quality assessment/clean-up, alignment, methylation data extraction, annotation, reporting, and visualization. With this package, bioinformaticians or investigators can submit sequencing reads and quickly receive a fully annotated CpG methylation report.
SnowShoes-FTD is a bioinformatics tool to identify fusion transcripts from paired-end transcriptome sequencing data.
A post-processor to optimize the selection of tag SNPs from common bin-tagging programs. SNPPicker uses a multi-step search strategy in combination with a statistical model to produce optimal genotyping panels.
SoftSearch is a sensitive structural variant (SV) detection tool for Illumina paired-end next-generation sequencing data.
Creates a table of variable summaries plus test statistics for the difference between two or more independent samples.
Complete Kaplan-Meier survival analysis with printing options and logrank statistic.
Calculates the c-statistic (concordance, discrimination index) for survived data with time dependent covariates
General survival statistics p(t), standard error, confidence limits, and median survival time, for the left-truncated survival analyses.
Checks for symmetry and suggests the best power transformation, if one exists, to make an asymmetric distribution symmetric
trace-rrbs v0.1: Targeted Alignment and Artificial Cytosine Elimination for RRBS for Illumina next-generation sequencing data.
TREAT is a Targeted RE-sequencing Annotation Tool that offers a comprehensive, open framework, end-to-end solution for analyzing and interpreting targeted re-sequencing data.
Measures agreement, precision, accuracy, total deviation index and coverage probability.
The Ultrafast and Comprehensive lncRNA detection (UClncRNA) pipeline leverages fast transcript assembly and parallel computing tools, multi-step filters for increased specificity to provide comprehensive lncRNA characterization.
The Variant Call Format (VCF) is the de facto standard for storing variant information from next-generation DNA sequencing experiments.
Wandy: A program for CNV/Aneuploidy detection from WGS sequencing data Introduction Wandy is designed for Copy Number Variation (CNV) and Aneuploidy detection from large genomes such as human. It takes a sorted BAM file as input and report predicted chromosome regions that have amplifications or deletions using LOG2 ratio, generate graphic reports. There are […]