Biomedical Statistics and Informatics Software Packages


Go to...
Software Packages
Division Overview
Suggested Searches...
All | Alignment | Assemblers | ChIP | Copy Number Variants | Exome | Mate Pair | Methylation | Microbiome | Pathway Analysis | Proteins | RNA | SNP/SNV | Structural Variants | Visualization | R Packages | SAS Macros | Survival Analysis

armitage

armitage trend test for trait and SNP dosage Authors: Jason Sinnwell (primary contact) Dan Schaid Link: armitage_0.2.1.tar.gz Language/Platform: R

Read more...

arp.gee

Generalized Estimating Equations for Affected Relative Pairs Authors: Dan Schaid Jason Sinnwell Link: arp.gee_0.1.1.tar.gz Language/Platform: R

Read more...

Arsenal

An Arsenal of ‘R’ Functions for Large-Scale Statistical Summaries An Arsenal of ‘R’ functions for large-scale statistical summaries, which are streamlined to work within the latest reporting tools in ‘R’ and ‘RStudio’ and which use formulas and versatile summary statistics for summary tables and models. The primary functions include tableby(), a Table-1-like summary of multiple […]

Read more...

Attribrisk

Population Attributable Risk Estimates population (etiological) attributable risk for unmatched, pair-matched or set-matched case-control designs and returns a list containing the estimated attributable risk, estimates of coefficients, and their standard errors, from the (conditional, If necessary) logistic regression used for estimating the relative risk. Authors: Beth Atkinson (primary contact) Louis Schenck Cindy Crowson Terry Therneau […]

Read more...

bdsmatrix

Routines for Block Diagonal Symmetric matrices This is a special case of sparse matrices, used by coxme Authors: Terry Therneau Available at: https://cran.r-project.org/web/packages/bdsmatrix/index.html Language/Platform: R

Read more...

bilinear.fit

bilinear regression to 16O/18O isotope label experiments Authors: Doug Mahoney (primary contact) Jeanette Eckel-Passow Available at: bilinear.fit.tar.gz Language/Platform: R

Read more...

BIMA

A mapping/alignment customized for mate-pair library next generation sequencing.

Read more...

BioR Toolkit – Old Versions

BioR Toolkit – Old Versions Warning! These versions contain a critical tabix-related bug that cause a small percentage of regions to “miss” when using bior_overlap and bior_same_variant against some catalogs. Please use one of the fixed versions HERE. These old versions are maintained here for archive and re-creation purposes only, and should NOT be used […]

Read more...

BioR: Rapid, Flexible System for Genomic Annotation

A toolkit and set of catalogs to retrieve genomic annotation for variants, genes, diseases, conditions, genetic tests, and drugs.

Read more...

bnmlci

Exact confidence intervals for a proportion.

Read more...

boot

Select bootstrap samples.

Read more...

CAP-miRSEQ

miRNAs play a key role in normal physiology and various diseases such as cancer. However, analyzing miRNA sequencing data is challenging due to the requirement of significant computational resources and bioinformatics expertise. To address this, we present a comprehensive analysis pipeline for deep microRNA sequencing (CAP-miRSeq) that integrates read preprocessing, alignment, mature/precursor/novel miRNA qualification, variant detection in miRNA coding region, and flexible differential expression between experimental conditions. Using well characterized data, we demonstrated the pipeline’s superior performances, flexibilities, and practical use in research and biomarker discovery.

Read more...

ChIP-RNA-seqPRO

ChIP-RNA-seqPRO is a resource motivated by this current need and provides a strategy that enables the user to profile regulatory associations between epigenomic modifications and co/post-transcriptional processes.

Read more...

Circ-Seq

Circ-Seq: A comprehensive bioinformatics workflow for detecting circular RNAs Circular RNAs (circRNAs) are recently discovered members of the noncoding RNA family that range in length from a few hundred to thousands of nucleotides. In contrast to linear RNA transcripts, which are normally spliced tail-to-head, circRNAs are formed by the covalent bonding of their 3´ and […]

Read more...

CNVNator

Structural Variations (SVs) and Copy Number Variations (CNVs) are the major source of genomic variations. CNVnator is a tool for Copy Number Variation (CNV) discovery and genotyping  from depth-of-coverage by mapped reads.  It accepts .bam files as input and generates CNVs calls in less than 10 hours of calculations. The source code and extended descriptions […]

Read more...

comprisk

Cumulative incidence in the presence of competing risks.

Read more...

coxme

Mixed Effects Cox Models Cox proportional hazards models containing Gaussian random effects, also known as frailty models. Authors: Terry Therneau Available at: https://cran.r-project.org/web/packages/coxme/index.html Language/Platform: R

Read more...

criskcox

Competing risk survival analysis with covariates.

Read more...

deming

Deming, Thiel-Sen and Passing-Bablock Regression Generalized Deming regression, Theil-Sen regression and Passing-Bablock regression functions. Authors: Terry Therneau Available at: https://cran.r-project.org/web/packages/deming/index.html Language/Platform: R

Read more...

dist

Estimates the distance matrix between two groups (e.g. cases and potential controls) on the basis of a set of X’s.

Read more...

eSNV-Detect

eSNV-Detect v1.0: Reliable Identification of Variants Using RNA-seq Data

Read more...

Ezimputer

Ezimputer is an impute2-based genotype imputation workflow that greatly simplifies the process of imputation and achieves a significant speedup of imputation using multiple CPUs on a computer cluster.

Read more...

fastlo

Fast Loess Authors: Doug Mahoney Jeanette Eckel-Passow Ann Oberb Link: fastlo_1.3.tar.gz Language/Platform: R

Read more...

findcut

Uses the method of Contal and O’Quigley (1999) to find the best cutpoint in a continuous variable with regards to a survival outcome.

Read more...

Fusion-sense

A tool used to calculate the estimated sensitivity of fusion finding for an RNA-seq experiment. It plots the estimated sensitivity as a function of the distance to the 3’ end and also calculates the decay rate for the sample.

Read more...

GeneSetScan

GeneSetScan is a pre-compiled binary for 64-bit linux systems. It offers a general approach to scan genome-wide SNP data for gene-set association analyses.

Read more...

GenomeSmasher

GenomeSmasher is a set of tools used to create diploid FASTA files with containing snps, indels, duplications, deletions and translocations.

Read more...

gmatch

Computerized matching of cases to controls using the greedy matching algorithm

Read more...

Guide

Required Elements Name of project/tool Short description of project/tool (1-3 sentences) Authors, primary contact Suggested Tags Link to source code & data (tarred/gzipped if large) Other elements you may want to include Date last updated (?) Longer Description User manual Links to Publications for the software System requirements Licensing information If you want to deploy an […]

Read more...

haplo.stats

Statistical Analysis of Haplotypes with Traits and Covariates when Linkage Phase is Ambiguous This software offers a suite of R routines for the analysis of indirectly measured haplotypes. The statistical methods assume that all subjects are unrelated and that haplotypes are ambiguous (because of unknown linkage phase of the genetic markers). The genetic markers are […]

Read more...

HGT-ID

HGT-ID v1.0: An efficient and sensitive program for detecting viral insertion sequences in the genome of human cancers

Read more...

HiChIP Pipeline

HiChIP: A high-throughput pipeline for integrative analysis of ChIP-Seq data HiChIP pipeline is designed for performing comprehensive analysis of chromatin immunoprecipitation and sequencing (ChIP-Seq) data. It can be used to analyze profiles from transcription factor binding, histone modifications, histone variants, and chromatin regulators. Paired-end and single-end NexGen sequencing data from ChIP experiment with different antibodies, […]

Read more...

hwe

Hardy-Weinberg Equilibrium Tests Test the fit of genotype frequencies to Hardy-Weinberg Equilibrium proportions for autosomes and the X chromosome. Different statistical tests are provided, as well as an option to evaluate statistical significance by either exact methods or simulations Authors: Jason Sinnwell (primary contact) Dan Schaid Dan Folie Link: hwe_0.3.1.tar.gz   Language/Platform: R

Read more...

Hybrid-Denovo

Microbiota pipeline that utilizes and integrates information from a mix of both paired-end and single-end reads.

Read more...

ibdreg

Regression Methods for IBD Linkage With Covariates A method to test genetic linkage with covariates by regression methods with response IBD sharing for relative pairs. Account for correlations of IBD statistics and covariates for relative pairs within the same pedigree. Authors: Jason Sinnwell (primary contact) Dan Schaid Available at: https://cran.r-project.org/web/packages/ibdreg/index.html Language/Platform: R

Read more...

ICQ-lincRNA

ICQ-lincRNA (Identification, Characterization, and Quantification of Long Intergenic Non-Coding RNAs), offers an end-to-end solution to identify and annotate expressed lincRNAs in next generation RNA sequencing data.  Specifically, ICQ-lincRNA: Conducts ab-initio genome-wide transcript assembly by both Cufflinks and Scripture using Binary Alignment/Map (BAM) files Conducts downstream quantitative analyses including gene count, exon count, overlap with known […]

Read more...

jitplot

Produce gplot of continuous variable(y-axis) vs a group variable(x-axis) in such a way that no points are hidden.

Read more...

kinship2

Pedigree Functions {Pedigree Functions description} Authors: Jason Sinnwell (primary contact) Terry Therneau Beth Atkinson Dan Schaid Available at: https://cran.r-project.org/web/packages/kinship2/index.html Language/Platform: R

Read more...

ld.pairs

LD measure on multi-allele variants LD calculations on multi-allele, and SNP variants, including the composite-LD measure. Authors: Jason Sinnwell (primary contact) Dan Schaid Available at: /data5/bsi/adhoc/s200555.R-infrastructure/localrepo/R-3.3.1/ Language/Platform:

Read more...

logisuni

Univariate logistic regression model summaries with multiple dependent variables and predictors.

Read more...

MAP-RSeq

The MAP-RSeq workflow integrates a suite of open source bioinformatics tools along with in-house developed methods to analyze paired-end RNA-Seq data.

Read more...

mccc

Computes Lin’s concordance correlation coefficient (CCC) for any number of raters.

Read more...

mend.err

Check Pedigrees for Mendelian Errors Check Pedigrees for Mendelian Errors and, when errors are found, systematically jackknifes every typed pedigree member to determine if eliminating this member will remove all Mendelian Errors from the pedigree Authors: Jason Sinnwell (primary contact) Dan Schaid Dan Folie Link: mend.err_1.3.tar.gz Language/Platform: R

Read more...

multic

Quantitative Linkage Analysis Tools using the Variance Components Approach Calculate the polygenic and major gene models for quantitative trait linkage analysis using the variance components approach. Authors: Pat Votruba (primary contact) Beth Atkinson Mariza de Andrade Available at: https://cran.r-project.org/web/packages/multic/index.html Language/Platform: R

Read more...

nesttest

Conducts likelihood ratio tests for nested logistic and Cox proportional hazards models.

Read more...

newsurv

Uses Graph Template Language to create a highly customizabile Kaplan-Meier curve.

Read more...

nobs

This macro creates a macro variable containing the number of observations in a SAS dataset.

Read more...

outsumm

Creates a single RTF file containing multiple tables created by %SUMMARY.

Read more...

PANDA

The PANDA (Pathway AND Annotation) Explorer is a data visualization tool capable of annotating genes with any data type and graphically displaying the result within the context of pathways.

Read more...

Panoply

PANOPLY, a novel computational approach to integrate both germline and somatic data obtained from multi-omics platforms for an individual of interest and analyze that data in the context matched-control samples.

Read more...

PatternCNV

A versatile tool for detecting copy number changes from exome sequencing data.

Read more...

pedgene

Gene-Level Statistics for Pedigree Data Gene-level association tests with disease status for pedigree data: kernel and burden association statistics. Authors: Jason Sinnwell (primary contact) Dan Schaid Available at: https://cran.r-project.org/web/packages/pedgene/index.html Language/Platform: R

Read more...

pleio

Pleiotropy Test for Multiple Traits on a Genetic Marker Perform tests for pleiotropy of multiple traits of various variable types on genotypes for a genetic marker. Authors: Jason Sinnwell (primary contact) Dan Schaid Available at: https://cran.r-project.org/web/packages/pleio/index.html Language/Platform: R

Read more...

plotcorr

Proc Plot with correlation/regression statistics appended.

Read more...

plotmat

Create a scatterplot matrix graphically displaying the bivariate relationships between a number of variables.

Read more...

Protein Panoramic annoTation Tool (P2T2)

P2T2 is a web-based platform for the annotation of proteins using population variants; experimentally determined functional and phenotype-associated variants; literature-mined variant-phenotype relationships; and structural bioinformatics features such as linear motifs, domains and experimental structures.

Read more...

RNASeqPower

Sample size for RNAseq studies {Short description of project/tool (1-3 sentences)} Authors: Terry Therneau (primary contact) Steven Hart JP Kocher Available at: https://bioconductor.org/packages/release/bioc/html/RNASeqPower.html Publication: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3842884/ Language/Platform: R  

Read more...

rocplus

Estimated Integrated Discrimination Index (IDI) and Net Reclassification Improvement (NRI) for comparison of a new risk model to an old model.

Read more...

rpart

Recursive Partitioning and Regression Trees Recursive partitioning for classification, regression and survival trees. An implementation of most of the functionality of the 1984 book by Breiman, Friedman, Olshen and Stone. Authors: Beth Atkinson Terry Therneau Brian Ripley (primary contact) Available at: https://cran.r-project.org/web/packages/rpart/index.html Language/Platform: R

Read more...

RVboost

RVboost v0.1: RNA-seq variant prioritization approach for Illumina next-generation sequencing data.

Read more...

SAAP-RRBS

The Streamlined Analysis and Annotation Pipeline for RRBS data (SAAP-RRBS) integrates read quality assessment/clean-up, alignment, methylation data extraction, annotation, reporting, and visualization. With this package, bioinformaticians or investigators can submit sequencing reads and quickly receive a fully annotated CpG methylation report.

Read more...

schoen

Schoenfeld residuals for proportional hazards model.

Read more...

SnowShoes-FTD

SnowShoes-FTD is a bioinformatics tool to identify fusion transcripts from paired-end transcriptome sequencing data.

Read more...

SNPPicker

A post-processor to optimize the selection of tag SNPs from common bin-tagging programs. SNPPicker uses a multi-step search strategy in combination with a statistical model to produce optimal genotyping panels.

Read more...

SoftSearch

SoftSearch is a sensitive structural variant (SV) detection tool for Illumina paired-end next-generation sequencing data.

Read more...

SpatialNorm6

SpatialNorm6 Spatial normalization of Affymetrix SNP 6.0 cel file, which adjusts for spatial bias based on wavelet decomposition Authors: Chai High Seng (primary contact) Link: SpatialNorm6_1.1.tar.gz  

Read more...

Stress.dfArray

Stress.dfArray Calculates normalization Stress and dfArray quality for a set of arrays. Authors: Doug Mahoney (primary contact) Jeanette Eckel-Passow Available at: Stress.dfArray_1.1.tar.gz Language/Platform: R

Read more...

summary

Creates a table of variable summaries plus test statistics for the difference between two or more independent samples.

Read more...

SuperLearner

Super Learner Prediction Implements the super learner prediction method and contains a library of prediction algorithms to be used in the super learner. Authors: Eric Polley (primary contact) LeDell van der Laan Kennedy Lendle Available at: https://cran.r-project.org/web/packages/SuperLearner/index.html https://github.com/ecpolley/SuperLearner Language/Platform: R

Read more...

surv

Complete Kaplan-Meier survival analysis with printing options and logrank statistic.

Read more...

survcstd

Calculates the c-statistic (concordance, discrimination index) for survived data with time dependent covariates

Read more...

survival

Survival Analysis Contains the core survival analysis routines, including definition of Surv objects, Kaplan-Meier and Aalen-Johansen (multi-state) curves, Cox models, and parametric accelerated failure time models. Authors: Terry Therneau (primary contact) Available at: https://cran.r-project.org/web/packages/survival/index.html https://github.com/therneau/survival Language/Platform: R  

Read more...

survlrk

Calculates logrank statatistics for the surv macro.

Read more...

survlt

General survival statistics p(t), standard error, confidence limits, and median survival time, for the left-truncated survival analyses.

Read more...

survplot

Creates high-quality and easily customized Kaplan-Meier plots.

Read more...

symmchk

Checks for symmetry and suggests the best power transformation, if one exists, to make an asymmetric distribution symmetric

Read more...

Trace-RRBS

trace-rrbs v0.1: Targeted Alignment and Artificial Cytosine Elimination for RRBS for Illumina next-generation sequencing data.

Read more...

TREAT

TREAT is a Targeted RE-sequencing Annotation Tool that offers a comprehensive, open framework, end-to-end solution for analyzing and interpreting targeted re-sequencing data.

Read more...

trex

trex Package that calculates a truncated exact test for two-stage case-control studies for rare genetic variants. The first stage is for screening rare variants in only cases. If the number of case-carriers of any rare variants exceeds a user-specified threshold, then additional cases and controls are genotyped for the detected variants and carrier status of […]

Read more...

uagreemt

Measures agreement, precision, accuracy, total deviation index and coverage probability.

Read more...

UCLncR Pipeline

The Ultrafast and Comprehensive lncRNA detection (UClncRNA) pipeline leverages fast transcript assembly and parallel computing tools, multi-step filters for increased specificity to provide comprehensive lncRNA characterization.

Read more...

VCF-Miner

The Variant Call Format (VCF) is the de facto standard for storing variant information from next-generation DNA sequencing experiments.

Read more...

vmatch

Computerized matching of cases to controls using variable optimal matching.

Read more...

WANDY

Wandy: A program for CNV/Aneuploidy detection from WGS sequencing data   Introduction Wandy is designed for Copy Number Variation (CNV) and Aneuploidy detection from large genomes such as human. It takes a sorted BAM file as input and report predicted chromosome regions that have amplifications or deletions using LOG2 ratio, generate graphic reports. There are […]

Read more...