Software Packages

Department of Quantitative Health Sciences
Mayo Clinic Research
Formerly known as the Department of Health Sciences Research

Related links: Division Overview R Shiny Applications

BIMA

BIMA is a Next Generation Sequencing (NGS) mapping and alignment algorithm customized to process mate pair library sequencing.  Mate Pair sequencing is a comprehensive and cost effective method for detecting structural variants throughout the entire genome, see http://www.illumina.com/technology/mate_pair_sequencing_assay.ilmn for details.  BIMA was developed to handle sequencing artifact inherent in mate pair library preparation (biotin junction reads, paired-end read contamination, chimeras, etc.), see http://supportres.illumina.com/documents/myillumina/0a36163e-5fc0-4ae0-a944-a0ee51aa0eb2/matepair_v2_2-5kb_sampleprep_guide_15008135_a.pdf for details. 

BIMA currently supports Linux operating systems,  with CentOS being the primary distribution utilized for development and test.  BIMA requires a minimum of 128GB of available RAM.  We strongly suggest that you do not leverage swap space for BIMA execution.  If swap space is utilized for BIMA execution, expect at least a 100 fold performance degradation.  BIMA is multithreaded and scales linearly up to 16 cores.

BIMA supports individual read lengths of 64 to 512 base pairs (i.e. each read of the read pairs can be 64 to 512 base pairs in length.  We plan to support longer read lengths in future version.   BIMA does not currently support the mapping or alignment of single end reads.

Invocation Options:

BIMA V3.0.0

System Requirements: Requires at least 128GB of available RAM for execution.  We strongly recommend you do not utilize swap space for BIMA execution.

Usage:

bima options

Options:

--concordantReadPairInsertSize maximum concordant insert size [10000]
-h, --help display this help text and exit
--increasedAccuracy Evaluate additional candidate positions,
increases runtime by factor of 2
--isPairedEndSequencingData Sequencing is not from Mate-pair protocol,
no biotin junctions should be present
[false]
--phredQualityOffset ASCII encoding offset of phred score
(Sanger = 33, Illumina 1.0+ = 59,
Illumina 1.3+ = 64, Illumina 1.8+ = 33) [33]
--read1FastqFileName FASTQ formatted file containing all 1st
reads of read pair
--read2FastqFileName FASTQ formatted file containing all 2nd
reads of read pair
--referenceGenomeFastaFileName FASTA formated file containing reference
genome
--singleThreadAlignment Use only 1 thread when performing alignment,
but multiple during index generation
--samOutputFileName SAM formatted output file name
-T, --maximumNumberOfThreads maximum number of threads to utilize [4]
-v, --verbose display execution options and progress

Example Invocation:

The following example invokes bima to perform a single threaded alignment:

bima --singleThreadAlignment true --referenceGenomeFastaFileName --read1FastqFileName --read2FastqFileName --samOutputFileName

Parameter Description Value
  FASTQ formatted reference genome.  All chromosomes concatenated into 1 file ./hg19.fa
  FASTQ formatted sequencing file for read 1 of paired read. ./sampleX.read1.fastq
  FASTQ formatted sequencing file for read 2 of paired read. ./sampleX.read2.fastq
  SAM formatted output file ./sampleX.sam

bima --singleThreadAlignment true --referenceGenomeFastaFileName ./hg19.fa --read1FastqFileName ./sampleX.read1.fastq --read2FastqFileName ./sampleX.read2.fastq --samOutputFileName ./sampleX.sam

Download

Download (Linux x64)

Primary Contact:

George Vasmastzis

Authors:

Travis M. Drucker, Sarah H. Johnson, Stephen J. Murphy, Kendall W. Cradic, Terry M. Therneau, and George Vasmatzis

Page last modified: November 17, 2023