BIMA

BIMA is a Next Generation Sequencing (NGS) mapping and alignment algorithm customized to process mate pair library sequencing.  Mate Pair sequencing is a comprehensive and cost effective method for detecting structural variants throughout the entire genome, see http://www.illumina.com/technology/mate_pair_sequencing_assay.ilmn for details.  BIMA was developed to handle sequencing artifact inherent in mate pair library preparation (biotin junction reads, paired-end read contamination, chimeras, etc.), see http://supportres.illumina.com/documents/myillumina/0a36163e-5fc0-4ae0-a944-a0ee51aa0eb2/matepair_v2_2-5kb_sampleprep_guide_15008135_a.pdf for details. 

BIMA currently supports Linux operating systems,  with CentOS being the primary distribution utilized for development and test.  BIMA requires a minimum of 128GB of available RAM.  We strongly suggest that you do not leverage swap space for BIMA execution.  If swap space is utilized for BIMA execution, expect at least a 100 fold performance degradation.  BIMA is multithreaded and scales linearly up to 16 cores.

BIMA supports individual read lengths of 64 to 512 base pairs (i.e. each read of the read pairs can be 64 to 512 base pairs in length.  We plan to support longer read lengths in future version.   BIMA does not currently support the mapping or alignment of single end reads.

Invocation Options:

BIMA V3.0.0

System Requirements: Requires at least 128GB of available RAM for execution.  We strongly recommend you do not utilize swap space for BIMA execution.

Usage:

bima options

Options:

           --concordantReadPairInsertSize maximum concordant insert size [10000]
       -h, --help                         display this help text and exit
           --increasedAccuracy            Evaluate additional candidate positions,
                                           increases runtime by factor of 2
           --isPairedEndSequencingData    Sequencing is not from Mate-pair protocol,
                                           no biotin junctions should be present
                                           [false]
           --phredQualityOffset           ASCII encoding offset of phred score
                                           (Sanger = 33, Illumina 1.0+ = 59,
                                           Illumina 1.3+ = 64, Illumina 1.8+ = 33) [33]
           --read1FastqFileName           FASTQ formatted file containing all 1st
                                           reads of read pair
           --read2FastqFileName           FASTQ formatted file containing all 2nd
                                           reads of read pair
           --referenceGenomeFastaFileName FASTA formated file containing reference
                                           genome
           --singleThreadAlignment        Use only 1 thread when performing alignment,
                                           but multiple during index generation
           --samOutputFileName            SAM formatted output file name
       -T, --maximumNumberOfThreads       maximum number of threads to utilize [4]
       -v, --verbose                      display execution options and progress

Example Invocation:

The following example invokes bima to perform a single threaded alignment:

bima --singleThreadAlignment true --referenceGenomeFastaFileName <reference genome> --read1FastqFileName <read 1 FASTQ> --read2FastqFileName <read 2 FASTQ> --samOutputFileName <output sam>

Parameter Description Value
  FASTQ formatted reference genome.  All chromosomes concatenated into 1 file ./hg19.fa
  FASTQ formatted sequencing file for read 1 of paired read. ./sampleX.read1.fastq
  FASTQ formatted sequencing file for read 2 of paired read. ./sampleX.read2.fastq
  SAM formatted output file ./sampleX.sam

bima --singleThreadAlignment true --referenceGenomeFastaFileName ./hg19.fa --read1FastqFileName ./sampleX.read1.fastq --read2FastqFileName ./sampleX.read2.fastq --samOutputFileName ./sampleX.sam

Download

Download (Linux x64)

Primary Contact:

George Vasmastzis

Authors:

Travis M. Drucker, Sarah H. Johnson, Stephen J. Murphy, Kendall W. Cradic, Terry M. Therneau, and George Vasmatzis

Page last modified: October 24, 2014