BIMA is a Next Generation Sequencing (NGS) mapping and alignment algorithm customized to process mate pair library sequencing. Mate Pair sequencing is a comprehensive and cost effective method for detecting structural variants throughout the entire genome, see http://www.illumina.com/technology/mate_pair_sequencing_assay.ilmn for details. BIMA was developed to handle sequencing artifact inherent in mate pair library preparation (biotin junction reads, paired-end read contamination, chimeras, etc.), see http://supportres.illumina.com/documents/myillumina/0a36163e-5fc0-4ae0-a944-a0ee51aa0eb2/matepair_v2_2-5kb_sampleprep_guide_15008135_a.pdf for details.
BIMA currently supports Linux operating systems, with CentOS being the primary distribution utilized for development and test. BIMA requires a minimum of 128GB of available RAM. We strongly suggest that you do not leverage swap space for BIMA execution. If swap space is utilized for BIMA execution, expect at least a 100 fold performance degradation. BIMA is multithreaded and scales linearly up to 16 cores.
BIMA supports individual read lengths of 64 to 512 base pairs (i.e. each read of the read pairs can be 64 to 512 base pairs in length. We plan to support longer read lengths in future version. BIMA does not currently support the mapping or alignment of single end reads.
System Requirements: Requires at least 128GB of available RAM for execution. We strongly recommend you do not utilize swap space for BIMA execution.
--concordantReadPairInsertSize maximum concordant insert size  -h, --help display this help text and exit --increasedAccuracy Evaluate additional candidate positions, increases runtime by factor of 2 --isPairedEndSequencingData Sequencing is not from Mate-pair protocol, no biotin junctions should be present [false] --phredQualityOffset ASCII encoding offset of phred score (Sanger = 33, Illumina 1.0+ = 59, Illumina 1.3+ = 64, Illumina 1.8+ = 33)  --read1FastqFileName FASTQ formatted file containing all 1st reads of read pair --read2FastqFileName FASTQ formatted file containing all 2nd reads of read pair --referenceGenomeFastaFileName FASTA formated file containing reference genome --singleThreadAlignment Use only 1 thread when performing alignment, but multiple during index generation --samOutputFileName SAM formatted output file name -T, --maximumNumberOfThreads maximum number of threads to utilize  -v, --verbose display execution options and progress
The following example invokes bima to perform a single threaded alignment:
bima --singleThreadAlignment true --referenceGenomeFastaFileName <reference genome> --read1FastqFileName <read 1 FASTQ> --read2FastqFileName <read 2 FASTQ> --samOutputFileName <output sam>
|FASTQ formatted reference genome. All chromosomes concatenated into 1 file||./hg19.fa|
|FASTQ formatted sequencing file for read 1 of paired read.||./sampleX.read1.fastq|
|FASTQ formatted sequencing file for read 2 of paired read.||./sampleX.read2.fastq|
|SAM formatted output file||./sampleX.sam|
bima --singleThreadAlignment true --referenceGenomeFastaFileName ./hg19.fa --read1FastqFileName ./sampleX.read1.fastq --read2FastqFileName ./sampleX.read2.fastq --samOutputFileName ./sampleX.sam
Travis M. Drucker, Sarah H. Johnson, Stephen J. Murphy, Kendall W. Cradic, Terry M. Therneau, and George Vasmatzis
Page last modified: October 24, 2014