[//]: # (This file is in markdown format.  See: http://dillinger.io/)

------------------------------------------------------------------
Version 5.1.0 - 2021-01-27
------------------------------------------------------------------
**Bug Fixes**
+ bior_lookup threw an error when specifying an index path
+ Corrected ALT field to be a single string when using bior_build_catalog
+ Fixed bug with nested JSON
+ bior_catalog_stats - fixed off-by-one error when determining end of read; Fixed Long comparisons; Use Long instead of int for row counts
+ Added bior_catalog_latest command to get the latest version of a specific catalog source 

------------------------------------------------------------------
Version 5.0.1 - 2019-04-23
------------------------------------------------------------------
**Bug Fixes**
+ When bior\_verify\_catalog encountered an int that was larger than Integer.MAX_VALUE, it threw an exception. 
+ When building catalogs such as OMIM, the ShortUniqueName was not correctly formatted.  Ex: "omim\_2018\_09\_05\_GRCh37p13" should have been "omim\_20180905\_GRCh37p13" 

**Changes**
+ bior\_same\_variant now uses fast string-parsing to resolve _landmark,_minBP,_maxBP,_refAllele,_altAlleles,_id fields instead of multiple full JSON parsing.  The speed increase should be more noticeable for catalogs with a large number of JSON fields.  This is imported with pipes 5.0.6.
+ Updated bior\_modify\_tjson command help to clear up that only one delimiter is allowed, but it can be multiple characters
+ When reporting errors in verify, only output the first 1000 chars of the JSON if it is longer than that (so log doesn't fill up with huge lines)
+ Allow bior_build_catalog command to skip the portion that pulls the first 3 tabix columns from the JSON (if the lines are already formatted correctly).  This can shave off several days on large catalogs


------------------------------------------------------------------
Version 5.0.0 - 2018-07-31
------------------------------------------------------------------
**Removed**
These commands were removed due to out-of-date sources, low usage, and high maintenance costs.  They can be run independently (VEP, SnpEffect), with pre-5.0.0 versions of BioR, or with bior\_annotate.sh that provides similar functionality.
+ bior\_annotate  (this is the old toolkit command, and NOT bior\_annotate.sh that is in general use)
+ bior\_annotate\_blaster  (the old toolkit command that uses the grid engine for parallelization)
+ bior\_vep
+ bior\_snpeff

**Added**
+ bior\_catalog\_markdown command added to generate markdown documentation about a catalog

**Changes**
+ bior\_build\_catalog - Added STOP step so it can be executed on 1 or more steps
+ DataSourceReleaseDate, Dataset, Description added to list of build\_info.txt items that will be saved in the catalog's datasource.properties file
+ Error codes given by bior\_verify\_catalog to inform catalog build automation on whether it can automatically publish catalogs 
+ Command usage logging updated with new server settings
+ bior\_verify\_catalog now handles comparisons of "N" base-pair to other alt base-pairs
+ BioR R jar - removed fixed paths to allow simpler build or jar for R annotator
+ bior\_catalog\_markdown can add data (with -c flag) from re-annotation project JSON output to show comparisons between current and previous catalogs 

**Bug Fixes**
+ VCF INFO column characters are now escaped to avoid VCF parsing issues (comma (',') to %2C,   equals ('=') to %3D,  semicolon (';') to %3B).  This should fix issues with URLs and these special characters occurring in the VCF INFO column, but may require decoding if you pull out these values.  For example, the string Description='a small,quick list; status=done' is encoded into the INFO column as "Description=a small%2Cquick list%3B status%3Ddone"
+ bior\_tab\_to\_tjson - Support 4 types: Float, Integer, Boolean, String.  (Previous 'Number' type split out into 'Float' and 'Integer', but still offered for backwards compatibility) 
+ bior\_build\_catalog - Arrays threw exceptions when switching between values with JSON objects and null  ( ex: {"A":[{"x":1}]}  then  {"A":[null]} ) 
+ bior\_catalog\_stats and bior\_verify\_catalog have options to specify chunks of the catalog to be scanned with each call to enable parallelization
+ bior\_catalog\_stats attempted to write to / root dir when no path given, instead of writing to current dir
+ Fix line counts for bior\_verify\_catalog and bior\_catalog\_stats when reading huge catalogs with > 2B rows



------------------------------------------------------------------
Version 4.3.1 - 2017-09-28
------------------------------------------------------------------
**Bug Fixes**
+ Large float values were often dropped or not handled correctly by these commands:
	- bior\_vcf\_to\_tjson
	- bior\_tjson\_to\_vcf
	- bior\_tab\_to\_tjson
	- bior\_modify\_tjson
	- bior\_drill
+Example of a float that was dropped:  123456789.123456789012345 - most floats over 16 digits were either rounded, truncated, or threw an Exception that caused the value to be dropped from the JSON string. The precision of the original value should now be preserved.
			


------------------------------------------------------------------
Version 4.3.0 - 2017-04-19
------------------------------------------------------------------
**New Commands**
+ bior\_modify\_tjson
	- Modifies existing JSON fields, allowing users to change object types, or strings to arrays, etc.
	- Ex: You can change the type of "MyBoolean" in the JSON from an integer {"MyBoolean":0} to a boolean {"MyBoolean":false} by defining the type change in a config file, which would then modify each line

+ bior\_catalog\_stats
	- Provides stats and values from all fields within a catalog, such as number of occurrences of each character (to determine the data type), the frequency of each character, and the first thousand values encountered

+ bior\_replace\_lines
	- Replaces input lines with others while streaming.
	- Ex: Say you have a ##INFO header line in a VCF that defines the Type as "String", but it should be "Float", you can define two input config files - one to list the lines to find, and another to list the lines to replace those with.


			
**Bug Fixes**
+ bior\_build\_catalog
	- Allowed sort to be deactivated after make-json to avoid double-sorts in some cases, or to skip a file that is already sorted
	- All build\_info.txt keys added to environment, so users can define their own variables in the config file
	- Output of make\_json script can be plain-text, gz, or bgzip, instead of just plain-text, as using gz can save disk space.

+ bior\_create\_catalog
	- Corrected sort columns (was: chrom,min,max;  now: chrom,min,max,ref,altAllele)
	- Now uses the same logic as bior\_build\_catalog to find the preferred temp directory (preferring /local2/tmp and /local1/tmp over /tmp due to available space on those partitions)
	- H2 index lookup and writing now correctly handles catalog short names ending in ".vcf".  This had been a problem with older catalogs built on dbSNP.

+ bior\_vcf\_to\_tjson
	- POS field should be an Integer and not a String (should NOT be quoted)
	- Don't add empty values (dots) to catalogs, such as "QUAL":"."


**Updates**
+ bior\_drill - collapses array values down into a pipe-delimited string to make value parsing easier
	- Ex: {"minorAlleleFreqs":[0.23,0.34,0.11]} used to output [0.23,0.34,0.11], but will now output "0.23|0.34|0.11".
	- Separator character can be changed from pipe to a user-specified character

+ bior\_catalog\_remove\_duplicates allows streaming of data instead of just input and output files, making it more useful to pipe together with other commands

+ bior\_vcf\_to\_tjson provides flag to output "false" values for any VCF INFO flags that do not appear in the INFO field (default is to not add the "false" value to JSON)
	- Ex: If "IsMyBoolean" appears on one VCF line, it will produce JSON of {"IsMyBoolean":true}, but if that field does not appear on the next line, it will produce no key-value pair in JSON: {}, as the absence of the field signifies "false".  By providing the -f flag to the command, it will force the absent field to produce JSON {"IsMyBoolean":false}



------------------------------------------------------------------
Version 4.2.0 - 2016-11-12
------------------------------------------------------------------
**New Commands**
+ bior\_build\_catalog
	- Compiles source data into JSON
	- Creates a .tsv.bgz catalog from the JSON
	- Creates columns and datasource properties files
	- Merges columns and datasource properties files with data from previous catalogs
	- Verifies the catalog

+ bior\_verify\_catalog
	- Verifies many details of the catalog

**Updates**
+ Fixed bug related to bior\_lookup path when constructing indexes where the catalog name has extra dots in it
+ bior\_annotate\_blaster takes a -jobName JOBNAME flag which allows a user to wait for the command to finish by using the "qsub -sync yes".

**Bug Fixes**
+ bior\_annotate\_blaster was calling bior\_concat with the -i flag which is no longer used.  This would throw an exception about an invalid option.
+ bior\_drill was not handling null values in catalogs correctly.  A key in the catalog such as "id":null would miss the column 
+ bior\_catalog\_remove\_duplicates command added back in for this release as we are providing a tabix index builder through Java (cmd line call)


------------------------------------------------------------------
Version 4.1.2 - 2016-09-29
------------------------------------------------------------------
**New Commands**
+ bior\_count\_catalog\_misses 
	- assesses BioR catalog for lines that bior\_overlap would miss due to bug described at http://bsiweb.mayo.edu/bior/java-tabix-bug-2016-09
	- run with no arguments for help

**Removed Commands**
+ bior\_catalog\_remove\_duplicates
	- this command used Java Tabix to create incorrect tabix indexes on the catalog output in the command
	- it is in versions 3.0.0, 4.0.0 and 4.1.1 but had not been used so it was removed from this release

**Bug Fixes**
+ To address the critical tabix issue described at http://bsiweb.mayo.edu/bior/java-tabix-bug-2016-09, we replaced our Java code that reads tabix indexes with htsjdk (https://github.com/samtools/htsjdk) version 1.143.  Most importantly, this could affect bior\_overlap and bior\_same\_variant results but can have affects on other commands like bior\_variant\_to\_tjson.
+ bior\_create\_catalog: instead of creating incorrect tabix indexes with Java Tabix code, we are creating it with command line tabix. The incorrect indexes are created in the 4.0.0 and 4.1.1 versions of bior\_create\_catalog.
		
		
------------------------------------------------------------------
Version 4.1.1 - 2016-07-18
------------------------------------------------------------------
**Updates**
+ Fixed bug with bior\_tjson\_to\_vcf throwing IndexOutOfBoundsException when number of columns exceeded 128
+ Fixed bug with bior\_tjson\_to\_vcf where ##INFO headers were not added for column when there were no values in the column until after row 200 (due to line buffer)
+ Additional bug fixes, improvements, and changes to bior\_build-catalog
+ Memory bumped up from 128MB to 256MB for bior\_tjson\_to\_vcf, bior\_same\_variant, bior\_overlap
+ bior\_tjson\_to\_vcf outputs ##fileformat=VCFv4.1 now instead of ##fileformat=VCFv4.2, since we haven't fully transitioned BioR to v4.2 yet.


------------------------------------------------------------------
Version 4.0.0 - 2016-04-15
------------------------------------------------------------------
**Updates**
+ bior\_tab\_to\_tjson
	- config file Golden attribute validation:
	- \_minBP and \_maxBP must be of type “NUMBER”
	- \_landmark must be of type “STRING”
	- \_minBP and \_maxBP in config file will be de-quoted in JSON output
	- \_altAlleles added to JSON correctly as array
	- JsonType (data type) & InjectorType values no longer case-sensitive
	- Value checks for JsonType (data type), InjectorType, and Golden attribute columns
	- Handles empty lines and spaces
+ bior\_create\_catalog
	- Can calculate \_maxBP from \_refAllele & \_minBP and add to catalog JSON if user does not want to calculate it themselves
	- Ensure \_minBP and \_maxBP are integers & not quoted
	- Chromosome conversions  (23->X, 24->Y, 25->XY, 26->M, MT->M) in both Tabix and \_landmark fields
	- User can specify their own chromosome sort order file which takes precedence
	- Enforces .tsv.bgz extension of catalog
+ bior\_tjson\_to\_vcf
	- Default behavior changed:
		* Now does NOT unroll JSON into the INFO field.  (“new” bior\_annotate.sh had issues with repeated uses of this command)
		* Instead, provide “-j” option to unroll the input JSON into the INFO field.
		* Removes all bior added columns, the target JSON used to build the VCF first 8 columns, and any columns specified in the range.			
+ \_bior\_sort\_catalog command removed as a much better alternative is included in bior\_create\_catalog



------------------------------------------------------------------
Version 3.0.0 - 2016-02-23
------------------------------------------------------------------
**Updates**
+ bior\_tjson\_to\_vcf - can now create the full VCF from a JSON column and roll all BioR columns (including JSON) into the VCF INFO column
+ Support added for calling BioR annotate directly from R
+ bior\_concat and bior\_merge - multiple input files can still be specified with wildcards, but must be passed as stand-alone arguments instead of arguments to multiple -i flags
+ bior\_vep - now uses flags loaded from bior.properties, and uses an increased buffer size which will speed processing.  See $BIOR\_LITE\_HOME/conf/bior.properties for changes.
	- NOTE: If you have a bior.properties file in your user home directory, this will need to be updated to include additional properties! 
	- New keys in properties file: BiorVepBufferSize, BiorVepCmdFlags

**New Commands**
+ bior\_variant\_to\_tjson - Converts rsId or (chrom and pos) or (chrom and pos and allele1 and allele2) into a JSON field for use by other BioR commands. 
+ \_bior\_sort\_catalog - Sorts a catalog by chromosome, start, end.  This sorts chromosomes logically, for example: 1,2,3,...9,10,11,12,...19,20,21,22,X,Y,XY,M,UNKNOWN

**Bug Fixes**
+ bior\_same\_variant now returns values if ANY of the alt alleles matches one of the alts in the catalog
	- (instead of the input alts needing to be a subset of the catalog alts)
	- Flag to compare by rsId
+ bior\_vep had occasional errors when using a buffer size of 1 where fields would be missing from the returned CSQ results
+ bior\_annotate fields have less reliance on VEP and OMIM, and pull fields from HGNC where available.  This should increase speed slightly and pull gene-related fields more reliably.
+ No longer throws error when running BioR script with NO log flag from a read-only directory.
+ Speed improvement in pipes parsing
+ bior\_annotate\_blaster flag for outputting uncompressed vcf was not working previously
+ bior\_vcf\_to\_tjson was outputting an error to stdout instead of stderr when one of the INFO fields could not be parsed
	- (for example, if a field was type "Integer", but the value was "0|0|0", it could not be parsed and threw a warning to StdOut)
	- Bug fixed where non-human chromosomes (or non-standard human chromosomes) were mapped incorrectly to a standard chromosome.
		* If the chromosome is not recognized, it is left alone.
+ bior\_vcf\_to\_tjson
	- When the INFO field has a dot in it, the JSON will now contain '"INFO":{}' instead of '"INFO:{".":true}'
	- Incorrectly converted odd chromosomes such as "1\_alt" to a different chromosome (such as "7") in the "\_landmark" attribute
+ Flag --logfile added to all commands to allow users to write to a specific log file
	- (rather than defaulting to bior.log each time which causes conflicts when running 2 or more BioR commands)
+ Flag --log was not working for command bior\_create\_catalog
+ Flag arguments such as "-c -2" (example: a negative column as an argument) are handled correctly now, whereas they were previously treated incorrectly as separate flags.
+ bior\_annotate help text was difficult to use in constructing example commands because of spaces and tabs around config file values
+ Commands can now override default flags such as -l (log) and -v (version), which will force the defaults to then use the long versions (--log, --version)


------------------------------------------------------------------
Version 2.4.1 - 2015-04-10
------------------------------------------------------------------
**NOTE: Important fix to bior\_vep!**
	
**Updates**
+ Corrected help for bior\_tjson\_to\_vcf
+ bior\_compress - added examples about escaping special characters
		
**Bug Fixes** 
+ bior\_vep: (IMPORTANT!!!) When multiple effects were returned by the VEP tool, bior\_vep chooses the most-damaging, but when there were no SIFT or PolyPhen scores available it instead chose nothing and reported "{}" as the result instead of choosing one of the results
+ bior\_annotate\_blaster: Added status file BEFORE each chunk so if chunk fail it would give a failed status at end and stop instead of cleaning up the output files.
	- Added --noZip flag and fixed bug where it wasn't working (it gzip'd anyway, but only when wanting TJSON back instead of VCF)


------------------------------------------------------------------
Version 2.4.0 - 2014-08-15
------------------------------------------------------------------
**Updates**
+ bior\_annotate\_blaster : Splits, annotates, concatenates the files back together for you when done. 
	- Cleans up output.  Can output vcf or gzipped-vcf file.  
	- Provides arguments for log and status files.
+ bior\_tjson\_to\_vcf : Help documentation was incorrect previously
+ bior\_vep, bior\_snpeff : Updated long flags to work correctly
+ bior\_create\_catalog : Updated long flags to work correctly
+ bior\_bed\_to\_tjson : Note added that this takes 0-based coordinates
+ bior\_compress : Updated help documentation on command line to show how to escape special characters as separators
         
**New Commands**
+ bior\_gbk\_to\_tjson  : Converts Genbank gbk files to JSON
+ bior\_gff3\_to\_tjson : Converts GFF3 variant data to JSON
        
**Bug Fixes**
+ Corrected help text on several commands which were being truncated because of spaces at the ends of the lines in the properties files after the line wrap character.



------------------------------------------------------------------
Version 2.3.0 - 2014-04-22
------------------------------------------------------------------
**Updates**
+ bior\_index\_catalog : Can now create indexes on arrays.   Most command output moved to logger.
        
**New Commands**
+ bior\_chunk : Breaks up a VCF file into chunks based on start and end line
+ bior\_merge : Merges multiple VCF files (text or gzip) with or without headers into one VCF.
	- These files can have duplicate lines with overlapping chromosome ranges and can be out of order
+ bior\_concat : Concatenates multiple VCF files together - similar to merge (and twice as fast), but should only be run on files whose chromosome and start positions are sequential.
	- Useful for combining file parts from bior\_annotate\_blaster.
+ bior\_annotate\_blaster : Splits one large VCF up into multiples that are then submitted to the open grid engine for parallel processing
+ bior\_ref\_allele : Looks up the reference allele in NCBI Genome for each variant's position.  This can be used with catalogs such as:
  /data5/bsi/catalogs/user/v1/biostats/exome\_overlap/hs\_ref\_genome.fa.tsv.bgz
        
**Bug Fixes**
+ bior\_drill : Was throwing an error when using multiple drill paths with positive column numbers


------------------------------------------------------------------
Version 2.2.1 - 2014-02-13
------------------------------------------------------------------
**Updates**
+ SnpEff - users can now upgrade the SnpEff tool version and genome builds. See UserGuide for more info
+ The bior.properties file has been updated to include a SnpEffCmd property which allows the SnpEffect tool to be upgraded easily. However, if users have added a bior.properties file to their home directory to override the default one in the bior\_pipeline*/conf/ directory, this will need to be updated to include the new property. See the file in the bior\_pipeline*/conf/ directory for how to specify the new property.
+ bior\_annotate: Added all columns for config file to the help output when using the --help or -h flag
  
**New Commands**
+ Added bior\_trim\_spaces command to remove spaces around columns. This command skips any metadata header lines that begin with "##".
	- bior\_trim\_spaces was added by default within bior\_vep, bior\_snpeff, and bior\_annotate commands. Spaces around columns could cause problems for tools that operate on numeric columns like POS, and would cause VEP to eat up memory at a rate of 2-3MB/sec.

**Bug Fixes**
+ bior\_same\_variant will no longer match alts when they are "." - this was a problem when there was no rsId specified in either the input or catalog file, and variants were matching on chrom, position already, and then matched incorrectly on the "." in the Id column.
+ bior\_index\_catalog - was throwing a NullPointerException when trying to create an index in the current directory.
+ LineCounterPipe can now handle either Strings or History objects. This was causing a problem with the bior\_create\_catalog command. 



------------------------------------------------------------------
Version 2.2.0 - 2013-11-27
------------------------------------------------------------------
**Updates**
+ bior\_annotate updates:
	- Resurrected single-threaded bior\_annotate with metadata functionality. This will reduce the amount of memory used by bior\_annotate.
	- Single-threaded operation is now the default, with flag to call multi-threaded version
	- Flag to output status (# lines in, out, failed, exit code)
	- Improved exit code handling so it only returns 1 if the code did not run to completion. If some lines could not be processed, but command ran to completion, then errors on those lines are shown in log, with summary to stderr, but exit code = 0 

+ Added LineCounterPipe to pipes project 


------------------------------------------------------------------
Version 2.1.0 - 2013-09-11
------------------------------------------------------------------
**Updates**
+ Metadata is now included with all catalogs - one properties file to denote the data source name, version, build; one to specify column names, types, number of occurrences, and descriptions. 
+ Metadata ##BIOR lines added for all commands when a new column is added (or one is modified). This required a significant change to the code base as all commands were affected, and the metadata handling was centralized into one location. This also affected many test cases as well that had to be updated to match the new command outputs. 
+ Reduced memory usage on SGE (Sun Grid Engine) for all commands
	- Gathered all license information from tools and data sources 

**New commands added**
+ bior\_bed\_to\_tjson - converts a bed file to a tab-delimited JSON file.
+ bior\_compress - compresses multiple lines into one. If values in a compressed column differ from one line to the next, those are combined into the same column, but separated by some delimiter.
+ bior\_create\_catalog - creates a catalog from a given file that contains a JSON column. This performs a bgzip compression on the output file, and then runs tabix to create a positional index on that bgzip file.
+ bior\_create\_catalog\_props - creates the two properties files for a catalog. To do this, the catalog is crawled to calculate data types and whether there are multiple values within a field. The user can also point to a VCF file from which the catalog was derived in order to parse the correct data types and multiplicities.
+ bior\_create\_config\_for\_tab\_to\_tjson - helps to create the config file that is used by the bior\_tab\_to\_tjson command
+ bior\_tab\_to\_tjson - for each line in a tab-delimited file it creates a JSON object from this, and appends the JSON to the end of the line. This requires a config file that can be constructed using the bior\_create\_config\_for\_tab\_to\_tjson command.
+ bior\_tjson\_to\_vcf - takes lines in a tab-delimited JSON file and converts them to VCF-compatible lines with the data going into the INFO column. The metadata header is also modified to include ##INFO lines describing these fields added to the INFO column. 

**Commands renamed**
+ bior\_index renamed to bior\_index\_catalog to signify that it is indexing a catalog.
+ bior\_bed\_to\_json renamed to bior\_bed\_to\_tjson
+ bior\_vcf\_to\_json renamed to bior\_vcf\_to\_tjson 

**User Guide**
+ Created a "Bior 2.1.0 User Guide" in GoogleDocs so external viewers can read it and select reviewers can edit and update it.
  https://docs.google.com/document/d/17abGGHKJU6BizDLanI58enTRXmRh7yLGt69GurObjxo/edit?pli=1 

**Bug Fixes**
+ bior\_index\_catalog now shows error if the path is a directory. Shows warning if the output index name does not have the same prefix as the catalog.
+ bior\_vcf\_to\_tjson - when processing files, if a given line contains a data type that is inconsistent with the ##INFO line in the header, it will just ignore that field now rather than crashing (example: data: "MQRankSum=..03" - the ##INFO line says the data type is Integer, but the value in the INFO column is messed up and then treated as a string)
+ bior\_annotate more gracefully handles blank or malformed lines in the config file. 

**RefData**
+ HTTP Downloader fixes - now handles symlinks and MD5sums
+ New data model - Updated refdata\_search command, download daemon 

**Support**
+ BioR is now officially using the support org (Remedy) to submit and track bugs and feature requests.
  http://helpdesk.mayo.edu/remedyessforms/remedyessincident.aspx?ext\_sys=Bioinformatics+Systems+Unit&ext\_event=BioR 



------------------------------------------------------------------
Version 0.0.3 - 2013-06-20
------------------------------------------------------------------

**Updates**
+ General:
	- Logs only written now when --log flag used (instead of every time) 
+ bior\_vep:
	- --all flag now outputs a valid JSON object (that starts with "{") that contains an array of values
	- Resolved timeouts that were slowing down VEP processing and possibly causing some out of memory errors 
+ bior\_annotate
	- Increased memory for bior\_annotate command from 512M to 2G to reduce the number of out-of-memory errors
	- In the future, we may need to have special handling of variants that are thousands of base-pairs long (which can still cause out-of-memory errors) 
+ bior\_bed\_to\_json
	- This new script is now available
      Example: 
```
$ cat /data4/bsi/BIOR/example.bed | bior\_bed\_to\_json 
#chrom	chromStart	chromEnd	name	score	strand	thickStart	thickEnd	itemRgb	blockCount	blockSizes	blockStarts	BED2JSON
chr22	1000	5000	cloneA	960	+	1000	5000	0	2	567,488,	0,3512	{"chrom":"chr22","\_landmark":"22","chromStart":"1000","\_minBP":1001,"chromEnd":"5000","\_maxBP":5000,"name":"cloneA","score":"960","strand":"+","thickStart":"1000","thickEnd":"5000","itemRgb":"0","blockCount":"2","blockSizes":"567,488,","blockStarts":"+"}
chr22	2000	6000	cloneB	900	-	2000	6000	0	2	433,399,	0,3601	{"chrom":"chr22","\_landmark":"22","chromStart":"2000","\_minBP":2001,"chromEnd":"6000","\_maxBP":6000,"name":"cloneB","score":"900","strand":"-","thickStart":"2000","thickEnd":"6000","itemRgb":"0","blockCount":"2","blockSizes":"433,399,","blockStarts":"-"}
```



------------------------------------------------------------------
Version 0.0.3 - 2013-05-29
------------------------------------------------------------------
**Updates**
+ bior\_annotate command updated:
	- New --config option allows a file to be used to limit the columns that are outputted, as well as avoids commands that do not need to be run if their output is not required.
	- same\_variant used instead of overlap for several catalogs (BGI, ESP, Hapmap, 1000Genomes), since overlap returned some large indels that were not the same variant
	- Passes GOLD VCF file (a vcf file where several variants have been verified against known catalog values)
	- 4 HapMap columns added to output 

+ bior\_vep command fix
	- Fix to avoid indefinite hang when variant\_effect\_predictor returns no output (now times out after 10 sec on a bad line instead of ending) 

+ bior\_compress command updated:
	- bior\_compress has new --reverse and --escape options. (data can sometimes contain the delimiter character we use to separate values, so escaping it when it occurs in the original data) 

+ bior\_lookup command updated:
	- bior\_lookup is no longer case sensitive. Note that the default behavior has changed. You can now search for a key with any case. To make case-sensitive, just use the appropriate flag (-s) 

+ SGE (Sun Grid Engine) fix for bior toolkit commands (avoids out of memory exceptions and using too much memory) 

+ $BIOR\_CATALOG variable updated:
	- $BIOR\_CATALOG has been switched to the new directory (from /data4/bsi/refdata-new/catalogs/ to /data4/bsi/catalogs/bior/) 

+ $USER\_CATALOG variable added:
	- There is a new $USER\_CATALOG variable that points to the user-created catalogs. It is a good idea to use this instead of the hard-coded paths (especially since /data4 may move to /data5 soon) This should be used for all new catalogs that users (not the bior team) create 



------------------------------------------------------------------
Version 0.0.3 - 2013-05-07
------------------------------------------------------------------
**New Commands**
+ bior\_annotate
	- New command that will replace the legacy TREAT annotation module. This is an early test version that has 56/60 of the final TREAT output columns. Documentation and validation against the "GOLD" file are not complete, but you can take it for a test-drive if you like. Please wait until the full production 2.0 release prior to replacing the legacy perl-based module. Be aware that SNPEff requires a lot of RAM and takes time to iinitialize prior to any data being processed. 

+ bior\_compress
	- New command that compresses multiple rows to a single row. This is built-in to the bior\_annotate command, but can also be run independently after other toolkit commands such as bior\_overlap, bior\_lookup, and bior\_same\_variant that may cause multi-row output. 

**Updates**
+ bior\_vep - updated command so that it no longer is required to be the 1ST command in your pipeline. 

+ New dbSNP Catalog
	- Added new clinvar catalog as a requirement for TREAT. See http://bsiweb.mayo.edu/dbsnp for details. 

+ Updated OMIM Catalog
	- OMIM separates out the Title, Comments, and Disorders fields into multiple "continued" fields. The original rationale behind this is not clear, but the usability of the Catalog has been enhanced so that the "continued" fields have been removed and their respective content has been merged. ** For example, the fields "Disorders" and "Disorders, cont." have been merged to just be "Disorders".
	- For details, see: http://bsiweb.mayo.edu/omim 

+ Updated ESP Catalog
	- ESP reports allele frequencies as a percentage 0-100%. New fields have been added to the catalog that divide these percentages by 100 so that the frequencies are normalized along the same lines as HapMap and 1000 Genomes.
	- For details, see: http://bsiweb.mayo.edu/esp 

+ NEW Catalog space
	- The current space for catalogs in the RCF environment is /data4/bsi/refdata-new/catalogs/v1 . The $BIOR\_CATALOG environment variable also points to this space. In 2 weeks (May 21), this space will be going away. The new /data4/bsi/catalogs will take its place. We will update $BIOR\_CATALOG at that time to point to the new space. We highly encourage users to utilize $BIOR\_CATALOG instead of hard-coding the path to make the transition seamless.
	- The /data4/bsi/catalogs/user space is open to the BioR user community for publishing their own catalogs. It's recommended that you mirror the folder structure and naming conventions used for existing BioR built catalogs for your own catalogs. This will help everyone from a usability standpoint. 



------------------------------------------------------------------
Version 0.0.3 2013-03-13
------------------------------------------------------------------
** Updates - catalogs**
+ The following 8 catalogs are based on UCSC annotation tracks required by the TREAT workflow. These are now available in the RCF environment and are fully compatible with the bior\_overlap command.
	- Regulation Catalog:
		* Track for regulatory regions from ORegAnno.
		* http://bsiweb.mayo.edu/regulation 

	- Conservation Catalog:
		* Vertebrate Multi-z alignment and conservation.
  		* http://bsiweb.mayo.edu/conservation 

	- Enhancer Catalog:
		* Vista HMR-Conserved Non-coding Human Enhancers from LBNL.
		* http://bsiweb.mayo.edu/enhancer 

	- TFBS Catalog:
		* HMR Conserved Transcription Factor Binding Sites.
		* http://bsiweb.mayo.edu/tfbs 

	- TSS Catalog:
		* SwitchGear Genomics Transcription Start Sites.
		* http://bsiweb.mayo.edu/tss 

 	- Blacklisted Region Catalog:
		* Blacklisted Region Browser extensible data.
		* http://bsiweb.mayo.edu/blacklisted-region 

	- Repeat Region Catalog:
		* Repeat Masker .out data.
		* http://bsiweb.mayo.edu/repeat-region 

	- Uniqueness/Alignability Catalog:
		* Duke excluded regions.
		* http://bsiweb.mayo.edu/uniquenessalignability 

+ NOTE: Although UCSC has 8,344 annotation track files, only 8 have been documented and vetted by the BioR team. These 8 catalogs are part of TREAT and are fully supported by the BioR Team. The remaining 8,336 UCSC annotation tracks have also been published as catalogs and are available "as is" with no support or documentation at this time. These tracks will be available in the RCF environment at /data4/bsi/refdata-new/catalogs/beta/ucsc/hg19 by March 15 as the upload is still running.


------------------------------------------------------------------
Version 0.0.3 - 2013-03-11
------------------------------------------------------------------
**Updates - Catalogs
+ BGI Catalog change
	- The fields calculated\_minor\_allele\_freq and calculated\_major\_allele\_freq have been removed from the catalog as of March 8th. These fields were calculated by the BioR team as a convenience and are not in the original LuCAMP\_200exomeFinal.maf.gz raw flat file.
	- Thanks to some helpful feedback from Greg Dougherty, it was brought to our attention that these calculated values are not correct due to ambiguity in the exact meaning of values in the LuCAMP\_200exomeFinal.genotype.gz file. We made the decision to pull these fields from the catalog. These changes are reflected in the rebuilt Catalog file and the Drupal page http://bsiweb.mayo.edu/bgi.
	- Please use the fields estimated\_minor\_allele\_freq and estimated\_major\_allele\_freq for allele frequencies. These fields are carried over verbatim from the original source data.


------------------------------------------------------------------
Version 0.0.3 - 2013-03-06
------------------------------------------------------------------
**New Catalogs**
+ OMIM genemap Catalog
	- This catalog is accessible with the bior\_lookup command using the pre-built index for the MIM\_Number field.
	- Please see the Drupal documentation page http://bsiweb.mayo.edu/omim for more details. 

+ HGNC Catalog:
	- This catalog is accessible with the bior\_lookup command using the pre-built indexes for either the Entrez\_Gene\_ID or UniProt\_ID fields.
	- Please see the Drupal documentation page http://bsiweb.mayo.edu/hgnc for more details. 

+ NOTE: Each of the above catalogs have numerous fields. Please let the team know at bior@mayo.edu if we are missing pre-built indexes for commonly used fields.



------------------------------------------------------------------
Version 0.0.3 - 2013-02-22
------------------------------------------------------------------
**New Commands**
+ Id Lookup commands
	- The BioR toolkit currently supports coordinate based search with the bior\_overlap and bior\_same\_variant commands. We are now introducing a new search capability to the BioR toolkit that searches for annotation by unique identifiers. This ID based search is implemented with the following 2 new commands:
		* bior\_lookup
        	+ The bior\_lookup command searches for a given ID or string in a catalog. We currently have an index built for the following catalogs:
            	- dbSNP RS identifier for the dbSNP catalog [$BIOR\_CATALOG/dbSNP/137/00-All\_GRCh37.tsv.bgz]
            	- HGNC identifier for the NCBI Entrez Gene catalog [$BIOR\_CATALOG/NCBIGene/GRCh37\_p10/genes.tsv.bgz] 

    	* bior\_index
        	+ The bior\_index command builds a new index on a catalog that can be used by bior\_lookup for fast searches. This gives users the ability to create their own index if desired. 

+ The bior\_lookup and bior\_index commands are now available in the RCF environment.
	- The Drupal page http://bsiweb.mayo.edu/lookup-and-indexing has more information on getting familiar with the commands.
	- We appreciate any feedback you can provide. We would especially like to know of new indexes that we should built that would be useful to the BioR user community.

+ NOTE: Please logout of your RCF terminal and log back in to pick up the latest changes.



------------------------------------------------------------------
Version 0.0.2 - 2013-02-21c
------------------------------------------------------------------
**New Catalog**
+ A new 1000 Genomes catalog is available for release 20110521. This catalog is a direct "out of the box" translation of the data in the original source files.
	- The calculated subpopulation frequencies are not available in this catalog, but are available in the older BioR 1.0 Legacy system (see http://bsiweb.mayo.edu/user-guide-bior-10-legacy ).
	- The BioR team would like to build a full-featured catalog that does include the subpopulation frequencies and retire the BioR 1.0 Legacy System. The timing of this new catalog is still up in the air and should be discussed in a future Stakeholder meeting.
	- Please visit the drupal documentation page at http://bsiweb.mayo.edu/1000genomes for details.


------------------------------------------------------------------
Version 0.0.2 - 2013-02-21b
------------------------------------------------------------------
**New Catalog**
+ A new miRBase catalog is available for release 19.
	- Please visit the drupal documentation page at http://bsiweb.mayo.edu/mirbase for details.


------------------------------------------------------------------
Version 0.0.2 - 2013-02-21a
------------------------------------------------------------------
**Bug fixes**
+ The bior\_overlap and bior\_same\_variant commands had an issue where a single variant with ZERO matches in a catalog would cause all subsequent variants to also show ZERO matches regardless of whether a match exists or not. See http://bsu-bugs/default.asp?1932 for details. 
+ The bior\_overlap and bior\_same\_variant commands did not correctly handle the input JSON column being a blank JSON document "{}". All subsequent variants after the variant with the blank JSON column show ZERO matches regardless of whether a match exists or not. See http://bsu-bugs/default.asp?1943 for details. 
+ The latest bug fixes are now available in the RCF environment.


------------------------------------------------------------------
Version 0.0.2 - 2013-02-11
------------------------------------------------------------------
**New Catalog**
+ COSMIC v63 Catalog:
	- Drupal documentation page http://bsiweb.mayo.edu/cosmic 


------------------------------------------------------------------
Version 0.0.2 - 2013-02-06
------------------------------------------------------------------
**New Catalog**
+ Hapmap Allele Frequencies Catalog:
	- Drupal documentation page http://bsiweb.mayo.edu/hapmap
	- Please take special note on the LiftOver process section and how a very small percentage of variants had difficulty lifting over to GRCh37. 

+ BGI LUCAMP Allele Frequencies Catalog:
	- Drupal documentation page http://bsiweb.mayo.edu/bgi 


------------------------------------------------------------------
Version 0.0.2 - 2013-02-05b
------------------------------------------------------------------
**New Catalog**
+ A new ESP6500SI catalog is available for the NHLBI GO Exome Sequencing Project (ESP).
	- Please visit the drupal documentation page at http://bsiweb.mayo.edu/esp for details.



------------------------------------------------------------------
Version 0.0.2 - 2013-02-05a
------------------------------------------------------------------
**New Command**
+ bior\_vep
	- Based on feedback from the Jan 21 stakeholder meeting, the team has finished development of the command named bior\_vep. The default behavior of the command has been changed to suit the needs of the TREAT/GenomeGPS workflow. Here's a snippet from the command help:
	- NOTE: By default this command will select a single transcript that has the worst possible outcome as predicted by Sift/PolyPhen. This differs from VEP's behavior of producing multiple transcripts per input variant.
	- To retain VEP's behavior of a single variant row becoming multiple rows, each row containing the variant plus transcript specific data, use the --all option.

**Bug Fixes**
+ The bior\_vcf\_to\_json command had an issue handling VCF files with blank columns.
	- For details, see http://bsu-bugs/default.asp?1882 
+ The latest bior\_vep command and bug fix for bior\_vcf\_to\_json are now available in the RCF environment. Please take a look and give it a test drive. We appreciate any feedback you can provide.


------------------------------------------------------------------
Version 0.0.2 - 2013-01-10
------------------------------------------------------------------
**New Command**
+ bior\_vep
	- The team has finished development of a new command named bior\_vep. This command allows you to take variants from a VCF file and stream them into the Variant Effect Predictor (VEP) http://useast.ensembl.org/info/docs/variation/vep/index.html to get SIFT and Polyphen functional prediction data. A more detailed breakdown on the command can be found in the help text included below.

**Updates**
+ One other noteworthy item is that we have removed the ".sh" suffix from all commands to improve usability. For example, "bior\_drill.sh" is now just "bior\_drill".
+ The bior\_vep command is now available in the RCF environment. Please take a look and give it a test drive. We appreciate any feedback you can provide.



------------------------------------------------------------------
Version 0.0.1 - 2012-11-28
------------------------------------------------------------------
**Notes**
+ There will be an outage of the BioR system this Friday, 11/28 from 12PM-5PM.
	- The purpose of this outage is to upgrade the server bior.mayo.edu to the newest release (1.2). Please review the Release Notes at http://bsiweb.mayo.edu/release-notes for more details on version 1.2.
+ Some bug fixes in version 1.2 have made legacy BioR client versions incompatible. Legacy BioR client versions will be required to upgrade to 1.2. 


