ABySS 2.0.0 (Sep 01, 2016)


This release introduces a new Bloom filter assembly mode that enables large genome assemblies with minimal memory (e.g. 34 GB for H. sapiens with 76X coverage bfc-corrected reads). Bloom filter assemblies are currently less contiguous than the default (MPI) assembly mode but are still of high quality (e.g. 3.5 Mbp vs. 4.8 Mbp scaffold NG50 for H. sapiens). Bloom filter assembly mode is enabled by adding three 'abyss-pe' parameters (B = *Bloom filter size*, H = *number of Bloom filter hash functions*, kc = *k-mer coverage threshold*). See 'README.md' for an example. This release also updates several 'abyss-pe' parameter defaults to be more suitable for large genome assemblies with recent Illumina data. In addition, ABySS 2.0.0 includes minor usability improvements for 'abyss-sealer' and removes an unnecessary build dependency on sqlite3.

For additional information about this project, please visit the overview page .

Available downloads

Source code

For all platforms (1.1 MB)

Change log

2016-08-30 Ben Vandervalk <benv@bcgsc.ca>

* Release version 2.0.0
* New Bloom filter mode for assembly => assemble large genomes
with minimal memory (e.g. 34G for H. sapiens)
* Update param defaults for modern Illumina data
* Make sqlite3 an optional dependency

* New 'compare' command for bitwise comparison of Bloom filters
(thanks to @bschiffthaler!)
* New 'kmers' command for printing k-mers that match a Bloom filter
(thanks to @bschiffthaler!)

* New preunitig assembler that uses Bloom filter
* Add 'B' param (Bloom filter size) to 'abyss-pe' command to enable
Bloom filter mode
* See README.md and '--help' for further instructions

* Mask scaftigs shorter than 50bp with 'N's (short scaftigs
were causing problems with NCBI submission)

* Update default parameter values for modern Illumina data
* Change 'l=k' => 'l=40'
* Change 's=200' => 's=1000'
* Change 'S=s' => 'S=1000-10000' (do a param sweep of 'S')
* Use 'DistanceEst --mean' for scaffolding stage, instead of
the default '--mle'

* New '--max-gap-length' ('-G') option to replace unintuitive
'--max-frag'; use of '--max-frag' is now deprecated
* Require user to explicitly specify Bloom filter size (e.g.
* Report false positive rate (FPR) when building/loading Bloom
* Don't require input FASTQ files when using pre-built Bloom
filter files

* Fix bug causing output read 2 file to be empty
* New percent sequence identity options ('-x' and '-X')
* New '--alt-paths-mode' option to output alternate connecting
paths between read pairs

* Fix documentation of ABYSS and abyss-pe parameters
(thanks to @nsoranzo!)