ntCard
ntCard: a streaming algorithm for cardinality estimation in genomics data
Project Description
ntCard is a streaming algorithm for estimating the frequencies of k-mers in genomics datasets. At its core, ntCard uses the ntHash algorithm to efficiently compute hash values for streamed sequences. It then samples the calculated hash values to build a reduced representation multiplicity table describing the sample distribution. Finally, it uses a statistical model to reconstruct the population distribution from the sample distribution.
Visit our Github Repository for the latest version
Publications
-
Hamid Mohamadi, Hamza Khan, and Inanc Birol. ntCard: a streaming algorithm for cardinality estimation in genomics data. Bioinformatics (2017) 33 (9): 1324-1330. 10.1093/bioinformatics/btw832
- Hamid Mohamadi, Justin Chu, Benjamin P Vandervalk, and Inanc Birol. ntHash: recursive nucleotide hashing. Bioinformatics (2016) 32 (22): 3492-3494. doi:10.1093/bioinformatics/btw397
Current Release
ntCard 1.0.2
Released Sep 04, 2018
Higher periodicity ntHash
More about this release…
- Get ntCard for all platforms
- ntCard 1.0.2
All Releases
Version | Released | Description | Compatibility | Licenses | Status |
---|---|---|---|---|---|
1.0.2 | Sep 04, 2018 | Higher periodicity ntHash More about this release… | BSD | final | |
1.0.1 | Jan 29, 2018 | Change License to MIT License Fixing bugs and improving ops More about this release… | BSD | final | |
1.0.0 | Jan 11, 2017 | See ntCard GitHub page for details. More about this release… | GPLv3 for non-commercial usage | final |