supplementary materials
Manuscript:
Griffith OL, Gao BJ, Bilenky M, Prychyna Y, Ester M and Jones SJM. KiWi: A scalable subspace clustering algorithm for the identification of coregulated genes from massive gene expression datasets. Submitted. Bioinformatics. 20 Apr 2007.
A technical paper on the KiWi algorithm was presented at the KDD conference:
Gao BJ, Griffith OL, Ester M, Jones SJ. 2006. Discovering significant OPSM subspace clusters in massive gene expression data. In Proceedings of the 12th ACM SIGKDD international Conference on Knowledge Discovery and Data Mining (Philadelphia, PA, USA, August 20-23, 2006). KDD '06. ACM Press, New York, NY, 922-928.
Future developments of the KiWi algorithm will be made available here:
http://www.cs.sfu.ca/~bgao/personal/
- Supplementary figures
- KiWi source
- KiWi 1.0 executable
- KiWi tutorial
- Test dataset (Breast cancer data from Hedenfalk, et al. 2001. Gene-expression profiles in hereditary breast cancer. NEJM. 344:539–548)
Expression datasets
Dataset |
# of rows (genes/features) |
# of columns (conditions) |
12332 |
1640 |
|
20113 |
1026 |
|
730 |
16 |
*Due to its large size, expO_exp_data.txt.zip was also split into two parts and zipped separately. To recreate the full file, unzip each part and then concatenate them back together (at commandline: cat expO_exp_data_pt1.txt expO_exp_data_pt2.txt > expO_exp_data.txt).
KiWi Results (sig files)
Dataset |
k |
w |
Min genes |
Min dimensions |
# clusters |
30000 |
45 |
2 |
10 |
13412 |
|
100000 |
18 |
2 |
10 |
23555 |
|
100000 |
16 |
2 |
6 |
212532 |