DSHMM: Difference between revisions
No edit summary |
|||
Line 14: | Line 14: | ||
== Download == | == Download == | ||
* SHMMs will be available in the next Jstacs release. | * SHMMs will be available in the next Jstacs release. | ||
* | * [http://www.jstacs.de/downloads/supplementary_data.zip Supplementary data]: Containing the breast cancer gene expression data set and the breast cancer gene copy number data set (Pollack et al. (2002)) analyzed in the manuscript, and the predictions and scores of the compared methods. |
Revision as of 10:49, 16 December 2010
by Michael Seifert, Marc Strickert, Alexander Schliep, and Ivo Grosse
Description
Motivation
Changes in gene expression levels play a central role in tumors. Additional information about the distribution of gene expression levels and distances between adjacent genes on chromosomes should be integrated into the analysis of tumor expression profiles.
Results
We use a Hidden Markov Model with scaled transition matrices (SHMM) to incorporate chromosomal distances of adjacent genes on chromosomes into the identification of differentially expressed genes in breast cancer. We train the SHMM by integrating prior knowledge about potential distributions of expression levels of differentially expressed and unchanged genes in tumor. We find that the usage of these information and the modeling of distances between adjacent genes lead to a substantial improvement of the identification of differentially expressed genes in comparison to other existing methods. This performance benefit is further accompanied by the identification of genes well-known to be associated with breast cancer. This suggests applications of SHMMs for screening of other tumor expression profiles.
Paper
The paper Exploiting prior knowledge and gene distances in the analysis of tumor expression profiles with extended Hidden Markov Models has been submitted to Bioinformatics.
Download
- SHMMs will be available in the next Jstacs release.
- Supplementary data: Containing the breast cancer gene expression data set and the breast cancer gene copy number data set (Pollack et al. (2002)) analyzed in the manuscript, and the predictions and scores of the compared methods.