MeDIP-HMM: Difference between revisions

From Jstacs
Jump to navigationJump to search
m (Created page with "__NOTOC__ by Michael Seifert, Sandra Cortijo, Francois Roudier, and Vincent Colot == Description == === Motivation === Methylation of cytosines in DNA is an important epigenetic...")
 
mNo edit summary
 
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
__NOTOC__
__NOTOC__
by Michael Seifert, Sandra Cortijo, Francois Roudier, and Vincent Colot
by Michael Seifert, Sandra Cortijo, Maria Colome-Tatche, Frank Johannes, Francois Roudier, and Vincent Colot


== Description ==
== Description ==
=== Motivation ===
=== Motivation ===
Methylation of cytosines in DNA is an important epigenetic mechanism involved in transcriptional regulation and preservation of genome integrity in a wide range of eukaryotes. Immunoprecipitation of methylated DNA followed by hybridization to genomic tiling arrays (MeDIP-chip) is a cost-effective and sensitive method for methylome analyses. However, existing bioinformatic methods only enable a binary classification into unmethylated and methylated genomic regions, which limits biological interpretations. Indeed, DNA methylation levels can vary substantially within a given DNA fragment depending on the number of contained methylated cytosines. Therefore, a method for the identification of more than two methylation states from MeDIP-chip data is highly desirable.
Methylation of cytosines in DNA is an important epigenetic mechanism involved in transcriptional regulation and preservation of genome integrity in a wide range of eukaryotes. Immunoprecipitation of methylated DNA followed by hybridization to genomic tiling arrays (MeDIP-chip) is a cost-effective and sensitive method for methylome analyses. However, existing bioinformatics methods only enable a binary classification into unmethylated and methylated genomic regions, which limits biological interpretations. Indeed, DNA methylation levels can vary substantially within a given DNA fragment depending on the number and degree of methylated cytosines. Therefore, a method for the identification of more than two methylation states is highly desirable.


=== Results ===
=== Results ===
Here, we present a three-state Hidden Markov Model (MeDIP-HMM) for analyzing MeDIP-chip data. MeDIP-HMM utilizes a higher-order state-transition process improving modeling of spatial dependencies between chromosomal regions, allows a simultaneous analysis of replicates, and enables a differentiation between unmethylated, methylated and highly methylated genomic regions. We train MeDIP-HMM using a Bayesian Baum-Welch algorithm integrating prior knowledge on methylation levels. We apply MeDIP-HMM to the analysis of the Arabidopsis root methylome and systematically investigate the benefit of using higher-order HMMs. Moreover, we also perform an in-depth comparison study to existing methods and demonstrate the value of using MeDIP-HMM by comparisons to current knowledge on the Arabidopsis methylome. We find that MeDIP-HMM is a fast and precise method for the analysis of DNA methylation data enabling the identification of distinct DNA methylation levels. These results suggest that MeDIP-HMM could also be useful for analyses of other methylomes.
Here, we present a three-state Hidden Markov Model (MeDIP-HMM) for analyzing MeDIP-chip data. MeDIP-HMM utilizes a higher-order state-transition process improving modeling of spatial dependencies between chromosomal regions, allows a simultaneous analysis of replicates, and enables a differentiation between unmethylated, methylated and highly methylated genomic regions. We train MeDIP-HMM using a Bayesian Baum-Welch algorithm integrating prior knowledge on methylation levels. We apply MeDIP-HMM to the analysis of the Arabidopsis root methylome and systematically investigate the benefit of using higher-order HMMs. Moreover, we also perform an in-depth comparison study to existing methods and demonstrate the value of using MeDIP-HMM by comparisons to current knowledge on the Arabidopsis methylome. We find that MeDIP-HMM is a fast and precise method for the analysis of DNA methylation data enabling the identification of distinct DNA methylation levels. Finally, we provide evidence for the general applicability of MeDIP-HMM by analyzing promoter DNA methylation data obtained for chicken.


== Paper ==
== Paper ==
The paper '''''MeDIP-HMM: Genome-wide identification of distinct DNA methylation states from high-density tiling arrays''''' has been submitted to [http://bioinformatics.oxfordjournals.org/ Bioinformatics].
The paper [http://bioinformatics.oxfordjournals.org/content/early/2012/09/17/bioinformatics.bts562.abstract '''''MeDIP-HMM: Genome-wide identification of distinct DNA methylation states from high-density tiling arrays'''''] has been published in [http://bioinformatics.oxfordjournals.org/ Bioinformatics].


== Download ==
== Download ==
* The Arabidopsis root methylome data set and a JAR file containing the MeDIP-HMM will soon be available.
* [https://imbcloud.medizin.tu-dresden.de/sharing/iWr3Uix4A MeDIP-HMM]: A ZIP file containing a JAR file for analyzing methylome data sets by MeDIP-HMMs. This file also contains the Arabidopsis root methylome and the chicken data sets considered in our study.
 
== Related Projects ==
* [[ARHMM]]: integrating local chromosomal dependencies into the analysis of tumor expression profiles
* [[DSHMM]]: exploiting prior knowledge and gene distances in the analysis of tumor expression profiles
* [[SHMM]]: utilizing gene-pair orientations for improved analysis of ChIP-chip promoter array data
* [[PHHMM]]: improved analysis of Array-CGH data
* [https://sites.google.com/site/mseifertweb/hmm-book HMM Book]: Hidden Markov Models with Applications in Computational Biology
 
== Follow Me ==
* [https://sites.google.com/site/michaelseiferthmm/home Personal Homepage]

Latest revision as of 08:11, 24 March 2021

by Michael Seifert, Sandra Cortijo, Maria Colome-Tatche, Frank Johannes, Francois Roudier, and Vincent Colot

Description

Motivation

Methylation of cytosines in DNA is an important epigenetic mechanism involved in transcriptional regulation and preservation of genome integrity in a wide range of eukaryotes. Immunoprecipitation of methylated DNA followed by hybridization to genomic tiling arrays (MeDIP-chip) is a cost-effective and sensitive method for methylome analyses. However, existing bioinformatics methods only enable a binary classification into unmethylated and methylated genomic regions, which limits biological interpretations. Indeed, DNA methylation levels can vary substantially within a given DNA fragment depending on the number and degree of methylated cytosines. Therefore, a method for the identification of more than two methylation states is highly desirable.

Results

Here, we present a three-state Hidden Markov Model (MeDIP-HMM) for analyzing MeDIP-chip data. MeDIP-HMM utilizes a higher-order state-transition process improving modeling of spatial dependencies between chromosomal regions, allows a simultaneous analysis of replicates, and enables a differentiation between unmethylated, methylated and highly methylated genomic regions. We train MeDIP-HMM using a Bayesian Baum-Welch algorithm integrating prior knowledge on methylation levels. We apply MeDIP-HMM to the analysis of the Arabidopsis root methylome and systematically investigate the benefit of using higher-order HMMs. Moreover, we also perform an in-depth comparison study to existing methods and demonstrate the value of using MeDIP-HMM by comparisons to current knowledge on the Arabidopsis methylome. We find that MeDIP-HMM is a fast and precise method for the analysis of DNA methylation data enabling the identification of distinct DNA methylation levels. Finally, we provide evidence for the general applicability of MeDIP-HMM by analyzing promoter DNA methylation data obtained for chicken.

Paper

The paper MeDIP-HMM: Genome-wide identification of distinct DNA methylation states from high-density tiling arrays has been published in Bioinformatics.

Download

  • MeDIP-HMM: A ZIP file containing a JAR file for analyzing methylome data sets by MeDIP-HMMs. This file also contains the Arabidopsis root methylome and the chicken data sets considered in our study.

Related Projects

  • ARHMM: integrating local chromosomal dependencies into the analysis of tumor expression profiles
  • DSHMM: exploiting prior knowledge and gene distances in the analysis of tumor expression profiles
  • SHMM: utilizing gene-pair orientations for improved analysis of ChIP-chip promoter array data
  • PHHMM: improved analysis of Array-CGH data
  • HMM Book: Hidden Markov Models with Applications in Computational Biology

Follow Me