InhPMM: Difference between revisions
From Jstacs
Jump to navigationJump to search
(basic info) |
(instructions) |
||
Line 4: | Line 4: | ||
== Paper == | == Paper == | ||
The Paper ''Inhomogeneous Parsimonious Markov Models'' has been accepted at [http://www.ecmlpkdd2013.org ECMLPKDD 2013]. | The Paper ''Inhomogeneous Parsimonious Markov Models'' has been accepted at [http://www.ecmlpkdd2013.org ECMLPKDD 2013]. | ||
== Runnable JAR == | |||
The application learns an InhPMM (or InhVOMM) from a training data set, returns two representations of the model, and computes predictions on a test data set (if provided). | |||
The input files are expected to contain sequences over DNA alphabet of identical length as plain text. The sequence length in training and test files should be identical. | |||
Run by calling: | |||
<code>java -jar InhPMM.jar modelClass order kappa ess dataTrain dataTest</code> | |||
The arguments have the following semantics: | |||
<table border=0 cellpadding=10 align="center"> | |||
<tr> | |||
<td>name</td> | |||
<td>comment</td> | |||
<td>type</td> | |||
</tr> | |||
<tr><td colspan=3><hr></td></tr> | |||
<tr> | |||
<td><font color="green">modelClass</font></td> | |||
<td>Determines the model class. Either parsimonious Markov model (type PMM) or variable order Markov model (type VOMM).</td> | |||
<td>String</td> | |||
</tr> | |||
<tr> | |||
<td><font color="green">order</font></td> | |||
<td>The maximal depth of the (parsimonious) context trees.</td> | |||
<td>Integer</td> | |||
</tr> | |||
<tr> | |||
<td><font color="green">logKappa</font></td> | |||
<td>The logarithm of the structure prior hyperparameter kappa.</td> | |||
<td>Double</td> | |||
</tr> | |||
<tr> | |||
<td><font color="green">ess</font></td> | |||
<td>The equivalent sample size of the parameter prior.</td> | |||
<td>Double (positive)</td> | |||
</tr> | |||
<tr> | |||
<td><font color="green">dataTrain</font></td> | |||
<td>Path to training data set.</td> | |||
<td>String</td> | |||
</tr> | |||
<tr> | |||
<td><font color="green">dataTest</font></td> | |||
<td> Optional. Path to the text data set. If omitted, no predictive probabilities are computed.</td> | |||
<td>String</td> | |||
</tr> | |||
</table> | |||
The application produces the following output: | |||
* model.xml: Contains an xml-representation of the learned model for subsequent reloading the model into Jstacs. | |||
* model.dot: Contains a graphViz-representation of the learned model. Convert to pdf by calling <code>dot -Tpdf -o model.pdf model.dot</code> (local [http://www.graphviz.org graphViz] installation required). | |||
* prediction.txt: Contains a list of log predictive probabilities of all sequences in the test data set given the learned model. Is only created if dataTest is set. | |||
== Download == | == Download == | ||
* [http://www.jstacs.de/downloads/InhPMM.jar Runnable Jar]: Examplary command line application. | * [http://www.jstacs.de/downloads/InhPMM.jar Runnable Jar]: Examplary command line application. | ||
* [http://www.jstacs.de/downloads/InhPMM-sources.zip Sources]: Building requires Jstacs 2.1. | * [http://www.jstacs.de/downloads/InhPMM-sources.zip Sources]: Building requires Jstacs 2.1. |
Revision as of 19:18, 21 September 2013
by Ralf Eggeling, André Gohr, Pierre-Yves Bourguignon, Edgar Wingender, and Ivo Grosse
Paper
The Paper Inhomogeneous Parsimonious Markov Models has been accepted at ECMLPKDD 2013.
Runnable JAR
The application learns an InhPMM (or InhVOMM) from a training data set, returns two representations of the model, and computes predictions on a test data set (if provided). The input files are expected to contain sequences over DNA alphabet of identical length as plain text. The sequence length in training and test files should be identical. Run by calling:
java -jar InhPMM.jar modelClass order kappa ess dataTrain dataTest
The arguments have the following semantics:
name | comment | type |
modelClass | Determines the model class. Either parsimonious Markov model (type PMM) or variable order Markov model (type VOMM). | String |
order | The maximal depth of the (parsimonious) context trees. | Integer |
logKappa | The logarithm of the structure prior hyperparameter kappa. | Double |
ess | The equivalent sample size of the parameter prior. | Double (positive) |
dataTrain | Path to training data set. | String |
dataTest | Optional. Path to the text data set. If omitted, no predictive probabilities are computed. | String |
The application produces the following output:
- model.xml: Contains an xml-representation of the learned model for subsequent reloading the model into Jstacs.
- model.dot: Contains a graphViz-representation of the learned model. Convert to pdf by calling
dot -Tpdf -o model.pdf model.dot
(local graphViz installation required). - prediction.txt: Contains a list of log predictive probabilities of all sequences in the test data set given the learned model. Is only created if dataTest is set.
Download
- Runnable Jar: Examplary command line application.
- Sources: Building requires Jstacs 2.1.