InhPMM: Difference between revisions

From Jstacs
Jump to navigationJump to search
(instructions)
mNo edit summary
 
(One intermediate revision by the same user not shown)
Line 3: Line 3:


== Paper ==
== Paper ==
The Paper ''Inhomogeneous Parsimonious Markov Models'' has been accepted at [http://www.ecmlpkdd2013.org ECMLPKDD 2013].
The paper [http://www.ecmlpkdd2013.org/wp-content/uploads/2013/07/337.pdf Inhomogeneous Parsimonious Markov Models] has been published at [http://www.ecmlpkdd2013.org ECMLPKDD 2013].


== Runnable JAR ==
== Runnable JAR ==
Line 10: Line 10:
Run by calling:
Run by calling:


<code>java -jar InhPMM.jar modelClass order kappa ess dataTrain dataTest</code>
<code>java -jar InhPMM.jar modelClass order logKappa ess dataTrain dataTest</code>


The arguments have the following semantics:
The arguments have the following semantics:
Line 47: Line 47:
<tr>
<tr>
<td><font color="green">dataTest</font></td>
<td><font color="green">dataTest</font></td>
<td> Optional. Path to the text data set. If omitted, no predictive probabilities are computed.</td>
<td> Optional. Path to the test data set. If omitted, no predictive probabilities are computed.</td>
<td>String</td>
<td>String</td>
</tr>
</tr>

Latest revision as of 11:16, 19 November 2013

by Ralf Eggeling, André Gohr, Pierre-Yves Bourguignon, Edgar Wingender, and Ivo Grosse

Paper

The paper Inhomogeneous Parsimonious Markov Models has been published at ECMLPKDD 2013.

Runnable JAR

The application learns an InhPMM (or InhVOMM) from a training data set, returns two representations of the model, and computes predictions on a test data set (if provided). The input files are expected to contain sequences over DNA alphabet of identical length as plain text. The sequence length in training and test files should be identical. Run by calling:

java -jar InhPMM.jar modelClass order logKappa ess dataTrain dataTest

The arguments have the following semantics:

name comment type

modelClass Determines the model class. Either parsimonious Markov model (type PMM) or variable order Markov model (type VOMM). String
order The maximal depth of the (parsimonious) context trees. Integer
logKappa The logarithm of the structure prior hyperparameter kappa. Double
ess The equivalent sample size of the parameter prior. Double (positive)
dataTrain Path to training data set. String
dataTest Optional. Path to the test data set. If omitted, no predictive probabilities are computed. String

The application produces the following output:

  • model.xml: Contains an xml-representation of the learned model for subsequent reloading the model into Jstacs.
  • model.dot: Contains a graphViz-representation of the learned model. Convert to pdf by calling dot -Tpdf -o model.pdf model.dot (local graphViz installation required).
  • prediction.txt: Contains a list of log predictive probabilities of all sequences in the test data set given the learned model. Is only created if dataTest is set.

Download