PCTLearn: Difference between revisions
(page created) |
No edit summary |
||
Line 1: | Line 1: | ||
by Ralf Eggeling, Ivo Grosse, and Mikko Koivisto. | by Ralf Eggeling, Ivo Grosse, and Mikko Koivisto. | ||
== Runnable JAR == | |||
[https://www.cs.helsinki.fi/u/eggeling/PCTLearn/PCTLearn.jar PCTLearn] requires as input a plain text file, which can consist of basic Latin characters a-z and A-Z (case sensitive) and Arabic numerical 0-9. The number of different characters in the input file determines the alphabet size for PCT optimization. | |||
The application has one mandatory and various optional arguments. | |||
A shorter list of arguments can be provided, in which case all missing arguments are considered to assume default values. | |||
Run with | |||
<code>java -jar PCTLearn.jar inputFile maximalDepth scoringFunction memoization pruning fineBound memoLimit lookaheadDepth</code> | |||
where the arguments have the following semantics: | |||
<table border=0 cellpadding=10 align="center"> | |||
<tr> | |||
<td>name</td> | |||
<td>type</td> | |||
<td>default</td> | |||
<td>comment</td> | |||
</tr> | |||
<tr><td colspan=4><hr></td></tr> | |||
<tr> | |||
<td><font color="green">inputFile</font></td> | |||
<td>String</td> | |||
<td>--</td> | |||
<td>The location of a text file containing the input data. </td> | |||
</tr> | |||
<tr> | |||
<td><font color="green">maximalDepth</font></td> | |||
<td>Integer</td> | |||
<td>2</td> | |||
<td>The maximal depth of the learned PCT.</td> | |||
</tr> | |||
<tr> | |||
<td><font color="green">scoringFunction</font></td> | |||
<td>String</td> | |||
<td>BIC</td> | |||
<td>The used scoring function. Permitted values are "BIC" and "AIC".</td> | |||
</tr> | |||
<tr> | |||
<td><font color="green">memoization</font></td> | |||
<td>Boolean</td> | |||
<td>TRUE</td> | |||
<td>Enabling memoization.</td> | |||
</tr> | |||
<tr> | |||
<td><font color="green">pruning</font></td> | |||
<td>Boolean</td> | |||
<td>TRUE</td> | |||
<td>Enabling pruning.</td> | |||
</tr> | |||
<tr> | |||
<td><font color="green">fineBound</font></td> | |||
<td>Boolean</td> | |||
<td>TRUE</td> | |||
<td>Use fine upper bound instead of coarse. Is ignored if pruning is set to FALSE.</td> | |||
</tr> | |||
<tr> | |||
<td><font color="green">memoLimit</font></td> | |||
<td>Integer</td> | |||
<td>1</td> | |||
<td>Memoization limit that stops storing subtrees width given distance from the leaves. Is ignored if memoization is set to FALSE.</td> | |||
</tr> | |||
<tr> | |||
<td><font color="green">lookaheadDepth</font></td> | |||
<td>Integer</td> | |||
<td>1</td> | |||
<td>The used lookahead depth. Is ignored if pruning is set to FALSE.</td> | |||
</tr> | |||
</table> | |||
The tool writes some statistics about the optimization, such optimal score, number of visited node, and required running time to stdout. | |||
It addition it creates (i) a graphViz file of the learned PCT structure and (ii) a file with conditional probability parameters (MLE) for each leaf. |
Revision as of 11:33, 14 December 2017
by Ralf Eggeling, Ivo Grosse, and Mikko Koivisto.
Runnable JAR
PCTLearn requires as input a plain text file, which can consist of basic Latin characters a-z and A-Z (case sensitive) and Arabic numerical 0-9. The number of different characters in the input file determines the alphabet size for PCT optimization. The application has one mandatory and various optional arguments. A shorter list of arguments can be provided, in which case all missing arguments are considered to assume default values. Run with
java -jar PCTLearn.jar inputFile maximalDepth scoringFunction memoization pruning fineBound memoLimit lookaheadDepth
where the arguments have the following semantics:
name | type | default | comment |
inputFile | String | -- | The location of a text file containing the input data. |
maximalDepth | Integer | 2 | The maximal depth of the learned PCT. |
scoringFunction | String | BIC | The used scoring function. Permitted values are "BIC" and "AIC". |
memoization | Boolean | TRUE | Enabling memoization. |
pruning | Boolean | TRUE | Enabling pruning. |
fineBound | Boolean | TRUE | Use fine upper bound instead of coarse. Is ignored if pruning is set to FALSE. |
memoLimit | Integer | 1 | Memoization limit that stops storing subtrees width given distance from the leaves. Is ignored if memoization is set to FALSE. |
lookaheadDepth | Integer | 1 | The used lookahead depth. Is ignored if pruning is set to FALSE. |
The tool writes some statistics about the optimization, such optimal score, number of visited node, and required running time to stdout.
It addition it creates (i) a graphViz file of the learned PCT structure and (ii) a file with conditional probability parameters (MLE) for each leaf.