FAQs: Difference between revisions

From Jstacs
Jump to navigationJump to search
m (Reverted edits by 193.93.134.132 (talk) to last revision by Keilwagen)
No edit summary
 
(9 intermediate revisions by the same user not shown)
Line 13: Line 13:
<p>
<p>
'''Q: How can I create a simple sequence?'''<br />
'''Q: How can I create a simple sequence?'''<br />
'''A:''' Try to use the [http://www.jstacs.de/api/de/jstacs/data/Sequence.html#create(de.jstacs.data.AlphabetContainer,%20java.lang.String) create method] of [http://www.jstacs.de/api/de/jstacs/data/Sequence.html Sequence], e.q. Sequence.create( new AlphabetContainer( new DNAAlphabet() ), "ACGTACGT" );
'''A:''' Try to use the [http://www.jstacs.de/api/de/jstacs/data/sequences/Sequence.html#create(de.jstacs.data.AlphabetContainer,%20java.lang.String) create method] of [http://www.jstacs.de/api/de/jstacs/data/sequences/Sequence.html Sequence], e.q. Sequence.create( new AlphabetContainer( new DNAAlphabet() ), "ACGTACGT" );
</p>
</p>
<hr />
<hr />
<p>
<p>
'''Q: How can I load my own data?'''<br />
'''Q: How can I load my own data?'''<br />
'''A:''' If your sequences are stored either in plain text or in FastA format, you can directly [http://www.jstacs.de/api/de/jstacs/data/Sample.html#Sample(de.jstacs.data.AlphabetContainer,%20de.jstacs.io.StringExtractor,%20java.lang.String) create] a new [http://www.jstacs.de/api/de/jstacs/data/Sample.html Sample] from the file.
'''A:''' If your sequences are stored either in plain text or in FastA format, you can directly [http://www.jstacs.de/api/de/jstacs/data/DataSet.html#DataSet(de.jstacs.data.AlphabetContainer,%20de.jstacs.io.AbstractStringExtractor,%20java.lang.String) create] a new [http://www.jstacs.de/api/de/jstacs/data/DataSet.html DataSet] from the file.
</p>
</p>
<hr />
<hr />
Line 28: Line 28:


== Using existing models ==
== Using existing models ==
Also have a look at code examples [[Training a classifier and classifying new sequences]], [[Performing a 10-fold cross validation]] and [[Saving and loading a model]].
Also have a look at the [[Code examples]].
<p>
<p>
'''Q: Where do I find a list of the models currently implemented in Jstacs?'''<br />
'''Q: Where do I find a list of the models currently implemented in Jstacs?'''<br />
'''A:''' All generative models in Jstacs implement the [http://www.jstacs.de/api/de/jstacs/models/Model.html Model] interface.<br />
'''A:''' All generative models in Jstacs implement the [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/trainable/TrainableStatisticalModel.html TrainableStatisticalModel] interface.<br />
All discriminative models in Jstacs implement the [http://www.jstacs.de/api/de/jstacs/scoringFunctions/ScoringFunction.html ScoringFunction] interface. You find all the existing implementations in the list of implementing classes of these two interfaces.
All discriminative models in Jstacs implement the [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/differentiable/DifferentiableStatisticalModel.html DifferentiableStatisticalModel] interface. You find all the existing implementations in the list of implementing classes of these two interfaces.
</p>
</p>
<hr />
<hr />
<p>
<p>
'''Q: I decided for two [http://www.jstacs.de/api/de/jstacs/models/Model.html Models]. How do I learn them and classify new data?'''<br />
'''Q: I decided for two [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/trainable/TrainableStatisticalModel.html TrainableStatisticalModels]. How do I learn them and classify new data?'''<br />
'''A:''' You can create a new [http://www.jstacs.de/api/de/jstacs/classifier/modelBased/ModelBasedClassifier.html ModelBasedClassifier] from your models and use its [http://www.jstacs.de/api/de/jstacs/classifier/modelBased/ModelBasedClassifier.html#train(de.jstacs.data.Sample%5B%5D,%20double%5B%5D%5B%5D) train] and [http://www.jstacs.de/api/de/jstacs/classifier/modelBased/ModelBasedClassifier.html#classify(de.jstacs.data.Sample) classify] methods. If you only want to learn a model from data, e.g. to sample new sequences, you can also directly use the [http://www.jstacs.de/api/de/jstacs/models/Model.html#train(de.jstacs.data.Sample) train] method of the [http://www.jstacs.de/api/de/jstacs/models/Model.html Model].<br />
'''A:''' You can create a new [http://www.jstacs.de/api/de/jstacs/classifiers/trainSMBased/TrainSMBasedClassifier.html TrainSMBasedClassifier] from your models and use its [http://www.jstacs.de/api/de/jstacs/classifiers/trainSMBased/TrainSMBasedClassifier.html#train(de.jstacs.data.DataSet%5B%5D,%20double%5B%5D%5B%5D) train] and [http://www.jstacs.de/api/de/jstacs/classifiers/trainSMBased/TrainSMBasedClassifier.html#classify(de.jstacs.data.DataSet) classify] methods. If you only want to learn a model from data, e.g. to sample new sequences, you can also directly use the [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/trainable/TrainableStatisticalModel.html#train(de.jstacs.data.DataSet) train] method of the [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/trainable/TrainableStatisticalModel.html TrainableStatisticalModel].
Also see the [[Training a classifier and classifying new sequences | code example]].
</p>
</p>
<hr />
<hr />
<p>
<p>
'''Q: I decided for two [http://www.jstacs.de/api/de/jstacs/scoringFunctions/ScoringFunction.html ScoringFunctions]. How do I learn them and classify new data?'''<br />
'''Q: I decided for two [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/differentiable/DifferentiableStatisticalModel.html DifferentiableStatisticalModels]. How do I learn them and classify new data?'''<br />
'''A:''' You can create a new [http://www.jstacs.de/api/de/jstacs/classifier/scoringFunctionBased/ScoreClassifier.html ScoreClassifier] from your scoring functions and use its [http://www.jstacs.de/api/de/jstacs/classifier/scoringFunctionBased/ScoreClassifier.html#train(de.jstacs.data.Sample%5B%5D,%20double%5B%5D%5B%5D) train] and [http://www.jstacs.de/api/de/jstacs/classifier/AbstractScoreBasedClassifier.html#classify(de.jstacs.data.Sequence)) classify] methods.
'''A:''' You can create a new [http://www.jstacs.de/api/de/jstacs/classifiers/differentiableSequenceScoreBased/gendismix/GenDisMixClassifier.html GenDisMixClassifier] from your models and use its [http://www.jstacs.de/api/de/jstacs/classifier/scoringFunctionBased/ScoreClassifier.html#train(de.jstacs.data.Sample%5B%5D,%20double%5B%5D%5B%5D) train] and [http://www.jstacs.de/api/de/jstacs/classifiers/differentiableSequenceScoreBased/gendismix/GenDisMixClassifier.html#classify(de.jstacs.data.sequences.Sequence)) classify] methods.
</p>
</p>
<hr />
<hr />
<p>
<p>
'''Q: I have to decide, which model is best for my classification task. How do I assess different model combinations or classifiers?'''<br />
'''Q: I have to decide, which model is best for my classification task. How do I assess different model combinations or classifiers?'''<br />
'''A:''' You can use the subclasses of [http://www.jstacs.de/api/de/jstacs/classifier/assessment/ClassifierAssessment.html ClassifierAssessment], e.g. [http://www.jstacs.de/api/de/jstacs/classifier/assessment/KFoldCrossValidation.html KFoldCrossValidation]. All [http://www.jstacs.de/api/de/jstacs/classifier/assessment/ClassifierAssessment.html ClassifierAssessments] have a constructor that accepts an array of classifiers (or [http://www.jstacs.de/api/de/jstacs/models/Model.html Models]). You can then use the [http://www.jstacs.de/api/de/jstacs/classifier/assessment/ClassifierAssessment.html#assess(de.jstacs.classifier.MeasureParameters,%20de.jstacs.classifier.assessment.ClassifierAssessmentAssessParameterSet,%20de.jstacs.data.Sample...) assess] method to assess these classifiers on the same data using a number of pre-defined [http://www.jstacs.de/api/de/jstacs/classifier/MeasureParameters.Measure.html performance measures].<br />
'''A:''' You can use the subclasses of [http://www.jstacs.de/api/de/jstacs/classifiers/assessment/ClassifierAssessment.html ClassifierAssessment], e.g. [http://www.jstacs.de/api/de/jstacs/classifiers/assessment/KFoldCrossValidation.html KFoldCrossValidation]. All [http://www.jstacs.de/api/de/jstacs/classifiers/assessment/ClassifierAssessment.html ClassifierAssessments] have a constructor that accepts an array of classifiers (or [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/trainable/TrainableStatisticalModel.html TrainableStatisticalModels]). You can then use the [http://www.jstacs.de/api/de/jstacs/classifiers/assessment/ClassifierAssessment.html#assess(de.jstacs.classifiers.performanceMeasures.NumericalPerformanceMeasureParameterSet,%20de.jstacs.classifiers.assessment.ClassifierAssessmentAssessParameterSet,%20de.jstacs.data.DataSet...) assess] method to assess these classifiers on the same data using a number of pre-defined [http://www.jstacs.de/api/de/jstacs/classifiers/performanceMeasures/NumericalPerformanceMeasureParameterSet.html performance measures].
Also see the [[Performing a 10-fold cross validation | code example]]
</p>
</p>
<hr />
<hr />
<p>
<p>
'''Q: How can store and load my model, classifier, ...?'''<br />
'''Q: How can store and load my model, classifier, ...?'''<br />
'''A:''' All classes that implement [http://www.jstacs.de/api/de/jstacs/Storable.html Storable] have a method [http://www.jstacs.de/api/de/jstacs/Storable.html#toXML() toXML()] that returns a StringBuffer containing the instance as [http://en.wikipedia.org/wiki/XML XML]. Such classes should also have a proper constructor with a single argument StringBuffer. This can be used to create a new instance form a StringBuffer that contains an instance as XML. In addition, the class [http://www.jstacs.de/api/de/jstacs/io/FileManager.html FileManager] allows to read and write StringBuffers to the hard drive.<br />
'''A:''' All classes that implement [http://www.jstacs.de/api/de/jstacs/Storable.html Storable] have a method [http://www.jstacs.de/api/de/jstacs/Storable.html#toXML() toXML()] that returns a StringBuffer containing the instance as [http://en.wikipedia.org/wiki/XML XML]. Such classes should also have a proper constructor with a single argument StringBuffer. This can be used to create a new instance form a StringBuffer that contains an instance as XML. In addition, the class [http://www.jstacs.de/api/de/jstacs/io/FileManager.html FileManager] allows to read and write StringBuffers to the hard drive.
Also see the [[Saving and loading a model | code example]].
</p>
</p>
<hr />
<hr />
Line 64: Line 61:
<hr />
<hr />
<p>
<p>
'''Q: I use Gibbs sampling in a class extending [http://www.jstacs.de/api/de/jstacs/models/mixture/AbstractMixtureModel.html AbstractMixtureModel].'''<br />
'''Q: I use Gibbs sampling in a class extending [http://www.jstacs.de/api-2.0/de/jstacs/sequenceScores/statisticalModels/trainable/mixture/AbstractMixtureTrainSM.html AbstractMixtureTrainSM].'''<br />
* '''Q<sub>1</sub>: Why does the sampling create files in temporary directory of Java?''' <br />
* '''Q<sub>1</sub>: Why does the sampling create files in temporary directory of Java?''' <br />
* '''Q<sub>2</sub>: Will these files be deleted automatically, if they will not be used any more?'''<br />
* '''Q<sub>2</sub>: Will these files be deleted automatically, if they will not be used any more?'''<br />
'''A:'''<br />
'''A:'''<br />
* '''A<sub>1</sub>:''' These files are created for saving the sampled parameter temporarily. Java temp is used to minimize network load if you work on a cluster.<br />
* '''A<sub>1</sub>:''' These files are created for saving the sampled parameter temporarily. Java temp is used to minimize network load if you work on a cluster.<br />
* '''A<sub>2</sub>:''' These files will be deleted if no reference to the mixture instance exists and the Garbage collector is called. Therefore it is recommended to [http://java.sun.com/j2se/1.5.0/docs/api/java/lang/System.html#gc() call the Garbage collector explicitly] at the end of any application.</p>
* '''A<sub>2</sub>:''' These files will be deleted if no reference to the mixture instance exists and the Garbage collector is called. Therefore it is recommended to [http://docs.oracle.com/javase/6/docs/api/java/lang/System.html#gc() call the Garbage collector explicitly] at the end of any application.</p>
<hr />
<hr />


== Implementing new models ==
== Implementing new models ==
<p>
<p>
'''Q: How do I implement a new generative [http://www.jstacs.de/api/de/jstacs/models/Model.html Model]?'''<br />
'''Q: How do I implement a new generative [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/trainable/TrainableStatisticalModel.html TrainableStatisticalModel]?'''<br />
'''A:''' Write an implementation of the [http://www.jstacs.de/api/de/jstacs/models/Model.html Model] interface. For convenience, you can use the abstract [http://www.jstacs.de/api/de/jstacs/models/AbstractModel.html AbstractModel] class with default implementations for many methods.<br />
'''A:''' Write an implementation of the [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/trainable/TrainableStatisticalModel.html TrainableStatisticalModel] interface. For convenience, you can use the abstract [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/trainable/AbstractTrainableStatisticalModel.html AbstractTrainableStatisticalModel] class with default implementations for many methods.
Also see the [[Implementation of a homogeneous Markov model of order 0 based on AbstractModel | code example]].
</p>
</p>
<hr />
<hr />
<p>
<p>
'''Q: How do I implement a new discriminative model?'''<br />
'''Q: How do I implement a new discriminative model?'''<br />
'''A:''' Write an implementation of the [http://www.jstacs.de/api/de/jstacs/scoringFunctions/ScoringFunction.html ScoringFunction] interface. For convenience, you can use the abstract [http://www.jstacs.de/api/de/jstacs/scoringFunctions/AbstractNormalizableScoringFunction.html AbstractNormalizableScoringFunction] class with default implementations for many methods.
'''A:''' Write an implementation of the [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/differentiable/DifferentiableStatisticalModel.html DifferentiableStatisticalModel] interface. For convenience, you can use the abstract [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/differentiable/AbstractDifferentiableStatisticalModel.html AbstractDifferentiableStatisticalModel] class with default implementations for many methods.
</p>
</p>
<hr />
<hr />
<p>
<p>
'''Q: How do I implement a model that can be trained generatively ''and'' discriminatively?'''<br />
'''Q: How do I implement a model that can be trained generatively ''and'' discriminatively?'''<br />
'''A:''' You can either extend [http://www.jstacs.de/api/de/jstacs/models/AbstractModel.html AbstractModel] and additionally implement the [http://www.jstacs.de/api/de/jstacs/scoringFunctions/ScoringFunction.html ScoringFunction] interface, or you extend the [http://www.jstacs.de/api/de/jstacs/scoringFunctions/AbstractNormalizableScoringFunction.html AbstractNormalizableScoringFunction] and additionally implement the [http://www.jstacs.de/api/de/jstacs/models/Model.html Model] interface.
'''A:''' You can either extend [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/trainable/AbstractTrainableStatisticalModel.html AbstractTrainableStatisticalModel] and additionally implement the [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/differentiable/DifferentiableStatisticalModel.html DifferentiableStatisticalModel] interface, or you extend the [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/differentiable/AbstractDifferentiableStatisticalModel.html AbstractDifferentiableStatisticalModel] and additionally implement the [http://www.jstacs.de/api/de/jstacs/sequenceScores/statisticalModels/trainable/TrainableStatisticalModel.html TrainableStatisticalModel] interface.
</p>
</p>
<hr />
<hr />

Latest revision as of 15:04, 2 February 2012

Handling data

Also have a look at the code example Loading data.

Q: How do I create an AlphabetContainer instance for DNA sequences?
A: AlphabetContainer container = new AlphabetContainer( new DNAAlphabet() );


Q: Why shall I use the AlphabetContainer and not just a simple Alphabet instance?
A: Because for some data you will not have the same alphabet at each position of the sequence, e.g. when using phenotypic data. Hence, we also strongly recommend to always use getAlphabetLengthAt(int) when setting e.g. the size of an array.


Q: How can I create a simple sequence?
A: Try to use the create method of Sequence, e.q. Sequence.create( new AlphabetContainer( new DNAAlphabet() ), "ACGTACGT" );


Q: How can I load my own data?
A: If your sequences are stored either in plain text or in FastA format, you can directly create a new DataSet from the file.


Q: I wrote some sophisticated method using BioJava to load my data from a Genbank file/a database/somewhere else. How can I do something similar in Jstacs?
A: You can still use your existing method. Jstacs has an adapter for BioJava SequenceIterators.


Using existing models

Also have a look at the Code examples.

Q: Where do I find a list of the models currently implemented in Jstacs?
A: All generative models in Jstacs implement the TrainableStatisticalModel interface.
All discriminative models in Jstacs implement the DifferentiableStatisticalModel interface. You find all the existing implementations in the list of implementing classes of these two interfaces.


Q: I decided for two TrainableStatisticalModels. How do I learn them and classify new data?
A: You can create a new TrainSMBasedClassifier from your models and use its train and classify methods. If you only want to learn a model from data, e.g. to sample new sequences, you can also directly use the train method of the TrainableStatisticalModel.


Q: I decided for two DifferentiableStatisticalModels. How do I learn them and classify new data?
A: You can create a new GenDisMixClassifier from your models and use its train and classify methods.


Q: I have to decide, which model is best for my classification task. How do I assess different model combinations or classifiers?
A: You can use the subclasses of ClassifierAssessment, e.g. KFoldCrossValidation. All ClassifierAssessments have a constructor that accepts an array of classifiers (or TrainableStatisticalModels). You can then use the assess method to assess these classifiers on the same data using a number of pre-defined performance measures.


Q: How can store and load my model, classifier, ...?
A: All classes that implement Storable have a method toXML() that returns a StringBuffer containing the instance as XML. Such classes should also have a proper constructor with a single argument StringBuffer. This can be used to create a new instance form a StringBuffer that contains an instance as XML. In addition, the class FileManager allows to read and write StringBuffers to the hard drive.


Q: Why does Jstacs use XML to save instances?
A: Because it is human-readable.


Q: I use Gibbs sampling in a class extending AbstractMixtureTrainSM.

  • Q1: Why does the sampling create files in temporary directory of Java?
  • Q2: Will these files be deleted automatically, if they will not be used any more?

A:

  • A1: These files are created for saving the sampled parameter temporarily. Java temp is used to minimize network load if you work on a cluster.
  • A2: These files will be deleted if no reference to the mixture instance exists and the Garbage collector is called. Therefore it is recommended to call the Garbage collector explicitly at the end of any application.


Implementing new models

Q: How do I implement a new generative TrainableStatisticalModel?
A: Write an implementation of the TrainableStatisticalModel interface. For convenience, you can use the abstract AbstractTrainableStatisticalModel class with default implementations for many methods.


Q: How do I implement a new discriminative model?
A: Write an implementation of the DifferentiableStatisticalModel interface. For convenience, you can use the abstract AbstractDifferentiableStatisticalModel class with default implementations for many methods.


Q: How do I implement a model that can be trained generatively and discriminatively?
A: You can either extend AbstractTrainableStatisticalModel and additionally implement the DifferentiableStatisticalModel interface, or you extend the AbstractDifferentiableStatisticalModel and additionally implement the TrainableStatisticalModel interface.


Reporting bugs and requesting new features

Q: How do I report bugs I found in Jstacs?
A: Before reporting bugs in Jstacs, you should be sure it's not a feature ;-) You can discuss potential issues in the Jstacs forum. You can also have a look at the bugs that have already been reported. If you are sure that you found a new bug, please submit a new ticket to the Jstacs bug tracking system.


Q: How can request new features?
A: You may use the Jstacs forum to discuss your request with other users. Most likely, we will join the discussion, too (We are somewhere out there!).
If you are convinced that the feature you request will be useful for all users of Jstacs, you are invited to submit a new ticket with your request.


Other

Q: The class UserTime does not work! Why?
A: The class UserTime uses native code. Therefore there are at least two possibilities:

  • A1: You have forgotten to set the Java library path: -Djava.library.path=...
  • A2: You have to compile the native code on your system.