# Quick start: Jstacs in a nutshell

### From Jstacs

This section is for the unpatient who like to directly start using Jstacs without reading the complete cookbook. If you do not belong to this group, you can skip this section.

Here, we provide code snippets for simple task including reading a data set, creating models and classifiers which might be frequently used. In addition, some of the basic code examples in section \nameref{recipes} may also serve as a basis for a quick start into Jstacs.

For reading a FastA file, we call the constructor of the DNADataSet with the (absolute or relative) path to the FastA file. Subsequently, we can determine the alphabets used.

AlphabetContainer con = ds.getAlphabetContainer();

For more detailed information about data sets, sequences, and alphabets, we refer to section Starter: Data handling.

## Statistical models and classifiers using generative learning principles

In Jstacs, statistical models that use generative learning principles to infer their parameters implement the interface TrainableStatisticalModel. For convenience, Jstacs provides the TrainableStatisticalModelFactory, which allows for creating various simple models in an easy manner. Creating for instance a PWM model is just one line of code.

Similarily other models including inhomogeneous Markov models, permuted Markov models, Bayesian networks, homogeneous Markov models, ZOOPS models, and hidden Markov models can be created using the TrainableStatisticalModelFactory and the HMMFactory, respectively.

Given some model `pwm`

, we can directly infer the model parameters based on some data set `ds`

using the `train`

method.

After the model has been trained, it can be used to score sequences using the `getLogProbFor`

methods. More information about the interface TrainableStatisticalModel can be found in section First main course: SequenceScores#TrainableStatisticalModels.

Based on a set of TrainableStatisticalModel s, for instance two PWM models, we can build a classifier.

## Further statistical models and classifiers

Sometimes, we like to learn statistical models by other learning principles that require numerical optimization. For this purpose, Jstacs provides the interface DifferentiableStatisticalModel and the factory DifferentiableStatisticalModelFactory in close analogy to TrainableStatisticalModel and TrainableStatisticalModelFactory (cf. First main course: SequenceScores#DifferentiableStatisticalModels). Creating a classifier using two PWM models and the maximum supervised posterior learning principle, can be accomplished by calling

## Using classifiers

Based on statistical models, we can build classifiers as we have seen in the previous subsections. The main functionality is predicting the class label of a sequence and assessing the performance of a classifier. For these tasks, Jstacs provides the methods `classify`

and `evaluate`

, respectively.

For classifying a sequence, we just call

on a trained classifier. The method returns numerical class labels starting from 0 and in the same order as data is provided for training.

For evaluating the performance of a classifier, we need to compute some performance measures. For convenience, Jstacs provides the possibility of getting a bunch of standard measures including point measures and areas under curves (cf. Second main course: Classifiers#Performance_measures). Based on such measures, we can directly determine the performance of the classifier.

System.out.println( cl.evaluate(params, true, data) );

Here, `true`

indicates that an `Exception`

should be thrown if a measure could not be computed, and `data`

is an array of data sets, where the index within this array encodes for the class.

For assessing the performance of a classifier using some repeated procedure of training and testing, Jstacs provides the class ClassifierAssessment (cf. Second main course: Classifiers#Assessment).