Dessert: Alignments, Utils, and goodies
In this section, we present a motley composition of interesting classes of Jstacs.
In this subsection, we present how to compute Alignments using Jstacs.
If we like to compute an alignment, we first have to define the costs for match, mismatch, and gaps. In Jstacs, we provide the interface Costs that declares all necessary method used during the alignment. In this example, we restrict to simple costs that are 0 for match, 1 for match, 0.5 for a gap.
Second, we have to provide an instance of Alignment. This instance contains all information needed for an alignment and stores for instance matrices used for dynamic programming. When creating an instance, we have to specify which kind of alignment we like to have. Jstacs supports local, global and semi-global alignments (cf. Alignment.AlignmentType).
In second constructor it is also possible to specify the number of off-diagonals to be used in the alignment leading to a potential speedup.
Finally, we can compute the optimal alignment between two Sequence s and write the result to the standard output.
The alignment instance can be reused for aligning further sequences.
In Jstacs, we also provide the possibility of computing optimal alignments with affine gap costs. For this reason, we implement the class AffineCosts that is used to specify the cost for a gap opening. The costs for gap elongation are given by the gap costs of the internally used instance of Costs.
align = new Alignment( AlignmentType.GLOBAL, costs );
System.out.println( align.getAlignment( seq1, seq2 ) );
REnvironment: Connection to R
In this subsection, we show how to access R (cf. http://www.r-project.org/) from Jstacs. R is a project for statistical computing that allows for performing complex computations and creating nice plots.
In some cases, it is reasonable to use R from within Jstacs. To do so, we have to create a connection to R. We utilize the package
Rserve (cf. http://www.rforge.net/Rserve/) of R that allows to communicate between Java and R. Having a running instance of
Rserve, we can create a connection via
However, in some cases we have to specify the login name, a password, and the port for the communication which is possible via alternative constructors.
Now, we are able to do diverse things in R. Here, we only present three methods, but REnvironment provides more functionality. First, we copy an array of
doubles from Java to R
and second, we modify it
Finally, the REnvironment allows to create plots as PDF, TeX, or
ArrayHandler: Handling arrays
In this subsection, we present a way to easily handle arrays in Java, i.e., to cast, clone, and create arrays with elements of generic type. To this end, we implement the class ArrayHandler in Jstacs.
Let's assume we have a two dimensional array of either primitives of some Java class and we like to create a deep clone as it is necessary for member fields in clone methods.
Traditionally, we would have to implement
for-loops to do so. However, the ArrayHandler implements this functionality in a generic manner providing one method for this purpose.
A second use case, is the creation of arrays, where each and every entry is a clone of some instance.
TrainableStatisticalModel models = ArrayHandler.createArrayOf( pwm, 10 );
The third use case is to cast an array. Even if all elements of the array are from the same class, the component type of the array might be different (some super class). A simple cast will fail in those cases. However, the ArrayHandler provides two methods for casting arrays. Here, we present the more important method, which allows to specify the array component type and performs the cast operation.
TrainableStatisticalModelFactory.createPWM( DNAAlphabetContainer.SINGLETON, 10, 4.0 ),
TrainableStatisticalModelFactory.createHomogeneousMarkovModel( DNAAlphabetContainer.SINGLETON, 40.0, (byte)0 )
TrainableStatisticalModel sms = ArrayHandler.cast( TrainableStatisticalModel.class, models );
The class ToolBox contains several static methods for recurring tasks.
For example, you can compute the maximum of an array of
or the sum of the values
or you can obtain the index of the first maximum value in the provided array
Another frequently needed functionality is the handling of log-values. Assume that we have an array
values containing a number of log-probabilities li. What we want to compute is the logarithm of the sum of the original probabilities, i.e.,
. The naive computation of this sum often results in numerical problems, especially, if the original probabilities are very different.
A more exact solution is provided by the static method
getLogSum of the class Normalisation, which can be accessed by calling
Of course, this method does not only work for probabilities, but for general log-values.
Sometimes, we also want to normalize the given probabilities. That means, given the log-probabilities li, we want to obtain normalized probabilities . This normalization is performed by calling
and after the call, the array
values contains the normalized probabilities (not log-probabilities!).
Finally, we might want to do the same for probabilities, i.e. given probabilities qi in an array
values, we want to compute using
A typical application for the last two methods are (log) joint probabilities that we want use to compute conditional probabilities by dividing by a marginal probability.
The class SafeOutputStream is a simple way to switch between writing outputs of a program to standard out, to a file, or to completely suppress output. This class is basically a wrapper for other output streams that can handle
null values. You can create a SafeOutputStream writing to standard out with
If you provided
null to the factory method instead, output would be suppressed, while no modifications in code using this SafeOutputStream †would be necessary.
Finally, the class SubclassFinder can be used to search for all subclasses of a given class in a specified package and its sub-packages. For example, if we want to find all concrete sub-classes of TrainableStatisticalModel, i.e., classes that are not abstract and can be instantiated, in all sub-packages of
de.jstacs, we call
and obtain a linked list containing all such classes. Other methods in SubclassFinder allow for searching for general sub-types including interfaces and abstract classes, or for filtering the results by further required interfaces.