|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES All Classes | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.jstacs.sequenceScores.statisticalModels.trainable.AbstractTrainableStatisticalModel
de.jstacs.sequenceScores.statisticalModels.trainable.hmm.AbstractHMM
de.jstacs.sequenceScores.statisticalModels.trainable.hmm.models.HigherOrderHMM
de.jstacs.sequenceScores.statisticalModels.trainable.hmm.models.DifferentiableHigherOrderHMM
public class DifferentiableHigherOrderHMM
This class combines an HigherOrderHMM
and a DifferentiableStatisticalModel
by implementing some of the declared methods.
Nested Class Summary |
---|
Nested classes/interfaces inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.hmm.models.HigherOrderHMM |
---|
HigherOrderHMM.Type |
Field Summary | |
---|---|
protected double |
ess
The equivalent sample size used for the prior |
protected double[][][] |
gradient
Help array for the gradient |
protected int[][] |
index
Index array used for computing the gradient |
protected IntList[] |
indicesState
Help array for the indexes of the parameters of the states |
protected IntList[] |
indicesTransition
Help array for the indexes of the parameters of the transition |
protected int |
numberOfParameters
The number of parameters of this HMM |
protected DoubleList[] |
partDerState
Help array for the derivatives of the parameters of the states |
protected DoubleList[] |
partDerTransition
Help array for the derivatives of the parameters of the transition |
protected HigherOrderHMM.Type |
score
The type of the score that is evaluated |
Fields inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.hmm.models.HigherOrderHMM |
---|
backwardIntermediate, container, logEmission, numberOfSummands, skipInit, stateList |
Fields inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.hmm.AbstractHMM |
---|
bwdMatrix, emission, emissionIdx, finalState, forward, fwdMatrix, name, sostream, START_NODE, states, threads, trainingParameter, transition |
Fields inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.AbstractTrainableStatisticalModel |
---|
alphabets, length |
Fields inherited from interface de.jstacs.sequenceScores.differentiable.DifferentiableSequenceScore |
---|
UNKNOWN |
Constructor Summary | |
---|---|
DifferentiableHigherOrderHMM(MaxHMMTrainingParameterSet trainingParameterSet,
String[] name,
int[] emissionIdx,
boolean[] forward,
DifferentiableEmission[] emission,
boolean likelihood,
double ess,
TransitionElement... te)
This is the main constructor. |
|
DifferentiableHigherOrderHMM(StringBuffer xml)
The standard constructor for the interface Storable . |
Method Summary | |
---|---|
void |
addGradientOfLogPriorTerm(double[] grad,
int start)
This method computes the gradient of DifferentiableStatisticalModel.getLogPriorTerm() for each
parameter of this model. |
protected void |
appendFurtherInformation(StringBuffer xml)
This method appends further information to the XML representation. |
DifferentiableHigherOrderHMM |
clone()
Follows the conventions of Object 's clone() -method. |
protected void |
createHelperVariables()
This method instantiates all helper variables that are need inside the model for instance for filling forward and backward matrix, ... |
protected void |
createStates()
This method creates states for the internal usage. |
protected void |
extractFurtherInformation(StringBuffer xml)
This method extracts further information from the XML representation. |
double[] |
getCurrentParameterValues()
Returns a double array of dimension
DifferentiableSequenceScore.getNumberOfParameters() containing the current parameter values. |
double |
getESS()
Returns the equivalent sample size (ess) of this model, i.e. the equivalent sample size for the class or component that is represented by this model. |
double |
getInitialClassParam(double classProb)
Returns the initial class parameter for the class this DifferentiableSequenceScore is responsible for, based on the class
probability classProb . |
String |
getInstanceName()
Should return a short instance name such as iMM(0), BN(2), ... |
double |
getLogNormalizationConstant()
Returns the logarithm of the sum of the scores over all sequences of the event space. |
double |
getLogPartialNormalizationConstant(int parameterIndex)
Returns the logarithm of the partial normalization constant for the parameter with index parameterIndex . |
double |
getLogScoreAndPartialDerivation(Sequence seq,
int startPos,
int endPos,
IntList indices,
DoubleList partialDer)
Returns the logarithmic score for a Sequence beginning at
position start in the Sequence and fills lists with
the indices and the partial derivations. |
double |
getLogScoreAndPartialDerivation(Sequence seq,
int startPos,
IntList indices,
DoubleList partialDer)
Returns the logarithmic score for a Sequence beginning at
position start in the Sequence and fills lists with
the indices and the partial derivations. |
double |
getLogScoreAndPartialDerivation(Sequence seq,
IntList indices,
DoubleList partialDer)
Returns the logarithmic score for a Sequence seq and
fills lists with the indices and the partial derivations. |
double |
getLogScoreFor(Sequence seq)
Returns the logarithmic score for the Sequence seq . |
double |
getLogScoreFor(Sequence seq,
int start)
Returns the logarithmic score for the Sequence seq
beginning at position start in the Sequence . |
double |
getLogScoreFor(Sequence seq,
int start,
int end)
Returns the logarithmic score for the Sequence seq
beginning at position start in the Sequence . |
int |
getNumberOfParameters()
Returns the number of parameters in this DifferentiableSequenceScore . |
int |
getNumberOfRecommendedStarts()
This method returns the number of recommended optimization starts. |
int[][] |
getSamplingGroups(int parameterOffset)
Returns groups of indexes of parameters that shall be drawn together in a sampling procedure |
int |
getSizeOfEventSpaceForRandomVariablesOfParameter(int index)
Returns the size of the event space of the random variables that are affected by parameter no. |
void |
initializeFunction(int index,
boolean freeParams,
DataSet[] data,
double[][] weights)
This method creates the underlying structure of the DifferentiableSequenceScore . |
void |
initializeFunctionRandomly(boolean freeParams)
This method initializes the DifferentiableSequenceScore randomly. |
boolean |
isInitialized()
This method can be used to determine whether the instance is initialized. |
boolean |
isNormalized()
This method indicates whether the implemented score is already normalized to 1 or not. |
protected double |
logProb(int startpos,
int endpos,
Sequence sequence)
This method computes the logarithm of the probability of the corresponding subsequences. |
void |
setParameters(double[] params,
int start)
This method sets the internal parameters to the values of params between start and
start + |
void |
train(DataSet data,
double[] weights)
Trains the TrainableStatisticalModel object given the data as DataSet using
the specified weights. |
Methods inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.hmm.models.HigherOrderHMM |
---|
baumWelch, estimateFromStatistics, fillBwdMatrix, fillBwdOrViterbiMatrix, fillFwdMatrix, fillLogStatePosteriorMatrix, finalize, getCharacteristics, getLogPriorTerm, getLogProbForPath, getLogScoreFor, getLogScoreFor, getMaximalMarkovOrder, getNumericalCharacteristics, getViterbiPathFor, getXMLTag, initialize, initializeRandomly, resetStatistics, samplePath, viterbi |
Methods inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.hmm.AbstractHMM |
---|
createMatrixForStatePosterior, decodePath, decodeStatePosterior, determineFinalStates, fromXML, getFinalStatePosterioriMatrix, getGraphvizRepresentation, getGraphvizRepresentation, getGraphvizRepresentation, getGraphvizRepresentation, getLogProbFor, getLogStatePosteriorMatrixFor, getLogStatePosteriorMatrixFor, getNumberOfStates, getNumberOfThreads, getRunTimeException, getStatePosteriorMatrixFor, getStatePosteriorMatrixFor, getViterbiPathFor, getViterbiPathsFor, initTransition, provideMatrix, setOutputStream, toString, toXML, train |
Methods inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.AbstractTrainableStatisticalModel |
---|
check, emitDataSet, getAlphabetContainer, getLength, getLogProbFor, getLogProbFor |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Methods inherited from interface de.jstacs.sequenceScores.statisticalModels.differentiable.DifferentiableStatisticalModel |
---|
getLogPriorTerm |
Methods inherited from interface de.jstacs.sequenceScores.statisticalModels.StatisticalModel |
---|
emitDataSet, getLogProbFor, getLogProbFor, getLogProbFor, getMaximalMarkovOrder |
Methods inherited from interface de.jstacs.sequenceScores.SequenceScore |
---|
getAlphabetContainer, getCharacteristics, getLength, getLogScoreFor, getLogScoreFor, getNumericalCharacteristics |
Methods inherited from interface de.jstacs.Storable |
---|
toXML |
Field Detail |
---|
protected int numberOfParameters
protected double ess
protected HigherOrderHMM.Type score
protected int[][] index
protected double[][][] gradient
protected IntList[] indicesState
protected IntList[] indicesTransition
protected DoubleList[] partDerState
protected DoubleList[] partDerTransition
Constructor Detail |
---|
public DifferentiableHigherOrderHMM(MaxHMMTrainingParameterSet trainingParameterSet, String[] name, int[] emissionIdx, boolean[] forward, DifferentiableEmission[] emission, boolean likelihood, double ess, TransitionElement... te) throws Exception
trainingParameterSet
- the ParameterSet
that determines the training algorithm and contains the necessary Parameter
sname
- the names of the statesemissionIdx
- the indices of the emissions that should be used for each state, if null
state i
will use emission i
forward
- a boolean array that indicates whether the symbol on the forward or the reverse complementary strand should be used,
if null
all states use the forward strandemission
- the emissionslikelihood
- if true
the likelihood is return by getLogScoreFor(Sequence)
otherwise the viterbi scoreess
- the ess of the modelte
- the TransitionElement
s used for creating a Transition
Exception
- if
name, emissionIdx,
or forward
is not equal to the number of statesAlphabetContainer
public DifferentiableHigherOrderHMM(StringBuffer xml) throws NonParsableException
Storable
.
Constructs an DifferentiableHigherOrderHMM
out of an XML representation.
xml
- the XML representation as StringBuffer
NonParsableException
- if the DifferentiableHigherOrderHMM
could not be reconstructed out of
the StringBuffer
xml
Method Detail |
---|
protected void appendFurtherInformation(StringBuffer xml)
AbstractHMM
appendFurtherInformation
in class HigherOrderHMM
xml
- the XML representationprotected void extractFurtherInformation(StringBuffer xml) throws NonParsableException
HigherOrderHMM
extractFurtherInformation
in class HigherOrderHMM
xml
- the XML representation
NonParsableException
- if the information could not be reconstructed out of the StringBuffer
xml
protected void createHelperVariables()
AbstractHMM
createHelperVariables
in class HigherOrderHMM
protected void createStates()
AbstractHMM
createStates
in class HigherOrderHMM
public DifferentiableHigherOrderHMM clone() throws CloneNotSupportedException
AbstractTrainableStatisticalModel
Object
's clone()
-method.
clone
in interface DifferentiableSequenceScore
clone
in interface SequenceScore
clone
in interface TrainableStatisticalModel
clone
in class HigherOrderHMM
AbstractTrainableStatisticalModel
(the member-AlphabetContainer
isn't deeply cloned since
it is assumed to be immutable). The type of the returned object
is defined by the class X
directly inherited from
AbstractTrainableStatisticalModel
. Hence X
's
clone()
-method should work as:Object o = (X)super.clone();
o
defined by
X
that are not of simple data-types like
int
, double
, ... have to be deeply
copied return o
CloneNotSupportedException
- if something went wrong while cloningpublic double getESS()
DifferentiableStatisticalModel
getESS
in interface DifferentiableStatisticalModel
public void addGradientOfLogPriorTerm(double[] grad, int start) throws Exception
DifferentiableStatisticalModel
DifferentiableStatisticalModel.getLogPriorTerm()
for each
parameter of this model. The results are added to the array
grad
beginning at index start
.
addGradientOfLogPriorTerm
in interface DifferentiableStatisticalModel
grad
- the array of gradientsstart
- the start index in the grad
array, where the
partial derivations for the parameters of this models shall be
entered
Exception
- if something went wrong with the computing of the gradientsDifferentiableStatisticalModel.getLogPriorTerm()
public int getNumberOfParameters()
DifferentiableSequenceScore
DifferentiableSequenceScore
. If the
number of parameters is not known yet, the method returns
DifferentiableSequenceScore.UNKNOWN
.
getNumberOfParameters
in interface DifferentiableSequenceScore
DifferentiableSequenceScore
DifferentiableSequenceScore.UNKNOWN
public int getNumberOfRecommendedStarts()
DifferentiableSequenceScore
getNumberOfRecommendedStarts
in interface DifferentiableSequenceScore
public double[] getCurrentParameterValues() throws Exception
DifferentiableSequenceScore
double
array of dimension
DifferentiableSequenceScore.getNumberOfParameters()
containing the current parameter values.
If one likes to use these parameters to start an optimization it is
highly recommended to invoke
DifferentiableSequenceScore.initializeFunction(int, boolean, DataSet[], double[][])
before.
After an optimization this method can be used to get the current
parameter values.
getCurrentParameterValues
in interface DifferentiableSequenceScore
Exception
- if no parameters exist (yet)public boolean isInitialized()
SequenceScore
SequenceScore.getLogScoreFor(Sequence)
.
isInitialized
in interface SequenceScore
isInitialized
in class HigherOrderHMM
true
if the instance is initialized, false
otherwisepublic void setParameters(double[] params, int start)
DifferentiableSequenceScore
params
between start
and
start + DifferentiableSequenceScore.getNumberOfParameters()
- 1
setParameters
in interface DifferentiableSequenceScore
params
- the new parametersstart
- the start index in params
public void initializeFunctionRandomly(boolean freeParams) throws Exception
DifferentiableSequenceScore
DifferentiableSequenceScore
randomly. It has to
create the underlying structure of the DifferentiableSequenceScore
.
initializeFunctionRandomly
in interface DifferentiableSequenceScore
freeParams
- indicates whether the (reduced) parameterization is used
Exception
- if something went wrongpublic void initializeFunction(int index, boolean freeParams, DataSet[] data, double[][] weights) throws Exception
DifferentiableSequenceScore
DifferentiableSequenceScore
.
initializeFunction
in interface DifferentiableSequenceScore
index
- the index of the class the DifferentiableSequenceScore
modelsfreeParams
- indicates whether the (reduced) parameterization is useddata
- the samplesweights
- the weights of the sequences in the samples
Exception
- if something went wrongpublic void train(DataSet data, double[] weights) throws Exception
TrainableStatisticalModel
TrainableStatisticalModel
object given the data as DataSet
using
the specified weights. The weight at position i belongs to the element at
position i. So the array weight
should have the number of
sequences in the sample as dimension. (Optionally it is possible to use
weight == null
if all weights have the value one.)train(data1)
; train(data2)
should be a fully trained model over data2
and not over
data1+data2
. All parameters of the model were given by the
call of the constructor.
train
in interface TrainableStatisticalModel
train
in class HigherOrderHMM
data
- the given sequences as DataSet
weights
- the weights of the elements, each weight should be
non-negative
Exception
- if the training did not succeed (e.g. the dimension of
weights
and the number of sequences in the
sample do not match)DataSet.getElementAt(int)
,
DataSet.ElementEnumerator
public boolean isNormalized()
DifferentiableStatisticalModel
false
.
isNormalized
in interface DifferentiableStatisticalModel
true
if the implemented score is already normalized
to 1, false
otherwisepublic double getLogNormalizationConstant()
DifferentiableStatisticalModel
getLogNormalizationConstant
in interface DifferentiableStatisticalModel
public double getLogPartialNormalizationConstant(int parameterIndex) throws Exception
DifferentiableStatisticalModel
parameterIndex
. This is the logarithm of the partial derivation of the
normalization constant for the parameter with index
parameterIndex
,
getLogPartialNormalizationConstant
in interface DifferentiableStatisticalModel
parameterIndex
- the index of the parameter
Exception
- if something went wrong with the normalizationDifferentiableStatisticalModel.getLogNormalizationConstant()
public double getInitialClassParam(double classProb)
DifferentiableSequenceScore
DifferentiableSequenceScore
is responsible for, based on the class
probability classProb
.
getInitialClassParam
in interface DifferentiableSequenceScore
classProb
- the class probability
public double getLogScoreFor(Sequence seq)
SequenceScore
Sequence
seq
.
getLogScoreFor
in interface SequenceScore
getLogScoreFor
in class AbstractTrainableStatisticalModel
seq
- the sequence
public double getLogScoreFor(Sequence seq, int start)
SequenceScore
Sequence
seq
beginning at position start
in the Sequence
.
getLogScoreFor
in interface SequenceScore
getLogScoreFor
in class AbstractTrainableStatisticalModel
seq
- the Sequence
start
- the start position in the Sequence
Sequence
public double getLogScoreFor(Sequence seq, int start, int end)
SequenceScore
Sequence
seq
beginning at position start
in the Sequence
.
getLogScoreFor
in interface SequenceScore
getLogScoreFor
in class AbstractTrainableStatisticalModel
seq
- the Sequence
start
- the start position in the Sequence
end
- the end position (inclusive) in the Sequence
Sequence
protected double logProb(int startpos, int endpos, Sequence sequence)
AbstractHMM
AlphabetContainer
and possible further features
before starting the computation.
logProb
in class AbstractHMM
startpos
- the start position (inclusive)endpos
- the end position (inclusive)sequence
- the Sequence
(s)
public double getLogScoreAndPartialDerivation(Sequence seq, IntList indices, DoubleList partialDer)
DifferentiableSequenceScore
Sequence
seq
and
fills lists with the indices and the partial derivations.
getLogScoreAndPartialDerivation
in interface DifferentiableSequenceScore
seq
- the Sequence
indices
- an IntList
of indices, after method invocation the
list should contain the indices i
where
partialDer
- a DoubleList
of partial derivations, after method
invocation the list should contain the corresponding
Sequence
public double getLogScoreAndPartialDerivation(Sequence seq, int startPos, IntList indices, DoubleList partialDer)
DifferentiableSequenceScore
Sequence
beginning at
position start
in the Sequence
and fills lists with
the indices and the partial derivations.
getLogScoreAndPartialDerivation
in interface DifferentiableSequenceScore
seq
- the Sequence
startPos
- the start position in the Sequence
indices
- an IntList
of indices, after method invocation the
list should contain the indices i where
partialDer
- a DoubleList
of partial derivations, after method
invocation the list should contain the corresponding
Sequence
public double getLogScoreAndPartialDerivation(Sequence seq, int startPos, int endPos, IntList indices, DoubleList partialDer)
DifferentiableSequenceScore
Sequence
beginning at
position start
in the Sequence
and fills lists with
the indices and the partial derivations.
getLogScoreAndPartialDerivation
in interface DifferentiableSequenceScore
seq
- the Sequence
startPos
- the start position in the Sequence
endPos
- the end position (inclusive) in the Sequence
indices
- an IntList
of indices, after method invocation the
list should contain the indices i where
partialDer
- a DoubleList
of partial derivations, after method
invocation the list should contain the corresponding
Sequence
public int getSizeOfEventSpaceForRandomVariablesOfParameter(int index)
DifferentiableStatisticalModel
index
, i.e. the product of the
sizes of the alphabets at the position of each random variable affected
by parameter index
. For DNA alphabets this corresponds to 4
for a PWM, 16 for a WAM except position 0, ...
getSizeOfEventSpaceForRandomVariablesOfParameter
in interface DifferentiableStatisticalModel
index
- the index of the parameter
public int[][] getSamplingGroups(int parameterOffset)
SamplingDifferentiableStatisticalModel
getSamplingGroups
in interface SamplingDifferentiableStatisticalModel
parameterOffset
- a global offset on the parameter indexes
parameterOffset
.public String getInstanceName()
SequenceScore
getInstanceName
in interface SequenceScore
getInstanceName
in class HigherOrderHMM
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES All Classes | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |