public class LimitedSparseLocalInhomogeneousMixtureDiffSM_higherOrder extends AbstractDifferentiableStatisticalModel implements Mutable, QuickScanningSequenceScore
Modifier and Type | Class and Description |
---|---|
static class |
LimitedSparseLocalInhomogeneousMixtureDiffSM_higherOrder.PriorType
The type of the prior used by the Slim model
|
alphabets, length, r
UNKNOWN
Constructor and Description |
---|
LimitedSparseLocalInhomogeneousMixtureDiffSM_higherOrder(AlphabetContainer alphabets,
int length,
int order,
int distance,
double ess,
double q,
LimitedSparseLocalInhomogeneousMixtureDiffSM_higherOrder.PriorType t)
Creates a new Slim model with given number of components and maximum distance.
|
LimitedSparseLocalInhomogeneousMixtureDiffSM_higherOrder(StringBuffer xml)
Creates a
LimitedSparseLocalInhomogeneousMixtureDiffSM_higherOrder model from its XML representation |
Modifier and Type | Method and Description |
---|---|
void |
addGradientOfLogPriorTerm(double[] grad,
int start)
This method computes the gradient of
DifferentiableStatisticalModel.getLogPriorTerm() for each
parameter of this model. |
LimitedSparseLocalInhomogeneousMixtureDiffSM_higherOrder |
clone()
Creates a clone (deep copy) of the current
DifferentiableSequenceScore
instance. |
void |
fillInfixScore(int[] seq,
int start,
int length,
double[] scores)
Computes the position-wise scores of an infix of the sequence
seq (which must be encoded by the
same alphabet as this QuickScanningSequenceScore ) beginning at start and extending for length
positions. |
protected void |
fromXML(StringBuffer xml)
This method is called in the constructor for the
Storable
interface to create a scoring function from a StringBuffer . |
double[][] |
getAncestorProbabilities(int component)
Returns the probabilities that the preceding positions considered are used as context.
|
double[][][] |
getConditionalProbabilities(int component)
Returns the conditional probabilities for the specified component.
|
protected double[][] |
getCum_Complex(int kmer) |
protected double[][] |
getCum_Naive(int kmer) |
double[] |
getCurrentParameterValues()
Returns a
double array of dimension
DifferentiableSequenceScore.getNumberOfParameters() containing the current parameter values. |
int |
getDistance()
Returns the maximum distance of preceding positions considered in the LSlim model.
|
double |
getESS()
Returns the equivalent sample size (ess) of this model, i.e.
|
String |
getGraphviz()
Returns a Graphviz (dot) representation of the Slim model.
|
boolean[][] |
getInfixFilter(int kmer,
double thresh,
int... start)
Computes arrays that indicate, for a given set of starting positions and a given k-mer length, if a sequence
containing this k-mer may yield a score above
threshold , choosing the best-scoring option among
all non-specified positions (i.e., those outside the k-mer). |
protected void |
getInfixScores(int s,
int start,
int l,
int kmer,
int[] seq,
double[][] prefixScore,
double[][] max) |
String |
getInstanceName()
Should return a short instance name such as iMM(0), BN(2), ...
|
double |
getLogNormalizationConstant()
Returns the logarithm of the sum of the scores over all sequences of the event space.
|
double |
getLogPartialNormalizationConstant(int parameterIndex)
Returns the logarithm of the partial normalization constant for the parameter with index
parameterIndex . |
double |
getLogPriorTerm()
This method computes a value that is proportional to
|
double |
getLogScoreAndPartialDerivation(Sequence seq,
int start,
IntList indices,
DoubleList partialDer)
|
double |
getLogScoreFor(Sequence seq,
int start)
|
double[][] |
getMixtureProbabilities()
Returns the probabilities of the mixture components.
|
int |
getNumberOfParameters()
Returns the number of parameters in this
DifferentiableSequenceScore . |
int |
getNumberOfRecommendedStarts()
This method returns the number of recommended optimization starts.
|
int |
getOrder()
Returns the order of the Slim model
|
double[][] |
getPWMParameters()
Returns the unconditional, normalized (PWM) probabilities of this Slim model
|
int |
getSizeOfEventSpaceForRandomVariablesOfParameter(int index)
Returns the size of the event space of the random variables that are
affected by parameter no.
|
void |
initializeFunction(int index,
boolean freeParams,
DataSet[] data,
double[][] weights)
This method creates the underlying structure of the
DifferentiableSequenceScore . |
void |
initializeFunctionRandomly(boolean freeParams)
This method initializes the
DifferentiableSequenceScore randomly. |
boolean |
isInitialized()
This method can be used to determine whether the instance is initialized.
|
boolean |
modify(int offsetLeft,
int offsetRight)
Manually modifies the model.
|
void |
set(int position,
double[] pars)
Sets the (conditional) probability parameters at a specific position and sets the mixture parameters
(largely) to the unconditional PWM component.
|
void |
setParameters(double[] params,
int start)
This method sets the internal parameters to the values of
params between start and
start + |
String |
toString(NumberFormat nf)
This method returns a
String representation of the instance. |
StringBuffer |
toXML()
This method returns an XML representation as
StringBuffer of an
instance of the implementing class. |
emitDataSet, getInitialClassParam, getLogProbFor, getLogProbFor, getLogProbFor, getLogScoreFor, getLogScoreFor, getMaximalMarkovOrder, isNormalized, isNormalized
getAlphabetContainer, getCharacteristics, getLength, getLogScoreAndPartialDerivation, getLogScoreAndPartialDerivation, getLogScoreFor, getLogScoreFor, getNumberOfStarts, getNumericalCharacteristics, toString
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
getAlphabetContainer, getCharacteristics, getLength, getLogScoreFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getNumericalCharacteristics
getLogScoreAndPartialDerivation, getLogScoreAndPartialDerivation
public LimitedSparseLocalInhomogeneousMixtureDiffSM_higherOrder(AlphabetContainer alphabets, int length, int order, int distance, double ess, double q, LimitedSparseLocalInhomogeneousMixtureDiffSM_higherOrder.PriorType t) throws IllegalArgumentException
alphabets
- the alphabet of sequences the model is defined onlength
- the length of the sequences that may be scoresorder
- the number of components, i.e., the number of preceding positions considered jointlydistance
- the maximum distance of preceding positions consideredess
- the equivalent sample sizeq
- Parameter q of the mixture prior, ignored for BDeu priort
- the type of the priorIllegalArgumentException
- if the ess or other parameters are not allowedpublic LimitedSparseLocalInhomogeneousMixtureDiffSM_higherOrder(StringBuffer xml) throws NonParsableException
LimitedSparseLocalInhomogeneousMixtureDiffSM_higherOrder
model from its XML representationxml
- the XML representationNonParsableException
- if XML could not be parsedpublic int getOrder()
public int getDistance()
public LimitedSparseLocalInhomogeneousMixtureDiffSM_higherOrder clone() throws CloneNotSupportedException
DifferentiableSequenceScore
DifferentiableSequenceScore
instance.clone
in interface DifferentiableSequenceScore
clone
in interface SequenceScore
clone
in class AbstractDifferentiableStatisticalModel
DifferentiableSequenceScore
CloneNotSupportedException
- if something went wrong while cloning the
DifferentiableSequenceScore
public int getSizeOfEventSpaceForRandomVariablesOfParameter(int index)
DifferentiableStatisticalModel
index
, i.e. the product of the
sizes of the alphabets at the position of each random variable affected
by parameter index
. For DNA alphabets this corresponds to 4
for a PWM, 16 for a WAM except position 0, ...getSizeOfEventSpaceForRandomVariablesOfParameter
in interface DifferentiableStatisticalModel
index
- the index of the parameterpublic double getLogNormalizationConstant()
DifferentiableStatisticalModel
getLogNormalizationConstant
in interface DifferentiableStatisticalModel
public double getLogPartialNormalizationConstant(int parameterIndex) throws Exception
DifferentiableStatisticalModel
parameterIndex
. This is the logarithm of the partial derivation of the
normalization constant for the parameter with index
parameterIndex
,
getLogPartialNormalizationConstant
in interface DifferentiableStatisticalModel
parameterIndex
- the index of the parameterException
- if something went wrong with the normalizationDifferentiableStatisticalModel.getLogNormalizationConstant()
public void initializeFunctionRandomly(boolean freeParams) throws Exception
DifferentiableSequenceScore
DifferentiableSequenceScore
randomly. It has to
create the underlying structure of the DifferentiableSequenceScore
.initializeFunctionRandomly
in interface DifferentiableSequenceScore
freeParams
- indicates whether the (reduced) parameterization is usedException
- if something went wrongpublic void initializeFunction(int index, boolean freeParams, DataSet[] data, double[][] weights) throws Exception
DifferentiableSequenceScore
DifferentiableSequenceScore
.initializeFunction
in interface DifferentiableSequenceScore
index
- the index of the class the DifferentiableSequenceScore
modelsfreeParams
- indicates whether the (reduced) parameterization is useddata
- the data setsweights
- the weights of the sequences in the data setsException
- if something went wrongpublic double getLogPriorTerm()
DifferentiableStatisticalModel
DifferentiableStatisticalModel.getESS()
* DifferentiableStatisticalModel.getLogNormalizationConstant()
+ Math.log( prior )
prior
is the prior for the parameters of this model.getLogPriorTerm
in interface DifferentiableStatisticalModel
getLogPriorTerm
in interface StatisticalModel
DifferentiableStatisticalModel.getESS()
* DifferentiableStatisticalModel.getLogNormalizationConstant()
+ Math.log( prior ).
DifferentiableStatisticalModel.getESS()
,
DifferentiableStatisticalModel.getLogNormalizationConstant()
public void addGradientOfLogPriorTerm(double[] grad, int start) throws Exception
DifferentiableStatisticalModel
DifferentiableStatisticalModel.getLogPriorTerm()
for each
parameter of this model. The results are added to the array
grad
beginning at index start
.addGradientOfLogPriorTerm
in interface DifferentiableStatisticalModel
grad
- the array of gradientsstart
- the start index in the grad
array, where the
partial derivations for the parameters of this models shall be
enteredException
- if something went wrong with the computing of the gradientsDifferentiableStatisticalModel.getLogPriorTerm()
public double getESS()
DifferentiableStatisticalModel
getESS
in interface DifferentiableStatisticalModel
public boolean[][] getInfixFilter(int kmer, double thresh, int... start)
QuickScanningSequenceScore
threshold
, choosing the best-scoring option among
all non-specified positions (i.e., those outside the k-mer).
This method is implemented as an upper bound on the scores, i.e., there may be k-mers that are considered
to score above threshold (i.e., that have entry true
) although they are not, but there may not be k-mers that are considered to be below
threshold (i.e., that have entry false
), although there exist sequences containing this k-mer that do.
The returned array is indexed by the starting positions (in the same order as provided in starts
) in the first dimension, and in the second dimension
it is indexed by an integer representation of the k-mers, assigning the highest priority to the first k-mer position, i.e.,
DiscreteAlphabet
).getInfixFilter
in interface QuickScanningSequenceScore
kmer
- the k-mer lengththresh
- the thresholdstart
- the starting position(s)protected double[][] getCum_Naive(int kmer)
protected double[][] getCum_Complex(int kmer)
protected void getInfixScores(int s, int start, int l, int kmer, int[] seq, double[][] prefixScore, double[][] max)
public double getLogScoreFor(Sequence seq, int start)
SequenceScore
getLogScoreFor
in interface SequenceScore
seq
- the Sequence
start
- the start position in the Sequence
Sequence
public double getLogScoreAndPartialDerivation(Sequence seq, int start, IntList indices, DoubleList partialDer)
DifferentiableSequenceScore
Sequence
beginning at
position start
in the Sequence
and fills lists with
the indices and the partial derivations.getLogScoreAndPartialDerivation
in interface DifferentiableSequenceScore
seq
- the Sequence
start
- the start position in the Sequence
indices
- an IntList
of indices, after method invocation the
list should contain the indices i where
partialDer
- a DoubleList
of partial derivations, after method
invocation the list should contain the corresponding
Sequence
public int getNumberOfParameters()
DifferentiableSequenceScore
DifferentiableSequenceScore
. If the
number of parameters is not known yet, the method returns
DifferentiableSequenceScore.UNKNOWN
.getNumberOfParameters
in interface DifferentiableSequenceScore
DifferentiableSequenceScore
DifferentiableSequenceScore.UNKNOWN
public double[] getCurrentParameterValues() throws Exception
DifferentiableSequenceScore
double
array of dimension
DifferentiableSequenceScore.getNumberOfParameters()
containing the current parameter values.
If one likes to use these parameters to start an optimization it is
highly recommended to invoke
DifferentiableSequenceScore.initializeFunction(int, boolean, DataSet[], double[][])
before.
After an optimization this method can be used to get the current
parameter values.getCurrentParameterValues
in interface DifferentiableSequenceScore
Exception
- if no parameters exist (yet)public void set(int position, double[] pars)
position
- the positionpars
- the new parameters at this positionpublic void setParameters(double[] params, int start)
DifferentiableSequenceScore
params
between start
and
start + DifferentiableSequenceScore.getNumberOfParameters()
- 1
setParameters
in interface DifferentiableSequenceScore
params
- the new parametersstart
- the start index in params
public String getInstanceName()
SequenceScore
getInstanceName
in interface SequenceScore
public boolean isInitialized()
SequenceScore
SequenceScore.getLogScoreFor(Sequence)
.isInitialized
in interface SequenceScore
true
if the instance is initialized, false
otherwisepublic StringBuffer toXML()
Storable
StringBuffer
of an
instance of the implementing class.protected void fromXML(StringBuffer xml) throws NonParsableException
AbstractDifferentiableSequenceScore
Storable
interface to create a scoring function from a StringBuffer
.fromXML
in class AbstractDifferentiableSequenceScore
xml
- the XML representation as StringBuffer
NonParsableException
- if the StringBuffer
could not be parsedAbstractDifferentiableSequenceScore.AbstractDifferentiableSequenceScore(StringBuffer)
public String toString(NumberFormat nf)
SequenceScore
String
representation of the instance.toString
in interface SequenceScore
nf
- the NumberFormat
for the String
representation of parameters or probabilitiesString
representation of the instancepublic double[][][] getConditionalProbabilities(int component) throws CloneNotSupportedException
component
- the componentCloneNotSupportedException
- if the internal probabilities could not be clonedpublic double[][] getPWMParameters() throws CloneNotSupportedException
CloneNotSupportedException
- if the internal parameters could not be clonedpublic double[][] getMixtureProbabilities() throws CloneNotSupportedException
CloneNotSupportedException
- if the internal parameters could not be clonedpublic double[][] getAncestorProbabilities(int component)
component
- the component consideredpublic String getGraphviz()
public int getNumberOfRecommendedStarts()
DifferentiableSequenceScore
getNumberOfRecommendedStarts
in interface DifferentiableSequenceScore
getNumberOfRecommendedStarts
in class AbstractDifferentiableSequenceScore
public boolean modify(int offsetLeft, int offsetRight)
Mutable
offsetLeft
and offsetRight
define how many positions the left or
right border positions shall be moved. Negative numbers indicate moves to
the left while positive numbers correspond to moves to the right.public void fillInfixScore(int[] seq, int start, int length, double[] scores)
QuickScanningSequenceScore
seq
(which must be encoded by the
same alphabet as this QuickScanningSequenceScore
) beginning at start
and extending for length
positions. The scores are computed per position and filled into the provided array scores
, which is of the same
length as seq
.
This must be implemented such that ToolBox.sum(double...)
applied to scores
computed from 0
to seq.length
returns the same value as SequenceScore.getLogScoreFor(de.jstacs.data.sequences.Sequence)
called
on the IntSequence
created from seq
.fillInfixScore
in interface QuickScanningSequenceScore
seq
- the sequencestart
- the start of the infixlength
- the length of the infixscores
- the array of scores to be (partly) filled