public class MixtureDiffSM extends AbstractMixtureDiffSM implements MutableMotifDiscoverer
MotifDiscoverer.KindOfProfile
componentScore, dList, freeParams, function, hiddenParameter, hiddenPotential, iList, logGammaSum, logHiddenNorm, logHiddenPotential, norm, optimizeHidden, paramRef, partNorm
alphabets, length, r
UNKNOWN
Constructor and Description |
---|
MixtureDiffSM(int starts,
boolean plugIn,
DifferentiableStatisticalModel... component)
This constructor creates a new
MixtureDiffSM . |
MixtureDiffSM(StringBuffer xml)
This is the constructor for the interface
Storable . |
Modifier and Type | Method and Description |
---|---|
void |
adjustHiddenParameters(int index,
DataSet[] data,
double[][] weights)
Adjusts all hidden parameters including duration and mixture parameters according to the current values of the remaining parameters.
|
MixtureDiffSM |
clone()
Creates a clone (deep copy) of the current
DifferentiableSequenceScore
instance. |
DataSet |
emitDataSet(int numberOfSequences,
int... seqLength)
This method returns a
DataSet object containing artificial
sequence(s). |
protected void |
fillComponentScores(Sequence seq,
int start)
Fills the internal array
AbstractMixtureDiffSM.componentScore with the logarithmic
scores of the components given a Sequence . |
double |
getESS()
Returns the equivalent sample size (ess) of this model, i.e.
|
int |
getGlobalIndexOfMotifInComponent(int component,
int motif)
Returns the global index of the
motif used in
component . |
double |
getHyperparameterForHiddenParameter(int index)
This method returns the hyperparameter for the hidden parameter with
index
index . |
int |
getIndexOfMaximalComponentFor(Sequence sequence)
Returns the index of the component with the maximal score for the
sequence
sequence . |
String |
getInstanceName()
Should return a short instance name such as iMM(0), BN(2), ...
|
protected double |
getLogNormalizationConstantForComponent(int i)
Computes the logarithm of the normalization constant for the component
i . |
double |
getLogPartialNormalizationConstant(int parameterIndex)
Returns the logarithm of the partial normalization constant for the parameter with index
parameterIndex . |
double |
getLogScoreAndPartialDerivation(Sequence seq,
int start,
IntList indices,
DoubleList partialDer)
|
int |
getMotifLength(int motif)
This method returns the length of the motif with index
motif
. |
int |
getNumberOfMotifs()
Returns the number of motifs for this
MotifDiscoverer . |
int |
getNumberOfMotifsInComponent(int component)
Returns the number of motifs that are used in the component
component of this MotifDiscoverer . |
double[] |
getProfileOfScoresFor(int component,
int motif,
Sequence sequence,
int startpos,
MotifDiscoverer.KindOfProfile kind)
Returns the profile of the scores for component
component
and motif motif at all possible start positions of the motif
in the sequence sequence beginning at startpos . |
double[] |
getStrandProbabilitiesFor(int component,
int motif,
Sequence sequence,
int startpos)
This method returns the probabilities of the strand orientations for a given subsequence if it is
considered as site of the motif model in a specific component.
|
void |
initializeMotif(int motifIndex,
DataSet data,
double[] weights)
This method allows to initialize the model of a motif manually using a weighted data set.
|
void |
initializeMotifRandomly(int motif)
This method initializes the motif with index
motif randomly using for instance DifferentiableSequenceScore.initializeFunctionRandomly(boolean) . |
protected void |
initializeUsingPlugIn(int index,
boolean freeParams,
DataSet[] data,
double[][] weights)
This method initializes the functions using the data in some way.
|
boolean |
modifyMotif(int motifIndex,
int offsetLeft,
int offsetRight)
Manually modifies the motif model with index
motifIndex . |
String |
toString(NumberFormat nf)
This method returns a
String representation of the instance. |
addGradientOfLogPriorTerm, cloneFunctions, computeHiddenParameter, computeLogGammaSum, determineIsNormalized, extractFurtherInformation, fromXML, getAPrioriMixtureProbabilities, getComponentScores, getCurrentParameterValues, getDifferentiableStatisticalModels, getFunction, getFunctions, getFurtherInformation, getIndexOfMaximalComponentFor, getIndices, getLogNormalizationConstant, getLogPriorTerm, getLogScoreFor, getNumberOfComponents, getNumberOfParameters, getNumberOfRecommendedStarts, getProbsForComponent, getSamplingGroups, getSizeOfEventSpaceForRandomVariablesOfParameter, getXMLTag, init, initializeFunction, initializeFunctionRandomly, initializeHiddenPotentialRandomly, initializeHiddenUniformly, initWithLength, isInitialized, isNormalized, precomputeNorm, setHiddenParameters, setParameters, setParametersForFunction, toXML
getInitialClassParam, getLogProbFor, getLogProbFor, getLogProbFor, getLogScoreFor, getLogScoreFor, getMaximalMarkovOrder, isNormalized
getAlphabetContainer, getCharacteristics, getLength, getLogScoreAndPartialDerivation, getLogScoreAndPartialDerivation, getLogScoreFor, getLogScoreFor, getNumberOfStarts, getNumericalCharacteristics, toString
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
getNumberOfComponents
getInitialClassParam, getLogScoreAndPartialDerivation, getLogScoreAndPartialDerivation
getLogProbFor, getLogProbFor, getLogProbFor, getMaximalMarkovOrder
getAlphabetContainer, getCharacteristics, getLength, getLogScoreFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getNumericalCharacteristics
public MixtureDiffSM(int starts, boolean plugIn, DifferentiableStatisticalModel... component) throws CloneNotSupportedException
MixtureDiffSM
. The first component determines the length of the sequences that can be modeled.starts
- the number of starts that should be done in an optimizationplugIn
- indicates whether the initial parameters for an optimization
should be related to the data or randomly drawncomponent
- the DifferentiableStatisticalModel
sCloneNotSupportedException
- if an element of component
could not be clonedpublic MixtureDiffSM(StringBuffer xml) throws NonParsableException
Storable
.
Creates a new MixtureDiffSM
out of a
StringBuffer
.xml
- the XML representation as StringBuffer
NonParsableException
- if the XML representation could not be parsedpublic MixtureDiffSM clone() throws CloneNotSupportedException
DifferentiableSequenceScore
DifferentiableSequenceScore
instance.clone
in interface MotifDiscoverer
clone
in interface DifferentiableSequenceScore
clone
in interface SequenceScore
clone
in class AbstractMixtureDiffSM
DifferentiableSequenceScore
CloneNotSupportedException
- if something went wrong while cloning the
DifferentiableSequenceScore
Cloneable
protected double getLogNormalizationConstantForComponent(int i)
AbstractMixtureDiffSM
i
.getLogNormalizationConstantForComponent
in class AbstractMixtureDiffSM
i
- the index of the componentpublic double getLogPartialNormalizationConstant(int parameterIndex) throws Exception
DifferentiableStatisticalModel
parameterIndex
. This is the logarithm of the partial derivation of the
normalization constant for the parameter with index
parameterIndex
,
getLogPartialNormalizationConstant
in interface DifferentiableStatisticalModel
parameterIndex
- the index of the parameterException
- if something went wrong with the normalizationDifferentiableStatisticalModel.getLogNormalizationConstant()
public double getHyperparameterForHiddenParameter(int index)
AbstractMixtureDiffSM
index
.getHyperparameterForHiddenParameter
in class AbstractMixtureDiffSM
index
- the index of the hidden parameterpublic double getESS()
DifferentiableStatisticalModel
getESS
in interface DifferentiableStatisticalModel
protected void initializeUsingPlugIn(int index, boolean freeParams, DataSet[] data, double[][] weights) throws Exception
AbstractMixtureDiffSM
initializeUsingPlugIn
in class AbstractMixtureDiffSM
index
- the class indexfreeParams
- if true
, the (reduced) parameterization is useddata
- the dataweights
- the weights for the dataException
- if the initialization could not be doneDifferentiableSequenceScore.initializeFunction(int,
boolean, DataSet[], double[][])
public void adjustHiddenParameters(int index, DataSet[] data, double[][] weights) throws Exception
adjustHiddenParameters
in interface MutableMotifDiscoverer
index
- the index of the class of this instancedata
- the array of data for all classesweights
- the weights for all sequences in dataException
- thrown if the hidden parameters could not be adjustedpublic String getInstanceName()
SequenceScore
getInstanceName
in interface SequenceScore
protected void fillComponentScores(Sequence seq, int start)
AbstractMixtureDiffSM
AbstractMixtureDiffSM.componentScore
with the logarithmic
scores of the components given a Sequence
.fillComponentScores
in class AbstractMixtureDiffSM
seq
- the sequencestart
- the start position in seq
public double getLogScoreAndPartialDerivation(Sequence seq, int start, IntList indices, DoubleList partialDer)
DifferentiableSequenceScore
Sequence
beginning at
position start
in the Sequence
and fills lists with
the indices and the partial derivations.getLogScoreAndPartialDerivation
in interface DifferentiableSequenceScore
seq
- the Sequence
start
- the start position in the Sequence
indices
- an IntList
of indices, after method invocation the
list should contain the indices i where
partialDer
- a DoubleList
of partial derivations, after method
invocation the list should contain the corresponding
Sequence
public String toString(NumberFormat nf)
SequenceScore
String
representation of the instance.toString
in interface SequenceScore
nf
- the NumberFormat
for the String
representation of parameters or probabilitiesString
representation of the instancepublic void initializeMotif(int motifIndex, DataSet data, double[] weights) throws Exception
MutableMotifDiscoverer
initializeMotif
in interface MutableMotifDiscoverer
motifIndex
- the index of the motif in the motif discovererdata
- the data set of sequencesweights
- either null
or an array of length data.getNumberofElements()
with non-negative weights.Exception
- if initialize was not possiblepublic void initializeMotifRandomly(int motif) throws Exception
MutableMotifDiscoverer
motif
randomly using for instance DifferentiableSequenceScore.initializeFunctionRandomly(boolean)
.
Furthermore, if available, it also initializes the positional distribution.initializeMotifRandomly
in interface MutableMotifDiscoverer
motif
- the index of the motifException
- either if the index is wrong or if it is thrown by the method DifferentiableSequenceScore.initializeFunctionRandomly(boolean)
public boolean modifyMotif(int motifIndex, int offsetLeft, int offsetRight) throws Exception
MutableMotifDiscoverer
motifIndex
. The two offsets offsetLeft
and offsetRight
define how many positions the left or right border positions shall be moved. Negative numbers indicate moves to the left while positive
numbers correspond to moves to the right. The distribution for sequences to the left and right side of the motif shall be computed internally.modifyMotif
in interface MutableMotifDiscoverer
motifIndex
- the index of the motif in the motif discovereroffsetLeft
- the offset on the left sideoffsetRight
- the offset on the right sidetrue
if the motif model was modified otherwise false
Exception
- if some unexpected error occurred during the modificationMutableMotifDiscoverer.modifyMotif(int, int, int)
,
Mutable.modify(int, int)
public int getGlobalIndexOfMotifInComponent(int component, int motif)
MotifDiscoverer
motif
used in
component
. The index returned must be at least 0 and less
than MotifDiscoverer.getNumberOfMotifs()
.getGlobalIndexOfMotifInComponent
in interface MotifDiscoverer
component
- the component indexmotif
- the motif index in the componentmotif
in component
public int getIndexOfMaximalComponentFor(Sequence sequence) throws Exception
MotifDiscoverer
sequence
.getIndexOfMaximalComponentFor
in interface MotifDiscoverer
sequence
- the given sequenceException
- if the index could not be computed for any reasonspublic int getMotifLength(int motif)
MotifDiscoverer
motif
.getMotifLength
in interface MotifDiscoverer
motif
- the index of the motifmotif
public int getNumberOfMotifs()
MotifDiscoverer
MotifDiscoverer
.getNumberOfMotifs
in interface MotifDiscoverer
public int getNumberOfMotifsInComponent(int component)
MotifDiscoverer
component
of this MotifDiscoverer
.getNumberOfMotifsInComponent
in interface MotifDiscoverer
component
- the component of the MotifDiscoverer
public double[] getProfileOfScoresFor(int component, int motif, Sequence sequence, int startpos, MotifDiscoverer.KindOfProfile kind) throws Exception
MotifDiscoverer
component
and motif motif
at all possible start positions of the motif
in the sequence sequence
beginning at startpos
.
This array should be of length sequence.length() - startpos - motifs[motif].getLength() + 1
.
getProfileOfScoresFor
in interface MotifDiscoverer
component
- the component indexmotif
- the index of the motif in the componentsequence
- the given sequencestartpos
- the start position in the sequencekind
- indicates the kind of profileException
- if the score could not be computed for any reasonspublic double[] getStrandProbabilitiesFor(int component, int motif, Sequence sequence, int startpos) throws Exception
MotifDiscoverer
getStrandProbabilitiesFor
in interface MotifDiscoverer
component
- the component indexmotif
- the index of the motif in the componentsequence
- the given sequencestartpos
- the start position in the sequenceException
- if the strand could not be computed for any reasonspublic DataSet emitDataSet(int numberOfSequences, int... seqLength) throws NotTrainedException, Exception
StatisticalModel
DataSet
object containing artificial
sequence(s).
emitDataSet( int n, int l )
should return a data set with
n
sequences of length l
.
emitDataSet( int n, int[] l )
should return a data set with
n
sequences which have a sequence length corresponding to
the entry in the given array l
.
emitDataSet( int n )
and
emitDataSet( int n, null )
should return a data set with
n
sequences of length of the model (
SequenceScore.getLength()
).
Exception
.emitDataSet
in interface StatisticalModel
emitDataSet
in class AbstractDifferentiableStatisticalModel
numberOfSequences
- the number of sequences that should be contained in the
returned data setseqLength
- the length of the sequences for a homogeneous model; for an
inhomogeneous model this parameter should be null
or an array of size 0.DataSet
containing the artificial sequence(s)NotTrainedException
- if the model is not trained yetException
- if the emission did not succeedDataSet