|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES All Classes | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.jstacs.sequenceScores.statisticalModels.trainable.AbstractTrainableStatisticalModel
de.jstacs.sequenceScores.statisticalModels.trainable.mixture.AbstractMixtureTrainSM
de.jstacs.sequenceScores.statisticalModels.trainable.mixture.motif.HiddenMotifMixture
public abstract class HiddenMotifMixture
This is the main class that every generative motif discoverer should implement.
Nested Class Summary |
---|
Nested classes/interfaces inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.mixture.AbstractMixtureTrainSM |
---|
AbstractMixtureTrainSM.Algorithm, AbstractMixtureTrainSM.Parameterization |
Nested classes/interfaces inherited from interface de.jstacs.motifDiscovery.MotifDiscoverer |
---|
MotifDiscoverer.KindOfProfile |
Field Summary | |
---|---|
protected PositionPrior |
posPrior
The prior for the positions. |
Fields inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.mixture.AbstractMixtureTrainSM |
---|
algorithm, algorithmHasBeenRun, alternativeModel, best, burnInTest, componentHyperParams, compProb, counter, dimension, estimateComponentProbs, file, filereader, filewriter, initialIteration, logWeights, model, optimizeModel, sample, samplingIndex, seqWeights, sostream, starts, stationaryIteration, weights |
Fields inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.AbstractTrainableStatisticalModel |
---|
alphabets, length |
Constructor Summary | |
---|---|
protected |
HiddenMotifMixture(StringBuffer xml)
The standard constructor for the interface Storable . |
protected |
HiddenMotifMixture(TrainableStatisticalModel[] models,
boolean[] optimzeArray,
int components,
int starts,
boolean estimateComponentProbs,
double[] componentHyperParams,
double[] weights,
PositionPrior posPrior,
AbstractMixtureTrainSM.Algorithm algorithm,
double alpha,
TerminationCondition tc,
AbstractMixtureTrainSM.Parameterization parametrization,
int initialIteration,
int stationaryIteration,
BurnInTest burnInTest)
Creates a new HiddenMotifMixture . |
Method Summary | |
---|---|
protected void |
checkLength(int index,
int l)
This method checks if the length l of the model with index
index is capable for the current instance. |
HiddenMotifMixture |
clone()
Follows the conventions of Object 's clone() -method. |
protected Sequence[] |
emitDataSetUsingCurrentParameterSet(int n,
int... lengths)
Standard implementation throwing an OperationNotSupportedException . |
protected void |
extractFurtherInformation(StringBuffer xml)
This method is used in the subclasses to extract further information from the XML representation and to set these as values of the instance. |
protected StringBuffer |
getFurtherInformation()
This method is used in the subclasses to append further information to the XML representation. |
String |
getInstanceName()
Should return a short instance name such as iMM(0), BN(2), ... |
abstract int |
getMinimalSequenceLength()
Returns the minimal length a sequence respectively a data set has to have. |
protected void |
getNewParameters(int iteration,
double[][] seqWeights,
double[] w)
This method trains the internal models on the internal data set and the given weights. |
String |
toString(NumberFormat nf)
This method returns a String representation of the instance. |
void |
train(DataSet data,
double[] weights)
Trains the TrainableStatisticalModel object given the data as DataSet using
the specified weights. |
abstract void |
trainBgModel(DataSet data,
double[] weights)
This method trains the background model. |
Methods inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.AbstractTrainableStatisticalModel |
---|
check, getAlphabetContainer, getLength, getLogProbFor, getLogProbFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getMaximalMarkovOrder, toString, train |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Methods inherited from interface de.jstacs.motifDiscovery.MotifDiscoverer |
---|
getGlobalIndexOfMotifInComponent, getIndexOfMaximalComponentFor, getMotifLength, getNumberOfComponents, getNumberOfMotifs, getNumberOfMotifsInComponent, getProfileOfScoresFor, getStrandProbabilitiesFor |
Methods inherited from interface de.jstacs.Storable |
---|
toXML |
Field Detail |
---|
protected PositionPrior posPrior
Constructor Detail |
---|
protected HiddenMotifMixture(TrainableStatisticalModel[] models, boolean[] optimzeArray, int components, int starts, boolean estimateComponentProbs, double[] componentHyperParams, double[] weights, PositionPrior posPrior, AbstractMixtureTrainSM.Algorithm algorithm, double alpha, TerminationCondition tc, AbstractMixtureTrainSM.Parameterization parametrization, int initialIteration, int stationaryIteration, BurnInTest burnInTest) throws CloneNotSupportedException, IllegalArgumentException, WrongAlphabetException
HiddenMotifMixture
. This constructor can be used
for any algorithm since it takes all necessary values as parameters.
models
- the single models building the HiddenMotifMixture
, if
the model is trained using
AbstractMixtureTrainSM.Algorithm.GIBBS_SAMPLING
the
models that will be adjusted have to implement
SamplingComponent
.
The models that are used for the flanking sequences have to
be able to score sequences of arbitrary length.optimzeArray
- a array of switches whether to train or not the corresponding modelcomponents
- the number of components (e.g. for ZOOPS this is 2)starts
- the number of times the algorithm will be started in the
train
-method, at least 1estimateComponentProbs
- the switch for estimating the component probabilities in the
algorithm or to hold them fixed; if the component parameters
are fixed, the values of weights
will be used,
otherwise the componentHyperParams
will be
incorporated in the adjustmentcomponentHyperParams
- the hyperparameters for the component assignment prior
estimateComponentProbs == true
null
or has to have
length dimension
null
or an array with all values zero (0)
then ML
parameterization
weights
- null
or the weights for the components (then
weights.length == dimension
)posPrior
- this object determines the positional distribution that shall
be usedalgorithm
- either AbstractMixtureTrainSM.Algorithm.EM
or
AbstractMixtureTrainSM.Algorithm.GIBBS_SAMPLING
alpha
- only for AbstractMixtureTrainSM.Algorithm.EM
train
to initialize the
gammas. It is recommended to use alpha = 1
(uniform distribution on a simplex).tc
- only for AbstractMixtureTrainSM.Algorithm.EM
TerminationCondition
for stopping the EM-algorithm,
tc
has to return true
from TerminationCondition.isSimple()
parametrization
- only for AbstractMixtureTrainSM.Algorithm.EM
AbstractMixtureTrainSM.Parameterization.THETA
or
AbstractMixtureTrainSM.Parameterization.LAMBDA
AbstractMixtureTrainSM.Parameterization.LAMBDA
initialIteration
- only for AbstractMixtureTrainSM.Algorithm.GIBBS_SAMPLING
stationaryIteration/starts
)stationaryIteration
- only for AbstractMixtureTrainSM.Algorithm.GIBBS_SAMPLING
burnInTest
- only for AbstractMixtureTrainSM.Algorithm.GIBBS_SAMPLING
IllegalArgumentException
- if
weights != null && weights.length != 2
weights != null
and it exists an
i
where weights[i] < 0
starts < 1
componentHyperParams
are not correct
WrongAlphabetException
- if not all models
work on the same simple
alphabet
CloneNotSupportedException
- if the models
can not be clonedprotected HiddenMotifMixture(StringBuffer xml) throws NonParsableException
Storable
.
Creates a new HiddenMotifMixture
out of its XML representation.
xml
- the XML representation of the model as StringBuffer
NonParsableException
- if the StringBuffer
can not be parsedMethod Detail |
---|
public HiddenMotifMixture clone() throws CloneNotSupportedException
AbstractTrainableStatisticalModel
Object
's clone()
-method.
clone
in interface MotifDiscoverer
clone
in interface SequenceScore
clone
in interface TrainableStatisticalModel
clone
in class AbstractMixtureTrainSM
AbstractTrainableStatisticalModel
(the member-AlphabetContainer
isn't deeply cloned since
it is assumed to be immutable). The type of the returned object
is defined by the class X
directly inherited from
AbstractTrainableStatisticalModel
. Hence X
's
clone()
-method should work as:Object o = (X)super.clone();
o
defined by
X
that are not of simple data-types like
int
, double
, ... have to be deeply
copied return o
CloneNotSupportedException
- if something went wrong while cloningCloneable
protected StringBuffer getFurtherInformation()
AbstractMixtureTrainSM
getFurtherInformation
in class AbstractMixtureTrainSM
AbstractMixtureTrainSM.extractFurtherInformation(StringBuffer)
protected void extractFurtherInformation(StringBuffer xml) throws NonParsableException
AbstractMixtureTrainSM
extractFurtherInformation
in class AbstractMixtureTrainSM
xml
- the XML representation
NonParsableException
- if the XML representation is not parsableAbstractMixtureTrainSM.getFurtherInformation()
public void train(DataSet data, double[] weights) throws Exception
TrainableStatisticalModel
TrainableStatisticalModel
object given the data as DataSet
using
the specified weights. The weight at position i belongs to the element at
position i. So the array weight
should have the number of
sequences in the data set as dimension. (Optionally it is possible to use
weight == null
if all weights have the value one.)train(data1)
; train(data2)
should be a fully trained model over data2
and not over
data1+data2
. All parameters of the model were given by the
call of the constructor.
train
in interface TrainableStatisticalModel
train
in class AbstractMixtureTrainSM
data
- the given sequences as DataSet
weights
- the weights of the elements, each weight should be
non-negative
Exception
- if the training did not succeed (e.g. the dimension of
weights
and the number of sequences in the
data set do not match)DataSet.getElementAt(int)
,
DataSet.ElementEnumerator
protected void getNewParameters(int iteration, double[][] seqWeights, double[] w) throws Exception
AbstractMixtureTrainSM
getNewParameters
in class AbstractMixtureTrainSM
iteration
- the number of times this method has been invokedseqWeights
- the weights for each model and sequencew
- the weights for the components
Exception
- if the training of the internal models went wrongpublic abstract void trainBgModel(DataSet data, double[] weights) throws Exception
data
- the data setweights
- the weights
Exception
- if something went wrongprotected void checkLength(int index, int l)
AbstractMixtureTrainSM
l
of the model with index
index
is capable for the current instance. Otherwise an
IllegalArgumentException
is thrown.
checkLength
in class AbstractMixtureTrainSM
index
- the index of the modell
- the length of the modelpublic abstract int getMinimalSequenceLength()
public String getInstanceName()
SequenceScore
getInstanceName
in interface SequenceScore
getInstanceName
in class AbstractMixtureTrainSM
public String toString(NumberFormat nf)
SequenceScore
String
representation of the instance.
toString
in interface SequenceScore
nf
- the NumberFormat
for the String
representation of parameters or probabilities
String
representation of the instanceprotected Sequence[] emitDataSetUsingCurrentParameterSet(int n, int... lengths) throws Exception
OperationNotSupportedException
.
emitDataSetUsingCurrentParameterSet
in class AbstractMixtureTrainSM
n
- the number of sequences to be sampledlengths
- the corresponding lengths
Exception
- if it was impossible to sample the sequencesStatisticalModel.emitDataSet(int, int...)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES All Classes | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |