|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES All Classes | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.jstacs.sequenceScores.statisticalModels.trainable.AbstractTrainableStatisticalModel
de.jstacs.sequenceScores.statisticalModels.trainable.mixture.AbstractMixtureTrainSM
de.jstacs.sequenceScores.statisticalModels.trainable.mixture.motif.HiddenMotifMixture
public abstract class HiddenMotifMixture
This is the main class that every generative motif discoverer should implement.
Nested Class Summary |
---|
Nested classes/interfaces inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.mixture.AbstractMixtureTrainSM |
---|
AbstractMixtureTrainSM.Algorithm, AbstractMixtureTrainSM.Parameterization |
Nested classes/interfaces inherited from interface de.jstacs.motifDiscovery.MotifDiscoverer |
---|
MotifDiscoverer.KindOfProfile |
Field Summary | |
---|---|
protected PositionPrior |
posPrior
The prior for the positions. |
Fields inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.mixture.AbstractMixtureTrainSM |
---|
algorithm, algorithmHasBeenRun, alternativeModel, best, burnInTest, componentHyperParams, compProb, counter, dimension, estimateComponentProbs, file, filereader, filewriter, initialIteration, logWeights, model, optimizeModel, sample, samplingIndex, seqWeights, sostream, starts, stationaryIteration, weights |
Fields inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.AbstractTrainableStatisticalModel |
---|
alphabets, length |
Constructor Summary | |
---|---|
protected |
HiddenMotifMixture(StringBuffer xml)
The standard constructor for the interface Storable . |
protected |
HiddenMotifMixture(TrainableStatisticalModel[] models,
boolean[] optimzeArray,
int components,
int starts,
boolean estimateComponentProbs,
double[] componentHyperParams,
double[] weights,
PositionPrior posPrior,
AbstractMixtureTrainSM.Algorithm algorithm,
double alpha,
TerminationCondition tc,
AbstractMixtureTrainSM.Parameterization parametrization,
int initialIteration,
int stationaryIteration,
BurnInTest burnInTest)
Creates a new HiddenMotifMixture . |
Method Summary | |
---|---|
protected void |
checkLength(int index,
int l)
This method checks if the length l of the model with index
index is capable for the current instance. |
HiddenMotifMixture |
clone()
Follows the conventions of Object 's clone() -method. |
protected Sequence[] |
emitDataSetUsingCurrentParameterSet(int n,
int... lengths)
Standard implementation throwing an OperationNotSupportedException . |
protected void |
extractFurtherInformation(StringBuffer xml)
This method is used in the subclasses to extract further information from the XML representation and to set these as values of the instance. |
protected StringBuffer |
getFurtherInformation()
This method is used in the subclasses to append further information to the XML representation. |
String |
getInstanceName()
Should return a short instance name such as iMM(0), BN(2), ... |
abstract int |
getMinimalSequenceLength()
Returns the minimal length a sequence respectively a sample has to have. |
protected void |
getNewParameters(int iteration,
double[][] seqWeights,
double[] w)
This method trains the internal models on the internal sample and the given weights. |
String |
toString()
Should give a simple representation (text) of the model as String . |
void |
train(DataSet data,
double[] weights)
Trains the TrainableStatisticalModel object given the data as DataSet using
the specified weights. |
abstract void |
trainBgModel(DataSet data,
double[] weights)
This method trains the background model. |
Methods inherited from class de.jstacs.sequenceScores.statisticalModels.trainable.AbstractTrainableStatisticalModel |
---|
check, getAlphabetContainer, getLength, getLogProbFor, getLogProbFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getLogScoreFor, getMaximalMarkovOrder, train |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Methods inherited from interface de.jstacs.motifDiscovery.MotifDiscoverer |
---|
getGlobalIndexOfMotifInComponent, getIndexOfMaximalComponentFor, getMotifLength, getNumberOfComponents, getNumberOfMotifs, getNumberOfMotifsInComponent, getProfileOfScoresFor, getStrandProbabilitiesFor |
Methods inherited from interface de.jstacs.Storable |
---|
toXML |
Field Detail |
---|
protected PositionPrior posPrior
Constructor Detail |
---|
protected HiddenMotifMixture(TrainableStatisticalModel[] models, boolean[] optimzeArray, int components, int starts, boolean estimateComponentProbs, double[] componentHyperParams, double[] weights, PositionPrior posPrior, AbstractMixtureTrainSM.Algorithm algorithm, double alpha, TerminationCondition tc, AbstractMixtureTrainSM.Parameterization parametrization, int initialIteration, int stationaryIteration, BurnInTest burnInTest) throws CloneNotSupportedException, IllegalArgumentException, WrongAlphabetException
HiddenMotifMixture
. This constructor can be used
for any algorithm since it takes all necessary values as parameters.
models
- the single models building the HiddenMotifMixture
, if
the model is trained using
AbstractMixtureTrainSM.Algorithm.GIBBS_SAMPLING
the
models that will be adjusted have to implement
SamplingComponent
.
The models that are used for the flanking sequences have to
be able to score sequences of arbitrary length.optimzeArray
- a array of switches whether to train or not the corresponding modelcomponents
- the number of components (e.g. for ZOOPS this is 2)starts
- the number of times the algorithm will be started in the
train
-method, at least 1estimateComponentProbs
- the switch for estimating the component probabilities in the
algorithm or to hold them fixed; if the component parameters
are fixed, the values of weights
will be used,
otherwise the componentHyperParams
will be
incorporated in the adjustmentcomponentHyperParams
- the hyperparameters for the component assignment prior
estimateComponentProbs == true
null
or has to have
length dimension
null
or an array with all values zero (0)
then ML
parameterization
weights
- null
or the weights for the components (then
weights.length == dimension
)posPrior
- this object determines the positional distribution that shall
be usedalgorithm
- either AbstractMixtureTrainSM.Algorithm.EM
or
AbstractMixtureTrainSM.Algorithm.GIBBS_SAMPLING
alpha
- only for AbstractMixtureTrainSM.Algorithm.EM
train
to initialize the
gammas. It is recommended to use alpha = 1
(uniform distribution on a simplex).tc
- only for AbstractMixtureTrainSM.Algorithm.EM
TerminationCondition
for stopping the EM-algorithm,
tc
has to return true
from TerminationCondition.isSimple()
parametrization
- only for AbstractMixtureTrainSM.Algorithm.EM
AbstractMixtureTrainSM.Parameterization.THETA
or
AbstractMixtureTrainSM.Parameterization.LAMBDA
AbstractMixtureTrainSM.Parameterization.LAMBDA
initialIteration
- only for AbstractMixtureTrainSM.Algorithm.GIBBS_SAMPLING
stationaryIteration/starts
)stationaryIteration
- only for AbstractMixtureTrainSM.Algorithm.GIBBS_SAMPLING
burnInTest
- only for AbstractMixtureTrainSM.Algorithm.GIBBS_SAMPLING
IllegalArgumentException
- if
weights != null && weights.length != 2
weights != null
and it exists an
i
where weights[i] < 0
starts < 1
componentHyperParams
are not correct
WrongAlphabetException
- if not all models
work on the same simple
alphabet
CloneNotSupportedException
- if the models
can not be clonedprotected HiddenMotifMixture(StringBuffer xml) throws NonParsableException
Storable
.
Creates a new HiddenMotifMixture
out of its XML representation.
xml
- the XML representation of the model as StringBuffer
NonParsableException
- if the StringBuffer
can not be parsedMethod Detail |
---|
public HiddenMotifMixture clone() throws CloneNotSupportedException
AbstractTrainableStatisticalModel
Object
's clone()
-method.
clone
in interface MotifDiscoverer
clone
in interface SequenceScore
clone
in interface TrainableStatisticalModel
clone
in class AbstractMixtureTrainSM
AbstractTrainableStatisticalModel
(the member-AlphabetContainer
isn't deeply cloned since
it is assumed to be immutable). The type of the returned object
is defined by the class X
directly inherited from
AbstractTrainableStatisticalModel
. Hence X
's
clone()
-method should work as:Object o = (X)super.clone();
o
defined by
X
that are not of simple data-types like
int
, double
, ... have to be deeply
copied return o
CloneNotSupportedException
- if something went wrong while cloningCloneable
protected StringBuffer getFurtherInformation()
AbstractMixtureTrainSM
getFurtherInformation
in class AbstractMixtureTrainSM
AbstractMixtureTrainSM.extractFurtherInformation(StringBuffer)
protected void extractFurtherInformation(StringBuffer xml) throws NonParsableException
AbstractMixtureTrainSM
extractFurtherInformation
in class AbstractMixtureTrainSM
xml
- the XML representation
NonParsableException
- if the XML representation is not parsableAbstractMixtureTrainSM.getFurtherInformation()
public void train(DataSet data, double[] weights) throws Exception
TrainableStatisticalModel
TrainableStatisticalModel
object given the data as DataSet
using
the specified weights. The weight at position i belongs to the element at
position i. So the array weight
should have the number of
sequences in the sample as dimension. (Optionally it is possible to use
weight == null
if all weights have the value one.)train(data1)
; train(data2)
should be a fully trained model over data2
and not over
data1+data2
. All parameters of the model were given by the
call of the constructor.
train
in interface TrainableStatisticalModel
train
in class AbstractMixtureTrainSM
data
- the given sequences as DataSet
weights
- the weights of the elements, each weight should be
non-negative
Exception
- if the training did not succeed (e.g. the dimension of
weights
and the number of sequences in the
sample do not match)DataSet.getElementAt(int)
,
DataSet.ElementEnumerator
protected void getNewParameters(int iteration, double[][] seqWeights, double[] w) throws Exception
AbstractMixtureTrainSM
getNewParameters
in class AbstractMixtureTrainSM
iteration
- the number of times this method has been invokedseqWeights
- the weights for each model and sequencew
- the weights for the components
Exception
- if the training of the internal models went wrongpublic abstract void trainBgModel(DataSet data, double[] weights) throws Exception
data
- the sampleweights
- the weights
Exception
- if something went wrongprotected void checkLength(int index, int l)
AbstractMixtureTrainSM
l
of the model with index
index
is capable for the current instance. Otherwise an
IllegalArgumentException
is thrown.
checkLength
in class AbstractMixtureTrainSM
index
- the index of the modell
- the length of the modelpublic abstract int getMinimalSequenceLength()
public String getInstanceName()
SequenceScore
getInstanceName
in interface SequenceScore
getInstanceName
in class AbstractMixtureTrainSM
public String toString()
TrainableStatisticalModel
String
.
toString
in interface TrainableStatisticalModel
toString
in class Object
String
protected Sequence[] emitDataSetUsingCurrentParameterSet(int n, int... lengths) throws Exception
OperationNotSupportedException
.
emitDataSetUsingCurrentParameterSet
in class AbstractMixtureTrainSM
n
- the number of sequences to be sampledlengths
- the corresponding lengths
Exception
- if it was impossible to sample the sequencesStatisticalModel.emitDataSet(int, int...)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES All Classes | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |