|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES All Classes | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.jstacs.sequenceScores.differentiable.AbstractDifferentiableSequenceScore
de.jstacs.sequenceScores.statisticalModels.differentiable.AbstractDifferentiableStatisticalModel
de.jstacs.sequenceScores.statisticalModels.differentiable.directedGraphicalModels.BayesianNetworkDiffSM
public class BayesianNetworkDiffSM
This class implements a scoring function that is a moral directed graphical
model, i.e. a moral Bayesian network. This implementation also comprises well
known specializations of Bayesian networks like Markov models of arbitrary
order (including weight array matrix models (WAM) and position weight
matrices (PWM)) or Bayesian trees. Different structures can be achieved by
using the corresponding Measure
, e.g. InhomogeneousMarkov
for
Markov models of arbitrary order.
This scoring function can be used in any
ScoreClassifier
, e.g. in a
MSPClassifier
to learn
the parameters of the DifferentiableStatisticalModel
using maximum conditional likelihood or maximum supervised posterior.
Field Summary | |
---|---|
protected double |
ess
The equivalent sample size. |
protected boolean |
isTrained
Indicates if the instance has been trained. |
protected Double |
logNormalizationConstant
Normalization constant to obtain normalized probabilities. |
protected Integer |
numFreePars
The number of free parameters. |
protected int[] |
nums
Used internally. |
protected int[][] |
order
The network structure, used internally. |
protected BNDiffSMParameter[] |
parameters
The parameters of the scoring function. |
protected boolean |
plugInParameters
Indicates if plug-in parameters, i.e. generative (MAP) parameters shall be used upon initialization. |
protected Measure |
structureMeasure
Measure that defines the network structure. |
protected BNDiffSMParameterTree[] |
trees
The trees that represent the context of the random variable (i.e. |
Fields inherited from class de.jstacs.sequenceScores.differentiable.AbstractDifferentiableSequenceScore |
---|
alphabets, length, r |
Fields inherited from interface de.jstacs.sequenceScores.differentiable.DifferentiableSequenceScore |
---|
UNKNOWN |
Constructor Summary | |
---|---|
BayesianNetworkDiffSM(AlphabetContainer alphabet,
int length,
double ess,
boolean plugInParameters,
Measure structureMeasure)
Creates a new BayesianNetworkDiffSM that has neither
been initialized nor trained. |
|
BayesianNetworkDiffSM(BayesianNetworkDiffSMParameterSet parameters)
Creates a new BayesianNetworkDiffSM that has neither
been initialized nor trained from a
BayesianNetworkDiffSMParameterSet . |
|
BayesianNetworkDiffSM(StringBuffer xml)
The standard constructor for the interface Storable . |
Method Summary | |
---|---|
void |
addGradientOfLogPriorTerm(double[] grad,
int start)
This method computes the gradient of DifferentiableStatisticalModel.getLogPriorTerm() for each
parameter of this model. |
BayesianNetworkDiffSM |
clone()
Creates a clone (deep copy) of the current DifferentiableSequenceScore
instance. |
protected void |
createTrees(DataSet[] data2,
double[][] weights2)
Creates the tree structures that represent the context (array trees ) and the parameter objects parameters using the
given Measure structureMeasure . |
DataSet |
emitDataSet(int numberOfSequences,
int... seqLength)
This method returns a DataSet object containing artificial
sequence(s). |
protected void |
fromXML(StringBuffer source)
This method is called in the constructor for the Storable
interface to create a scoring function from a StringBuffer . |
InstanceParameterSet |
getCurrentParameterSet()
Returns the InstanceParameterSet that has been used to
instantiate the current instance of the implementing class. |
double[] |
getCurrentParameterValues()
Returns a double array of dimension
DifferentiableSequenceScore.getNumberOfParameters() containing the current parameter values. |
double |
getESS()
Returns the equivalent sample size (ess) of this model, i.e. the equivalent sample size for the class or component that is represented by this model. |
String |
getInstanceName()
Should return a short instance name such as iMM(0), BN(2), ... |
double |
getLogNormalizationConstant()
Returns the logarithm of the sum of the scores over all sequences of the event space. |
double |
getLogPartialNormalizationConstant(int parameterIndex)
Returns the logarithm of the partial normalization constant for the parameter with index parameterIndex . |
double |
getLogPriorTerm()
This method computes a value that is proportional to
where prior is the prior for the parameters of this model. |
double |
getLogScoreAndPartialDerivation(Sequence seq,
int start,
IntList indices,
DoubleList partialDer)
Returns the logarithmic score for a Sequence beginning at
position start in the Sequence and fills lists with
the indices and the partial derivations. |
double |
getLogScoreFor(Sequence seq,
int start)
Returns the logarithmic score for the Sequence seq
beginning at position start in the Sequence . |
int |
getNumberOfParameters()
Returns the number of parameters in this DifferentiableSequenceScore . |
double[] |
getPositionDependentKMerProb(Sequence kmer)
Returns the probability of kmer for all possible positions in this BayesianNetworkDiffSM starting at position kmer.getLength()-1 |
int |
getPositionForParameter(int index)
Returns the position in the sequence the parameter index is
responsible for. |
double[][] |
getPWM()
If this BayesianNetworkDiffSM is a PWM, i.e. |
int |
getSizeOfEventSpaceForRandomVariablesOfParameter(int index)
Returns the size of the event space of the random variables that are affected by parameter no. |
void |
initializeFunction(int index,
boolean freeParams,
DataSet[] data,
double[][] weights)
This method creates the underlying structure of the DifferentiableSequenceScore . |
void |
initializeFunctionRandomly(boolean freeParams)
This method initializes the DifferentiableSequenceScore randomly. |
boolean |
isInitialized()
This method can be used to determine whether the instance is initialized. |
protected void |
precomputeNormalization()
Pre-computes all normalization constants. |
void |
setParameters(double[] params,
int start)
This method sets the internal parameters to the values of params between start and
start + |
protected void |
setPlugInParameters(int index,
boolean freeParameters,
DataSet[] data,
double[][] weights)
Computes and sets the plug-in parameters (MAP estimated parameters) from data using weights . |
String |
toString()
|
StringBuffer |
toXML()
This method returns an XML representation as StringBuffer of an
instance of the implementing class. |
Methods inherited from class de.jstacs.sequenceScores.statisticalModels.differentiable.AbstractDifferentiableStatisticalModel |
---|
getInitialClassParam, getLogProbFor, getLogProbFor, getLogProbFor, getLogScoreFor, getLogScoreFor, getMaximalMarkovOrder, isNormalized, isNormalized |
Methods inherited from class de.jstacs.sequenceScores.differentiable.AbstractDifferentiableSequenceScore |
---|
getAlphabetContainer, getCharacteristics, getLength, getLogScoreAndPartialDerivation, getLogScoreAndPartialDerivation, getLogScoreFor, getLogScoreFor, getNumberOfRecommendedStarts, getNumberOfStarts, getNumericalCharacteristics |
Methods inherited from class java.lang.Object |
---|
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Methods inherited from interface de.jstacs.sequenceScores.differentiable.DifferentiableSequenceScore |
---|
getLogScoreAndPartialDerivation, getLogScoreAndPartialDerivation, getNumberOfRecommendedStarts |
Methods inherited from interface de.jstacs.sequenceScores.SequenceScore |
---|
getAlphabetContainer, getCharacteristics, getLength, getLogScoreFor, getLogScoreFor, getNumericalCharacteristics |
Field Detail |
---|
protected BNDiffSMParameter[] parameters
protected BNDiffSMParameterTree[] trees
protected boolean isTrained
protected double ess
protected Integer numFreePars
protected int[] nums
protected Measure structureMeasure
Measure
that defines the network structure.
protected boolean plugInParameters
protected int[][] order
protected Double logNormalizationConstant
Constructor Detail |
---|
public BayesianNetworkDiffSM(AlphabetContainer alphabet, int length, double ess, boolean plugInParameters, Measure structureMeasure) throws Exception
BayesianNetworkDiffSM
that has neither
been initialized nor trained.
alphabet
- the alphabet of the scoring function boxed in an
AlphabetContainer
, e.g
new AlphabetContainer(new DNAAlphabet())
length
- the length of the scoring function, i.e. the length of the
sequences this scoring function can handleess
- the equivalent sample sizeplugInParameters
- indicates if plug-in parameters, i.e. generative (MAP)
parameters, shall be used upon initializationstructureMeasure
- the Measure
used for the structure, e.g.
InhomogeneousMarkov
Exception
- if the length of the scoring function is not admissible (<=0)
or the alphabet is not discretepublic BayesianNetworkDiffSM(BayesianNetworkDiffSMParameterSet parameters) throws ParameterSetParser.NotInstantiableException, Exception
BayesianNetworkDiffSM
that has neither
been initialized nor trained from a
BayesianNetworkDiffSMParameterSet
.
parameters
- the parameter set
ParameterSetParser.NotInstantiableException
- if the BayesianNetworkDiffSM
could not be
instantiated from the
BayesianNetworkDiffSMParameterSet
Exception
- if the length of the scoring function is not admissible (<=0)
or the alphabet is not discretepublic BayesianNetworkDiffSM(StringBuffer xml) throws NonParsableException
Storable
.
Recreates a BayesianNetworkDiffSM
from its XML
representation as saved by the method toXML()
.
xml
- the XML representation as StringBuffer
NonParsableException
- if the XML code could not be parsedMethod Detail |
---|
public BayesianNetworkDiffSM clone() throws CloneNotSupportedException
DifferentiableSequenceScore
DifferentiableSequenceScore
instance.
clone
in interface DifferentiableSequenceScore
clone
in interface SequenceScore
clone
in class AbstractDifferentiableStatisticalModel
DifferentiableSequenceScore
CloneNotSupportedException
- if something went wrong while cloning the
DifferentiableSequenceScore
public double getLogPartialNormalizationConstant(int parameterIndex) throws Exception
DifferentiableStatisticalModel
parameterIndex
. This is the logarithm of the partial derivation of the
normalization constant for the parameter with index
parameterIndex
,
getLogPartialNormalizationConstant
in interface DifferentiableStatisticalModel
parameterIndex
- the index of the parameter
Exception
- if something went wrong with the normalizationDifferentiableStatisticalModel.getLogNormalizationConstant()
public void initializeFunction(int index, boolean freeParams, DataSet[] data, double[][] weights) throws Exception
DifferentiableSequenceScore
DifferentiableSequenceScore
.
initializeFunction
in interface DifferentiableSequenceScore
index
- the index of the class the DifferentiableSequenceScore
modelsfreeParams
- indicates whether the (reduced) parameterization is useddata
- the samplesweights
- the weights of the sequences in the samples
Exception
- if something went wrongprotected void createTrees(DataSet[] data2, double[][] weights2) throws Exception
trees
) and the parameter objects parameters
using the
given Measure
structureMeasure
.
data2
- the data that is used to compute the structureweights2
- the weights on the sequences in data2
Exception
- if the structure is no moral graph or if the lengths of data
and scoring function do not match or other problems
concerning the data occurprotected void setPlugInParameters(int index, boolean freeParameters, DataSet[] data, double[][] weights)
data
using weights
.
index
- the index of the class the scoring function is responsible
for, the parameters are estimated from
data[index]
and weights[index]
freeParameters
- indicates if only the free parameters or all parameters should
be used, this also affects the initializationdata
- the data used for initializationweights
- the weights on the dataprotected void fromXML(StringBuffer source) throws NonParsableException
AbstractDifferentiableSequenceScore
Storable
interface to create a scoring function from a StringBuffer
.
fromXML
in class AbstractDifferentiableSequenceScore
source
- the XML representation as StringBuffer
NonParsableException
- if the StringBuffer
could not be parsedAbstractDifferentiableSequenceScore.AbstractDifferentiableSequenceScore(StringBuffer)
public String toString()
toString
in class Object
public String getInstanceName()
SequenceScore
getInstanceName
in interface SequenceScore
public double getLogScoreFor(Sequence seq, int start)
SequenceScore
Sequence
seq
beginning at position start
in the Sequence
.
getLogScoreFor
in interface SequenceScore
seq
- the Sequence
start
- the start position in the Sequence
Sequence
public double getLogScoreAndPartialDerivation(Sequence seq, int start, IntList indices, DoubleList partialDer)
DifferentiableSequenceScore
Sequence
beginning at
position start
in the Sequence
and fills lists with
the indices and the partial derivations.
getLogScoreAndPartialDerivation
in interface DifferentiableSequenceScore
seq
- the Sequence
start
- the start position in the Sequence
indices
- an IntList
of indices, after method invocation the
list should contain the indices i where
partialDer
- a DoubleList
of partial derivations, after method
invocation the list should contain the corresponding
Sequence
public double getLogNormalizationConstant() throws RuntimeException
DifferentiableStatisticalModel
getLogNormalizationConstant
in interface DifferentiableStatisticalModel
RuntimeException
public int getNumberOfParameters()
DifferentiableSequenceScore
DifferentiableSequenceScore
. If the
number of parameters is not known yet, the method returns
DifferentiableSequenceScore.UNKNOWN
.
getNumberOfParameters
in interface DifferentiableSequenceScore
DifferentiableSequenceScore
DifferentiableSequenceScore.UNKNOWN
public void setParameters(double[] params, int start)
DifferentiableSequenceScore
params
between start
and
start + DifferentiableSequenceScore.getNumberOfParameters()
- 1
setParameters
in interface DifferentiableSequenceScore
params
- the new parametersstart
- the start index in params
protected void precomputeNormalization()
public double[] getCurrentParameterValues() throws Exception
DifferentiableSequenceScore
double
array of dimension
DifferentiableSequenceScore.getNumberOfParameters()
containing the current parameter values.
If one likes to use these parameters to start an optimization it is
highly recommended to invoke
DifferentiableSequenceScore.initializeFunction(int, boolean, DataSet[], double[][])
before.
After an optimization this method can be used to get the current
parameter values.
getCurrentParameterValues
in interface DifferentiableSequenceScore
Exception
- if no parameters exist (yet)public StringBuffer toXML()
Storable
StringBuffer
of an
instance of the implementing class.
toXML
in interface Storable
public double getLogPriorTerm()
DifferentiableStatisticalModel
DifferentiableStatisticalModel.getESS()
* DifferentiableStatisticalModel.getLogNormalizationConstant()
+ Math.log( prior )
prior
is the prior for the parameters of this model.
getLogPriorTerm
in interface DifferentiableStatisticalModel
getLogPriorTerm
in interface StatisticalModel
DifferentiableStatisticalModel.getESS()
* DifferentiableStatisticalModel.getLogNormalizationConstant()
+ Math.log( prior ).
DifferentiableStatisticalModel.getESS()
,
DifferentiableStatisticalModel.getLogNormalizationConstant()
public void addGradientOfLogPriorTerm(double[] grad, int start)
DifferentiableStatisticalModel
DifferentiableStatisticalModel.getLogPriorTerm()
for each
parameter of this model. The results are added to the array
grad
beginning at index start
.
addGradientOfLogPriorTerm
in interface DifferentiableStatisticalModel
grad
- the array of gradientsstart
- the start index in the grad
array, where the
partial derivations for the parameters of this models shall be
enteredDifferentiableStatisticalModel.getLogPriorTerm()
public double getESS()
DifferentiableStatisticalModel
getESS
in interface DifferentiableStatisticalModel
public int getPositionForParameter(int index)
index
is
responsible for.
index
- the index of the parameter
public double[] getPositionDependentKMerProb(Sequence kmer) throws Exception
kmer
for all possible positions in this BayesianNetworkDiffSM
starting at position kmer.getLength()-1.
- Parameters:
kmer
- the k-mer
- Returns:
- the position-dependent probabilities of this k-mer for position
kmer.getLength()-1 to AbstractDifferentiableSequenceScore.getLength()
-1
Throws:
Exception
- if the method is called for non-Markov model structures
getSizeOfEventSpaceForRandomVariablesOfParameter
public int getSizeOfEventSpaceForRandomVariablesOfParameter(int index)
- Description copied from interface:
DifferentiableStatisticalModel
- Returns the size of the event space of the random variables that are
affected by parameter no.
index
, i.e. the product of the
sizes of the alphabets at the position of each random variable affected
by parameter index
. For DNA alphabets this corresponds to 4
for a PWM, 16 for a WAM except position 0, ...
- Specified by:
getSizeOfEventSpaceForRandomVariablesOfParameter
in interface DifferentiableStatisticalModel
- Parameters:
index
- the index of the parameter
- Returns:
- the size of the event space
initializeFunctionRandomly
public void initializeFunctionRandomly(boolean freeParams)
throws Exception
- Description copied from interface:
DifferentiableSequenceScore
- This method initializes the
DifferentiableSequenceScore
randomly. It has to
create the underlying structure of the DifferentiableSequenceScore
.
- Specified by:
initializeFunctionRandomly
in interface DifferentiableSequenceScore
- Parameters:
freeParams
- indicates whether the (reduced) parameterization is used
- Throws:
Exception
- if something went wrong
isInitialized
public boolean isInitialized()
- Description copied from interface:
SequenceScore
- This method can be used to determine whether the instance is initialized. If
the instance is initialized you should be able to invoke
SequenceScore.getLogScoreFor(Sequence)
.
- Specified by:
isInitialized
in interface SequenceScore
- Returns:
true
if the instance is initialized, false
otherwise
getPWM
public double[][] getPWM()
throws Exception
- If this
BayesianNetworkDiffSM
is a PWM, i.e.
structureMeasure
=new InhomogeneousMarkov
(0)}}, this
method returns the normalized PWM as a double
array of
dimension AbstractDifferentiableSequenceScore.getLength()
x size-of-alphabet.
- Returns:
- the PWM as a two-dimensional array
- Throws:
Exception
- if this method is called for a
BayesianNetworkDiffSM
that is not a PWM
getCurrentParameterSet
public InstanceParameterSet getCurrentParameterSet()
throws Exception
- Description copied from interface:
InstantiableFromParameterSet
- Returns the
InstanceParameterSet
that has been used to
instantiate the current instance of the implementing class. If the
current instance was not created using an InstanceParameterSet
,
an equivalent InstanceParameterSet
should be returned, so that an
instance created using this InstanceParameterSet
would be in
principle equal to the current instance.
- Specified by:
getCurrentParameterSet
in interface InstantiableFromParameterSet
- Returns:
- the current
InstanceParameterSet
- Throws:
Exception
- if the InstanceParameterSet
could not be returned
emitDataSet
public DataSet emitDataSet(int numberOfSequences,
int... seqLength)
throws NotTrainedException,
Exception
- Description copied from interface:
StatisticalModel
- This method returns a
DataSet
object containing artificial
sequence(s).
There are two different possibilities to create a sample for a model with
length 0 (homogeneous models).
-
emitDataSet( int n, int l )
should return a data set with
n
sequences of length l
.
-
emitDataSet( int n, int[] l )
should return a data set with
n
sequences which have a sequence length corresponding to
the entry in the given array l
.
There are two different possibilities to create a sample for a model with
length greater than 0 (inhomogeneous models).
emitDataSet( int n )
and
emitDataSet( int n, null )
should return a sample with
n
sequences of length of the model (
SequenceScore.getLength()
).
The standard implementation throws an Exception
.
- Specified by:
emitDataSet
in interface StatisticalModel
- Overrides:
emitDataSet
in class AbstractDifferentiableStatisticalModel
- Parameters:
numberOfSequences
- the number of sequences that should be contained in the
returned sampleseqLength
- the length of the sequences for a homogeneous model; for an
inhomogeneous model this parameter should be null
or an array of size 0.
- Returns:
- a
DataSet
containing the artificial sequence(s)
- Throws:
NotTrainedException
- if the model is not trained yet
Exception
- if the emission did not succeed- See Also:
DataSet
Overview
Package
Class
Use
Tree
Deprecated
Index
Help
PREV CLASS
NEXT CLASS
FRAMES
NO FRAMES
All Classes
SUMMARY: NESTED | FIELD | CONSTR | METHOD
DETAIL: FIELD | CONSTR | METHOD