|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES All Classes | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.jstacs.utils.StatisticalModelTester
public class StatisticalModelTester
This class is useful for some test for any (discrete) models. It implements several statistics (log-likelihood, Shannon entropy, AIC, BIC, ...) to compare models.
StatisticalModel
Constructor Summary | |
---|---|
StatisticalModelTester()
|
Method Summary | |
---|---|
static double |
getKLDivergence(StatisticalModel m1,
StatisticalModel m2,
int length)
Returns the Kullback-Leibler-divergence D(p_m1||p_m2) . |
static double |
getLogLikelihood(StatisticalModel m,
DataSet data)
Returns the log-likelihood of a DataSet data for a
given model m . |
static double |
getLogLikelihood(StatisticalModel m,
DataSet data,
double[] weights)
Returns the log-likelihood of a DataSet data for a
given model m . |
static double |
getMarginalDistribution(StatisticalModel m,
int[] constraint)
This method computes the marginal distribution for any discrete model m and all sequences that fulfill the constraint
, if possible. |
static double |
getMaxOfDeviation(StatisticalModel m1,
StatisticalModel m2,
int length)
This method computes the maximum deviation between the probabilities for all sequences of length for discrete models m1
and m2 . |
static Sequence |
getMostProbableSequence(SequenceScore m,
int length)
Returns one most probable sequence for the discrete model m . |
static double |
getShannonEntropy(StatisticalModel m,
int length)
This method computes the Shannon entropy for any discrete model m and all sequences of length , if possible. |
static double |
getShannonEntropyInBits(StatisticalModel m,
int length)
This method computes the Shannon entropy in bits for any discrete model m and all sequences of length , if possible. |
static double |
getSumOfDeviation(StatisticalModel m1,
StatisticalModel m2,
int length)
This method computes the sum of deviations between the probabilities for all sequences of length for discrete models m1
and m2 . |
static double |
getSumOfDistribution(StatisticalModel m,
int length)
This method computes the marginal distribution for any discrete model m and all sequences of length , if possible. |
static double |
getSymKLDivergence(StatisticalModel m1,
StatisticalModel m2,
int length)
Returns the difference of the Kullback-Leibler-divergences, i.e. |
static double |
getValueOfAIC(StatisticalModel m,
DataSet s,
int k)
This method computes the value of Akaikes Information Criterion (AIC). |
static double |
getValueOfBIC(StatisticalModel m,
DataSet s,
int k)
This method computes the value of the Bayesian Information Criterion (BIC). |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public StatisticalModelTester()
Method Detail |
---|
public static double getKLDivergence(StatisticalModel m1, StatisticalModel m2, int length) throws Exception
D(p_m1||p_m2)
.
\sum_x p(x|m1) * \log \frac{p(x|m1)}{p(x|m2)}
.
m1
- one discrete modelm2
- another discrete modellength
- the length of the sequence (for inhomogeneous models length
has to be SequenceScore.getLength()
)
Exception
- if something went wrongpublic static double getSymKLDivergence(StatisticalModel m1, StatisticalModel m2, int length) throws Exception
D(p_m1||p_m2) - D(p_m2||p_m1)
.
\sum_x (p(x|m1)-p(x|m2)) * \log \frac{p(x|m1)}{p(x|m2)}
.
m1
- one discrete modelm2
- another discrete modellength
- the length of the sequence (for inhomogeneous models length
has to be SequenceScore.getLength()
)
Exception
- if something went wrongpublic static double getLogLikelihood(StatisticalModel m, DataSet data) throws Exception
DataSet
data
for a
given model m
.
m
- the given modeldata
- the DataSet
data
Exception
- if something went wrongpublic static double getLogLikelihood(StatisticalModel m, DataSet data, double[] weights) throws Exception
DataSet
data
for a
given model m
.
m
- the given modeldata
- the DataSet
weights
- the weight for each element of the DataSet
data
Exception
- if something went wrongpublic static double getMarginalDistribution(StatisticalModel m, int[] constraint) throws Exception
m
and all sequences that fulfill the constraint
, if possible.
m
- a discrete modelconstraint
- constraint[i] < 0
stands for an irrelevant
position, constraint[i] = c
with
0 <= c < m.getAlphabets()[(m.getLength==0)?0:i].getAlphabetLength()
is the encoded character of position i
Exception
- if something went wrongpublic static double getMaxOfDeviation(StatisticalModel m1, StatisticalModel m2, int length) throws Exception
length
for discrete models m1
and m2
.
m1
- one discrete modelm2
- another discrete modellength
- the length of the sequence (for inhomogeneous models length
has to be SequenceScore.getLength()
)
Exception
- if something went wrongpublic static Sequence getMostProbableSequence(SequenceScore m, int length) throws Exception
m
.
(Maybe there are more than one most probable sequences. In this case only
one of them is returned.)
m
- the discrete modellength
- the length of the sequence (for inhomogeneous models length
has to be SequenceScore.getLength()
)
Exception
- if something went wrongpublic static double getShannonEntropy(StatisticalModel m, int length) throws Exception
m
and all sequences of length
, if possible.
m
- the discrete modellength
- the length of the sequence (for inhomogeneous models length
has to be SequenceScore.getLength()
)
Exception
- if something went wrongpublic static double getShannonEntropyInBits(StatisticalModel m, int length) throws Exception
m
and all sequences of length
, if possible.
m
- the discrete modellength
- the length of the sequence (for inhomogeneous models length
has to be SequenceScore.getLength()
)
Exception
- if something went wrongpublic static double getSumOfDeviation(StatisticalModel m1, StatisticalModel m2, int length) throws Exception
length
for discrete models m1
and m2
.
m1
- one discrete modelm2
- another discrete modellength
- the length of the sequence (for inhomogeneous models length
has to be SequenceScore.getLength()
)
Exception
- if something went wrongpublic static double getSumOfDistribution(StatisticalModel m, int length) throws Exception
m
and all sequences of length
, if possible. So
this method can be used to give a hint whether a model is a distribution
or if some mistakes are in the implementation.
Math.abs( 1.0d - getSumOfDistribution( m, length )
should be
smaller than 1E-10
.
m
- the discrete modellength
- the length of the sequence (for inhomogeneous models length
has to be SequenceScore.getLength()
)
Exception
- if something went wrongpublic static double getValueOfAIC(StatisticalModel m, DataSet s, int k) throws Exception
2 * log L(t,x) - 2*k
, where
L(t,x)
is the likelihood of the DataSet
and
k
is the number of parameters in the model.
m
- a trained models
- the DataSet
for the testk
- the number of parameters of the model m
Exception
- if something went wrongpublic static double getValueOfBIC(StatisticalModel m, DataSet s, int k) throws Exception
2 * log L(t,x) - k *
log n
, where L(t,x)
is the likelihood of the
DataSet
, k
is the number of parameters in the model
and n
is the number of sequences in the DataSet
.
m
- a trained models
- the DataSet
for the testk
- the number of parameters of the model m
Exception
- if something went wrong
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES All Classes | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |