|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES All Classes | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.jstacs.classifiers.AbstractClassifier
de.jstacs.classifiers.AbstractScoreBasedClassifier
de.jstacs.classifiers.differentiableSequenceScoreBased.sampling.SamplingScoreBasedClassifier
public abstract class SamplingScoreBasedClassifier
A classifier that samples the parameters of SamplingDifferentiableStatisticalModel
s by the Metropolis-Hastings algorithm.
The distribution the parameters are sampled from is the distribution represented by the
DiffSSBasedOptimizableFunction
returned by
getFunction(DataSet[], double[][])
. As proposal distribution, a Gaussian distribution with given sampling
variance is used for each parameter.
Specifically, a new set of parameters is drawn from a proposal distribution
,
where
DifferentiableStatisticalModel.getSizeOfEventSpaceForRandomVariablesOfParameter(int)
. Let SamplingDifferentiableStatisticalModel
that Random.nextDouble()
.
Otherwise, the parameters are rejected and SamplingComponent
. The contents of these files
are stored together with the remaining representation of the SamplingScoreBasedClassifier
, if AbstractClassifier.toXML()
is called, and, hence,
can be stored to a monolithic file containing all information for, e.g., later classification procedures.
For determining the length of the burn-in phase and, as a consequence, the beginning of the stationary phase, a BurnInTest
can be provided to the constructor of the classifier.
Nested Class Summary | |
---|---|
protected class |
SamplingScoreBasedClassifier.DiffSMSamplingComponent
The SamplingComponent that handles storing and loading sampled parameters values
to and from files. |
static class |
SamplingScoreBasedClassifier.SamplingScheme
Sampling scheme for sampling the parameters of the scoring functions. |
Nested classes/interfaces inherited from class de.jstacs.classifiers.AbstractScoreBasedClassifier |
---|
AbstractScoreBasedClassifier.DoubleTableResult |
Field Summary | |
---|---|
protected BurnInTest |
burnInTest
The BurnInTest , may be null for no test |
protected double[] |
currentParameters
the currently accepted parameters |
protected double |
currentScore
The score achieved using currentParameters |
protected double[] |
initParameters
The initial parameters if set by setInitParameters(double[]) , null otherwise |
protected double[][] |
lastParameters
The last accepted parameters for all samplings, backup for iterative sampling when checking for BurnInTest |
protected double[] |
lastScore
The scores yielded for the parameters in lastParameters |
protected SamplingScoreBasedClassifierParameterSet |
params
Parameters |
protected double[] |
previousParameters
The previously accepted parameters, backup for rollbacks |
protected SamplingDifferentiableStatisticalModel[] |
scoringFunctions
SamplingDifferentiableStatisticalModel s |
Constructor Summary | |
---|---|
protected |
SamplingScoreBasedClassifier(SamplingScoreBasedClassifierParameterSet params,
BurnInTest burnInTest,
double[] classVariances,
SamplingDifferentiableStatisticalModel... scoringFunctions)
Creates a new SamplingScoreBasedClassifier using the parameters in params ,
a specified BurnInTest (or null for no burn-in test), a set of sampling variances,
which may be different for each of the classes (in analogy to equivalent sample size for the Dirichlet distribution),
and set set of SamplingDifferentiableStatisticalModel s for each of the classes. |
|
SamplingScoreBasedClassifier(StringBuffer xml)
This is the constructor for Storable . |
Method Summary | |
---|---|
protected double |
doOneSamplingStep(DiffSSBasedOptimizableFunction function,
SamplingScoreBasedClassifier.SamplingScheme scheme,
double previousValue)
Performs one sampling step, i.e., one sampling of all parameter values. |
void |
doSingleSampling(DataSet[] s,
double[][] weights,
int numSteps,
String outfilePrefix)
Does a single sampling run for a predefined number of steps. |
protected void |
extractFurtherClassifierInfosFromXML(StringBuffer xml)
Extracts further information of a classifier from an XML representation. |
protected double[] |
getBestParameters()
Returns the sampled parameter values with the maximum value of the objective function |
CategoricalResult[] |
getClassifierAnnotation()
Returns an array of Result s of dimension
AbstractClassifier.getNumberOfClasses() that contains information about the
classifier and for each class. |
boolean |
getDeleteOnExit()
Returns true if the temporary parameter files shall
be deleted on exit of the program. |
protected abstract DiffSSBasedOptimizableFunction |
getFunction(DataSet[] data,
double[][] weights)
Returns the function that should be sampled from. |
protected StringBuffer |
getFurtherClassifierInfos()
This method returns further information of a classifier as a StringBuffer . |
String |
getInstanceName()
Returns a short description of the classifier. |
protected double[] |
getMeanParameters(boolean testBurnIn,
int minBurnInSteps)
Returns the mean parameters over all samplings of all stationary phases. |
NumericalResultSet |
getNumericalCharacteristics()
Returns the subset of numerical values that are also returned by AbstractClassifier.getCharacteristics() . |
protected SamplingScoreBasedClassifier.DiffSMSamplingComponent |
getSamplingComponent()
Returns a sampling component suited for this SamplingScoreBasedClassifier |
protected double |
getScore(Sequence seq,
int cls,
boolean check)
This method returns the score for a given Sequence and a given
class. |
double[] |
getScores(DataSet s)
This method returns the scores of the classifier for any Sequence
in the DataSet . |
File |
getTempDir()
Returns the directory for parameter files set in this SamplingScoreBasedClassifier . |
protected void |
init(int starts,
boolean adaptVariance,
String outfilePrefix)
Initializes all internal fields and initializes the scoringFunctions s randomly |
boolean |
isInitialized()
This method gives information about the state of the classifier. |
void |
joinAndSetParameterFiles(boolean add,
File... files)
Combines parameter files such that they are accepted as parameter files of this SamplingScoreBasedClassifier |
protected double |
modifyFunctionValue(double value)
Allows for a modification of the value returned by the function obtained by getFunction(DataSet[], double[][]) . |
protected void |
precomputeBurnInLength(SamplingScoreBasedClassifier.DiffSMSamplingComponent sfsc)
Precomputes the length of the burn-in phase, e.g. useful for computing scores of multiple sequences |
protected void |
sample(SamplingScoreBasedClassifier.DiffSMSamplingComponent sfsc,
DiffSSBasedOptimizableFunction function)
Samples as many steps as needed to get into the stationary phase according to burnInTest and then samples the number of
stationary steps as set in params . |
protected double |
sampleNSteps(DiffSSBasedOptimizableFunction function,
SamplingScoreBasedClassifier.DiffSMSamplingComponent component,
BurnInTest test,
int numSteps,
SamplingScoreBasedClassifier.SamplingScheme scheme)
Samples a predefined number of steps appended to the current sampling |
void |
setDeleteOnExit(boolean deleteOnExit)
If set to true (which is the default), the temporary files for storing sampled parameter
values are deleted on exit of the program. |
void |
setInitParameters(double[] parameters)
Sets the initial parameters of the sampling to parameters . |
void |
setTempDir(File tempDir)
Sets the directory for parameter files set in this SamplingScoreBasedClassifier . |
void |
train(DataSet[] s,
double[][] weights)
This method trains a classifier over an array of weighted DataSet
s. |
Methods inherited from class de.jstacs.classifiers.AbstractScoreBasedClassifier |
---|
check, check, classify, classify, clone, createDefaultClassWeights, getClassWeight, getClassWeights, getMultiClassScores, getNumberOfClasses, getPValue, getPValue, getResults, getScore, setClassWeights, setClassWeights, setThresholdClassWeights |
Methods inherited from class de.jstacs.classifiers.AbstractClassifier |
---|
classify, evaluate, getAlphabetContainer, getCharacteristics, getLength, getXMLTag, toXML, train |
Methods inherited from class java.lang.Object |
---|
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected SamplingScoreBasedClassifierParameterSet params
protected SamplingDifferentiableStatisticalModel[] scoringFunctions
SamplingDifferentiableStatisticalModel
s
protected double[] currentParameters
protected double[] initParameters
setInitParameters(double[])
, null
otherwise
protected double currentScore
currentParameters
protected double[] previousParameters
protected double[][] lastParameters
BurnInTest
protected double[] lastScore
lastParameters
protected BurnInTest burnInTest
BurnInTest
, may be null for no test
Constructor Detail |
---|
public SamplingScoreBasedClassifier(StringBuffer xml) throws NonParsableException
Storable
.
xml
- the xml representation
NonParsableException
- if the representation could not be parsed.protected SamplingScoreBasedClassifier(SamplingScoreBasedClassifierParameterSet params, BurnInTest burnInTest, double[] classVariances, SamplingDifferentiableStatisticalModel... scoringFunctions) throws CloneNotSupportedException
SamplingScoreBasedClassifier
using the parameters in params
,
a specified BurnInTest
(or null
for no burn-in test), a set of sampling variances,
which may be different for each of the classes (in analogy to equivalent sample size for the Dirichlet distribution),
and set set of SamplingDifferentiableStatisticalModel
s for each of the classes.
params
- the external parameters of this classifierburnInTest
- the burn-in test (or null
for no burn-in test)classVariances
- the variances used for sampling for the parameters of each classscoringFunctions
- the scoring functions for each of the classes
CloneNotSupportedException
- if the scoring functions or the burn-in test could not be clonedVarianceRatioBurnInTest
Method Detail |
---|
protected StringBuffer getFurtherClassifierInfos()
AbstractClassifier
StringBuffer
. This method is used by the method AbstractClassifier.toXML()
and should not be made public.
getFurtherClassifierInfos
in class AbstractScoreBasedClassifier
StringBuffer
AbstractClassifier.toXML()
protected void extractFurtherClassifierInfosFromXML(StringBuffer xml) throws NonParsableException
AbstractClassifier
AbstractClassifier.fromXML(StringBuffer)
and
should not be made public.
extractFurtherClassifierInfosFromXML
in class AbstractScoreBasedClassifier
xml
- the XML representation as StringBuffer
NonParsableException
- if the information could not be parsed out of the XML
representation (the StringBuffer
could not be parsed)AbstractClassifier.fromXML(StringBuffer)
public CategoricalResult[] getClassifierAnnotation()
AbstractClassifier
Result
s of dimension
AbstractClassifier.getNumberOfClasses()
that contains information about the
classifier and for each class.
res[0] = new CategoricalResult( "classifier", "the kind of classifier", getInstanceName() );
res[1] = new CategoricalResult( "class info 0", "some information about the class", "info0" );
res[2] = new CategoricalResult( "class info 1", "some information about the class", "info1" );
...
getClassifierAnnotation
in class AbstractClassifier
Result
s that contains information about the
classifierpublic NumericalResultSet getNumericalCharacteristics() throws Exception
AbstractClassifier
AbstractClassifier.getCharacteristics()
.
getNumericalCharacteristics
in class AbstractClassifier
Exception
- if some of the characteristics could not be definedpublic String getInstanceName()
AbstractClassifier
getInstanceName
in class AbstractClassifier
protected abstract DiffSSBasedOptimizableFunction getFunction(DataSet[] data, double[][] weights) throws Exception
data
- the samplesweights
- the weights of the sequences of the samples
Exception
- if the function could not be createdprotected double modifyFunctionValue(double value)
getFunction(DataSet[], double[][])
.
This is for instance necessary in case of LogGenDisMixFunction
to
obtain a proper posterior or supervised posterior.
value
- the original value
protected SamplingScoreBasedClassifier.DiffSMSamplingComponent getSamplingComponent()
SamplingScoreBasedClassifier
public File getTempDir()
SamplingScoreBasedClassifier
.
If this value is null
, the default directory of the executing OS is used for the parameter
files.
public void setTempDir(File tempDir)
SamplingScoreBasedClassifier
.
If tempDir
is null
, the default directory of the executing OS is used for the parameter
files. If this value is reset after training, all sampled parameters will be lost.
The value set by this method is not stored in the XML-representation.
tempDir
- the temp directorypublic boolean getDeleteOnExit()
true
if the temporary parameter files shall
be deleted on exit of the program.
public void setDeleteOnExit(boolean deleteOnExit) throws Exception
true
(which is the default), the temporary files for storing sampled parameter
values are deleted on exit of the program. If this value is set to true
it cannot be
reset to false
, again, after sampling started due to the restrictions of File.deleteOnExit()
.
If you want to retain those
parameters, nonetheless, you can call AbstractClassifier.toXML()
and save this StringBuffer
, which also contains the sampled
parameter values, somewhere.
The value set by this method is not stored in the XML-representation.
deleteOnExit
- if temp files shall be deleted on exit
Exception
- if set to false
after sampling startedprotected void init(int starts, boolean adaptVariance, String outfilePrefix) throws Exception
scoringFunctions
s randomly
starts
- number of startsadaptVariance
- if true, variance is adapted to size of event spaceoutfilePrefix
- the prefix of the outfiles
Exception
- if the scoring functions could not be initializedprotected double sampleNSteps(DiffSSBasedOptimizableFunction function, SamplingScoreBasedClassifier.DiffSMSamplingComponent component, BurnInTest test, int numSteps, SamplingScoreBasedClassifier.SamplingScheme scheme) throws Exception
function
- the objective functioncomponent
- the sampling component with selected samplingtest
- the burn-in testnumSteps
- the number of stepsscheme
- the SamplingScoreBasedClassifier.SamplingScheme
Exception
- if either the function could not be evaluated on the current parameters or the
sampled parameters could not be storedprotected void sample(SamplingScoreBasedClassifier.DiffSMSamplingComponent sfsc, DiffSSBasedOptimizableFunction function) throws Exception
burnInTest
and then samples the number of
stationary steps as set in params
.
sfsc
- the current sampling componentfunction
- the objective function
Exception
- if the sampling could not be extended, e.g. due to evaluation errorsprotected double doOneSamplingStep(DiffSSBasedOptimizableFunction function, SamplingScoreBasedClassifier.SamplingScheme scheme, double previousValue) throws Exception
function
- the objective functionscheme
- the SamplingScoreBasedClassifier.SamplingScheme
previousValue
- the value of the last sampling or minus infinity
for the first sampling run
Double.NaN
if none
of the sampled parameters where accepted
Exception
- if the function could not be evaluated or an unknown SamplingScoreBasedClassifier.SamplingScheme
was providedprotected double getScore(Sequence seq, int cls, boolean check) throws IllegalArgumentException, NotTrainedException, Exception
AbstractScoreBasedClassifier
Sequence
and a given
class.
getScore
in class AbstractScoreBasedClassifier
seq
- the Sequence
cls
- the index of the classcheck
- the switch to decide whether to check
AlphabetContainer
and the length of the
Sequence
or not
Sequence
and a given class
IllegalArgumentException
- if something is wrong with the Sequence
seq
NotTrainedException
- if the classifier is not trained
Exception
- if something went wrongpublic double[] getScores(DataSet s) throws Exception
AbstractScoreBasedClassifier
Sequence
in the DataSet
. The scores are stored in the array according to
the index of the Sequence
in the DataSet
.
getScores
in class AbstractScoreBasedClassifier
s
- the DataSet
Exception
- if something went wrongpublic void setInitParameters(double[] parameters)
parameters
.
parameters
- the initial parameterspublic boolean isInitialized()
AbstractClassifier
isInitialized
in class AbstractClassifier
true
if the classifier is initialized and therefore able
to classify sequences, otherwise false
public void doSingleSampling(DataSet[] s, double[][] weights, int numSteps, String outfilePrefix) throws Exception
s
- the dataweights
- the weights for the datanumSteps
- the number of sampling stepsoutfilePrefix
- the prefix of the outfile where the parameter values
are stored
Exception
- if the scoring functions could not be initialized or the sampling could not be extended, e.g. due to evaluation errorspublic void train(DataSet[] s, double[][] weights) throws Exception
AbstractClassifier
DataSet
s. That is why the following has to be fulfilled:
s.length == weights.length
weights[i] == null || s[i].getNumberOfElements() == weights[i].length
.
AbstractClassifier.train(DataSet...)
.
DataSet
s are defined over the
underlying alphabet and length.
train
in class AbstractClassifier
s
- an array of DataSet
sweights
- the weights for the DataSet
s
Exception
- if the weights are incorrect or the training did not succeedAbstractClassifier.train(DataSet...)
protected void precomputeBurnInLength(SamplingScoreBasedClassifier.DiffSMSamplingComponent sfsc) throws Exception
sfsc
- the current sampling component
Exception
- if the parameters values could not be parsedprotected double[] getBestParameters() throws Exception
Exception
- if the parameters values could not be parsedprotected double[] getMeanParameters(boolean testBurnIn, int minBurnInSteps) throws Exception
testBurnIn
- true if the length of the burn-in phase shall be computedminBurnInSteps
- minimum number of steps considered as burn-in
Exception
- if the parameters values could not be parsedpublic void joinAndSetParameterFiles(boolean add, File... files) throws Exception
SamplingScoreBasedClassifier
add
- if true, parameter files are appended to the current ones, i.e., the number
of samplings is augmented by these files
files
- the parameter files
Throws:
Exception
- if the parameter files could not be joined
Overview
Package
Class
Use
Tree
Deprecated
Index
Help
PREV CLASS
NEXT CLASS
FRAMES
NO FRAMES
All Classes
SUMMARY: NESTED | FIELD | CONSTR | METHOD
DETAIL: FIELD | CONSTR | METHOD