public abstract class AbstractClassifier extends Object implements Storable, Cloneable
train, test
and evaluate
should always have the same order that you have
used while instantiation of the object.
Constructor and Description |
---|
AbstractClassifier(AlphabetContainer abc)
The constructor for a homogeneous classifier.
|
AbstractClassifier(AlphabetContainer abc,
int length)
The constructor for an inhomogeneous classifier.
|
AbstractClassifier(StringBuffer xml)
The standard constructor for the interface
Storable . |
Modifier and Type | Method and Description |
---|---|
byte[] |
classify(DataSet s)
This method classifies all sequences of a data set and returns an array of
indices of the classes to which the respective sequences are assigned
with for each index
i in the array
0 < i < getNumberOfClasses() . |
abstract byte |
classify(Sequence seq)
This method classifies a sequence and returns the index
i of
the class to which the sequence is assigned with
0 < i < getNumberOfClasses() . |
AbstractClassifier |
clone() |
ResultSet |
evaluate(AbstractPerformanceMeasureParameterSet<? extends PerformanceMeasure> params,
boolean exceptionIfNotComputeable,
DataSet... s)
This method evaluates the classifier and computes, for instance, the sensitivity for a given specificity, the
area under the ROC curve and so on.
|
ResultSet |
evaluate(AbstractPerformanceMeasureParameterSet<? extends PerformanceMeasure> params,
boolean exceptionIfNotComputeable,
DataSet[] s,
double[][] weights)
This method evaluates the classifier and computes, for instance, the sensitivity for a given specificity, the
area under the ROC curve and so on.
|
protected abstract void |
extractFurtherClassifierInfosFromXML(StringBuffer xml)
Extracts further information of a classifier from an XML representation.
|
AlphabetContainer |
getAlphabetContainer()
This method returns the container of alphabets that is used in the
classifier.
|
ResultSet |
getCharacteristics()
Returns some information characterizing or describing the current
instance of the classifier.
|
abstract CategoricalResult[] |
getClassifierAnnotation()
Returns an array of Result s of dimension
getNumberOfClasses() that contains information about the
classifier and for each class.
res[0] = new CategoricalResult( "classifier", "the kind of classifier", getInstanceName() );
|
protected abstract StringBuffer |
getFurtherClassifierInfos()
This method returns further information of a classifier as a
StringBuffer . |
abstract String |
getInstanceName()
Returns a short description of the classifier.
|
int |
getLength()
Returns the length of the sequences this classifier can handle or
0 for sequences of arbitrary length. |
protected double[][][] |
getMultiClassScores(DataSet[] s)
This method returns a multidimensional array with class specific scores.
|
abstract int |
getNumberOfClasses()
Returns the number of classes that can be distinguished.
|
abstract NumericalResultSet |
getNumericalCharacteristics()
Returns the subset of numerical values that are also returned by
getCharacteristics() . |
protected boolean |
getResults(LinkedList list,
DataSet[] s,
double[][] weights,
AbstractPerformanceMeasureParameterSet<? extends PerformanceMeasure> params,
boolean exceptionIfNotComputeable)
This method computes the results for any evaluation of the classifier.
|
protected abstract String |
getXMLTag()
Returns the
String that is used as tag for the XML representation
of the classifier. |
abstract boolean |
isInitialized()
This method gives information about the state of the classifier.
|
StringBuffer |
toXML()
This method returns an XML representation as
StringBuffer of an
instance of the implementing class. |
void |
train(DataSet... s)
Trains the
AbstractClassifier object given the data as
DataSet s.This method should work non-incrementally. |
abstract void |
train(DataSet[] s,
double[][] weights)
This method trains a classifier over an array of weighted
DataSet
s. |
public AbstractClassifier(AlphabetContainer abc)
abc
- the alphabets that are usedAbstractClassifier(AlphabetContainer, int)
public AbstractClassifier(AlphabetContainer abc, int length) throws IllegalArgumentException
abc
- the alphabets that are usedlength
- the length of the sequences that can be classifiedIllegalArgumentException
- if the length and the possible length of the
AlphabetContainer
does not matchpublic AbstractClassifier(StringBuffer xml) throws NonParsableException
Storable
.
Creates a new AbstractClassifier
out of its XML representation.xml
- the XML representation as StringBuffer
NonParsableException
- if the AbstractClassifier
could not be reconstructed
out of the XML representation (the StringBuffer
could
not be parsed)Storable
public abstract byte classify(Sequence seq) throws Exception
i
of
the class to which the sequence is assigned with
0 < i < getNumberOfClasses()
.
seq
- the sequence to be classifiedException
- if the classifier is not trained or something is wrong with
the sequencepublic byte[] classify(DataSet s) throws Exception
i
in the array
0 < i < getNumberOfClasses()
.s
- the data set to be classifiedException
- if something went wrong during the classificationpublic AbstractClassifier clone() throws CloneNotSupportedException
clone
in class Object
CloneNotSupportedException
public final ResultSet evaluate(AbstractPerformanceMeasureParameterSet<? extends PerformanceMeasure> params, boolean exceptionIfNotComputeable, DataSet... s) throws Exception
ClassifierAssessment
as, for instance, crossvalidation, hold out
sampling, ... .
params
- the current parameters defining the set of AbstractPerformanceMeasure
s to be evaluatedexceptionIfNotComputeable
- indicates that the method throws an Exception
if a measure
could not be computeds
- the array of DataSet
sNumericalResultSet
, otherwise ResultSet
Exception
- if something went wrongevaluate(AbstractPerformanceMeasureParameterSet, boolean, DataSet[], double[][])
public final ResultSet evaluate(AbstractPerformanceMeasureParameterSet<? extends PerformanceMeasure> params, boolean exceptionIfNotComputeable, DataSet[] s, double[][] weights) throws Exception
ClassifierAssessment
as, for instance, crossvalidation, hold out
sampling, ... .
params
- the current parameters defining the set of AbstractPerformanceMeasure
s to be evaluatedexceptionIfNotComputeable
- indicates that the method throws an Exception
if a measure
could not be computeds
- the array of DataSet
sweights
- the weights of the sequences for each data setNumericalResultSet
, otherwise ResultSet
Exception
- if something went wrongNumericalResultSet
,
ResultSet
,
getResults(LinkedList, DataSet[], double[][], AbstractPerformanceMeasureParameterSet, boolean)
,
ClassifierAssessment
,
ClassifierAssessment.assess(de.jstacs.classifiers.performanceMeasures.NumericalPerformanceMeasureParameterSet, de.jstacs.classifiers.assessment.ClassifierAssessmentAssessParameterSet, DataSet...)
protected boolean getResults(LinkedList list, DataSet[] s, double[][] weights, AbstractPerformanceMeasureParameterSet<? extends PerformanceMeasure> params, boolean exceptionIfNotComputeable) throws Exception
list
- a list adding the resultss
- the array of DataSet
sweights
- the weights of the sequences for each data setparams
- the current parametersexceptionIfNotComputeable
- indicates the method throws an Exception
if a measure
could not be computedException
- if something went wrongevaluate(AbstractPerformanceMeasureParameterSet, boolean, DataSet[], double[][])
,
NumericalResult
,
Result
protected double[][][] getMultiClassScores(DataSet[] s) throws Exception
result[d][n][c]
returns the score of class c
for sequence n
of the data set s[d]
. The class with the maximum score
for any sequence is the predicted class of the sequence.s
- the data setsException
- if the scores can not be computedgetResults(LinkedList, DataSet[], double[][], AbstractPerformanceMeasureParameterSet, boolean)
public final AlphabetContainer getAlphabetContainer()
public ResultSet getCharacteristics() throws Exception
StorableResult
.Exception
- if some of the characteristics could not be definedStorableResult
,
getNumericalCharacteristics()
,
ResultSet.ResultSet(de.jstacs.results.Result[][])
public abstract String getInstanceName()
public abstract CategoricalResult[] getClassifierAnnotation()
Result
s of dimension
getNumberOfClasses()
that contains information about the
classifier and for each class.
res[0] = new CategoricalResult( "classifier", "the kind of classifier", getInstanceName() );
res[1] = new CategoricalResult( "class info 0", "some information about the class", "info0" );
res[2] = new CategoricalResult( "class info 1", "some information about the class", "info1" );
...
Result
s that contains information about the
classifierpublic final int getLength()
0
for sequences of arbitrary length.public abstract NumericalResultSet getNumericalCharacteristics() throws Exception
getCharacteristics()
.Exception
- if some of the characteristics could not be definedpublic abstract int getNumberOfClasses()
public abstract boolean isInitialized()
true
if the classifier is initialized and therefore able
to classify sequences, otherwise false
public void train(DataSet... s) throws Exception
AbstractClassifier
object given the data as
DataSet
s.train(data1); train(data2);
should be a
fully trained model over data2
and not over
data1, data2
.
DataSet
s are defined over the
underlying alphabet and length.s
- the data
Exception
- if the training did not succeedtrain(DataSet[], double[][])
public abstract void train(DataSet[] s, double[][] weights) throws Exception
DataSet
s. That is why the following has to be fulfilled:
s.length == weights.length
weights[i] == null || s[i].getNumberOfElements() == weights[i].length
.
train(DataSet...)
.
DataSet
s are defined over the
underlying alphabet and length.s
- an array of DataSet
sweights
- the weights for the DataSet
sException
- if the weights are incorrect or the training did not succeedtrain(DataSet...)
protected abstract String getXMLTag()
String
that is used as tag for the XML representation
of the classifier. This method is used by the methods
fromXML(StringBuffer)
and toXML()
.String
that is used as tag for the XML representation
of the classifierprotected abstract void extractFurtherClassifierInfosFromXML(StringBuffer xml) throws NonParsableException
fromXML(StringBuffer)
and
should not be made public.xml
- the XML representation as StringBuffer
NonParsableException
- if the information could not be parsed out of the XML
representation (the StringBuffer
could not be parsed)fromXML(StringBuffer)
public final StringBuffer toXML()
Storable
StringBuffer
of an
instance of the implementing class.protected abstract StringBuffer getFurtherClassifierInfos()
StringBuffer
. This method is used by the method toXML()
and should not be made public.StringBuffer
toXML()