Package weka.classifiers.mi
Class MISMO
- java.lang.Object
-
- weka.classifiers.Classifier
-
- weka.classifiers.mi.MISMO
-
- All Implemented Interfaces:
java.io.Serializable
,java.lang.Cloneable
,CapabilitiesHandler
,MultiInstanceCapabilitiesHandler
,OptionHandler
,RevisionHandler
,TechnicalInformationHandler
,WeightedInstancesHandler
public class MISMO extends Classifier implements WeightedInstancesHandler, MultiInstanceCapabilitiesHandler, TechnicalInformationHandler
Implements John Platt's sequential minimal optimization algorithm for training a support vector classifier.
This implementation globally replaces all missing values and transforms nominal attributes into binary ones. It also normalizes all attributes by default. (In that case the coefficients in the output are based on the normalized data, not the original data --- this is important for interpreting the classifier.)
Multi-class problems are solved using pairwise classification.
To obtain proper probability estimates, use the option that fits logistic regression models to the outputs of the support vector machine. In the multi-class case the predicted probabilities are coupled using Hastie and Tibshirani's pairwise coupling method.
Note: for improved speed normalization should be turned off when operating on SparseInstances.
For more information on the SMO algorithm, see
J. Platt: Machines using Sequential Minimal Optimization. In B. Schoelkopf and C. Burges and A. Smola, editors, Advances in Kernel Methods - Support Vector Learning, 1998.
S.S. Keerthi, S.K. Shevade, C. Bhattacharyya, K.R.K. Murthy (2001). Improvements to Platt's SMO Algorithm for SVM Classifier Design. Neural Computation. 13(3):637-649. BibTeX:@incollection{Platt1998, author = {J. Platt}, booktitle = {Advances in Kernel Methods - Support Vector Learning}, editor = {B. Schoelkopf and C. Burges and A. Smola}, publisher = {MIT Press}, title = {Machines using Sequential Minimal Optimization}, year = {1998} } @article{Keerthi2001, author = {S.S. Keerthi and S.K. Shevade and C. Bhattacharyya and K.R.K. Murthy}, journal = {Neural Computation}, number = {3}, pages = {637-649}, title = {Improvements to Platt's SMO Algorithm for SVM Classifier Design}, volume = {13}, year = {2001} }
Valid options are:-D If set, classifier is run in debug mode and may output additional info to the console
-no-checks Turns off all checks - use with caution! Turning them off assumes that data is purely numeric, doesn't contain any missing values, and has a nominal class. Turning them off also means that no header information will be stored if the machine is linear. Finally, it also assumes that no instance has a weight equal to 0. (default: checks on)
-C <double> The complexity constant C. (default 1)
-N Whether to 0=normalize/1=standardize/2=neither. (default 0=normalize)
-I Use MIminimax feature space.
-L <double> The tolerance parameter. (default 1.0e-3)
-P <double> The epsilon for round-off error. (default 1.0e-12)
-M Fit logistic models to SVM outputs.
-V <double> The number of folds for the internal cross-validation. (default -1, use training data)
-W <double> The random number seed. (default 1)
-K <classname and parameters> The Kernel to use. (default: weka.classifiers.functions.supportVector.PolyKernel)
Options specific to kernel weka.classifiers.mi.supportVector.MIPolyKernel:
-D Enables debugging output (if available) to be printed. (default: off)
-no-checks Turns off all checks - use with caution! (default: checks on)
-C <num> The size of the cache (a prime number), 0 for full cache and -1 to turn it off. (default: 250007)
-E <num> The Exponent to use. (default: 1.0)
-L Use lower-order terms. (default: no)
- Version:
- $Revision: 9144 $
- Author:
- Eibe Frank (eibe@cs.waikato.ac.nz), Shane Legg (shane@intelligenesis.net) (sparse vector code), Stuart Inglis (stuart@reeltwo.com) (sparse vector code), Lin Dong (ld21@cs.waikato.ac.nz) (code for adapting to MI data)
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static int
FILTER_NONE
No normalization/standardizationstatic int
FILTER_NORMALIZE
Normalize training datastatic int
FILTER_STANDARDIZE
Standardize training datastatic Tag[]
TAGS_FILTER
The filter to apply to the training data
-
Constructor Summary
Constructors Constructor Description MISMO()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.String[][][]
attributeNames()
Returns the attribute names.double[][]
bias()
Returns the bias of each binary SMO.void
buildClassifier(Instances insts)
Method for building the classifier.java.lang.String
buildLogisticModelsTipText()
Returns the tip text for this propertyjava.lang.String
checksTurnedOffTipText()
Returns the tip text for this propertyjava.lang.String[]
classAttributeNames()
Returns the names of the class attributes.java.lang.String
cTipText()
Returns the tip text for this propertydouble[]
distributionForInstance(Instance inst)
Estimates class probabilities for given instance.java.lang.String
epsilonTipText()
Returns the tip text for this propertyjava.lang.String
filterTypeTipText()
Returns the tip text for this propertyboolean
getBuildLogisticModels()
Get the value of buildLogisticModels.double
getC()
Get the value of C.Capabilities
getCapabilities()
Returns default capabilities of the classifier.boolean
getChecksTurnedOff()
Returns whether the checks are turned off or not.double
getEpsilon()
Get the value of epsilon.SelectedTag
getFilterType()
Gets how the training data will be transformed.Kernel
getKernel()
Gets the kernel to use.boolean
getMinimax()
Check if the MIMinimax feature space is to be used.Capabilities
getMultiInstanceCapabilities()
Returns the capabilities of this multi-instance classifier for the relational data.int
getNumFolds()
Get the value of numFolds.java.lang.String[]
getOptions()
Gets the current settings of the classifier.int
getRandomSeed()
Get the value of randomSeed.java.lang.String
getRevision()
Returns the revision string.TechnicalInformation
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.double
getToleranceParameter()
Get the value of tolerance parameter.java.lang.String
globalInfo()
Returns a string describing classifierjava.lang.String
kernelTipText()
Returns the tip text for this propertyjava.util.Enumeration
listOptions()
Returns an enumeration describing the available options.static void
main(java.lang.String[] argv)
Main method for testing this class.java.lang.String
minimaxTipText()
Returns the tip text for this propertyint
numClassAttributeValues()
Returns the number of values of the class attribute.java.lang.String
numFoldsTipText()
Returns the tip text for this propertydouble[]
pairwiseCoupling(double[][] n, double[][] r)
Implements pairwise coupling.java.lang.String
randomSeedTipText()
Returns the tip text for this propertyvoid
setBuildLogisticModels(boolean newbuildLogisticModels)
Set the value of buildLogisticModels.void
setC(double v)
Set the value of C.void
setChecksTurnedOff(boolean value)
Disables or enables the checks (which could be time-consuming).void
setEpsilon(double v)
Set the value of epsilon.void
setFilterType(SelectedTag newType)
Sets how the training data will be transformed.void
setKernel(Kernel value)
Sets the kernel to use.void
setMinimax(boolean v)
Set if the MIMinimax feature space is to be used.void
setNumFolds(int newnumFolds)
Set the value of numFolds.void
setOptions(java.lang.String[] options)
Parses a given list of options.void
setRandomSeed(int newrandomSeed)
Set the value of randomSeed.void
setToleranceParameter(double v)
Set the value of tolerance parameter.int[][][]
sparseIndices()
Returns the indices in sparse format.double[][][]
sparseWeights()
Returns the weights in sparse format.java.lang.String
toleranceParameterTipText()
Returns the tip text for this propertyjava.lang.String
toString()
Prints out the classifier.void
turnChecksOff()
Turns off checks for missing values, etc.void
turnChecksOn()
Turns on checks for missing values, etc.-
Methods inherited from class weka.classifiers.Classifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, makeCopy, setDebug
-
-
-
-
Field Detail
-
FILTER_NORMALIZE
public static final int FILTER_NORMALIZE
Normalize training data- See Also:
- Constant Field Values
-
FILTER_STANDARDIZE
public static final int FILTER_STANDARDIZE
Standardize training data- See Also:
- Constant Field Values
-
FILTER_NONE
public static final int FILTER_NONE
No normalization/standardization- See Also:
- Constant Field Values
-
TAGS_FILTER
public static final Tag[] TAGS_FILTER
The filter to apply to the training data
-
-
Method Detail
-
globalInfo
public java.lang.String globalInfo()
Returns a string describing classifier- Returns:
- a description suitable for displaying in the explorer/experimenter gui
-
getTechnicalInformation
public TechnicalInformation getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- Specified by:
getTechnicalInformation
in interfaceTechnicalInformationHandler
- Returns:
- the technical information about this class
-
turnChecksOff
public void turnChecksOff()
Turns off checks for missing values, etc. Use with caution.
-
turnChecksOn
public void turnChecksOn()
Turns on checks for missing values, etc.
-
getCapabilities
public Capabilities getCapabilities()
Returns default capabilities of the classifier.- Specified by:
getCapabilities
in interfaceCapabilitiesHandler
- Overrides:
getCapabilities
in classClassifier
- Returns:
- the capabilities of this classifier
- See Also:
Capabilities
-
getMultiInstanceCapabilities
public Capabilities getMultiInstanceCapabilities()
Returns the capabilities of this multi-instance classifier for the relational data.- Specified by:
getMultiInstanceCapabilities
in interfaceMultiInstanceCapabilitiesHandler
- Returns:
- the capabilities of this object
- See Also:
Capabilities
-
buildClassifier
public void buildClassifier(Instances insts) throws java.lang.Exception
Method for building the classifier. Implements a one-against-one wrapper for multi-class problems.- Specified by:
buildClassifier
in classClassifier
- Parameters:
insts
- the set of training instances- Throws:
java.lang.Exception
- if the classifier can't be built successfully
-
distributionForInstance
public double[] distributionForInstance(Instance inst) throws java.lang.Exception
Estimates class probabilities for given instance.- Overrides:
distributionForInstance
in classClassifier
- Parameters:
inst
- the instance to compute the distribution for- Returns:
- the class probabilities
- Throws:
java.lang.Exception
- if computation fails
-
pairwiseCoupling
public double[] pairwiseCoupling(double[][] n, double[][] r)
Implements pairwise coupling.- Parameters:
n
- the sum of weights used to train each modelr
- the probability estimate from each model- Returns:
- the coupled estimates
-
sparseWeights
public double[][][] sparseWeights()
Returns the weights in sparse format.- Returns:
- the weights in sparse format
-
sparseIndices
public int[][][] sparseIndices()
Returns the indices in sparse format.- Returns:
- the indices in sparse format
-
bias
public double[][] bias()
Returns the bias of each binary SMO.- Returns:
- the bias of each binary SMO
-
numClassAttributeValues
public int numClassAttributeValues()
Returns the number of values of the class attribute.- Returns:
- the number values of the class attribute
-
classAttributeNames
public java.lang.String[] classAttributeNames()
Returns the names of the class attributes.- Returns:
- the names of the class attributes
-
attributeNames
public java.lang.String[][][] attributeNames()
Returns the attribute names.- Returns:
- the attribute names
-
listOptions
public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Overrides:
listOptions
in classClassifier
- Returns:
- an enumeration of all the available options.
-
setOptions
public void setOptions(java.lang.String[] options) throws java.lang.Exception
Parses a given list of options. Valid options are:-D If set, classifier is run in debug mode and may output additional info to the console
-no-checks Turns off all checks - use with caution! Turning them off assumes that data is purely numeric, doesn't contain any missing values, and has a nominal class. Turning them off also means that no header information will be stored if the machine is linear. Finally, it also assumes that no instance has a weight equal to 0. (default: checks on)
-C <double> The complexity constant C. (default 1)
-N Whether to 0=normalize/1=standardize/2=neither. (default 0=normalize)
-I Use MIminimax feature space.
-L <double> The tolerance parameter. (default 1.0e-3)
-P <double> The epsilon for round-off error. (default 1.0e-12)
-M Fit logistic models to SVM outputs.
-V <double> The number of folds for the internal cross-validation. (default -1, use training data)
-W <double> The random number seed. (default 1)
-K <classname and parameters> The Kernel to use. (default: weka.classifiers.functions.supportVector.PolyKernel)
Options specific to kernel weka.classifiers.mi.supportVector.MIPolyKernel:
-D Enables debugging output (if available) to be printed. (default: off)
-no-checks Turns off all checks - use with caution! (default: checks on)
-C <num> The size of the cache (a prime number), 0 for full cache and -1 to turn it off. (default: 250007)
-E <num> The Exponent to use. (default: 1.0)
-L Use lower-order terms. (default: no)
- Specified by:
setOptions
in interfaceOptionHandler
- Overrides:
setOptions
in classClassifier
- Parameters:
options
- the list of options as an array of strings- Throws:
java.lang.Exception
- if an option is not supported
-
getOptions
public java.lang.String[] getOptions()
Gets the current settings of the classifier.- Specified by:
getOptions
in interfaceOptionHandler
- Overrides:
getOptions
in classClassifier
- Returns:
- an array of strings suitable for passing to setOptions
-
setChecksTurnedOff
public void setChecksTurnedOff(boolean value)
Disables or enables the checks (which could be time-consuming). Use with caution!- Parameters:
value
- if true turns off all checks
-
getChecksTurnedOff
public boolean getChecksTurnedOff()
Returns whether the checks are turned off or not.- Returns:
- true if the checks are turned off
-
checksTurnedOffTipText
public java.lang.String checksTurnedOffTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
kernelTipText
public java.lang.String kernelTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getKernel
public Kernel getKernel()
Gets the kernel to use.- Returns:
- the kernel
-
setKernel
public void setKernel(Kernel value)
Sets the kernel to use.- Parameters:
value
- the kernel
-
cTipText
public java.lang.String cTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getC
public double getC()
Get the value of C.- Returns:
- Value of C.
-
setC
public void setC(double v)
Set the value of C.- Parameters:
v
- Value to assign to C.
-
toleranceParameterTipText
public java.lang.String toleranceParameterTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getToleranceParameter
public double getToleranceParameter()
Get the value of tolerance parameter.- Returns:
- Value of tolerance parameter.
-
setToleranceParameter
public void setToleranceParameter(double v)
Set the value of tolerance parameter.- Parameters:
v
- Value to assign to tolerance parameter.
-
epsilonTipText
public java.lang.String epsilonTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getEpsilon
public double getEpsilon()
Get the value of epsilon.- Returns:
- Value of epsilon.
-
setEpsilon
public void setEpsilon(double v)
Set the value of epsilon.- Parameters:
v
- Value to assign to epsilon.
-
filterTypeTipText
public java.lang.String filterTypeTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getFilterType
public SelectedTag getFilterType()
Gets how the training data will be transformed. Will be one of FILTER_NORMALIZE, FILTER_STANDARDIZE, FILTER_NONE.- Returns:
- the filtering mode
-
setFilterType
public void setFilterType(SelectedTag newType)
Sets how the training data will be transformed. Should be one of FILTER_NORMALIZE, FILTER_STANDARDIZE, FILTER_NONE.- Parameters:
newType
- the new filtering mode
-
minimaxTipText
public java.lang.String minimaxTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getMinimax
public boolean getMinimax()
Check if the MIMinimax feature space is to be used.- Returns:
- true if minimax
-
setMinimax
public void setMinimax(boolean v)
Set if the MIMinimax feature space is to be used.- Parameters:
v
- true if RBF
-
buildLogisticModelsTipText
public java.lang.String buildLogisticModelsTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getBuildLogisticModels
public boolean getBuildLogisticModels()
Get the value of buildLogisticModels.- Returns:
- Value of buildLogisticModels.
-
setBuildLogisticModels
public void setBuildLogisticModels(boolean newbuildLogisticModels)
Set the value of buildLogisticModels.- Parameters:
newbuildLogisticModels
- Value to assign to buildLogisticModels.
-
numFoldsTipText
public java.lang.String numFoldsTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getNumFolds
public int getNumFolds()
Get the value of numFolds.- Returns:
- Value of numFolds.
-
setNumFolds
public void setNumFolds(int newnumFolds)
Set the value of numFolds.- Parameters:
newnumFolds
- Value to assign to numFolds.
-
randomSeedTipText
public java.lang.String randomSeedTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getRandomSeed
public int getRandomSeed()
Get the value of randomSeed.- Returns:
- Value of randomSeed.
-
setRandomSeed
public void setRandomSeed(int newrandomSeed)
Set the value of randomSeed.- Parameters:
newrandomSeed
- Value to assign to randomSeed.
-
toString
public java.lang.String toString()
Prints out the classifier.- Overrides:
toString
in classjava.lang.Object
- Returns:
- a description of the classifier as a string
-
getRevision
public java.lang.String getRevision()
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Overrides:
getRevision
in classClassifier
- Returns:
- the revision
-
main
public static void main(java.lang.String[] argv)
Main method for testing this class.- Parameters:
argv
- the commandline parameters
-
-