Package weka.classifiers.trees
Class BFTree
- java.lang.Object
-
- weka.classifiers.Classifier
-
- weka.classifiers.RandomizableClassifier
-
- weka.classifiers.trees.BFTree
-
- All Implemented Interfaces:
java.io.Serializable
,java.lang.Cloneable
,AdditionalMeasureProducer
,CapabilitiesHandler
,OptionHandler
,Randomizable
,RevisionHandler
,TechnicalInformationHandler
public class BFTree extends RandomizableClassifier implements AdditionalMeasureProducer, TechnicalInformationHandler
Class for building a best-first decision tree classifier. This class uses binary split for both nominal and numeric attributes. For missing values, the method of 'fractional' instances is used.
For more information, see:
Haijian Shi (2007). Best-first decision tree learning. Hamilton, NZ.
Jerome Friedman, Trevor Hastie, Robert Tibshirani (2000). Additive logistic regression : A statistical view of boosting. Annals of statistics. 28(2):337-407. BibTeX:@mastersthesis{Shi2007, address = {Hamilton, NZ}, author = {Haijian Shi}, note = {COMP594}, school = {University of Waikato}, title = {Best-first decision tree learning}, year = {2007} } @article{Friedman2000, author = {Jerome Friedman and Trevor Hastie and Robert Tibshirani}, journal = {Annals of statistics}, number = {2}, pages = {337-407}, title = {Additive logistic regression : A statistical view of boosting}, volume = {28}, year = {2000}, ISSN = {0090-5364} }
Valid options are:-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-P <UNPRUNED|POSTPRUNED|PREPRUNED> The pruning strategy. (default: POSTPRUNED)
-M <min no> The minimal number of instances at the terminal nodes. (default 2)
-N <num folds> The number of folds used in the pruning. (default 5)
-H Don't use heuristic search for nominal attributes in multi-class problem (default yes).
-G Don't use Gini index for splitting (default yes), if not information is used.
-R Don't use error rate in internal cross-validation (default yes), but root mean squared error.
-A Use the 1 SE rule to make pruning decision. (default no).
-C Percentage of training data size (0-1] (default 1).
- Version:
- $Revision: 6947 $
- Author:
- Haijian Shi (hs69@cs.waikato.ac.nz)
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static int
PRUNING_POSTPRUNING
pruning strategy: post-pruningstatic int
PRUNING_PREPRUNING
pruning strategy: pre-pruningstatic int
PRUNING_UNPRUNED
pruning strategy: un-prunedstatic Tag[]
TAGS_PRUNING
pruning strategy
-
Constructor Summary
Constructors Constructor Description BFTree()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
buildClassifier(Instances data)
Method for building a BestFirst decision tree classifier.double[]
distributionForInstance(Instance instance)
Computes class probabilities for instance using the decision tree.java.util.Enumeration
enumerateMeasures()
Return an enumeration of the measure names.Capabilities
getCapabilities()
Returns default capabilities of the classifier.boolean
getHeuristic()
Get if use heuristic search for nominal attributes in multi-class problems.double
getMeasure(java.lang.String additionalMeasureName)
Returns the value of the named measureint
getMinNumObj()
Get minimal number of instances at the terminal nodes.int
getNumFoldsPruning()
Set number of folds in internal cross-validation.java.lang.String[]
getOptions()
Gets the current settings of the Classifier.SelectedTag
getPruningStrategy()
Gets the pruning strategy.java.lang.String
getRevision()
Returns the revision string.double
getSizePer()
Get training set size.TechnicalInformation
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.boolean
getUseErrorRate()
Get if use error rate in internal cross-validation.boolean
getUseGini()
Get if use Gini index as splitting criterion.boolean
getUseOneSE()
Get if use the 1SE rule to choose final model.java.lang.String
globalInfo()
Returns a string describing classifierjava.lang.String
heuristicTipText()
Returns the tip text for this propertyjava.util.Enumeration
listOptions()
Returns an enumeration describing the available options.static void
main(java.lang.String[] args)
Main method.double
measureTreeSize()
Return number of tree size.java.lang.String
minNumObjTipText()
Returns the tip text for this propertyjava.lang.String
numFoldsPruningTipText()
Returns the tip text for this propertyint
numLeaves()
Compute number of leaf nodes.int
numNodes()
Compute size of the tree.java.lang.String
pruningStrategyTipText()
Returns the tip text for this propertyvoid
setHeuristic(boolean value)
Set if use heuristic search for nominal attributes in multi-class problems.void
setMinNumObj(int value)
Set minimal number of instances at the terminal nodes.void
setNumFoldsPruning(int value)
Set number of folds in internal cross-validation.void
setOptions(java.lang.String[] options)
Parses the options for this object.void
setPruningStrategy(SelectedTag value)
Sets the pruning strategy.void
setSizePer(double value)
Set training set size.void
setUseErrorRate(boolean value)
Set if use error rate in internal cross-validation.void
setUseGini(boolean value)
Set if use Gini index as splitting criterion.void
setUseOneSE(boolean value)
Set if use the 1SE rule to choose final model.java.lang.String
sizePerTipText()
Returns the tip text for this propertyjava.lang.String
toString()
Prints the decision tree using the protected toString method from below.java.lang.String
useErrorRateTipText()
Returns the tip text for this propertyjava.lang.String
useGiniTipText()
Returns the tip text for this propertyjava.lang.String
useOneSETipText()
Returns the tip text for this property-
Methods inherited from class weka.classifiers.RandomizableClassifier
getSeed, seedTipText, setSeed
-
Methods inherited from class weka.classifiers.Classifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, makeCopy, setDebug
-
-
-
-
Field Detail
-
PRUNING_UNPRUNED
public static final int PRUNING_UNPRUNED
pruning strategy: un-pruned- See Also:
- Constant Field Values
-
PRUNING_POSTPRUNING
public static final int PRUNING_POSTPRUNING
pruning strategy: post-pruning- See Also:
- Constant Field Values
-
PRUNING_PREPRUNING
public static final int PRUNING_PREPRUNING
pruning strategy: pre-pruning- See Also:
- Constant Field Values
-
TAGS_PRUNING
public static final Tag[] TAGS_PRUNING
pruning strategy
-
-
Method Detail
-
globalInfo
public java.lang.String globalInfo()
Returns a string describing classifier- Returns:
- a description suitable for displaying in the explorer/experimenter gui
-
getTechnicalInformation
public TechnicalInformation getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- Specified by:
getTechnicalInformation
in interfaceTechnicalInformationHandler
- Returns:
- the technical information about this class
-
getCapabilities
public Capabilities getCapabilities()
Returns default capabilities of the classifier.- Specified by:
getCapabilities
in interfaceCapabilitiesHandler
- Overrides:
getCapabilities
in classClassifier
- Returns:
- the capabilities of this classifier
- See Also:
Capabilities
-
buildClassifier
public void buildClassifier(Instances data) throws java.lang.Exception
Method for building a BestFirst decision tree classifier.- Specified by:
buildClassifier
in classClassifier
- Parameters:
data
- set of instances serving as training data- Throws:
java.lang.Exception
- if decision tree cannot be built successfully
-
distributionForInstance
public double[] distributionForInstance(Instance instance) throws java.lang.Exception
Computes class probabilities for instance using the decision tree.- Overrides:
distributionForInstance
in classClassifier
- Parameters:
instance
- the instance for which class probabilities is to be computed- Returns:
- the class probabilities for the given instance
- Throws:
java.lang.Exception
- if something goes wrong
-
toString
public java.lang.String toString()
Prints the decision tree using the protected toString method from below.- Overrides:
toString
in classjava.lang.Object
- Returns:
- a textual description of the classifier
-
numNodes
public int numNodes()
Compute size of the tree.- Returns:
- size of the tree
-
numLeaves
public int numLeaves()
Compute number of leaf nodes.- Returns:
- number of leaf nodes
-
listOptions
public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Overrides:
listOptions
in classRandomizableClassifier
- Returns:
- an enumeration describing the available options.
-
setOptions
public void setOptions(java.lang.String[] options) throws java.lang.Exception
Parses the options for this object. Valid options are:-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-P <UNPRUNED|POSTPRUNED|PREPRUNED> The pruning strategy. (default: POSTPRUNED)
-M <min no> The minimal number of instances at the terminal nodes. (default 2)
-N <num folds> The number of folds used in the pruning. (default 5)
-H Don't use heuristic search for nominal attributes in multi-class problem (default yes).
-G Don't use Gini index for splitting (default yes), if not information is used.
-R Don't use error rate in internal cross-validation (default yes), but root mean squared error.
-A Use the 1 SE rule to make pruning decision. (default no).
-C Percentage of training data size (0-1] (default 1).
- Specified by:
setOptions
in interfaceOptionHandler
- Overrides:
setOptions
in classRandomizableClassifier
- Parameters:
options
- the options to use- Throws:
java.lang.Exception
- if setting of options fails
-
getOptions
public java.lang.String[] getOptions()
Gets the current settings of the Classifier.- Specified by:
getOptions
in interfaceOptionHandler
- Overrides:
getOptions
in classRandomizableClassifier
- Returns:
- the current settings of the Classifier
-
enumerateMeasures
public java.util.Enumeration enumerateMeasures()
Return an enumeration of the measure names.- Specified by:
enumerateMeasures
in interfaceAdditionalMeasureProducer
- Returns:
- an enumeration of the measure names
-
measureTreeSize
public double measureTreeSize()
Return number of tree size.- Returns:
- number of tree size
-
getMeasure
public double getMeasure(java.lang.String additionalMeasureName)
Returns the value of the named measure- Specified by:
getMeasure
in interfaceAdditionalMeasureProducer
- Parameters:
additionalMeasureName
- the name of the measure to query for its value- Returns:
- the value of the named measure
- Throws:
java.lang.IllegalArgumentException
- if the named measure is not supported
-
pruningStrategyTipText
public java.lang.String pruningStrategyTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setPruningStrategy
public void setPruningStrategy(SelectedTag value)
Sets the pruning strategy.- Parameters:
value
- the strategy
-
getPruningStrategy
public SelectedTag getPruningStrategy()
Gets the pruning strategy.- Returns:
- the current strategy.
-
minNumObjTipText
public java.lang.String minNumObjTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setMinNumObj
public void setMinNumObj(int value)
Set minimal number of instances at the terminal nodes.- Parameters:
value
- minimal number of instances at the terminal nodes
-
getMinNumObj
public int getMinNumObj()
Get minimal number of instances at the terminal nodes.- Returns:
- minimal number of instances at the terminal nodes
-
numFoldsPruningTipText
public java.lang.String numFoldsPruningTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setNumFoldsPruning
public void setNumFoldsPruning(int value)
Set number of folds in internal cross-validation.- Parameters:
value
- the number of folds
-
getNumFoldsPruning
public int getNumFoldsPruning()
Set number of folds in internal cross-validation.- Returns:
- number of folds in internal cross-validation
-
heuristicTipText
public java.lang.String heuristicTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui.
-
setHeuristic
public void setHeuristic(boolean value)
Set if use heuristic search for nominal attributes in multi-class problems.- Parameters:
value
- if use heuristic search for nominal attributes in multi-class problems
-
getHeuristic
public boolean getHeuristic()
Get if use heuristic search for nominal attributes in multi-class problems.- Returns:
- if use heuristic search for nominal attributes in multi-class problems
-
useGiniTipText
public java.lang.String useGiniTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui.
-
setUseGini
public void setUseGini(boolean value)
Set if use Gini index as splitting criterion.- Parameters:
value
- if use Gini index splitting criterion
-
getUseGini
public boolean getUseGini()
Get if use Gini index as splitting criterion.- Returns:
- if use Gini index as splitting criterion
-
useErrorRateTipText
public java.lang.String useErrorRateTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui.
-
setUseErrorRate
public void setUseErrorRate(boolean value)
Set if use error rate in internal cross-validation.- Parameters:
value
- if use error rate in internal cross-validation
-
getUseErrorRate
public boolean getUseErrorRate()
Get if use error rate in internal cross-validation.- Returns:
- if use error rate in internal cross-validation.
-
useOneSETipText
public java.lang.String useOneSETipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui.
-
setUseOneSE
public void setUseOneSE(boolean value)
Set if use the 1SE rule to choose final model.- Parameters:
value
- if use the 1SE rule to choose final model
-
getUseOneSE
public boolean getUseOneSE()
Get if use the 1SE rule to choose final model.- Returns:
- if use the 1SE rule to choose final model
-
sizePerTipText
public java.lang.String sizePerTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui.
-
setSizePer
public void setSizePer(double value)
Set training set size.- Parameters:
value
- training set size
-
getSizePer
public double getSizePer()
Get training set size.- Returns:
- training set size
-
getRevision
public java.lang.String getRevision()
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Overrides:
getRevision
in classClassifier
- Returns:
- the revision
-
main
public static void main(java.lang.String[] args)
Main method.- Parameters:
args
- the options for the classifier
-
-