Package weka.core.converters
Class ConverterUtils.DataSource
- java.lang.Object
-
- weka.core.converters.ConverterUtils.DataSource
-
- All Implemented Interfaces:
java.io.Serializable
,RevisionHandler
- Enclosing class:
- ConverterUtils
public static class ConverterUtils.DataSource extends java.lang.Object implements java.io.Serializable, RevisionHandler
Helper class for loading data from files and URLs. Via the ConverterUtils class it determines which converter to use for loading the data into memory. If the chosen converter is an incremental one, then the data will be loaded incrementally, otherwise as batch. In both cases the same interface will be used (hasMoreElements
,nextElement
). Before the data can be read again, one has to call thereset
method. The data source can also be initialized with an Instances object, in order to provide a unified interface to files and already loaded datasets.- Version:
- $Revision: 6416 $
- Author:
- FracPete (fracpete at waikato dot ac dot nz)
- See Also:
hasMoreElements(Instances)
,nextElement(Instances)
,reset()
,ConverterUtils.DataSink
, Serialized Form
-
-
Constructor Summary
Constructors Constructor Description DataSource(java.io.InputStream stream)
Initializes the datasource with the given input stream.DataSource(java.lang.String location)
Tries to load the data from the file.DataSource(Loader loader)
Initializes the datasource with the given Loader.DataSource(Instances inst)
Initializes the datasource with the given dataset.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description Instances
getDataSet()
returns the full dataset, can be null in case of an error.Instances
getDataSet(int classIndex)
returns the full dataset with the specified class index set, can be null in case of an error.Loader
getLoader()
returns the determined loader, null if the DataSource was initialized with data alone and not a file/URL.java.lang.String
getRevision()
Returns the revision string.Instances
getStructure()
returns the structure of the data.Instances
getStructure(int classIndex)
returns the structure of the data, with the defined class index.boolean
hasMoreElements(Instances structure)
returns whether there are more Instance objects in the data.static boolean
isArff(java.lang.String location)
returns whether the extension of the location is likely to be of ARFF format, i.e., ending in ".arff" or ".arff.gz" (case-insensitive).boolean
isIncremental()
returns whether the loader is an incremental one.static void
main(java.lang.String[] args)
for testing only - takes a data file as input.Instance
nextElement(Instances dataset)
returns the next element and sets the specified dataset, null if none available.static Instances
read(java.io.InputStream stream)
convencience method for loading a dataset in batch mode from a stream.static Instances
read(java.lang.String location)
convencience method for loading a dataset in batch mode.static Instances
read(Loader loader)
convencience method for loading a dataset in batch mode.void
reset()
resets the loader.
-
-
-
Constructor Detail
-
DataSource
public DataSource(java.lang.String location) throws java.lang.Exception
Tries to load the data from the file. Can be either a regular file or a web location (http://, https://, ftp:// or file://).- Parameters:
location
- the name of the file to load- Throws:
java.lang.Exception
- if initialization fails
-
DataSource
public DataSource(Instances inst)
Initializes the datasource with the given dataset.- Parameters:
inst
- the dataset to use
-
DataSource
public DataSource(Loader loader)
Initializes the datasource with the given Loader.- Parameters:
loader
- the Loader to use
-
DataSource
public DataSource(java.io.InputStream stream)
Initializes the datasource with the given input stream. This stream is always interpreted as ARFF.- Parameters:
stream
- the stream to use
-
-
Method Detail
-
isArff
public static boolean isArff(java.lang.String location)
returns whether the extension of the location is likely to be of ARFF format, i.e., ending in ".arff" or ".arff.gz" (case-insensitive).- Parameters:
location
- the file location to check- Returns:
- true if the location seems to be of ARFF format
-
isIncremental
public boolean isIncremental()
returns whether the loader is an incremental one.- Returns:
- true if the loader is a true incremental one
-
getLoader
public Loader getLoader()
returns the determined loader, null if the DataSource was initialized with data alone and not a file/URL.- Returns:
- the loader used for retrieving the data
-
getDataSet
public Instances getDataSet() throws java.lang.Exception
returns the full dataset, can be null in case of an error.- Returns:
- the full dataset
- Throws:
java.lang.Exception
- if resetting of loader fails
-
getDataSet
public Instances getDataSet(int classIndex) throws java.lang.Exception
returns the full dataset with the specified class index set, can be null in case of an error.- Parameters:
classIndex
- the class index for the dataset- Returns:
- the full dataset
- Throws:
java.lang.Exception
- if resetting of loader fails
-
reset
public void reset() throws java.lang.Exception
resets the loader.- Throws:
java.lang.Exception
- if resetting fails
-
getStructure
public Instances getStructure() throws java.lang.Exception
returns the structure of the data.- Returns:
- the structure of the data
- Throws:
java.lang.Exception
- if something goes wrong
-
getStructure
public Instances getStructure(int classIndex) throws java.lang.Exception
returns the structure of the data, with the defined class index.- Parameters:
classIndex
- the class index for the dataset- Returns:
- the structure of the data
- Throws:
java.lang.Exception
- if something goes wrong
-
hasMoreElements
public boolean hasMoreElements(Instances structure)
returns whether there are more Instance objects in the data.- Parameters:
structure
- the structure of the dataset- Returns:
- true if there are more Instance objects available
- See Also:
nextElement(Instances)
-
nextElement
public Instance nextElement(Instances dataset)
returns the next element and sets the specified dataset, null if none available.- Parameters:
dataset
- the dataset to set for the instance- Returns:
- the next Instance
-
read
public static Instances read(java.lang.String location) throws java.lang.Exception
convencience method for loading a dataset in batch mode.- Parameters:
location
- the dataset to load- Returns:
- the dataset
- Throws:
java.lang.Exception
- if loading fails
-
read
public static Instances read(java.io.InputStream stream) throws java.lang.Exception
convencience method for loading a dataset in batch mode from a stream.- Parameters:
stream
- the stream to load the dataset from- Returns:
- the dataset
- Throws:
java.lang.Exception
- if loading fails
-
read
public static Instances read(Loader loader) throws java.lang.Exception
convencience method for loading a dataset in batch mode.- Parameters:
loader
- the loader to get the dataset from- Returns:
- the dataset
- Throws:
java.lang.Exception
- if loading fails
-
main
public static void main(java.lang.String[] args) throws java.lang.Exception
for testing only - takes a data file as input.- Parameters:
args
- the commandline arguments- Throws:
java.lang.Exception
- if something goes wrong
-
getRevision
public java.lang.String getRevision()
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Returns:
- the revision
-
-