Class Dataset<E>
- java.lang.Object
-
- smile.data.Dataset<E>
-
- Type Parameters:
E
- the type of data objects.
- All Implemented Interfaces:
java.lang.Iterable<Datum<E>>
- Direct Known Subclasses:
AttributeDataset
public class Dataset<E> extends java.lang.Object implements java.lang.Iterable<Datum<E>>
A set of data objects.- Author:
- Haifeng Li
-
-
Field Summary
Fields Modifier and Type Field Description protected java.util.List<Datum<E>>
data
The data objects.protected static java.lang.String
DATASET_HAS_NO_RESPONSE
protected java.lang.String
description
The optional detailed description of dataset.protected java.lang.String
name
The name of dataset.protected Attribute
response
The attribute property of response variable.protected static java.lang.String
RESPONSE_NOT_NOMINAL
protected static java.lang.String
RESPONSE_NOT_NUMERIC
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Datum<E>
add(E x)
Add a datum item into the dataset.Datum<E>
add(E x, double y)
Add a datum item into the dataset.Datum<E>
add(E x, double y, double weight)
Add a datum item into the dataset.Datum<E>
add(E x, int y)
Add a datum item into the dataset.Datum<E>
add(E x, int y, double weight)
Add a datum item into the dataset.Datum<E>
add(Datum<E> x)
Add a datum item into the dataset.java.util.List<Datum<E>>
data()
Returns the data set.Datum<E>
get(int i)
Returns the element at the specified position in this dataset.java.lang.String
getDescription()
Returns the detailed dataset description.java.lang.String
getName()
Returns the dataset name.java.util.Iterator<Datum<E>>
iterator()
Returns an iterator over the elements in this dataset in proper sequence.int[]
labels()
Returns the class labels.Datum<E>
remove(int i)
Removes the element at the specified position in this dataset.AttributeVector
response()
Returns the response attribute vector.Attribute
responseAttribute()
Returns the attribute of the response variable.void
setDescription(java.lang.String description)
Sets the detailed dataset description.void
setName(java.lang.String name)
Sets the dataset name.int
size()
Returns the size of dataset.double[]
toArray(double[] a)
Returns an array containing the response variable of the elements in this dataset in proper sequence (from first to last element).int[]
toArray(int[] a)
Returns an array containing the class labels of the elements in this dataset in proper sequence (from first to last element).E[]
toArray(E[] a)
Returns an array containing all of the elements in this dataset in proper sequence (from first to last element); the runtime type of the returned array is that of the specified array.java.lang.String[]
toArray(java.lang.String[] a)
Returns an array containing the string names of the elements in this dataset in proper sequence (from first to last element).java.sql.Timestamp[]
toArray(java.sql.Timestamp[] a)
Returns an array containing the timestamps of the elements in this dataset in proper sequence (from first to last element).double[]
y()
Returns the response values.
-
-
-
Field Detail
-
DATASET_HAS_NO_RESPONSE
protected static final java.lang.String DATASET_HAS_NO_RESPONSE
- See Also:
- Constant Field Values
-
RESPONSE_NOT_NOMINAL
protected static final java.lang.String RESPONSE_NOT_NOMINAL
- See Also:
- Constant Field Values
-
RESPONSE_NOT_NUMERIC
protected static final java.lang.String RESPONSE_NOT_NUMERIC
- See Also:
- Constant Field Values
-
name
protected java.lang.String name
The name of dataset.
-
description
protected java.lang.String description
The optional detailed description of dataset.
-
response
protected Attribute response
The attribute property of response variable. null means no response variable.
-
-
Constructor Detail
-
Dataset
public Dataset()
Constructor.
-
Dataset
public Dataset(java.lang.String name)
Constructor.- Parameters:
name
- the name of dataset.
-
Dataset
public Dataset(Attribute response)
Constructor.- Parameters:
response
- the attribute type of response variable.
-
Dataset
public Dataset(java.lang.String name, Attribute response)
Constructor.- Parameters:
name
- the name of dataset.response
- the attribute type of response variable.
-
-
Method Detail
-
getName
public java.lang.String getName()
Returns the dataset name.
-
setName
public void setName(java.lang.String name)
Sets the dataset name.
-
setDescription
public void setDescription(java.lang.String description)
Sets the detailed dataset description.
-
getDescription
public java.lang.String getDescription()
Returns the detailed dataset description.
-
responseAttribute
public Attribute responseAttribute()
Returns the attribute of the response variable. null means no response variable in this dataset.- Returns:
- the attribute of the response variable. null means no response variable in this dataset.
-
response
public AttributeVector response()
Returns the response attribute vector. null means no response variable in this dataset.- Returns:
- the response attribute vector. null means no response variable in this dataset.
-
size
public int size()
Returns the size of dataset.
-
add
public Datum<E> add(Datum<E> x)
Add a datum item into the dataset.- Parameters:
x
- a datum item.- Returns:
- the added datum item.
-
add
public Datum<E> add(E x)
Add a datum item into the dataset.- Parameters:
x
- a datum item.- Returns:
- the added datum item.
-
add
public Datum<E> add(E x, int y)
Add a datum item into the dataset.- Parameters:
x
- a datum item.y
- the class label of the datum.- Returns:
- the added datum item.
-
add
public Datum<E> add(E x, int y, double weight)
Add a datum item into the dataset.- Parameters:
x
- a datum item.y
- the class label of the datum.weight
- the weight of datum. The particular meaning of weight depends on applications and machine learning algorithms. Although there are on explicit requirements on the weights, in general, they should be positive.- Returns:
- the added datum item.
-
add
public Datum<E> add(E x, double y)
Add a datum item into the dataset.- Parameters:
x
- a datum item.y
- the real-valued response for regression.- Returns:
- the added datum item.
-
add
public Datum<E> add(E x, double y, double weight)
Add a datum item into the dataset.- Parameters:
x
- a datum item.weight
- the weight of datum. The particular meaning of weight depends on applications and machine learning algorithms. Although there are on explicit requirements on the weights, in general, they should be positive.- Returns:
- the added datum item.
-
remove
public Datum<E> remove(int i)
Removes the element at the specified position in this dataset.- Parameters:
i
- the index of the element to be removed.- Returns:
- the element previously at the specified position.
-
get
public Datum<E> get(int i)
Returns the element at the specified position in this dataset.- Parameters:
i
- the index of the element to be returned.
-
iterator
public java.util.Iterator<Datum<E>> iterator()
Returns an iterator over the elements in this dataset in proper sequence.- Specified by:
iterator
in interfacejava.lang.Iterable<E>
- Returns:
- an iterator over the elements in this dataset in proper sequence
-
y
public double[] y()
Returns the response values.
-
labels
public int[] labels()
Returns the class labels.
-
toArray
public E[] toArray(E[] a)
Returns an array containing all of the elements in this dataset in proper sequence (from first to last element); the runtime type of the returned array is that of the specified array. If the dataset fits in the specified array, it is returned therein. Otherwise, a new array is allocated with the runtime type of the specified array and the size of this dataset.If the dataset fits in the specified array with room to spare (i.e., the array has more elements than the dataset), the element in the array immediately following the end of the dataset is set to null.
- Parameters:
a
- the array into which the elements of this dataset are to be stored, if it is big enough; otherwise, a new array of the same runtime type is allocated for this purpose.- Returns:
- an array containing the elements of this list.
-
toArray
public int[] toArray(int[] a)
Returns an array containing the class labels of the elements in this dataset in proper sequence (from first to last element). Unknown labels will be saved as Integer.MIN_VALUE. If the dataset fits in the specified array, it is returned therein. Otherwise, a new array is allocated with the size of this dataset.If the dataset fits in the specified array with room to spare (i.e., the array has more elements than the dataset), the element in the array immediately following the end of the dataset is set to Integer.MIN_VALUE.
- Parameters:
a
- the array into which the class labels of this dataset are to be stored, if it is big enough; otherwise, a new array is allocated for this purpose.- Returns:
- an array containing the class labels of this dataset.
-
toArray
public double[] toArray(double[] a)
Returns an array containing the response variable of the elements in this dataset in proper sequence (from first to last element). If the dataset fits in the specified array, it is returned therein. Otherwise, a new array is allocated with the size of this dataset.If the dataset fits in the specified array with room to spare (i.e., the array has more elements than the dataset), the element in the array immediately following the end of the dataset is set to Double.NaN.
- Parameters:
a
- the array into which the response variable of this dataset are to be stored, if it is big enough; otherwise, a new array is allocated for this purpose.- Returns:
- an array containing the response variable of this dataset.
-
toArray
public java.lang.String[] toArray(java.lang.String[] a)
Returns an array containing the string names of the elements in this dataset in proper sequence (from first to last element). If the dataset fits in the specified array, it is returned therein. Otherwise, a new array is allocated with the size of this dataset.If the dataset fits in the specified array with room to spare (i.e., the array has more elements than the dataset), the element in the array immediately following the end of the dataset is set to null.
- Parameters:
a
- the array into which the string names of the elements in this dataset are to be stored, if it is big enough; otherwise, a new array is allocated for this purpose.- Returns:
- an array containing the string names of the elements in this dataset.
-
toArray
public java.sql.Timestamp[] toArray(java.sql.Timestamp[] a)
Returns an array containing the timestamps of the elements in this dataset in proper sequence (from first to last element). If the dataset fits in the specified array, it is returned therein. Otherwise, a new array is allocated with the size of this dataset.If the dataset fits in the specified array with room to spare (i.e., the array has more elements than the dataset), the element in the array immediately following the end of the dataset is set to null.
- Parameters:
a
- the array into which the timestamps of the elements in this dataset are to be stored, if it is big enough; otherwise, a new array is allocated for this purpose.- Returns:
- an array containing the timestamps of the elements in this dataset.
-
-