Package smile.data

Class AttributeDataset

  • All Implemented Interfaces:
    java.lang.Iterable<Datum<double[]>>

    public class AttributeDataset
    extends Dataset<double[]>
    A dataset of fixed number of attributes. All attribute values are stored as double even if the attribute may be nominal, ordinal, string, or date. The dataset is stored row-wise internally, which is fast for frequently accessing instances of dataset.
    Author:
    Haifeng Li
    • Constructor Detail

      • AttributeDataset

        public AttributeDataset​(java.lang.String name,
                                Attribute[] attributes)
        Constructor.
        Parameters:
        name - the name of dataset.
        attributes - the list of attributes in this dataset.
      • AttributeDataset

        public AttributeDataset​(java.lang.String name,
                                Attribute[] attributes,
                                Attribute response)
        Constructor.
        Parameters:
        name - the name of dataset.
        attributes - the list of attributes in this dataset.
        response - the attribute of response variable.
      • AttributeDataset

        public AttributeDataset​(java.lang.String name,
                                double[][] x,
                                double[] y)
        Constructor.
        Parameters:
        name - the name of dataset.
        x - the data in this dataset.
        y - the response data.
      • AttributeDataset

        public AttributeDataset​(java.lang.String name,
                                Attribute[] attributes,
                                double[][] x,
                                Attribute response,
                                double[] y)
        Constructor.
        Parameters:
        name - the name of dataset.
        attributes - the list of attributes in this dataset.
        x - the data in this dataset.
        response - the attribute of response variable.
        y - the response data.
    • Method Detail

      • attributes

        public Attribute[] attributes()
        Returns the list of attributes in this dataset.
      • x

        public double[][] x()
        Returns the array of data items.
      • add

        public Datum<double[]> add​(Datum<double[]> x)
        Description copied from class: Dataset
        Add a datum item into the dataset.
        Overrides:
        add in class Dataset<double[]>
        Parameters:
        x - a datum item.
        Returns:
        the added datum item.
      • add

        public AttributeDataset.Row add​(double[] x)
        Description copied from class: Dataset
        Add a datum item into the dataset.
        Overrides:
        add in class Dataset<double[]>
        Parameters:
        x - a datum item.
        Returns:
        the added datum item.
      • add

        public AttributeDataset.Row add​(double[] x,
                                        int y)
        Description copied from class: Dataset
        Add a datum item into the dataset.
        Overrides:
        add in class Dataset<double[]>
        Parameters:
        x - a datum item.
        y - the class label of the datum.
        Returns:
        the added datum item.
      • add

        public AttributeDataset.Row add​(double[] x,
                                        int y,
                                        double weight)
        Description copied from class: Dataset
        Add a datum item into the dataset.
        Overrides:
        add in class Dataset<double[]>
        Parameters:
        x - a datum item.
        y - the class label of the datum.
        weight - the weight of datum. The particular meaning of weight depends on applications and machine learning algorithms. Although there are on explicit requirements on the weights, in general, they should be positive.
        Returns:
        the added datum item.
      • add

        public AttributeDataset.Row add​(double[] x,
                                        double y)
        Description copied from class: Dataset
        Add a datum item into the dataset.
        Overrides:
        add in class Dataset<double[]>
        Parameters:
        x - a datum item.
        y - the real-valued response for regression.
        Returns:
        the added datum item.
      • add

        public AttributeDataset.Row add​(double[] x,
                                        double y,
                                        double weight)
        Description copied from class: Dataset
        Add a datum item into the dataset.
        Overrides:
        add in class Dataset<double[]>
        Parameters:
        x - a datum item.
        weight - the weight of datum. The particular meaning of weight depends on applications and machine learning algorithms. Although there are on explicit requirements on the weights, in general, they should be positive.
        Returns:
        the added datum item.
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class java.lang.Object
      • range

        public AttributeDataset range​(int from,
                                      int to)
        Returns the rows in the given range [from, to).
      • toString

        public java.lang.String toString​(int from,
                                         int to)
        Stringify dataset.
        Parameters:
        from - starting row (inclusive)
        to - ending row (exclusive)
      • column

        public AttributeVector column​(java.lang.String col)
        Returns a column.
      • columns

        public AttributeDataset columns​(java.lang.String... cols)
        Returns a dataset with selected columns.
      • remove

        public AttributeDataset remove​(java.lang.String... cols)
        Returns a new dataset without given columns.