One-Hot Encoder

../../../../_images/label_binarizer.svg

Encode categorical integer features using a one-hot aka one-of-K scheme.

For each categorical input feature, a number of output features will be given of which exactly one is marked as true and the rest as false. This encoding is needed for feeding categorical data to many scikit-learn estimators, notably linear models and SVMs with the standard kernels. Note: a one-hot encoding of y labels should use a LabelBinarizer instead. Also note: categories for the input data are generated automatically (as in category=’auto’ keyword in scikit-learn)

Documentation

Encode categorical integer features using a one-hot aka one-of-K scheme.

For each categorical input feature, a number of output features will be given of which exactly one is marked as true and the rest as false. This encoding is needed for feeding categorical data to many scikit-learn estimators, notably linear models and SVMs with the standard kernels. Note: a one-hot encoding of y labels should use a LabelBinarizer instead. Also note: categories for the input data are generated automatically (as in category=’auto’ keyword in scikit-learn)

Configuration:

  • handle_unknown

    How to handle unknown categories during (non-fit) transform

  • sparse

    Will generate sparse matrix if true. Warning: sparse matrices are not handled by all Sympathy nodes and may be silently converted to non-sparse arrays

Attributes:

  • active_features_

  • feature_indices_

  • n_values_

  • categories_

    The categories of each feature determined during fitting (in order of the features in X and corresponding with the output of transform). This includes the category specified in drop (if any).

Input ports:

Output ports:
modelmodel

Model

Definition

Input ports

Output ports

model

model

Model

class node_preprocessing.OneHotEncoder[source]