One-Hot Encoder¶

Encode categorical integer features using a one-hot aka one-of-K scheme.

For each categorical input feature, a number of output features will be given of which exactly one is marked as true and the rest as false. This encoding is needed for feeding categorical data to many scikit-learn estimators, notably linear models and SVMs with the standard kernels. Note: a one-hot encoding of y labels should use a LabelBinarizer instead.

Documentation¶

Encode categorical integer features using a one-hot aka one-of-K scheme.

For each categorical input feature, a number of output features will be given of which exactly one is marked as true and the rest as false. This encoding is needed for feeding categorical data to many scikit-learn estimators, notably linear models and SVMs with the standard kernels. Note: a one-hot encoding of y labels should use a LabelBinarizer instead.

Configuration:

handle_unknown

How to handle unknown categories during (non-fit) transform

categories

Categories (unique values) per feature:

‘auto’ : Determine categories automatically from the training data.

list : categories[i] holds the categories expected in the ith column. The passed categories should not mix strings and numeric values within a single feature, and should be sorted in case of numeric values.

The used categories can be found in the categories_ attribute.

Attributes:

categories_

The categories of each feature determined during fitting (in order of the features in X and corresponding with the output of transform). This includes the category specified in drop (if any).

Input ports:

Output ports:

modelmodel: Model

Definition¶

Some of the docstrings for this module have been automatically extracted from the scikit-learn library and are covered by their respective licenses.

class node_preprocessing.OneHotEncoder[source]¶

One-Hot Encoder¶

Documentation¶

Definition¶

Sympathy for Data

Navigation

Related Topics