Learning Curve¶

../../../../_images/learning_curve.svg

Generates a learning curve by training model multiple timeson incrementally larger subsets of the data and using cross validation for scoring. Plot performance of train-mean vs. test-mean for curve.

Documentation¶

A learning curve shows the validation and training score of an estimator for varying numbers of training samples. It is a tool to find out how much we benefit from adding more training data and whether the estimator suffers more from a variance error or a bias error.

A cross-validation generator splits the whole dataset k times in training and test data. Subsets of the training set with varying sizes will be used to train the estimator and a score for each training subset size and the test set will be computed. Afterwards, the scores will be averaged over all k runs for each training subset size.

Definition¶

Input ports¶

model

Type: model

Description: Model

X

Type: table

Description: X

Y

Type: table

Description: Y

Output ports¶

results

Type: table

Description: results

statistics

Type: table

Description: statistics

Configuration¶

Cross validation folds (cv)
Number of fold of cross-validation (minimum 2)

Shuffle (shuffle)
Randomizes the input dataset before passed to internal cross validation

Smallest fraction (smallest)
Size of the smallest dataset as fraction of total

Steps (steps)
Number of different sizes of training/test data measured

Implementation¶

class node_metrics.LearningCurve[source]