Learning Curve¶
Generates a learning curve by training model multiple timeson incrementally larger subsets of the data and using cross validation for scoring. Plot performance of train-mean vs. test-mean for curve.
Documentation¶
A learning curve shows the validation and training score of an estimator for varying numbers of training samples. It is a tool to find out how much we benefit from adding more training data and whether the estimator suffers more from a variance error or a bias error.
A cross-validation generator splits the whole dataset k times in training and test data. Subsets of the training set with varying sizes will be used to train the estimator and a score for each training subset size and the test set will be computed. Afterwards, the scores will be averaged over all k runs for each training subset size.
Definition¶
Input ports¶
- model
Type: modelDescription: Model- X
Type: tableDescription: X- Y
Type: tableDescription: Y
Output ports¶
- results
Type: tableDescription: results- statistics
Type: tableDescription: statistics
Configuration¶
- Cross validation folds (cv)
Number of fold of cross-validation (minimum 2)
- Shuffle (shuffle)
Randomizes the input dataset before passed to internal cross validation
- Smallest fraction (smallest)
Size of the smallest dataset as fraction of total
- Steps (steps)
Number of different sizes of training/test data measured
Implementation¶
- class node_metrics.LearningCurve[source]