3. Model selection and evaluation¶
- 3.1. Cross-validation: evaluating estimator performance
- 3.1.1. Computing cross-validated metrics
- 3.1.2. Cross validation iterators
- 3.1.2.1. K-fold
- 3.1.2.2. Stratified k-fold
- 3.1.2.3. Label k-fold
- 3.1.2.4. Leave-One-Out - LOO
- 3.1.2.5. Leave-P-Out - LPO
- 3.1.2.6. Leave-One-Label-Out - LOLO
- 3.1.2.7. Leave-P-Label-Out
- 3.1.2.8. Random permutations cross-validation a.k.a. Shuffle & Split
- 3.1.2.9. Label-Shuffle-Split
- 3.1.2.10. Predefined Fold-Splits / Validation-Sets
- 3.1.2.11. See also
- 3.1.3. A note on shuffling
- 3.1.4. Cross validation and model selection
- 3.2. Grid Search: Searching for estimator parameters
- 3.2.1. Exhaustive Grid Search
- 3.2.2. Randomized Parameter Optimization
- 3.2.3. Tips for parameter search
- 3.2.4. Alternatives to brute force parameter search
- 3.2.4.1. Model specific cross-validation
- 3.2.4.1.1.
sklearn.linear_model
.ElasticNetCV - 3.2.4.1.2.
sklearn.linear_model
.LarsCV - 3.2.4.1.3.
sklearn.linear_model
.LassoCV - 3.2.4.1.4.
sklearn.linear_model
.LassoLarsCV - 3.2.4.1.5.
sklearn.linear_model
.LogisticRegressionCV - 3.2.4.1.6.
sklearn.linear_model
.MultiTaskElasticNetCV - 3.2.4.1.7.
sklearn.linear_model
.MultiTaskLassoCV - 3.2.4.1.8.
sklearn.linear_model
.OrthogonalMatchingPursuitCV - 3.2.4.1.9.
sklearn.linear_model
.RidgeCV - 3.2.4.1.10.
sklearn.linear_model
.RidgeClassifierCV
- 3.2.4.1.1.
- 3.2.4.2. Information Criterion
- 3.2.4.3. Out of Bag Estimates
- 3.2.4.3.1.
sklearn.ensemble
.RandomForestClassifier - 3.2.4.3.2.
sklearn.ensemble
.RandomForestRegressor - 3.2.4.3.3.
sklearn.ensemble
.ExtraTreesClassifier - 3.2.4.3.4.
sklearn.ensemble
.ExtraTreesRegressor - 3.2.4.3.5.
sklearn.ensemble
.GradientBoostingClassifier - 3.2.4.3.6.
sklearn.ensemble
.GradientBoostingRegressor
- 3.2.4.3.1.
- 3.2.4.1. Model specific cross-validation
- 3.3. Model evaluation: quantifying the quality of predictions
- 3.3.1. The
scoring
parameter: defining model evaluation rules - 3.3.2. Classification metrics
- 3.3.2.1. From binary to multiclass and multilabel
- 3.3.2.2. Accuracy score
- 3.3.2.3. Cohen’s kappa
- 3.3.2.4. Confusion matrix
- 3.3.2.5. Classification report
- 3.3.2.6. Hamming loss
- 3.3.2.7. Jaccard similarity coefficient score
- 3.3.2.8. Precision, recall and F-measures
- 3.3.2.9. Hinge loss
- 3.3.2.10. Log loss
- 3.3.2.11. Matthews correlation coefficient
- 3.3.2.12. Receiver operating characteristic (ROC)
- 3.3.2.13. Zero one loss
- 3.3.3. Multilabel ranking metrics
- 3.3.4. Regression metrics
- 3.3.5. Clustering metrics
- 3.3.6. Dummy estimators
- 3.3.1. The
- 3.4. Model persistence
- 3.5. Validation curves: plotting scores to evaluate models