In statistics and machine learning, Cross-validation is the process of estimating how well a trained model generalizes by assessing its performance on an unseen set of data.

One often makes use of cross-validation to aid in model choice. Typically, the training dataset is tackled using several different models, which are then compared using cross-validation on the (appropriately named) validation dataset. The best model is then chosen to be evaluated against the test dataset.

A common approach to cross-validation is k-fold cross validation, in which we define k different splits of training and validation data, assessing the model on each of those splits and combining the outcome.

Notes mentioning this note

Here are all the zettels in this zettelkasten, along with their links, visualized as a graph. You may need to zoom and pan around to see something.