About   cv   Etc   Now   Zettelkästen  
machine learning

Wrong feature preprocessing is a source of train-test leakage

Feature selection should be done after train-test splitting to avoid leaking information from the test set into the training pipeline. This also means that feature selection should be done within each fold of cross-validation, not before. This sounds obvious, but this is something that goes wrong easily and often. Especially when the feature …

Read more


See archives for more ...

An IndieWeb Webring 🕸💍