As we saw in a previous post, feature selection attempts to remove unnecessary features from the prediction model. When we reduce the number of irrelevant/redundant features, we hope to improve the accuracy of the model and to reduce its computational cost.
In this post, we learn how to select features through a model-based approach. This type of approach uses machine learning to model the data, judging the usefulness of a feature according to its relative importance to the predictability of the target variable.
To perform model-based selection in scikit-learn, you can use the meta-transformer SelectFromModel in conjunction with different models (e.g. L1 penalized regression models or tree-based estimators).
The following notebook gives you a practical example of the application of feature selection through a model-based approach. You can access all the data and the GitHub version here.