How to do feature selection via a model-based approach

As we saw in a previous post, feature selection attempts to remove unnecessary features from the prediction model. When we reduce the number of irrelevant/redundant features, we hope to improve the accuracy of the model and to reduce its computational cost.

In this post, we learn how to select features through a model-based approach. This type of approach uses machine learning to model the data, judging the usefulness of a feature according to its relative importance to the predictability of the target variable.

To perform model-based selection in scikit-learn, you can use the meta-transformer SelectFromModel in conjunction with different models (e.g. L1 penalized regression models or tree-based estimators).

The following notebook gives you a practical example of the application of feature selection through a model-based approach. You can access all the data and the GitHub version here

Notebook on Feature Selection Through a Model-Based Approach

Leave a Reply

Your email address will not be published. Required fields are marked *