Feature Selection for Machine Learning: 3 Categories and 12 Methods

1 minute read

The project is available online on Towards Data Science.

Most of the content of this article is from my recent paper entitled: “An Evaluation of Feature Selection Methods for Environmental Data”, available here for anyone interested.

The 2 approaches for Dimensionality Reduction

There are two ways to reduce the number of features, otherwise known as dimensionality reduction.

The first way is called feature extraction and it aims to transform the features and create entirely new ones based on combinations of the raw/given ones. The most popular approaches are the Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA), and Multidimensional Scaling. However, the new feature space can hardly provide us with useful information about the original features. The new higher-level features are not easily understood by humans, because we can not link them directly to the initial ones, making it difficult to draw conclusions and explain the variables.

The second way for dimensionality reduction is feature selection. It can be considered as a pre-processing step and does not create any new features, but instead selects a subset of the raw ones, providing better interpretability. Finding the best features from a significant initial number can help us extract valuable information and discover new knowledge. In classification problems, the significance of features is evaluated as to their ability to resolve distinct classes. The property which gives an estimation of each feature’s handiness in discriminating the distinct classes is called feature relevance.

Continue reading on Towards Data Science.

Leave a Comment