A comparison between two feature selection algorithms
This article provides a comparison of two feature selection algorithms, Information Gain Thresholding and Koller and Sahami's algorithm in the context of text document classification on the Reuters Corpus Volume 1 dataset. The algorithms were evaluated by testing the performance of classifiers trained on the features they select from a given dataset. Results show that Koller and Sahami's algorithm consistently outperforms Information Gain Thresholding by capturing interactions between features and avoiding redundancy among features, although it achieves its gains through increased complexity and longer running time.