Proceedings of
8th International Conference On Advances in Computing, Electronics and Electrical Technology CEET 2018
"AN ANALYSIS OF FEATURE SELECTION METHODS FOR MULTICLASS TEXT CLASSIFICATION"
Abstract: “To classify objects into different classes, feature plays a vital role. So identification of best features is a backbone of classification process. In text classification, features are simple words, having very large dimension so finding the most appropriate feature set is a big challenge. This paper includes analysis of some feature selection methods for multi class text classification and checks their results on different classifier for an email classification. We run our experiments on 20NewGroups and PU corpora datasets. Experiments are done on some well-known feature selection method like Term Selection, Document Frequency, Mutual Information, Odds Ratio, Chi square and etc. This paper concludes that Mutual Information and Chi square are most appropriate for text classification.”
Keywords: Feature Selection, Text Classification, MultiClass Classification