Loading...

Proceedings of

8th International Conference On Advances in Computing, Electronics and Electrical Technology CEET 2018

"AN ANALYSIS OF FEATURE SELECTION METHODS FOR MULTICLASS TEXT CLASSIFICATION"

MAYANK KALBHOR SANJAY AGARWAL
DOI
10.15224/978-1-63248-144-3-26
Pages
32 - 36
Authors
2
ISBN
978-1-63248-144-3

Abstract: “To classify objects into different classes, feature plays a vital role. So identification of best features is a backbone of classification process. In text classification, features are simple words, having very large dimension so finding the most appropriate feature set is a big challenge. This paper includes analysis of some feature selection methods for multi class text classification and checks their results on different classifier for an email classification. We run our experiments on 20NewGroups and PU corpora datasets. Experiments are done on some well-known feature selection method like Term Selection, Document Frequency, Mutual Information, Odds Ratio, Chi square and etc. This paper concludes that Mutual Information and Chi square are most appropriate for text classification.”

Keywords: Feature Selection, Text Classification, MultiClass Classification

Download PDF