Proceedings of
5th International Conference on Advances in Computing, Communication and Information Technology CCIT 2017
"USING BIPARTITE GRAPHS PROJECTED ONTO TWO DIMENSIONS FOR TEXT CLASSIFICATION"
Abstract: “In our Big Data world, the amount of text being gathered is ever expanding. For many years, data curators have sought ways to group thes e documents and identify common topics. As the size of the problem increases, solutions that will scale are needed . The purpose of this work is to present a novel text classifier that can be used for text - mining and interactive information access. The mode l that is demonstrated can be used to extract hierarchical relations between topics , as well as to conducted unsupervised clustering of documents and keywords. The approach that is taken with this model is the use of a graph - of - words key term extraction an d a dimensional projection of the bipartite graph of documents and key terms. This projection makes it possible for terms to be co - clustered in an efficient manner in relation to their documents and the documents in relation to their terms. Furthermore, t h e key term extraction process that is outlined can be scaled on a large corpus us”
Keywords: text mining, classification, clustering, bipartite graph, Apache Spark