Proceedings of
2nd International Conference on Advances in Computing, Communication and Information Technology CCIT 2014
"TOWARDS REQUIREMENTS REUSE: IDENTIFYING SIMILAR REQUIREMENTS WITH LATENT SEMANTIC ANALYSIS AND CLUSTERING ALGORITHMS"
Abstract: “Software requirements that exist in natural language can easily be understood by various stakeholders. However, when it comes to extracting common requirements from the natural language requirement documents for reuse, manual extraction process can be arduous, expensive, and very error-prone on the results. In this paper, we describe a process of identifying similar requirement documents for reuse in Software Product Lines. Online product reviews were extracted and used as the input mainly due to the scarcity of publicly available requirement documents. Latent Semantic Analysis technique from Information Retrieval was used to identify similar requirement documents and filter out the unrelated ones after the text has been pre-processed. Similar documents were then clustered together by using K-means and Hierarchical Agglomerative Clustering algorithm. As a result, the output from the clustering process will be used to recommend group of related requirement documents to be used in requir”
Keywords: Software Engineering, Requirements Similarities, Requirements Reuse, Software Product Lines, Latent Semantic Analysis, Text Clustering