Loading...

Proceedings of

International Conference on Advanced Computing, Communication and Networks CCN 2011

"CONCEPT-BASED MINING MODEL FOR WEB DOCUMENT CLUSTERING"

B ESWARA REDDY K MUNIVELU REDDY
DOI
10.15224/978-981-07-1847-3-1027
Pages
768 - 772
Authors
2
ISBN
978-981-07-1847-3-1027

Abstract: “Most of the document clustering techniques are based on statistical analysis of a term, either a word or phrase.The statistical analysis of a term frequency captures the importance of the term within the document only. Thus, the underlying mining model should indicate terms that capture the semantics of the text. In this case, The mining model can capture terms that present the concepts of the sentence, which leads to the discovery of the topic of document. A new concept-based mining model focuses on the web document clustering;the model consists of three components: concept-based statistical analyzer, COG and concept extractor.The statistical analyzer is to analyze terms on the sentence and document levels. The COG is to extract the most important terms with respect to the meaning of the text. The concepts that have maximum weights are selected by the concept extractor.The similarity between documents is calculated based on the Concept-based document similarity measure;”

Keywords: Concept-based mining model, COG, web mining, clustering, document similarity.

Download PDF