Proceedings of
International Conference on Advances in Information Processing and Communication Technology IPCT 2014
"OUTLIER DOCUMENT FILTERING APPLIED TO THE EXTRACTIVE SUMMARIZATION"
Abstract: “Summarization requires selection of the more informative sentences within a set of documents. Generally, process assumes the document set includes related topics to a subject. However, some of the documents may be outlier and the effect of an outlier document might affect the success of extractive summary. Research is focused on filtering documents at the extraction stage these are outlier. Extraction finds the outlier documents far distance from representative document set word vector (DSWV). DUC 2006 data set is used for tests. System summaries are compared with the systems generated by DUC participants. Results points out that filtering outlier documents overwhelm all the systems fairly.”
Keywords: Document Processing, Extractive Summarization, Outlier Detection, Similarity Measure