This paper describes an approach to digesting threads of archived discussion lists by clustering messages into approximate topical groups, and then extracting shorter overviews, and longer summaries for each group. Categories and Subject Descriptors H.3.1 [Information Systems]: Content Analysis and -- Abstracting Methods; H.3.3 [Information Systems]: Information Search and Retrieval – Clustering; H.4.3 [Information Systems Applications]: Communications Applications -- bulletin boards General Terms Algorithms, Human Factors, Experimentation Keywords Discussion lists, newsgroups, digests, clustering, summarization, persistent conversations.
Paula S. Newman, John C. Blitzer