Discovery of numerous specific topics via term co-occurrence analysis

15 years 4 months ago

Download www.ai.sri.com

We describe efficient techniques for construction of large term co-occurrence graphs, and investigate an application to the discovery of numerous fine-grained (specific) topics. A topic is a small dense subgraph discovered by a random walk initiated at a term (node) in the graph. We observe that the discovered topics are highly interpretable, and reveal the different meanings of terms in the corpus. We show the information-theoretic utility of the topics when they are used as features in supervised learning. Such features lead to consistent improvements in classification accuracy over the standard bag-of-words representation, even at high training proportions. We explain how a layered pyramidal view of the term distribution helps in understanding the algorithms and in visualizing and interpreting the topics. Categories and Subject Descriptors H.3.3 [Information Systems]: Information Search and Retrieval--Clustering General Terms Algorithms Keywords unsupervised learning, text mining, ...

Omid Madani, Jiye Yu

Real-time Traffic

CIKM 2010 | Information Technology | Large Term Co-occurrence | Layered Pyramidal View | Small Dense Subgraph |

claim paper

Added	10 Feb 2011
Updated	10 Feb 2011
Type	Journal
Year	2010
Where	CIKM
Authors	Omid Madani, Jiye Yu

Sciweavers

Discovery of numerous specific topics via term co-occurrence analysis

CIKM 2010 | Information Technology | Large Term Co-occurrence | Layered Pyramidal View | Small Dense Subgraph |

Explore & Download

Productivity Tools

Sciweavers