Scalable community discovery on textual data with relations

15 years 9 months ago

Download research.microsoft.com

Every piece of textual data is generated as a method to convey its authors' opinion regarding specific topics. Authors deliberately organize their writings and create links, i.e., references, acknowledgments, for better expression. Thereafter, it is of interest to study texts as well as their relations to understand the underlying topics and communities. Although many efforts exist in the literature in data clustering and topic mining, they are not applicable to community discovery on large document corpus for several reasons. First, few of them consider both textual attributes as well as relations. Second, scalability remains a significant issue for large-scale datasets. Additionally, most algorithms rely on a set of initial parameters that are hard to be captured and tuned. Motivated by the aforementioned observations, a hierarchical community model is proposed in the paper which distinguishes community cores from affiliated members. We present our efforts to develop a scalable...

Huajing Li, Zaiqing Nie, Wang-Chien Lee, C. Lee Gi

Real-time Traffic

CIKM 2008 | Community Discovery | Document Corpus | Information Management | Initial Parameters |

claim paper

» Towards discovering criminal communities from textual data

» Extracting community structure through relational hypergraphs

» Scalable graph clustering using stochastic flows applications to community discovery

» Mining SoftMatching Rules from Textual Data

» Knowledge Discovery in Textual Databases KDT

» A framework for semantic link discovery over relational data

» HCDF A Hybrid Community Discovery Framework

» Scalable relevance feedback using clickthrough data for web image retrieval

Post Info
More Details (n/a)

Added	12 Oct 2010
Updated	12 Oct 2010
Type	Conference
Year	2008
Where	CIKM
Authors	Huajing Li, Zaiqing Nie, Wang-Chien Lee, C. Lee Giles, Ji-Rong Wen

Comments (0)

Sciweavers

Scalable community discovery on textual data with relations

CIKM 2008 | Community Discovery | Document Corpus | Information Management | Initial Parameters |

Explore & Download

Productivity Tools

Sciweavers