Web forums have become an important data resource for many web applications, but extracting structured data from unstructured web forum pages is still a challenging task due to bo...
Jiang-Ming Yang, Rui Cai, Yida Wang, Jun Zhu, Lei ...
BreakingStory is an interactive system for visualizing change in online news. The system regularly collects the text from the front pages of international daily news web sites. It...
Jean Anne Fitzpatrick, James Reffell, Moryma Aydel...
This paper addresses the issue of query refinement, which involves reformulating ill-formed search queries in order to enhance relevance of search results. Query refinement typica...
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
A large amount of research, technical and professional documents are available today in digital formats. Digital libraries are created to facilitate search and retrieval of inform...