Sciweavers

37 search results - page 6 / 8
» Extending Page Segmentation Algorithms for Mixed-Layout Docu...
Sort
View
WWW
2005
ACM
14 years 8 months ago
Browsing fatigue in handhelds: semantic bookmarking spells relief
Focused Web browsing activities such as periodically looking up headline news, weather reports, etc., which require only selective fragments of particular Web pages, can be made m...
Saikat Mukherjee, I. V. Ramakrishnan
DOCENG
2004
ACM
14 years 23 days ago
Querying XML documents by dynamic shredding
With the wide adoption of XML as a standard data representation and exchange format, querying XML documents becomes increasingly important. However, relational database systems co...
Hui Zhang 0003, Frank Wm. Tompa
SIGIR
2003
ACM
14 years 18 days ago
Domain-independent text segmentation using anisotropic diffusion and dynamic programming
This paper presents a novel domain-independent text segmentation method, which identifies the boundaries of topic changes in long text documents and/or text streams. The method c...
Xiang Ji, Hongyuan Zha
WWW
2009
ACM
14 years 7 hour ago
Extracting data records from the web using tag path clustering
Fully automatic methods that extract lists of objects from the Web have been studied extensively. Record extraction, the first step of this object extraction process, identifies...
Gengxin Miao, Jun'ichi Tatemura, Wang-Pin Hsiung, ...
NAACL
2003
13 years 8 months ago
TIPS: A Translingual Information Processing System
Searching online information is increasingly a daily activity for many people. The multilinguality of online content is also increasing (e.g. the proportion of English web users, ...
Yaser Al-Onaizan, Radu Florian, Martin Franz, Hany...