Sciweavers

289 search results - page 9 / 58
» Postal Address Detection from Web Documents
Sort
View
SIGIR
2010
ACM
13 years 2 months ago
Efficient partial-duplicate detection based on sequence matching
With the ever-increasing growth of the Internet, numerous copies of documents become serious problem for search engine, opinion mining and many other web applications. Since parti...
Qi Zhang, Yue Zhang, Haomin Yu, Xuanjing Huang
ACSC
2002
IEEE
14 years 27 days ago
Signature Extraction for Overlap Detection in Documents
Easy access to the Web has led to increased potential for students cheating on assignments by plagiarising others’ work. By the same token, Web-based tools offer the potential f...
Raphael A. Finkel, Arkady B. Zaslavsky, Kriszti&aa...
WWW
2004
ACM
14 years 8 months ago
Web page summarization using dynamic content
Summarizing web pages have recently gained much attention from researchers. Until now two main types of approaches have been proposed for this task: content- and context-based met...
Adam Jatowt
WWW
2003
ACM
14 years 8 months ago
DOM-based content extraction of HTML documents
Web pages often contain clutter (such as pop-up ads, unnecessary images and extraneous links) around the body of an article that distracts a user from actual content. Extraction o...
Suhit Gupta, Gail E. Kaiser, David Neistadt, Peter...
ICWE
2004
Springer
14 years 1 months ago
Accelerating Dynamic Web Content Delivery Using Keyword-Based Fragment Detection
The recent trend in the Internet traffic is increasing in requests for dynamic and personalized content. To efficiently serve this trend, several serverside and cache-side fragme...
Daniel Brodie, Amrish Gupta, Weisong Shi