Sciweavers

37 search results - page 5 / 8
» Extending Page Segmentation Algorithms for Mixed-Layout Docu...
Sort
View
WWW
2006
ACM
14 years 8 months ago
A comparison of implicit and explicit links for web page classification
It is well known that Web-page classification can be enhanced by using hyperlinks that provide linkages between Web pages. However, in the Web space, hyperlinks are usually sparse...
Dou Shen, Jian-Tao Sun, Qiang Yang, Zheng Chen
WWW
2009
ACM
14 years 8 months ago
Extracting article text from the web with maximum subsequence segmentation
Much of the information on the Web is found in articles from online news outlets, magazines, encyclopedias, review collections, and other sources. However, extracting this content...
Jeff Pasternack, Dan Roth
PAMI
2010
207views more  PAMI 2010»
13 years 2 months ago
Document Ink Bleed-Through Removal with Two Hidden Markov Random Fields and a Single Observation Field
We present a new method for blind document bleed through removal based on separate Markov Random Field (MRF) regularization for the recto and for the verso side, where separate pri...
Christian Wolf
IPM
2008
141views more  IPM 2008»
13 years 7 months ago
Towards a unified approach to document similarity search using manifold-ranking of blocks
Document similarity search (i.e. query by example) aims to retrieve a ranked list of documents similar to a query document in a text corpus or on the Web. Most existing approaches...
Xiaojun Wan, Jianwu Yang, Jianguo Xiao
ICDAR
2005
IEEE
14 years 29 days ago
Grouping Text Lines in Freeform Handwritten Notes
Handwritten text lines are prominent structures in freeform digital ink notes and their reliable detection is the foundation to a natural and intelligent interface for note editin...
Ming Ye, Herry Sutanto, Sashi Raghupathy, Chengyan...