Sciweavers

708 search results - page 51 / 142
» Identifying Content Blocks from Web Documents
Sort
View
ICDAR
2003
IEEE
14 years 2 months ago
Rejection Algorithm for Mis-segmented Characters In Multilingual Document Recognition
In OCR systems the character segmentation algorithm may generate mis-segmented blocks. Feedback information from character classifier is indispensable to achieve higher character ...
Zhengang Chen, Xiaoqing Ding
ICAIL
2005
ACM
14 years 2 months ago
Constructing a Semantic Network for Legal Content
The Dutch Tax and Customs Administration (DTCA) is one of many organizations that deal with a multitude of electronic legal data, from various sources and in different formats. In...
Radboud Winkels, Alexander Boer, Emile de Maat, To...
VLDB
2002
ACM
161views Database» more  VLDB 2002»
13 years 8 months ago
Distributed Search over the Hidden Web: Hierarchical Database Sampling and Selection
Many valuable text databases on the web have non-crawlable contents that are "hidden" behind search interfaces. Metasearchers are helpful tools for searching over many s...
Panagiotis G. Ipeirotis, Luis Gravano
MEDIAFORENSICS
2010
13 years 10 months ago
A framework for theoretical analysis of content fingerprinting
The popularity of video sharing platforms such as Youtube has prompted the need for the development of efficient techniques for multimedia identification. Content fingerprinting i...
Avinash L. Varna, Wei-Hong Chuang, Min Wu
ICDE
2010
IEEE
251views Database» more  ICDE 2010»
14 years 8 months ago
Viewing a World of Annotations through AnnoVIP
The proliferation of electronic content has notably lead to the apparition of large corpora of interrelated structured documents (such as HTML and XML Web pages) and semantic annot...
Konstantinos Karanasos, Spyros Zoupanos