Sciweavers

1947 search results - page 68 / 390
» On the Automatic Extraction of Data from the Hidden Web
Sort
View
SIBGRAPI
2000
IEEE
14 years 1 months ago
An Off-Line Signature Verification System using Hidden Markov Model and Cross-Validation
This work has as main objective to present an off-line signature verification system. It is basically divided into three parts. The first one demonstrates a pre-processing process,...
Edson J. R. Justino, Abdenaim El Yacoubi, Fl&aacut...
PVLDB
2008
141views more  PVLDB 2008»
13 years 8 months ago
WebTables: exploring the power of tables on the web
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
GFKL
2007
Springer
152views Data Mining» more  GFKL 2007»
14 years 2 months ago
Supporting Web-based Address Extraction with Unsupervised Tagging
Abstract. The manual acquisition and modeling of tourist information as e.g. addresses of points of interest is time and, therefore, cost intensive. Furthermore, the encoded inform...
Berenike Loos, Chris Biemann
ICADL
2007
Springer
129views Education» more  ICADL 2007»
14 years 2 months ago
Using Automatic Metadata Extraction to Build a Structured Syllabus Repository
Syllabi are important documents created by instructors for students. Students use syllabi to find information and to prepare for class. Instructors often need to find similar syl...
Xiaoyan Yu, Manas Tungare, Weiguo Fan, Manuel A. P...
ECIR
2006
Springer
13 years 10 months ago
Automatic Acquisition of Chinese-English Parallel Corpus from the Web
Parallel corpora are a valuable resource for tasks such as cross-language information retrieval and data-driven natural language processing systems. Previously only small scale cor...
Ying Zhang, Ke Wu, Jianfeng Gao, Phil Vines