Sciweavers

19 search results - page 3 / 4
» An N-Gram Based Approach to Automatically Identifying Web Pa...
Sort
View
WIDM
2003
ACM
14 years 1 months ago
Schema-guided wrapper maintenance for web-data extraction
Extracting data from Web pages using wrappers is a fundamental problem arising in a large variety of applications of vast practical interests. There are two main issues relevant t...
Xiaofeng Meng, Dongdong Hu, Chen Li
HT
2005
ACM
14 years 2 months ago
As we may perceive: inferring logical documents from hypertext
In recent years, many algorithms for the Web have been developed that work with information units distinct from individual web pages. These include segments of web pages or aggreg...
Pavel Dmitriev, Carl Lagoze, Boris Suchkov
ACL
2006
13 years 10 months ago
A Collaborative Framework for Collecting Thai Unknown Words from the Web
We propose a collaborative framework for collecting Thai unknown words found on Web pages over the Internet. Our main goal is to design and construct a Webbased system which allow...
Choochart Haruechaiyasak, Chatchawal Sangkeettraka...
CLEF
2009
Springer
13 years 6 months ago
Overview of VideoCLEF 2009: New Perspectives on Speech-Based Multimedia Content Enrichment
VideoCLEF 2009 offered three tasks related to enriching video content for improved multimedia access in a multilingual environment. For each task, video data (Dutch-language telev...
Martha Larson, Eamonn Newman, Gareth J. F. Jones
KCAP
2005
ACM
14 years 2 months ago
AutoFeed: an unsupervised learning system for generating webfeeds
The AutoFeed system automatically extracts data from semistructured web sites. Previously, researchers have developed two types of supervised learning approaches for extracting we...
Bora Gazen, Steven Minton