Sciweavers

543 search results - page 92 / 109
» Exploiting content redundancy for web information extraction
Sort
View
WEBI
2005
Springer
14 years 1 months ago
A Semi-Supervised Document Clustering Algorithm Based on EM
Document clustering is a very hard task in Automatic Text Processing since it requires to extract regular patterns from a document collection without a priori knowledge on the cat...
Leonardo Rigutini, Marco Maggini
IADIS
2004
13 years 9 months ago
A conceptual modeling of multimedia documents
Our research works are interested in the identification and the representation of the semantic structures of multimedia documents. The semantic structure of a multimedia document ...
Mohamed Mbarki, Chantal Soulé-Dupuy
ITCC
2000
IEEE
13 years 12 months ago
Towards Knowledge Discovery from WWW Log Data
As the result of interactions between visitors and a web site, an http log file contains very rich knowledge about users on-site behaviors, which, if fully exploited, can better c...
Feng Tao, Fionn Murtagh
ICDE
2010
IEEE
251views Database» more  ICDE 2010»
14 years 7 months ago
Viewing a World of Annotations through AnnoVIP
The proliferation of electronic content has notably lead to the apparition of large corpora of interrelated structured documents (such as HTML and XML Web pages) and semantic annot...
Konstantinos Karanasos, Spyros Zoupanos
JCDL
2004
ACM
114views Education» more  JCDL 2004»
14 years 1 months ago
Translating unknown cross-lingual queries in digital libraries using a web-based approach
Users’ cross-lingual queries to a digital library system might be short and not included in a common translation dictionary (unknown terms). In this paper, we investigate the fe...
Jenq-Haur Wang, Jei-Wen Teng, Pu-Jen Cheng, Wen-Hs...