Sciweavers

1947 search results - page 83 / 390
» On the Automatic Extraction of Data from the Hidden Web
Sort
View
FLAIRS
2007
13 years 11 months ago
Indexing Documents by Discourse and Semantic Contents from Automatic Annotations of Texts
The basic aim of the model proposed here is to automatically build semantic metatext structure for texts that would allow us to search and extract discourse and semantic informati...
Brahim Djioua, Jean-Pierre Desclés
ICASSP
2011
IEEE
13 years 16 days ago
Bilingual audio-subtitle extraction using automatic segmentation of movie audio
Extraction of bilingual audio and text data is crucial for designing Speech to Speech (S2S) systems. In this work, we propose an automatic method to segment multilingual audio str...
Andreas Tsiartas, Prasanta Kumar Ghosh, Panayiotis...
LREC
2010
216views Education» more  LREC 2010»
13 years 10 months ago
BlogBuster: A Tool for Extracting Corpora from the Blogosphere
This paper presents BlogBuster, a tool for extracting a corpus from the blogosphere. The topic of cleaning arbitrary web pages with the goal of extracting a corpus from web data, ...
Georgios Petasis, Dimitrios Petasis
CBMS
2006
IEEE
14 years 16 days ago
A Generic Framework: From Clinical Notes to Electronic Medical Records
Electronic Medical Records are important to manage health data and save lives to improve the quality of service in hospitals. Clinical medical records contain a wealth of informat...
Hyoil Han, Yoori Choi, Yoo Myung Choi, Xiaohua Zho...
SIGMOD
2000
ACM
236views Database» more  SIGMOD 2000»
14 years 1 months ago
XTRACT: A System for Extracting Document Type Descriptors from XML Documents
XML is rapidly emerging as the new standard for data representation and exchange on the Web. An XML document can be accompanied by a Document Type Descriptor (DTD) which plays the...
Minos N. Garofalakis, Aristides Gionis, Rajeev Ras...