Sciweavers

311 search results - page 32 / 63
» XTRACT: A System for Extracting Document Type Descriptors fr...
Sort
View
WIDM
2004
ACM
14 years 1 months ago
Measuring similarity between collection of values
In this paper, we propose a set of similarity metrics for manipulating collections of values occuring in XML documents. Following the data model presented in TAX algebra, we treat...
Carina F. Dorneles, Carlos A. Heuser, Andrei E. N....
WWW
2002
ACM
14 years 8 months ago
Using web structure for classifying and describing web pages
The structure of the web is increasingly being used to improve organization, search, and analysis of information on the web. For example, Google uses the text in citing documents ...
Eric J. Glover, Kostas Tsioutsiouliklis, Steve Law...
ECIR
1998
Springer
13 years 9 months ago
Coupled Hierarchical IR and Stochastic Models for Surface Information Extraction
We present in this paper a combination of Machine Learning based Information Retrieval (IR) techniques and stochastic language modelling in a hierarchical system that extracts sur...
Hugo Zaragoza, Patrick Gallinari
BMCBI
2006
153views more  BMCBI 2006»
13 years 7 months ago
Automatic document classification of biological literature
Background: Document classification is a wide-spread problem with many applications, from organizing search engine snippets to spam filtering. We previously described Textpresso, ...
David Chen, Hans-Michael Müller, Paul W. Ster...
WWW
2004
ACM
14 years 8 months ago
Testbed for information extraction from deep web
Search results generated by searchable databases are served dynamically and far larger than the static documents on the Web. These results pages have been referred to as the Deep ...
Yasuhiro Yamada, Nick Craswell, Tetsuya Nakatoh, S...