Sciweavers

96 search results - page 17 / 20
» Detecting Near-replicas on the Web by Content and Hyperlink ...
Sort
View
SIGMOD
2004
ACM
119views Database» more  SIGMOD 2004»
14 years 7 months ago
Lazy Query Evaluation for Active XML
In this paper, we study query evaluation on Active XML documents (AXML for short), a new generation of XML documents that has recently gained popularity. AXML documents are XML do...
Serge Abiteboul, Omar Benjelloun, Bogdan Cautis, I...
DGO
2006
134views Education» more  DGO 2006»
13 years 9 months ago
Next steps in near-duplicate detection for eRulemaking
Large volume public comment campaigns and web portals that encourage the public to customize form letters produce many near-duplicate documents, which increases processing and sto...
Hui Yang, Jamie Callan, Stuart W. Shulman
LREC
2008
110views Education» more  LREC 2008»
13 years 9 months ago
Unsupervised and Domain Independent Ontology Learning: Combining Heterogeneous Sources of Evidence
Acquiring knowledge from the Web to build domain ontologies has become a common practice in the Ontological Engineering field. The vast amount of freely available information allo...
David Manzano-Macho, Asunción Gómez-...
VLDB
2003
ACM
125views Database» more  VLDB 2003»
14 years 7 months ago
THESUS: Organizing Web document collections based on link semantics
Abstract. The requirements for effective search and management of the WWW are stronger than ever. Currently Web documents are classified based on their content not taking into acco...
Maria Halkidi, Benjamin Nguyen, Iraklis Varlamis, ...
WWW
2004
ACM
14 years 8 months ago
Mining models of human activities from the web
The ability to determine what day-to-day activity (such as cooking pasta, taking a pill, or watching a video) a person is performing is of interest in many application domains. A ...
Mike Perkowitz, Matthai Philipose, Kenneth P. Fish...