Sciweavers

77 search results - page 9 / 16
» Pairwise Document Similarity in Large Collections with MapRe...
Sort
View
SIGIR
2010
ACM
13 years 2 months ago
The 8th workshop on large-scale distributed systems for information retrieval (LSDS-IR'10)
The size of the Web as well as user bases of search systems continue to grow exponentially. Consequently, providing subsecond query response times and high query throughput become...
Roi Blanco, Berkant Barla Cambazoglu, Claudio Lucc...
WWW
2008
ACM
14 years 8 months ago
Extracting XML schema from multiple implicit xml documents based on inductive reasoning
We propose a method of classifying XML documents and extracting XML schema from XML by inductive inference based on constraint logic programming. The goal of this work is to type ...
Masaya Eki, Tadachika Ozono, Toramatsu Shintani
IPM
2007
95views more  IPM 2007»
13 years 7 months ago
Using structural contexts to compress semistructured text collections
We describe a compression model for semistructured documents, called Structural Contexts Model (SCM), which takes advantage of the context information usually implicit in the stru...
Joaquín Adiego, Gonzalo Navarro, Pablo de l...
ACL
2009
13 years 5 months ago
Profile Based Cross-Document Coreference Using Kernelized Fuzzy Relational Clustering
Coreferencing entities across documents in a large corpus enables advanced document understanding tasks such as question answering. This paper presents a novel cross document core...
Jian Huang 0002, Sarah M. Taylor, Jonathan L. Smit...
ERCIMDL
2006
Springer
124views Education» more  ERCIMDL 2006»
13 years 11 months ago
Design and Selection Criteria for a National Web Archive
Web archives and Digital Libraries are conceptually similar, as they both store and provide access to digital contents. The process of loading documents into a Digital Library usua...
Daniel Gomes, Sérgio Freitas, Mário ...