Sciweavers

97 search results - page 12 / 20
» Highly Scalable Algorithms for Robust String Barcoding
Sort
View
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
14 years 2 months ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
EISWT
2007
13 years 9 months ago
Schema Based XML Compression
XML has grown into a widely used and highly developed technology, due in part to the subcomponents built around the technology (advanced parsers, frameworks, libraries, etc). The ...
Naphtali Rishe, Ouri Wolfson, Ben Wongsaroj, Damia...
COLCOM
2005
IEEE
14 years 1 months ago
On-demand overlay networking of collaborative applications
We propose a new overlay network, called Generic Identifier Network (GIN), for collaborative nodes to share objects with transactions across affiliated organizations by merging th...
Cheng-Jia Lai, Richard R. Muntz
CIKM
2009
Springer
14 years 2 months ago
Combining labeled and unlabeled data with word-class distribution learning
We describe a novel simple and highly scalable semi-supervised method called Word-Class Distribution Learning (WCDL), and apply it the task of information extraction (IE) by utili...
Yanjun Qi, Ronan Collobert, Pavel Kuksa, Koray Kav...
ISCC
2006
IEEE
129views Communications» more  ISCC 2006»
14 years 1 months ago
A Semantic Overlay Network for P2P Schema-Based Data Integration
Abstract— Today data sources are pervasive and their number is growing tremendously. Current tools are not prepared to exploit this unprecedented amount of information and to cop...
Carmela Comito, Simon Patarin, Domenico Talia