Sciweavers

2504 search results - page 35 / 501
» Using Probabilistic Information in Data Integration
Sort
View
CIKM
2011
Springer
12 years 7 months ago
Probabilistic near-duplicate detection using simhash
This paper offers a novel look at using a dimensionalityreduction technique called simhash [8] to detect similar document pairs in large-scale collections. We show that this algo...
Sadhan Sood, Dmitri Loguinov
ICDIM
2008
IEEE
14 years 2 months ago
A geo-temporal Web gazetteer integrating data from multiple sources
This paper presents a geo-temporal gazetteer Web service that provides access to names of places and historical periods, together with the associated geotemporal information. With...
Hugo Manguinhas, Bruno Martins, José Luis B...
PKDD
2007
Springer
143views Data Mining» more  PKDD 2007»
14 years 1 months ago
Using the Web to Reduce Data Sparseness in Pattern-Based Information Extraction
Textual patterns have been used effectively to extract information from large text collections. However they rely heavily on textual redundancy in the sense that facts have to be m...
Sebastian Blohm, Philipp Cimiano
ERCIMDL
2009
Springer
95views Education» more  ERCIMDL 2009»
14 years 2 months ago
Recollection: Integrating Data through Access
The National Digital Information Infrastructure and Preservation Program will demonstrate a pilot tools platform called Recollection that supports access to distributed NDIIPP coll...
Laura E. Campbell
BMCBI
2010
115views more  BMCBI 2010»
13 years 7 months ago
Integration of multiple data sources to prioritize candidate genes using discounted rating system
Background: Identifying disease gene from a list of candidate genes is an important task in bioinformatics. The main strategy is to prioritize candidate genes based on their simil...
Yongjin Li, Jagdish Chandra Patra