Sciweavers

629 search results - page 77 / 126
» The Generalized Quantum Database Search Algorithm
Sort
View
CIKM
2009
Springer
14 years 3 months ago
Feature selection for ranking using boosted trees
Modern search engines have to be fast to satisfy users, so there are hard back-end latency requirements. The set of features useful for search ranking functions, though, continues...
Feng Pan, Tim Converse, David Ahn, Franco Salvetti...
WEBDB
2004
Springer
100views Database» more  WEBDB 2004»
14 years 2 months ago
Spam, Damn Spam, and Statistics: Using Statistical Analysis to Locate Spam Web Pages
The increasing importance of search engines to commercial web sites has given rise to a phenomenon we call “web spam”, that is, web pages that exist only to mislead search eng...
Dennis Fetterly, Mark Manasse, Marc Najork
CIKM
2005
Springer
14 years 2 months ago
Joint deduplication of multiple record types in relational data
Record deduplication is the task of merging database records that refer to the same underlying entity. In relational databases, accurate deduplication for records of one type is o...
Aron Culotta, Andrew McCallum
WWW
2005
ACM
14 years 9 months ago
Partitioning of Web graphs by community topology
We introduce a stricter Web community definition to overcome boundary ambiguity of a Web community defined by Flake, Lawrence and Giles [2], and consider the problem of finding co...
Hidehiko Ino, Mineichi Kudo, Atsuyoshi Nakamura
KDD
2004
ACM
117views Data Mining» more  KDD 2004»
14 years 9 months ago
Systematic data selection to mine concept-drifting data streams
One major problem of existing methods to mine data streams is that it makes ad hoc choices to combine most recent data with some amount of old data to search the new hypothesis. T...
Wei Fan