Sciweavers

294 search results - page 52 / 59
» Indexing Large Trajectory Data Sets With SETI
Sort
View
WWW
2010
ACM
14 years 2 months ago
A pattern tree-based approach to learning URL normalization rules
Duplicate URLs have brought serious troubles to the whole pipeline of a search engine, from crawling, indexing, to result serving. URL normalization is to transform duplicate URLs...
Tao Lei, Rui Cai, Jiang-Ming Yang, Yan Ke, Xiaodon...
SIGMOD
2012
ACM
253views Database» more  SIGMOD 2012»
11 years 10 months ago
Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems
The advent of affordable, shared-nothing computing systems portends a new class of parallel database management systems (DBMS) for on-line transaction processing (OLTP) applicatio...
Andrew Pavlo, Carlo Curino, Stanley B. Zdonik
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
14 years 2 months ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
BMCBI
2010
153views more  BMCBI 2010»
13 years 7 months ago
Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering
Background: Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or su...
Eva Freyhult, Mattias Landfors, Jenny Önskog,...
WWW
2009
ACM
14 years 8 months ago
Latent space domain transfer between high dimensional overlapping distributions
Transferring knowledge from one domain to another is challenging due to a number of reasons. Since both conditional and marginal distribution of the training data and test data ar...
Sihong Xie, Wei Fan, Jing Peng, Olivier Verscheure...