Sciweavers

323 search results - page 39 / 65
» An Information Extraction Model for Unconstrained Handwritte...
Sort
View
KDD
2004
ACM
210views Data Mining» more  KDD 2004»
14 years 9 months ago
Probabilistic author-topic models for information discovery
We propose a new unsupervised learning technique for extracting information from large text collections. We model documents as if they were generated by a two-stage stochastic pro...
Mark Steyvers, Padhraic Smyth, Michal Rosen-Zvi, T...
SIGIR
2006
ACM
14 years 3 months ago
Feature diversity in cluster ensembles for robust document clustering
The performance of document clustering systems depends on employing optimal text representations, which are not only difficult to determine beforehand, but also may vary from one ...
Xavier Sevillano, Germán Cobo, Francesc Al&...
CHI
1996
ACM
14 years 1 months ago
Silk from a Sow's Ear: Extracting Usable Structures from the Web
In its current implementation, the World-Wide Web lacks much of the explicit structure and strong typing found in many closed hypertext systems. While this property has directly f...
Peter Pirolli, James E. Pitkow, Ramana Rao
SIGIR
2010
ACM
14 years 27 days ago
Inferring user intent in web search by exploiting social annotations
In this paper, we present a folksonomy-based approach for implicit user intent extraction during a Web search process. We present a number of result re-ranking techniques based on...
Jose M. Conde, David Vallet, Pablo Castells
ICDE
2007
IEEE
126views Database» more  ICDE 2007»
14 years 10 months ago
Organizing Hidden-Web Databases by Clustering Visible Web Documents
In this paper we address the problem of organizing hidden-Web databases. Given a heterogeneous set of Web forms that serve as entry points to hidden-Web databases, our goal is to ...
Luciano Barbosa, Juliana Freire, Altigran Soares d...