Sciweavers

2827 search results - page 528 / 566
» Marking Text Documents
Sort
View
WWW
2005
ACM
14 years 9 months ago
Ranking definitions with supervised learning methods
This paper is concerned with the problem of definition search. Specifically, given a term, we are to retrieve definitional excerpts of the term and rank the extracted excerpts acc...
Jun Xu, Yunbo Cao, Hang Li, Min Zhao
KDD
2008
ACM
183views Data Mining» more  KDD 2008»
14 years 9 months ago
De-duping URLs via rewrite rules
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
Anirban Dasgupta, Ravi Kumar, Amit Sasturkar
SIGMOD
2009
ACM
269views Database» more  SIGMOD 2009»
14 years 9 months ago
Efficient approximate entity extraction with edit distance constraints
Named entity recognition aims at extracting named entities from unstructured text. A recent trend of named entity recognition is finding approximate matches in the text with respe...
Wei Wang 0011, Chuan Xiao, Xuemin Lin, Chengqi Zha...
SIGMOD
2007
ACM
229views Database» more  SIGMOD 2007»
14 years 9 months ago
Spark: top-k keyword query in relational databases
With the increasing amount of text data stored in relational databases, there is a demand for RDBMS to support keyword queries over text data. As a search result is often assemble...
Yi Luo, Xuemin Lin, Wei Wang 0011, Xiaofang Zhou
SAC
2009
ACM
14 years 4 months ago
Applying latent dirichlet allocation to group discovery in large graphs
This paper introduces LDA-G, a scalable Bayesian approach to finding latent group structures in large real-world graph data. Existing Bayesian approaches for group discovery (suc...
Keith Henderson, Tina Eliassi-Rad