Sciweavers

37 search results - page 3 / 8
» Language Models and Smoothing Methods for Collections with L...
Sort
View
CICLING
2010
Springer
13 years 11 months ago
Word Length n-Grams for Text Re-use Detection
Abstract. The automatic detection of shared content in written documents –which includes text reuse and its unacknowledged commitment, plagiarism– has become an important probl...
Alberto Barrón-Cedeño, Chiara Basile...
SDM
2008
SIAM
256views Data Mining» more  SDM 2008»
13 years 9 months ago
Graph Mining with Variational Dirichlet Process Mixture Models
Graph data such as chemical compounds and XML documents are getting more common in many application domains. A main difficulty of graph data processing lies in the intrinsic high ...
Koji Tsuda, Kenichi Kurihara
CIKM
2005
Springer
14 years 1 months ago
Predicting accuracy of extracting information from unstructured text collections
Exploiting lexical and semantic relationships in large unstructured text collections can significantly enhance managing, integrating, and querying information locked in unstructur...
Eugene Agichtein, Silviu Cucerzan
ECIR
2008
Springer
13 years 9 months ago
A Document-Centered Approach to a Natural Language Music Search Engine
We propose a new approach to a music search engine that can be accessed via natural language queries. As with existing approaches, we try to gather as much contextual information a...
Peter Knees, Tim Pohle, Markus Schedl, Dominik Sch...
SIGIR
2004
ACM
14 years 1 months ago
Dependence language model for information retrieval
This paper presents a new dependence language modeling approach to information retrieval. The approach extends the basic language modeling approach based on unigram by relaxing th...
Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong ...