Sciweavers

SIGIR
2008
ACM
14 years 10 days ago
Term clouds as surrogates for user generated speech
User generated spoken audio remains a challenge for Automatic Speech Recognition (ASR) technology and content-based audio surrogates derived from ASR-transcripts must be error rob...
Manos Tsagkias, Martha Larson, Maarten de Rijke
SIGIR
2008
ACM
14 years 10 days ago
Analyzing web text association to disambiguate abbreviation in queries
We introduce a statistical model for abbreviation disambiguation in Web search, based on analysis of Web data resources, including anchor text, click log and query log. By combini...
Xing Wei, Fuchun Peng, Benoît Dumoulin
SIGIR
2008
ACM
14 years 10 days ago
Optical character recognition errors and their effects on natural language processing
Errors are unavoidable in advanced computer vision applications such as optical character recognition, and the noise induced by these errors presents a serious challenge to downstr...
Daniel P. Lopresti
SIGIR
2008
ACM
14 years 10 days ago
Learning from labeled features using generalized expectation criteria
It is difficult to apply machine learning to new domains because often we lack labeled problem instances. In this paper, we provide a solution to this problem that leverages domai...
Gregory Druck, Gideon S. Mann, Andrew McCallum
SIGIR
2008
ACM
14 years 10 days ago
Learning to rank at query-time using association rules
Some applications have to present their results in the form of ranked lists. This is the case of many information retrieval applications, in which documents must be sorted accordi...
Adriano Veloso, Humberto Mossri de Almeida, Marcos...
SIGIR
2008
ACM
14 years 10 days ago
Topic based language models for OCR correction
Anurag Bhardwaj, Faisal Farooq, Huaigu Cao, Venu G...
SIGIR
2008
ACM
14 years 10 days ago
Uncovering deep user context from blogs
People's utterances are fundamentally different to other documents because they are more immediate and less thought through. While this makes them more natural
Robert McArthur
SIGIR
2008
ACM
14 years 10 days ago
On profiling blogs with representative entries
With an explosive growth of blogs, information seeking in blogosphere becomes more and more challenging. One example task is to find the most relevant topical blogs against a give...
Jinfeng Zhuang, Steven C. H. Hoi, Aixin Sun
SIGIR
2008
ACM
14 years 10 days ago
Latent dirichlet allocation based multi-document summarization
Extraction based Multi-Document Summarization Algorithms consist of choosing sentences from the documents using some weighting mechanism and combining them into a summary. In this...
Rachit Arora, Balaraman Ravindran