Sciweavers

945 search results - page 135 / 189
» Robust Text Processing in Automated Information Retrieval
Sort
View
WWW
2006
ACM
14 years 9 months ago
Visually guided bottom-up table detection and segmentation in web documents
In the AllRight project, we are developing an algorithm for unsupervised table detection and segmentation that uses the visual rendition of a Web page rather than the HTML code. O...
Bernhard Krüpl, Marcus Herzog
ICWSM
2008
13 years 10 months ago
Predicting Success and Failure in Weight Loss Blogs through Natural Language Use
We explore the emerging phenomenon of blogging about personal goals, and demonstrate how natural language processing tools can be used to uncover psychologically meaningful constr...
Cindy K. Chung, Clinton Jones, Alexander Liu, Jame...
SIGIR
2009
ACM
14 years 3 months ago
Addressing morphological variation in alphabetic languages
The selection of indexing terms for representing documents is a key decision that limits how effective subsequent retrieval can be. Often stemming algorithms are used to normaliz...
Paul McNamee, Charles K. Nicholas, James Mayfield
DOCENG
2009
ACM
14 years 3 months ago
Object-level document analysis of PDF files
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
Tamir Hassan
ACL
2010
13 years 6 months ago
Event-Based Hyperspace Analogue to Language for Query Expansion
Bag-of-words approaches to information retrieval (IR) are effective but assume independence between words. The Hyperspace Analogue to Language (HAL) is a cognitively motivated and...
Tingxu Yan, Tamsin Maxwell, Dawei Song, Yuexian Ho...