Sciweavers

708 search results - page 100 / 142
» Identifying Content Blocks from Web Documents
Sort
View
CIKM
2008
Springer
13 years 9 months ago
Semi-supervised text categorization by active search
In automated text categorization, given a small number of labeled documents, it is very challenging, if not impossible, to build a reliable classifier that is able to achieve high...
Zenglin Xu, Rong Jin, Kaizhu Huang, Michael R. Lyu...
CIKM
2009
Springer
14 years 2 months ago
Vetting the links of the web
Many web links mislead human surfers and automated crawlers because they point to changed content, out-of-date information, or invalid URLs. It is a particular problem for large, ...
Na Dai, Brian D. Davison
WWW
2005
ACM
14 years 8 months ago
Web-assisted annotation, semantic indexing and search of television and radio news
The Rich News system, that can automatically annotate radio and television news with the aid of resources retrieved from the World Wide Web, is described. Automatic speech recogni...
Mike Dowman, Valentin Tablan, Hamish Cunningham, B...
AND
2010
13 years 5 months ago
Statement map: reducing web information credibility noise through opinion classification
On the Internet, users often encounter noise in the form of spelling errors or unknown words, however, dishonest, unreliable, or biased information also acts as noise that makes i...
Koji Murakami, Eric Nichols, Junta Mizuno, Yotaro ...
DRR
2008
13 years 9 months ago
Whole-book recognition using mutual-entropy-driven model adaptation
We describe an approach to unsupervised high-accuracy recognition of the textual contents of an entire book using fully automatic mutual-entropy-based model adaptation. Given imag...
Pingping Xiu, Henry S. Baird