Sciweavers

559 search results - page 95 / 112
» An information extraction engine for web discussion forums
Sort
View
WWW
2010
ACM
14 years 2 months ago
New-web search with microblog annotations
Web search engines discover indexable documents by recursively ‘crawling’ from a seed URL. Their rankings take into account link popularity. While this works well, it introduc...
Tom Rowlands, David Hawking, Ramesh Sankaranarayan...
SPIRE
2009
Springer
14 years 2 months ago
A Two-Level Structure for Compressing Aligned Bitexts
A bitext, or bilingual parallel corpus, consists of two texts, each one in a different language, that are mutual translations. Bitexts are very useful in linguistic engineering bec...
Joaquín Adiego, Nieves R. Brisaboa, Miguel ...
WWW
2008
ACM
14 years 8 months ago
A larger scale study of robots.txt
A website can regulate search engine crawler access to its content using the robots exclusion protocol, specified in its robots.txt file. The rules in the protocol enable the site...
Santanu Kolay
SIGMOD
2008
ACM
150views Database» more  SIGMOD 2008»
14 years 8 months ago
Query biased snippet generation in XML search
Snippets are used by almost every text search engine to complement ranking scheme in order to effectively handle user searches, which are inherently ambiguous and whose relevance ...
Yu Huang, Ziyang Liu, Yi Chen
WWW
2010
ACM
14 years 2 months ago
Sampling high-quality clicks from noisy click data
Click data captures many users’ document preferences for a query and has been shown to help significantly improve search engine ranking. However, most click data is noisy and of...
Adish Singla, Ryen W. White