A soft error redirection is a URL redirection to a page that returns the HTTP status code 200 (OK) but has actually no relevant content to the client request. Since such redirecti...
Taehyung Lee, Jinil Kim, Jin Wook Kim, Sung-Ryul K...
Recent work on incremental crawling has enabled the indexed document collection of a search engine to be more synchronized with the changing World Wide Web. However, this synchron...
Lipyeow Lim, Min Wang, Sriram Padmanabhan, Jeffrey...
Information extraction (IE) from semi-structured Web documents is a critical issue for information integration systems on the Internet. Previous work in wrapper induction aim to so...
We propose a novel HMM-based framework to accurately transliterate unseen named entities. The framework leverages features in letteralignment and letter n-gram pairs learned from ...
Bing Zhao, Nguyen Bach, Ian R. Lane, Stephan Vogel
According to a recent survey made by Nielsen NetRatings, searching on news articles is one of the most important activity online. Indeed, Google, Yahoo, MSN and many others have p...
Gianna M. Del Corso, Antonio Gulli, Francesco Roma...