The World Wide Web, initially intended as a way to publish static hypertexts on the Internet, is moving toward complex applications. Static Web sites are being gradually replaced ...
Many machine learning technologies such as support vector machines, boosting, and neural networks have been applied to the ranking problem in information retrieval. However, since...
Tao Qin, Xu-Dong Zhang, Ming-Feng Tsai, De-Sheng W...
A lot of future-related information is available in news articles or Web pages. This information can however differ to large extent and may fluctuate over time. It is therefore di...
Adam Jatowt, Kensuke Kanazawa, Satoshi Oyama, Kats...
In this poster, we present an information extraction engine for web-based forums. The engine analyzes the HTML files crawled from web forums, deduces the wrapper (template) of the...
Hanny Yulius Limanto, Nguyen Ngoc Giang, Vo Tan Tr...
In this paper, we continue our investigations of "web spam": the injection of artificially-created pages into the web in order to influence the results from search engin...
Alexandros Ntoulas, Marc Najork, Mark Manasse, Den...