Often scientists seek to search for articles on the Web related to a particular chemical. When a scientist searches for a chemical formula using a search engine today, she gets ar...
Bingjun Sun, Qingzhao Tan, Prasenjit Mitra, C. Lee...
With the proliferation of online distribution methods for videos, content owners require easier and more effective methods for monetization through advertising. Matching advertis...
With the prosperity of tourism and Web 2.0 technologies, more and more people have willingness to share their travel experiences on the Web (e.g., weblogs, forums, or Web 2.0 comm...
Qiang Hao, Rui Cai, Changhu Wang, Rong Xiao, Jiang...
This paper is a comparative study of feature selection methods in statistical learning of text categorization. The focus is on aggressive dimensionality reduction. Five methods we...
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...