Combating Web spam has become one of the top challenges for Web search engines. State-of-the-art spam detection techniques are usually designed for specific known types of Web spa...
In this paper, an agent is defined as a triple (S, RS, LS), where S is a multi-hierarchical decision system, RS is a set of rules extracted from S defining values of its decision a...
Large scale learning is often realistic only in a semi-supervised setting where a small set of labeled examples is available together with a large collection of unlabeled data. In...
The explosion of Web opinion data has made essential the need for automatic tools to analyze and understand people’s sentiments toward different topics. In most sentiment analy...
Recent work has shown the feasibility and promise of templateindependent Web data extraction. However, existing approaches use decoupled strategies ? attempting to do data record ...
Jun Zhu, Zaiqing Nie, Ji-Rong Wen, Bo Zhang, Wei-Y...