The detection of new information in a document stream is an important component of many potential applications. In this work, a new novelty detection approach based on the identif...
In political speeches, the audience tends to react or resonate to signals of persuasive communication, including an expected theme, a name or an expression. Automatically predicti...
With the overwhelming number of reports on similar events originating from different sources on the web, it is often hard, using existing web search paradigms, to find the origi...
In Web 2.0, users have generated and shared massive amounts of resources in various media formats, such as news, blogs, audios, photos and videos. The abundance and diversity of t...
Chen Liu, Beng Chin Ooi, Anthony K. H. Tung, Dongx...
The web crawler space is often delimited into two general areas: full-web crawling and focused crawling. We present netSifter, a crawler system which integrates features from thes...