Sciweavers

ICWSM
2010
13 years 9 months ago
Coping With Noise in a Real-World Weblog Crawler and Retrieval System
In this paper we examine the effects of noise when creating a real-world weblog corpus for information retrieval. We focus on the DiffPost (Lee et al. 2008) approach to noise remo...
James Lanagan, Paul Ferguson, Neil O'Hare, Alan F....
FINTAL
2006
14 years 2 months ago
Language Model Mixtures for Contextual Ad Placement in Personal Blogs
Abstract. We introduce a method for content-based advertisement selection for personal blog pages, based on combining multiple representations of the blog. The core idea behind the...
Gilad Mishne, Maarten de Rijke
ACMSE
2007
ACM
14 years 3 months ago
Enhancing clustering blog documents by utilizing author/reader comments
Blogs are a new form of internet phenomenon and a vast everincreasing information resource. Mining blog files for information is a very new research direction in data mining. We p...
Beibei Li, Shuting Xu, Jun Zhang
WWW
2004
ACM
14 years 11 months ago
Automatically collecting, monitoring, and mining japanese weblogs
We present a system that tries to automatically collect and monitor Japanese blog collections that include not only ones made with blog softwares but also ones written as normal w...
Tomoyuki Nanno, Toshiaki Fujiki, Yasuhiro Suzuki, ...