Improving data quality is a time-consuming, labor-intensive and often domain specific operation. Existing data repair approaches are either fully automated or not efficient in int...
Mohamed Yakout, Ahmed K. Elmagarmid, Jennifer Nevi...
We present CiteSeer: an autonomous citation indexing system which indexes academic literature in electronic format (e.g. Postscript files on the Web). CiteSeer understands how to ...
Abstract. Clickthrough data has been the subject of increasing popularity as an implicit indicator of user feedback. Previous analysis has suggested that user click behaviour is su...
Falk Scholer, Milad Shokouhi, Bodo Billerbeck, And...
Web search is challenging partly due to the fact that search queries and Web documents use different language styles and vocabularies. This paper provides a quantitative analysis ...
When attempting to annotate music, it is important to consider both acoustic content and social context. This paper explores techniques for collecting and combining multiple sourc...
Douglas Turnbull, Luke Barrington, Gert R. G. Lanc...