Government regulations are semi-structured text documents that are often voluminous, heavily cross-referenced between provisions and even ambiguous. Multiple sources of regulation...
We address the problem of measuring global quality metrics of search engines, like corpus size, index freshness, and density of duplicates in the corpus. The recently proposed est...
Social tagging is becoming increasingly popular in many Web 2.0 applications where users can annotate resources (e.g. Web pages) with arbitrary keywords (i.e. tags). A tag recomme...
Ziyu Guan, Jiajun Bu, Qiaozhu Mei, Chun Chen, Can ...
As the number of available Web pages grows, users experience increasing difficulty finding documents relevant to their interests. One of the underlying reasons for this is that mo...
The "Institut National de I'Audiovisuel" (1NA) is in charge of keeping records of national TV broadcasts. Its main function is to provide TV producers with authenti...
Marc Nanard, Jocelyne Nanard, David Genest, Michel...