Conversational dialogue systems cannot be evaluated in a fully formal manner, because dialogue is heavily dependent on context and current dialogue theory is not precise enough to ...
Ron Artstein, Sudeep Gandhe, Jillian Gerten, Anton...
The need for evaluating large amounts of topics (queries) makes IR evaluation an uneasy task. In this paper, we study a topic selection problem for IR evaluation. The selection cr...
Jianhan Zhu, Jun Wang, Vishwa Vinay, Ingemar J. Co...
We describe an experiment that measures the pedagogical usefulness of the results returned by the National Science Digital Library (NSDL) and Google. Eleven public school teachers ...
Large music collections require new ways to let users interact with their music. The concept of finding ‘similar’ songs, albums, or artists provides handles to users for easy ...
We present an evaluation strategy for clock synchronization algorithms. It is based on a combination of measured traces, which provide for realistic performance estimation, and of...