The Wikipedia XML collection turned out to be rich of marked-up phrases as we carried out our INEX 2007 experiments. Assuming that a phrase occurs at the inline level of the markup...
We present an adaptive distributed query-sampling framework that is quality-conscious for extracting high-quality text database samples. The framework divides the query-based samp...
Large software projects often require a programmer to make changes to unfamiliar source code. This paper presents the results of a formative observational study of seven professio...
Robert DeLine, Amir Khella, Mary Czerwinski, Georg...
This paper describes a framework for defining domain specific Feature Functions in a user friendly form to be used in a Maximum Entropy Markov Model (MEMM) for the Named Entity Re...
Text similarity spans a spectrum, with broad topical similarity near one extreme and document identity at the other. Intermediate levels of similarity – resulting from summariza...
Donald Metzler, Yaniv Bernstein, W. Bruce Croft, A...