Abstract. In many supervised learning tasks it can be costly or infeasible to obtain objective, reliable labels. We may, however, be able to obtain a large number of subjective, po...
Professional manual transcription of speech is an expensive and time consuming process. This paper focuses on the problem of combining noisy transcriptions from multiple non-exper...
Kartik Audhkhasi, Panayiotis G. Georgiou, Shrikant...
Sentence Clustering is often used as a first step in Multi-Document Summarization (MDS) to find redundant information. All the same there is no gold standard available. This paper...
This paper proposes a dependency parsing method that uses bilingual constraints to improve the accuracy of parsing bilingual texts (bitexts). In our method, a targetside tree frag...
The comparison of the accuracy of two binary diagnostic tests has traditionally required knowledge of the real state of the disease in all of the patients in the sample via the ap...
Background: A key challenge in systems biology is the reconstruction of an organism's metabolic network from its genome sequence. One strategy for addressing this problem is ...
We propose a gold standard for evaluating two types of information extraction output -- noun phrase (NP) chunks (Abney 1991; Ramshaw and Marcus 1995) and technical terms (Justeson...
We present a new approach to intrinsic summary evaluation, based on initial experiments in van Halteren and Teufel (2003), which combines two novel aspects: comparison of informat...
Terms, term relevances, and sentence relevances are concepts that figure in many NLP applications, such as Text Summarization. These concepts are implemented in various ways, thou...
The availability of a huge mass of textual data in electronic format has increased the need for fast and accurate techniques for textual data processing. Machine learning and stat...