We present a syntactic and lexically based discourse segmenter (SLSeg) that is designed to avoid the common problem of over-segmenting text. Segmentation is the first step in a di...
In this paper, we propose a new comprehensive methodology in order to evaluate the performance of noisy historical document recognition techniques. We aim to evaluate not only the...
Authorship analysis of electronic texts assists digital forensics and anti-terror investigation. Author identification can be seen as a single-label multi-class text categorizatio...
A method is presented for segmenting text into subtopic areas. The proportion of related pairwise words is calculated between adjacent windows of text to determine their lexical s...
In this paper, we introduce and evaluate two novel approaches, one using video stream and the other using close-caption text stream, for segmenting TV news into stories. The segmen...
Hemant Misra, Frank Hopfgartner, Anuj Goyal, P. Pu...