Information retrieval systems, based on keyword match, are evolving to question answering systems that return short passages or direct answers to questions, rather than URLs point...
This paper considers the use of computational stylistics for performing authorship attribution of electronic messages, addressing categorization problems with as many as 20 differ...
Shlomo Argamon, Marin Saric, Sterling Stuart Stein
Government regulations are semi-structured text documents that are often voluminous, heavily cross-referenced between provisions and even ambiguous. Multiple sources of regulation...
Large 0-1 datasets arise in various applications, such as market basket analysis and information retrieval. We concentrate on the study of topic models, aiming at results which in...
Random projections have recently emerged as a powerful method for dimensionality reduction. Theoretical results indicate that the method preserves distances quite nicely; however,...