Abstract. Studies of different term extractors on a corpus of the biomedical domain revealed decreasing performances when applied to highly technical texts. Facing the difficulty o...
Social media has become a major source of information for many applications. Numerous techniques have been proposed to analyze network structures and text contents. In this paper,...
In this paper, we show that stylistic text features can be exploited to determine an anonymous author's native language with high accuracy. Specifically, we first use automat...
Software document repositories store artifacts produced in the course of developing software products. But most repositories are simply archives of documents. It is not unusual to ...
Yan Wu, Harvey P. Siy, Mansour Zand, Victor L. Win...
Finding definitions in huge text collections is a challenging problem, not only because of the many ways in which definitions can be conveyed in natural language texts but also be...