Background: Document classification is a wide-spread problem with many applications, from organizing search engine snippets to spam filtering. We previously described Textpresso, ...
Citation matching, or the automatic grouping of bibliographic references that refer to the same document, is a data management problem faced by automatic digital libraries for sci...
Isaac G. Councill, Huajing Li, Ziming Zhuang, Sand...
Research on linear text segmentation has been an on-going focus in NLP for the last decade, and it has great potential for a wide range of applications such as document summarizati...
Jingbo Zhu, Na Ye, Xinzhi Chang, Wenliang Chen, Be...
For most English words, dictionaries give various senses: e.g., “bank” can stand for a financial institution, shore, set, etc. Automatic selection of the sense intended in a gi...
Alexander F. Gelbukh, Grigori Sidorov, Sang-Yong H...
The problem of dividing a sequence of values into segments occurs in database systems, information retrieval, and knowledge management. The challenge is to select a finite number ...