To address the challenges of information integration and retrieval, the computational genomics community increasingly has come to rely on the methodology of creating annotations o...
David P. Hill, Barry Smith, Monica S. McAndrews-Hi...
We have previously reported on ProPOSEL, a purpose-built Prosody and PoS English Lexicon compatible with the Python Natural Language ToolKit. ProPOSEC is a new corpus research res...
The use of NLP techniques for document classification has not produced significant improvements in performance within the standard term weighting statistical assignment paradigm (...
We report on the construction of the PAN Wikipedia vandalism corpus, PAN-WVC-10, using Amazon’s Mechanical Turk. The corpus compiles 32 452 edits on 28 468 Wikipedia articles, a...
Social tagging is becoming increasingly popular in many Web 2.0 applications where users can annotate resources (e.g. Web pages) with arbitrary keywords (i.e. tags). A tag recomme...
Ziyu Guan, Jiajun Bu, Qiaozhu Mei, Chun Chen, Can ...