Vector-based models of lexical semantics retrieve semantically related words automatically from large corpora by exploiting the property that words with a similar meaning tend to ...
: We describe methods for automatically identifying signature blocks and reply lines in plaintext email messages. This analysis has many potential applications, such as preprocessi...
A major difficulty of supervised approaches for text classification is that they require a great number of training instances in order to construct an accurate classifier. This pap...
The real di culty in development of practical NLP systems comes from the fact that we do not have e ective means for gathering \knowledge". In this paper, we propose an algor...
Satoshi Sekine, Jeremy J. Carroll, Sophia Ananiado...
Previous studies evaluate simulated dialog corpora using evaluation measures which can be automatically extracted from the dialog systems' logs. However, the validity of thes...