Named Entity Recognition in Tweets: An Experimental Study

13 years 2 months ago

Download www.cs.washington.edu

People tweet more than 100 Million times daily, yielding a noisy, informal, but sometimes informative corpus of 140-character messages that mirrors the zeitgeist in an unprecedented manner. The performance of standard NLP tools is severely degraded on tweets. This paper addresses this issue by re-building the NLP pipeline beginning with part-of-speech tagging, through chunking, to named-entity recognition. Our novel T-NER system doubles F1 score compared with the Stanford NER system. T-NER leverages the redundancy inherent in tweets to achieve this performance, using LabeledLDA to exploit Freebase dictionaries as a source of distant supervision. LabeledLDA outperforms cotraining, increasing F1 by 25% over ten common entity types. Our NLP tools are available at: http:// github.com/aritter/twitter_nlp

Alan Ritter, Sam Clark, Mausam, Oren Etzioni

Real-time Traffic

EMNLP 2011 | Entity Recognition | Natural Language Processing | NLP Tools | Unprecedented Manner |

claim paper

Post Info
More Details (n/a)

Added	20 Dec 2011
Updated	20 Dec 2011
Type	Journal
Year	2011
Where	EMNLP
Authors	Alan Ritter, Sam Clark, Mausam, Oren Etzioni

Comments (0)

Sciweavers

Named Entity Recognition in Tweets: An Experimental Study

EMNLP 2011 | Entity Recognition | Natural Language Processing | NLP Tools | Unprecedented Manner |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers