Sciweavers

NAACL
2010

Minimally-Supervised Extraction of Entities from Text Advertisements

13 years 9 months ago
Minimally-Supervised Extraction of Entities from Text Advertisements
Extraction of entities from ad creatives is an important problem that can benefit many computational advertising tasks. Supervised and semi-supervised solutions rely on labeled data which is expensive, time consuming, and difficult to procure for ad creatives. A small set of manually derived constraints on feature expectations over unlabeled data can be used to partially and probabilistically label large amounts of data. Utilizing recent work in constraint-based semi-supervised learning, this paper injects light weight supervision specified as these "constraints" into a semiMarkov conditional random field model of entity extraction in ad creatives. Relying solely on the constraints, the model is trained on a set of unlabeled ads using an online learning algorithm. We demonstrate significant accuracy improvements on a manually labeled test set as compared to a baseline dictionary approach. We also achieve accuracy that approaches a fully supervised classifier.
Sameer Singh, Dustin Hillard, Chris Leggetter
Added 14 Feb 2011
Updated 14 Feb 2011
Type Journal
Year 2010
Where NAACL
Authors Sameer Singh, Dustin Hillard, Chris Leggetter
Comments (0)