Sciweavers

ACL
2006

Automatic Construction of Polarity-Tagged Corpus from HTML Documents

14 years 28 days ago
Automatic Construction of Polarity-Tagged Corpus from HTML Documents
This paper proposes a novel method of building polarity-tagged corpus from HTML documents. The characteristics of this method is that it is fully automatic and can be applied to arbitrary HTML documents. The idea behind our method is to utilize certain layout structures and linguistic pattern. By using them, we can automatically extract such sentences that express opinion. In our experiment, the method could construct a corpus consisting of 126,610 sentences.
Nobuhiro Kaji, Masaru Kitsuregawa
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2006
Where ACL
Authors Nobuhiro Kaji, Masaru Kitsuregawa
Comments (0)