Examining the Content Load of Part of Speech Blocks for Information Retrieval

15 years 8 months ago

Download acl.ldc.upenn.edu

We investigate the connection between part of speech (POS) distribution and content in language. We define POS blocks to be groups of parts of speech. We hypothesise that there exists a directly proportional relation between the frequency of POS blocks and their content salience. We also hypothesise that the class membership of the parts of speech within such blocks reflects the content load of the blocks, on the basis that open class parts of speech are more content-bearing than closed class parts of speech. We test these hypotheses in the context of Information Retrieval, by syntactically representing queries, and removing from them content-poor blocks, in line with the aforementioned hypotheses. For our first hypothesis, we induce POS distribution information from a corpus, and approximate the probability of occurrence of POS blocks as per two statistical estimators separately. For our second hypothesis, we use simple heuristics to estimate the content load within POS blocks. We us...

Christina Lioma, Iadh Ounis

Real-time Traffic

ACL 2006 | ACL 2007 | Class Parts | Content Load | POS Blocks |

claim paper

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2006
Where	ACL
Authors	Christina Lioma, Iadh Ounis

Comments (0)

Sciweavers

Examining the Content Load of Part of Speech Blocks for Information Retrieval

ACL 2006 | ACL 2007 | Class Parts | Content Load | POS Blocks |

Explore & Download

Productivity Tools

Sciweavers