Keyphrase Extraction in Scientific Publications

15 years 11 months ago

Download www.comp.nus.edu.sg

Abstract. We present a keyphrase extraction algorithm for scientific publications. Different from previous work, we introduce features that capture the positions of phrases in document with respect to logical sections found in scientific discourse. We also introduce features that capture salient morphological phenomena found in scientific keyphrases, such as whether a candidate keyphrase is an acronyms or uses specific terminologically productive suffixes. We have implemented these features on top of a baseline feature set used by Kea [1]. In our evaluation using a corpus of 120 scientific publications multiply annotated for keyphrases, our system significantly outperformed Kea at the p < .05 level. As we know of no other existing multiply annotated keyphrase document collections, we have also made our evaluation corpus publicly available. We hope that this contribution will spur future comparative research.

Thuy Dung Nguyen, Min-Yen Kan

Real-time Traffic

Education | ICADL 2007 | Keyphrase Extraction Algorithm | Multiply Annotated Keyphrase | Scientific Publications |

claim paper

Post Info
More Details (n/a)

Added	16 Aug 2010
Updated	16 Aug 2010
Type	Conference
Year	2007
Where	ICADL
Authors	Thuy Dung Nguyen, Min-Yen Kan

Comments (0)

Sciweavers

Keyphrase Extraction in Scientific Publications

Education | ICADL 2007 | Keyphrase Extraction Algorithm | Multiply Annotated Keyphrase | Scientific Publications |

Explore & Download

Productivity Tools

Sciweavers