Automated extraction of bibliographic information from journal articles is key to the affordable creation and maintenance of citation databases, such as MEDLINE
Xiaoli Zhang, Jie Zou, Daniel X. Le, George R. Tho...
In this paper, we propose a new user interface to interactively specify Web wrappers to extract relational information from Web documents. In this study, we focused on improving u...
Challenging the implicit reliance on document collections, this paper discusses the pros and cons of using query logs rather than document collections, as self-contained sources o...
In this paper, we propose a machine learning approach to title extraction from general documents. By general documents, we mean documents that can belong to any one of a number of...
Yunhua Hu, Hang Li, Yunbo Cao, Dmitriy Meyerzon, Q...
In this paper, we address a unique problem in Chinese language processing and report on our study on extending a Chinese thesaurus with region-specific words, mostly from the fina...