Review assignment is a common task that many people such as conference organizers, journal editors, and grant administrators would have to do routinely. As a computational problem...
Maryam Karimzadehgan, ChengXiang Zhai, Geneva G. B...
Finding latent patterns in high dimensional data is an important research problem with numerous applications. The most well known approaches for high dimensional data analysis are...
Web Page segmentation is a crucial step for many applications in Information Retrieval, such as text classification, de-duplication and full-text search. In this paper we describe...
The two most important tasks in information extraction from the Web are webpage structure understanding and natural language sentences processing. However, little work has been don...
Chunyu Yang, Yong Cao, Zaiqing Nie, Jie Zhou, Ji-R...
Finding biological entities (such as genes or proteins) that satisfy certain conditions from texts is an important and challenging task in biomedical information retrieval and tex...
We developed and tested a heuristic technique for extracting the main article from news site Web pages. We construct the DOM tree of the page and score every node based on the amo...
MetaMap is an online application that allows mapping text to UMLS Metathesaurus concepts, which is very useful interoperability among different languages and systems within the bi...
Search engine logs are an emerging new type of data that offers interesting opportunities for data mining. Existing work on mining such data has mostly attempted to discover knowl...