Most databases contain “name constants” like course numbers, personal names, and place names that correspond to entities in the real world. Previous work in integration of het...
It has long been recognized that capturing term relationships is an important aspect of information retrieval. Even with large amounts of data, we usually only have significant ev...
Duplicate URLs have brought serious troubles to the whole pipeline of a search engine, from crawling, indexing, to result serving. URL normalization is to transform duplicate URLs...
Tao Lei, Rui Cai, Jiang-Ming Yang, Yan Ke, Xiaodon...
Background: Significant parts of biological knowledge are available only as unstructured text in articles of biomedical journals. By automatically identifying gene and gene produc...
Many testing and analysis techniques use finite state models to validate and verify the quality of software systems. Since the specification of such models is complex and timecons...