A web search with double checking model is proposed to explore the web as a live corpus. Five association measures including variants of Dice, Overlap Ratio, Jaccard, and Cosine, ...
To conduct content analysis over text data, one may look out for important named objects and entities that refer to real world instances, synthesizing them into knowledge relevant ...
Ever increasing size of the biomedical literature makes tapping into implicit knowledge in scientific literature a necessity for knowledge discovery. In this paper, a semantic par...
Gene and protein names follow few, if any, true naming conventions and are subject to great variation in different occurrences of the same name. This gives rise to two important p...
The entity resolution (ER) problem, which identifies duplicate entities that refer to the same real world entity, is essential in many applications. In this paper, in particular,...
Byung-Won On, Ergin Elmacioglu, Dongwon Lee, Jaewo...