In bibliographies like DBLP and Citeseer, there are three kinds of entity-name problems that need to be solved. First, multiple entities share one name, which is called the name sh...
Recently, the concept of a species containing both core and distributed genes, known as the supra- or pangenome theory, has been introduced. In this paper, we aim to develop a new ...
Name ambiguity is a special case of identity uncertainty where one person can be referenced by multiple name variations in different situations or even share the same name with ot...
Yang Song, Jian Huang 0002, Isaac G. Councill, Jia...
This paper proposes a nonparametric Bayesian method for exploratory data analysis and feature construction in continuous time series. Our method focuses on understanding shared fe...
The aim of query-based sampling is to obtain a sufficient, representative sample of an underlying (text) collection. Current measures for assessing sample quality are too coarse gr...