Effective and scalable solutions for mixed and split citation problems in digital libraries

16 years 8 days ago

Download www4.ncsu.edu

In this paper, we consider two important problems that commonly occur in bibliographic digital libraries, which seriously degrade their data qualities: Mixed Citation (MC) problem (i.e., citations of diﬀerent scholars with their names being homonyms are mixed together) and Split Citation (SC) problem (i.e., citations of the same author appear under diﬀerent name variants). In particular, we investigate an eﬀective yet scalable solution since citations in such digital libraries tend to be large-scale. After formally deﬁning the problems and accompanying challenges, we present an eﬀective solution that is based on the state-of-the-art sampling-based approximate join algorithm. Our claim is veriﬁed through preliminary experimental results.

Dongwon Lee, Byung-Won On, Jaewoo Kang, Sanghyun P

Real-time Traffic

Approximate Join Algorithm | Bibliographic Digital Libraries | Digital Libraries | Information System | IQIS 2005 |

claim paper

» Hybrid indatabase inference for declarative information extraction

Post Info
More Details (n/a)

Added	26 Jun 2010
Updated	26 Jun 2010
Type	Conference
Year	2005
Where	IQIS
Authors	Dongwon Lee, Byung-Won On, Jaewoo Kang, Sanghyun Park

Comments (0)

Sciweavers

Effective and scalable solutions for mixed and split citation problems in digital libraries

Approximate Join Algorithm | Bibliographic Digital Libraries | Digital Libraries | Information System | IQIS 2005 |

Explore & Download

Productivity Tools

Sciweavers