Dirty data is a serious problem for businesses leading to incorrect decision making, inefficient daily operations, and ultimately wasting both time and money. Dirty data often ari...
The Whitehead Institute/MIT Center for Genome Research is responsible for a number of large genome mapping efforts, the scale of which create problems of data and workflow managem...
Lincoln Stein, Andre Marquis, Robert Dredge, Mary ...
Background: Manual curation of biological databases, an expensive and labor-intensive process, is essential for high quality integrated data. In this paper we report the implement...
Data-intensive e-science applications often rely on third-party data found in public repositories, whose quality is largely unknown. Although scientists are aware that this uncert...
Alun D. Preece, Binling Jin, Paolo Missier, R. Mar...
Accurate and efficient integration of geospatial data is an important problem with applications in areas such as emergency response and urban planning. Some of the key challenges ...