XML is becoming a prevalent format for data exchange. Many XML documents have complex schemas that are not always known, and can vary widely between information sources and applica...
Eugene Agichtein, C. T. Howard Ho, Vanja Josifovsk...
Developing models and methods to manage data vagueness is a current effervescent research field. Some work has been done with supervised problems but unsupervised problems and unce...
The goal of input-output modeling is to apply a test input to a system, analyze the results, and learn something useful from the causeeffect pair. Any automated modeling tool that...
Machine learning often relies on costly labeled data, and this impedes its application to new classification and information extraction problems. This has motivated the developme...
The emergence of online labor markets makes it far easier to use individual human raters to evaluate materials for data collection and analysis in the social sciences. In this pap...