Sciweavers

833 search results - page 105 / 167
» mc 2007
Sort
View
DMIN
2007
186views Data Mining» more  DMIN 2007»
13 years 11 months ago
Cost-Sensitive Learning vs. Sampling: Which is Best for Handling Unbalanced Classes with Unequal Error Costs?
- The classifier built from a data set with a highly skewed class distribution generally predicts the more frequently occurring classes much more often than the infrequently occurr...
Gary M. Weiss, Kate McCarthy, Bibi Zabar
IJCAI
2007
13 years 11 months ago
Improving Author Coreference by Resource-Bounded Information Gathering from the Web
Accurate entity resolution is sometimes impossible simply due to insufficient information. For example, in research paper author name resolution, even clever use of venue, title ...
Pallika Kanani, Andrew McCallum, Chris Pal
JDM
2007
99views more  JDM 2007»
13 years 10 months ago
Semantic Integration and Knowledge Discovery for Environmental Research
Environmental research and knowledge discovery both require extensive use of data stored in various sources and created in different ways for diverse purposes. We describe a new m...
Zhiyuan Chen, Aryya Gangopadhyay, George Karabatis...
IQ
2007
13 years 11 months ago
A Flexible And Generic Data Quality Metamodel
: DQ metadata can be stored in a Metadata Repository (MDR). The structure of the MDR should be carefully defined to ensure a maximum amount of flexibility, generality and ease of u...
David Becker, William McMullen, Kevin Hetherington...
ACSW
2007
13 years 11 months ago
Storage and Data Management in EGEE
Distributed management of data is one of the most important problems facing grids. Within the Enabling Grids for Enabling eScience (EGEE) project, currently the world’s largest ...
Graeme A. Stewart, David G. Cameron, Greig A. Cowa...