Type-Independent Correction of Sample Selection Bias via Structural Discovery and Re-balancing

14 years 2 months ago

Download www.weifan.info

Sample selection bias is a common problem in many real world applications, where training data are obtained under realistic constraints that make them follow a different distribution from the future testing data. For example, in the application of hospital clinical studies, it is common practice to build models from the eligible volunteers as the training data, and then apply the model to the entire populations. Because these volunteers are usually not selected at random, the training set may not be drawn from the same distribution as the test set. Thus, such a dataset suffers from "sample selection bias" or "covariate shift". In the past few years, much work has been proposed to reduce sample selection bias, mainly by statically matching the distribution between training set and test set. But in this paper, we do not explore the different distributions directly. Instead, we propose to discover the natural structure of the target distribution, by which different ty...

Jiangtao Ren, Xiaoxiao Shi, Wei Fan, Philip S. Yu

Real-time Traffic

Data Mining | Sample Selection | Sample Selection Bias | Sample Selection Biases | SDM 2008 |

claim paper

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2008
Where	SDM
Authors	Jiangtao Ren, Xiaoxiao Shi, Wei Fan, Philip S. Yu

Comments (0)

Sciweavers

Type-Independent Correction of Sample Selection Bias via Structural Discovery and Re-balancing

Data Mining | Sample Selection | Sample Selection Bias | Sample Selection Biases | SDM 2008 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers