Sciweavers

2936 search results - page 383 / 588
» Genetic Process Mining
Sort
View
KDD
2009
ACM
227views Data Mining» more  KDD 2009»
14 years 8 months ago
Efficiently learning the accuracy of labeling sources for selective sampling
Many scalable data mining tasks rely on active learning to provide the most useful accurately labeled instances. However, what if there are multiple labeling sources (`oracles...
Pinar Donmez, Jaime G. Carbonell, Jeff Schneider
KDD
2008
ACM
183views Data Mining» more  KDD 2008»
14 years 8 months ago
De-duping URLs via rewrite rules
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
Anirban Dasgupta, Ravi Kumar, Amit Sasturkar
KDD
2007
ACM
182views Data Mining» more  KDD 2007»
14 years 8 months ago
Cleaning disguised missing data: a heuristic approach
In some applications such as filling in a customer information form on the web, some missing values may not be explicitly represented as such, but instead appear as potentially va...
Ming Hua, Jian Pei
KDD
2006
ACM
253views Data Mining» more  KDD 2006»
14 years 8 months ago
Adaptive Website Design Using Caching Algorithms
Visitors enter a website through a variety of means, including web searches, links from other sites, and personal bookmarks. In some cases the first page loaded satisfies the visi...
Justin Brickell, Inderjit S. Dhillon, Dharmendra S...
KDD
2005
ACM
153views Data Mining» more  KDD 2005»
14 years 8 months ago
Improving discriminative sequential learning with rare--but--important associations
Discriminative sequential learning models like Conditional Random Fields (CRFs) have achieved significant success in several areas such as natural language processing, information...
Xuan Hieu Phan, Minh Le Nguyen, Tu Bao Ho, Susumu ...