Sciweavers

1413 search results - page 190 / 283
» Efficient Learning of Semi-structured Data from Queries
Sort
View
SDM
2008
SIAM
133views Data Mining» more  SDM 2008»
15 years 5 months ago
Semantic Smoothing for Bayesian Text Classification with Small Training Data
Bayesian text classifiers face a common issue which is referred to as data sparsity problem, especially when the size of training data is very small. The frequently used Laplacian...
Xiaohua Zhou, Xiaodan Zhang, Xiaohua Hu
CASCON
2001
148views Education» more  CASCON 2001»
15 years 5 months ago
A Pareto model for OLAP view size estimation
On Line Analytical Processing (OLAP) aims at gaining useful information quickly from large amounts of data residing in a data warehouse. To improve the quickness of response to qu...
Thomas P. Nadeau, Toby J. Teorey
SDM
2010
SIAM
184views Data Mining» more  SDM 2010»
15 years 5 months ago
A Robust Decision Tree Algorithm for Imbalanced Data Sets
We propose a new decision tree algorithm, Class Confidence Proportion Decision Tree (CCPDT), which is robust and insensitive to class distribution and generates rules which are st...
Wei Liu, Sanjay Chawla, David A. Cieslak, Nitesh V...
ICDIM
2010
IEEE
15 years 2 months ago
Data mining and automatic OLAP schema generation
Data mining aims at extraction of previously unidentified information from large databases. It can be viewed as an automated application of algorithms to discover hidden patterns a...
Muhammad Usman, Sohail Asghar, Simon Fong
TKDE
2008
111views more  TKDE 2008»
15 years 4 months ago
Text Clustering with Feature Selection by Using Statistical Data
Abstract-- Feature selection is an important method for improving the efficiency and accuracy of text categorization algorithms by removing redundant and irrelevant terms from the ...
Yanjun Li, Congnan Luo, Soon M. Chung