Term signal is an existing text representation that depicts a term as a vector of frequencies of occurrences in a number of user-defined partitions of a document. Although term si...
Supphachai Thaicharoen, Tom Altman, Krzysztof J. C...
The area of imbalanced datasets is still relatively new, and it is known that the use of overall accuracy is not an appropriate evaluation measure for imbalanced datasets, because...
LBR is a highly accurate classification algorithm, which lazily constructs a single Bayesian rule for each test instance at classification time. However, its computational complex...
Complex network analysis involves the study of the properties of various real world networks. In this broad field, research on community structures forms an important sub area. Th...
Growing up is in large measure learning about the world and our social and linguistic environment. We might call this data mining, although it is far more multimodal and immersive...
Predicting stock market movements is always difficult. Investors try to guess a stock's behavior, but it often backfires. Thumb rules and intuition seems to be the major indi...
Estimating the rate at which events happen has been studied under various guises and in different settings. We are interested in the specific case of consumerinitiated events or t...
This paper proposes a clustering approach that explores both the content and the structure of XML documents for determining similarity among them. Assuming that the content and th...
Exploratory data mining is fundamental to fostering an appreciation of complex datasets. For large and continuously growing datasets, such as obtained by regular sampling of an or...