We consider the the problem of tracking heavy hitters and quantiles in the distributed streaming model. The heavy hitters and quantiles are two important statistics for characteri...
SimPoint is a technique used to pick what parts of the program’s execution to simulate in order to have a complete picture of execution. SimPoint uses data clustering algorithms...
Segmentation of continuous audio is important for speaker indexing. The reliability of models in speaker indexing depends much on segmentation. Commonly used method is based on th...
Real-world data is never perfect and can often suffer from corruptions (noise) that may impact interpretations of the data, models created from the data and decisions made based on...
Tree models are valuable tools for predictive modeling and data mining. Traditional tree-growing methodologies such as CART are known to suffer from problems including greediness,...