We present a new method to estimate the intrinsic dimensionality of a submanifold M in Rd from random samples. The method is based on the convergence rates of a certain U-statisti...
In dealing with large datasets the reduced support vector machine (RSVM) was proposed for the practical objective to overcome the computational difficulties as well as to reduce t...
— Monitoring the traffic in high-speed networks is a data intensive problem. Uniform packet sampling is the most popular technique for reducing the amount of data the network mo...
Random sampling is a well-known technique for approximate processing of large datasets. We introduce a set of algorithms for incremental maintenance of large random samples on seco...
Databases are a key technology for molecular biology which is a very data intensive discipline. Since molecular biological databases are rather heterogeneous, unification and data...