Gradient Boosted Regression Trees (GBRT) are the current state-of-the-art learning paradigm for machine learned websearch ranking — a domain notorious for very large data sets. ...
Stephen Tyree, Kilian Q. Weinberger, Kunal Agrawal...
Astronomy increasingly faces the issue of massive datasets. For instance, the Sloan Digital Sky Survey (SDSS) has so far generated tens of millions of images of distant galaxies, ...
Brigham Anderson, Andrew W. Moore, Andrew Connolly...
In this paper we address the problem of selecting variables or features in a regression model in the presence of both additive (vertical) and leverage outliers. Since variable sel...
This paper offers a local distributed algorithm for multivariate regression in large peer-to-peer environments. The algorithm is designed for distributed inferencing, data compact...
Abstract. The much-publicized Netflix competition has put the spotlight on the application domain of collaborative filtering and has sparked interest in machine learning algorithms...