Kernel descriptors provide a unified way to generate rich visual feature sets by turning pixel attributes into patch-level features, and yield impressive results on many object recognition tasks. However, best results with kernel descriptors are achieved using efficient match kernels in conjunction with nonlinear SVMs, which makes it impractical for large-scale problems. In this paper, we propose hierarchical kernel descriptors that apply kernel descriptors recursively to form image-level features and thus provide a conceptually simple and consistent way to generate image-level features from pixel attributes. More importantly, hierarchical kernel descriptors allow linear SVMs to yield state-of-the-art accuracy while being scalable to large datasets. They can also be naturally extended to extract features over depth images. We evaluate hierarchical kernel descriptors both on the CIFAR10 dataset and the new RGB-D Object Dataset consisting of segmented RGB and depth images of 300 everyday...