Object detection in cluttered, natural scenes has a high
complexity since many local observations compete for object
hypotheses. Voting methods provide an efficient solution
to this problem. When Hough voting is extended to location
and scale, votes naturally become lines through scale
space due to the local scale-location-ambiguity. In contrast
to this, current voting methods stick to the location-only setting
and cast point votes, which require local estimates of
scale. Rather than searching for object hypotheses in the
Hough accumulator, we propose a weighted, pairwise clustering
of voting lines to obtain globally consistent hypotheses
directly. In essence, we propose a hierarchical approach
that is based on a sparse representation of object boundary
shape. Clustering of voting lines (CVL) condenses the information
from these edge points in few, globally consistent
candidate hypotheses. A final verification stage concludes
by refining the candidates. Experiments on t...