Stackless traversal techniques are often used to circumvent memory bottlenecks by avoiding a stack and replacing return traversal with extra computation. This paper addresses whether the stackless traversal approaches are useful on newer hardware and technology (such as CUDA). To this end, we present a novel stackless approach for implicit kd-trees, which exploits the benefits of index-based node traversal, without incurring extra node visitation. This approach, which we term Kd-Jump, enables the traversal to immediately return to the next valid node, like a stack, without incurring extra node visitation (kd-restart). Also, Kd-Jump does not require global memory (stack) at all and only requires a small matrix in fast constant-memory. We report that Kd-Jump outperforms a stack by 10 to 20% and kd-restart by 100%. We also present a Hybrid Kd-Jump, which utilizes a volume stepper for leaf testing and a run-time depth threshold to define where kd-tree traversal stops and volume-stepping oc...
David M. Hughes, Ik Soo Lim