The wide availability of commodity multi-core systems presents an opportunity to address the latency issues that have plaqued XML query processing. However, simply executing multiple XML queries over multiple cores merely addresses the throughput issue: intra-query parallelization is needed to exploit multiple processing cores for better latency. Toward this effort, this paper investigates the parallelization of individual XPath queries over shared-address space multi-core processors. Much previous work on parallelizing XPath in a distributed setting failed to exploit the shared memory parallelism of multi-core systems. We propose a novel, end-to-end parallelization framework that determines the optimal way of parallelizing an XML query. This decision is based on a statistics-based approach that relies both on the query specifics and the data statistics. At each stage of the parallelization process, we evaluate three alternative approaches, namely, data-, query-, and hybridpartition...