Thread-level speculative execution is a technique that makes it possible for a wider range of single-threaded applications to make use of the processing resources in a chip multiprocessor. We consider module-level speculation, i.e., speculative threads executing the code after a module (i.e., a procedure, function, or method) call. Unfortunately, previous studies have shown that indiscriminate module-level speculation results in significant overheads, mainly due to frequent misspeculations. In addition to hurting performance, excessive overhead is harmful from a resource usage and energy efficiency standpoint. We show that the overhead when spawning speculative threads for all module continuations is on average three times as big as the time spent on useful execution on our baseline 8-way chip multiprocessor. In this paper, we present and make a detailed evaluation of a technique that aims at reducing the overhead associated with misspeculations. History-based prediction is used in ...