The classical direct product theorem for circuits says that if a Boolean function f : {0, 1}n → {0, 1} is somewhat hard to compute on average by small circuits, then the corresponding k-wise direct product function fk(x1, . . . , xk) = (f(x1), . . . , f(xk)) (where each xi ∈ {0, 1}n) is significantly harder to compute on average by slightly smaller circuits. We prove a fully uniform version of the direct product theorem with information-theoretically optimal parameters, up to constant factors. Namely, we show that for given k and , there is an efficient randomized algorithm A with the following property. Given a circuit C that computes fk on at least fraction of inputs, the algorithm A outputs with probability at least 3/4 a list of O(1/ ) circuits such that at least one of the circuits on the list computes f on more than 1 − δ fraction of inputs, for δ = O((log 1/ )/k); moreover, each output circuit is an AC0 circuit (of size poly(n, k, log 1/δ, 1/ )), with oracle access to ...