While ITG has many desirable properties for word alignment, it still suffers from the limitation of one-to-one matching. While existing approaches relax this limitation using phrase pairs, we propose a ITG formalism, which even handles units of non-contiguous words, using both simple and hierarchical phrase pairs. We also propose a parameter estimation method, which combines the merits of both supervised and unsupervised learning, for the ITG formalism. The ITG alignment system achieves significant improvement in both word alignment quality and translation performance.