Hierarchical phrase-based translation (Hiero, (Chiang, 2005)) provides an attractive framework within which both short- and longdistance reorderings can be addressed consistently and ef ciently. However, Hiero is generally implemented with a constraint preventing the creation of rules with adjacent nonterminals, because such rules introduce computational and modeling challenges. We introduce methods to address these challenges, and demonstrate that rules with adjacent nonterminals can improve Hiero's generalization power and lead to signi cant performance gains in Chinese-English translation.