Digital signal processors provide specialized SIMD (single instruction multiple data) operations designed to dramatically increase performance in embedded systems. While these operations are simple to understand, their unusual functions and their parallelism make it difficult for automatic code generation algorithms to use them effectively. In this paper, we present a new optimizing code generation method that can deploy these operations successfully while also verifying that the generated code is a correct translation of the input program.