In this paper we address the question: Is water-filling appropriate for watermarking? The water-filling paradigm is a traditional solution to the capacity maximization of parallel zero-mean additive white Gaussian channels subject to a signal energy constraint. In this work, we take an information theoretic approach to analyze the watermark communication problem in the presence of perceptual coding. Our effective watermark channel is modeled as a set of parallel independent zero-mean uniformly distributed additive noise channels. Energy allocation principles are identified to maximize the capacity results. Our findings are compared to the traditional water-filling solution and shed light on strategies to maximize the data hiding rate in the presence of compression.