Written documents created through dictation differ significantly from a true verbatim transcript of the recorded speech. This poses an obstacle in automatic dictation systems as speech recognition output needs to undergo a fair amount of editing in order to turn it into a document that complies with the customary standards. We present an approach that attempts to perform this edit from recognized words to final document automatically by learning the appropriate transformations from example documents. This addresses a number of problems in an integrated way, which have so far been studied independently, in particular automatic punctuation, text segmentation, error correction and disfluency repair. We study two different learning methods, one based on rule induction and one based on a probabilistic sequence model. Quantitative evaluation shows that the probabilistic method performs more accurately.