This paper describes recent advances in hidden Markov model (HMM) based OCR for machine-printed Arabic documents. A combination of scriptindependent and script-specific techniques are applied to glyph models and language models (LM). Scriptindependent techniques we applied are higher order ngram LMs for N-best rescoring and discriminative estimation of glyph HMMs. Arabic specific techniques include the use of context-dependent HMMs for glyph modeling and Parts-of-Arabic-Words in language modeling. We present experimental results that demonstrate a 40% relative reduction in word error rate over the baseline configuration on a corpus of machine-printed Arabic documents.