This paper proposes a three-dimensional (3D) body scanning system that uses passive stereo vision with a robot arm. So far, the reported 3D body scanning systems employ active 3D measurement methods. However, active methods use structured illumination or laser scanning, which is not desirable in many systems applied to human. A major problem of using passive stereo vision for 3D measurement is its low accuracy. In addition, multiple stereo images captured from different viewpoints are necessary to cover the whole body at an appropriate distance. Addressing these problems, we have newly developed an eye-in-hand system based on passive stereo vision, where a phase-based image matching technique is employed for sub-pixel disparity estimation. Through a set of experiments, we demonstrate that the proposed system can capture 3D shape of human body with high quality.