A new method of finding people in video images is presented. Detection is based on a novel background modeling and subtraction approach which uses both color and edge information. We introduce confidence maps--gray-scale images whose intensity is a function of our confidence that a pixel has changed--to fuse intermediate results and to represent the results of background subtraction. The latter is used to delineate a person's body by guiding contour collection to segment the person from the background. The method is tolerant to scene clutter, slow illumination changes, and camera noise, and runs in near real time on a standard platform.