Visual Hull (VH) construction from silhouette images is a popular method of shape estimation. The method, also known as Shape-From-Silhouette (SFS), is used in many applications such as non-invasive 3D model acquisition, obstacle avoidance, and more recently human motion tracking and analysis. One of the limitations of SFS, however, is that the approximated shape can be very coarse when there are only a few cameras. In this paper, we propose an algorithm to improve the shape approximation by combining multiple silhouette images captured across time. The improvement is achieved by first estimating the rigid motion between the visual hulls formed at different time instants (visual hull alignment) and then combining them (visual hull refinement) to get a tighter bound on the object's shape. Our algorithm first constructs a representation of the VHs called the bounding edge representation. Utilizing a fundamental property of visual hulls which states that each bounding edge must touc...
German K. M. Cheung, Simon Baker, Takeo Kanade