In recent years, digital images and videos have become increasingly popular over the internet and bring great social impact to a wide audience. In the meanwhile, technology advancement allows people to easily alter the content of digital multimedia and brings serious concern on the trustworthiness of online multimedia information. Forensic hash is a short signature attached to an image before transmission and acts as side information for analyzing the processing history and trustworthiness of the received image. In this paper, we propose a new construction of forensic hash based on visual words representation. We encode SIFT features into a compact visual words representation for robust estimation of geometric transformations and propose a hybrid construction using both SIFT and block-based features to detect and localize image tampering. The proposed hash construction achieves more robust and accurate forensic analysis than prior work.