In this work we present an approach to capture the total semantics in multimedia-multimodal web pages. Our research improves upon the state-ofthe-art with two key features: (1) capturing the semantics of text and imagebased media for static and dynamic web content; and (2) recognizing that information goals are defined by emergent user behavior and not statically declared by web design alone. Given a user session, the proposed method accurately predicts user information goals and presents them as a list of most relevant words and images. Conversely, given a set of information goals, the technique predicts possible user navigation patterns as network flow with a semantically-derived flow distribution. In the latter case, differences between predicted optimal and observed user navigation patterns highlight points of suboptimal website design. We compare this approach to other content-based techniques for modeling web-usage and demonstrate its effectiveness.