The problem of multimodal data mining in a multimedia database can be addressed as a structured prediction problem where we learn the mapping from an input to the structured and interdependent output variables. In this paper, built upon the existing literature on the max margin based learning, we develop a new max margin learning approach called Enhanced Max Margin Learning (EMML) framework. In addition, we apply EMML framework to developing an effective and efficient solution to the multimodal data mining problem in a multimedia database. The main contributions include: (1) we have developed a new max margin learning approach -- the enhanced max margin learning framework that is much more efficient in learning with a much faster convergence rate, which is verified in empirical evaluations; (2) we have applied this EMML approach to developing an effective and efficient solution to the multimodal data mining problem that is highly scalable in the sense that the query response time is i...
Zhen Guo, Zhongfei Zhang, Eric P. Xing, Christos F