We propose a visual event recognition framework for consumer domain videos by leveraging a large amount of loosely labeled web videos (e.g., from YouTube). First, we propose a new...
Based on the local keypoints extracted as salient image patches, an image can be described as a "bag-of-visualwords (BoW)" and this representation has appeared promising ...
Yu-Gang Jiang, Jun Yang 0003, Chong-Wah Ngo, Alexa...
When assessing reported classification results based on selection of members from a database (e.g. a face database), one would like to know what is an achievable classification ra...
Recent work in the field of machine translation (MT) evaluation suggests that sentence level evaluation based on machine learning (ML) can outperform the standard metrics such as B...
Antoine Veillard, Elvina Melissa, Cassandra Theodo...
The detection of faces in images is fundamentally a rare event detection problem. Cascade classifiers provide an efficient computational solution, by leveraging the asymmetry in t...