Movie/Script: Alignment and Parsing of Video and Text Transcription

16 years 8 months ago

Download www.cis.upenn.edu

Abstract. Movies and TV are a rich source of diverse and complex video of people, objects, actions and locales "in the wild". Harvesting automatically labeled sequences of actions from video would enable creation of large-scale and highlyvaried datasets. To enable such collection, we focus on the task of recovering scene structure in movies and TV series for object tracking and action retrieval. We present a weakly supervised algorithm that uses the screenplay and closed captions to parse a movie into a hierarchy of shots and scenes. Scene boundaries in the movie are aligned with screenplay scene labels and shots are reordered into a sequence of long continuous tracks or threads which allow for more accurate tracking of people, actions and objects. Scene segmentation, alignment, and shot threading are formulated as inference in a unified generative model and a novel hierarchical dynamic programming algorithm that can handle alignment and jump-limited reorderings in linear tim...

Timothee Cour, Chris Jordan, Eleni Miltsakaki, Ben

Real-time Traffic

Computer Vision | ECCV 2008 | Popular Tv Series | Scene Boundaries | Scene Segmentation | Scene Structure | Screenplay Scene Labels |

claim paper

Related Content

» Actions in context

» Generating Hypermedia Documents from Transcriptions of Television Programs Using Parallel ...

» Alignment of Speech to Highly Imperfect Text Transcriptions

» Enhanced exploration of oral history archives through processed video and synchronized tex...

» Automated closedcaptioning using text alignment

» TV Commercial Classification by using MultiModal Textual Information

» Story Segmentation and Detection of Commercials in Broadcast News Video

Post Info
More Details (n/a)

Added	15 Oct 2009
Updated	15 Oct 2009
Type	Conference
Year	2008
Where	ECCV
Authors	Timothee Cour, Chris Jordan, Eleni Miltsakaki, Ben Taskar

Comments (0)

Sciweavers

Movie/Script: Alignment and Parsing of Video and Text Transcription

Computer Vision | ECCV 2008 | Popular Tv Series | Scene Boundaries | Scene Segmentation | Scene Structure | Screenplay Scene Labels |

Explore & Download

Productivity Tools

Sciweavers