This paper presents a novel content-based emotion-oriented music video (MV) generation system, called EMV-matchmaker, which utilizes the emotional temporal phase sequence of the multimedia content as a bridge to connect music and video. Specifically, we adopt an emotional temporal course model (ETCM) to respectively learn the relationship between music and its emotional temporal phase sequence and the relationship between video and its emotional temporal phase sequence from an emotion-annotated MV corpus. Then, given a video clip (or a music clip), the visual (or acoustic) ETCM is applied to predict its emotional temporal phase sequence in a valence-arousal (VA) emotional space from the corresponding low-level visual (or acoustic) features. For MV generation, string matching is applied to measure the similarity between the emotional temporal phase sequences of video and music. The results of objective and subjective experiments demonstrate that EMV-matchmaker performs well and can gen...