Music consists of both local and long-term temporal information. However, for a genre classification task, most of the text categorization based approaches only capture local temporal dependences (e.g. statistics of unigrams and bigrams). In our previous work, we use sequential patterns to capture long-term temporal information from the tokenized sequences of music pieces. In this paper, we propose the use of time-constrained sequential patterns (TSPs) to enhance the mined long-term temporal structures so that these TSPs can fit more closely to the human perception. Experimental results show that the proposed method can discover more temporal structures than statistical language modeling approaches and achieves better recognition accuracy.