We propose a new kind of sequential pattern which we call Generalized Sequential Pattern, and we introduce the problem of mining generalized sequential patterns over temporal databases. A classical sequential pattern consists of a sequence of itemsets. This kind of pattern can be discovered in a database of customer transactions where each transaction consists of a transaction-id, transaction time and the items bought in the transaction. On the other hand, our generalized sequential pattern consists of a sequence of SQL expressions and can be discovered in a large temporal database. We present the genetic algorithm SEG-GEN to solve the problem of mining generalized sequential pattern. We show that SEG-GEN performs better than the classical algorithm AprioriAll for mining simple sequential patterns where the minimum support threshold is low.
Sandra de Amo, Ary dos Santos Rocha Jr.