: Recurrent neural networks possess interesting universal approximation capabilities, making them good candidates for time series modeling. Unfortunately, long term dependencies are difficult to learn if gradient descent algorithms are employed. We support the view that it is easier for these algorithms to find good solutions if one includes connections with time delays in the recurrent networks. The algorithm we present here allows one to choose the right locations and delays for such connections. As we show on two benchmark problems, this algorithm produces very good results while keeping the total number of connections in the recurrent network to a minimum.