The paper focuses on the study of solving the large-scale traveling salesman problem (TSP) based on neurodynamic programming. From this perspective, two methods, temporal difference learning and approximate Sarsa, are presented in detail. In essence, both of them try to learn an appropriate evaluation function on the basis of a finite amount of experience. To evaluate their performances, some computational experiments on both the Euclidean and asymmetric TSP instances are conducted. In contrast with the large size of the state space, only a few training sets have been used to obtain the initial results. Hence, the results are acceptable and encouraging in comparisons with some classical algorithms, and further study of this kind of methods, as well as applications in combinatorial optimization problems, is worth investigating. Keywords Neurodynamic programming