In this paper, a distributed and adaptive approach for resource discovery in peer-to-peer networks is presented. This approach is based on the mobile agent paradigm and the random walk technique with reinforcement learning to allow for dynamic and selfadaptive resource discovery. More precisely, this approach augments random walks with a reinforcement learning technique where mobile agents are backtracked over the walked path in the network. A metric recording an affinity value that incorporates knowledge from past and present searches is maintained between nodes. The affinity value is used during a search to influence the selection of the next hop. This approach is evaluated with the network simulator ns2.