This paper investigates reinforcement learning (RL) in XCS. First, it formally shows that XCS implements a method of generalized RL based on linear approximators, in which the usual input mapping function translates the state-action space into a niche relative fitness space. Then, it shows that, although XCS has always been related to standard RL, XCS is actually a method of averaging RL. More precisely, XCS with gradient descent can be actually derived from the typical update of averaging RL. It is noted that the use of averaging RL in XCS introduces an intrinsic preference toward classifiers with a smaller fitness in the niche. It is argued that, because of the accuracy pressure in XCS, this results in an additional preference toward specificity. A very simple experiment is presented to support this hypothesis. The same approach is applied to XCS with computed prediction (XCSF) and similar conclusions are drawn. Categories and Subject Descriptors