Cognitive Agents must be able to decide their actions based on their recognized states. In general, learning mechanisms are equipped for such agents in order to realize intellgent behaviors. In this paper, we propose a new Estimation of Distribution Algorithms (EDAs) which can acquire effective rules for cognitive agents. Basic calculation procedure of the EDAs is that 1) select better individuals, 2) estimate probabilistic models, and 3) sample new individuals. In the proposed method, instead of the use of individuals, input-output records in episodes are directory used for estimating the probabilistic model by Conditional Random Fields. Therefore, estimated probabilistic model can be regarded as policy so that new input-output records are generated by the interaction between the policy and environments. Computer simulations on Probabilistic Transition Problems show the effectiveness of the proposed method.