One of the main problems in probabilistic grammatical inference consists in inferring a stochastic language, i.e. a probability distribution, in some class of probabilistic models, from a sample of strings independently drawn according to a fixed unknown target distribution p. Here, we consider the class of rational stochastic languages composed of stochastic languages that can be computed by multiplicity automata, which can be viewed as a generalization of probabilistic automata. Rational stochastic languages p have a useful algebraic characterization: all the mappings ˙up : v → p(uv) lie in a finite dimensional vector subspace V ∗ p of the vector space R Σ composed of all real-valued functions defined over Σ∗ . Hence, a first step in the grammatical inference process can consist in identifying the subspace V ∗ p . In this paper, we study the possibility of using Principal Component Analysis to achieve this task. We provide an inference algorithm which computes an esti...