Hitherto, the major challenge to sign language recognition is how to develop approaches that scale well with increasing vocabulary size. In this paper we present an approach to large vocabulary, continuous Chinese Sign Language (CSL) recognition that uses phonemes instead of whole signs as the basic units. Since the number of phonemes is limited, HMM-based training and recognition of the CSL signal becomes more tractable and has the potential to recognize enlarged vocabularies. Furthermore, the proposed method facilitates the CSL recognition when finger-alphabet is blended with gestures. About 2400 phonemes are defined for CSL. One HMM is built for each phoneme, and then the signs are encoded based on these phonemes. A decoder that uses treestructured network is presented. Clustering of the Gaussians on the states, Language model and N-best-pass is used to improve the performance of the system. Experiments on a 5119 sign vocabulary are carried out, and the result is exciting.