Compilation of a 100 million words balanced corpus called the Balanced Corpus of Contemporary Written Japanese (or BCCWJ) is underway at the National Institute for Japanese Language and Linguistics. The corpus covers a wide range of text genres including books, magazines, newspapers, governmental white papers, textbooks, minutes of the National Diet, internet text (bulletin board and blogs) and so forth, and when possible, samples are drawn from the rigidly defined statistical populations by means of random sampling. All texts are dually POS-analyzed based upontwo different, but mutually related, definitions of `word.'Currently, more than 90 million words have been sampled and XML annotated with respect to text-structure and lexical and character information. A preliminary linear discriminant analysis of text genres using the data of POS frequencies and sentence length revealed it was possible to classifythe text genres with a correct identification rate of 88% as far as the samp...