Psychological measures of concreteness of words are generally estimated by having humans provide ratings of words on a concreteness scale. Due to the limits of this technique, concreteness ratings in current word databases (e.g., MRC) are incomplete due to the limited size of the word samples. In this study, we use available linguistic databases to formulate a computational model to simulate human ratings on word concreteness. The computational model includes Lexical Type, Latent Semantic Analysis Dimensions, Hypernymy Levels, Word Frequency and Word Length. Our results indicate that the model accounts for 64% variance of human ratings.
Shi Feng, Zhiqiang Cai, Scott A. Crossley, Daniell