Abstract. This paper suggests a novel representation for documents that is intended to improve precision. This representation is generated by combining two central techniques: Rand...
This paper shows how to formally characterize language learning in a finite parameter space as a Markov structure, hnportant new language learning results follow directly: explici...
We address the problem of academic conference homepage understanding for the Semantic Web. This problem consists of three labeling tasks - labeling conference function pages, func...
— One of the critical issues in search engines is the size of search indexes: as the number of documents handled by an engine increases, the search must preserve its efficiency,...
Topic modeling has been a key problem for document analysis. One of the canonical approaches for topic modeling is Probabilistic Latent Semantic Indexing, which maximizes the join...
Deng Cai, Qiaozhu Mei, Jiawei Han, Chengxiang Zhai