In this paper we introduce a statistical Named Entity recognizer (NER) system for the Hungarian language. We examined three methods for identifying and disambiguating proper nouns...
An approach to simultaneous document classification and word clustering is developed using a two-way mixture model of Poisson distributions. Each document is represented by a vect...
Recently, manifold learning has been widely exploited in pattern recognition, data analysis, and machine learning. This paper presents a novel framework, called Riemannian manifold...
In many pattern recognition applications, high-dimensional feature vectors impose a high computational cost as well as the risk of "overfitting". Feature Selection addre...
We propose a new method for using anaphoric information in Latent Semantic Analysis (lsa), and discuss its application to develop an lsa-based summarizer which achieves a signifi...
Josef Steinberger, Massimo Poesio, Mijail Alexandr...