In this work we discuss author identification for documents written in Portuguese. Two different approaches were compared. The first is the writer-independent model which reduces ...
Daniel Pavelec, Edson J. R. Justino, Leonardo Vida...
In this work, automatic recognition of Arabic dialects is proposed. An acoustic survey of the proportion of vocalic intervals and the standard deviation of consonantal intervals i...
Mohamed Belgacem, Georges Antoniadis, Laurent Besa...
—This paper presents a new method for localization of digit strings with a specific syntax in Farsi/ Arabic document images. First, some features are extracted from all connected...
Identifying the most influential documents in a corpus is an important problem in many fields, from information science and historiography to text summarization and news aggregati...
Low-dimensional topic models have been proven very useful for modeling a large corpus of documents that share a relatively small number of topics. Dimensionality reduction tools s...