Sciweavers

210 search results - page 6 / 42
» Distributional Clustering of English Words
Sort
View
CLEF
2011
Springer
12 years 7 months ago
A Language-Independent Approach to Identify the Named Entities in Under-Resourced Languages and Clustering Multilingual Document
Abstract. This paper presents a language-independent Multilingual Document Clustering (MDC) approach on comparable corpora. Named entites (NEs) such as persons, locations, organiza...
N. Kiran Kumar, G. S. K. Santosh, Vasudeva Varma
LREC
2008
146views Education» more  LREC 2008»
13 years 8 months ago
ProPOSEL: A Prosody and POS English Lexicon for Language Engineering
ProPOSEL is a prototype prosody and PoS (part-of-speech) English lexicon for Language Engineering, derived from the following language resources: the computer-usable dictionary CU...
Claire Brierley, Eric Atwell
NIPS
2004
13 years 8 months ago
A Probabilistic Model for Online Document Clustering with Application to Novelty Detection
In this paper we propose a probabilistic model for online document clustering. We use non-parametric Dirichlet process prior to model the growing number of clusters, and use a pri...
Jian Zhang 0003, Zoubin Ghahramani, Yiming Yang
ICPR
2000
IEEE
14 years 8 months ago
OCR with No Shape Training
We present a document-specific OCR system and apply it to a corpus of faxed business letters. Unsupervised classification of the segmented character bitmaps on each page, using a ...
Tin Kam Ho, George Nagy
CORR
1998
Springer
87views Education» more  CORR 1998»
13 years 7 months ago
Word Clustering and Disambiguation Based on Co-occurrence Data
We address the problem of clustering words (or constructing a thesaurus) based on co-occurrence data, and using the acquired word classes to improve the accuracy of syntactic disa...
Hang Li, Naoki Abe