Sciweavers

187 search results - page 3 / 38
» Entity categorization over large document collections
Sort
View
EMNLP
2007
13 years 9 months ago
Large-Scale Named Entity Disambiguation Based on Wikipedia Data
This paper presents a large-scale system for the recognition and semantic disambiguation of named entities based on information extracted from a large encyclopedic collection and ...
Silviu Cucerzan
CIKM
2000
Springer
13 years 12 months ago
Scalable association-based text classification
Naïve Bayes (NB) classifier has long been considered a core methodology in text classification mainly due to its simplicity and computational efficiency. There is an increasing n...
Dimitris Meretakis, Dimitris Fragoudis, Hongjun Lu...
ISI
2006
Springer
13 years 7 months ago
Analyzing Entities and Topics in News Articles Using Statistical Topic Models
Statistical language models can learn relationships between topics discussed in a document collection and persons, organizations and places mentioned in each document. We present a...
David Newman, Chaitanya Chemudugunta, Padhraic Smy...
KDD
2006
ACM
118views Data Mining» more  KDD 2006»
14 years 8 months ago
Reducing the human overhead in text categorization
Many applications in text processing require significant human effort for either labeling large document collections (when learning statistical models) or extrapolating rules from...
Arnd Christian König, Eric Brill
ICDE
2004
IEEE
151views Database» more  ICDE 2004»
14 years 9 months ago
Improved File Synchronization Techniques for Maintaining Large Replicated Collections over Slow Networks
We study the problem of maintaining large replicated collections of files or documents in a distributed environment with limited bandwidth. This problem arises in a number of impo...
Torsten Suel, Patrick Noel, Dimitre Trendafilov