Sciweavers

1353 search results - page 123 / 271
» Text Indexing with Errors
Sort
View
TOIS
2002
97views more  TOIS 2002»
15 years 3 months ago
Burst tries: a fast, efficient data structure for string keys
Many applications depend on efficient management of large sets of distinct strings in memory. For example, during index construction for text databases a record is held for each d...
Steffen Heinz, Justin Zobel, Hugh E. Williams
ICML
2002
IEEE
16 years 4 months ago
Combining Labeled and Unlabeled Data for MultiClass Text Categorization
Supervised learning techniques for text classi cation often require a large number of labeled examples to learn accurately. One way to reduce the amountoflabeled datarequired is t...
Rayid Ghani
ICASSP
2009
IEEE
15 years 10 months ago
Data hiding in hard-copy text documents robust to print, scan and photocopy operations
This paper describes a method for hiding data inside printed text documents that is resilient to print/scan and photocopying operations. Using the principle of channel coding with...
Avinash L. Varna, Shantanu Rane, Anthony Vetro
ML
2000
ACM
124views Machine Learning» more  ML 2000»
15 years 3 months ago
Text Classification from Labeled and Unlabeled Documents using EM
This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. ...
Kamal Nigam, Andrew McCallum, Sebastian Thrun, Tom...
NLE
2008
118views more  NLE 2008»
15 years 3 months ago
Part-of-speech tagging of Modern Hebrew text
Words in Semitic texts often consist of a concatenation of word segments, each corresponding to a Part-of-Speech (POS) category. Semitic words may be ambiguous with regard to thei...
Roy Bar-Haim, Khalil Sima'an, Yoad Winter