Sciweavers

1353 search results - page 123 / 271
» Text Indexing with Errors
Sort
View
TOIS
2002
97views more  TOIS 2002»
13 years 7 months ago
Burst tries: a fast, efficient data structure for string keys
Many applications depend on efficient management of large sets of distinct strings in memory. For example, during index construction for text databases a record is held for each d...
Steffen Heinz, Justin Zobel, Hugh E. Williams
ICML
2002
IEEE
14 years 8 months ago
Combining Labeled and Unlabeled Data for MultiClass Text Categorization
Supervised learning techniques for text classi cation often require a large number of labeled examples to learn accurately. One way to reduce the amountoflabeled datarequired is t...
Rayid Ghani
ICASSP
2009
IEEE
14 years 2 months ago
Data hiding in hard-copy text documents robust to print, scan and photocopy operations
This paper describes a method for hiding data inside printed text documents that is resilient to print/scan and photocopying operations. Using the principle of channel coding with...
Avinash L. Varna, Shantanu Rane, Anthony Vetro
ML
2000
ACM
124views Machine Learning» more  ML 2000»
13 years 7 months ago
Text Classification from Labeled and Unlabeled Documents using EM
This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. ...
Kamal Nigam, Andrew McCallum, Sebastian Thrun, Tom...
NLE
2008
118views more  NLE 2008»
13 years 7 months ago
Part-of-speech tagging of Modern Hebrew text
Words in Semitic texts often consist of a concatenation of word segments, each corresponding to a Part-of-Speech (POS) category. Semitic words may be ambiguous with regard to thei...
Roy Bar-Haim, Khalil Sima'an, Yoad Winter