Sciweavers

144 search results - page 20 / 29
» Improved Source-Channel Models for Chinese Word Segmentation
Sort
View
CVPR
2010
IEEE
14 years 4 months ago
Improving State-of-the-Art OCR through High-Precision Document-Specific Modeling
Optical character recognition (OCR) remains a difficult problem for noisy documents or documents not scanned at high resolution. Many current approaches rely on stored font models...
Andrew Kae, Gary Huang, Erik Learned-miller, Carl ...
LREC
2010
181views Education» more  LREC 2010»
13 years 9 months ago
Linguistically Motivated Unsupervised Segmentation for Machine Translation
In this paper we use statistical machine translation and morphology information from two different morphological analyzers to try to improve translation quality by linguistically ...
Mark Fishel, Harri Kirik
EMNLP
2008
13 years 9 months ago
Bayesian Unsupervised Topic Segmentation
This paper describes a novel Bayesian approach to unsupervised topic segmentation. Unsupervised systems for this task are driven by lexical cohesion: the tendency of wellformed se...
Jacob Eisenstein, Regina Barzilay
CIKM
2010
Springer
13 years 5 months ago
Modeling reformulation using passage analysis
Query reformulation modifies the original query with the aim of better matching the vocabulary of the relevant documents, and consequently improving ranking effectiveness. Previou...
Xiaobing Xue, W. Bruce Croft, David A. Smith
COLING
2010
13 years 2 months ago
Discriminative Induction of Sub-Tree Alignment using Limited Labeled Data
We employ Maximum Entropy model to conduct sub-tree alignment between bilingual phrasal structure trees. Various lexical and structural knowledge is explored to measure the syntac...
Jun Sun, Min Zhang, Chew Lim Tan