Search Sciweavers | Sciweavers

466 search results - page 12 / 94

» Scalable Feature Extraction from Noisy Documents

161

click to vote

COLING
2010

125views Computational Linguistics» more COLING 2010»

EM-based Hybrid Model for Bilingual Terminology Extraction from Comparable Corpora

15 years 2 months ago

Download acl.eldoc.ub.rug.nl

In this paper, we present an unsupervised hybrid model which combines statistical, lexical, linguistic, contextual, and temporal features in a generic EMbased framework to harvest...

Lianhau Lee, AiTi Aw, Min Zhang, Haizhou Li

claim paper

Read More »

254

click to vote

ACSC
2003
IEEE

153views Theoretical Computer Science» more ACSC 2003»

A Comparative Study for Domain Ontology Guided Feature Extraction

16 years 22 days ago

Download crpit.com

We introduced a novel method employing a hierarchical domain ontology structure to extract features representing documents in our previous publication (Wang 2002). All raw words i...

Bill B. Wang, Robert I. McKay, Hussein A. Abbass, ...

claim paper

Read More »

222

click to vote

KDD
2007
ACM

186views Data Mining» more KDD 2007»

Content-based document routing and index partitioning for scalable similarity-based searches in a large corpus

16 years 7 months ago

Download www.ssrc.ucsc.edu

We present a document routing and index partitioning scheme for scalable similarity-based search of documents in a large corpus. We consider the case when similarity-based search ...

Deepavali Bhagwat, Kave Eshghi, Pankaj Mehra

claim paper

Read More »

222

click to vote

ICDAR
2003
IEEE

107views Document Analysis» more ICDAR 2003»

A Model-based Line Detection Algorithm in Documents

16 years 22 days ago

Download www.cse.salford.ac.uk

In this paper we present a novel model based approach to detect severely broken parallel lines in noisy textual documents. It is important to detect and remove these lines so the ...

Yefeng Zheng, Huiping Li, David S. Doermann

claim paper

Read More »

213

click to vote

ICDAR
2011
IEEE

235views Document Analysis» more ICDAR 2011»

Localization of Digit Strings in Farsi/Arabic Document Images Using Structural Features and Syntactical Analysis

14 years 7 months ago

Download www.icdar2011.org

—This paper presents a new method for localization of digit strings with a specific syntax in Farsi/ Arabic document images. First, some features are extracted from all connected...

Ali Abedi, Karim Faez

claim paper

Read More »

« Prev « First page 12 / 94 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers