large corpora | Sciweavers

163

COLING
2010

108views Computational Linguistics» more COLING 2010»

Large Scale Parallel Document Mining for Machine Translation

15 years 1 months ago

A distributed system is described that reliably mines parallel text from large corpora. The approach can be regarded as cross-language near-duplicate detection, enabled by an init...

Jakob Uszkoreit, Jay Ponte, Ashok C. Popat, Moshe ...

claim paper

Read More »

152

click to vote

CORR
2010
Springer

116views Education» more CORR 2010»

LiquidXML: Adaptive XML Content Redistribution

15 years 6 months ago

Download asteriosk.gr

We propose to demonstrate LiquidXML, a platform for managing large corpora of XML documents in large-scale P2P networks. All LiquidXML peers may publish XML documents to be shared...

Jesús Camacho-Rodríguez, Asterios Ka...

claim paper

Read More »

192

click to vote

FLAIRS
2006

134views Artificial Intelligence» more FLAIRS 2006»

Corpus Based Unsupervised Labeling of Documents

15 years 8 months ago

Download www.aaai.org

Text categorization involves mapping of documents to a fixed set of labels. A similar but equally important problem is that of assigning labels to large corpora. With a deluge of ...

Delip Rao, Deepak P, Deepak Khemani

claim paper

Read More »

159

click to vote

LREC
2010

200views Education» more LREC 2010»

A Corpus Factory for Many Languages

15 years 8 months ago

Download web2py.iiit.ac.in

For many languages there are no large, general-language corpora available. Until the web, all but the richest institutions could do little but shake their heads in dismay as corpu...

Adam Kilgarriff, Siva Reddy, Jan Pomikálek,...

claim paper

Read More »

164

click to vote

ACL
2008

136views Computational Linguistics» more ACL 2008»

A Subcategorization Acquisition System for French Verbs

15 years 8 months ago

Download www.aclweb.org

This paper presents a system capable of automatically acquiring subcategorization frames (SCFs) for French verbs from the analysis of large corpora. We applied the system to a lar...

Cédric Messiant

claim paper

Read More »

171

click to vote

COLCOM
2005
IEEE

77views Distributed And Parallel Com...» more COLCOM 2005»

An experimental evaluation of spam filter performance and robustness against attack

16 years 7 days ago

Download www.cc.gatech.edu

— In this paper, we show experimentally that learning ﬁlters are able to classify large corpora of spam and legitimate email messages with a high degree of accuracy. The corpor...

Steve Webb, Subramanyam Chitti, Calton Pu

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers