Search Sciweavers | Sciweavers

63 search results - page 1 / 13

» Large Linguistically-Processed Web Corpora for Multiple Lang...

112

click to vote

EACL
2006
ACL Anthology

156views Natural Language Processing» more EACL 2006»

Large Linguistically-Processed Web Corpora for Multiple Languages

15 years 3 months ago

Download acl.ldc.upenn.edu

The Web contains vast amounts of linguistic data. One key issue for linguists and language technologists is how to access it. Commercial search engines give highly compromised acc...

Marco Baroni, Adam Kilgarriff

claim paper

Read More »

107

click to vote

LREC
2010

217views Education» more LREC 2010»

Building a Web Corpus of Czech

15 years 3 months ago

Download ufal.mff.cuni.cz

Large corpora are essential to modern methods of computational linguistics and natural language processing. In this paper, we describe an ongoing project whose aim is to build a l...

Drahomíra "johanka" Spoustová, Miros...

claim paper

Read More »

Voted

LREC
2010

200views Education» more LREC 2010»

A Corpus Factory for Many Languages

15 years 3 months ago

Download web2py.iiit.ac.in

For many languages there are no large, general-language corpora available. Until the web, all but the richest institutions could do little but shake their heads in dismay as corpu...

Adam Kilgarriff, Siva Reddy, Jan Pomikálek,...

claim paper

Read More »

click to vote

WWW
2006
ACM

97views Internet Technology» more WWW 2006»

WebKhoj: Indian language IR from multiple character encodings

16 years 3 months ago

Download www.iiit.net

Today web search engines provide the easiest way to reach information on the web. In this scenario, more than 95% of Indian language content on the web is not searchable due to mu...

Prasad Pingali, Jagadeesh Jagarlamudi, Vasudeva Va...

claim paper

Read More »

122

click to vote

CIKM
2001
Springer

82views Information Technology» more CIKM 2001»

Mining the Web to Create Minority Language Corpora

15 years 6 months ago

Download www.accenture.com

The Web is a valuable source of language speci c resources but the process of collecting, organizing and utilizing these resources is di cult. We describe CorpusBuilder, an approa...

Rayid Ghani, Rosie Jones, Dunja Mladenic

claim paper

Read More »

« Prev « First page 1 / 13 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers