word n-grams | Sciweavers

181

LREC
2010

159views Education» more LREC 2010»

The Web Library of Babel: evaluating genre collections

15 years 5 months ago

We present experiments in automatic genre classiﬁcation on web corpora, comparing a wide variety of features on several different genreannotated datasets (HGC, I-EN, KI-04, KRYS...

Serge Sharoff, Zhili Wu, Katja Markert

claim paper

Read More »

122

click to vote

COLING
2008

108views Computational Linguistics» more COLING 2008»

Source Language Markers in EUROPARL Translations

15 years 8 months ago

Download www.aclweb.org

This paper shows that it is very often possible to identify the source language of medium-length speeches in the EUROPARL corpus on the basis of frequency counts of word n-grams (...

Hans van Halteren

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers