texts | Sciweavers

194

TCS
2010

147views Theoretical Computer Science» more TCS 2010»

Definable transductions and weighted logics for texts

15 years 1 months ago

A text is a word together with an additional linear order on it. We study quantitative models for texts, i.e. text series which assign to texts elements of a semiring. We introduc...

Christian Mathissen

claim paper

Read More »

163

click to vote

EMNLP
2009

109views Natural Language Processing» more EMNLP 2009»

Learning Term-weighting Functions for Similarity Measures

15 years 4 months ago

Download research.microsoft.com

Measuring the similarity between two texts is a fundamental problem in many NLP and IR applications. Among the existing approaches, the cosine measure of the term vectors represen...

Wen-tau Yih

claim paper

Read More »

188

click to vote

EMNLP
2010

153views Natural Language Processing» more EMNLP 2010»

Enhancing Domain Portability of Chinese Segmentation Model Using Chi-Square Statistics and Bootstrapping

15 years 4 months ago

Download www.aclweb.org

Almost all Chinese language processing tasks involve word segmentation of the language input as their first steps, thus robust and reliable segmentation techniques are always requ...

Baobao Chang, Dongxu Han

claim paper

Read More »

216

click to vote

MT
2007

158views more MT 2007»

Automatic extraction of translations from web-based bilingual materials

15 years 6 months ago

Download www.site.uottawa.ca

This paper describes the framework of the StatCan Daily Translation Extraction System (SDTES), a computer system that maps and compares webbased translation texts of Statistics Can...

Qibo Zhu, Diana Zaiu Inkpen, Ash Asudeh

claim paper

Read More »

150

click to vote

JQL
2007

82views more JQL 2007»

Experiments on authorship attribution by intertextual distance in English

15 years 6 months ago

Download halshs.archives-ouvertes.fr

How can it be said that texts are "near" or "distant" from one another? Are different texts by a single author more similar than texts by different authors? To...

Dominique Labbé

claim paper

Read More »

196

click to vote

IPM
2008

196views more IPM 2008»

Author identification: Using text sampling to handle the class imbalance problem

15 years 6 months ago

Download www.icsd.aegean.gr

Authorship analysis of electronic texts assists digital forensics and anti-terror investigation. Author identification can be seen as a single-label multi-class text categorizatio...

Efstathios Stamatatos

claim paper

Read More »

183

click to vote

NAACL
1994

123views Computational Linguistics» more NAACL 1994»

Principles of Template Design

15 years 7 months ago

Download acl.ldc.upenn.edu

The functionality of systems that extract information from texts can be specified quite simply: the input is a stream of texts and the output is some representation of the informa...

Jerry R. Hobbs, David J. Israel

claim paper

Read More »

133

click to vote

COLING
1994

102views Computational Linguistics» more COLING 1994»

A Part-of-Speech-Based Alignment Algorithm

15 years 7 months ago

Download www.mt-archive.info

To align bilingual texts becomes a crucial issue recently. Rather than using length-based or translation-based criterion, a part-of-speech-based criterion is proposed. We postulat...

Kuang-Hua Chen, Hsin-Hsi Chen

claim paper

Read More »

170

click to vote

ECIR
2003
Springer

102views Information Technology» more ECIR 2003»

Corpus-Based Thesaurus Construction for Image Retrieval in Specialist Domains

15 years 8 months ago

Download www.computing.surrey.ac.uk

This paper explores the use of texts that are related to an image collection, also known as collateral texts, for building thesauri in specialist domains to aid in image retrieval....

Khurshid Ahmad, Mariam Tariq, Bogdan Vrusias, Chri...

claim paper

Read More »

185

click to vote

AAAI
2008

284views Intelligent Agents» more AAAI 2008»

An Effective and Robust Method for Short Text Classification

15 years 9 months ago

Download www.aaai.org

Classification of texts potentially containing a complex and specific terminology requires the use of learning methods that do not rely on extensive feature engineering. In this w...

Victoria Bobicev, Marina Sokolova

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers