Sciweavers

147 search results - page 15 / 30
» UNL as a Text Content Representation Language for Informatio...
Sort
View
ACL
2003
13 years 9 months ago
tRuEcasIng
Truecasing is the process of restoring case information to badly-cased or noncased text. This paper explores truecasing issues and proposes a statistical, language modeling based ...
Lucian Vlad Lita, Abraham Ittycheriah, Salim Rouko...
ICDIM
2008
IEEE
14 years 1 months ago
Unsupervised key-phrases extraction from scientific papers using domain and linguistic knowledge
The domain of Digital Libraries presents specific challenges for unsupervised information extraction to support both the automatic classification of documents and the enhancement ...
Mikalai Krapivin, Maurizio Marchese, Andrei Yadran...
TSD
1999
Springer
13 years 11 months ago
Handling Word Order in a Multilingual System for Generation of Instructions
Slavic languages are characteristic by their relatively high degree of word order freedom. In the process of automatic generation from an underlying representation of the content, ...
Ivana Kruijff-Korbayová, Geert-Jan M. Kruij...
DOCENG
2009
ACM
14 years 2 months ago
Object-level document analysis of PDF files
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
Tamir Hassan
ACSC
2009
IEEE
14 years 2 months ago
A ConceptLink Graph for Text Structure Mining
Most text mining methods are based on representing documents using a vector space model, commonly known as a bag of word model, where each document is modeled as a linear vector r...
Rowena Chau, Ah Chung Tsoi, Markus Hagenbuchner, V...