In this paper, we describe a SVM classification framework of session detection task on both Chinese and English query logs. With eight features on the aspects of temporal and cont...
With OWL (Web Ontology Language) established as a standard for encoding ontologies on the Semantic Web, interest has begun to focus on the task of verbalising OWL code in controll...
Unknown words are a major issue for large-scale grammars of natural language. We propose a machine learning based algorithm for acquiring lexical entries for all forms in the para...
The Varro toolkit is a system for identifying and counting a major class of regularity in treebanks and annotated natural language data in the form of treestructures: frequently r...
Word co-occurrence networks are one of the most common linguistic networks studied in the past and they are known to exhibit several interesting topological characteristics. In th...
In this paper we propose a novel algorithm for opinion summarization that takes account of content and coherence, simultaneously. We consider a summary as a sequence of sentences ...
Many existing methods for bilingual lexicon learning from comparable corpora are based on similarity of context vectors. These methods suffer from noisy vectors that greatly affec...
Previous research in cross-document entity coreference has generally been restricted to the offline scenario where the set of documents is provided in advance. As a consequence, t...
In this paper, an automatic method for Persian WordNet construction based on Prenceton WordNet 2.1 (PWN) is introduced. The proposed approach uses Persian and English corpora as w...
Sentence-level aligned parallel texts are important resources for a number of natural language processing (NLP) tasks and applications such as statistical machine translation and ...