In this paper we present a Japanese-English Bilingual lexicon of technical terms. The lexicon was derived from the first and second NTCIR evaluation collections for research into ...
In this paper we present two experiments conducted for comparison of different language identification algorithms. Short words-, frequent words- and n-gram-based approaches are co...
Lena Grothe, Ernesto William De Luca, Andreas N&uu...
This paper presents a series of tools for the extraction of specialized corpora from the web and its subsequent analysis mainly with statistical techniques. It is an integrated sy...
In this paper, we explore the discriminating subsequencebased clustering problem. First, several effective optimization techniques are proposed to accelerate the sequence mining p...
Jianyong Wang, Yuzhou Zhang, Lizhu Zhou, George Ka...
This paper presents a dependency language model (DLM) that captures linguistic constraints via a dependency structure, i.e., a set of probabilistic dependencies that express the r...