Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

148

IRAL
2000
ACM

124views Information Technology» more IRAL 2000»

On the use of words and n-grams for Chinese information retrieval

15 years 11 months ago

On the use of words and n-grams for Chinese information retrieval

Download research.microsoft.com

: In the processing of Chinese documents and queries in information retrieval (IR), one has to identify the units that are used as indexes. Words and n-grams have been used as indexes in several previous studies, which showed that both kinds of indexes lead to comparable IR performances. In this study, we carry out more experiments on different ways to segment documents and queries, and to combine words with n-grams. Our experiments show that a combination of the longest-matching algorithm with single characters is the best choice.

Jian-Yun Nie, Jianfeng Gao, Jian Zhang, Ming Zhou

Real-time Traffic

Chinese Documents | Comparable Ir Performances | Information Management | Information Retrieval | IRAL 2000 |

claim paper

Related Content

» A First Approach to CLIR Using Character N Grams Alignment

» Word Length nGrams for Text Reuse Detection

» Webpage Genre Identification Using VariableLength Character nGrams

» Character nGram Spotting in Document Images

» Construction of a ChineseEnglish WordNet and its application to CLIR

» EnglishChinese CrossLanguage IR Using Bilingual Dictionaries

» Using selfsupervised word segmentation in Chinese information retrieval

» Multilingual Information Retrieval Using English and Chinese Queries

» Investigating the Relationship between Word Segmentation Performance and Retrieval Perform...

Post Info
More Details (n/a)

Added	01 Aug 2010
Updated	01 Aug 2010
Type	Conference
Year	2000
Where	IRAL
Authors	Jian-Yun Nie, Jianfeng Gao, Jian Zhang, Ming Zhou

Comments (0)