Sciweavers

804 search results - page 24 / 161
» Text Segmentation Based on Similarity between Words
Sort
View
EMNLP
2006
13 years 8 months ago
Graph-based Word Clustering using a Web Search Engine
Word clustering is important for automatic thesaurus construction, text classification, and word sense disambiguation. Recently, several studies have reported using the web as a c...
Yutaka Matsuo, Takeshi Sakaki, Koki Uchiyama, Mits...
SIGIR
2008
ACM
13 years 7 months ago
Enhancing text clustering by leveraging Wikipedia semantics
Most traditional text clustering methods are based on "bag of words" (BOW) representation based on frequency statistics in a set of documents. BOW, however, ignores the ...
Jian Hu, Lujun Fang, Yang Cao, Hua-Jun Zeng, Hua L...
FLAIRS
2007
13 years 9 months ago
Combining Machine Learning with Linguistic Heuristics for Chinese Word Segmentation
This paper describes a hybrid model that combines machine learning with linguistic heuristics for integrating unknown word identification with Chinese word segmentation. The model...
Xiaofei Lu
GFKL
2006
Springer
89views Data Mining» more  GFKL 2006»
13 years 11 months ago
The Relationship of Word Length and Sentence Length: The Inter-Textual Perspective
The present study concentrates on the relation between sentence length (SL) and word length (WL) as a possible factor in text classification. The dependence of WL and SL is discuss...
Peter Grzybek, Ernst Stadlober, Emmerich Kelih
SPIESR
2003
136views Database» more  SPIESR 2003»
13 years 8 months ago
Media segmentation using self-similarity decomposition
We present a framework for analyzing the structure of digital media streams. Though our methods work for video, text, and audio, we concentrate on detecting the structure of digit...
Jonathan Foote, Matthew L. Cooper