Sciweavers

ICML
2002
IEEE

Syllables and other String Kernel Extensions

15 years 8 days ago
Syllables and other String Kernel Extensions
During the last years, the use of string kernels that compare documents has been shown to achieve good results on text classification problems. In this paper we introduce the application of the string kernel in conjunction with syllables. Using syllables shortens the representation of documents compared to a character based representation and as a result reduces computation time. Moreover syllables provide a more natural representation of text; rather than the traditional coarse representation given by the bag-of-words, or the too fine one resulting from considering individual letters only. We give some experimental results which show that syllables can be effectively used in text-categorisation problems. In this paper we also propose two extensions to the string kernel. The first introduces a lambda-weighting scheme, where different symbols can be given differing decay weightings. This may be useful in text and other applications where the insertion of certain symbols may be known to...
Craig Saunders, Hauke Tschach, John Shawe-Taylor
Added 17 Nov 2009
Updated 17 Nov 2009
Type Conference
Year 2002
Where ICML
Authors Craig Saunders, Hauke Tschach, John Shawe-Taylor
Comments (0)