Syllables and other String Kernel Extensions

16 years 7 months ago

Download www.cs.ucl.ac.uk

During the last years, the use of string kernels that compare documents has been shown to achieve good results on text classification problems. In this paper we introduce the application of the string kernel in conjunction with syllables. Using syllables shortens the representation of documents compared to a character based representation and as a result reduces computation time. Moreover syllables provide a more natural representation of text; rather than the traditional coarse representation given by the bag-of-words, or the too fine one resulting from considering individual letters only. We give some experimental results which show that syllables can be effectively used in text-categorisation problems. In this paper we also propose two extensions to the string kernel. The first introduces a lambda-weighting scheme, where different symbols can be given differing decay weightings. This may be useful in text and other applications where the insertion of certain symbols may be known to...

Craig Saunders, Hauke Tschach, John Shawe-Taylor

Real-time Traffic

ICML 2002 | Machine Learning | String Kernel | String Kernels | String Subseqence Kernel |

claim paper

Added	17 Nov 2009
Updated	17 Nov 2009
Type	Conference
Year	2002
Where	ICML
Authors	Craig Saunders, Hauke Tschach, John Shawe-Taylor

Sciweavers

Syllables and other String Kernel Extensions

ICML 2002 | Machine Learning | String Kernel | String Kernels | String Subseqence Kernel |

Explore & Download

Productivity Tools

Sciweavers