Sciweavers

DIS
2008
Springer

String Kernels Based on Variable-Length-Don't-Care Patterns

14 years 2 months ago
String Kernels Based on Variable-Length-Don't-Care Patterns
Abstract. We propose a new string kernel based on variable-lengthdon't-care patterns (VLDC patterns). A VLDC pattern is an element of ({}) , where is an alphabet and is the variable-length-don't-care symbol that matches any string in . The number of VLDC patterns matching a given string s of length n is O(22n ). We present an O(n5 ) algorithm for computing the kernel value. We also propose variations of the kernel which modify the relative weights of each pattern. We evaluate our kernels using a support vector machine to classify spam data.
Kazuyuki Narisawa, Hideo Bannai, Kohei Hatano, Shu
Added 19 Oct 2010
Updated 19 Oct 2010
Type Conference
Year 2008
Where DIS
Authors Kazuyuki Narisawa, Hideo Bannai, Kohei Hatano, Shunsuke Inenaga, Masayuki Takeda
Comments (0)