Feature Generation for Sequence Categorization

14 years 4 months ago

Download www.aaai.org

The problem of sequence categorization is to generalize from a corpus of labeled sequences procedures for accurately labeling future unlabeled sequences. The choice of representation of sequences can have a major impact on this task, and in the absence of background knowledge a good representation is often not knownand straightforward representations are often far from optimal. Wepropose a feature generation method (called FGEN)that creates Boolean features that check for the presence or absence of heuristically selected collections of subsequences. Weshow empirically that the representation computedby FGENimproves the accuracy of two commonlyused learning systems (C4.5 and Ripper) whenthe new features are added to existing representations of sequence data. Weshowthe superiority of FGENacross a range of tasks selected from three domains: DNAsequences, Unix commandsequences, and English text.

Daniel Kudenko, Haym Hirsh

Real-time Traffic

AAAI 1998 | Future Unlabeled Sequences | Intelligent Agents | Sequence Categorization | Sequences Procedures |

claim paper

Post Info
More Details (n/a)

Added	01 Nov 2010
Updated	01 Nov 2010
Type	Conference
Year	1998
Where	AAAI
Authors	Daniel Kudenko, Haym Hirsh

Comments (0)

Sciweavers

Feature Generation for Sequence Categorization

AAAI 1998 | Future Unlabeled Sequences | Intelligent Agents | Sequence Categorization | Sequences Procedures |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers