Customer care in technical domains is increasingly based on e-mail communication, allowing for the reproduction of approved solutions. Identifying the customer's problem is o...
Stephan Busemann, Sven Schmeier, Roman Georg Arens
We propose a novel approach for categorizing text documents based on the use of a special kernel. The kernel is an inner product in the feature space generated by all subsequences...
Huma Lodhi, John Shawe-Taylor, Nello Cristianini, ...
Abstract. Existing methods to text plagiarism analysis mainly base on “chunking”, a process of grouping a text into meaningful units each of which gets encoded by an integer nu...
Software tools are used to compare multiple versions of a textual document to help a reader understand the evolution of that document over time. These tools generally support the ...
Many approaches to Information Extraction (IE) have been proposed in literature capable of finding and extract specific facts in relatively unstructured documents. Their applicatio...