Wrapping is the process of navigating a data source, semiautomatically extracting data and transforming it into a form suitable for data processing applications. There are current...
We present a generalization of frequent itemsets allowing the notion of errors in the itemset definition. We motivate the problem and present an efficient algorithm that identifie...
This paper presents the Part Of Speech tagger and Chunker for Tamil using Machine learning techniques. Part Of Speech tagging and chunking are the fundamental processing steps for...
V. Dhanalakshmi, P. Padmavathy, M. Anand Kumar, K....
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
CAPTCHAs are automated Turing tests used to determine if the end-user is human and not an automated program. Users are asked to read and answer Visual CAPTCHAs, which often appear...