Classification of documents by genre is typically done either using linguistic analysis or term frequency based techniques. The former provides better classification accuracy than...
A major difficulty for designing a document image segmentation methodology is the proper value selection for all involved parameters. This is usually done after experimentations o...
Recent research in e-contracts is concerned with the development of frameworks and tools to support contracts. EREC framework is one that enables modelling and deployment of e-con...
Anushree Khandekar, P. Radha Krishna, Kamalakar Ka...
XPath [3, 5] is a powerful and quite successful language able to perform complex node selection in trees through compact specifications. As such, it plays a growing role in many ...
The performance of document clustering systems depends on employing optimal text representations, which are not only difficult to determine beforehand, but also may vary from one ...