DTD and its instance have been considered the standard for data representation and information exchange format on the current web. However, when coming to the next generation of w...
This paper presents our experiments in question answering for speech corpora. These experiments focus on improving the answer extraction step of the QA process. We present two app...
In this paper we address the problem of organizing hidden-Web databases. Given a heterogeneous set of Web forms that serve as entry points to hidden-Web databases, our goal is to ...
Automatically generated content is ubiquitous in the web: dynamic sites built using the three-tier paradigm are good examples (e.g. commercial sites, blogs and other sites powered...
: Text classification, document clustering and similar document analysis tasks are currently the subject of significant global research, since such areas underpin web intelligence,...