We consider the coverage testing problem where we are given a document and a corpus with a limited query interface and asked to find if the corpus contains a near-duplicate of th...
Ali Dasdan, Paolo D'Alberto, Santanu Kolay, Chris ...
Discovering correspondences between schema elements is a crucial task for data integration. Most schema matching tools are semiautomatic, e.g. an expert must tune some parameters ...
Mining frequent itemsets in data streams is beneficial to many real-world applications but is also a challenging task since data streams are unbounded and have high arrival rates...
Well-designed indices can dramatically improve query performance. Including query workload information can produce indices that yield better overall throughput while balancing the...
No search engine is perfect. A typical type of imperfection is the preference misalignment between search engines and end users, e.g., from time to time, web users skip higherrank...
Ontology-Based Information Extraction (OBIE) has recently emerged as a subfield of Information Extraction (IE). Here, ontologies - which provide formal and explicit specificatio...
To receive personalized web services, the user has to provide personal information and preferences, in addition to the query itself, to the web service. However, detailed personal...
Collaborative tagging systems allow users to use tags to describe their favourite online documents. Two documents that are maintained in the collection of the same user and/or ass...
Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbo...