Background: Biologists are often interested in performing a simple database search to identify proteins or genes that contain a well-defined sequence pattern. Many databases do no...
We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...
We address the problem of integrating web taxonomies from different real Internet applications. Integrating web taxonomies is to transfer instances from a source to target taxonom...
Chia-Wei Wu, Richard Tzong-Han Tsai, Cheng-Wei Lee...
In this paper we propose a scaling-up method that is applicable to essentially any induction algorithm based on discrete search. The result of applying the method to an algorithm ...
Findings from a data mapping and extraction exercise undertaken as part of the STAR project are described and related to recent work in the area. The exercise was undertaken in con...