Abstract. A usage scenario of bio-ontologies is hypothesis testing, such as finding relationships or new subconcepts in the data linked to the ontology. Whilst validating the hypothesis, such knowledge is uncertain or vague and the data is often incomplete, which DL knowledge bases do not take into account. In addition, it requires scalability with large amounts of data. To address these requirements, we take the SROIQ(D) and DL-Lite family of languages and their application infrastructures augmented with notions of rough sets. Although one can represent only little of rough concepts in DL-Lite, useful aspects can be dealt with in the mapping layer that links the concepts in the ontology to queries over the data source. We discuss the trade-offs and demonstrate validation of the theoretical assessment with the HGT application ontology about horizontal gene transfer and its 17GB database by taking advantage of the Ontology-Based Data Access framework. However, the prospects for comprehe...
C. Maria Keet