City Research Online

Federated ontology-based queries over cancer data

Gonzalez Beltran, A. N., Tagger, B. and Finkelstein, A. ORCID: 0000-0003-2167-9844 (2012). Federated ontology-based queries over cancer data. BMC Bioinformatics, 13, S9. doi: 10.1186/1471-2105-13-S1-S9

Abstract

Background
Personalised medicine provides patients with treatments that are specific to their genetic profiles. It requires efficient data sharing of disparate data types across a variety of scientific disciplines, such as molecular biology, pathology, radiology and clinical practice. Personalised medicine aims to offer the safest and most effective therapeutic strategy based on the gene variations of each subject. In particular, this is valid in oncology, where knowledge about genetic mutations has already led to new therapies. Current molecular biology techniques (microarrays, proteomics, epigenetic technology and improved DNA sequencing technology) enable better characterisation of cancer tumours. The vast amounts of data, however, coupled with the use of different terms - or semantic heterogeneity - in each discipline makes the retrieval and integration of information difficult.

Results
Existing software infrastructures for data-sharing in the cancer domain, such as caGrid, support access to distributed information. caGrid follows a service-oriented model-driven architecture. Each data source in caGrid is associated with metadata at increasing levels of abstraction, including syntactic, structural, reference and domain metadata. The domain metadata consists of ontology-based annotations associated with the structural information of each data source. However, caGrid's current querying functionality is given at the structural metadata level, without capitalising on the ontology-based annotations. This paper presents the design of and theoretical foundations for distributed ontology-based queries over cancer research data. Concept-based queries are reformulated to the target query language, where join conditions between multiple data sources are found by exploiting the semantic annotations. The system has been implemented, as a proof of concept, over the caGrid infrastructure. The approach is applicable to other model-driven architectures. A graphical user interface has been developed, supporting ontology-based queries over caGrid data sources. An extensive evaluation of the query reformulation technique is included.

Conclusions
To support personalised medicine in oncology, it is crucial to retrieve and integrate molecular, pathology, radiology and clinical data in an efficient manner. The semantic heterogeneity of the data makes this a challenging task. Ontologies provide a formal framework to support querying and integration. This paper provides an ontology-based solution for querying distributed databases over service-oriented, model-driven infrastructures.

Publication Type: Article
Additional Information: This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Publisher Keywords: Unify Modeling Language, Semantic Annotation, Unify Modeling Language Model, Unify Modeling Language Class Diagram, Query Reformulation
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
R Medicine > RC Internal medicine > RC0254 Neoplasms. Tumors. Oncology (including Cancer)
Z Bibliography. Library Science. Information Resources > Z665 Library Science. Information Science
Departments: Presidents's Portfolio
Date available in CRO: 28 Jul 2021 10:47
Date deposited: 28 July 2021
Date of first online publication: 25 January 2012
URI: https://openaccess.city.ac.uk/id/eprint/26415
[img]
Preview
Text - Published Version
Available under License Creative Commons: Attribution 2.0.

Download (3MB) | Preview

Export

Downloads

Downloads per month over past year

View more statistics

Actions (login required)

Admin Login Admin Login