Towards the semantic web: The automation of knowledge acquisition from the medical web
Eljinini, M. A. H. S. (2007). Towards the semantic web: The automation of knowledge acquisition from the medical web. (Unpublished Doctoral thesis, City, University of London)
Abstract
The current web contains a wealth of information in the form of natural text. In the medical domain, the number of documents related to healthcare is already large and continues to grow at exponential rate. Today’s desktops can retrieve millions of web documents but can understand none. HTML documents are made to be read and understood by humans and not by machines. In recent years, researchers have been working on the development of new languages for the semantic web. Annotating web documents with semantic metadata will enable contents-guided searching and reasoning which will lead the web to its full potential. Despite all the advances in this area the web at large is still un-semantic. It is impractical to go back and annotate the current web with semantic tags manually. Such a process is labour intensive, prone to errors, and requires expertise with the new complex technologies.
The objective of this work is the development of a novel methodology for extracting useful information from the medical web to be structured and ready for the semantic web. To accomplish this task, three sets of chronic disease-related websites have been downloaded, analysed and studied in depth. The study has revealed a common set of concepts along with their attributes which were used in the construction of the ontology. An information extraction system has been developed that utilises the ontology for extracting common structures from unseen chronic disease-related websites.
Download (9MB) | Preview
Export
Downloads
Downloads per month over past year