Introduction to OWL Web Ontology Language for Medical and Biosciences Applications (Continued)

Previous  Next
Contents  References

Creating an Ontology

This section describes creation of an ontology using human physiology as an example.  A realistic ontology would be too complex to serve as a good example.  The OWL document is physiology.owl and it imports the document biochemistry.owl.  An ontology may be created by editing an OWL document with a text editor or by using one of a number of tools available (see10,11,12,13).  These particular OWL documents were created using Protégé.

The OWL document starts


<rdf:RDF
xmlns
="http://www.medicalcomputing.net/owl/physiology.owl#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:biochemistry="http://www.medicalcomputing.net/biochemistry.owl#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:daml="http://www.daml.org/2001/03/daml+oil#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xml:base="http://www.medicalcomputing.net/owl/physiology.owl">
<owl:Ontology rdf:about="">
<rdfs:comment rdf:datatype="http://www.w3.org/2001/XMLSchema#string">A model for human physiology</rdfs:comment>
<owl:imports rdf:resource="http://www.medicalcomputing.net/biochemistry.owl#"/>
</owl:Ontology>

The rdf:RDF element is the top level tag described previously.  The XML Namespace attributes define the default (xmlns) and the base (xml:base) namespaces as http://www.medicalcomputing.net/owl/physiology.owl#.  The biochemistry namespace, discussed below, is defined with the xmlns:biochemistry attribute.  A number of other namespaces are also defined, including RDF (xmlns:rdf) , RDF Schema (xmlns:rdfs), XML Schema (xmlns:xsd), OWL (xmlns:owl), DAML+OIL (xmlns:daml), and Dublin Core (xmlns:dc).  The owl:Ontology element contains a comment describing the purpose of the document and imports the biochemistry OWL document.

A physiological system is defined with the class definition


<owl:Class rdf:about="#PhysiologicalSystem">
<rdfs:comment rdf:datatype="http://www.w3.org/2001/XMLSchema#string">
A physiological system is a system within the body that functions as a unit. The human physiological
system includes the circulatory, endocrine, gastrointestinal, immune, cardio-vascular, integumentary,
lymphatic, muscoloskeletal, reproductive, respiratory, and urinary systems.
</rdfs:comment>
</owl:Class>


The various kinds of physiological system (circulatory, endocrine, gastrointestinal, etc) will be defined as subclasses of the class PhysiologicalSystem.  This is appropriate because the circulatory system is a physiological system, an endocrine system is a physiological system, and so on.  Subclassing defines an is a relation, also known as an inheritance, a parent-child, or a type specialization relation.  The article Ontology Development 101: A Guide to Creating Your First Ontology8 discusses when to use subclassing versus properties (has a relation) when authoring ontologies in more detail.

I have included a lengthy description of the class PhysiologicalSystem, even though this is just an example, for a reason: to emphasize that this is where a subject matter expert (a physician in this case) should pour out their knowledge.  That knowledge is what users of the ontology are looking for and hoping to benefit by.  Again, the information created by subject matter experts and making this readily available to users is what matters.  The XML, the software, and all that goes with it exist to support human users. 

The class NervousSystem is defined to be a subclass of PhysiologicalSystem


<owl:Class rdf:ID="NervousSystem">
<rdfs:comment rdf:datatype="http://www.w3.org/2001/XMLSchema#string">
The nervous system senses the outside environment, the state of the body itself,
and initiates movement of the musculoskeletal system.</rdfs:comment>
<rdfs:subClassOf>
<owl:Class rdf:ID="PhysiologicalSystem"/>
</rdfs:subClassOf>
</owl:Class>

This can also be shown with a Unified Modeling Language (UML) class diagram, which also shows the other PhysiologicalSystem subclasses.

UML Class Diagram for PhysiologicalSystem and Subclasses

UML Class Diagram for PhysiologicalSystem and Subclasses

The open arrow in the diagram represents inheritance.

Next I will introduce a model to represent measurements and observations that a physician may use in determining the health of the physiological systems described above.  I will create a base class called Measurement and from that derive classes for pulse, blood pressure, body temperature, and the blood concentration of relevant chemical substances, such as glucose, calcium, sodium, and so on.  The list is incomplete but it illustrates the approach.

UML Class Diagram for Measurement
UML Class Diagram to Model Physiological Measurements

The reader might think at this point that the ontology for physiology will become very large and we should create separate ontologies for each specialization.  That is a good argument.  Then should not the BloodConcentrations belong to a pathology ontology?  Also, shouldn't the CirculatorySystem belong to a cardiovascular ontology? The reasons that I did not do this are

  1. To do something basic, for example, describe a blood pressure measurement, will require importing specialty ontologies.
  2. The specialty ontology should depend on the base ontology physiology, not the other way around.  Certainly, there should not be a two-way dependency.

To describe the substances measured in the blood I have created an example biochemistry ontology.  The relationship between the physiology ontology and the biochemistry ontology is shown below.

Ontology Relationships
Ontology Relationships

This new ontology illustrates two points

  1. It is a good practice to partition ontologies to keep them relevant and manageable.  Chemical defintions do not belong in a physiology ontology.
  2. Separating classes into multiple ontologies promotes reuse.  A biochemistry ontology can potentially be used in many other biology related ontologies.
  3. Minimization of dependencies is important.  It is reasonable for physiology to depend on biochemistry but not the other way around.

The blood concentration measures a particular chemical substance.  This can be described as a property of the BloodConcentration element:


<owl:ObjectProperty rdf:ID="substanceMeasured">
<rdfs:comment xml:lang="en">The substance measured in the laboratory test.</rdfs:comment>
<rdfs:label xml:lang="en">Substance Measured</rdfs:label>
<rdfs:range rdf:resource=http://www.medicalcomputing.net/owl/biochemistry.owl#Chemical"/>
<rdfs:domain rdf:resource=BloodConcentration"/>
 </owl:ObjectProperty>

A property is a map from the domain to a range. Here the domain is the set of BloodConcentration entities and the range is the set of Chemical entities, which are shown in the class diagram below.

Class Diagram for Chemicals
Class Diagram for Chemicals

To define a benchmark for interpreting results create a reference element for measurements in general and for blood concentration, in particular, define a healthy range.  This is shown in the class diagram below.

Class Diagram for Healthy Range
Class Diagram for Healthy Range

In this diagram the dashed line represents a dependency or a has a relation.  A measurement has a reference; a blood concentration has a healthy range.

Previous  Next
Contents  References


Please send ideas and opinions by email at webmaster@medicalcomputing.net or add comments to my blog.  The content may become part of the web site.

© Alex Amies 2006