Introduction to Data Modeling for Physiology: The Human Endocrine System

Alex Amies   March 19, 2006

Contents

Summary

This document is a tutorial style introduction to data modeling to help people working in the biomedical sciences learn about data modeling and for others to consider the special problems faces in modeling data in physiological systems.  First I will consider the goals of the data model.  This will be followed by creation of a logical data model. The logical data model will be developed into a physical model using extensible markup language (XML). I outline a number of considerations in choosing technology for physical representation of the data model. Finally, I will give sample code for manipulating the data using the Java API for XML Binding (JAXB).

Motivation

The motivation behind this article is to provide an accessible introduction to data modeling for physiology.  How would this help anybody?  Constructing a data model is a first step in building software systems and we hope to have computing systems to help understand human physiology and treat disease.  Dissemination of data and data models over the Internet gives the knowledge and data great availability.  My goal will be to investigate principles for making the data as widely accessible as possible.


Research in data modelling of human physiology is at an early stage.  The BioPAX11, which first met in 2002, is a collaborative effort to create a data exchange format for biological pathway data.  The Open Biomedical Ontologies10 project is a focus point for modelling of information for shared use across different biological and medical domains.  This includes the OBO Ontology Browser, which lists a number of different vocabularies.  There is a vocabulary for human development anatomy and another for human disease.  Systems Biology Markup Language12 (SBML) is another project focussed on computer-readable format for representing models of biochemical reaction networks.  In a project closely related to the example in this paper Van Durme et al have created a Glycoprotein-hormone Receptors Information System22.

See the page Resources for Research in Medical Computing on this site for more discussion on research in this area.

Goals for the Data Model

It is important to understand your goals before creating a data model.  Otherwise, you are likely to go in circles and have contrdictions emerge.  My goals for this data model are

  1. It should be capable of being a simplified representation of the human endocrine system, including most of the known hormones that make up this system.
  2. It should be able to be used to help people learn about the human endocrine system.  It is an increasing trend to make to make basic medical data freely available over the Internet so their could be a wide range of people may want to make use of this data.  It should not only be able to viewed over the Internet, other software should be able to access the data over the Internet.
  3. It should be able to reference other data models, in particular, chemistry data models for chemical descriptions of hormones.  Because problems in biomedical sciences don't exist in isolation from one another.

Logical Model

The first step in developing a data model is to look at the data in isolation from any particular technology.  A starting point is the browse over the data to model.  From Purves et al3 you can see that there are a number of basic components of the human endocrine system.  A list of components are the secreting tissues or glands is

The only properties I will consider for these are the name and a description.  For the hormones we have these general properties
It is important to describe the general chemical nature of the hormone to understand how it may be transported but we would also want to know the detailed chemical formula.  The targets are a list of components that overlaps with the secreting tissues and glands.  For example, the anterior pituitary is a secreting gland for luteinizing hormone and a target.for releasing and release-inhibiting hormones.  The important properties or actions item is a summary of the role of the hormone.

The next step in logical data model is to determine what the entities are.  According to Patrick and Elizabeth O'Neil in Database Principles, Programming, and Performance4, an entity is 'a collection of distinguishable real-world objects with common properties.'  I will divide our entities up into

Chemical nature is an interesting choice.  There is a handful of these that are of interest in this problem.  I will refer to Chemical Markup Language (CML)5 to describe the details of the hormones but that doesn't have as coarse of a description for chemicals that categorizes them in the terms needed for this particular problem.

To define the data model we need to define attributes of the entities.  An attribute is a data item that describes a property of an entity or a relationship between two entities.  The diagram below shows the entities and the attributes.

Endocrine Data Model

Endocrine Data Model

The names of the entities are shown at the top of the boxes and the attributes are shown underneath.  There are several relationships here:

Entity References in the Data Model










Entity and Attribute Entity Referenced Description
Hormone.secretingTissue AnatomicalComponent Specifies which anatomical component is doing the secreting for a given hormone.  There can be many hormones secreted by some anatomical components.
Hormone.target AnatomicalComponent Specifies which anatomical components are targeted by the hormone.  A hormone can target many anatomical components.  This leads to a many to many target relation between hormone and target.  For example, growth hormone targets bones, liver, and muscles.  Bones (as a single entity, the skeleton) themselves are the targets of growth hormone, calcitonin, and parathyroids.
Hormone.chemicalNature
ChemicalNature
A hormone can only have one chemical nature but many hormones can have the same chemical nature.

There are several types of model.  Two common types are entity-relation (ER) and universal modeling language (UML) model.  O'Neil and O'Neil describe entity relation models4 and Booch, Rumbaugh, and Jacobson7 describe UML if you wish to read further about these modeling techniques.  Actually, the model here is simple enough to be described by either type of model.  I created the diagram above as a UML model using the freely available Omondo UML Eclipse plug-in6.  You could have drawn it with a number of other tools, including Microsoft Visio or IBM Rational Software Architect.  The logical model should be considered in isolation from the technology used but the tools tend to be developed around specific technologies.  That can be an advantage when changing the model later and keeping it synchronized with the implementation.

There are several important points to note about the model:

Next
Contents  References

Google

Please send me ideas and opinions by email at webmaster@medicalcomputing.net or add comments to my blog.  The content may become part of the web site.

© 2006 Alex Amies