Contents
SummaryThis document is a tutorial style introduction to data
modeling to
help people working in the biomedical sciences learn about data
modeling and for others to consider the special problems faces in
modeling data in physiological systems. First I will consider the
goals
of the data model. This will be followed by creation of a logical
data model. The logical data model will be developed into a physical
model using extensible markup language (XML). I outline a number of
considerations in choosing technology for physical representation of
the data model. Finally, I will give sample code for manipulating
the data using the Java API for XML Binding (JAXB). MotivationThe motivation behind this article is to provide an accessible
introduction to data modeling for physiology. How would this
help anybody? Constructing a data model is a first step in
building software systems and we hope to have computing systems to help
understand human physiology and treat disease. Dissemination of
data and data models over the Internet gives the knowledge and data
great availability. My goal will be to investigate principles for
making the data as widely accessible as possible. |
Research in data modelling of human physiology is at an early
stage. The BioPAX11,
which first met in 2002, is a collaborative effort to create a data
exchange format for biological pathway data. The Open
Biomedical Ontologies10
project is a focus point
for modelling of information for shared use across different biological
and medical domains. This includes the OBO Ontology Browser,
which lists a number of different vocabularies. There is a
vocabulary
for human development anatomy and another for human disease.
Systems Biology Markup Language12 (SBML) is
another project focussed on computer-readable format for representing
models of biochemical reaction networks. In a project closely
related to the example in this paper Van Durme et al have created a
Glycoprotein-hormone Receptors Information System22.
See the page Resources for Research in Medical Computing on this site for more discussion on research in this area.
It is important to understand your goals before creating a data
model. Otherwise, you are likely to go in circles and have
contrdictions emerge. My goals for this data model are
The first step in developing a data model is to look at the data in
isolation from any particular technology. A starting point is the
browse over the data to model. From Purves et al3 you can see
that there are a
number of basic components of the human endocrine system. A list
of components are the secreting tissues or glands is
The next step in logical data model is to determine what the
entities are. According to Patrick and Elizabeth O'Neil in
Database Principles, Programming, and Performance4,
an entity is 'a collection of distinguishable real-world objects with
common properties.' I will divide our entities up into
To define the data model we need to define attributes of the
entities. An attribute is a data item that describes a property
of an entity or a relationship between two entities. The diagram
below shows the entities and the attributes.

The names of the entities are shown at the top of the boxes and the
attributes are shown underneath. There are several relationships
here:
| Entity and Attribute | Entity Referenced | Description |
|---|---|---|
| Hormone.secretingTissue | AnatomicalComponent | Specifies which anatomical
component is doing the secreting for a given hormone. There can
be many hormones secreted by some anatomical components. |
| Hormone.target | AnatomicalComponent | Specifies which anatomical
components are targeted by the hormone. A hormone can target many
anatomical components. This leads to a many to many target
relation between hormone and target. For example, growth hormone
targets bones, liver, and muscles. Bones (as a single entity, the
skeleton) themselves are the targets of growth hormone, calcitonin, and
parathyroids. |
| Hormone.chemicalNature |
ChemicalNature |
A hormone can only have one
chemical nature but many hormones can have the same chemical nature. |
There are several types of model. Two common types are entity-relation (ER) and universal modeling language (UML) model. O'Neil and O'Neil describe entity relation models4 and Booch, Rumbaugh, and Jacobson7 describe UML if you wish to read further about these modeling techniques. Actually, the model here is simple enough to be described by either type of model. I created the diagram above as a UML model using the freely available Omondo UML Eclipse plug-in6. You could have drawn it with a number of other tools, including Microsoft Visio or IBM Rational Software Architect. The logical model should be considered in isolation from the technology used but the tools tend to be developed around specific technologies. That can be an advantage when changing the model later and keeping it synchronized with the implementation.
There are several important points to note about the model:
Next