Contents
Chemical, Biological, and Medical DatabasesLiteraturePubMed is a database for literature medical and biology developed by the National Center for Biotechnology Information (NCBI) at the National Library of Medicine (NLM). It contains abstracts from more than 4,800 biomedical journals. ChemistryNational Library of Medicine, PubChem web site at pubchem.ncbi.nlm.nih.gov
provides information on the biological properties of small molecules.
You can use the structure search facility to find properties and
structure of many chemicals. It is a component of the US National
Institute of Health's (NIH) Molecular
Libraries Roadmap Initiative. Argonne National Laboraties host the WIT (What is There?) database containing comparative analysis of sequenced genomes. Argonne National Laboraties also hosts EMP (Enzymes and Metabolic Pathways) and other databases. Protein Structure and SequenceThe US National Center for Biotechnology Information (NCBI) Entrez Protein Database is compiled from a variety of sources, including SwissProt, PIR, PRF, Protein Data Bank (PDB). A copy of the Protein Data Bank (PDB) hosted by the Research Collaboratory for Structural Bioinformatics (RCSB) can be found at at www.rcsb.org/pdb. The latest additions to the PDB can also be browsed and downloaded at www.rcsb.org/pdb/smartSubquery.do?smartSearchSubtype=LastLoadQuery. The Basic Local Alignment Search Tool (BLAST) is a program that compares nucleotide or protein sequences to sequences in these and other similar databases. The Swiss Institute of Bioinformatics (SIB) hosts the ExPASy (Expert Protein Analysis System) proteomics server, which includes a protein knowledge base and tools. The Sanger Institute maintains the Protein Family (Pfam) database at www.sanger.ac.uk/Software/Pfam. Seventy four percent of protein sequences have at least one match to at least one entry in Pfam. GeneticsCompleted in 2003, The Human Genome Project took many advances in computing to complete. Among the goals of the project directly relating to computing are storing the DNA sequence information in databases openly accessible from the Internet and improving tools for data analysis. The human genome can be browsed with the Human Chromosome Launchpad. The GenBank database hosted by the NCBI is a DNA sequence database with genetic data from humans and other organisms. GenBank can be most easily browsed with MapViewer. ENTREZ is the NCBI's search engine, which searches in GenBank and other NCBI databases. The NCBI has a number of tools available for analysis, including BLAST. The European Bioinformatics Institute has similar genetic databases and tools for working with the data. The Institute for Genomic Research (TIGR) also hosts a genome sequence database also hosts serveral genome databases, including the Expressed Gene Anatomy Database (EGAD) at www.tigr.org/tdb/egad/egad.shtml. The Kyoto Genes and Genome (KEGG) database is another genetic information database but also has pathway, ligand, and drug information and has a web services API to access the database. Genome.net is a bioinformatics gateway hosted by the Bioinformatics Center at the Institute of Chemical Research, Kyoto University. It includes KEGG and other bioinformatics databases. Weissmann Institute of Science, GeneCards Database at www.genecards.org is integrated database of human genes that includes genomic, proteomic, and transcriptomic information. ToolsEMBOSS is an open source tool for working with data from GenBank. GrailEXP is an Experimental Gene Discovery Suite that can be used to predict gene locations in genome data developed by the Genome Analysis and System Modeling Group of the Life Sciences Division of Oak Ridge National Laboratory. The Oak Ridge National Laboratory has also developed PROSPECT (PROtein Structure Prediction and Evaluation Computer Toolkit), a protein structure prediction system. GENSCAN is a gene finding program developed by Chris Burge and Samuel Karlin. A web interface to the program is available at genes.mit.edu/GENSCAN.html. PROCRUSTES is a gene recognition program that uses spliced alignment to explores possible all exon assemblies within DNA sequences. GeneWise is a program that searches for genes by comparing a protein sequence to a genomic DNA sequence, allowing for introns and frameshifting errors. A web interface for the program is hosted by the European Bioinformatics Institute at www.ebi.ac.uk/Wise2/. Biojava is an
open
source project initiated by Great Britain's Sanger Institute and hosted
by the Open
Bioinformatics Foundation. It
focusses on genetic analysis. The introductory article BioJava
-- Java Technology Powers Toolkit for Deciphering Genomic Codes
describes the
project. Interesting Projects and Organizations in Medical Computing ResearchThere are computer systems to aid in medical research and there is also research in medical computer systems themselves. Since the goal of both is the same, to move progress our ability to help people live longer and healthier lives, I will discuss both. There are so many different things that researchers are trying to acheive and these are two of the hottest areas in science and technology at present. There is some exciting stuff in this area but the best I can hope to do is to get a sample of what is out there and get peoples opinions on the it and on barriers that researchers face. World Community GridHere (www.worldcommunitygrid.org) is a software system with a lofty goal. I have it installed on all three of my systems (two at work and one at home) and run it night and day. It uses a grid of computers to solve various problems that are very computationally expensive. The problem that it is working on now is FIGHTAIDS@HOME. It analyses how potential drug molecules fit into the HIV protease and notes that the best candidate will be lab tested. The software executing in the grid at present was written by The Molecular Graphics Laboratory at the Scripps Institute. The grid software was made and is operated by IBM along with a very worthy collection of partners listed on the site. The World Community Grid has also run the computations for The Human Proteome Folding Project. |
|
BrainMaps.org is an interactive high-resolution digital brain atlas and virtual microscope. It features scanned images of serial sections of both primate and non-primate brains, which is integrated with a database and Flash interactive user interface.
Neuroscientific.net is a portal for neurosciences, especially those relating to bioinformatics.
The U.S. Department of Energy Office of Science Genomes to Life project is studies the proteins encoded by genomes of different organisms to explore natural capabilities in microbes.
The Human Physiology in Space Outline project is managed by the National Space Biomedical Research Institute to measure the effects of life in space on on physiology.
W3C Semantic Web Health
Care and Life Sciences Interest Group aims to improve
collaboration, research and development, and innovation adoption in the
health care and life science industries.
There are a number of eXtensible Markup Language (XML) projects underway. These are essential to allow for interchange of data between groups of people and software systems. Some of these are listed at xml.com. There are a number of protein databases in existence, some of which are listed at The European Bioinformatics Insitute's web site. One project is the Protein eXtensible Markup Language (PROXIML). Another project is the Molecular Interaction XML run by the proteomics standards initiative.
The National Center for
Biomedical Ontology is a consortium of leading biologists,
clinicians, informaticians, and ontologists who develop innovative
technology and methods that allow scientists to create, disseminate,
and manage biomedical information and knowledge in machine-processable
form. They sponsor the Open
Biomedical Ontologies SourceForge project, which is a focus point
for modelling of information for shared use across different biological
and medical domains. This includes the OBO Ontology Browser,
which lists a number of different vocabularies. There is a
vocabulary for human development anatomy and another for human disease.
Systems Biology Markup Language
(SBML) is a computer-readable format for representing models of
biochemical reaction networks. The current stable version, SBML Level 2, describes
structures and facilities for model definitions using XML Schema.
The Physiome Project seeks to describe the human organism quantitatively to understand its physiology and pathophysiology using a collection of models. Much of this work focusses on biophysics.
BioPAX is a collaborative effort to create a data exchange format for biological pathway data. The BioPAX group first met in 2002. The project uses Resource Description Framework (RDF), which builds on URI and XML technologies. RDF specifications are developed by the World Wide Web Consortium (W3C) Semantic Web Group. The W3C's RDF home is at www.w3.org/RDF.
Cell Markup Language is an XML language for modelling cells, in particular, to store and exchange computer-based mathematical models.. CellML is being developed by the Bioengineering Institute at the University of Auckland and affiliated research groups.
The University of Washington Genome Center provides links to projects within the University, in addition to publications, technology, and other resources. There is also a short tutorial on Analyzing Genome Sequences.
Bioinformatics.net
is a hub that includes links to bioinformatics web sites, companies,
tools, news articles, and forumns.
The site www.bio.net/bionet/
is a gateway for biology related newsgroups. Many of these
include RSS feeds and a BIO-SOFTWARE list of Information about software
for biology at www.bio.net/bionet/mm/bio-soft/.