Chapter 10

After studying this chapter, you should be able to:

• Describe the major features of genomics, proteomics, and bioinformatics.
• Summarize the principal features and medical relevance of the Encode project.
• Describe the functions served by HapMap, Entrez Gene, BLAST, and the dbGAP databases.
• Describe the major features of computer-aided drug design and discovery.
• Describe possible future applications of computational models of individual pathways and pathway networks.
• Outline the possible medical utility of “virtual cells.”

The first scientific models of pathogenesis, such as Louis Pasteur's seminal germ theory of disease, were binary in nature: each disease possessed a single, definable causal agent. Malaria was caused by the amoeba Plasmodium falciparum,tuberculosis by the bacterium Mycobacterium tuberculosis,sickle cell disease by a mutation in a gene encoding one of the subunits of hemoglobin, poliomyelitis by poliovirus, and scurvy by a deficiency in ascorbic acid. The strategy for treating or preventing disease thus could be reduced to a straightforward process of tracing the causal agent, and then devising some means of eliminating it, neutralizing its effects, or blocking its route of transmission. This approach has been successfully employed to understand and treat a wide range of infectious and genetic diseases. However, it has become clear that the determinants of many pathologies—including cancer, coronary heart disease, type II diabetes, and Alzheimer's disease—are multifactorial in nature. Rather than having a specific causal agent or agents whose presence is both necessary and sufficient, the appearance and progression of the aforementioned diseases reflect the complex interplay between each individual's genetic makeup, other inherited or epigenetic factors, and environmental factors such as diet, lifestyle, toxins, viruses, or bacteria.

The challenge posed by multifactorial diseases demands a quantum increase in the breadth and depth of our knowledge of living organisms capable of matching their sophistication and complexity. We must identify the many as yet unknown proteins encoded within the genomes of humans and the organisms with which they interact, their cellular functions and interactions. We must be able to trace the factors, both external and internal, that compromise human health and wellbeing by analyzing the impact of dietary, genetic, and environmental factors across entire communities or populations. The sheer mass of information that must be processed lies well beyond the ability of the human mind to review and analyze unaided. To understand, as completely and comprehensively as possible, the molecular mechanisms that underlie the behavior of living organisms, the manner in which perturbations can lead to disease or dysfunction, and how such perturbing factors spread throughout a population, biomedical scientists have turned to sophisticated computational tools to collect and evaluate biologic information on a mass scale.

Physicians and scientists have long understood that the genome, the complete complement of genetic information of a living organism, represented a rich source of information concerning topics ranging from basic metabolism to evolution to aging. However, the massive size of the human genome, 3 × 10...

