Page 18

Figure 1.11
Chromosomal walking from flanking genetic markers to the gene
responsible for cystic fibrosis. The distance covered totaled
more than 1 million DNA bases.
into and out of the cell (Figure 1.12A). By analogy, it was even possible to infer a likely three-dimensional shape for the CF protein (Figure 1.12B). In this way, computer-based sequence analysis shed substantial light on the structure and function of this important disease gene.
With the recent advent of gene therapythe ability to use a virus as a shuttle to deliver a working copy of a gene into cells carrying a defective versionclinical trials have been started to try to cure the disease in the lung cells of CF patients. The path from the initial discovery of the gene to potential therapies has been stunningly short in this case.
With the identification of the CF gene as well as a number of other successes, it has become clear that molecular genetics has developed a powerful general paradigm that can be applied to many inherited diseases and will have a profound impact on our understanding of human health. Unfortunately, the paradigm involves many tedious laboratory steps: genetic mapping (finding a polymorphism closely linked to the
Page 19
disease gene), physical mapping (isolating the consecutive fragments of DNA along the chromosome), and DNA sequencing (typically performed in pieces of only 300 to 500 letters at a time). It would be inefficient to repeat these steps for each of the more than 4,000 genetic traits and diseases already known. To accelerate progress, molecular geneticists have seen the value of building infrastructurea common set of maps, tools, and informationthat can be applied to all genetic problems. This recognition led to the creation of the Human Genome Project (National Research Council, 1988), an international effort to analyze the structure of the human genome (as well as the genomes of certain key experimental model systems, such as E. coli, yeast, nematodes, fruit flies, and mice).
Because most molecular biological methods are applicable only to small fragments of DNA, it is not practical to sequence the human genome by simply starting at one end and proceeding sequentially. Moreover, because the current cost of sequencing is about $1 per base, it would be expensive to sequence the 3 × 109 bases of the human chromosomes by conventional methods. Instead, it is more sensible to construct maps of increasing resolution and to develop more efficient sequencing technology. The current goals of the Human Genome Project include development of the following tools:
·Genetic maps. The goal is to produce a genetic map showing the location of 5,000 polymorphisms that can be used to trace inheritance of diseases in families. As of this writing, the goal is nearly complete.
·Physical maps. The goal is to produce a collection of overlapping pieces of DNA that cover all the human chromosomes. This goal is not completed yet but should be by 1996.
·DNA sequence. The ultimate goal is to sequence the entire genome, but the intermediate steps include sequencing particular regions, generating more efficient and automated technology, and developing better analytical methods for handling DNA information.
With the vast quantities of information being generated, the Human Genome Project is one of the driving forces behind the expanding role
Page 20

Page 21
B

Figure 1.12
(A) The protein sequence of the cystic fibrosis gene showed striking similarities
to a variety of proteins known to transport molecules across cell membranes.
(B) Based on these similarities, it was possible to construct a basic molecular
model of the architecture of the CF protein. Reprinted, by permission, from
Riordan et al. (1989). Copyright © 1989 by the American Association for
the Advancement of Science.