Firepower in the Lab: Automation in the Fight Against Infectious Diseases and Bioterrorism (2001)

Chapter: Integration of New Technologies in the Future of the Biological Sciences

Previous Chapter: Biological Warfare Scenarios
Suggested Citation: "Integration of New Technologies in the Future of the Biological Sciences." Scott P. Layne, et al. 2001. Firepower in the Lab: Automation in the Fight Against Infectious Diseases and Bioterrorism. Washington, DC: Joseph Henry Press. doi: 10.17226/9749.

PART IV

FURTHER APPLICATIONS AND TECHNOLOGIES

Suggested Citation: "Integration of New Technologies in the Future of the Biological Sciences." Scott P. Layne, et al. 2001. Firepower in the Lab: Automation in the Fight Against Infectious Diseases and Bioterrorism. Washington, DC: Joseph Henry Press. doi: 10.17226/9749.
This page in the original is blank.
Suggested Citation: "Integration of New Technologies in the Future of the Biological Sciences." Scott P. Layne, et al. 2001. Firepower in the Lab: Automation in the Fight Against Infectious Diseases and Bioterrorism. Washington, DC: Joseph Henry Press. doi: 10.17226/9749.

20

Integration of New Technologies in the Future of the Biological Sciences

David J. Galas and T. Gregory Dewey

INTRODUCTION

Integration of technologies into the life sciences, which is critical to their advance, will take two forms. The first is the integration of existing technologies to produce more powerful tools for discovery. The second is the integration of technological change with discovery as an integral part of the process of scientific advancement. This paper discusses and illustrates both of these meanings, taking our lead from the biology of the past few decades, which have seen science progress from the foundations of molecular biology to the high-throughput parallel data acquisition of genomic and other data that characterize the biology of the past few years. The applications that are focused on the understanding and control of infectious diseases and biological aggression are specific instances of a much broader area of application of these ideas. The principal conclusions of this discussion are that advances in physical technologies, in particular automation and miniaturization, and the increasingly interdisciplinary nature of technological development are vital to the future of the life sciences. This is largely because the inherent complexity of biological systems requires a massive increase in an already high rate of data acquisition. Analysis of these data and their synthesis into new scientific knowledge will dominate the foreseeable future of the biological sciences. The essential scientific and technological activity required to meet this goal of data assimilation is the development of a modeling capability for complex biological systems that greatly exceeds our present capabilities. We discuss prospects and constraints on the development of this interface

Suggested Citation: "Integration of New Technologies in the Future of the Biological Sciences." Scott P. Layne, et al. 2001. Firepower in the Lab: Automation in the Fight Against Infectious Diseases and Bioterrorism. Washington, DC: Joseph Henry Press. doi: 10.17226/9749.

between theory and experiment in the biological sciences. Finally, since the interfaces between disciplines and the combinations of scientific and engineering points of view in the biological sciences are new, the cross-disciplinary education and training of a new breed of professionals in the applied life sciences are crucial to achieving this future.

DISCOVERY AND TECHNOLOGICAL CHANGE

The life sciences are unique among the sciences in several ways. One of these is that the history of the biological sciences has been almost entirely driven by the invention of new technologies, from van Leeuwenhoek's first microscope, which opened the world of the very small to human observation, to the invention of DNA cloning technology in the 1970s, which opened the world of the biomacromolecule to human experimentation. Access to realms of the hitherto unknown provided by each successive invention has marked the advance of biological discovery. Discovery has marched relentlessly in the footsteps of technological innovation. The pattern has been a repeated cycle of new technologies opening the door to discovery and then new discovery in its turn enabling or motivating the development of newer technologies. The history of this inexorable technology-discovery cycle in biology is a long one that includes a wide range of advances. Perhaps most relevant to our subject are the technological inventions that can be credited with creating modern biology in the last half of this century. Our list of these is an arguable one by all accounts, but many of our entries would likely be included by most.

First, most would agree that modern biology has been ushered in by that bastard science born of chemistry, physics, and biology called molecular biology—a classically feeble term that attempts to describe the field. It was roundly criticized at its inception by eminent scientists such as bio-chemist Erwin Chargaff as a superficial integration of sciences and a shallow discipline, one not sufficiently biochemical or biological to be taken seriously (Cairns et al., 1966; Watson, 1968). Yet it proved to be the key to unlocking the advances in biological sciences that marked the last half of the twentieth century. The list must include Cohen's and Boyer's methods for cloning DNA molecules (Cohen et al., 1973; Boyer, 1971) and discovery of the tools that make such cloning possible, restriction enzymes (Nathans and Smith, 1975) and DNA ligase; the sequence-specific hybridization of DNA sequences as a way of identifying specific sequences (Southern, 1974); monoclonal antibody methods (Köhler and Milstein, 1975); methods for deciphering the three-dimensional structure of macro-molecules; invention of DNA sequencing methods and the subsequent technical elevation of one of these methods to the automated DNA sequencing machine (Sanger and Coulson, 1975; Maxam and Gilbert, 1977);

Suggested Citation: "Integration of New Technologies in the Future of the Biological Sciences." Scott P. Layne, et al. 2001. Firepower in the Lab: Automation in the Fight Against Infectious Diseases and Bioterrorism. Washington, DC: Joseph Henry Press. doi: 10.17226/9749.

methods for direct chemical synthesis of DNA (Itakura et al., 1984); and the method for the amplification of nucleic acid sequences by more than a millionfold—the polymerase chain reaction (Arnheim and Erlich, 1992).

These technologies and their descendants have enabled the partially completed dissection of the molecular components and interaction of the living cell and continue to transform modern biology. It is becoming clear, however, that we are on the brink of another revolution that is characterized by the convergence of several technical and scientific developments. We argue here that the present change will have as great an impact as the change that brought forth molecular biology almost 50 years ago.

FUTURE REALM OF TECHNOLOGY AND DISCOVERY

At the threshold of the twenty-first century, the pace of technological innovation in the life sciences seems to be increasing and promises to carry the science of biology into a new realm in a very few years. In the past few years a technological phenomenon has been at work in another realm: computer and communications technologies have transformed many areas of everyday life and business. Inevitably, the life sciences have also been affected. The combination of high-throughput data acquisition, epitomized perhaps by the current final push to sequence the human genome, and the relentless surge of computing and data storage power is about to bring forth a new realm in the science of biology. To underestimate the effect of what some would consider a protean sector on the fundamental understanding of biology would be a grave error.

The realm we are thinking of is one that can only be projected from the promise of developments in the present state of the science and the technology. This future is very different from even the past decade in biology and will be bristling with new technologies for measurement and manipulation, flooded with data and information about complex systems operating in biological cells and organisms, and empowered with methods for directly modifying cells and organisms. But most of all this data-rich realm of molecular detail has become completely quantitative and is now described in terms of mathematical equations and complex computational models. The genome sequencing projects of the past have provided the essential raw data that have finally enabled the compiling of accurate molecular “parts lists” for living cells that are the central information resources for biological research.

The methods of acquiring information and data are now very advanced and continue to develop rapidly. They have become fully automated and high throughput in character, with many tasks being carried out simultaneously. Much of the acceleration and enhancement of data acquisition rates in these new technologies have been achieved through

Suggested Citation: "Integration of New Technologies in the Future of the Biological Sciences." Scott P. Layne, et al. 2001. Firepower in the Lab: Automation in the Fight Against Infectious Diseases and Bioterrorism. Washington, DC: Joseph Henry Press. doi: 10.17226/9749.

miniaturization to micron scales and a high degree of intelligent automation—experimental protocols in which many of the measurement decisions, sample tracking, and data storage and analysis are now entirely computer directed.

Needless to say, laboratory instrumentation has finally been organized into local networks of flexibly controlled tools that can be guided by a central interface that aids researchers with intelligent assessments of the tasks and commands for a project. Laboratory automation is now very sophisticated and well advanced. Data flow directly to the software agents for analysis and perusal and final presentation to the experimenter. The deciphering of complex biological systems— understanding how they work—has now taken on an entirely new character, unexpected to some. Hypotheses about how a system works are no longer described by biologists in words (e.g., protein A represses the function and expression of the gene for protein B, or protein A interacts with receptor B and stimulates the cell to divide), but rather they must be embodied in mathematical models defining and integrating the interactions and dynamics of the whole system in a quantitative and prediction-generating fashion.

This transition to mathematical models has occurred largely because of the complete inadequacy of the descriptive mode in handling complex systems that have multiple nonlinear interactions and responses. The accumulation of overwhelming evidence of the inherently complex nature of the systems has forced entirely new methods of analysis. In the year 2000 we are just beginning to face the fact that the process of constructing large quantitative models from large complex datasets was something we did not know how to do. Conversely, we did not know how to critically compare large datasets with existing models. We had no idea how to obtain values for all the molecular parameters or how to measure the rate constants and interaction strengths that went into the models. We also did not know how accurate they had to be to produce a useful model. Somehow these problems were solved with the usual combination of surprises, disappointments, and triumphs.

The transition of molecular biology from molecular natural history —descriptive lists of components and classifications—to real biological science with predictive models of behavior has been largely completed. Researchers have now come to grips with the hard lessons learned from the 1970s through the 1990s, a period during which most experimental studies were interpreted in a strictly descriptive mode. This necessary phase of scientific evolution has passed. Predictions were then made on the basis of these qualitative, poorly formed models that often resembled a conceptual Rube Goldberg machine. These linear chains of causation were weakened by many assumptions and often had poor or limited predictive value, giving a prediction that in some cases was found to be dif-

Suggested Citation: "Integration of New Technologies in the Future of the Biological Sciences." Scott P. Layne, et al. 2001. Firepower in the Lab: Automation in the Fight Against Infectious Diseases and Bioterrorism. Washington, DC: Joseph Henry Press. doi: 10.17226/9749.

ferent from expectations—even the opposite of the experimental findings. Since the molecular systems in question likely have significant nonlinear characteristics, this was not a surprising outcome.

There is now an enormous spectrum of models and classes of models that are based on cooperative collective phenomena. Although these models treat the system as a whole, with the new computational power they can be relatively easily manipulated, used on past data, and tested and validated using the software available. In short, the transition of biology into a mature science, with a theoretical component to guide its development and a robust experimental infrastructure to provide data, has finally occurred, and the science and its applications are thriving.

The lessons of the past decades in constructing and maintaining massive and complex databases have also been incorporated into the present network of interconnected and relatively transparent sets of databases distributed throughout the reach of computer networks. Publicly and privately funded databases that are centrally assembled and maintained, dispersed private researcher-assembled special databases, and company databases, open and closed, are all now connected and accessible. These databases can be seamlessly connected in complex queries and transposed from one operating system to another (Liu et al., 2000).

Principal among these lessons is that the way data or information is displayed and the way the data structures within the databases are defined have a strong influence on how the data can and will be used by scientists. This, in turn, influences the inferences that can be made from the data. To perceive all the regularities and see the patterns and trends, however subtle, the data must be appropriately displayed so that the right questions are asked. Thus, there is a need for flexible and easily reconfigured data structures that can be redesigned on the fly by users to fit their specific viewpoints and new evolving models. Such data structures are in the ascendant.

We have gone well beyond relational databases and are now in an era of higher-order, object-oriented databases. These databases store information in the form of “objects” as well as functions specific to the databases. The functions are tailored to the potential applications of the objects, and the entire database has qualities resembling higher-level simulations of the previous era. The databases, in fact, mimic both the components and the functionality of the biological subsystems they represent. With the new functionality of these databases, a variety of problems are easily handled; even the aggravating and persistent problem of inconsistent data is mitigated by “self-proofing ” software that seeks out ever more subtle inconsistencies between datasets and calls them to the attention of the researcher. Database languages and the logic of simulations have converged to a large extent.

Suggested Citation: "Integration of New Technologies in the Future of the Biological Sciences." Scott P. Layne, et al. 2001. Firepower in the Lab: Automation in the Fight Against Infectious Diseases and Bioterrorism. Washington, DC: Joseph Henry Press. doi: 10.17226/9749.

It is now only very rarely that one can capture a precise and useful hypothesis sufficiently in a brief description. Most working models of the cell and organism have become complex computer codes whose simulations are compared with experiments to refine parameters such as binding constants and reaction rates, to define new interactions or new components, or to test the response of the system to a new set of stimuli. These codes are further tested by experimental directed modification of the cellular components, through genomic modifications, or by highly specific chemical modifications.

These biocomputing codes are similar to the large physical codes that model the behavior of the atmosphere and oceans of our planet and provide a strong, if somewhat short-range, predictive capability. While it took more than 30 years to move from the first crude climate modeling codes to the first realistic and predictive codes, development of the biological equivalents has gone much more quickly, partly as a result of this experience and partly from the much more powerful computer technology. The complexity of these codes is now built up using recursive object-oriented designs. We have learned which subsystems in an organism can be modularized in this manner. Higher-order organizations are then constructed by assembly of computational modules. This allows a stepwise self-consistent increase in complexity of the collective model. This fits well with real biological systems, metaphorically recapitulating the possible evolutionary pathways of the biological systems.

It is now possible to assemble sets of genes from a variety of organisms into actual cellular enclosures (stocked with the essential subcellular components), to observe the operation of these cellular (and multicellular) systems, and to compare their functioning with that predicted by the appropriate computer code. These “cell-sims” are teaching us new things every day about the emergent behavior inherent in a sufficiently complex set of genes and about the complex computer codes we now use to simulate them.

Real engineering in biology has now emerged as a powerful computationally driven technology, and many of the lessons of the past transitions of disciplines into the “engineering phase” are being applied to biology. In this realm of the future we have been able to devise completely unexpected mechanisms of cellular response by observing this kind of collective behavior and modifying it according to the predictions of an iteratively modified computer code. We are now beginning to devise methods for efficiently modifying these modeling codes so that precisely defined cellular behavior can be predicted and built into the cells. Engineering at this level has begun to resemble in some ways the computational design of integrated circuits at the end of the twentieth century. It was possible in the latter part of the century to redirect the designer 's

Suggested Citation: "Integration of New Technologies in the Future of the Biological Sciences." Scott P. Layne, et al. 2001. Firepower in the Lab: Automation in the Fight Against Infectious Diseases and Bioterrorism. Washington, DC: Joseph Henry Press. doi: 10.17226/9749.

attention from the individual transistor, diode, or gate to design on a larger scale of thousands of these circuit elements at once using computer tools. The development of “cell-sims” has a great advantage over integrated circuits. They have the potential to evolve. High-throughput screening coupled with directed evolution allows a massive exploration of functional motifs on a molecular and systems level. Molecular evolution in the test tube, initially explored in the 1980s (Beaudry and Joyce, 1992), has become a powerful molecular technology at the boundary of synthetic chemistry and computing. Molecular structures can be evolved in vitro in a few hours to meet the requirements of very complex criteria of properties and behavior.

New professions have arisen in the area of design, construction, use, and maintenance of these computer code sims, analogous to the chip designers and developers of the computer industry at the end of the twentieth century. A major difference between integrated circuits and cellular design is that the fixes to errors in integrated circuit design and manufacture that could be (and usually were) made by firmware and software changes are not available to the cellular system designers. Simulations of the cellular design are now possible because of new high-speed parallel computers (gigaflop/microsecond), so the design and testing cycle can be closed in a relatively short period.

Computational power, which is now finally matched to the complexity of biological systems, has increased sufficiently to also allow simulations of integrated circuits for more than just a few microseconds so that circuit design errors can now largely be corrected before manufacture, and firmware and software fixes are less prevalent. Biology has entered a new mature phase because of the convergence of high-throughput data acquisition and computing technology. Engineering in biology has finally emerged as a sophisticated technical enterprise. The era when computing power finally reached the threshold necessary to compute biological systems of real complexity is now recognized as a major milestone in human scientific capability. This revolution has eclipsed in recognition and importance by orders of magnitude the sequencing of the human genome, completed in 2000.

Some of the things that are being done with engineered cells would have been unexpected in 2000. It was anticipated that the use of specifically engineered cells would find wide application in medicine to correct a defect or repair a damaged organ or joint. These applications have taken an unexpected turn. Medical scientists are now designing organs that were not produced by evolution and that do not exist in normal humans to correct a pathology, deflect a degeneration, or enhance a biochemical capacity of the body.

Other broader uses were not anticipated at all. For example, we have

Suggested Citation: "Integration of New Technologies in the Future of the Biological Sciences." Scott P. Layne, et al. 2001. Firepower in the Lab: Automation in the Fight Against Infectious Diseases and Bioterrorism. Washington, DC: Joseph Henry Press. doi: 10.17226/9749.

learned how to control the processes of biodirected crystallization that many organisms use to make exquisitely precise nanostructures of silicon, calcium carbonate, and composite materials of these with proteins (Cha et al., 2000). The genetically controlled architecture of the structures of the cocolithophoids of the oceans and marine diatoms and sponges are now controllable at the DNA level, and engineered organisms can produce precise and well-defined structures for use in a variety of micro- and nanotechnological, mechanical, and electronic applications.

Another application derives from the Turing machine-like design for a DNA computer proposed in 1999 (Liu et al., 2000). The device is now engineered into an artificial cell that signals the input and output to its internal DNA computer through the cell's surface receptors. In a very short time the advance of the life sciences has been transformed into advanced applications well beyond the medical or agricultural. This future realm of converged technologies and sciences has brought with it a powerful reminder of both the immense power of understanding and the responsibility for societal control of its applications.

BIOLOGY AS AN INFORMATION SCIENCE

The above extrapolation to a brave and very new world of the life sciences is based on only slightly more advanced technologies than we have today, albeit with some significant license and no assurances of accuracy. However, these extrapolated technologies, when combined with some fundamental insights of a new biology, can clearly make an enormous difference. The realm is actually very plausible and is one that may not be as far away as we might imagine. One basic idea about biology that underlies our thinking is that it is essentially an information science. This statement should not be taken in the trivial sense by which any science can be described as an information science in that descriptions of phenomena and prediction all involve handling information, but rather it should be considered as an assertion of the fundamentally different properties of biological systems. We would argue that it is fundamental to the way in which we should think about biological problems as distinct from, for example, physics, chemistry, geophysics, astronomy, and cosmology. A full discussion of this assertion is beyond the scope of this paper and will be addressed elsewhere, but because there are important lessons to be derived from the conclusions, the argument will be summarized here. The main arguments for the assertion that biology is an information science are as follows:

  • All biological processes can be described as information transactions.

Suggested Citation: "Integration of New Technologies in the Future of the Biological Sciences." Scott P. Layne, et al. 2001. Firepower in the Lab: Automation in the Fight Against Infectious Diseases and Bioterrorism. Washington, DC: Joseph Henry Press. doi: 10.17226/9749.
  • Inheritance, recombination, transcription, translation, signaling, etc., all involve specific processing and transfer of encoded information.

  • Biological entities are often temporally stable even though their physical components turn over. Organisms are transient and species are longer lived. Similarly, the molecular constituents are transient, yet the information they encode is long lived.

    • The information encoded in the genomes of the members of a species population is the essential definition of that species, and the information changes significantly only as evolution produces new or different species.

    • The fidelity of biological processes is linked to the lifetime of the information-carrying species. For example, genes are long lived, so replication must be extremely accurate. Messenger RNA carries information for briefer periods and in multiple copies, so transcription need not be quite as accurate. Because of protein turnover, translation is not as stringent in its proofreading capabilities.

  • Evolution changes the information content of genomes.

    • The ways in which variation and selection act to change the information in the genome are at the heart of the evolutionary process—the loss of information through random variation balanced against the addition of information by selection based on phenotype.

  • Biological systems are characterized by an irreducible complexity.

    • The encoded instructions for cellular life involve the direct and indirect interaction of a large number of molecular components behaving as a complex system with many emergent properties.

When presenting biology as an information science, one must ask: Is this just a new expression of old ideas or are there underlying principles to be asserted from this new perspective? It has been argued that biology is just an extension to a higher level of organization of physics and chemistry, albeit in systems far from equilibrium. So where does information science come in? Biological systems represent complex, nonlinear, and far-from-equilibrium systems. They also represent systems that are stably replicated over long stretches of time from encoded information.

Until recently, such systems have been deemed too “messy” for proper exploration by the mainstream of physics and chemistry. The key to putting order to the wealth of data in the biological sciences is through an understanding of the dynamics of complex systems. One of the crucial characteristics of species of living organisms is the ability to evolve in response to external environmental pressures. At the same time, a distin-

Suggested Citation: "Integration of New Technologies in the Future of the Biological Sciences." Scott P. Layne, et al. 2001. Firepower in the Lab: Automation in the Fight Against Infectious Diseases and Bioterrorism. Washington, DC: Joseph Henry Press. doi: 10.17226/9749.

guishing feature of most organisms is their ability to buffer their internal components and structures against external fluctuations. Biological systems are adaptive yet stable. Any understanding of biology must, by necessity, deal with the stability conditions for complicated nonlinear systems. Currently, these problems are too complex for traditional physico-chemical approaches.

To assess the stability of a dynamic system, one can use what are called Lyapounov functions. The Gibbs entropy is an example of a Lyapounov function used to describe the stability of thermodynamic systems. Recent work suggests that for some biological applications the Shannon information can also serve as a Lyapounov function (Lehman et al., 2000). This allows the stability of the evolution of the system to be described in a direct and simple manner. In such cases we do not need to return to thermodynamic arguments but can instead use approaches from information theory and the dynamics of complex systems. This provides a theoretical foundation for presenting biology as an information science. The analysis of such fundamental issues is, perhaps, the biggest challenge in developing a quantitative biology and in understanding what our models mean, what constraints are in effect in biological systems, and how best to shape theory to reflect experiment in biology.

These arguments may seem too theoretical to have practical consequences, but the consequences of the central role of information in the biological sciences are indeed practical and far reaching. They have significant implications for how we approach overcoming the many technical obstacles that remain to achieving the vision of the future of the life sciences described above. They have significant implications as well for the essential bridging of the disciplinary gaps between biology and the other sciences. For example, as an educational tool, the formulation of biology in these terms represents an exciting challenge that could help greatly in building robust permanent bridges between biology and the more mathematically based physical sciences and engineering. These gaps are significant impediments that need to be aggressively closed. Finally, the approach to biology as an information science should have important implications in harnessing the power of the biological sciences for practical applications.

THE DATA AVALANCHE

It has become commonplace to cite the tremendous increase in biological information available today and to point out the increasing rate of its acquisition. This accelerating avalanche is, in fact, a discipline-altering change that deserves acknowledgment. Yet it is also an enormous problem. At the moment, there are more 100 databases available on the World

Suggested Citation: "Integration of New Technologies in the Future of the Biological Sciences." Scott P. Layne, et al. 2001. Firepower in the Lab: Automation in the Fight Against Infectious Diseases and Bioterrorism. Washington, DC: Joseph Henry Press. doi: 10.17226/9749.

Wide Web that hold useful information about the genetic contents and biochemical components and systems of various cell types. More than 25 full microbial genome sequences have been completed and are available in public databases, and probably more than 50 are currently in the process of being fully sequenced. Full genomic sequencing has become a powerful tool in approaching very practical environmental and therapeutic problems related to microorganisms. We expect that in the next few years at least a few hundred complete microbial genomic sequences will be available.

The Human Genome Project provides a powerful illustration of the rate of acceleration of information acquisition in the past few years. Before 1993 there were a few million base pairs of human sequence data in the databases (some of them redundant), and the 5-year goals for the project, as set out in 1993 (Collins and Galas, 1993), posted a rate of 50 million base pairs per year by the end of 1998. By then more than 100 million base pairs were entered into the database. The year 2000 is an historic one, as during this year the entire human genome sequence will be essentially (90 percent) complete.

In March 2000 the completed sequence of the Drosophila genome was published (Adams et al., 2000). In April three more human chromosome sequences were completed, bringing the completed score to four, and the remaining sequences are expected by year's end. The millennium turns with our having the full sequence of our own genome available for our scrutiny and understanding, a truly historic and transforming event for both the basic and applied sciences.

It is perfectly within reasonable limits of projection to suppose that hundreds of microorganism genomes will be fully sequenced each year in the first decade of the new millennium. Indeed, the capacities of the major sequencing centers are such that the equivalent of one full microbial genome project is completed each day. These kinds of data, genomic sequence, are, however, only the very simplest kind of data—digital, one-dimensional data. The much more complex data on protein interactions, gene expression levels, cell surface receptor populations, the spectrum of genetic variations in populations, and other subtle biological complexities are the all-important informational font for the biology of the future. This information will occupy the basic and applied life sciences for decades to come.

NEW TECHNOLOGIES: THE GENOTYPING EXAMPLE

As argued above, the life sciences are unique among the sciences in being driven almost entirely by technological change. The field of human genetics is one of the areas undergoing major shifts now as a result of the

Suggested Citation: "Integration of New Technologies in the Future of the Biological Sciences." Scott P. Layne, et al. 2001. Firepower in the Lab: Automation in the Fight Against Infectious Diseases and Bioterrorism. Washington, DC: Joseph Henry Press. doi: 10.17226/9749.

recent discoveries of the extent of single nucleotide polymorphisms (SNPs) in the human genome and the development of methods for identifying and scoring (genotyping) these genetic variations. It is useful to describe a specific example of the changes in the technology for genotyping in order to explore both the role of interdisciplinary science and the scale of change in data acquisition rates now upon us.

We cite here one example of genotyping technologies—the use of cleavable small molecule tags for DNA that have been developed for genotyping large numbers of samples automatically and simultaneously scoring variations at many sites per sample. This technology is one example of the rapid acceleration of throughput and automation that is being seen in many areas of biological technology. Others include automated enzyme-based assays with fluorescent readouts, different mass spectrometry methods, and others. The new technologies will be cheaper, faster, and more miniaturized than these.

The underlying idea of this mass-tagging technology is essentially that detecting and discriminating among small molecules is an easier problem than detecting and discriminating among large nucleic acid molecules such as DNA fragments. Thus, a tagging strategy is used in which a large number (hundreds) of small molecules are attached via a photocleavable linker to specific DNA oligonucleotides in a scheme that encodes the sequence of each of them. The assay that selects one or the other of the oligonucleotides corresponding to the genetic variant at each site (SNPs) is completed when the tags are cleaved from the DNA with intense light and detected in a mass spectrometer. The genotypes are then reconstructed with a simple computer algorithm, and the data are stored for analysis.

The technology has been adapted to a fully automated scheme that, in the first-generation system, can read more than 40,000 genotypes per day per machine. In addition, it is clear that this technology, like many others of a similar age, will be amenable to substantial changes in efficiency, speed, and cost in the near future. This kind of automated parallelized data gathering is a good quantitative indicator of things to come in this area. Large amounts of genotype data for human populations at loci that are well characterized are already being used to reconstruct human prehistoric population migrations, to track down genetic disease loci and loci that determine susceptibilities to a variety of conditions and therapies. The realization that it may be possible to find genetic loci and specify variations at certain of these loci that determine differential response to therapeutic drugs has triggered an extensive exploration of the field now dubbed “pharmacogenomics.” While it is very early in this historic endeavor, it is already clear that medical treatments in the future will have to deal with the genetic variations among patients to maximize thera-

Suggested Citation: "Integration of New Technologies in the Future of the Biological Sciences." Scott P. Layne, et al. 2001. Firepower in the Lab: Automation in the Fight Against Infectious Diseases and Bioterrorism. Washington, DC: Joseph Henry Press. doi: 10.17226/9749.

peutic effectiveness and avoid patient-specific toxicity and lack of efficacy. Health care in the future will be dependent in many areas on this type of technology for essential patient information.

KEY ROLE OF INTERDISCIPLINARY EDUCATION AND TRAINING

It is clear from the above discussion that if our view of the future is at all accurate we will need an abundance of people trained in the integration of several disciplines who have adopted the data, technology, and modeling philosophies for the life sciences described here. Without effective integration of these disciplinary elements and the ability to function well outside the usual boundaries of the biological sciences, this type of training will fail.

A new academic institution is being developed specifically to provide this training. It is the goal of the newest Claremont College, the Keck Graduate Institute of Applied Life Sciences (KGI), to play a key leadership role in the development of the new area of applied biology and to train professional leaders in this area. This goal is distinct from that of training the next generation of researchers, but it has some features in common with that important goal. The professional training provided by KGI's two-year master's of biosciences degree is directed toward broad-based technical training, integrated with management and ethics—an objective that is increasingly recognized as key (Rayl, 2000).

The technical training must be fundamentally integrated, not just a collection of conventional discipline-specific courses, and intensive enough to enable an understanding of the key science and many of the important questions posed by real problems. As planned, the program will not attempt to train researchers, in either the depth required or the orientation, but the integrity of the advanced scientific training required must be paramount.

The effective training of a new breed of professionals in the applied life sciences is an important way to attempt to fulfill this leadership role, since people are the key to this new era. As such it will be important for the KGI to nurture close ties to various industrial, academic, and government institutions to track the burgeoning development of applied biology and to collaborate with researchers and managers to understand the changes afoot in the applied arena. These professionals are intended to play a role in the applied life sciences similar to that played by the master's engineers in the physical sciences-based industry. The KGI training program will require significant changes in the ways in which training is conducted. Some features of these changes will include problem-centered training—that is, an educational experience that has as its unifying theme

Suggested Citation: "Integration of New Technologies in the Future of the Biological Sciences." Scott P. Layne, et al. 2001. Firepower in the Lab: Automation in the Fight Against Infectious Diseases and Bioterrorism. Washington, DC: Joseph Henry Press. doi: 10.17226/9749.

the solution of practical real-world technical problems requiring the application of diverse disciplinary elements, close team work, and concentrated effort for a limited time period. The training will attempt to illustrate by example such modes of working. Both academic and industrial involvement in this mode of training will clearly be essential to its success. Industry internships and collaborations with both industry and academic colleagues will provide important occasions for this training. In many ways the model of engineering training is most illustrative of this training philosophy, but the information sciences are perhaps the closest to our intellectual philosophy.

Cooperative interdisciplinary problem solving will necessarily introduce the important topic of the behavior of organizations, as well as project and people management. In addition, some of the significant social issues related to applied biology, particularly in areas such as human genetics, health care resource issues, clinical trials, and environmental impacts, will arise naturally in these endeavors. All of the above-mentioned areas will be treated more fully by some focused program elements; by project-based training; by using affiliated institutions, adjunct faculty, and external industry associates; and by direct industrial experience for students.

The intellectual foundations of the KGI, the axis around which we can orient our development in applied biology, derive directly from new developments in fundamental biology to which a wide variety of other disciplines contribute. The challenges of this new institution, representative of all who dedicate themselves to these goals, are to focus on applications and innovative professional training while tracking new developments in the life sciences in all sectors and establishing a strong interdisciplinary applied research program to complement and invigo-rate the training program. It is a challenge matched to the changes taking place in the biological sciences.

CONCLUSIONS

The hallmarks of the future of the applied life sciences are high-throughput technology for data acquisition, massive data acquisition and storage, and the use of multiple technical disciplines in both the development of these technologies and the analysis and application of the data generated. Advances in automation and miniaturization and the increasingly interdisciplinary nature of technological developments are key to the future of the life sciences, largely because the inherent complexity of biological systems requires a massive increase in an already high rate of data acquisition. Analysis of these data and their synthesis into new sci-

Suggested Citation: "Integration of New Technologies in the Future of the Biological Sciences." Scott P. Layne, et al. 2001. Firepower in the Lab: Automation in the Fight Against Infectious Diseases and Bioterrorism. Washington, DC: Joseph Henry Press. doi: 10.17226/9749.

entific knowledge will dominate the foreseeable future of the basic biological sciences.

An essential scientific and technological activity is the development of a modeling capability for complex biological systems that greatly exceeds anything we now have. The move toward integration in all the above senses is a key to the future. We have ventured to speculate about what the future holds in these areas of biological sciences and their applications. These speculations may be flawed in detail, but we suggest that some aspects are almost certain to be realized. They are intended to serve to stimulate current thinking about this future, its implications, and the needs it implies.

We anticipate that the near future will bring major shifts in how the life sciences are conducted in both their basic and applied forms. The applications that are at center stage in this volume—those focused on the understanding and control of infectious diseases and related issues in human health, environment, and agriculture—are specific instances of a much broader area of application of these ideas. People are key to the future. Cross-disciplinary education and training of a new breed of professionals in the applied life sciences are crucial to achieving this future. Programs such as the one at the KGI Institute of Applied Life Sciences have taken on this challenge, one that will be necessary to meet in order to realize the promise of the convergence discussed here.

REFERENCES

Adams, M. D., S. E. Celniker, R. A. Holt, et al. 2000. The genome sequence of Drosophila melanogaster. Science, 287:2185-2195.

Arnheim, N., and H. Erlich. 1992. Polymerase chain reaction strategy. Annual Review of Biochemistry, 61:131-156.

Beaudry, D. P., and G. F. Joyce. 1992. Directed evolution of an RNA enzyme. Science, 257:635-641.

Boyer, H. W. 1971. DNA restriction and modification mechanisms in bacteria. Annual Review of Microbiology, 25:153-176.

Cairns, J., G. Stent, and J. D. Watson, eds. 1966. Phage and the Origins of Molecular Biology. New York: Cold Spring Harbor Laboratory.

Cha, J. N., G. D. Stucky, D. E. Morse, and T. J. Deming. 2000. Biomimetic synthesis of ordered silica structures mediated by block copolypeptides. Nature, 403(6767):289-292.

Cohen, S. N., A. C. Y. Chang, and H. W. Boyer. 1973. Construction of biologically functional bacterial plasmids in vitro Proceedings of the National Academy of Sciences USA, 72:3240-3245.

Collins, F., and D. J. Galas. 1993. A new five-year plan for the U.S. Human Genome Project. Science, 262(5130):43-46.

Itakura, K., J. Rossi, and R. B. Wallace. 1984. Synthesis and use of synthetic oligonucleotides. Annual Review of Biochemistry, 53:323-356.

Köhler, G., and C. Milstein. 1975. Continuous cultures of fused cells secreting antibody of predefined specificity. Nature, 256(5517):495-497.

Suggested Citation: "Integration of New Technologies in the Future of the Biological Sciences." Scott P. Layne, et al. 2001. Firepower in the Lab: Automation in the Fight Against Infectious Diseases and Bioterrorism. Washington, DC: Joseph Henry Press. doi: 10.17226/9749.

Lehman, N., M. D. Donne, M. West, and T. G. Dewey. 2000. The genotypic landscape during in vitro evolution of a catalytic RNA: Implications for phenotypic buffering. Journal of Molecular Evolution, 50:481-490.

Liu, Q., L. Wang, A. G. Frutos, A. E. Condon, R. M. Corn, and L. M. Smith. 2000. DNA computing on surfaces. Nature, 403(6766):175-179.

Maxam, A. M., and W. Gilbert. 1977. A new method for sequencing DNA. Proceedings of the National Academy of Sciences USA, 74:560-564.

Nathans, D., and H. O. Smith. 1975. Restriction endonucleases in the analysis and restructuring of DNA molecules. Annual Review of Biochemistry, 44:273-293.

Rayl, A. J. S. 2000. From implants to explants, and beyond: Multidisciplinary panel emphasizes follow-up research. The Scientist, 14(5):16.

Sanger, F., and A. R. Coulson. 1975. A rapid method for determining sequences in DNA by primed synthesis with DNA polymerase. Journal of Molecular Biology, 94:441-449.

Southern, E. M. 1974. An improved method for transferring nucleotides from electrophoresis strips to thin layers of ion-exchange cellulose. Analytical Biochemistry, 62(1):317-318.

Watson, J. D. 1968. The Double Helix. London: Weidenfeld and Nicolsen.

Next Chapter: New Standards and Approaches for Integrating Instruments into Laboratory Automation Systems
Subscribe to Email from the National Academies
Keep up with all of the activities, publications, and events by subscribing to free updates by email.