Previous Chapter: Chapter 4 Hearing Distant Echoes: Using Extremal Statistics to Probe Evolutionary Origins
Suggested Citation: "Sequence Alignment." National Research Council. 1995. Calculating the Secrets of Life: Contributions of the Mathematical Sciences to Molecular Biology. Washington, DC: The National Academies Press. doi: 10.17226/2121.

Page 94

importance, it is a good indicator and can lead to the formulation of important biological hypotheses, as noted above. Conversely, lack of statistical significance is an important clue in considering whether to reject a relationship that may seem interesting to the human eye. With over 70,000 sequences in modern databases, molecular biologists require an automatic way to reject all but the most interesting results from a database search. Comparing one sequence to the database involves 70,000 comparisons. Comparing all pairs of sequences involves image, or about 2.4 ´109, comparisons. As we will see with the tRNA and rRNA comparison, even a small number of comparisons can raise subtle questions.

Global Sequence Comparisons

We will now discuss a number of situations for sequence comparisons and some probability and statistics that can be applied to these problems. Some powerful and elegant mathematics has been developed to treat this class of problems. Our discussion will naturally break into two parts, global comparisons and local comparisons.

Sequence Alignment

In this section we study the comparison of two sequences. For simplicity the two sequences A1A2. . .An and B1B2. . .Bm will consist of letters drawn independently with identical distribution from a common alphabet.

Sequences evolve at the molecular level by several mechanisms. One letter, A for example, can be substituted for another, G for example. These events are called substitutions. Letters can be removed from or added to a sequence, and these events are called deletions or insertions. Given two sequences such as ATTGCC and ACGGC, it is usually not clear how they should be related. The possible relationships are often written as alignments such as:

Suggested Citation: "Sequence Alignment." National Research Council. 1995. Calculating the Secrets of Life: Contributions of the Mathematical Sciences to Molecular Biology. Washington, DC: The National Academies Press. doi: 10.17226/2121.
Page 94
Next Chapter: Alignment Given
Subscribe to Emails from the National Academies
Stay up to date on activities, publications, and events by subscribing to email updates.