Page 117
To illustrate the methods, we use a set of North American Indian mitochondrial sequences described in Ward et al. (1991). These authors sequenced the first 360 base pairs of the mitochondrial control region for a sample of 63 Nuu-Chah-Nulth (Nootka) Indians from Vancouver Island. The sample comprises individuals who were maternally unrelated for four generations, chosen from 13 of the 14 tribal bands. As a consequence the sample deviates from a truly random sample, although it will be treated as such for the purposes of this chapter. An important parameter in the analysis is the effective population size of the group. This is approximated by the number of reproducing females, giving a value of about 600 for the long-term effective population size N.
The most common DNA changes seen in mitochondria are transitions (changes from one pyrimidine base to the other or one purine base to the other, that is, C « T or A « G) rather than transversions (changes from a pyrimidine to a purine or vice versa). Indeed, the sequenced region shows no transversions, so that each site in the sequences has one of just two possible nucleotides. We focus on the pyrimidine (C or T) sites in the region. There are 201 such sites, in which 21 variable (or segregating) sites define 24 distinct sequences (called alleles or lineages). The details of the data, including the allele frequencies, are given in Table 5.1.
The parameter of particular interest here is q, the population geneticist's stock in trade. The variable q is a measure of the mutation rate in the region, and it figures in many important theoretical formulas in population genetics. For mitochondrial data, it is defined by
q = 2Nu,
where N is the effective population size referred to earlier, and u is the mutation rate per gene per generation. Once q is estimated, we can estimate u if N is known or N if u is known. In what follows, we estimate the compound parameter q rather than its components.
In the section immediately following, we begin by outlining the structure of the coalescent, a robust description of the genealogy of samples taken from large populations. The effects of mutation are superimposed on this genealogy in several ways. The classical case, which
Page 118
| Table 5.1 Nucleotide Position in Control Region | ||||||||||||||||||||||
| 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 | Allele | ||||
| Position | 6 | 8 | 9 | 2 | 4 | 6 | 6 | 9 | 0 | 1 | 3 | 4 | 5 | 6 | 7 | 7 | 0 | 0 | 0 | 1 | 3 | |
| 9 | 8 | 1 | 4 | 9 | 2 | 6 | 4 | 0 | 9 | 3 | 7 | 5 | 7 | 1 | 5 | 1 | 2 | 4 | 9 | 9 | ||
| Site | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | |
| ID ref | T | C | C | C | T | C | T | T | C | C | C | C | C | C | C | T | T | T | C | T | T | |
| 1 | · | · | · | · | · | · | C | · | T | · | · | · | T | · | · | · | · | · | · | · | · | |
| 2&3 | · | · | · | · | · | · | · | · | T | · | · | · | T | · | · | · | · | · | · | · | · | 3 |
| 4 | · | · | · | · | · | · | · | · | T | · | · | · | T | · | · | · | · | · | · | · | C | |
| 5 | · | T | · | · | · | T | · | · | T | · | · | · | · | T | · | · | · | · | · | · | C | 3 |
| 6 | · | T | · | · | · | · | · | · | T | · | · | · | · | · | · | · | · | · | · | · | C | 2 |
| 7 | C | T | · | · | · | · | · | · | T | · | · | T | · | · | · | · | · | · | · | · | C | 1 |
| 8,10&11 | · | T | · | · | · | · | · | · | T | · | · | · | · | T | · | · | · | · | · | · | C | 8 |
| 9 | C | T | · | · | · | · | · | · | T | · | · | · | · | T | · | · | · | · | · | · | C | 2 |
| 12&13 | · | T | · | · | · | · | · | · | · | · | · | · | · | T | · | · | · | · | · | · | C | 10 |
| 14 | · | T | · | · | · | · | · | · | T | · | · | · | T | T | · | · | · | · | · | · | C | 1 |
| 15 | · | T | · | · | · | · | · | · | T | · | · | · | T | T | · | · | C | · | · | · | C | 2 |
| 16 | · | · | · | · | · | · | · | · | T | T | · | · | · | · | · | · | · | · | T | · | C | 1 |
| 17 | · | · | · | T | · | · | · | · | T | · | · | · | · | · | · | · | C | · | · | · | C | 1 |
| 18 | · | · | · | T | · | · | · | · | T | · | · | · | · | · | · | · | · | C | · | · | C | 2 |
| 19 | · | · | T | · | C | · | · | · | T | · | · | · | · | · | T | · | · | C | · | · | C | 1 |
| 20 | · | · | · | · | · | · | · | · | T | · | · | · | · | · | · | · | · | C | · | · | C | 3 |
| 21 | · | · | · | · | · | · | · | · | T | · | · | · | · | · | · | · | · | C | · | · | C | 3 |
| 22 | C | · | · | · | · | · | · | · | T | · | · | · | · | · | · | · | · | C | · | · | · | 3 |
| 23 | · | · | · | · | · | · | · | · | T | T | · | · | · | · | · | C | · | C | · | · | · | 1 |
| 24 | · | · | · | · | · | · | · | · | T | · | · | · | · | · | · | C | · | C | T | · | · | 7 |
| 25 | · | · | · | · | · | · | · | · | T | T | · | · | · | · | · | C | · | C | T | C | · | 3 |
| 26 | · | · | · | · | · | · | · | · | · | T | · | · | · | · | · | C | · | C | T | C | · | 1 |
| 27 | · | · | · | · | · | · | C | C | · | · | · | · | · | · | · | · | · | · | · | · | · | 1 |
| 28 | · | · | · | · | · | · | C | C | · | · | T | · | · | · | · | · | · | · | · | · | · | 1 |
| NOTE: These mitochondrial data from Ward et al. (1991, Figure 1) are the variable pyrimidine positions in the control region. Position 69 corresponds to position 16,092 in the human reference sequence published by Anderson et al. (1981). The ID numbers correspond to those given in Ward et al. (1991, Figure 1). | ||||||||||||||||||||||