Page 96
therefore cannot be optimized. We give the statistical distribution of the alignment score for completeness, however. Let s(A,B) be a real valued random variable. Define the score S by
![]()
and let E(S) denote the expectation of S and Var(S) denote the variance. Clearly, E(S) = nE(s(A,B)) and
Var(S) = nVar(s(A, B)).
Since S is the sum of independent, identically distributed random variables s(A,B), the central limit theorem implies that for large n
S » Normal(nE(s(A, B)), nVar(s(A, B))).
If s(A, B) Î {0, 1}, then S is binomial (n, p), where p = P{s(A, B) = 1}. Even when the letters are not identically distributed, the limiting distribution is normal under mild assumptions (Chung, 1974).
The assumptions of the last section are carried over: AlA2. . . An and B1B2. . . Bm are composed of independent and identically distributed letters and s(A, B) is a real valued random variable on pairs of letters. We extend s(·,·) to s(A,-) and s(-,B) so that deletions are included. We assume that the value of s for all deletion scores is smaller than maxs(A,B). An alignment score S is the maximum over all possible alignments

Page 97
The optimization destroys the classical normal distribution of alignment score, but an easy application of a beautiful theorem known as Kingman's subadditive ergodic theorem gives an interesting result:
Theorem 4.1 (Kingman, 1973) For s,t nonnegative integers with 0 £ s £ t, let Xs,t be a collection of random variables that satisfy
(i) Whenever s t u, Xs,u £ Xs,t + Xt,u,
(ii The joint distribution of Xs,t} is the same as that of {Xs+l, t+1},
iii The expectation gt = EX0,t exists and satisfies gt ³ - Kt for
some constant
Then the finite limt®¥ X0,t / t = l exists with probability 1 and in the mean.
The essential assumption is (ii), the subadditivity condition. To motivate and illustrate this theorem, recall the strong law of large numbers (SLLN), which treats independent, identically distributed (iid) random variables W1,W2,. . . with µ = E(Wi). The SLLN asserts that
![]()
with probability 1.
It is easy to see that additivity holds. Set
![]()
Of course (i) is satisfied:

Page 98
Since the Wi are iid, (ii) is evidently true. Finally, g(t) = E(U0,t) = tµ, so that (iii) holds with µ = -K. Therefore the limit
![]()
exists and is constant with probability 1. Notice that this setup does not allow us to conclude that the limit is µ. This is a price of relaxing the assumption of additivity.
Returning to the statistical distribution of alignment score, recall that an alignment score S is the maximum over all possible alignments
![]()
Define X,, by
![]()
Then evidently,
-Xs,u ³ (-Xs,t) + (-Xt,u)
and
Xs,u £ Xs,t + Xt,u.
We have that gt = E(X0,t) exists since the expectation of a single alignment exists and -X0,t is the maximum of a finite number of alignment scores. The final hypothesis to check is gt ³ -Kt for some constant K and all t > 1. Clearly,
E(-X0,t) £ t maxs(A, B)
so that
gt ³ -(maxs(A, B))t = -Kt.