Page 277
approximate pattern matching, 78-79
difference measures, 72-73
dynamic programming, 60-64, 78, 82, 84, 85, 86, 109
in evolutionary analysis, 106, 110-112
gap cost penalties, 70-72
in genetic mapping, 35-36
global alignment, 58-64, 94-99
heuristic, 82-84
K-best alignments, 76-78
local alignment, 65-70, 99-106
multiple alignments, 73-76
in physical mapping, 46-51
Alleles, 6
see also DNA; Protein folding; Sequence similarity and comparison
Amplification, see Polymerase chain reaction
Ancestry, see Evolutionary analysis
ANREP systems, 87
Approximate pattern matching, 78-79, 86
Approximate repeats, 87
ARIADNE systems, 87
Assay techniques, 2-3
Autosomes, 26
Base pairs, 8, 26, 48, 153, 154, 163, 179, 185, 188, 189, 191, 194, 204, 249
see also Adenine, Thymine, Cystosine, Guanine, Uracil
Bayesian statistics, 35
Bernoulli random variables, 102, 125
Biochemistry, 2-5
Biosequences, see Databases of DNA sequences; Sequence similarity and comparison; Sequencing methods and technology
BLASTA algorithm, 82-84
Booth-Leuker algorithm, 50-51
BRCA1 (breast cancer) gene, 33
Cancer, 33, 34, 37-42, 58, 91, 183, 196
Cauchy's formula, 136
Cellular structures, 9
Chaperonins, 238-239
Chen-Stein method, 102, 106, 110
Chimeras, 51
Page 278
Chirality, 213-215
Chromosomal walking, 17, 18, 42, 43
Clones and cloning, 13, 14, 26, 42-43, 209
Closed circular DNA, 153-154, 155, 156, 157, 181, 204
combinatorial structures, 119, 136-148
Ewens sampling formula, 119, 122-124, 136-139
K-allele model, 130-132
likelihood methods, 146-148
tree construction and movement, 124-127
see also, Finitely-many-sites model; Infinitely-many-sites model
Combinatorics, 119, 136-148, 185
Computing time and memory capacity
algorithmic efficiencies, 35-36, 84-86, 87
approximate pattern matching, 79, 87
dynamic programming algorithms, 62-63, 64, 68, 83, 84
gap cost functions, 72
heuristic algorithms, 83-84
K-best paths, 77
multiple alignments, 75
parallel processing, 79-81, 84
sublinear similarity searches, 84-85
Consecutive ones property, 50
Consensus scores, 76
Contigs, 47-50
Crick and Watson model, 153, 204-205
Crossovers, 27-29
Cruciforms, 154
Crystallography, 202, 203, 240
Cystic fibrosis (CF), 16-18, 20-21, 26
Databases of DNA sequences, 13, 17, 56, 81, 87
similarity searches in, 78-79, 82-86, 87, 91-92, 94
see also FASTA, BLASTA
Diagnostics, see Genetic diagnostics
Difference measures, 72-73
Diffusion processes, 37-42, 148
Dimers, 212
DNA (deoxyribonucleic acid), 8-9, 92
protein binding, 166-167, 168, 170-171, 181
transcription, 9-12, 154, 179, 196-198, 204-205
see also DNA polymorphisms and mutations; Protein folding; Sequence similarity and comparison; Sequencing methods and
Page 279
technology; Strand separation and unwinding; Supercoiling
DNA polymorphisms and mutations, 8-9, 16-17, 26, 30, 34, 57, 106
in evolutionary analysis, 114-135
minimal cost alignments, 72-73
in mitochondria, 115-116, 117, 118, 148-149
rates of, 66, 67, 116, 117, 124-125
see also Genetic maps and mapping
Duplex unwinding elements (DUEs), 183, 194, 195
Dynamic programming algorithm, 60-64, 78, 82, 84, 85, 86, 109, 251
Effective population size, 117
Efficient algorithms, 35-36, 84-86, 87
Electron microscopy, 202, 211, 227
Electrostatic interactions, 251
Energetics, 154, 180, 182, 186-195
see also under names of specific types
Eve hypothesis, 116
Evolutionary analysis, 57-58, 90-94
coalescent structures, 117, 119-135, 148-149
extremal statistical methods, 106-112
minimal cost alignments, 72-73
random combinatorial structures, 136-148
use of mitochondrial DNA, 57-58, 90-94, 115-116, 117, 148-149
trees, 73, 76, 87, 124, 129, 132, 266
see also Eve hypothesis
Ewens sampling formula (ESF), 119, 122-124, 136-139
Extremal statistical methods, 106-112
global sequence comparisons, 94-99
local sequence comparisons, 99-106
False negatives and positives, 51
Familial adenomatous polyopsis (FAP), 37-38
Fingerprinting methods, 42-47
Finitely-many-sites model, 132-135
Fleming-Viot process, 148
Page 280
Foldases, 237-238
Fourier transforms, coefficient, 240
4-plat knots, 215-216, 220, 222
Fractionation, 2-3
Free energy, 154, 180, 182, 186-195
Gaussian processes, 41
Gel electrophoresis, 210-211, 227
GENBANK database, 81
Generalized Levenshtein measure, 73, 87
Gene splicing, see Recombinant DNA technology
Gene therapy, 18
Genetic distance, 28-29
Genetic heterogeneity, 34
Genetic maps and mapping, 16, 18-19, 26, 27-30, 51
and incomplete pedigree information, 30, 31, 34-35
markers in, 31
and maximum likelihood estimation, 34-42
and non-Mendelian genetics, 30, 31, 33-34
Genetics, 5-7
Geometry, 166, 203, 210, 211, 220, 223
descriptors and methods, 155-163
see also Topology
Global alignment, 5, 58-64, 94-99
maximum-scoring, 63
Haldane mapping function, 29, 41
Hierarchical condensation methods, 248-251
destabilization, 184, 188, 196
Helical periodicity, 154
Heuristic algorithms, 82-84
HIV protease structure, 254-255
Homeomorphisms, 212-213
Homology modeling, 252
Human Genome Project, 18-22, 26
Hydrophilic side chains, 244, 253, 263
Hydrophobic side chains, 244, 245, 253
Hydrophobicity, 4
Incomplete penetrance, 31, 33, 34
Independent assortment, 29
Indexing, of databases, 87
Infinitely-many-sites/alleles
Page 281
In vitro assays, 3
Isomerases, 238
K-allele model, 130-132
K-best alignments, 76-78
kDNA (kinetoplast DNA), 231
Kingman's subadditive ergodic theorem, 97
Knot theory, 212
see also Tangles and knots
Large Deviation Theory of Diffusion Processes, 37-42
LexA binding sites, 198-199
Ligases, 13
Likelihood methods, 34-42, 146-148
Linking number (Lk), 155, 157-158, 163-164, 173-174, 181
topoisomerase reactions, 164-166
Local alignment, 5, 65-70, 99-106
Longest common subsequence, 99
Macromolecules, 3
Mapping, see Genetic maps and mapping; Physical maps and mapping; Restriction maps; Sequencing methods and technology
Markers, see Genetic markers
Markov models, processes, 36, 146-147, 249
Maximum likelihood estimation, 34-35
and efficient algorithms, 35-36
and statistical significance, 37-42
Measure-valued diffusions, 148
Membrane-bound transporters, 17-18, 20
Mendelian genetics, 5-7, 27, 31
Minichromosomes, 174-177
Min (multiple intestinal neoplasia) trait, 38-39
Mirror images, 213-215
Mismatch ratio, 86
Mitochondrial DNA (mtDNA), 115-116, 117, 118, 135, 148-149, 204
Molecular biology, overview, 7-12
Monte Carlo methods, 146-147, 149, 241
Morgans, 28
mRNA (messenger RNA), 9, 12, 92
Multiple alignments, 73-76
Multiple minima problem, 241
Mutation, see DNA polymorphisms and mutations
Myoglobin, 265-266
Page 282
Native American population studies, 116, 117
Neighborhood concept, 83
Neural networks, 259-263
Nonadditive scoring schemes, 87
Nuclear magnetic resonance (NMR), 203, 240
Nucleic acids, 3
Nucleosomes, 154, 166, 174-177
Ornstein-Uhlenbeck process, 41
Overwinding, 154
Packing density, 252
Palindromes, 87
Parallel computing, 79-81, 84, 87
Phenocopy, 34
see also evolutionary trees
Physical maps and mapping, 17, 19, 26, 29
fingerprinting methods, 42-47
PIR database, 81
PLANS (Pattern Language for Amino and Nucleic Acids Sequences), 263-264
Platelet-derived growth factor (PDGF), 91
Plectonemic forms, 154, 156, 169, 170, 215-216
Poisson distributions, 144
see also Boltzmann equation, 254;
Dirichlet distribution, 144
in coalescent trees, 121, 124-127
in sequence comparisons, 29, 100-104, 108-110
Poly-adenylation, 196
Polygenic inheritance, 34
Polymerase chain reaction (PCR), 13, 15, 16, 46
Polymorphism, see DNA polymorphisms and mutations
Polyoma virus, 196
Principle of optimality, 63
Probabilistic combinatorics, 136
Processing time, see Computing time and memory capacity
Protein folding, 5, 12, 236-248
hierarchical condensation methods, 248-251, 256-265
prediction of, 5, 254-255, 265-266
threading methods, 248-254
see also Amino acids; Protein folding; Sequence similarity and comparison
Public databases, see Databases of DNA sequences
Pure breeding, 5
Page 283
Pyrimidines (Y), 99, 117, 118, 123, 128, 200
QUEST systems, 87
Rational tangles, 218-221, 228-229
RecA binding, 198-199, 211, 227
Recessive traits, 16
Recombinant DNA technology, 13-16, 17
Recombination, 27-28, 205, 213, 225-230
site-specific, 207-212, 222-225
Replication processes, 92, 154, 179-180, 183, 204
Restriction enzymes, 13
Restriction fragment lists, 45-46
R-group, 237
RNA (ribonucleic acid), 9, 179, 196, 237
evolutionary analysis, 92-93, 106-107, 110-112
polymerase, 9
rRNA, 92, 93, 106, 107, 110, 112
see also mRNA, tRNA, 11
Rule-based methods, 263-264
Scoring schemes
gap cost penalties, 70-72
global alignments, 59-64
K-best alignments, 76-78
local alignments, 65-68
minimal cost alignments, 72-73
multiple alignments, 74-76
nonadditive, 87
Sedimentation rate, 100
Self-replication, 92
Sequence similarity and comparison, 56-58, 86-87, 91, 199
approximate pattern matching, 78-79, 86
database searches, 78-79, 82-86, 87, 91-92, 94
difference measures, 72-73
in evolutionary analysis, 57-58, 72-73, 76, 90-94, 106-112, 115
gap cost penalties, 70-72
global alignment, 5, 58-64, 94-99
heuristic algorithms, 82-84
K-best alignments, 76-78
local alignment, 5, 65-70, 99-106
multiple alignments, 73-76
parallel computing, 79-81, 84, 87
sublinear, 84-86
Page 284
Sequencing methods and technology, 13, 17, 19, 26, 81
error detection and correction, 73
shotgun method, 43-44
Sex chromosomes, 26
Shotgun method, 43-44
SIMD (single-instruction, multiple-data) computers, 80
Site-specific recombination, 207-212, 222-225
Smith-Waterman algorithm, 66, 68, 83, 84, 109
Solvent-accessible contact areas, 252-253
SOS genes, response, 183, 198-199, 200
Statistics of coverage, 46-51
coalescent structures, 119-135, 148-149
combinatorial structures, 119, 136-148
likelihood methods, 146-148
Storage capacities, see Computing time and memory capacity
Strand separation and unwinding, 8, 179-180, 181-184, 219-220
energy states, 154, 180, 182, 186-195
site prediction, 184-186, 196-200
Stress responses, 183, 198-199, 200, 204
Strong law of large numbers (SLLN), 97, 98, 100
Sublinear similarity searches, 84-86
Sum-of-pairs scores, 76
Supercoiling processes, 153-163
closed curves, 153-154, 155, 156, 157, 181, 204
topoisomerase reactions, 163-166
see also Strand separation and unwinding; Superhelicity
Superhelicity, 162, 181-183, 193
Surface linking number (Slk), 167-171, 173-174
Synapsis, 207-209, 223-225, 226-227
Tangles and knots, 204-207, 211, 212-222
gel mobility, 231-232
recognition, 230-231
site-specific recombination models, 222-225
Threading methods, 248-254
Topology, 155, 166-167, 168, 170-171, 203-204, 205, 207, 244, 247
Page 285
surface linking number, 167-171, 173-174
tangles and knots, 204-207, 211, 212-225, 230-231
see also Geometry
Toroidal surfaces, 155, 168, 170, 182, 228-229, 231
Traceback procedures, 64
Transcription processes, 9-12, 154, 179, 196-198, 204-205
Transitions, 117, 185-186, 188, 190, 194-195
Trivial tangles, 218-219
tRNA (transfer RNA), 92, 93, 106, 107, 110-112
t-test, 40-41
Twist (Tw), 157, 159-160, 162, 164, 173-174
topoisomerase reactions, 164-166
Unit-cost scoring scheme, 58-59, 86
Unwinding, see Strand separation and unwinding
Variable population size processes, 148-149
Virtual surfaces, 17-171
Vitalism, 3
VLSI (very large scale integration) chips, 80-81
Winding number, 167, 171-172, 173-174
Writhe (Wr), 157, 159, 160, 161, 162, 164
topoisomerase reactions, 164-166
X-ray crystallography, 203, 240
YAC (yeast artificial chromosomes) libraries, 46-47, 53
z-DNA, 154