Table of contents : Bioinformatics and Molecular Evolution......Page 5 Full Contents......Page 9 Preface......Page 12 1.1 DATA EXPLOSIONS......Page 17 1.2 GENOMICS AND HIGH-THROUGHPUT TECHNIQUES......Page 21 1.3 WHAT IS BIOINFORMATICS?......Page 22 1.4 THE RELATIONSHIP BETWEEN POPULATION GENETICS, MOLECULAR EVOLUTION, AND BIOINFORMATICS......Page 23 SUMMARY......Page 26 2.1 NUCLEIC ACID STRUCTURE......Page 28 2.2 PROTEIN STRUCTURE......Page 30 2.3 THE CENTRAL DOGMA......Page 32 2.4 PHYSICO-CHEMICAL PROPERTIES OF THE AMINO ACIDS AND THEIR IMPORTANCE IN PROTEIN FOLDING......Page 38 BOX 2.1 Polymerase chain reaction (PCR)......Page 39 2.5 VISUALIZATION OF AMINO ACID PROPERTIES USING PRINCIPAL COMPONENT ANALYSIS......Page 41 2.6 CLUSTERING AMINO ACIDS ACCORDING TO THEIR PROPERTIES......Page 44 BOX 2.2 Principal component analysis in more detail......Page 45 SUMMARY......Page 50 3.1 WHAT IS EVOLUTION?......Page 53 3.2 MUTATIONS......Page 55 3.3 SEQUENCE VARIATION WITHIN AND BETWEEN SPECIES......Page 56 3.4 GENEALOGICAL TREES AND COALESCENCE......Page 60 3.5 THE SPREAD OF NEW MUTATIONS......Page 62 3.6 NEUTRAL EVOLUTION AND ADAPTATION......Page 65 BOX 3.1 The influence of selection on the fixation probability......Page 66 BOX 3.2 A deterministic theory for the spread of mutations......Page 67 SUMMARY......Page 70 4.1 MODELS OF NUCLEIC ACID SEQUENCE EVOLUTION......Page 74 BOX 4.1 Solution of the Jukes–Cantor model......Page 77 4.2 THE PAM MODEL OF PROTEIN SEQUENCE EVOLUTION......Page 81 BOX 4.2 PAM distances......Page 86 4.3 LOG-ODDS SCORING MATRICES FOR AMINO ACIDS......Page 87 SUMMARY......Page 92 5.1 WHY BUILD A DATABASE?......Page 97 5.2 DATABASE FILE FORMATS......Page 98 5.3 NUCLEIC ACID SEQUENCE DATABASES......Page 99 5.4 PROTEIN SEQUENCE DATABASES......Page 105 5.5 PROTEIN FAMILY DATABASES......Page 111 5.6 COMPOSITE PROTEIN PATTERN DATABASES......Page 124 5.7 PROTEIN STRUCTURE DATABASES......Page 127 5.8 OTHER TYPES OF BIOLOGICAL DATABASE......Page 129 SUMMARY......Page 131 6.1 WHAT IS AN ALGORITHM?......Page 135 6.2 PAIRWISE SEQUENCE ALIGNMENT – THE PROBLEM......Page 137 6.3 PAIRWISE SEQUENCE ALIGNMENT – DYNAMIC PROGRAMMING METHODS......Page 139 6.4 THE EFFECT OF SCORING PARAMETERS ON THE ALIGNMENT......Page 143 6.5 MULTIPLE SEQUENCE ALIGNMENT......Page 146 SUMMARY......Page 152 7.1 SIMILARITY SEARCH TOOLS......Page 155 7.2 ALIGNMENT STATISTICS (IN THEORY)......Page 163 BOX 7.1 Extreme value distributions......Page 167 BOX 7.2 Derivation of the extreme value distribution in the word-matching example......Page 168 7.3 ALIGNMENT STATISTICS (IN PRACTICE)......Page 169 SUMMARY......Page 171 8.1 UNDERSTANDING PHYLOGENETIC TREES......Page 174 8.2 CHOOSING SEQUENCES......Page 177 8.3 DISTANCE MATRICES AND CLUSTERING METHODS......Page 178 BOX 8.1 Calculation of distances in the neighbor-joining method......Page 183 8.4 BOOTSTRAPPING......Page 185 8.5 TREE OPTIMIZATION CRITERIA AND TREE SEARCH METHODS......Page 187 8.6 THE MAXIMUM-LIKELIHOOD CRITERION......Page 189 BOX 8.2 Calculating the likelihood of the data on a given tree......Page 190 8.7 THE PARSIMONY CRITERION......Page 193 8.8 OTHER METHODS RELATED TO MAXIMUM LIKELIHOOD......Page 195 BOX 8.3 Calculating posterior probabilities......Page 198 SUMMARY......Page 201 9.1 GOING BEYOND PAIRWISE ALIGNMENT METHODS FOR DATABASE SEARCHES......Page 211 9.2 REGULAR EXPRESSIONS......Page 213 9.3 FINGERPRINTS......Page 216 9.4 PROFILES AND PSSMS......Page 221 9.5 BIOLOGICAL APPLICATIONS – G PROTEIN-COUPLED RECEPTORS......Page 224 SUMMARY......Page 232 10.1 USING MACHINE LEARNING FOR PATTERN RECOGNITION IN BIOINFORMATICS......Page 243 10.2 PROBABILISTIC MODELS OF SEQUENCES – BASIC INGREDIENTS......Page 244 BOX 10.1 Dirichlet prior distributions......Page 248 10.3 INTRODUCING HIDDEN MARKOV MODELS......Page 250 BOX 10.2 The Viterbi algorithm......Page 254 BOX 10.3 The forward and backward algorithms......Page 255 10.4 PROFILE HIDDEN MARKOV MODELS......Page 257 10.5 NEURAL NETWORKS......Page 260 BOX 10.4 The back-propagation algorithm......Page 265 10.6 NEURAL NETWORKS AND PROTEIN SECONDARY STRUCTURE PREDICTION......Page 266 SUMMARY......Page 269 11.1 RNA STRUCTURE AND EVOLUTION......Page 273 11.2 FITTING EVOLUTIONARY MODELS TO SEQUENCE DATA......Page 282 11.3 APPLICATIONS OF MOLECULAR PHYLOGENETICS......Page 288 SUMMARY......Page 295 12.1 PROKARYOTIC GENOMES......Page 299 BOX 12.1 Web resources for bacterial genomes......Page 300 12.2 ORGANELLAR GENOMES......Page 314 SUMMARY......Page 325 13.1 ’OMES AND ’OMICS......Page 329 13.2 HOW DO MICROARRAYS WORK?......Page 330 13.3 NORMALIZATION OF MICROARRAY DATA......Page 332 13.4 PATTERNS IN MICROARRAY DATA......Page 335 13.5 PROTEOMICS......Page 341 13.6 INFORMATION MANAGEMENT FOR THE ’OMES......Page 346 BOX 13.1 Examples from the Gene Ontology......Page 351 SUMMARY......Page 353 M.1 EXPONENTIALS AND LOGARITHMS......Page 359 M.3 SUMMATIONS......Page 360 M.5 PERMUTATIONS AND COMBINATIONS......Page 361 M.6 DIFFERENTIATION......Page 362 M.8 DIFFERENTIAL EQUATIONS......Page 363 M.10 NORMAL DISTRIBUTIONS......Page 364 M.11 POISSON DISTRIBUTIONS......Page 366 M.12 CHI-SQUARED DISTRIBUTIONS......Page 367 M.13 GAMMA FUNCTIONS AND GAMMA DISTRIBUTIONS......Page 368 PROBLEMS......Page 369 List of Web addresses......Page 371 Glossary......Page 373 Index......Page 379