115 86 1MB
English Pages [386] Year 2023
Certainty by Construction Software & Mathematics in Agda
Sandy Maguire
Cofree Press First published 2023 Copyright © 2023, Sandy Maguire All rights reserved. Version 1.0.2 / 2023-11-08
To Erin Jackes, who knows the true meaning of equality
In science if you know what you are doing you should not be doing it. In engineering if you do not know what you are doing you should not be doing it. RICHARD HAMMING
Contents
Contents
v
Preface
1
The Co-blub Paradox A World Without Execution? . . . . . . . . . . . . . . . . .
5 9
1 A Gentle Introduction to Agda 1.1 The Longevity of Knowledge . . . 1.2 Modules and Imports . . . . . . . 1.3 A Note on Interaction . . . . . . . 1.4 Importing Code . . . . . . . . . . 1.5 Semantic Highlighting . . . . . . . 1.6 Types and Values . . . . . . . . . 1.7 Your First Function . . . . . . . . 1.8 Normalization . . . . . . . . . . . 1.9 Unit Testing . . . . . . . . . . . . 1.10 Dealing with Unicode . . . . . . . 1.11 Expressions and Functions . . . . 1.12 Operators . . . . . . . . . . . . . . 1.13 Agda’s Computational Model . . 1.14 Stuckness . . . . . . . . . . . . . . 1.15 Records and Tuples . . . . . . . . 1.16 Copatterns and Constructors . . . 1.17 Fixities . . . . . . . . . . . . . . . 1.18 Coproduct Types . . . . . . . . . 1.19 Function Types . . . . . . . . . . 1.20 The Curry/Uncurry Isomorphism 1.21 Implicit Arguments . . . . . . . . 1.22 Wrapping Up . . . . . . . . . . . . v
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
13 14 14 17 18 19 20 24 27 28 29 33 35 38 40 42 46 48 49 50 52 55 60
vi 2 An Exploration of Numbers 2.1 Natural Numbers . . . . . . . . . . . . 2.2 Brief Notes on Data and Record Types 2.3 Playing with Naturals . . . . . . . . . . 2.4 Induction . . . . . . . . . . . . . . . . . 2.5 Two Notions of Evenness . . . . . . . . 2.6 Constructing Evidence . . . . . . . . . 2.7 Addition . . . . . . . . . . . . . . . . . 2.8 Termination Checking . . . . . . . . . . 2.9 Multiplication and Exponentiation . . 2.10 Semi-subtraction . . . . . . . . . . . . . 2.11 Inconvenient Integers . . . . . . . . . . 2.12 Difference Integers . . . . . . . . . . . . 2.13 Unique Integer Representations . . . . 2.14 Pattern Synonyms . . . . . . . . . . . . 2.15 Integer Addition . . . . . . . . . . . . . 2.16 Wrapping Up . . . . . . . . . . . . . . .
CONTENTS
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
65 66 68 69 72 73 78 81 83 85 87 88 90 93 96 98 100
3 Proof Objects 3.1 Constructivism . . . . . . . . . . . . . . . . . . 3.2 Statements are Types; Programs are Proofs . 3.3 Hard to Prove or Simply False? . . . . . . . . 3.4 The Equality Type . . . . . . . . . . . . . . . 3.5 Congruence . . . . . . . . . . . . . . . . . . . . 3.6 Identity and Zero Elements . . . . . . . . . . . 3.7 Symmetry and Involutivity . . . . . . . . . . . 3.8 Transitivity . . . . . . . . . . . . . . . . . . . . 3.9 Mixfix Parsing . . . . . . . . . . . . . . . . . . 3.10 Equational Reasoning . . . . . . . . . . . . . . 3.11 Ergonomics, Associativity and Commutativity 3.12 Exercises in Proof . . . . . . . . . . . . . . . . 3.13 Wrapping Up . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
103 104 105 108 110 114 117 121 125 129 133 139 146 149
4 Relations 4.1 Universe Levels . . . . . . . . . . . . . . . . . . . . 4.2 Dependent Pairs . . . . . . . . . . . . . . . . . . . . 4.3 Heterogeneous Binary Relations . . . . . . . . . . . 4.4 The Relationship Between Functions and Relations 4.5 Homogeneous Relations . . . . . . . . . . . . . . . . 4.6 Standard Properties of Relations . . . . . . . . . . . 4.7 Attempting to Order the Naturals . . . . . . . . . . 4.8 Substitution . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . .
153 153 158 159 161 164 165 166 169
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
CONTENTS 4.9 4.10 4.11 4.12 4.13 4.14 4.15 4.16 4.17 4.18 4.19 4.20
Unification . . . . . . . . . . . . . Overconstrained by Dot Patterns Ordering the Natural Numbers . . Preorders . . . . . . . . . . . . . . Preorder Reasoning . . . . . . . . Reasoning over ≤ . . . . . . . . . . Graph Reachability . . . . . . . . Free Preorders in the Wild . . . . Antisymmetry . . . . . . . . . . . Equivalence Relations and Posets Strictly Less Than . . . . . . . . . Wrapping Up . . . . . . . . . . . .
vii . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
170 171 173 175 177 179 180 182 184 184 186 186
5 Modular Arithmetic 5.1 Instance Arguments . . . . . . . . . . . . 5.2 The Ring of Natural Numbers Modulo N 5.3 Deriving Transitivity . . . . . . . . . . . 5.4 Congruence of Addition . . . . . . . . . . 5.5 Congruence of Multiplication . . . . . . . 5.6 Automating Proofs . . . . . . . . . . . . 5.7 Wrapping Up . . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
191 192 196 198 200 202 204 204
6 Decidability 6.1 Negation . . . . . . . . . . . . . . . 6.2 Bottom . . . . . . . . . . . . . . . . 6.3 Inequality . . . . . . . . . . . . . . . 6.4 Negation Considered as a Callback 6.5 Intransitivity of Inequality . . . . . 6.6 No Monus Left-Identity Exists . . . 6.7 Decidability . . . . . . . . . . . . . 6.8 Transforming Decisions . . . . . . . 6.9 Binary Trees . . . . . . . . . . . . . 6.10 Proving Things about Binary Trees 6.11 Decidability of Tree Membership . 6.12 The All Predicate . . . . . . . . . . 6.13 Binary Search Trees . . . . . . . . . 6.14 Trichotomy . . . . . . . . . . . . . . 6.15 Insertion into BSTs . . . . . . . . . 6.16 Intrinsic vs Extrinsic Proofs . . . . 6.17 An Intrinsic BST . . . . . . . . . . 6.18 Wrapping Up . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
207 208 210 211 215 215 216 217 221 221 223 225 228 230 232 235 239 241 245
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
viii
CONTENTS
7 Monoids and Setoids 7.1 Structured Sets . . . . . . . . . . . 7.2 Monoids . . . . . . . . . . . . . . . 7.3 Examples of Monoids . . . . . . . 7.4 Monoids as Queries . . . . . . . . 7.5 More Monoids . . . . . . . . . . . 7.6 Monoidal Origami . . . . . . . . . 7.7 Composition of Monoids . . . . . 7.8 Function Extensionality . . . . . . 7.9 Setoid Hell . . . . . . . . . . . . . 7.10 A Setoid for Extensionality . . . . 7.11 The Pointwise Monoid . . . . . . 7.12 Monoid Homomorphisms . . . . . 7.13 Finding Equivalent Computations 7.14 Wrapping Up . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
249 250 250 253 254 258 261 265 267 270 274 276 279 283 287
8 Isomorphisms 8.1 Finite Numbers . . . . . . . . . . . . 8.2 Vectors and Finite Bounds . . . . . . 8.3 Characteristic Functions . . . . . . . 8.4 Isomorphisms . . . . . . . . . . . . . . 8.5 Equivalence on Isomorphisms . . . . 8.6 Finite Types . . . . . . . . . . . . . . 8.7 Algebraic Data Types . . . . . . . . . 8.8 The Algebra of Algebraic Data Types 8.9 Monoids on Types . . . . . . . . . . . 8.10 Functions as Exponents . . . . . . . . 8.11 Wrapping Up . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
291 293 295 297 299 302 304 306 309 315 317 320
. . . . . . .
323 324 325 328 331 333 335 338
9 Program Optimization 9.1 Why Can This Be Done? . 9.2 Shaping the Cache . . . . . 9.3 Building the Tries . . . . . 9.4 Memoizing Functions . . . 9.5 Inspecting Memoized Tries 9.6 Updating the Trie . . . . . 9.7 Wrapping It All Up . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
Appendix: Ring Solving 341 9.8 Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 9.9 Agda’s Ring Solver . . . . . . . . . . . . . . . . . . . . 344 9.10 Tactical Solving . . . . . . . . . . . . . . . . . . . . . . 346
CONTENTS 9.11 9.12 9.13 9.14 9.15 9.16 9.17 9.18
The Pen and Paper Algorithm Horner Normal Form . . . . . Multivariate Polynomials . . . Building a Semiring over HNF Semantics . . . . . . . . . . . . Syntax . . . . . . . . . . . . . Solving the Ring . . . . . . . . Ergonomics . . . . . . . . . . .
ix . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
348 351 352 354 356 360 362 363
Bibliography
367
Acknowledgments
369
About the Author
371
Books by Sandy Maguire
373
Preface
It was almost ten years ago when I stumbled across the idea that in order to truly understand something, one should write themselves a textbook to really flesh out the concepts. The reasoning, it goes, is that when you’re just forced to articulate something from the ground up, the holes in your understanding become immediately obvious. As Richard Feynman says, “the first principle is that you must not fool yourself and you are the easiest person to fool.” The first textbook project I ever attempted in earnest was one on category theory, an alternative foundation for mathematics, as opposed its more traditional set theoretic foundation. Category theory has much to recommend it; while set theory is very good at get-theanswer-by-any-means-necessary sorts of approaches, category theory instead gives good theoretical underpinnings to otherwise-nebulous concepts like “abstraction” and “composition.” The argument I’d heard somewhere was that doing math in sets was like writing programs in assembly code, while doing it in categories was comparative to writing them in a modern programming language. While writing a textbook was indeed helpful at identifying holes in my understanding, it was never a particularly good tool for building that understanding in the first place. My mathematics education is rather spotty—I was good at “running the maze” in school, which is to say, I could reliably compute answers. At university I received an engineering degree, which required lots more running of the maze, but I did horrendously in all of my actual math courses. I had grown up writing software, and it felt extraordinarily vulnerable to need to write the technical solutions required by mathematics, without having tooling to help me. I was looking for some sort of compiler or runtime to help troubleshoot my proofs. As a self-taught programmer, I had developed a bad habit of brute-forcing my way to working programs. My algorithm for this was as follows: 1
2
CONTENTS 1. write a program that seems to make sense 2. run it and pray that it works 3. insert some print statements to try to observe the failure 4. make random changes 5. go back to 2
Depending on the programming language involved, this can be an extremely tight feedback loop. But when it came to mathematics, the algorithm becomes much less effective. Not only is it meaningless to “run” a proof, but also as a non-mathematician thrust into the domain, I found it unclear what even constituted a proof. Which steps did I need to justify? How could I know when I was done? When is a proof sufficiently convincing? At least in university, the answer to that last question is “when you get 100% on the assignment.” In a very real sense, the feedback algorithm for mathematics is this: 1. write the proof 2. submit the assignment 3. wait for it to be marked 4. tweak the bits that have red underlines 5. go back to 2 Of course, this algorithm requires some sort of mythical teaching assistant with infinite time and understanding, who will continually mark your homework until you’re satisfied with it. You might not be done in time to get the grade, but with perseverance, you’ll eventually find enlightenment—intuiting the decision procedure that allows a theorem to pass through without any red underlines. I suppose this is how anyone learns anything, but the feedback cycle is excruciatingly slow. Perhaps out of tenaciousness and math-envy more than anything else, I managed to keep at my category theory goal for seven years. I would pick up a new textbook every few years, push through it, get slightly further than the last time, and then bounce off anew. The process was extremely slow-going, but I did seem to be making sense of it. Things sped up immensely when I made friends with kind,
CONTENTS
3
patient people who knew category theory, who would reliably answer my questions and help when I got stuck. What a godsend! Having kind, patient friends sped up the feedback algorithm for mathematics by several orders of magnitude, from “wait until the next time I pick up a category theory textbook and identify some mistakes in my understanding elucidated by time” to “ask a friend and get a response back in less than an hour.” Amazing stuff. At some point, one of those kind, patient friends introduced me to proof assistants. Proof assistants are essentially programming languages meant for doing mathematics, and stumbling across them gave me a taste of what was possible. The selling point is that these languages come with a standard library of mathematical theorems. As a software guy, I know how to push programming knowledge into my brain. You bite off one module of the standard library at a time, learning what’s there, inlining definitions, and reading code. If you ever need to check your understanding, you just code up something that seems to make sense and see if the compiler accepts it. Now, as they say, I was cooking with gas. I spent about a year bouncing around between different proof assistants. There are several options in this space, each a descendant from a different family of programming languages. During that year, I came across Agda—a language firmly in the functional programming camp, with a type-system so powerful that several years later, I have still only scratched the surface of what’s possible. Better yet, Agda comes with a massive standard library of mathematics. Once you can wrap your head around Agda programming (no small task), there is a delectable buffet of ideas waiting to be feasted upon. Learning Agda has been slow going, and I chipped away at it for a year in the context of trying to prove things not about mathematics, but about my own programs. It’s nice, for example, to prove your algorithm always gets the right answer, and to not need to rely on a bevy of tests that hopefully cover the input space (but how would you ever know if you’d missed a case?). In the meantime, I had also been experimenting with formalizing my category theory notes in Agda. That is, I picked up a new textbook, and this time, coded up all of the definitions and theorems in Agda. Better, I wrote my answers to the book’s exercises as programs, and Agda’s compiler would yell at me if there was a flaw in my reasoning. WOW! What a difference this made! All of a sudden I had the feedback mechanism for mathematics that I’d always wanted. I
4
CONTENTS
could get the red underlines on unconvincing (or wrong) parts of my argument automatically—all on the order of seconds! Truly this is a marvelous technology. When feedback becomes instant, tedious chores turn into games. I found myself learning mathematics into the late evenings on weekends. I found myself redoing theorems I’d already proved, trying to find ways of making them prettier, or more elegant, or more concise, or what have you. I’ve made more progress in category theory in the last six months than I had in a decade. I feel now that proof assistants are the best video game ever invented. The idea to write this book came a few months later, when some friends of mine wanted to generalize some well-established applied mathematics and see what happened. I didn’t know anything about the domain, but thought it might be fun to tag along. Despite not knowing anything, my experience with proving things in Agda meant I was able to help more than any of us expected. I came away with the insight that there are a lot more people out there like me: people who want to be better at mathematics but don’t quite know how to get there. People who are technically minded and have keen domain knowledge, but are lacking the proof side of things. And after all, what is mathematics without proof? So we come now to this book. This book is the textbook I ended up writing—not to teach myself category theory as originally intended, but instead to teach myself mathematics at large. In the process of explaining the techniques, and the necessity of linearizing the narrative, I’ve been forced to grapple with my understanding of math, and it has become very clear in the places I was fooling myself. This book is itself a series of Agda modules, meaning it is a fully type-checked library. That is, it comes with the guarantee that I have told no lies; at least, none mathematically. I am not an expert in this field, but have stumbled across a fantastic method of learning and teaching mathematics. This textbook is the book I wish I had found ten years ago. It would have saved me a great deal of wasted effort and false starts. I hope it can save you from the same. Good luck, godspeed, and welcome aboard.
The Co-blub Paradox
It is widely acknowledged that the languages you speak shape the thoughts you can think; while this is true for natural language, it is doubly so in the case of programming languages. And it’s not hard to see why; while humans have dedicated neural circuitry for natural language, it would be absurd to suggest that we also have dedicated neural circuitry for fiddling with arcane, abstract symbols, usually encoded arbitrarily as electrical potentials on a conductive metal. Programming—and mathematics more generally—does not come easily to us humans, and for this reason it can be hard to see the forest for the trees. We have no built-in intuition as to what should be possible, and thus, this intuition is built only by observing the work of more-established practitioners. In the more artificial human endeavors like programming, newcomers to the field must be constructivists—their art is shaped only by the patterns they have previously observed. Because different programming languages support different features and idioms, the imaginable shape of what programming is must be shaped by those languages we understand deeply. In a famous essay, “Beating the Averages,” Graham (2001) points out the so-called Blub paradox. This, Graham says, is the ordering of programming languages by powerfulness; a programmer who thinks in a middle-of-the-road language along this ordering (call it Blub) can identify less powerful languages, but not those which are more powerful. The idea rings true; one can arrange languages in power by the features they support, and subsequently check to see if a language supports all the features we feel to be important. If it doesn’t, it must be less powerful. However, this technique doesn’t work to identify more powerful languages—at best, you will see that the compared language supports all the features you’re looking for, but you don’t know enough to ask for more. Quasi-formally, we can describe the Blub paradox as a semidecision procedure. That is, given an ordering over programming 5
6
CONTENTS
languages (here, by their relative “power”,) we can determine whether a language is less than our comparison language, but not whether it is more than. We can determine when the answer is definitely “yes,” but, not when it is “no!” Over two decades of climbing this lattice of powerful languages, I have come to understand a lesser-known corollary of the Blub paradox, coining it the Co-Blub paradox. This is: knowledge of lesser languages is actively harmful when transposed into the context of a more powerful language. The hoops you unwittingly jumped through in Blub due to lacking feature X are anti-patterns in the presence of feature X. This is obviously true when stated abstractly, but insidiously hard to see when we are the ones writing the anti-patterns. Let’s look at a few examples over the ages, to help motivate the problem before we get into our introspection proper. In the beginning, people programmed directly in machine code. Not assembly, mind you, but in raw binary-encoded op-codes. They had a book somewhere showing them what bits needed to be set in order to cajole the machine into performing any given instruction. Presumably if this were your job, you’d eventually memorize the bit patterns for common operations, and it wouldn’t be nearly as tedious as it seems today. Then came assembly languages, which provided humanmeaningful mnemonics to the computer’s opcodes. No longer did we need to encode a jump as the number 1018892 (11111000110000001100 in binary)—now it was simply jl 16. Still mysterious to be sure, but you must admit such a thing is infinitely more legible. When encoded directly in machine code, programs were, for the most part, write-only. But assembly languages don’t come for free; first you need to write an assembler: a program that reads the mnemonics and outputs the raw machine code. If you were already proficient writing machine code directly, you can imagine the task of implementing an assembler to feel like make work—a tool to automate a problem you don’t have. In the context of the Co-Blub paradox, knowing the direct encodings of your opcodes is an anti-pattern when you have an assembly language, as it makes your contributions inscrutable to your peers. Programming directly in assembly eventually hit its limits. Every computer had a different assembly language, which meant if you wanted to run the same program on a different computer you’d have to completely rewrite the whole thing; often needing to translate between extremely different concepts and limitations. Ignoring a lot of
CONTENTS
7
history, C came around with the big innovation that software should be portable between different computers: the same C program should work regardless of the underlying machine architecture—more or less. If you were an assembly programmer, you ran into the anti-pattern that while you could squeeze more performance and perform clever optimizations if you were aware of the underlying architecture, this fundamentally limited you to that platform. By virtue of being in many ways a unifying assembly language, C runs very close to what we think of as “the metal.” Although different computer architectures have minor differences in registers and ways of doing things, they are all extremely similar variations on a theme. They all expose storable memory indexed by a number, operations for performing basic logic and arithmetic tasks, and means of jumping around to what the computer should consider to be the next instruction. As a result, C exposes this abstraction of what a computer is to its programmers, who are thus required to think about mutable memory and about how to encode complicated objects as sequences of bytes in that memory. But then (skipping much history) came along Java, whose contribution to mainstream programming was to popularize the idea that memory is cheap and abundant. Thus, Java teaches its adherents that it’s OK to waste some bytes in order to alleviate the headache of needing to wrangle it all on your own. A C programmer coming to Java must unlearn the idea that memory is sacred and scarce, that one can do a better job of keeping track of it than the compiler can. The hardest thing to unlearn is that memory is an important thing to think about in the first place. There is a clear line of progression here; as we move up the lattice of powerful languages, we notice that more and more details of what we thought were integral parts of programming turn out to be not particularly relevant to the actual task at hand. However, the examples thus discussed are already known to the modern programmer. Let’s take a few steps further, into languages deemed esoteric in the present day. It’s easy to see and internalize examples from the past, but those staring us in the face are much more difficult to spot. Compare Java then to Lisp, which—among many things—makes the argument that functions, and even programs themselves, are just as meaningful objects as are numbers and records. Where Java requires the executable pieces to be packaged up and moved around with explicit dependencies on the data it requires, Lisp just lets you write and pass around functions, which automatically carry around
8
CONTENTS
all the data they reference. Java has a design pattern for this called the “command pattern,” which has required much ado and ink to be spilled. In Lisp, however, you can just pass functions around as firstclass values and everything works properly. It can be hard to grok why exactly if you are used to thinking about computer programs as static sequences of instructions. Indeed, the command pattern is bloated and ultimately unnecessary in Lisp, and practitioners must first unlearn it before they can begin to see the way of Lisp. Haskell takes a step further than Lisp, in that it restricts when and where side-effects are allowed to occur in a program. This sounds like heresy (and feels like it for the first six months of programming in Haskell) until you come to appreciate that almost none of a program needs to perform side-effects. As it happens, side-effects are the only salient observation of the computer’s execution model, and by restricting their use, Haskell frees its programmers from needing to think about how the computer will execute their code—promising only that it will. As a result, Haskell code looks much more like mathematics than it looks like a traditional computer program. Furthermore, by abstracting away the execution model, the runtime is free to parallelize and reorder code, often even eliding unnecessary execution altogether. The programmer who refuses to acknowledge this reality and insists on coding with side-effects pays a great price, both on the amount of code they need to write, in its long-term reusability, and, most importantly, in the correctness of their computations. All of this brings us to Agda, which is as far as I’ve personally come along the power lattice of programming languages. Agda’s powerful type system allows us to articulate many invariants that are impossible to write down in other languages. In fact, its type system is so precise we can prove that our solutions are correct, which alleviates the need to actually run the subsequent programs. In essence, programming in Agda abstracts away the notion of execution entirely. Following our argument about co-Blub programmers, they will come to Agda with the anti-pattern that thinking their hard-earned, battleproven programming techniques for wrangling runtime performance will come in handy. But this is not the case; most of the techniques we have learned and consider “computer science” are in fact implementation ideas: that is, specific realizations from infinite classes of solutions, chosen not for their simplicity or clarity, but for their efficiency. Thus, the process of learning Agda, in many ways, is learning to separate the beautiful aspects of problem solving from the multitude
CONTENTS
9
of clever hacks we have accumulated over the years. Much like the fish who is unable to recognize the ubiquitous water around him, as classically-trained programmers, it is nigh-impossible to differentiate the salient points from the implementation details until we find ourselves in a domain where they do not overlap. Indeed, in Agda, you will often feel the pain of having accidentally conflated the two, when your proofs end up being much more difficult than you feel they deserve. Despite the pain and the frustration, this is in fact a feature, and not a bug. It is a necessary struggle, akin to the type-checker informing you that your program is wrong. While it can be tempting to blame the tool, the real fault is in the craftsmanship.
A World Without Execution? It’s worth stopping and asking ourselves in what way a non-executable programming language might be useful. If the end result of a coding endeavor is the eventual result, whether that be an answer (as in a computation) or series of side-effects (as in most real-world programs,) non-execution seems useless at best and masturbatory at worst. Consider the case of rewriting a program from scratch. Even though we reuse none of the original source code, nor the compiled artifacts, the second time through writing a program is always much easier. Why should this be so? Writing the program has an effect on the world, completely apart from the digital artifacts that result— namely, the way in which your brain changes from having written the program. Programming is a creative endeavor, and every program leaves its mark on its creator. It is through these mental battle scars—accumulated from struggling with and conquering problems— that we become better programmers. Considering a program to be a digital artifact only ignores the very real groove it made on its author’s brain. It is for this reason that programmers, in their spare time, will also write other sorts of programs. Code that are is not necessarily useful, but code that allows its author to grapple with problems. Many open-source projects got started as a hobbyist project that someone created in order to learn more about the internals of databases, or to try their hand at implementing a new programming language. For programmers, code is a very accessible means of exploring new ideas, which acts as a forcing function to prevent us from fooling ourselves into thinking we understand when we don’t. After all, it’s much easier to fool ourselves than it is to fool the computer.
10
CONTENTS
So, programmers are familiar with writing programs, not for running the eventual code, but for the process of having built it in the first place. In effect, the real output of this process is the neural pathways it engenders. Agda fits very naturally into this niche; the purpose of writing Agda is not to be able to run the code, but because Agda is so strict with what it allows. Having programmed it in Agda will teach you much more about the subject than you thought there was to know. Agda forces you to grapple with the hard, conceptual parts of the problem, without worrying very much about how you’re going to make the whole thing go fast later. After all, there’s nothing to go fast if you don’t know what you’re building in the first place. Consider the universal programmer experience of spending a week implementing a tricky algorithm or data structure, only to realize upon completion that you don’t need it—either that it doesn’t do what you hoped, or it doesn’t actually solve the problem you thought you had. Unfortunately, this is the rule in software, not the exception. Without total conceptual clarity, our software can never possibly be correct, if for no other reason than we don’t know what correct means. Maybe the program will still give you an answer, but it is nothing but willful ignorance to assume this output corresponds at all to reality. The reality is, conceptual understanding is the difficult part of programming—the rest is just coloring by numbers. The way we have all learned how to do programming is to attempt to solve both problems at once: we write code and try to decipher the problem as we go; it is the rare coder who stops to think on the whiteboard, and the rarer-still engineer who starts there. But again recall the case of rewriting a program. Once you have worked through the difficult pieces, the rest is just going through the motions. There exists some order in which we must wiggle our fingers to produce the necessary syntax for the computer to solve our problem, and finding this order is trivial once we have conceptual clarity of what the solution is. This, in short, is the value of learning and writing Agda. It’s less of a programming language as it is a tool for thought; one in which we can express extremely precise ideas, propagate constraints around, and be informed loudly whenever we fail to live up to the exacting rigor that Agda demands of us. While traditional programming models are precise only about the “how,” Agda allows us to instead think about the “what,” without worrying very much about the “how.” After all, we’re all very good at the “how”—it’s been drilled into us for our
CONTENTS entire careers.
11
CHAPTER
1 A Gentle Introduction to Agda
This book is no ordinary prose. It is not just a book, but it is also a piece of software. Literate programming is a technique for interspersing text and computer programs in the same document. The result is a single artifact that can be interpreted simultaneously as a book written in English, or a series of modules written in Agda. Most technical books work only by dint of having been read by a diligent copy-editor. But technical editors are fallible, and can nevertheless miss some details. The compiler, however, is superhuman in its level of nitpicking pedantry. By virtue of being a literate program, this book is capable of being typeset only when all of the code actually works The book simply will not compile if any of the Agda code presented in it doesn’t typecheck, or if any of its tests fail. This last point is particularly important. As we will see shortly, Agda programs come with very extensive tests. In this chapter we will get acquainted with Agda’s syntax, core ideas, and interactive tooling. By the end, you will be able to parse Agda code mentally, be able to write simple functions, and type the many funny Unicode characters which are ubiquitous in real Agda code. Despite being written in Agda, this book is not about Agda, and so the goal is to get you to a minimum degree of competency as quickly as possible. The technical content of this chapter is slightly heavier than desirable if you do not already have a strong programming background. Should that be the case, the best option might be to skim this chapter to get a feeling for what Agda can do, and return here as the need arises. 13
14
1.1
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA
The Longevity of Knowledge
This book was written against Agda version 2.6.3, and later chapters assume that Agda’s standard library agda-stdlib is present at version 1.7.2. In order to assure longevity, I will not include any instructions for how to get your hands on these dependencies, instead trusting fully and completely in your agency. It is probable that, like all knowledge, the information in this book will slowly decay. Thankfully, mathematics seems to decay slower than most of humanity’s learning, but the engineering side can (and should) eventually fall down. The Agda language might not survive the test of time, or it might evolve into something unrecognizable. Its standard library almost certainly will. To combat this bit-rot, the structure of this book is to introduce concepts by defining them for ourselves. Only later will we import those same ideas from the standard library. This is done in order to maintain compatibility with the wider world as it stands today. Should the standard library change, you can still make progress by using the definitions provided here. Should the language itself die, well, this book has always been much more about mathematics than about Agda.
1.2
Modules and Imports
When code is presented in this book, it will be shown with a thick left rule, as below: 0
module Chapter1-Agda where
This one-line example is not merely an illustration to the reader. By virtue of being a literate document, every chapter in this book must be a valid Agda file, and every Agda file begins with a module declaration like above. This declaration is mandatory prefacing every chapter, and we’re not allowed to write any code until we’ve done the module ritual. The module is Agda’s simplest unit of compilation. Every Agda source file must begin with a module declaration which matches the name of the file. Since this module is called Chapter1Agda, if you’d like to follow along at home, you must save your file as Chapter1-Agda.agda. Failure to do so will result in a helpful error message:
1.2. MODULES AND IMPORTS
15
i INFO WINDOW The name of the top level module does not match the file name.
Whenever feedback from the compiler is presented in this book, we will typeset it as above. You’ll learn how to interact with the compiler momentarily. Agda modules act as namespaces: limiting access and visibility to definitions within. Modules can be nested inside of one another, as in the following example: 0
module example where -- This is a comment
This introduces a new module example inside of Chapter1-Agda. You will notice a lack of curly braces here. Unlike many programming languages, Agda doesn’t use delimiters to indicate blocks; instead, it uses significant whitespace. That is, in order for the comment to be inside the module example, it must be indented relative to the module keyword. Because leading whitespace is meaningful in Agda, we must be careful to get our indentation right. This nested indentation can get very deep in real programs. It’s not a problem in a text editor, but in print—like you are reading now—, real-estate is very expensive. Therefore, we will try to elide as much indentation as possible. You might have noticed a little number 0 in the left margin gutter of each code block. This indicates how many spaces should precede each line of the code block. In this case, there should be no preceding indentation. If the first line of block of code is at the same relative indent level as the last line of the previous one, we’ll just mark the column depth in the gutter. However, if the indentation has increased, we’ll draw attention to it with a ⇥ symbol. Likewise, if the indentation has decreased, we’ll use a ⇤ symbol. Thus, you might see ⇥ 2, meaning that the indentation of this block has increased such that the leftmost character should be 2 spaces away from the beginning of the line. Note that this is an absolute positioning scheme; ⇥ 8 means that you should begin in column 8, not that you should add 8 additional spaces relative to the previous line. To illustrate this convention, we can look at four code blocks presented separately. The first is a new module foo.
16 ⇤ 0
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA
module foo where
The second contains a doubly-nested submodule, first bar and then qux: ⇥ 2
module bar where module qux where
Our third code block introduces yet another module—this time at the same relative indentation: 4
module zaz where
And finally, we come to our last code block illustrating the indentation convention of the book: ⇤ 2
module ram where
If we wanted to lay out these four preceding blocks into our Agda file, the actual indentation for everything should look like this: ⇤ 0
module foo where module bar where module qux where module zaz where module ram where
Don’t worry; this indentation convention will be much less tedious throughout the book than it seems. The illustration here is merely to get you paying attention to the indicators. Our actual code will require dramatically less changing of indentation. The important point here is that you should indent when you see a ⇥, and likewise de-dent when you see a ⇤. If you (or Agda) ever confused about where your indentation should be, use a number of spaces equal to number indicated. Getting your indentation wrong is a serious error that the compiler will complain about, so if you get mysterious errors when working through the code presented here, the first diagnostic step is to ensure your indentation is correct. I said earlier that modules act as namespaces. In the case of multiple modules in a single file, the modules are given full-qualified name, in which they inherit the names of all of their parent modules as well. For example, in the code block above, we have defined five submodules, which have the fully-qualified names:
1.3. A NOTE ON INTERACTION
17
• Chapter1-Agda.foo • Chapter1-Agda.foo.bar • Chapter1-Agda.foo.bar.qux • Chapter1-Agda.foo.bar.zaz • Chapter1-Agda.foo.ram The module structure of an Agda program always forms a tree. We will use many modules throughout this book—albeit ones much more interesting than presented thus far. A common pattern we will take is to introduce a new module whenever we’d like to explore a new line of reasoning. The idea being that learning abstract things like math requires lots of specific examples, from which we will generalize. Thus, we need a mechanism to work out the gory details of a specific example without “polluting” our eventual results. One distinct advantage of organizing chapters into modules is that chapters thus become conceptual units in our program. If a later chapter depends on an earlier one, this fact must be explicitly documented in the prose by way of an import. If later chapters require code or extend concepts from earlier ones, they can simply import the necessary pieces. We will also assume the presence of the Agda standard library (The Agda Community (2023)), but will first derive the pieces we use from scratch before importing it. Agda’s flexibility is outstanding, and common “magical” constructions in other languages—like numbers, booleans, and if..then..else syntax—are all defined by the standard library. Getting a chance to define them for ourselves will help build an appreciation that none of this stuff needs to be special.
1.3
A Note on Interaction
Agda is an interactive programming language. That means at any point in time, you can load an Agda file via the Load ( C-c C-l in Emacs and VS Code) command. This will cause Agda to parse, typecheck, and highlight the program. It’s a good habit to reload your file every time you make a syntactically valid change. Doing so will alert you to problems as soon as they might occur. While working through this book, you are encouraged to reload your file after every code block.
18
1.4
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA
Importing Code
We will break our define-it-before-you-import-it rule just once to illustrate how Agda’s module importing system works. Because imports are scoped to the currently open module, we will first open a new module: ⇤ 0
module Example-Imports where
Inside this module we are free to import and define anything we like, without fear that it will leak into Chapter1-Agda where we will do the majority of our work. We can import the booleans from the Data.Bool module as follows: ⇥ 2
import Data.Bool
This line tells Agda to go find a module called Data.Bool somewhere (which it will do by looking in the current project and any globallyinstalled libraries for a file called Data/Bool.agda.) Just importing it, however, is rarely what we want, as all the identifiers have come in fully-qualified. Ignoring the syntax for a moment, you will notice the following code example is much more verbose than we’d like: 2
_ : Data.Bool.Bool _ = Data.Bool.false
We will dive into the exact syntax here more fully in a moment, but first, it’s worth learning how to avoid the fully-qualified names. After importing a module, we can also open it, in which case, all of its contents get dumped into the current environment. Thus, we can rewrite the previous two code blocks as: 2
import Data.Bool open Data.Bool _ : Bool _ = false
Of course, it’s rather annoying to need to import and open a module every time we’d like to use it. Thankfully, Agda provides us some syntactic sugar here, via open import. Rewriting the code again, we get:
1.5. SEMANTIC HIGHLIGHTING 2
19
open import Data.Bool _ : Bool _ = false
There is significantly more to say about Agda’s module system, but this is enough to get you up and running. We will cover the more advanced topics when we come to them.
1.5
Semantic Highlighting
Unlike most programming languages, syntax highlighting in Agda is performed by the compiler, rather than some hodgepodge regular expressions that do their best to parse the program. Therefore, it’s more correct to call Agda’s highlighting semantic rather than syntactic. Since Agda’s grammar is so flexible, getting trustworthy highlighting information from the compiler is a necessity for quickly parsing what’s going on. The lack of semantic highlighting outside of text editors makes Agda a much harder language to read casually. If you’re ever snooping through Agda code, do yourself a favor and load it into an editor to ensure you get the semantic highlighting. It makes a world of difference. The highlighting in this book was generated directly from the Agda compiler. The default colors that Agda chooses when doing highlighting are helpful for quickly conveying information, but they are by no means beautiful. For the sake of the reader’s enjoyment, I have chosen my own color-scheme for this book. It is presented below, alongside the standard Agda colors, if you’d like a guide for translating between the book and your editor: Element
Book
Editor
Numbers
grey
purple
Strings
grey
red
Comments
red
red
Keywords
yellow
orange
red
green
grey
black
Constructors Bound Variables
20
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA Element
Book
Editor
forest green
pink
Module Names
black
purple
Functions
blue
even more blue
Interactive Holes
papaya background
green background
Underspecified Elaboration
saffron background
bright yellow background
Record Fields
We haven’t yet discussed most of these ideas, but perhaps you can see why we have not followed the standard color-scheme in this book; its high information density comes at the cost of frenetic, psychedelic experience. Don’t feel like you need to memorize this table. Whenever a new concept is introduced, I’ll share the relevant highlighting information, both in the book and in your editor. And with a little bit of experience, you’ll internalize it all just from exposure. But feel free to return to this section if you’re ever having a hard time mentally parsing what’s going on.
1.6 Types and Values Since this is a book about using programming to do mathematics, it bears discussing a great deal around data—that of the utmost importance to programmers. On a physical machine, all data is stored as small valences of electrical charge, arranged into a matrix of cells laid out onto physical wafers, mapped by the operating system into a logical view that we pretend is laid out linearly into neat chunks of eight or 64 pieces, that we then interpret as numbers in binary, and which we then shuffle into higher order assemblages, building domain- and application-specific logical structure on top. This is a physical fact, caused by the path dependency that computation has taken over history. Programmers are often split into two camps: those who worship and count bits as if they are precious, and those of the opinion that we have lots of memory, and thus room to ignore it. Regardless of what camp you find yourself in, thinking about data in terms of this hierarchy of abstraction will not be conducive to our
1.6. TYPES AND VALUES
21
purposes. A great deal of this book involves crafting data; that is, designing the shapes that constrain the values we are interested in discussing. Most problems in mathematics and in programming reduce to finding the right set of constraints, and rigorously pushing them from one form of data to another. Data is constrained by types, which are rigorous means of constructing and deconstructing data. You likely already have a notion of what types are, what they do, and whether or not you like them, but the following section will nevertheless be informative and elucidating. Agda takes its types extremely seriously; it is strongly unlikely you have ever used a language with a type system one tenth as powerful as Agda’s. This is true even if you’re intimately familiar with stronglytyped languages like Rust, TypeScript, Haskell, C++, or Scala. Agda is a dependently-typed programming language, which means its types can be computed. For example, you might make a function that returns a String if the 10th Fibonacci number is 56, and a Boolean otherwise. At first blush, this seems impractical—if not useless—but it is in fact the defining feature which makes Agda suitable for doing mathematics. But let’s not get ahead of ourselves. Of utmost importance in Agda is the notion of a typing judgment: the static assertion that a particular value has a particular type. A typing judgment is the fancy, academic name for a type declaration. For example, let’s consider the booleans, of which we have two values: true and false. Because true is a Bool, we would write the judgment true : Bool, where the colon can be read aloud as “has the type,” or “is of type.” We can’t yet write this judgment down, since we are in a new module and thus have lost our imports that brought true and Bool into scope. In Agda, we can assert the existence of things without having to give them a definition by using the postulate keyword. As we will see later, this is can be a very powerful tool, which must be used with great caution since it is an excellent foot-gun. For now, we will be reckless, and use a postulate to explicitly write down some typing judgments. First, we assert that the type Bool exists: ⇤ 0
module Example-TypingJudgments where postulate Bool : Set
and then, at the same level of indentation, we can postulate the existence of our two booleans by also giving them typing judgments:
22 4
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA false : Bool true
: Bool
You will have noticed that Bool : Set itself looks a great deal like a typing judgment. And in fact, it is. Set is one of the few built-in things in Agda, and it corresponds as a first approximation to “the type of all types.” That is, the judgment Bool : Set says “Bool is a type.” And therefore, since Bool is a type, we are thus justified in saying that false and true are of type Bool. But we can’t just put any old thing on the right side of the typing colon! Try, for example, adding the following judgment to our postulate block: 6
illegal : false
If you attempt to load this definition into Agda via Load ( C-c C-l in Emacs and VS Code), you’ll get an angry error message stating: i INFO WINDOW Bool should be a sort, but it isn't when checking that the expression false has type _4
This is not the easiest thing to decipher, but what Agda is trying to tell you is that false is not a type, and therefore that it has no business being on the right-hand side of a colon. The general rule here is that you can only put Foo on the right side of a colon if you have earlier put it on the left of Set. In code, we can say: 4
Foo : Set -- ... bar : Foo
As a note on terminology, anything that belongs in Set we will call a type. Anything which belongs to a type is called a value. Thus in the previous example, we say Foo is a type, and bar is a value. As a matter of convention, types’ names will always begin with capital letters, while values will be given lowercase names. This is not required by Agda; it’s merely for the sake of our respective sanities when the types inevitably get hairy.
1.6. TYPES AND VALUES
23
It’s important to note that while types may have many values, every value has exactly one type. Since we know that bar : Foo, we know for a fact that bar is not of type Qux (unless Foo and Qux happen to be the same type.) Postulating types and values like we have been is a helpful piece of pedagogy, but it’s not how things usually get done. Just as Dijkstra popularized the role of structured programming by arguing programmers should not be writing jumps directly, but instead using if and while and so forth, we will note that real programming does not get done by writing typing judgments directly (nor does mathematics, for that matter.) Why is this? One problem is, we’d like to say that false and true are the only booleans. But of course, nothing stops us from further postulating another Bool, perhaps: 4
file-not-found : Bool
You can imagine the chaos that might occur if someone added such a judgment after the fact. All of a sudden, our programs, carefully designed to handle only the binary case of booleans, would now need to be retrofitted with extra logic to handle the case of file-not-found. This possibility is anathema to the concept of modular and composable programs—those that we can write and prove correct once, without needing to worry about what the future will bring. In short, working directly with postulates is dangerous and, in general, an anti-pattern. Instead, we will investigate a tool that instead allows us to simultaneously define Bool, false and true into a closed theory. That is, we’d like to say that these are the only two booleans, allowing us and Agda to reason about that fact. To do this, we can use a data declaration: ⇤ 0
module Booleans where data Bool : Set where false : Bool true : Bool
which simultaneously asserts the three typing judgments Bool : Set, false : Bool, true : Bool, and further, states that this is an exhaustive list of all the booleans. There are, and can be, no others. When written like this, we often call false and true the data constructors or the introductory forms of Bool.
24
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA
1.7 Your First Function After all of this preamble, you are probably itching to write a program in Agda. As a first step, let’s write the not function, which transforms false into true and vice-versa. Functions in Agda begin with a typing judgment using a function arrow (which you can type in your editor via \to ), and are immediately followed by a definition of the function: ⇤ 2
not : Bool → Bool not =
?
1
2
Line 1 should be read aloud as “not is a function that takes a Bool argument and returns a Bool,” or, alternatively, as “not has type Bool to Bool.” The question mark on line 2 says to Agda “we’re not sure how to implement this function.” Agda acknowledges that most of the time you’re writing a program, it’s an incomplete program. Agda is an interactive tool, meaning it can help you refine incomplete programs into slightly lessincomplete programs. Writing an Agda program thus feels much more like a conversation with a compiler than it does being left alone in your text editor, typing away until the program is done. Incomplete programs are programs that contain one or more holes in them, where a hole is part of the program that you haven’t written yet. Thanks to Agda’s exceptionally strong type system, it knows a great deal about what shape your hole must have, and what sorts of program-fragments would successfully fill the hole. In the process of filling a hole, perhaps by calling a function that returns the correct type, you will create new holes, in this case corresponding to the arguments of that function call. Thus the model is to slowly refine a hole by filling in more and more details, possibly creating new, smaller holes in the process. As you continue solving holes, eventually there will be no more left, and then your program will be complete. The question mark above at 2 is one of these holes. After invoking Load ( C-c C-l in Emacs and VS Code) on our file, we can ask it for help in implementing not. Position your cursor on the hole and invoke MakeCase ( C-c C-c in Emacs and VS Code), which will replace our definition with: 2
not : Bool → Bool not x =
{! !}
1.7. YOUR FIRST FUNCTION
25
You will notice two things have now happened; Agda wrote x on the left side of the equals sign, and it replaced our ? with {! !} . This latter change is a no-op; ? and {! !} are different syntax for the same thing—a hole. As a reader playing at home, you will also have noticed Agda’s info panel has changed, updating our “visible” goal from i INFO WINDOW ?1 : Bool → Bool
to i INFO WINDOW ?1 : Bool
Believe it or not, these changes engendered by invoking MakeCase ( C-c C-c in Emacs and VS Code) have a lot to teach us about how Agda works. Our first hole, way back at 1 had type Bool → Bool, because we had written not = ? . But we already knew what type not had, because of the type signature we gave it on the line immediately above (Bool → Bool). After running MakeCase ( C-c C-c in Emacs and VS Code) however, our code is now not x = {! !} , and our hole has changed types, now bearing only Bool. Somehow we lost the Bool → part of the type— but where did it go? As it happens, this first Bool → in the type corresponded to the function’s parameter. Saying not : Bool → Bool is the same as saying “not is a function that takes a Bool and returns a Bool.” We can verify this interpretation by asking Agda another question. By moving your cursor over the {! !} and running TypeContext ( C-c C-, in Emacs and VS Code), Agda will respond in the info window with: i INFO WINDOW Goal: Bool ——————————————————— x : Bool
26
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA
We can see now that the hole itself (called Goal in the info window) is a missing expression whose type should be Bool. But, more interestingly, Agda is also telling us that we now have a variable x in scope, whose type is Bool. In order to pull the Bool → off of the type signature, we were forced to introduce a binding x of type Bool, which corresponds exactly to not’s function argument. There is an important lesson to be learned here, more than just about how Agda’s type system works. And that’s that you can invoke TypeContext ( C-c C-, in Emacs and VS Code) at any time, on any hole, in order to get a better sense of the big picture. In doing so, you can “drill down” into a hole, and see everything you have in scope, as well as what type of thing you need to build in order to fill the hole. We can ask Agda for more help to continue. This time, if we put our cursor in the hole and invoke MakeCase with argument x ( C-c C-c in Emacs and VS Code), Agda will enumerate every possible constructor for x, asking us to fill in the result of the function for each case. Here, that looks like: 2
not : Bool → Bool not false =
{! !}
not true
{! !}
=
You’ll notice where once there was one hole there are now two, one for every possible value that x could have been. Since x is a Bool, and we know there are exactly two booleans, Agda gives us two cases—one for false and another for true. From here, we know how to complete the function without any help from Agda; we’d like false to map to true, and vice versa. We can write this by hand; after all, we are not required to program interactively! 2
not : Bool → Bool not false = true not true = false
Congratulations, you’ve just written your first Agda function! Take a moment to reflect on this model of programming. The most salient aspect is that the compiler wrote most of this code for us, prompted only by our invocations of interactivity. Agda supports a great deal of interactive commands, and you will learn more as you progress through this book. The amount of work that Agda can do for you is
1.8. NORMALIZATION
27
astounding, and it is a good idea to remember what the commands do and how to quickly invoke them for yourself. In particular, you will get a huge amount of value out of MakeCase ( C-c C-c in Emacs and VS Code) and TypeContext ( C-c C-, in Emacs and VS Code); use the former to get Agda to split variables into their constituent values, and the latter to help you keep track of what’s left to do in a given hole. The other important point to reflect upon is the declarative style that an Agda program takes on. Rather than attempting to give an algorithm that transforms some bit pattern corresponding to false into one corresponding to true, we simply give a list of definitions and trust the compiler to work out the details. It doesn’t matter how Agda decides to map not false into true, so long as it manages to!
1.8
Normalization
At any point in time, we can ask Agda to normalize an expression for us, meaning we’d like to replace as many left-hand sides of equations with their right-hand sides. This is done via the Normalise ( C-c C-n in Emacs and VS Code) command, which takes an argument asking us what exactly we’d like to normalize. But before running it for yourself, let’s work out the details by hand to get a sense of what the machine is doing behind the scenes. Let’s say we’d like to normalize the expression not (not false)—that is, figure out what it computes to. When we look at what “equations” we have to work with, we see that we have the two lines which define not. Inspecting our expression again, we see that there is no rule which matches the outermost not. The expression not false is neither of the form false, nor is it true, and so the outermost not can’t expand to its definition. However, the innermost call to not is not false, which matches one of our equations exactly. Thus, we can rewrite not false to true, which means we can write the whole expression to not true. At this point, we now have an expression that matches one of our rules, and so the other rule kicks in, rewriting this to false. Because every “rewrite” equation comes from a function definition, and there are no more function calls in our expression, no more rewrite rules can match, and therefore we are done. Thus, the expression not (not false) normalizes to false. Of course, rather than do all of this work, we can just ask Agda to evaluate the whole expression for us. Try now invoking Normalise
28
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA
with argument not (not false) ( C-c C-n in Emacs and VS Code). Agda will respond in the info window: i INFO WINDOW false
1.9
Unit Testing
While Normalise ( C-c C-n in Emacs and VS Code) is a nice little tool for interactively computing small results, we can instead write a small unit test. Breaking our “don’t import it before you define it” rule again, we can bring some necessary machinery into scope: 2
open import Relation.Binary.PropositionalEquality
Now, we can write a unit test that asserts the fact that not (not false) is false, just as we’d expect. Using \== , we can type the ≡ symbol, which is necessary in the following snippet: 2
_ : not (not false) ≡ false _ = refl
Whenever we’d like to show that two expressions normalize to the same thing, we can write a little unit test of this form. Here, we’ve defined a value called _ (which is the name we use for things we don’t care about), and have given it a strange type and a stranger definition. We will work through all the details later in sec. 3.4. The important bits here are only the two expressions on either side of _≡_, namely not (not false) and false, which we’d like to show normalize to the same form. Attempting to Load ( C-c C-l in Emacs and VS Code) the file at this point will be rather underwhelming, as nothing will happen. But that’s both OK and to be expected; that means our unit test passed. Instead, we can try telling Agda a lie: 6
_ : not (not false) ≡ true _ = refl
Try running Load ( C-c C-l in Emacs and VS Code) again, which will cause Agda to loudly proclaim a problem in the info window:
1.10. DEALING WITH UNICODE
29
i INFO WINDOW false != true of type Bool when checking that the expression refl has type not (not false) ≡ true
This error is telling us that our unit test failed; that not (not false) is actually false, but we said it was true. These unit tests only yell when they fail. In fact, it’s worse than that; these unit tests prevent compilation if they fail. Thus when dealing with tests in Agda, it’s just like the old proverb says—no news is good news!
1.10
Dealing with Unicode
In the last section, we got our first taste of how Agda uses Unicode, when we were required to use the character ≡. This is the first of much, much more Unicode in your future as an Agda programmer. The field of programming has a peculiar affinity for the standard ASCII character set, which is somewhat odd when you think about it. What symbol is == supposed to be, anyway? Is it merely a standard equals sign? If so, why not just use a single =, which would be much more reasonable. Maybe instead it’s supposed to be an elongated equals sign? Does that correspond to anything historically, or was it just a new symbol invented for the purpose? If we’re inventing symbols anyway, surely =? would have been a better name for the test-for-equality operator. In fact, Agda follows this line of reasoning, but decides that, since we have a perfectly good Unicode symbol anyway, we ought to just use ≟ instead! Unicode, more than the weird lack of computation and advanced type system, is many programmers’ biggest challenging when coming to Agda. Learning to wrangle all of the Unicode characters is be daunting for at least three reasons—how can we distinguish all of these characters, what should we call them, and how in heck do we input them? The first problem—that of identification—is that there are great swathes of Unicode characters, many of which are identical. For example, when I was getting started, I was often flummoxed by the difference between ⊎ and ⊎. Believe it or not, the former symbol is extremely important in Agda, while the latter won’t play nicely with the
30
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA
standard library. As it happens, the former symbol is the “multiset union”, while the latter is the “n-ary union operator with plus.” As far as Unicode (and Agda) is concerned, these are completely different characters, but, as far as you and I can tell, they are indistinguishable. Unfortunately, there is no real solution to this problem, other than putting in the time, and feeling the pain while you’re sorting things out in your head. However, this is less of a problem in practice than it might seem. When you’re first getting started, you’re not going to dream up the ⊎ operator on your own. Instead, you’ll just read some code that happens to use this operator, and thus we can apply the imitable monkeysee-monkey-do strategy. Whenever you encounter a symbol you aren’t familiar with, simply invoke Agda’s DescribeChar ( C-u C-x = in Emacs and VS Code) command. When invoked with the cursor over ⊎, the info window will respond the following output: i INFO WINDOW character: ⊎ (displayed as ⊎) code point in charset: 0x228E to input: type "\u+" with Agda input method Character code properties: name: MULTISET UNION
The real output from Agda has been truncated here, but there are two important pieces of information here. The first is under the name heading, which tells you what to call this particular symbol. And the other is under to input, which is how exactly you can type this character in your editor. Try typing \u+ in your editor, and you should indeed get a ⊎ on screen. Of course, this hunt-and-peck approach works only when you’re trying to learn how to input a particular symbol that you have access to in source code. What can you do instead if you have a particular symbol in mind? Thankfully we don’t need to resort to skimming code, hoping to find what we’re looking for. Instead, we can attempt to compose the symbol, by guessing the necessary keystrokes. When writing Agda, you can input Unicode characters by typing a backslash, and then a mnemonic for the character you’d like. There are a few different naming schemes used by Agda, but for abstract symbols like ⊎ a good bet is to try to press a series of characters that
1.10. DEALING WITH UNICODE
31
together build the one you want. To illustrate this composition of abstract symbols, we can take a look at some examples. None of these should be surprising to you. • ∙ is input by typing \. • ∘ is input by typing \o • × is input by typing \x • ¿ is input by typing \? • ⊎ is input by typing \u+ • ⊚ is input by typing \oo • ⊗ is input by typing \ox • ⊙ is input by typing \o. • → is input by typing \-> • ← is input by typing \ , but can also be written in fewer keystrokes as \to . Typesetting aficionados might be delighted that this is mnemonic that LaTeX uses to inset the arrow symbol. If you’re familiar with LaTeX, many bindings you know from there will also work in Agda: • typing \to produces → • typing \in produces ∈ • typing \inn produces ∉ • typing \sub produces ⊂ • typing \neg produces ¬ • typing \top produces ⊤
32
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA • typing \bot produces ⊥
Similarly to LaTeX, we can prefix bindings with _ or ^ to make subscript or superscript versions of characters, as in: • \_1 for ₁ • \_2 for ₂ • \_x for ₓ • \^1 for ¹ • \^f for ᶠ • \^z for ᶻ All numbers have sub- and superscript versions, but only some letters do. This is not Agda’s fault; address your complaints to the Unicode Consortium regarding the unfortunate lack of a subscript f. Mathematicians and Agda-users alike are very fond of Greek letters, which you can type by prefixing the Latin-equivalent letter with \G . • type \Ga for α, which is the Greek letter for a • type \Gd for δ, which is the Greek letter for d • type \GD for Δ, which is the Greek letter for D • type \Gl for λ, which is the Greek letter for l • type \GL for Λ, which is the Greek letter for L As you can see, the input system is quite well organized when you wrap your head around it—assuming you know Greek! There is one other block of symbols you will probably need at first: the so-called blackboard bold letters. These are often used in mathematics to describe sets of numbers—the natural numbers being ℕ, the reals being ℝ, the rationals being ℚ (short for “quotients”), and so on. You can type blackboard bold numbers with the \b prefix. The three examples here are input as \bN , \bR and \bQ respectively. Suspend your disbelief; programming in Unicode really is quite delightful if you can push through the first few hours. Having orders of magnitude more symbols under your fingers is remarkably powerful, meaning you can shorten your identifiers without throwing away
1.11. EXPRESSIONS AND FUNCTIONS
33
information. In addition, you will notice a common vocabulary for how these symbols get used. Being able to recognize more symbols means you can recognize more concepts at a distance. For example, we will often use the floor brackets ⌊⌋ (given by \clL and \clR ) as a name for an “evaluation” function. As a bonus, when transcribing mathematics, your program can look exceptionally close to the source material. This is nice, as it minimizes the cognitive load required to keep everything in your head. You’re already dealing with big ideas, without having to add a layer of naming indirection into the mix.
1.11
Expressions and Functions
Agda descends from the ML family of languages, making it related to Elm, F#, Haskell, OCaml, PureScript, among many, many other cousins. This section will give you a brief overview of how to conceptualize Agda as a programming language, including some sense of how to mentally parse it. Agda is an expression-based language, meaning that every unit of code must produce a value. Therefore, there are no control-flow statements in this language, because control flow doesn’t directly produce a value. Agda has no support for common constructions like for or while loops, and while it does support if..then..else its prevalence is very subdued. In Agda, we have much better tools for branching, as we will soon see. This ascetic expression dedication precludes many other features that you might think be ubiquitous in programming languages. The most substantial of these elided features is that Agda does not support any form of mutable variables. While at first blush this feels like an insane limitation, mutable variables don’t end up being a thing you miss after a little time apart. The problem with mutable variables is that code with access to them can no longer be isolated and studied on its own. In the presence of mutable variables, the behavior of a piece of code depends not only on the program snippet, but also implicitly on the historical context under which this snippet is being run. By the Law of Demeter—which states that code should assume as little as possible about anything— mutable variables are quite a taxing feature, in that they effectively limit our ability to confidently reuse code. Writing real code without mutable variables is surprisingly comfortable, once you’ve wrapped your head around how. The trick is a
34
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA
to perform a series of syntactic transformations which eliminate the mutation. For example, whenever you need to read from a mutable variable, instead, just pass in a function argument. That is, instead of: int foo; function bar() { // ... int a = foo; // ... }
you can write: function bar(int a) { // ... }
If you’d like to change a mutable variable, instead, just return it as an additional result of the function! While this seems (and is) clunky in C-like languages, it’s much less painful in Agda. We will not discuss the matter further, instead hoping the motivated reader will pick up the necessary tricks as we go. There is one further point about functions in Agda that I’d like to make, however. The syntax can be rather weird. You are likely familiar with C-style function calls which look like this: foo(bar, 5, true)
Instead of these parentheses and commas, Agda instead uses juxtaposition as its syntax for function calls. The above call would look, in Agda, like this: foo bar 5 true
Note that the arguments here are separated here only by whitespace. If disambiguation is necessary (which it will be whenever we have nested function calls) we surround the entire expression in parentheses: baz 0 (f false) (foo bar 5 true)
This would be written in the ALGOL style as
1.12. OPERATORS
35
baz(0, f(false), foo(bar, 5, true))
While it might feel like an unnecessarily annoying break in conventional syntax, there are mightily-good theoretical reasons for it, addressed in sec. 1.20. Given this new lens on the syntax of function calls, it’s informative to look back at our definition of not; recall: 2
not : Bool → Bool not false = true not true = false
we can now mentally parse these definitions differently, that is, we can read them literally. The left side of each equation is just a function call! Therefore, the first equation says “the function not called with argument false is equal to true”. The equals sign is really and truly an equals sign; it denotes that we have defined not false to be true. And the funny thing about equalities is that they go both directions, so it’s also correct to say that true is equal to not false. This equality is very deep. While Agda will simplify the left side to the right whenever possible, the two sides are exactly equivalent in all computational contexts, and we can pick whichever is the most helpful for our proof. Indeed, many proofs depend on finding two complicated expressions that are exactly equal in this way. Much of the work is in figuring out which two complicated expressions we need to find.
1.12
Operators
Armed with a better understanding of how Agda works, we can return to writing actual code. One simple operation over booleans is logical OR; that is, the result is true if at least one of its arguments is true. Mathematicians often use the symbol ∨ (pronounced “vel”) for this operation, which we will follow. Note that this is not the Latin letter that comes before w, but instead the Unicode character produced by typing \or . This odd choice for a function name is justified because that’s what the standard library calls it, and we’d like to reimplement the standard library as a pedagogical exercise. Incidentally, the standard library didn’t just make up this name; ∨ is the mathematical symbol used for joins in a semilattice, and OR is exactly that join on the boolean
36
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA
semilattice. Don’t worry if you don’t yet know what this means; the purpose of this book is to get you familiar with many mathematical concepts. We can start our implementation for _∨_ by writing out a function signature, and then giving a hole on the next line, as in: 2
_∨_ : Bool → Bool → Bool _∨_ =
?
The function signature here is a little strange, as we’d expect it to be something more akin to (Bool , Bool) → Bool—that is, a function from two booleans to one. For the time being, just imagine that is the type signature. The point is rather subtle, and we will address the point of confusion soon, in sec. 1.20. In order to implement _∨_, we will again interactively ask for Agda’s help. Place your cursor in the hole and run MakeCase ( C-c C-c in Emacs and VS Code). Agda will respond with: 2
_∨_ : Bool → Bool → Bool x ∨ x₁ =
{! !}
You will notice that the left-hand side of our equation has changed. Where before we had two underscores, we now have x and x₁. As it happens, those underscores were not literal underscores, but instead marked placeholders for in the operator’s syntax for its arguments. One advantage to the {! !} form for Agda’s holes is that we can type inside of them. If you fill the hole with {! x x₁ !} , as in: 2
_∨_ : Bool → Bool → Bool x ∨ x₁ =
{! x x₁ !}
you can now invoke MakeCase ( C-c C-c in Emacs and VS Code), and rather than prompting you like usual, Agda will just use the arguments you wrote inside the hole. Thus, we receive a very satisfying four lines of code that we didn’t have to write for ourselves: 2
_∨_ : Bool → Bool → Bool false ∨ false =
{! !}
false ∨ true
{! !}
=
1.12. OPERATORS true
∨ false = {! !}
true
∨ true = {! !}
37
We can finish the definition of _∨_ by filling in the desired answers in each hole: 2
_∨_ : Bool → Bool → Bool false ∨ false = false false ∨ true = true true ∨ false = true true ∨ true = true
Here we have taken the same approach as in not: for each argument, we enumerate every possibilities, giving the answer on the right side of the equals sign. You will quickly notice that this strategy grows exponentially fast; a function of five booleans would require 32 clauses to enumerate every possibility. Fortunately, this is not the only way to define _∨_. We can instead throw some thought at the problem, and realize the goal is to identify whether or not one of the arguments is true. This doesn’t require pattern matching on both parameters— some clever insight indicates we can get away with matching only on one. If the argument we matched on is true, we’re done, without needing to inspect the other. Otherwise, if our matched argument is false, it doesn’t affect the answer, because the result is true only when the other argument is true. Thus, in neither the true nor false case do we need to look at the second argument. We can take advantage of this fact by using a variable to abstract over the second parameter. Instead, let us define _∨_ in this way: 2
_∨_ : Bool → Bool → Bool false ∨ other = other true ∨ other = true
Pay attention to the syntax highlighting here, as this is the first time it is taking advantage of the compiler’s knowledge. Here, we’ve written other on the right side of _∨_, and Agda has colored it black rather than the usual red. This is because Agda reserves the red for constructors of the type, of which there are only two (true and false.) By writing anything other than the name of a constructor, Agda assumes we’d like to treat such a thing as a variable binding.
38
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA
In both cases, we now have a new variable other : Bool in scope, although we only end up using it when the first argument was false. As our Agda examples get more complicated, being able to quickly read the information conveyed in the syntax highlighting will dramatically improve your life. Red identifiers are reserved for constructors of a type. Blue is for types and functions, and black is for locallybound constant variables—all three of which are visible in this last example. Note that we call other a variable, but it is a variable in the mathematical sense rather than in the usual programming sense. This variable is a way of talking about something whose value we don’t know, like the 𝑥 in the expression 𝑓 (𝑥) = 2𝑥 (but not like the 𝑥 in 𝑥2 = 9.) Here, 𝑥 exists, but its value is set once and for all by the user of 𝑓 . When we are talking about 𝑓 (5), 𝑥 = 5, and it is never the case that 𝑥 changes to 7 while still in the context of 𝑓 (5). In any given context, 𝑥 is always bound to a specific value, even if we don’t know what that value is. For this reason, we sometimes call variables bindings. Exercise The _∨_ function corresponds to the boolean operation OR, which is true if either of its arguments was. There is an analogous function _∧_ (input via \and ), which returns true if both of its arguments are. Define this function. Solution 2
_∧_ : Bool → Bool → Bool true ∧ other = other false ∧ other = false
1.13
Agda's Computational Model
Let’s compare our two definitions of _∨_, reproduced here with slightly different names: 2
_∨₁_ : Bool → Bool → Bool false ∨₁ false = false false ∨₁ true = true true ∨₁ false = true true ∨₁ true = true
and
1
1.13. AGDA'S COMPUTATIONAL MODEL 2
_∨₂_ : Bool → Bool → Bool false ∨₂ other = other true ∨₂ other = true
39
2
Besides the amount of code we needed to write, is there a reason to prefer 2 over 1 ? These two implementations are equivalent, but have very different computational properties as Agda programs. Let’s explore why that is. While 1 is perhaps slightly clearer if you’re coming from a truthtable background, 2 is a better program because it branches less often. It’s easy to read the branching factor directly off the definition: _∨₁_ takes four lines to define, while _∨₂_ requires only two. Every time we must inspect which particular boolean value a parameter has, we are required to bifurcate our program. We will return to this idea when we discuss writing proofs about functions like _∨_. But for the time being, we can directly observe the difference in how Agda computes between _∨₁_ and _∨₂_. At its root, 2 is a better program because it needs to inspect less data in order to make a decision. _∨₂_ is able to make meaningful progress towards an answer, even when the second argument isn’t yet known, while _v₁_ is required to wait for both arguments. Agda supports partial application of functions, which means we’re allowed to see what would happen if we were to fill in some function arguments, while leaving others blank. The syntax for doing this on operators is to reintroduce an underscore on the side we’d like to leave indeterminate. For example, if we’d like to partially apply the first argument of _∨₁_, we could write true ∨₁_, while we can partially apply the second argument via _∨₁ true. You will sometimes hear this called “taking a section of _∨₁_.” We can observe the difference between _∨₁_ and _∨₂_ by looking at how Agda computes a section over each of their first arguments. We can evaluate arbitrary snippets of Agda code with the Normalise ( C-c C-n in Emacs and VS Code) command. First, try normalizing true ∨₂_, to which Agda will respond: i INFO WINDOW λ other → true
The lambda notation here is Agda’s syntax for an anonymous function. Thus, Agda tells us that true ∨₂_ is equal to a function which
40
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA
ignores its argument and always returns true. This is what we would expect semantically, based on the definitions we gave above for _∨₂_. It’s informative to compare these results against running Normalise with argument true ∨₁_ ( C-c C-n in Emacs and VS Code): i INFO WINDOW true ∨₁_
Here, Agda simply returns what we gave it, because it is unable to make any progress towards evaluating this value. There’s simply no way to make any reductions until we know what the second argument is!
1.14
Stuckness
Terms which are unable to reduce further are called stuck, and can make no progress in evaluation until something unsticks them. The usual reason something behind being stuck is that it’s waiting to inspect a value which hasn’t yet been provided. Stuckness can be quite a challenge to debug if you’re unaware of it, so it bears some discussion. In our example trying to normalize true ∨₁_, because we don’t yet know the value of the second argument, our pattern match inside of _∨₁_ is stuck, and thus the value of the call to true ∨₁_ is also stuck. As soon as the second argument is provided, the pattern match will unstick, and so too will the final result. Another way by which stuckness can occur is through use of postulated values. Recall that the postulate keyword allows us to bring into scope a variable of any type we’d like. However, we have no guarantee that such a thing necessarily exists. And so Agda offers us a compromise. We’re allowed to work with postulated values, but they are always stuck, and our program will not be able to proceed if it ever needs to inspect a stuck value. To demonstrate this, we can postulate a boolean value aptly named always-stuck : 2
postulate always-stuck : Bool
Our new always-stuck binding is, as its name suggests, always stuck. For example, we can learn nothing more about it by running Normalise
1.14. STUCKNESS
41
with argument always-stuck ( C-c C-n in Emacs and VS Code): i INFO WINDOW always-stuck
Nor can we reduce not always-stuck to a value, because not must inspect its value in order to make progress: i INFO WINDOW not always-stuck
Don’t believe the response from Normalise ( C-c C-n in Emacs and VS Code); not always-stuck is indeed always stuck (although it is distinct from always-stuck .) Rather, the entire call to not with argument always-stuck is stuck. And, as you might expect, Normalise with argument true ∨₁ always-stuck ( C-c C-n in Emacs and VS Code) is also stuck: i INFO WINDOW true ∨₁ always-stuck
Fascinatingly however, attempting to normalize true ∨₂ always-stuck computes just fine: i INFO WINDOW true
It is exactly because of this computational progress, even when the second argument is stuck ( always-stuck or otherwise), that we prefer _∨₂_ over _∨₁_. While this example might seem somewhat contrived, you would be amazed at how often it comes up in practice. Avoiding a pattern match in an implementation means you can avoid a pattern match in every subsequent proof about the implementation, and can be the difference between a three line proof and an 81 line proof.
42
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA
We will return to this point when we discuss proof techniques, but for now, try not to get into the habit of “bashing” your way through every implementation if you can help it.
1.15
Records and Tuples
In sec. 1.6, we saw how the data keyword could be used to create types with distinct values. Most programming languages do not admit any features analogous to the data type former, which is why booleans and numbers—two types that are all about distinct values—are usually baked directly into most languages. We have already looked at defining our own booleans, in sec. 2 we will focus on defining numbers for ourselves. To tide us over in the meantime, we will look at the more-familiar record types: those built by sticking a value of this type, and a value of that type, and maybe even one of the other type together, all at the same time. Let’s put together a little example. Imagine we’re modeling the employee database at a company. First, let’s start a new module, bring our Booleans into scope, and import Strings: ⇤ 0
module Example-Employees where open Booleans open import Data.String using (String)
Our company has five departments. Every employee must belong to one of these. Whenever you hear the words “one of” used to describe a piece of information, you should think about modeling it using a data type: ⇤ 2
data Department : Set where administrative : Department engineering : Department finance : Department marketing : Department sales : Department
Let’s say employees at our company have three relevant pieces of information: their name, which department they’re in, and whether they’ve been hired recently. Whenever you hear “and” when describing a type, you should think about using a record, as in:
1.15. RECORDS AND TUPLES ⇤ 2
record Employee field name department is-new-hire
43
: Set where : String : Department : Bool
We can build a value of Employee also by using the record keyword, as in: ⇤ 2
tillman : Employee tillman = record { name = "Tillman" ; department = engineering ; is-new-hire = false }
Sometimes we’d like to just stick two pieces of information together, without going through the rigmarole of needing to make a custom type for the occassion. In these cases, we’ll need a generic record type, capable of sticking any two values together. In essence, then, the goal is to build a record type with two generic, independently-typed fields. We can’t hard-code the types we’d like for the fields, because then it wouldn’t be generic. So instead, we do what we always do in situations like this, and parameterize things: ⇤ 0
module Sandbox-Tuples where record _×_ (A : Set) (B : Set) : Set where field proj₁ : A proj₂ : B
1
There is quite a lot going on here. First, note that the name _×_ here is not the Latin letter x, but is instead the times symbol, input as \x . At 1 we parameterize the our type _×_ by two other types, called A and B. You can see from the black syntax highlighting on A and B that these are not types in their own right, but locally-bound variables which are later instantiated as types. The entire situation is analogous to functions; think of _×_ as a function which takes two types and returns a third type. Inside the record, we’ve given two fields, proj₁ and proj₂. These names are intentionally vague, since we have no idea how real people will end up using _×_ in practice. Incidentally, “proj” is short
44
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA
for projection—which is the mathy word for reducing a complicated structure down along some axis. In this case we have two different “axes”: the first and the second elements of our tuple. We can try out our new tuple type, as usual by plunking a hole down on the right hand side of the equals sign: ⇤ 2
open Booleans my-tuple : Bool × Bool my-tuple =
?
This time we will use the Refine ( C-c C-r in Emacs and VS Code) command, which asks Agda to build us a value of the correct type, leaving holes for every argument. The result is: 2
my-tuple : Bool × Bool my-tuple = record { proj₁ =
? ; proj₂ = ? }
Note that these two holes both have type Bool, corresponding exactly to the type of the tuple (Bool × Bool.) We can fill them in with arbitrary boolean expressions: 2
my-tuple : Bool × Bool my-tuple = record { proj₁ = true ∨ true ; proj₂ = not true }
Congratulations! We’ve made a reusable—if not very syntactically beautiful—tuple type! We’d now like to extract the two fields from mytuple. How can we do that? Agda provides three means of projecting fields out of records. The first is a regular old pattern match: 2
first : Bool × Bool → Bool first record { proj₁ = proj₁ ; proj₂ = proj₂ } = proj₁
The syntax here brings pleasure to no ones heart, but it does work. There are some things to note. First, I didn’t write this definition out by hand. I instead wrote down the type, stuck in a hole, and then invoked MakeCase ( C-c C-c in Emacs and VS Code) twice. Nobody has type to write out this ugly syntax; just get the computer to do it for you. Second, once again, Agda’s syntax highlighting has done us a great favor. Look at the unpacking happening inside the record constructor. We have a green proj₁ being equal to a black proj₁. Agda highlights
1.15. RECORDS AND TUPLES
45
field names in green, and bindings in black, which means we know immediately know what this syntax must be doing. The green part is which field we’re looking at, and the black piece is the name we’d like to bind it to. Thus, the following definition is equivalent: 2
first : Bool × Bool → Bool first record { proj₁ = x ; proj₂ = y } = x
Even better, we don’t need to bind fields that we don’t intend to use, so we can write first more tersely again: 2
first : Bool × Bool → Bool first record { proj₁ = x } = x
I said there were three ways to project a field out of a record. If we don’t want to do a gnarly pattern match like this, what are our other options? One other means is via record access syntax, where the field name is prepended with a dot and given after the tuple: 2
my-tuple-first : Bool my-tuple-first = my-tuple ._×_.proj₁
You will notice that we needed to give a fully-qualified field name here. Rather than just writing proj₁ we needed to give _×_.proj₁ in full. But don’t fret, this is still the field name. We’ll see momentarily how to clean things up. Our other means for projecting fields out of records is via the record selector syntax. Under this syntax, we use the field name as if we were making a function call: 2
my-tuple-second : Bool my-tuple-second = _×_.proj₂ my-tuple
The reason that record selector syntax looks like a function call is because it is a function call. Every record field f of type F in record R gives rise to a function f : R → F. Record access and record selectors just different syntax for the exact same functionality, and it’s a matter of personal preference as to which you pick. Personally, I like using record selectors, because it means I can forget the fact that I’m working with records and think only about functions.
46
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA
In reading the above, it cannot have escaped your attention that these two call sites are ugly. What is this _×_.proj₁ nonsense? Do we really need to use a fully-qualified name every time we want to access a field? Fortunately, we do not. Believe it or not, every record creates a new module with the same name. Thus, we can bring proj₁ and proj₂ into the top-level scope by opening our new module, allowing us to rewrite the previous two definitions as: 2
open _×_ my-tuple-first : Bool my-tuple-first = my-tuple .proj₁ my-tuple-second : Bool my-tuple-second = proj₂ my-tuple
Much nicer, isn’t it?
1.16
Copatterns and Constructors
We now have nice syntax for projecting out of records. But can we do anything to improve the syntax involved in building them? It would be really nice to be able to avoid the record { proj₁ = ... } boilerplate every time we wanted to make a tuple. If we don’t mind giving a name to our tuple, we can use copattern syntax to build one. The idea is rather than define the record itself, we need only give definitions for each of its fields. Agda can help us with this. Start as usual with a type and a hole: 2
my-copattern : Bool × Bool my-copattern =
?
If we now attempt to perform a MakeCase without any argument ( C-c C-c in Emacs and VS Code) inside the hole, we will be rewarded with a copattern match: 2
my-copattern : Bool × Bool proj₁ my-copattern =
{! !}
proj₂ my-copattern =
{! !}
Copatterns can be nested, for example, in the case when we have a nested tuple:
1.16. COPATTERNS AND CONSTRUCTORS 2
47
nested-copattern : Bool × (Bool × Bool) proj₁ nested-copattern =
{! !}
proj₁ (proj₂ nested-copattern) =
{! !}
proj₂ (proj₂ nested-copattern) =
{! !}
We will make extensive use of copatterns later in this book, and will discuss them in much more depth in sec. 1.16. For the time being, it’s nice to know that this is an option. Suppose however, we’d like to not use copatterns—perhaps because we’d like to build an anonymous value of a record type. For that, we can instead write a helper function that will clean up the syntax for us. 2
_,_ : {A B : Set} → A → B → A × B _,_ =
1
?
The type of _,_ should really be A → B → A × B. However, recall that and B are variables standing in for whatever type the user wants. Unfortunately for us, we don’t know what those types are yet, but we need them in order to give a proper type to __. Since those variables are not in scope, we must bind them ourselves. This binding is what’s happening in the {A B : Set} syntax that prefixes the type at 1 . It’s responsible for bringing A and B both into scope, and letting Agda know they are both of type Set. We will discuss what exactly the curly braces mean momentarily, in sec. 1.21. Implementing _,_ isn’t hard to do by hand; but we can be lazy and ask Agda to do it for us. Begin as usual by getting Agda to bind our arguments, via MakeCase ( C-c C-c in Emacs and VS Code) in the hole: A
2
_,_ : {A B : Set} → A → B → A × B x , x₁ =
{! !}
and follow up by invoking Auto ( C-c C-a in Emacs and VS Code), which asks Agda to just write the function for you. Of course, this doesn’t always work, but it’s surprisingly good for little functions like _,_. The result is exactly what we’d expect it to be: 2
_,_ : {A B : Set} → A → B → A × B x , x₁ = record { proj₁ = x ; proj₂ = x₁ }
48
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA
The _,_ is now shorthand for writing out a record value. We can reimplement my-tuple thus: 2
my-tuple’ : Bool × Bool my-tuple’ = (true ∨ true) , not true
The parentheses here are necessary because Agda doesn’t know if it should parse the expression true ∨ true , not true as true ∨ (true , not true) or as the expression intended above. Of course, you and I know that the other parse doesn’t even typecheck, so it must be the unintended. You can, however, imagine a much larger expression could take an exponential amount of time in order to find a unique way of adding parentheses to make the types work out properly. We will fix this limitation in the next section. As it happens, we can get Agda to automatically create _,_, rather than needing to define it ourselves. Doing so, however, requires changing the definition of _×_, which we are now unable to do, since we have defined things after the fact. Let’s start a new module, and redefine _×_ in order to get _,_ for free. ⇤ 0
module Sandbox-Tuples₂ where open Booleans record _×_ (A : Set) (B : Set) : Set where constructor _,_ 1 field proj₁ : A proj₂ : B open _×_
There is one small change compared to our previous definition, and that’s the constructor keyword at 1 . Adding a constructor definition tells Agda that we’d like to avoid the whole record { ... } nonsense. Instead, we automatically get _,_ for free, which you will notice is now colored red, to let us know that it is a constructor.
1.17 Fixities We return now to the problem of getting Agda to correctly parse the expression true ∨ true , not true the implied parentheses on the left.
1.18. COPRODUCT TYPES
49
Infix operators like _∨_ and _,_ are parsed in every language according to rules of precedence and associativity. Together, these two concepts are known as fixities. The precedence of an operator lets the parser know how “tightly” an operator should bind with respect to other operators. In this case, because we’d like the parentheses to go around _∨_, we would like _∨_ to have a higher precedence than _,_. Agda assumes a precedence of 20 (on a scale from 0 to 100) by default for any operator, so we must give _,_ a lower precedence than 20. By convention, we give _,_ a precedence of 4, which makes it play nicely with the conventions for the more usual mathematic operators like addition and subtraction. The associativity of an operator describes how to insert parentheses into repeated applications of the operator. That is, should we parse x , y , z as (x , y) , z or as x , (y , z)? The former here is said to be left-associative, while the latter is, appropriately, rightassociative. For reasons that will make sense later, we’d like _,_ to be right-associative. We can tell Agda’s parser about our preferences, that _,_ be rightassociative with precedence 4 via the following declaration: 2
infixr 4 _,_
Here, the r at the end of infixr tells Agda that our preference is for associativity to be to the right. The analogous keyword infixl informs Agda of the opposite decision. With this fixity in place, Agda will now automatically insert parentheses, parsing true , false , true , false as true , (false , (true , false)). Of course, if you ever do want an explicit left-nested tuple, you are free to insert the parentheses on the left yourself. While _,_ is the operator for building values of tuple types, _×_ is the operator for building the tuple type itself. The values and their types should be in a one-to-one correspondence. That is to say, if we have a : A, b : B and c : C, we’d like that a , b , c have type A × B × C. By this reasoning, must also choose right-associativity for _×_. Traditionally, _×_ is given a precedence of 2. 2
infixr 2 _×_
1.18
Coproduct Types
Closely related to the tuple type is the coproduct type, which also “joins together” two disparate types into one. A coproduct of A and B
50
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA
is either a value of A, or a value of B. But unlike our everyday notion of “either/or”, a coproduct cannot simultaneously be both an A and a B. The coproduct of A and B is written symbolically as A ⊎ B, where ⊎ is input as \u+ . While the tuple type has two projections and one constructor, the coproduct type conversely has two constructors, and one projection. These two constructors correspond to the two ways of building a _⊎_, either out of an A, or out of a B. Thus, we can give the definition of coproducts as: 2
data _⊎_ (A : Set) (B : Set) : Set where inj₁ : A → A ⊎ B inj₂ : B → A ⊎ B infixr 1 _⊎_
We won’t deal much with coproducts in this much generality until sec. 7. Don’t worry—in the interim, we will see many, many special cases. But coproducts are a nice thing to know about if you’re going out and writing Agda on your own.
1.19
Function Types
It is now time to tackle the question of what’s up with the funny syntax for functions with multiple arguments. Most programming languages assign a type of the form (A × B) → C to functions with two arguments, while in Agda we instead write A → B → C. Why is this? In order to help understand this type, we can stop for a moment to think about the issue of parsing again. Although the function arrow is intrinsically magical and built-in to Agda, let’s ask ourselves how it ought to work. Spiritually, _→_ is a binary operator, meaning we can ask about its precedence and associativity. In Agda, the typing judgment _:_ binds with the lowest precedence of all, with _→_ coming in as a close second. What this means is that in practice, _→_ always acts as a separator between types, and we don’t need to worry ourselves about where exactly the parentheses should go. The associativity for _→_, on the other hand, is to the right. That means, given the type A → B → C, we must read it as A → (B → C). A literal interpretation of such a thing is a function that takes an A argument and returns a function. That returned function itself takes
1.19. FUNCTION TYPES
51
a B argument and then returns a C. At the end of the day, by the time we get a C, the function did indeed take two arguments: both an A and a B. What’s nice about this encoding is that, unlike in most programming languages, we are not required to give every argument at once. In fact, we can specialize a function call by slowly filling in its parameters, one at a time. Let’s take an example. The following models a family pet with a record. The Pet has a species, temperament, and a name. 2
module Example-Pets where open import Data.String using (String) data Species : Set where bird cat dog reptile : Species data Temperament : Set where anxious : Temperament chill : Temperament excitable : Temperament grumpy : Temperament record Pet : Set where constructor makePet field species : Species temperament : Temperament name : String
Imagine now we’d like to specialize the makePet constructor, so we can make a series of grumpy cats in rapid succession. Our first attempt is to write a helper function makeGrumpyCat: ⇤ 4
makeGrumpyCat : String → Pet makeGrumpyCat name = makePet cat grumpy name
This definition absolutely would work, but it doesn’t demonstrate the cool part. Let’s consider makePet as a function. It takes three arguments, before returning a Pet, and so its type must be Species → Temperament → String → Pet. If we were to insert the implicit parentheses, we’d instead get Species → (Temperament → (String → Pet)). You will note that the
52
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA
innermost parentheses here are String → Pet, which just so happens to be the type of makeGrumpyCat. Thus, we can define makeGrumpyCat as being makePet applied to cat and grumpy, as in: 4
makeGrumpyCat : String → Pet makeGrumpyCat = makePet cat grumpy
I like to think of this a lot like “canceling” in grade-school algebra. Because, in our original equation, both sides of the equality ended in name, those arguments on either side cancel one another out, and we’re left with this simpler definition for makeGrumpyCat. This ability to partially apply functions isn’t a language feature of Agda. It arises simply from the fact that we write our functions’ types as a big series of arrows, one for each argument.
1.20
The Curry/Uncurry Isomorphism
Of course, nothing stops us from encoding our functions in the more standard form, where we are required to give all arguments at once. The transformation is merely syntactic. We could write _∨₂_ instead as a function or: ⇤ 2
or : Bool × Bool → Bool or (false , y) = y or (true , y) = true
Rather amazingly, when we encode functions this way, we get back the same function-call notation that other languages use: 2
_ : Bool _ = or (true , false)
From this result, we can conclude that other languages’ functional calling mechanisms are similar to Agda’s. The only difference is that Agda lets you elide parentheses around a function call which requires only one argument. It feels “correct” that the difference between Agda and other languages’ functions should be entirely syntactic; after all, presumably at the end of the day we’re all talking about the same functions, regardless of the language used. But can we make this analogy more formal?
1.20. THE CURRY/UNCURRY ISOMORPHISM
53
A usual tool in mathematics to show two things are equivalent is the isomorphism. We will study this technique in much more detail in sec. 8, but for now, you can think of an isomorphism as a pair of functions which transform back and forth between two types. Whenever you have an isomorphism around, it means the two types you’re transforming between are equivalent to one another. Of course, not just any two functions will do the trick; the two functions must “undo” one another, in a very particular sense that we will explore in a moment. So as to convince any potential doubters that our one-at-a-time function encoding (also known as curried functions) is equivalent to the usual “take all your arguments at once as a big tuple,” we can show an isomorphism between the two. That is, we’d like to be able to transform functions of the shape A × B → C into A → B → C and vice versa. We’ll work through the first half of this isomorphism together, and leave the other direction as an exercise to try your hand at writing some Agda for yourself (or talking Agda into writing it for you!) In particular, the first function we’d like to write has this types: 2
curry : {A B C : Set} → (A × B → C) → (A → B → C) curry =
?
That’s quite the intimidating type; avoid the temptation to panic, and we’ll break it down together. Let’s ignore the {A B C : Set} → part, which you’ll recall exists only to bring the type variables into scope. Of course, these variables are necessary for our code to compile, so the next few code blocks won’t actually work. Nevertheless, it will be instructive to study them, compiling or no. We begin with the “simplified” type of curry: 6
curry : (A × B → C) → (A → B → C)
Because we know that _→_ has lower precedence than _×_, we can add some implicit parentheses: 6
curry : ((A × B) → C) → (A → B → C)
Furthermore, we know that _→_ is right associative, so a chain of multiple function arrows nests its parentheses to the right, as in: 6
curry : ((A × B) → C) → (A → (B → C))
54
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA
Now, we know that a chain of rightward-nested function arrows is the same as a function taking one argument at a time, so we can drop the outermost pair of parentheses on the right side of the outermost arrow. This results in: 6
curry : ((A × B) → C) → A → (B → C)
Doing the same trick, we can eliminate another pair of parentheses: 6
curry : ((A × B) → C) → A → B → C
This type is now as simple as we can make it. Although its first parameter is still rather scary, we can ignore the remaining set of parentheses for a moment, to see that curry is a function of three arguments, which eventually returns a C. The second and third arguments are easy enough, they are just A and B. Because the first set of parentheses themselves contain a function arrow, this means that the first parameter to curry is a function. What function is it? We’re not sure, all we know is that it’s a function which takes a tuple of A × B, and produces a C. Thought about in this fashion, it’s not too hard to see how to go about implementing curry. Because C could be any type at all, we can’t just build one for ourselves. Our only means of getting a C is to call our function which produces one, and in order to do that, we must construct a pair of A × B. Since we have both an A and a B as arguments, this is not an onerous requirement. Thus, all we need to do is to call the given function after tupling our arguments: 2
curry : {A B C : Set} → (A × B → C) → (A → B → C) curry f a b = f (a , b)
For all its complicated type signature, curry turns out to be a remarkably simple function. And this makes a great deal of sense when you recall why we wanted to write curry in the first place. Remember that we would like to show the equivalence of function calls that receive their arguments all at once, vs those which receive them one at a time. But at the end of the day, it’s the same function! Now for an exercise to the reader. Our function curry forms one side of the isomorphism between the two ways of calling functions. Your task is to implement the other half of this isomorphism, using all of the tools you’ve now learned. The type you’re looking for is:
1.21. IMPLICIT ARGUMENTS 2
55
uncurry : {A B C : Set} → (A → B → C) → (A × B → C) uncurry =
?
Exercise (Easy) Implement uncurry. Remember that TypeContext ( C-c C-, in Emacs and VS Code) is an invaluable tool if you don’t know how to make progress. Solution 2
uncurry : {A B C : Set} → (A → B → C) → (A × B → C) uncurry f (a , b) = f a b
Because we were able to implement curry and uncurry, we have shown that curried functions (used in Agda) are equivalent in power to uncurried functions (used in most programming languages.) But the oddity of our choice leads to our ability to “cancel” arguments that are duplicated on either side of a function definition, and this happens to be extremely useful for “massaging” functions. Often, we have a very general function that we will need to specialize to solve a particular task, and we can do exactly that by partially filling in its arguments.
1.21
Implicit Arguments
There is one final concept we must tackle before finishing this chapter. It’s been a long slog, but there is light at the end of the tunnel. As our last push, we will investigate what exactly those curly braces mean in type signatures. As a motivating example, let’s play around with our new uncurry function. In particular, let’s try applying it to _∨_. What type must this thing have? If we don’t want to do the thought-work ourselves, we can just leave a hole in the type signature: 2
_ : ? _ = uncurry _∨_
If Agda has enough information to work out the hole for itself, we can command it to do so via Solve ( C-c C-s in Emacs and VS Code). The result is:
56 2
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA
_ : Bool × Bool → Bool _ = uncurry _∨_
The Solve ( C-c C-s in Emacs and VS Code) command asks Agda to infer the contents of a hole based on information it already knows from somewhere else. In this case, Agda knows the type of _∨_ (that is, Bool → Bool → Bool,) and so it can infer the type of uncurry _∨_ as Bool × Bool → Bool. Since this is the entire expression, the type of our definition is fully known to Agda, and it will happily solve it for us. As you can see, Agda is quite lever! The constraint solving exhibited here is a generally useful tool when coding. For example, you can state a proof as being trivial, and then work backwards—asking Agda to synthesize the solution for you! It sounds absolutely bonkers, but somehow this actually works. Let’s explore this concept further. But first, we will make a new module. The booleans we implemented by hand in the previous section exist in the standard library, under the module Data.Bool. Better yet, they are defined identically to how we’ve built them, so there will be no surprises when bringing them in. Data.Bool is quite a big module, so we will take only the pieces we need via the using modifier: ⇤ 0
module Sandbox-Implicits where open import Data.Bool using (Bool; false; true; not; _∨_)
Additionally, tuples are also defined in the standard library, under Data.Product. ⇤ 2
open import Data.Product using (_×_; proj₁; proj₂) Data.Product also supplies _,_, curry and uncurry, but they are implemented in more generality than we’ve presented. Rather than get bogged down in the details, we can instead just import the specialized versions which do correspond to our implementations. By using the renaming modifier on this same import of Data.Product, we can ask Agda to shuffle some identifiers around for us:
4
renaming ( _,′_ to _,_ ; curry′ to curry ; uncurry′ to uncurry )
1.21. IMPLICIT ARGUMENTS
57
Note that these tick marks at the end of curry and uncurry are primes, not apostrophes. Primes can be input via \' . When you import Data.Product for yourself in the future, you won’t need this renaming. It’s necessary here only to simplify some details that we don’t usually care about (or even notice.) Our sandbox is now be equivalent to our last environment, where we defined everything by hand. In Agda like any other programming language, it’s desirable to use existing machinery rather than build your own copy, although admittedly building it for yourself leads to better understanding. Let’s now look again at the types of _,_, curry, and curry, in their full curly-braced glory: ⇤ 2
_,_ : {A B : Set} → A → B → A × B curry : {A B C : Set} → (A × B → C) → (A → B → C) uncurry : {A B C : Set} → (A → B → C) → (A × B → C)
Each of these types is preceded by curly braces containing a list of types, which are brought into scope for the remainder of the type signature. But what exactly is going on here? The first thing to realize is that the notation {A B : Set} is syntactic sugar for {A : Set} → {B : Set}, and so on for more variables. We can therefore rewrite the type of _,_ more explicitly: 2
_,_ : {A : Set} → {B : Set} → A → B → A × B
In this form, it looks a lot like A : Set and B : Set are arguments to Rather amazingly, they are! The curly braces around them make these invisible, or implicit, arguments. Something interesting happens if we replace them with regular parentheses instead of braces. Let’s make a new function called mk-tuple using regular, visible arguments:
_,_.
2
mk-tuple : (A : Set) → (B : Set) → A → B → A × B mk-tuple =
?
We will do the usual ceremony to bind our arguments via MakeCase without any argument ( C-c C-c in Emacs and VS Code): 2
mk-tuple : (A : Set) → (B : Set) → A → B → A × B mk-tuple A B x x₁ =
{! !}
And then run Auto ( C-c C-a in Emacs and VS Code) to implement the function for us.
58 2
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA
mk-tuple : (A : Set) → (B : Set) → A → B → A × B mk-tuple A B x x₁ = x , x₁
Here you can see that the implementation of mk-tuple completely ignores its A and B arguments. That’s peculiar, isn’t it? We can try using mk-tuple to build ourselves a tuple. Starting from a delimited hole: 2
_ : Bool × Bool _ =
{! !}
we can write mk-tuple inside the hole: 2
_ : Bool × Bool _ =
{! mk-tuple !}
and then invoke Refine ( C-c C-r in Emacs and VS Code), asking Agda to use the given function to try to fill the hole: 2
_ : Bool × Bool _ = mk-tuple
{! !} {! !} {! !} {! !}
This expression now has four holes for the four arguments to mk-tuple. The first two are the previously-implicit type parameters of the tuple, while the last two are the actual values we’d like to fill our tuple with. Thankfully, Agda can Solve ( C-c C-s in Emacs and VS Code) the first two holes for us: 2
_ : Bool × Bool _ = mk-tuple Bool Bool
{! !} {! !}
and we are free to fill in the latter two to our heart’s content. What’s more interesting is if we fill in one of these types incorrectly; that is to say, with a type that isn’t Bool. This is not an onerous task, as it’s very easy to spin up new types at will: 2
data PrimaryColor : Set where red green blue : PrimaryColor
We can now see what happens when we fill in one of those Bools with PrimaryColor instead:
1.21. IMPLICIT ARGUMENTS 6
59
bad-tuple : Bool × Bool bad-tuple = mk-tuple PrimaryColor Bool
{! !} {! !}
The response from Agda is immediate and serious: i INFO WINDOW PrimaryColor != Bool of type Set when checking that the expression mk-tuple PrimaryColor Bool ? ? has type Bool × Bool
Agda is telling us off for writing PrimaryColor when we should have written Bool. Amazingly, Agda knows that this type must be Bool, and all its doing is checking if we wrote down the correct thing. Which we didn’t. How does Agda know this? Because we wrote the type of badtuple as Bool × Bool. You will notice this situation is all a bit stupid. If Agda knows what exactly what we should write into this hole, and yells at us if we don’t do it properly, why do we have to do it at all? As it happens, we don’t. Instead, in any expression, we can leave behind an underscore, asking Agda to make an informed decision and fill it in for us. Thus, we can write the following: ⇤ 2
color-bool-tuple : PrimaryColor × Bool color-bool-tuple = mk-tuple _ _ red false
and Agda will (silently, without changing our file) fill in the two underscores as PrimaryColor and Bool, respectively. Filling in arguments in this way is known as elaboration, as it offloads the work of figuring out exactly what your program should be to the compiler. No human input necessary. It is exactly this elaboration that is happening behind the scenes of our invisible parameters. Whenever you mark a parameter invisible by ensconcing it in curly braces, you’re really just asking Agda to elaborate that argument for you by means of inserting an underscore. We can make the invisible visible again by explicitly filling in implicit arguments for ourselves. The syntax for this is to give our implicit arguments as regular arguments, themselves in curly braces.
60
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA
We can also use the explicit names to these implicits, so that we need to fill them all in order to fill only one: 2
mk-color-bool-tuple : PrimaryColor → Bool → PrimaryColor × Bool mk-color-bool-tuple = _,_ {A = PrimaryColor} {B = Bool}
Of course, implicit elaboration is not magic. It cannot write your entire program for you; it can only elucidate specific details that are already true, but which you would prefer not to write out. To illustrate, Agda can’t solve the following, because it doesn’t know whether you want to use false or true—there is no unambiguous answer! 2
ambiguous : Bool ambiguous = _
You’ll notice the syntax highlighting for this implicit has gone yellow; that’s Agda informing us that it doesn’t have enough information to elaborate. In addition, you’ll also see a warning message like this in the info window: i INFO WINDOW Invisible Goals: _236 : Agda.Builtin.Bool.Bool [at 1518,15-16]
Agda refers to problems like these as unsolved metas. Whenever you see this yellow background, something has gone wrong, and it’s worth fixing before powering on. Ambiguities have a habit of propagating themselves forwards, and so what is one unsolved meta now might turn into ten a few functions down the line.
1.22
Wrapping Up
You have managed to survive an extremely whirlwind tour of Agda. While you likely are not yet the world’s best Agda programmer, you now know much more than the vast majority of programmers. The attentive reader has been exposed to the majority of this gentle language’s most “astronautic” features.
1.22. WRAPPING UP
61
What we have seen here are Agda’s fundamental building blocks. While they are interesting in their own right, the fun parts come when we start putting them together into funky shapes. Throughout our progression we will learn that there was more to learn about these simple pieces all along. Indeed, perhaps these primitive elements are much more sophisticated than they look. As a convention in this book, we will end each chapter by exporting what we’ve made, so we can use it in future chapters. Since we built a lot of things by hand, made several examples, and generally went down the garden path, we will take the time to flesh out exactly what the takeaways should be—both in mind and in code. For the best portability, we will not use our own definitions, but rather reuse those which come from the standard library. In sec. 1.6 we built Bool alongside its constructors false and true. In sec. 1.7 we implemented not, and in sec. 1.12 we defined the boolean OR function _∨_. As an exercise, you were tasked with defining the AND function _∧_. ⇤ 0
open import Data.Bool using (Bool; false; true; not; _∨_; _∧_) public
Note the public modifier on thisimports. By default, Agda won’t export anything you imported, but the public keyword changes this behavior, allowing us to re-export definitions that we didn’t write for ourselves. The other things we built in this chapter are around tuples and functions. In sec. 1.15 we made the _×_ type, its constructor _,_, and its projections proj₁ and proj₂. The functions curry and uncurry—which form an isomorphism between curried functions and functions which take all of their arguments at once—were defined in sec. 1.20. ⇤ 0
open import Data.Product using (_×_; _,_; proj₁; proj₂; curry; uncurry) public
Also, in sec. 1.18, we quickly looked at the coproduct type _⊎_: ⇤ 0
open import Data.Sum using (_⊎_; inj₁; inj₂) public
62
CHAPTER 1. A GENTLE INTRODUCTION TO AGDA
We have not even begun to scratch the surface of what interesting things are possible in Agda, but we now have enough background that we can earnestly get into the meat of this book. Prepare yourself.
Ï UNICODE IN THIS CHAPTER ¬ ¹ ¿ × Δ Λ α δ λ ᶠ ᶻ ′
U+00AC U+00B9 U+00BF U+00D7 U+0394 U+039B U+03B1 U+03B4 U+03BB U+1DA0 U+1DBB U+2032
NOT SIGN (\neg) SUPERSCRIPT ONE (\^1) INVERTED QUESTION MARK (\?) MULTIPLICATION SIGN (\x) GREEK CAPITAL LETTER DELTA (\GD) GREEK CAPITAL LETTER LAMDA (\GL) GREEK SMALL LETTER ALPHA (\Ga) GREEK SMALL LETTER DELTA (\Gd) GREEK SMALL LETTER LAMDA (\Gl) MODIFIER LETTER SMALL F (\^f) MODIFIER LETTER SMALL Z (\^z) PRIME (\')
₁ U+2081 SUBSCRIPT ONE (\_1) ₂ U+2082 SUBSCRIPT TWO (\_2) ₓ U+2093 LATIN SUBSCRIPT SMALL LETTER X (\_x) ℕ U+2115 DOUBLE-STRUCK CAPITAL N (\bN) ℚ U+211A DOUBLE-STRUCK CAPITAL Q (\bQ) ℝ U+211D DOUBLE-STRUCK CAPITAL R (\bR) ← U+2190 LEFTWARDS ARROW (\l-) → U+2192 RIGHTWARDS ARROW (\to) ∈ U+2208 ELEMENT OF (\in) ∉ U+2209 NOT AN ELEMENT OF (\inn) ∘ U+2218 RING OPERATOR (\o) ∙ U+2219 BULLET OPERATOR (\.) ∧ U+2227 LOGICAL AND (\and) ∨ U+2228 LOGICAL OR (\or) ≋ U+224B TRIPLE TILDE (\~~~) ≗ U+2257 RING EQUAL TO (\=o) ≟ U+225F QUESTIONED EQUAL TO (\?=) ≡ U+2261 IDENTICAL TO (\ ) ⊂ U+2282 SUBSET OF (\sub) ⊎ U+228E MULTISET UNION (\u+) ⊗ U+2297 CIRCLED TIMES (\ox)
1.22. WRAPPING UP
⊙ ⊚ ⊤ ⊥ ⌊ ⌋
U+2299 U+229A U+22A4 U+22A5 U+230A U+230B
CIRCLED DOT OPERATOR (\o.) CIRCLED RING OPERATOR (\oo) DOWN TACK (\top) UP TACK (\bot) LEFT FLOOR (\clL) RIGHT FLOOR (\clR)
63
CHAPTER
2 An Exploration of Numbers
In this chapter, we will get our hands dirty, implementing several different number systems in Agda. The goal is threefold: to get some experience thinking about how to model problems in Agda, to practice seeing familiar objects with fresh eyes, and to get familiar with many of the mathematical objects we’ll need for the remainder of the book. Before we start, note that this chapter has prerequisite knowledge from sec. 1. And, as always, every new chapter must start a new module: 0
module Chapter2-Numbers where
Prerequisites 0
import Chapter1-Agda
As you might expect, Agda already has support for numbers, and thus everything we do here is purely to enhance our understanding. That being said, it’s important to get an intuition for how we can use Agda to solve problems. Numbers are simultaneously a domain you already understand, and, in most programming languages, they usually come as pre-built, magical primitives. This is not true in Agda: numbers are defined in library code. Our approach will be to build the same number system exported by the standard library so we can peek at how it’s done. Again, this is just an exercise; after this chapter, we will just use the standard library’s implementation, since it will be more complete, and allow us better interopability when doing real work. 65
66
2.1
CHAPTER 2. AN EXPLORATION OF NUMBERS
Natural Numbers
It is one thing to say we will “construct the numbers,” but doing so is much more involved. The first question to ask is which numbers? As it happens, we will build all of them. But that is just passing the buck. What do we mean by “all” the numbers? There are many different sets of numbers. For example, there are the numbers we use to count in the real world (which start at 1.) There are also the numbers we use to index in computer science (which begin at 0.) There are the integers, which contain negatives. And then there are the rationals which contain fractions, and happen to be all the numbers you have ever encountered in real life. But somehow, not even that is all the numbers. Beyond the rationals are the reals, which are somehow bigger than all the numbers you have actually experienced, and in fact are so big that the crushing majority of them are completely inaccessible to us. The party doesn’t stop there. After the reals come the complex numbers which have an “imaginary” part—whatever that means. Beyond those are the quaternions, which come with three different varieties of imaginary parts, and beyond those, the octonions (which have seven different imaginaries!) In order to construct “the numbers,” we must choose between these (and many other) distinct sets. And those are just some of the number systems mathematicians talk about. But worse, there are the number systems that computer scientists use, like the bits, the bytes, the words, and by far the worst of all, the IEEE 754 “floating point” numbers known as floats and doubles. You, gentle reader, are probably a programmer, and it is probably in number systems such as these that you feel more at home. We will not, however, be working with number systems of the computer science variety, as these are extremely non-standard systems of numbers, with all sorts of technical difficulties that you have likely been burned so badly by that you have lost your pain receptors. Who among us hasn’t been bitten by an integer overflow, where adding two positive numbers somehow results in a negative one? Or by the fact that, when working with floats, we get different answers depending on which two of three numbers we multiply together first. One might make a successful argument that these are a necessarily limitations of our computing hardware. As a retort, I will only point to the co-Blub paradox (sec. ), and remind you that our goal here is to learn how things can be, rather than limit our minds to the way we
2.1. NATURAL NUMBERS
67
perceive things must be. After all, we cannot hope to reach paradise if we cannot imagine it. And so we return to our original question of which number system we’d like to build. As a natural starting point, we will pick the simplest system that it seems fair to call “numbers”: the naturals. These are the numbers you learn as a child, in a simpler time, before you needed to worry about things like negative numbers, decimal points, or fractions. The natural numbers start at 0, and proceed upwards exactly one at a time, to 1, then 2, then 3, and so on and so forth. Importantly to the computer scientist, there are infinitely many natural numbers, and we intend to somehow construct every single one of them. We will not placate ourselves with arbitrary upper limits, or with arguments of the form “X ought to be enough for anyone.” How can we hope to generate an infinite set of numbers? The trick isn’t very impressive—in fact, I’ve already pointed it out. You start at zero, and then you go up one at a time, forever. In Agda, we can encode this by saying zero is a natural number, and that, given some number n, we can construct the next number up—its successor—via suc n. Such an encoding gives rise to a rather elegant (if inefficient, but, remember, we don’t care) specification of the natural numbers. Under such a scheme, we would write the number 3 as suc (suc (suc zero)). It is important to stress that this is a unary encoding, rather than the traditional binary encoding familiar to computer scientists. There is nothing intrinsically special about binary; it just happens to be an easy thing to build machines that can distinguish between two states: whether they be magnetic forces, electric potentials, or the presence or absence of a bead on the wire of an abacus. Do not be distraught; working in unary dramatically simplifies math, and if you are not yet sold on the approach, you will be before the end of this chapter. But enough talk. It’s time to conjure up the natural numbers. In the mathematical literature, the naturals are denoted by the blackboard bold symbol ℕ—a convention we too will adopt. You can input this symbol via \bN . 0
module Definition-Naturals where data ℕ : Set where zero : ℕ suc : ℕ → ℕ 1
Here we use the data keyword to construct a type consisting of
68
CHAPTER 2. AN EXPLORATION OF NUMBERS
several different constructors. In this case, a natural is either a zero or it is a suc of some other natural number. You will notice that we must give explicit types to constructors of a data type, and at 1 we give the type of suc as ℕ → ℕ. This is the precise meaning that a suc is “of some other natural number.” You can think of suc as the mathematical function: 𝑥↦𝑥+1
although this is just a mental shortcut, since we do not yet have formal definitions for addition or the number 1.
2.2
Brief Notes on Data and Record Types
Before we play around with our new numeric toys, I’d like to take a moment to discuss some of the subtler points around modeling data in Agda. The data keyword also came up when we defined the booleans in sec. 1.6, as well as for other toy examples in sec. 1. Indeed, data will arise whenever we’d like to build a type whose values are apart—that is, new symbols whose purpose is in their distinctiveness from one another. The boolean values false and true are just arbitrary symbols, which we assign meaning to only by convention. This meaning is justified exactly because false and true are distinct symbols. This distinctness is also of the utmost importance when it comes to numbers. The numbers are interesting to us only because we can differentiate one from two, and two from three. Numbers are a collection of symbols, all distinct from one another, and it is from their apartness that we derive importance. As a counterexample, imagine a number system in which there is only one number. Not very useful, is it? Contrast this apartness to the tuple type, which you’ll recall was a type defined via record instead of data. In some sense, tuples exist only for bookkeeping. The tuple type doesn’t build new things, it just lets you simultaneously move around two things that already exist. Another way to think about this is that records are made up of things that already exist, while data types create new things ex nihilo. Most programming languages have a concept of record types (whether they be called structures, tuples, or classes), but very few support data types. Booleans and numbers are the canonical
2.3. PLAYING WITH NATURALS
69
examples of data types, and the lack of support for them is exactly why these two types are usually baked-in to a language. It can be tempting to think of types defined by data as enums, but this is a subtly misleading. While enums are indeed apart from one another, this comes from the fact that enums are just special names given to particular values of ints. This is an amazingly restricting limitation. Note that in Agda, data types are strictly more powerful than enums, because they don’t come with this implicit conversion to ints. As a quick demonstration, note that suc is apart from zero, but cotr:suc can accept any ℕ as an argument! While there are only 264 ints, there are infinitely many ℕs, and thus types defined by data in Agda must be more powerful than those defined as enums in other languages. More generally, constructors in data types can take arbitrary arguments, and we will often use this capability moving forwards.
2.3
Playing with Naturals
Let’s return now to our discussion of the naturals. Since we’d like to reuse the things we build in future chapters, let’s first import the natural numbers from the standard library. ⇤ 0
module Sandbox-Naturals where open import Data.Nat using (ℕ; zero; suc)
By repeated application of suc, we can build an infinite tower of natural numbers, the first four of which are built like this: ⇤ 2
one : ℕ one = suc zero two : ℕ two = suc one three : ℕ three = suc two four : ℕ four = suc three
70
CHAPTER 2. AN EXPLORATION OF NUMBERS
Of course, these names are just for syntactic convenience; we could have instead defined four thusly: 2
four : ℕ four = suc (suc (suc (suc zero)))
It is tempting to use the traditional base-ten symbols for numbers, and of course, Agda supports this (although setting it up will require a little more effort on our part.) However, we will persevere with our explicit unary encoding for the time being, to really hammer-in that there is no magic happening behind the scenes here. The simplest function we can write over the naturals is to determine whether or not the argument is equal to 0. For the same of simplicity, this function will return a boolean, but note that this is a bad habit in Agda. There are much better techniques that don’t lead to boolean blindness that we will explore in sec. 6. This function therefore is only provided to help us get a feel for pattern matching over natural numbers. We can get access to the booleans by importing them from our exports from sec. 1: 2
open Chapter1-Agda using (Bool; true; false)
The function we’d like to write determines if a given ℕ is equal to zero, so we can begin with a name, a type signature, and a hole: ⇤ 2
n=0? : ℕ → Bool n=0? =
?
After MakeCase ( C-c C-c in Emacs and VS Code), our argument is bound for us: 2
n=0? : ℕ → Bool n=0? n =
{! !}
and, like when writing functions over the booleans, we can immediately MakeCase with argument x ( C-c C-c in Emacs and VS Code) to split x apart into its distinct possible constructors: 2
n=0? : ℕ → Bool
2.3. PLAYING WITH NATURALS n=0? zero
=
{! !}
n=0? (suc x) =
{! !}
71
1
Interestingly, at 1 , Agda has given us a new form, something we didn’t see when considering the booleans. We now have a pattern match of the form suc x, which after some mental type-checking, makes sense. We said n was a ℕ, but suc has type ℕ → ℕ. That means, n can only be a natural number of the suc form if that function has already been applied to some other number. And x is that other number. The interpretation you should give to this expression is that if 𝑛 is of the form suc x, then 𝑥 = 𝑛 − 1. Note that zero is not of the form suc x, and thus we don’t accidentally construct any negative numbers under this interpretation. Returning to n=0?, we care only if our original argument n is zero, which we can immediately solve from here—without needing to do anything with x: 2
n=0? : ℕ → Bool n=0? zero = true n=0? (suc x) = false
It will be informative to compare this against a function that computes whether a given natural is equal to 2. Exercise (Easy) Implement n=2? : ℕ → Bool Solution 2
n=2? n=2? n=2? n=2? n=2?
: ℕ → Bool zero (suc zero) (suc (suc zero)) (suc (suc (suc x)))
= = = =
false false true false
or, alternatively: 2
n=2? : ℕ → Bool n=2? (suc (suc zero)) = true n=2? _ = false
72
2.4
CHAPTER 2. AN EXPLORATION OF NUMBERS
Induction
Unlike functions out of the booleans, where we only had two possibilities to worry about, functions out of the naturals have (in principle) an infinite number of possibilities. But our program can only ever be a finite number of lines, which leads to a discrepancy. How can we possibly reconcile an infinite number of possibilities with a finite number of cases? Our only option is to give a finite number of interesting cases, and then a default case which describes what to do for everything else. Of course, nothing states that this default case must be constant, or even simple. That is to say, we are not required to give the same answer for every number above a certain threshold. To illustrate this, we can write a function which determines whether its argument is even. We need only make the (mathematical) argument that “a number 𝑛 + 2 is even if and only if the number 𝑛 is.” This is expressed naturally by recursion: 2
even? even? even? even?
: ℕ → Bool zero = true (suc zero) = false (suc (suc x)) = even? x
Here, we’ve said that zero is even, one is not, and for every other number, you should subtract two (indicated by having removed two suc constructors from x) and then try again. This general technique—giving some explicit answers for specific inputs, and recursing after refining—is known as induction. It is impossible to overstate how important induction is. Induction is the fundamental mathematical technique. It is the primary workhorse of all mathematics. Which makes sense; if you need to make some argument about an infinite number of things, you can neither physically nor theoretically analyze each and every possible case. You instead must give a few (usually very simple) answers, and otherwise show how to reduce a complicated problem into a simpler one. This moves the burden of effort from the theorem prover (you) to whomever wants an answer, since they are the ones who become responsible for carrying out the repetitive task of reduction to simpler forms. However, this is not so bad, since the end-user is the one who wants the answer, and they necessarily have a particular, finite problem that they’d like to solve.
2.5. TWO NOTIONS OF EVENNESS
73
Not being very creative, mathematicians often define the principle of induction as being a property of the natural numbers. They say all of mathematics comes from: 1. a base case—that is, proving something in the case that 𝑛 = 0 2. an inductive case—that is, showing something holds in the case of 𝑛 under the assumption that it holds under 𝑛 − 1. However, the exact same technique can be used for any sort of recursively-defined type, such as lists, trees, graphs, matrices, etc. While perhaps you could shoe-horn these to fit into the natural numbers, it would be wasted effort in order to satisfy nothing but a silly definition. That notwithstanding, the terminology itself is good, and so we will sometimes refer to recursive steps as “induction” and non-recursive steps as “base cases.”
2.5
Two Notions of Evenness
We have now defined even? a function which determines whether a given natural number is even. A related question is whether we can define a type for only the even numbers. That is, we’d like a type which contains 0, 2, 4, and so on, but neither 1, nor 3, nor any of the odd numbers. In a monkey-see-monkey-do fashion, we could try to define a new type called Evenℕ with a constructor for zero, but unlike ℕ, no suc. Instead, we will give a constructor called suc-suc, intending to be suggestive of taking two successors simultaneously: 2
data Evenℕ : Set where zero : Evenℕ suc-suc : Evenℕ → Evenℕ
We can transform an Evenℕ into a ℕ by induction: ⇤ 2
toℕ : Evenℕ → ℕ toℕ zero = zero toℕ (suc-suc x) = suc (toℕ x)
This approach, however, feels slightly underwhelming. The reflective reader will recall that in a data type, the meaning of the constructors comes only from their types and the suggestive names we give them.
74
CHAPTER 2. AN EXPLORATION OF NUMBERS
A slight renaming of suc-suc to suc makes the definition of Evenℕ look very similar indeed to that of ℕ. In fact, the two types are completely equivalent, modulo the names we picked. As such, there is nothing stopping us from writing an incorrect (but not obviously wrong) version of the toℕ function. On that note, did you notice that the definition given above was wrong? Oops! Instead, the correct implementation should be this: 2
toℕ : Evenℕ → ℕ toℕ zero = zero toℕ (suc-suc x) = suc (suc (toℕ x))
You might want to double check this new definition, just to make sure I haven’t pulled another fast one on you. Double checking, however, is tedious and error prone, and in an ideal world, we’d prefer to find a way to get the computer to double check on our behalf. Rather than trying to construct a completely new type for the even naturals, perhaps we can instead look for a way to filter for only the naturals we want. A mathematician would look at this problem and immediately think to build a subset—that is, a restricted collection of the objects at study. In this particular case, we’d like to build a subset of the natural numbers which contains only those that are even. The high-level construction here is we’d like to build IsEven : ℕ → Set, which, like you’d think, is a function that takes a natural and returns a type. The idea that we can compute types in this way is rare in programming languages, but is very natural in Agda. In order to use IsEven as a subset, it must return some sort of “usable” type when its argument is even, and an “unusable” type otherwise. We can take this function idea literally if we’d please, and postulate two appropriate types: 2
module Sandbox-Usable where postulate Usable
: Set
Unusable : Set IsEven : ℕ → Set IsEven zero
= Usable
IsEven (suc zero) = Unusable IsEven (suc (suc x)) = IsEven x
You will notice the definition of IsEven is identical to that of even? except that we replaced Bool with Set, true with Usuable, and false
2.5. TWO NOTIONS OF EVENNESS
75
with Unusuable. This is what you should expect, as even? was already a function that computed whether a given number is even! While we could flesh this idea out in full by finding specific (nonpostulated) types to use for Usuable and Unusuable , constructing subsets in this way isn’t often fruitful. Though it occasionally comes in handy, and it’s nice to know you can compute types directly in this way. Let’s drop out of the Sandbox-Usable module, and try defining IsEven in a different way. The situation here is analogous to our first venture into typing judgments in sec. 1.6. While it’s possible to do all of our work directly with postulated judgments, Agda doesn’t give us any help in doing so. Instead, things became much easier when we used a more principled structure—namely, using the data type. Amazingly, here too we can use a data type to solve our problem. The trick is to add an index to our type, which you can think of as a “return value” that comes from our choice of constructor. Don’t worry, the idea will become much clearer in a moment after we look at an example. Let’s begin just with the data declaration: ⇤ 2
data IsEven : ℕ → Set where
1
Every type we have seen so far has been of the form data X : Set, but at 1 we have ℕ → Set on the right side of the colon. Reading this as a type declaration directly, it says that this type IsEven we’re currently defining is exactly the function we were looking for earlier—the one with type ℕ → Set. Because of this parameter, we say that IsEven is an indexed type, and that the ℕ in question is its index. Every constructor of an indexed type must fill-in each index. To a first approximation, constructors of an indexed type are assertions about the index. For example, it is an axiom that zero is an even number, which we can reflect directly as a constructor: ⇥ 4
zero-even : IsEven zero
Notice that this constructor is equivalent to the base case even? zero = true. We would like to exclude odd numbers from IsEven, so we can ignore the suc zero case for the moment. In the inductive case, we’d like to say that if 𝑛 is even, then so too is 𝑛 + 2: 4
suc-suc-even : {n : ℕ} → IsEven n → IsEven (suc (suc n))
76
CHAPTER 2. AN EXPLORATION OF NUMBERS
In a very real sense, our indexed type IsEven is the “opposite” of our original decision function even?. Where before we removed two calls to suc before recursing, we now recurse first, and then add two calls to suc. This is not a coincidence, but is in fact a deep principle of mathematics that we will discuss later. The concept of indexed types is so foreign to mainstream programming that it is prudent to spend some time here and work through several examples of what-the-hell-is-happening. Let’s begin by showing that four is even. Begin with the type and a hole: ⇤ 2
four-is-even : IsEven four four-is-even =
?
Here’s where things get cool. We can ask Agda to refine this hole via Refine ( C-c C-r in Emacs and VS Code). Recall that refine asks Agda to fill in the hole with the only constructor that matches. Rather amazingly, the result of this invocation is: 2
four-is-even : IsEven four four-is-even = suc-suc-even
{! !}
Even more impressive is that the new goal has type IsEven two—which is to say, we need to show that two is even in order to show that four is even. Thankfully we can ask Agda to do the heavy lifting for us, and again request a Refine ( C-c C-r in Emacs and VS Code): 2
four-is-even : IsEven four four-is-even = suc-suc-even (suc-suc-even
{! !} )
Our new hole has type IsEven zero, which again Agda can refine for us: 2
four-is-even : IsEven four four-is-even = suc-suc-even (suc-suc-even zero-even)
With all the holes filled, we have now successfully proven that four is in fact even. But can we trust that this works as intended? Let’s see what happens when we go down a less-happy path. Can we also prove IsEven three?
2.5. TWO NOTIONS OF EVENNESS 2
77
three-is-even : IsEven three three-is-even =
?
Let’s play the same refinement game. Invoking Refine ( C-c C-r in Emacs and VS Code) results in: 2
three-is-even : IsEven three three-is-even = suc-suc-even
{! !}
Our new goal is IsEven one. But if we try to refine again, Agda gives us an error: i INFO WINDOW No introduction forms found.
What’s (correctly) going wrong here is that Agda is trying to find a constructor for IsEven (suc zero), but no such thing exists. We have zero-even for IsEven zero, and we have suc-suc-even for IsEven (suc (suc n)). But there is no such constructor when we have only one suc! Thus neither zero-even nor suc-suc-even will typecheck in our hole. Since these are the only constructors, and neither fits, it’s fair to say that nothing can possibly fill this hole. There is simply no way to give an implementation for three-is-even—it’s impossible to construct an IsEven n whenever n is odd. This is truly a miraculous result, and might give you a glimpse at why we do mathematics in Agda. The idea is to carefully construct types whose values are possible only when our desired property is actually true. We will explore this topic more deeply in sec. 3. Exercise (Easy) Build an indexed type for IsOdd. Solution 2
data IsOdd : ℕ → Set where one-odd : IsOdd one suc-suc-odd : {n : ℕ} → IsOdd n → IsOdd (suc (suc n))
or, alternatively, ⇤ 2
data IsOdd’ : ℕ → Set where is-odd : {n : ℕ} → IsEven n → IsOdd’ (suc n)
78
CHAPTER 2. AN EXPLORATION OF NUMBERS
Exercise (Easy) Write an inductive function which witnesses the fact that every even number is followed by an odd number. This function should have type {n : ℕ} → IsEven n → IsOdd (suc n). Solution ⇤ 2
evenOdd : {n : ℕ} → IsEven n → IsOdd (suc n) evenOdd zero-even = one-odd evenOdd (suc-suc-even x) = suc-suc-odd (evenOdd x)
or, if you took the alternative approach in the previous exercise, 2
evenOdd’ : {n : ℕ} → IsEven n → IsOdd’ (suc n) evenOdd’ = is-odd
2.6 Constructing Evidence When we originally implemented even?, I mentioned that functions which return booleans are generally a bad habit in Agda. You’ve done a lot of computation in order to get the answer, and then throw away all of that work just to say merely “yes” or “no.” Instead of returning a Bool, we could instead have even? return an IsEven, proving the number really is even! However, not all numbers are even, so we will first need some notion of failure. This is an excellent use for the Maybe type, which is a container that contains exactly zero or one element of some type A. We can define it as: 2
data Maybe (A : Set) : Set where just : A → Maybe A nothing : Maybe A
Here, just is the constructor for when the Maybe does contain an element, and nothing is for when it doesn’t. Maybe is a good type for representing partial functions—those which don’t always give back a result. Our desired improvement to even? is one such function, since there are naturals in the input which do not have a corresponding value in the output. Our new function is called evenEv, to be suggestive of the fact that it returns evidence of the number’s evenness. The first thing to study here is the type:
2.6. CONSTRUCTING EVIDENCE ⇤ 2
79
evenEv : (n : ℕ) → Maybe (IsEven n) evenEv =
?
The type signature of evenEv says “for some n : ℕ, I can maybe provide a proof that it is an even number.” The implementation will look very reminiscent of even?. First, we can do MakeCase ( C-c C-c in Emacs and VS Code) a few times: 2
evenEv : (n : ℕ) → Maybe (IsEven n) evenEv zero
=
{! !}
evenEv (suc zero)
=
{! !}
evenEv (suc (suc n)) =
{! !}
Then, in the suc zero case where we know there is not an answer, we can give back nothing: 2
evenEv : (n : ℕ) → Maybe (IsEven n) evenEv zero
=
evenEv (suc zero)
= nothing
evenEv (suc (suc n)) =
{! !} {! !}
In the case of zero, there definitely is an answer, so we refine our hole with just: 2
evenEv : (n : ℕ) → Maybe (IsEven n) evenEv zero
= just
evenEv (suc zero)
= nothing
evenEv (suc (suc n)) =
{! !}
{! !}
…but a just of what? The type IsEven zero of the goal tells us, but we can also elicit an answer from Agda by invoking Refine ( C-c C-r in Emacs and VS Code) on our hole: 2
evenEv : (n : ℕ) → Maybe (IsEven n) evenEv zero = just zero-even evenEv (suc zero) = nothing evenEv (suc (suc n)) =
{! !}
At this step in even? we just recursed and we were done. However, that can’t quite work here. The problem is that if we were to recurse,
80
CHAPTER 2. AN EXPLORATION OF NUMBERS
we’d get a result of type Maybe (IsEven n), but we need a result of type Maybe (IsEven (suc (suc n))). What needs to happen then is for us to recurse, inspect the answer, and then, if it’s just, insert a suc-suc-even on the inside. It all seems a little convoluted, but the types are always there to guide you should you ever lose the forest for the trees. Agda does allow us to pattern match on the result of a recursive call. This is known as a with abstraction, and the syntax is as follows: 2
evenEv evenEv evenEv evenEv
: (n : ℕ) → Maybe (IsEven n) zero = just zero-even (suc zero) = nothing (suc (suc n)) with evenEv n
... | result =
{! !}
1
2
At 1 , which you will note is on the left side of the equals sign, we add the word with and the expression we’d like to pattern match on. Here, it’s evenEv n, which is the recursive call we’d like to make. At 2 , we put three dots, a vertical bar, and a name for the resulting value of the call we made, and then the equals sign. The important thing to note here is that result is a binding that corresponds to the result of having called evenEv n. This seems like quite a lot of ceremony, but what’s cool is that we can now run MakeCase with argument result ( C-c C-c in Emacs and VS Code) in the hole to pattern match on result: 2
evenEv evenEv evenEv evenEv
: (n : ℕ) → Maybe (IsEven n) zero = just zero-even (suc zero) = nothing (suc (suc n)) with evenEv n
... | just x
=
{! !}
... | nothing =
{! !}
In the case that result is nothing, we know that our recursive call failed, and thus that 𝑛 − 2 is not even. Therefore, we too should return nothing. Similarly for the just case: 2
evenEv evenEv evenEv evenEv
: (n : ℕ) → Maybe (IsEven n) zero = just zero-even (suc zero) = nothing (suc (suc n)) with evenEv n
2.7. ADDITION ... | just x
81 = just
{! !}
... | nothing = nothing
We’re close to the end. Now we know that x : IsEven n and that our hole requires an IsEven (suc (suc n)). We can fill in the rest by hand, or invoke Auto ( C-c C-a in Emacs and VS Code) to do it on our behalf. 2
evenEv : (n : ℕ) → Maybe (IsEven n) evenEv zero = just zero-even evenEv (suc zero) = nothing evenEv (suc (suc n)) with evenEv n ... | just x = just (suc-suc-even x) ... | nothing = nothing
2.7
Addition
With the concept of induction firmly in our collective tool-belt, we are now ready to tackle a much more interesting function: addition over the naturals. Begin with the type, and bind the variables: 2
_+_ : ℕ → ℕ → ℕ x + y =
?
At first blush, it’s not obvious how we might go about implementing this. Perhaps we could mess about at random and see comes out, but while such a thing might be fun, it is rarely productive. Instead, we can go at this with a more structured approach, seeing what happens if we throw induction at the problem. Doing induction requires something to do induction on, meaning we can choose either x, y or both simultaneously. In fact, all three cases will work, but, as a general rule, if you have no reason to pick any parameter in particular, choose the first one. In practice, doing induction means calling MakeCase ( C-c C-c in Emacs and VS Code) on your chosen parameter, and then analyzing if a base case or an inductive case will help in each resulting equation. Usually, the values which are recursively-defined will naturally require recursion on their constituent parts. Let’s now invoke MakeCase with argument x ( C-c C-c in Emacs and VS Code):
82 2
CHAPTER 2. AN EXPLORATION OF NUMBERS
_+_ : ℕ → ℕ → ℕ zero + y =
{! !}
suc x + y =
{! !}
Immediately a base case is clear to us; adding zero to something doesn’t change it. In fact, that’s the definition of zero. Thus, we have: 2
_+_ : ℕ → ℕ → ℕ zero + y = y suc x + y =
{! !}
The second case here clearly requires recursion, but it might not immediately be clear what that recursion should be. The answer is to squint and reinterpret suc x as 1 + 𝑥, which allows us to write our left hand side as (1 + 𝑥) + 𝑦
If we were to reshuffle the parentheses here, we’d get an 𝑥 + 𝑦 term on its own, which is exactly what we need in order to do recursion. In symbols, this inductive case is thus written as: (1 + 𝑥) + 𝑦 = 1 + (𝑥 + 𝑦)
which translates back to Agda as our final definition of addition: 2
_+_ : ℕ → ℕ → ℕ zero + y = y suc x + y = suc (x + y)
With a little thought, it’s clear that this function really does implement addition. By induction, the first argument might be of the form zero, in which case it adds nothing to the result. Otherwise, the first argument must be of the form suc x, in which case we assume x + y properly implements addition. Then, we observe the fact that (1+𝑥)+𝑦 = 1+(𝑥+𝑦). This is our first mathematical proof, although it is a rather “loose” one: argued out in words, rather than being checked by the computer. Nevertheless, it is a great achievement on our path towards mathematical fluency and finesse. To wrap things up, we will add a fixity declaration for _+_ so that it behaves nicely as an infix operator. We must choose a direction for
2.8. TERMINATION CHECKING
83
repeated additions to associate. In fact, it doesn’t matter one way or another (and we used the fact that it doesn’t matter in the inductive case of _+_.) But, looking forwards, we realize that subtraction must be left-associative in order to get the right answer, and therefore it makes sense that addition should also be left-associative. As a matter of convention, we will pick precedence 6 for this operator. 2
infixl 6 _+_
2.8 Termination Checking There is a subtle point to be made about our implementation of _+_, namely that the parentheses are extremely important. Our last line is written as suc x + y = suc (x + y), but if you were to omit the parentheses, the last line becomes suc x + y = suc x + y. Such a statement is unequivocally true, but is also computationally unhelpful. Since both sides of the equation are syntactically identical, Agda has no ability to make computational progress by rewriting one side as the other. In fact, if such a thing were allowed, it would let you prove anything at all! The only caveat would be that if you tried to inspect the proof, your computer would fall into an infinite loop, rewriting the left side of the equation into the right, forever. Fortunately, Agda is smart enough to identify this case, and will holler, complaining about “termination checking” if you attempt to do it: 6
_+_ : ℕ → ℕ → ℕ zero + y = y suc x + y = suc x + y
i INFO WINDOW Termination checking failed for the following functions: Sandbox-Naturals._+_ Problematic calls: suc x + y
By putting in the parentheses, suc (x + y) is now recursive, and, importantly, it is recursive on structurally smaller inputs than it was
84
CHAPTER 2. AN EXPLORATION OF NUMBERS
given. Since the recursive call must be smaller (in the sense of there is one fewer suc constructor to worry about,) eventually this recursion must terminate, and thus Agda is happy. Agda’s termination checker can help keep you out of trouble, but it’s not the smartest computer program around. The termination checker will only let you recurse on bindings that came from a pattern match, and, importantly, you’re not allowed to fiddle with them first. As a quick, silly, illustration, we could imagine an alternative ℕ’, which comes with an additional 2suc constructor corresponding to sucing twice: 2
module Example-Silly where open Chapter1-Agda using (not) data ℕ’ : Set where zero : ℕ’ suc : ℕ’ → ℕ’ 2suc : ℕ’ → ℕ’
We can now write even?’, whose 2suc case is not of even?’ (suc n): 6
even?’ even?’ even?’ even?’
: ℕ’ → Bool zero = true (suc n) = not (even?’ n) (2suc n) = not (even?’ (suc n))
Tracing the logic here, it’s clear to us as programmers that this function will indeed terminate. Unfortunately, Agda is not as smart as we are, and rejects the program: i INFO WINDOW Termination checking failed for the following functions: Sandbox-Naturals.Example-Silly.even?' Problematic calls: even?' (suc n)
The solution to termination problems like these is just to unwrap another layer of constructors:
2.9. MULTIPLICATION AND EXPONENTIATION ⇤ 4
even?’ even?’ even?’ even?’ even?’ even?’
: ℕ’ → Bool zero (suc n) (2suc zero) (2suc (suc n)) (2suc (2suc n))
= = = = =
85
true not (even?’ n) true not (even?’ n) even?’ n
It’s not the nicest solution, but it gets the job done.
2.9 Multiplication and Exponentiation With addition happily under our belt, we will try our hand at multiplication. The approach is the same as with addition: write down the type of the operation, bind the variables, do induction, and use algebraic identities we remember from school to help figure out the actual logic. The whole procedure is really quite underwhelming once you get the hang of out! After writing down the type and binding some variables, we’re left with the following: ⇤ 2
_*_ : ℕ → ℕ → ℕ a * b =
{! !}
We need to do induction on one of these bindings; because we have no reason to pick one or the other, we default to a: 2
_*_ : ℕ → ℕ → ℕ zero * b =
{! !}
suc a * b =
{! !}
The first case is immediately obvious; zero times anything is zero: 2
_*_ : ℕ → ℕ → ℕ zero * b = zero suc a * b =
{! !}
In order to solve what’s left, we can dig into our mental cache of algebra facts. Recall that suc a is how we write 1 + 𝑎 in Agda, thus: (1 + 𝑎) × 𝑏 = 1 × 𝑏 + 𝑎 × 𝑏 =𝑏+𝑎×𝑏
Therefore, our final implementation of multiplication is just:
86 2
CHAPTER 2. AN EXPLORATION OF NUMBERS
_*_ : ℕ → ℕ → ℕ zero * b = zero suc a * b = b + a * b
of course, we need to add a fixity definition for multiplication to play nicely with the addition operator. Since _*_ is just a series of additions (as you can see from our implementation,) it makes sense to make multiplication also associate left. However, we’d like the expression y + x * y to parse as y + (x * y), and so we must give _*_ a higher precedence. Thus we settle on 2
infixl 7 _*_
Multiplication is just repeated addition, and addition is just repeated counting—as is made abundantly clear when working in our unary representation. We can repeat this pattern, moving upwards and building something that is “just repeated multiplication”—exponentiation: Begin as always, with the type and the bound variables: 2
_^_ : ℕ → ℕ → ℕ a ^ b =
{! !}
We’d again like to do induction, but we must be careful here. Unlike addition and multiplication, exponentiation is not commutative. Symbolically, it is not the case that: 𝑥𝑦 ≠ 𝑦 𝑥
It’s always a good habit to test claims like these. Because we’re computer scientists we can pick 𝑎 = 2, and because we’re humans, 𝑏 = 10. Doing some quick math, we see that this is indeed an inequality: 210 = 1024 ≠ 100 = 102
Due to this lack of commutativity, we must be careful when doing induction on _^_. Unlike in all of our examples so far, getting the right answer in exponentiation strongly depends on picking the right variable to do induction on. Think of what would happen if we were to do induction on a—we would somehow need to multiply smaller numbers together, each to the power of b. Alternatively, doing induction on b means we get to multiply the same number together, each to a smaller power. That sounds much more correct, so let’s pattern match on b:
2.10. SEMI-SUBTRACTION 2
87
_^_ : ℕ → ℕ → ℕ a ^ zero =
{! !}
a ^ suc b =
{! !}
The first case is a usual identity, namely that $ a^0 = 1 $$ while the second case requires an application of the exponent law: 𝑎𝑏+𝑐 = 𝑎𝑏 × 𝑎𝑐
Instantiating this gives us: 𝑎1+𝑏 = 𝑎1 × 𝑎𝑏 = 𝑎 × 𝑎𝑏
and thus: 2
_^_ : ℕ → ℕ → ℕ a ^ zero = one a ^ suc b = a * a ^ b
As you can see, a judicious application of grade-school facts goes a long way when reasoning through these implementations. It makes sense why; algebraic identities are all about when are two expressions equal—and our Agda programs really and truly are defined in terms of equations.
2.10
Semi-subtraction
The natural numbers don’t support subtraction, because we might try to take too much away, being forced to subtract what we don’t have. Recall that there is no way to construct any negative naturals, and so this is not an operation we can implement in general. However, we have an operation closely related to subtraction, which instead truncates at zero. That is, if the result would have gone negative, we just return zero instead. This operation is called “monus”, and given the symbol _∸_, input as \.- . Exercise (Easy) Define _∸_ : ℕ → ℕ → ℕ Solution
88
CHAPTER 2. AN EXPLORATION OF NUMBERS
2
_∸_ : x zero suc x
ℕ → ℕ → ℕ ∸ zero = x ∸ suc y = zero ∸ suc y = x ∸ y
Just to convince ourselves everything works, let’s write a few unit tests: 2
module Natural-Tests where open import Relation.Binary.PropositionalEquality _ : one + two ≡ three _ = refl _ : three ∸ one ≡ two _ = refl _ : one ∸ three ≡ zero _ = refl _ : two * two ≡ four _ = refl
Looks good to me! You can find all of these goodies, and significantly more, in the standard library’s Data.Nat module. Additionally, you also get support for natural literals. No more four : ℕ; just use 4 : ℕ instead! By this point, you should be starting to get a good handle on the basics of Agda—both syntactically, as well as how we think about modeling and solving problems. Let’s therefore ramp up the difficulty and put that understanding to the test.
2.11
Inconvenient Integers
In this section we will tackle the integers, which have much more interesting mathematical structure than the naturals, and subsequently, present many more challenges. The integers extend the natural numbers by reflecting themselves onto the negative side of the axis. The number line now goes off to infinity in two directions simultaneously, both towards infinity and towards negative infinity.
2.11. INCONVENIENT INTEGERS
89
Some of the integers, are … , −1000, … , −2, −1, 0, 1, … , 47, …, but of course, there are many, many more. The set of integers is often written in blackboard bold, with the symbol ℤ, input as \bZ . ℤ might seem like a strange choice for the integers, but it makes much more sense in German, where the word for “number” is Zahl. Mathematically, the integers are an extension of the natural numbers. That is, every natural number can be thought of as an integer, but there are some (infinitely many) integers that do not correspond to any natural. When modeling this problem in Agda, it would be nice if we could reuse the machinery we just built for natural numbers, rather than needing to build everything again from scratch. But before building integers the right way, we will first take an intentional wrong turn, in order to highlight some issues when data modeling in Agda. Rather than pollute our global module with this intentional deadend, we’ll start a new module which we can later use to “rollback” the idea. By analogy with ℕ, which contains zero and suc, perhaps ℤ also has a constructor pred which we will interpret as “one less than”: ⇤ 0
module Misstep-Integers₁ where data ℤ : Set where zero : ℤ suc : ℤ → ℤ pred : ℤ → ℤ
Perhaps we could make an honest go with this definition for ℤ, but it has a major problem—namely, that numbers no longer have a unique representation. For example, there are now infinitely many ways of representing the number zero, the first five of which are: • zero • pred (suc zero) • suc (pred zero) • pred (suc (pred (suc zero))) This is not just a problem for zero; in fact, every number has infinitely many encodings in this definition of ℤ. We could plausibly try to fix this problem by writing a function normalize, whose job is it is to cancel out sucs with preds, and vice versa. An honest attempt at such a function might look like this:
90 ⇤ 2
CHAPTER 2. AN EXPLORATION OF NUMBERS
normalize normalize normalize normalize normalize normalize normalize normalize
: ℤ → ℤ zero (suc zero) (suc (suc x)) (suc (pred x)) (pred zero) (pred (suc x)) (pred (pred x))
= = = = = = =
zero suc zero suc (normalize (suc x)) normalize x pred zero normalize x pred (normalize (pred x))
It’s unclear prima facie whether this function correctly normalizes all integers. As it happens, it doesn’t: 2
module Counterexample where open import Relation.Binary.PropositionalEquality _ : normalize (suc (suc (pred (pred zero)))) ≡ suc (pred zero) _ = refl
I’m sure there is a way to make normalize work correctly, but I suspect that the resulting ergonomics would be too atrocious to use in the real world. The problem seems to be that we can’t be sure that the sucs and preds are beside one another in order to cancel out. Perhaps we can try a different type to model integers which doesn’t have this limitation.
2.12
Difference Integers
Instead, this time let’s see what happens if we model integers as a pair of two natural numbers—one for the positive count, and another for the negative count. The actual integer in question in thus the difference between these two naturals. Because we’d like to use the natural numbers, we must import them. But we anticipate a problem—addition over both the natural numbers and the integers is called _+_, but in Agda, there can only be one definition in scope with a given name. Our solution will be to import Data.Nat, but not to open it: ⇤ 0
module Misstep-Integers₂ where import Data.Nat as ℕ
This syntax gives us access to all of Data.Nat, but allows us to use ℕ as the name of the module, rather than typing out Data.Nat every time.
2.12. DIFFERENCE INTEGERS
91
However, not every definition in ℕ will conflict with things we’d like to define about the integers, so we can also open ℕ in order to bring out the definitions we’d like to use unqualified: 2
open ℕ using (ℕ; zero; suc)
We are now ready to take our second attempt at defining the integers. 2
record ℤ : Set where constructor mkℤ field pos : ℕ neg : ℕ
This new definition of ℤ also has the problem that there are infinitely many representations for each number, but we no longer need to worry about the interleaving problem. To illustrate this, the first five representations of zero are now: • mkℤ 0 0 • mkℤ 1 1 • mkℤ 2 2 • mkℤ 3 3 • mkℤ 4 4 Because the positive and negative sides are tracked independently, we can now write normalize and be confident that it works as expected: ⇤ 2
normalize normalize normalize normalize
: ℤ → ℤ (mkℤ zero neg) = mkℤ zero neg (mkℤ (suc pos) zero) = mkℤ (suc pos) zero (mkℤ (suc pos) (suc neg)) = normalize (mkℤ pos neg)
Given normalize, we can give an easy definition for _+_ over our “difference integers,” based on the fact that addition distributes over subtraction: (𝑝1 − 𝑛1 ) + (𝑝2 − 𝑛2 ) = (𝑝1 + 𝑝2 ) − (𝑛1 + 𝑛2 ).
In Agda, this fact looks equivalent, after replacing 𝑎 − 𝑏 with mkℤ a b:
92 2
CHAPTER 2. AN EXPLORATION OF NUMBERS
_+_ : ℤ → ℤ → ℤ mkℤ p₁ n₁ + mkℤ p₂ n₂ = normalize (mkℤ (p₁ ℕ.+ p₂) (n₁ ℕ.+ n₂)) infixl 5 _+_
Subtraction is similar, but is based instead on the fact that subtraction distributes over addition—that is: (𝑝1 − 𝑛1 ) − (𝑝2 − 𝑛2 ) = 𝑝1 − 𝑛1 − 𝑝2 + 𝑛2 = 𝑝 1 + 𝑛 2 − 𝑛 1 − 𝑝2 = (𝑝1 + 𝑛2 ) − (𝑛1 + 𝑝2 ).
This identity is exactly what’s necessary to implement subtraction in Agda: 2
_-_ : ℤ → ℤ → ℤ mkℤ p₁ n₁ - mkℤ p₂ n₂ = normalize (mkℤ (p₁ ℕ.+ n₂) (n₁ ℕ.+ p₂)) infixl 5 _-_
Finally we come to multiplication, which continues to be implemented by way of straightforward algebraic manipulation. This time we need to multiply two binomials, which we can do by distributing our multiplication across addition twice. In symbols, the relevant equation is: (𝑝1 − 𝑛1 ) × (𝑝2 − 𝑛2 ) = 𝑝1 𝑝2 − 𝑝1 𝑛2 − 𝑛1 𝑝2 + 𝑛1 𝑛2 = 𝑝1 𝑝2 + 𝑛1 𝑛2 − 𝑝1 𝑛2 − 𝑛1 𝑝2 = (𝑝1 𝑝2 + 𝑛1 𝑛2 ) − (𝑝1 𝑛2 + 𝑛1 𝑝2 ).
Again, in Agda: 2
_*_ : ℤ → ℤ → ℤ mkℤ p₁ n₁ * mkℤ p₂ n₂ = normalize (mkℤ (p₁ ℕ.* p₂ ℕ.+ n₁ ℕ.* n₂) (p₁ ℕ.* n₂ ℕ.+ p₂ ℕ.* n₁)) infixl 6 _*_
While each and every one of our operations here do in fact work, there is nevertheless something dissatisfying about them—namely, our
2.13. UNIQUE INTEGER REPRESENTATIONS
93
requirement that each of _+_, _-_, and _*_ end in a call to normalize. This is by no means the end of the world, but it is inelegant. Ideally, we would like each of these elementary operations to just get the right answer, without needing to run a final pass over each result. In many ways, what we have built with our difference integers comes from having “computer science brain” as opposed to “math brain.” We built something that gets the right answer, but does it by way of an intermediary computation which doesn’t correspond to anything in the problem domain. There are all these calls to normalize, but it’s unclear what exactly normalize actually means–as opposed to what it computes. Where this problem will really bite us is when we’d like to start doing proofs. What we’d really like to be able to say is that “these two numbers are the same,” but, given our implementation, all we can say is “these two numbers are the same after a call to normalize.” It is possible to work around this problem, as we will see in sec. 7.9, but the solution is messier than the problem, and is best avoided whenever we are able. The crux of the matter is that we know what sorts of rules addition, subtraction and multiplication ought to abide by, but it’s much less clear what we should expect of normalize. This function is a computational crutch—nothing more, and nothing less. If we rewind to the point at which we introduced normalize, we realize that this crutch was designed to work around the problem that there are nonunique representations for each number. If we could fix that problem directly, we could avoid normalize and all the issues that arise because of it.
2.13
Unique Integer Representations
The important takeaway from our last two wrong turns is that we should strive for unique representations of our data whenever possible. Let’s take one last misstep in attempting to model the integers before we get to the right tack. Our difference integers went wrong because they were built from two different naturals, which we implicitly subtracted. Perhaps we were on the right track using naturals, and would be more successful if we had only one at a time. So in this attempt, we will again reuse the natural numbers, but now build integers merely by tagging whether that natural is postive or negative:
94 ⇤ 0
CHAPTER 2. AN EXPLORATION OF NUMBERS
module Misstep-Integers₃ where open import Data.Nat data ℤ : Set where +_ : ℕ → ℤ -_ : ℕ → ℤ
This approach is much more satisfying than our previous attempt; it allows us to reuse the machinery we wrote for natural numbers, and requires us only to wrap them with a tag. The syntax is a little weird, but recall that the underscores correspond to syntactic “holes,” meaning the following are both acceptable integers: ⇤ 2
_ : ℤ _ = - 2 _ : ℤ _ = + 6
Note that the spaces separating - from 2, and + from 6 are necessary. Agda will complain very loudly—and rather incoherently—if you forget them. While our second approach dramatically improves on the syntax of integers and eliminates most problems from Misstep-Integers₂, there is still one small issue: there is still a non-unique representation for zero, as we can encode it as being either positive or negative: 2
_ : ℤ _ = + 0 _ : ℤ _ = - 0
Perhaps there are some number systems in which it’s desirable to have (distinct) positive and negative zeroes, but this is not one of them. We are stuck with two uncomfortable options—keep the two zeroes and insist that they are in fact two different numbers, or duplicate all of our proof effort and somehow work in the fact that the two zeroes are different encodings of the same thing. Such a thing can work, but it’s inelegant and pushes our poor decisions down the line to every subsequent user of our numbers.
2.13. UNIQUE INTEGER REPRESENTATIONS
95
There really is no good solution here, so we must conclude that this attempt is flawed too. However, it points us in the right direction. Really, the only problem here is our interpretation of the syntax. Recall that the symbols induced by constructors are just names, and so we can rename our constructors in order to change their semantics. This brings us to our and final (and correct) implementation for the integers: ⇤ 0
module Sandbox-Integers where import Data.Nat as ℕ open ℕ using (ℕ) data ℤ : Set where +_ : ℕ → ℤ -[1+_] : ℕ → ℤ
You’ll notice this definition of ℤ is identical to the one from NaiveIntegers₂; the only difference being that we’ve renamed -_ to -[1+_]. This new name suggests that -[1+ n ] corresponds to the number −(1 + 𝑛) = −𝑛 − 1. By subtracting this 1 from all negative numbers, we have removed the possibility of a negative zero. Given this machinery, we can now name three particularly interesting integers: ⇤ 2
0ℤ : ℤ 0ℤ = + 0 1ℤ : ℤ 1ℤ = + 1 -1ℤ : ℤ -1ℤ = -[1+ 0 ]
Of course, we’d still like our suc and pred functions that we postulated our first time around. The constructors are already decided on our behalf, so we’ll have to settle for functions instead: 2
suc suc suc suc
: ℤ → ℤ (+ x) = + ℕ.suc x -[1+ ℕ.zero ] = 0ℤ -[1+ ℕ.suc x ] = -[1+ x ]
96
CHAPTER 2. AN EXPLORATION OF NUMBERS
If suc’s argument is positive, it makes it more positive. If it’s negative, it makes it less negative, possibly producing zero in the process. Dually, we can define pred which makes its argument more negative: 2
pred pred pred pred
2.14
: ℤ → ℤ (+ ℕ.zero) = -1ℤ (+ ℕ.suc x) = + x -[1+ x ] = -[1+ ℕ.suc x ]
Pattern Synonyms
It might be desirable to negate an integer; turning it negative if it’s positive, and vice versa. -_ is a natural name for this operation, but its implementation is not particularly natural: 2
-_ : ℤ → ℤ - (+ ℕ.zero) = 0ℤ - (+ ℕ.suc x) = -[1+ x ] - -[1+ x ] = + ℕ.suc x
When converting back and forth from positive to negative, there’s an annoying ℕ.suc that we need to be careful to not forget. This irritant is an artifact of our encoding. We now have the benefit of unique representations for all numbers, but at the cost of the definition not being symmetric between positive and negative numbers. Thankfully, Agda has a feature that can help us work around this problem. Pattern synonyms allow us to define new constructor syntax for types. While ℤ is and always will be made up of +_ and -[1+_], we can use pattern synonyms to induce other ways of thinking about our data. For example, it would be nice if we could also talk about +[1+_]. This doesn’t give us any new power, as it would always be equivalent to + ℕ.suc x. Nevertheless, our definition of -_ above does include a + (ℕ.suc x) case, so this pattern does seem like it might be useful. We can define a pattern synonym with the pattern keyword. Patterns look exactly like function definitions, except that they build constructors (highlighted red, and can be used in pattern matches) rather than (blue) function definitions. 2
pattern +[1+_] n = + ℕ.suc n
Let’s also define a pattern for zero:
2.14. PATTERN SYNONYMS 2
97
pattern +0 = + ℕ.zero
These two patterns give us the option to define functions symmetrically with respect to the sign of an integer. Where before in -_ we had to pattern match on two cases, +_ and -[1+_], we can now instead choose to match into three cases: +0, +[1+_] and -[1+_]. Let’s use our new patterns to rewrite -_, leading to a significantly more elegant implementation: 2
-_ : ℤ → ℤ - +0 = +0 - +[1+ x ] = -[1+ x ] - -[1+ x ] = +[1+ x ]
What exactly is going on with these pattern synonyms? We haven’t actually changed the constructors of ℤ; merely, we’ve extended our type with different ways of thinking about its construction. Behind the scenes, whats really happening when we write +[1+ n ] is that Agda simply rewrites it by the pattern equation—in this case, resulting in + (ℕ.suc n). It’s nothing magic, but it does a lot in terms of ergonomics. When should we use a pattern instead of a function definition? On the surface, they might seem quite similar. You could imagine we might define +[1+_] not as a pattern, but as a function: 6
+[1+_] : ℕ → ℤ +[1+_] n = + ℕ.suc n
The problem is such a definition creates +[1+_] as a function definition, and note its blue color. In Agda, we’re allowed to do pattern matching only on red values, which correspond to constructors and pattern synonyms. On the left side of the equality, Agda changes its namespace, and will only recognize known red identifiers. Anything it doesn’t recognize in this namespace becomes a black binding, rather than a reference to a blue function. This is a reasonable limitation. In general, function definitions can do arbitrary computation, which would obscure—if not render uncomputable—Agda’s ability to pattern match on the left side of an equality. Thus, blue bindings are strictly disallowed on the pattern matching side. This is the reason behind the existence of pattern synonyms. Pattern definitions are required to be made up of only red constructors
98
CHAPTER 2. AN EXPLORATION OF NUMBERS
and black bindings on both sides. In doing so, we limit their expressiveness, but because of that limitation, we have restricted them in such a way as to be usable in a pattern context. As a rule of thumb, you should define a pattern synonym whenever you notice yourself frequently using the same series of constructors together. Pattern synonyms are valuable in providing a different lens into how you put your data together.
2.15
Integer Addition
With a satisfactory definition for the integers, and having completed our discussion of pattern synonyms, it is now time to implement addition over ℤ. As usual, we will intentionally go down the wrong (but obvious) path in order to help you develop antibodies to antipatterns. Our particular misstep this time around will be to “bash” our way through the definition of addition—that is, match on all three of +0, +[1+_] and -[1+_] for both arguments of _+_. There are a few easy cases, like when one side is zero, or when the signs line up on both sides. After filling in the obvious details, we are left with: 2
module Naive-Addition _+_ : ℤ → ℤ → ℤ +0 + y = +[1+ x ] + +0 = +[1+ x ] + +[1+ y ] = +[1+ x ] + -[1+ y ] = -[1+ x ] + +0
where y +[1+ x ] +[1+ 1 ℕ.+ x ℕ.+ y ]
{! !}
= -[1+ x ]
-[1+ x ] + +[1+ y ] =
{! !}
-[1+ x ] + -[1+ y ] = -[1+ 1 ℕ.+ x ℕ.+ y ]
It’s not clear exactly how to fill in the remaining holes, however. We must commit to a constructor of ℤ, which mathematically means committing to the sign of the result—but in both cases, the sign depends on whether x or y is bigger. Of course, we could now do a pattern match on each of x and y, and implement integer addition inductively on the size of these natural numbers. That feels unsatisfactory for two reasons—the first is that this function is already doing a lot, and induction on the naturals feels like a different sort of thing than the function is already doing. Our second dissatisfaction here is that the two remaining holes are
2.15. INTEGER ADDITION
99
symmetric to one another; since we know that 𝑥 + 𝑦 = 𝑦 + 𝑥, we know that the two holes must be filled in with equivalent implementations. Both of these reasons point to the fact that we need a helper function. Recall in sec. 2.10 when we implemented the monus operator, which performed truncated subtraction of natural numbers. The only reason it was required to truncate results was that we didn’t have a satisfactory type in which we could encode the result if it went negative. With the introduction of ℤ, we now have room for all of those negatives. Thus, we can implement a version of subtraction whose inputs are the naturals, but whose output is an integer. We’ll call this operation _⊖_, input like you’d expect as \o-⇤ 2
_⊖_ : ℕ → ℕ.zero ⊖ ℕ.zero ⊖ ℕ.suc m ⊖ ℕ.suc m ⊖
ℕ → ℤ ℕ.zero ℕ.suc n ℕ.zero ℕ.suc n
= = = =
+0 -[1+ n ] +[1+ m ] m ⊖ n
By implementing _+_ in terms of _⊖_, we can factor out a significant portion of the logic. Note that all we care about is whether the signs of the arguments are the same or different, meaning we can avoid the pattern matches on +0 and +[1+_], instead matching only on +_: 2
infixl 5 _+_ _+_ : ℤ → ℤ → ℤ + x + + y + x + -[1+ y ] -[1+ x ] + + y -[1+ x ] + -[1+ y ]
= = = =
+ (x ℕ.+ y) x ⊖ ℕ.suc y y ⊖ ℕ.suc x -[1+ x ℕ.+ ℕ.suc y ]
This new definition of _+_ shows off the flexibility of Agda’s parser. Notice how we’re working with +_ and _+_ simultaneously, and that Agda isn’t getting confused between the two. In fact, Agda is less confused here than we are, as the syntax highlighting on the first line of the definition gives us enough to mentally parse what’s going on. The blue identifier on the left of the equals sign is always the thing being defined, and its arguments must always be red constructors or black bindings. Practice your mental parsing of these definitions, as they will only get harder as we move deeper into abstract mathematics. Having implemented addition is the hard part. We can implement subtraction trivially, via addition of the negative:
100 2
CHAPTER 2. AN EXPLORATION OF NUMBERS
infixl 5 _-_ _-_ : ℤ → ℤ → ℤ x - y = x + (- y)
Last but not least, we can define multiplication, again as repeated addition. It’s a little trickier this time around, since we need to recurse on positive and negative multiplicands, but the cases are rather simple. Multiplication by zero is zero: 2
infixl 6 _*_ _*_ : ℤ → ℤ → ℤ x * +0 = +0
Multiplication by either 1 or −1 merely transfers the sign: 2
x * +[1+ ℕ.zero ] = x x * -[1+ ℕ.zero ] = - x
and otherwise, multiplication is just can perform repeated addition or subtraction on one argument, moving towards zero: 2
x * +[1+ ℕ.suc y ] = (+[1+ y ] * x) + x x * -[1+ ℕ.suc y ] = (-[1+ y ] * x) - x
Thankfully, our hard work is rewarded when the unit tests agree that we got the right answers: 2
module Tests where open import Relation.Binary.PropositionalEquality _ : - (+ 2) * - (+ 6) ≡ + 12 _ = refl _ : (+ 3) - (+ 10) ≡ - (+ 7) _ = refl
2.16
Wrapping Up
Our achievements in this chapter are quite marvelous. Not only have we defined the natural numbers and the integers, but we’ve given
2.16. WRAPPING UP
101
their everyday operations, and learned a great deal about Agda in the process. Rather famously, in the Principia Mathematica, Whitehead and Russell took a whopping 379 pages to prove that 1 + 1 = 2. While we haven’t yet proven this fact, we are well on our way, and will do so in the next chapter when we reflect on the deep nature of proof. Before closing, we will explicitly list out our accomplishments from the chapter, and export them from the standard library for use in the future. In sec. 2.1 we constructed the natural numbers ℕ and their constructors zero and suc. Addition comes from sec. 2.7, while multiplication and exponentiation come from sec. 2.9. The monus operator _∸_ is from sec. 2.10. ⇤ 0
open import Data.Nat using (ℕ; zero; suc; _+_; _*_; _^_; _∸_) public
We also gave definitions for the first four positive naturals: ⇤ 0
open Sandbox-Naturals using (one; two; three; four) public
While discussing the natural numbers, we looked at two notions of evenness in sec. 2.5. We’d like to export IsEven and its constructors zero-even and suc-suc-even. For succinctness, however, we’ll rename those constructors to z-even and ss-even by way of a renaming import modifier: ⇤ 0
open Sandbox-Naturals using (IsEven) renaming ( zero-even to z-even ; suc-suc-even to ss-even ) public
In sec. 2.6, we constructed the Maybe type, which we used to wrap functions’ return types in case there is no possible answer. If the function can’t return anything, we use the constructor nothing, but if it was successful, it can use just: ⇤ 0
open import Data.Maybe using (Maybe; just; nothing)
102
CHAPTER 2. AN EXPLORATION OF NUMBERS public
Discussing the integers made for an interesting exercise, but we will not need them again in this book, and therefore will not export them. Nevertheless, if you’d like to use them in your own code, you can find all of our definitions under Data.Int.
Ï UNICODE IN THIS CHAPTER ₀ U+2080 SUBSCRIPT ZERO (\_0)
₁ U+2081 SUBSCRIPT ONE (\_1) ₂ U+2082 SUBSCRIPT TWO (\_2) ₃ U+2083 SUBSCRIPT THREE (\_3) ℕ U+2115 DOUBLE-STRUCK CAPITAL N (\bN) ℤ U+2124 DOUBLE-STRUCK CAPITAL Z (\bZ) → U+2192 RIGHTWARDS ARROW (\to) ∸ U+2238 DOT MINUS (\.-) ≡ U+2261 IDENTICAL TO (\ ) ⊖ U+2296 CIRCLED MINUS (\o-)
CHAPTER
3 Proof Objects
0
module Chapter3-Proofs where
My first encounter with mathematical proofs was in a first-year university algebra course, where I immediately realized I had no idea what was going on. The reasoning that seemed perfectly convincing to me was much less so to whomever was in charge of assigning my grade. I didn’t do well in that course, or any subsequent ones. The problem was that I didn’t know what steps needed to be justified, and which could be taken for granted. Thankfully, doing proofs in Agda makes this exceptionally clear—either your proof typechecks, or it doesn’t. In either case, the feedback cycle is extremely quick, and it’s easy to iterate until you’re done. In this chapter we will take our first looks at what constitutes a proof and see how we can articulate them in Agda. In the process, we will need to learn a little more about Agda’s execution model and begin exploring the exciting world of dependent types. Prerequisites 0
⇤ 0
open import Chapter1-Agda using (Bool; true; false; _∨_; _∧_; not)
open import Chapter2-Numbers using (ℕ; zero; suc)
103
104
3.1
CHAPTER 3. PROOF OBJECTS
Constructivism
It is worth noting that the mathematics in this book are not the “whole story” of the field. You see, there are two camps in the world of math: the classicists and the constructivists. Much like many religious sects, these two groups have much more in common than they have distinct. In fact, the only distinction between these two groups of truth-seekers is their opinion on the nature of falsities. The classicists—the vast majority—believe all mathematical statements are partitioned between those which are true, and those which are false. There is simply no middle ground. This opinion certainly doesn’t sound controversial, but it does admit odd tactics for proving things. One common proof technique in the classical camp is to show that something can’t not exist, and therefore deducing that it does. Contrast the classicists with the constructivists, who trust their eyes more than they trust logical arguments. Constructivists aren’t happy knowing something merely doesn’t not exist; they’d like to see the thing for themselves. Thus, the constructivists insist that a proof actually build the object in question, rather than just show it must be there with no clues towards actually finding the thing. In general, there are two ways to mathematically show something exists. The first way is to just build the thing, in sense “proof by doing.” The other is to show that a world without the thing would be meaningless, and thus show its existence—in some sense—by sheer force of will, because we really don’t want to believe our world is meaningless. To illustrate this difference, suppose we’d like to prove that there exists a prime number greater than 10. Under a classical worldview, a perfectly acceptable proof would go something like this: 1. Suppose there does not exist any prime number greater than 10. 2. Therefore, the prime factorization of every number must consist only of 2, 3, 5, and 7. 3. If a number 𝑛 has a prime factor 𝑑 , then 𝑛 + 1 does not have 𝑑 as a prime factor. 4. The number 2 × 3 × 5 × 7 = 210 has prime factors of 2, 3, 5, and 7. 5. Therefore, 210 + 1 = 211 does not have prime factors of 2, 3, 5, or 7. 6. Therefore, 211 has no prime factors.
3.2. STATEMENTS ARE TYPES; PROGRAMS ARE PROOFS
105
7. This is a contradiction, because all numbers have prime factors. 8. Therefore, there does exist a prime number greater than 10. ∎ Contrast this against a constructive proof of the same proposition: 1. 11 is divisible by no number between 2 and 10. 2. Therefore, 11 is a prime number. 3. 11 is a number greater than 10. 4. Therefore, there exists a prime number greater than 10. ∎ Classical arguments are allowed to assume the negation, show that it leads to absurdity, and therefore refute the original negation. But constructive arguments are required to build the object in question, and furthermore to take on the burden to show that it satisfies the necessary properties. The classicists will accept a constructive argument, while the constructivists insist on one. Under a computational setting, constructive arguments are much more compelling than classical ones. This is because constructive arguments correspond to objects we can hold in our hands (or, at least, in memory), while classical arguments can come from counterfactual observations. To put it another way, constructive arguments correspond directly to algorithms.
3.2
Statements are Types; Programs are Proofs
Having studied our programming language in sec. 1 and looked at some simple mathematical objects in sec. 2, let’s now turn our focus towards more fundamental mathematical ideas. When most people think of math, their minds go immediately to numbers. But of course, mathematics is a field significantly larger than numbers, and we will touch upon them only briefly in the remainder of this chapter. But if numbers are not the focus of mathematics, then what is? In the opinion of this author, it’s the process of clear, logical deduction around precise ideas. Numbers are one such precise idea, but they are by no means the only. Some common other examples are the booleans, graphs, and geometry. Some less examples less often considered math are digital circuits, and computer programs. Anything you can define precisely and manipulate symbolically can fall under the purview of mathematics when done right.
106
CHAPTER 3. PROOF OBJECTS
In math, it’s common to differentiate between statements and theorems. Statements are claims you can make, while theorems are claims you can prove. For example, it’s a perfectly valid statement that 2 = 3, but such a claim isn’t a theorem under any usual definitions for 2, 3, or equality. Occasionally, you’ll also hear statements called propositions, but this word is amazingly overloaded, and we will not adopt such usage. Of particular interest to us are theorems, which by necessity are made of two parts: a statement and a proof of that statement. While two mathematicians might disagree on whether they think a statement is true, they must both agree a theorem be true. That is the nature of a theorem, that it comes with a proof, and a proof must be convincing in order to warrant such a name. There is a very apt analogy to software here. It’s very easy to believe a problem can’t be solved. That is, of course, until someone hands you the algorithm that does it. The algorithm itself is a proof artifact that shows the problem can be done, and it would be insanity to disagree in the presence of such evidence. It is exactly this analogy that we will exploit for the remainder of this book in order to show the relationship between mathematics and programming. In doing so, we will help programmers use the tools they already have, in order to start being productive in mathematics. But let’s make the connection more formal. To be very explicit, our analogy equates mathematical states and types. That is to say, any mathematical statement can be encoding as a type, and every type can be interpreted as a mathematical statement. Furthermore, every theorem of a statement corresponds to a program with that type, and every program is a proof of the statement. As an extremely simple example, we can say that the type Bool corresponds to the proposition “there exists a boolean.” This is not a particularly strong claim. Under a constructive lens, we can prove the proposition merely by proving a boolean, thus proving at least one exists. Therefore, the term true is a proof of Bool. Such a simple example doesn’t provide much illumination. Let’s try something more complicated. Recall our IsEven type from sec. 2.5, which we can bring back into scope: ⇤ 0
module Example-ProofsAsPrograms where open Chapter2-Numbers using (ℕ; IsEven)
3.2. STATEMENTS ARE TYPES; PROGRAMS ARE PROOFS
107
Every type forms a module containing its constructors and fields, so we can open both of ℕ and IsEven to get the relevant constructors out: ⇤ 2
open ℕ open IsEven
We can then form a statement asking whether zero is even by constructing the type: 2
zero-is-even : IsEven zero
Of course, zero is even, the proof of which we have seen before: 2
zero-is-even = zero-even
Because we have successfully implemented zero-is-even, we say that zero-is-even is a theorem, and that it a proof of IsEven zero. To drive the point home, we can also try asking whether one is an even number: 2
one-is-even : IsEven (suc zero) one-is-even =
?
However, as we saw in sec. 2.5, there is no way to fill this hole. Therefore, one-is-even cannot be implemented, and therefore it is not a theorem—even though IsEven (suc zero) is a perfect acceptable statement. In the context of values (programs) and types, we will adopt some extra terminology. We say that a type is inhabited if there exists at least one value of that type. Therefore, Bool and IsEven zero are both inhabited, while IsEven (suc zero) is not. Under a constructive lens, it is exactly those statements for which we have a proof that can be said to be true. In other words, truth is synonymous with being inhabited. These examples all illustrate the point: while we can always write down the type of something we’d like to prove, we cannot always find a value with that type. Therefore, we say that types correspond to statements, while values are proofs of those statements. In the literature, this concept is known by the name types as propositions, and as the Curry–Howard correspondence. The Curry–Howard correspondence thus gives us a guiding principle for doing constructive mathematics in a programming language.
108
CHAPTER 3. PROOF OBJECTS
We “simply” write down the problem, encoding the statement as a type, and then we work hard to construct a value of that type. In doing so, we show the truth of the original problem statement. Keeping this perspective in mind is the secret to success.
3.3
Hard to Prove or Simply False?
Of course, this abstract discussion around the Curry–Howard isomorphism makes the whole business seem much less messy than it is in practice. It’s one thing to discuss whether a type is inhabited, and a very different thing to actually produce a value for a given type. Every example we’ve seen so far has made the job seem trivial, but attempting to produce an inhabitant has driven many to the brink of madness. What’s so hard here, you might wonder. The problem is, when you’re not making much process, it’s hard to tell whether you’re merely taking the wrong approach, or whether the task at hand is literally impossible. Of the absolute utmost importance in mathematics is the principle of consistency. This is a fancy way of saying “there should be no proof of false things.” Math is a tool for exploring truths about platonic abstractions, and being able to derive a proof of false would be devastating to the entire endeavor. The reason we care so much about this is that falsities beget falsities. If you ever get your hands on one, you can use it to produce a second. You’ve probably seen the classic “proof” that 1 = 2. It goes like this: Let 𝑎 = 𝑏, then 𝑎𝑏 = 𝑎2 ∴
𝑎𝑏 − 𝑏2 = 𝑎2 − 𝑏2 = (𝑎 + 𝑏)(𝑎 − 𝑏)
However, we can also factor 𝑎𝑏 − 𝑏2 as follows: 𝑎𝑏 − 𝑏2 = (𝑎 − 𝑏)𝑏 = 𝑏(𝑎 − 𝑏)
in which case we know:
3.3. HARD TO PROVE OR SIMPLY FALSE?
109
𝑏(𝑎 − 𝑏) = (𝑎 + 𝑏)(𝑎 − 𝑏) ∴
𝑏=𝑎+𝑏 =𝑏+𝑏 = 2𝑏
∴
1=2
The actual flaw in reasoning here is when we cancel 𝑎 − 𝑏 from both sides of the equation. Recall that 𝑎 = 𝑏, so 𝑎 − 𝑏 = 0, and thus this is an implicit division by zero. To see how we can use one false proof to get another, consider now Pythagoras’ famous theorem about the side lengths of triangles: 𝑎2 + 𝑏2 = 𝑐 2
But since we have already “proven” that 1 = 2, we can therefore “derive” the fact that: 𝑎+𝑏=𝑐
Whoops! As you can see, as soon as we manage to prove something false, all bets are off. In English, this property is known as the principle of explosion but you can also call it ex falso quodlibet if you’re feeling particularly regal. All this means is that, given a proof of false, you can subsequently provide a proof of anything. Therefore, contradictions are really, really bad, and a huge chunk of logical development (including computation itself) has arisen from scholars discovering contradictions in less rigorous mathematics than what we use today. All of this is to say: it’s desirable that it be very difficult to prove something that is false. From afar, this sounds like a very good and righteous desideratum. But when you’re deep in the proof mines, having difficulties eliciting the sought-after proof, it’s often unclear whether you haven’t tried hard enough or whether the problem is impossible outright. I myself have spent weeks of my life trying to prove a false statement without realizing it. I suspect this is a necessary rite of passage. Nevertheless, I hope you spare you from some of the toil spent wasted on a false proposition. If you’re ever finding a proof to be exceptionally hard, it’s worth taking some time out to prove the proposition for extremely simple, concrete values. For example, when you’re
110
CHAPTER 3. PROOF OBJECTS
working with numbers, see if it holds when everything is zero or one. Working through the easiest cases by hand will usually point out a fundamental problem if there is one, or might alert you to the fact that you haven’t yet built enough machinery (that is, library code around your particular problem) to make proving things easy. Remember, you can always prove something the stupid way first, and come back with a better proof later on if you deem necessary. In proofs as in life, “done” is better than “perfect.”
3.4
The Equality Type
All of this discussion about encoding statements as types is fine and dandy, but how do we go about actually building these types? Usually the technique is to construct an indexed type whose indices constrain what we can do, much like we did with IsEven. One of the most common mathematical statements—indeed, often synonymous with math in school—is the equation. Equality is the statement that two different objects are, and always were, just one object. There is a wide and raging debate about exactly what equality means, but for the time being we will limit ourselves to the case that the two expressions will eventually evaluate to the exact same tree of constructors. This particular notion of equality is known as propositional equality and is the basic notion of equality in Agda. As I said, the word “proposition” is extremely overloaded with meanings, and this usage has absolutely nothing to do with the idea of propositions as types discussed earlier. Instead, here and everywhere in the context of Agda, proposition means “inhabited by at most one value.” That is, Bool is not a proposition, because it can be constructed by either of the two booleans. On the other hand, IsEven zero is a proposition, because its only proof is zero-even. We can define propositional equality by making a type for it. The type should relate two objects stating that they are equal. Thus it must be indexed by two values. These indices correspond to the two values being related. In order for two things to evaluate to the same constructors, they must have the same type. And because we’d like to define propositional equality once and for all, we will parameterize this equality type by the type of things it relates. Solving all these constraints simultaneously gives us the following data type: ⇤ 0
module Definition where data _≡_ {A : Set} : A → A → Set where
3.4. THE EQUALITY TYPE
111
refl : {x : A} → x ≡ x
Recall that the ≡ symbol is input as \== . The type of refl here, {x : A} → x ≡ x, says that for any value x we’d like, we know only that x is equal to itself. The name refl is short for reflexivity, which is technical jargon for the property that all things are equal to themselves. We shorten reflexivity to refl because we end up writing this constructor a lot. That’s the type of refl, which happens to be the only constructor of this data type. But consider the type x ≡ y—it’s a statement that x and y in fact evaluate to the same tree of constructors. Whether or not this is actually true depends on whether the type is actually inhabited, which it is only when x and y both compute to the same thing. It is only in this case that we can convince Agda that refl is an inhabitant, because refl requires that both of the type indices be x. We’ll play with this type momentarily to get a feeling for it. But first we have a little more bookkeeping to do. In order to play nicely with standard mathematical notation, we’d like _≡_ to bind very loosely, that is to say, to have a low precedence. Furthermore, we do not want _≡_ to associate at all, so we can use infix without a left or right suffix to ensure the syntax behaves as desired. ⇤ 2
infix 4 _≡_
We have already encountered _≡_ and refl in sec. 1 where we called them “unit tests.” This was a little white-lie. In fact, what we were doing before with our “unit tests” was proposing the equality of two terms, and giving a proof of refl to show they were in fact equal. Because Agda will automatically do as much computation and simplification as it can, for any two concrete expressions that result in equal constructors, Agda will convince itself of this fact. As a practical technique, we often can (and do) write little unit tests of this form. But, as we will see in a moment, we can use propositional equality to assert much stronger claims than unit tests are capable of determining. Let’s play around with our equality type to get a feel for what it can do. ⇤ 0
module Playground where open import Relation.Binary.PropositionalEquality
112
CHAPTER 3. PROOF OBJECTS using (_≡_; refl) open Chapter2-Numbers
We should not be surprised that Agda can determine that two syntactically-identical terms are equal: 2
_ : suc (suc (suc zero)) ≡ suc (suc (suc zero)) _ = refl
As we saw before, Agda will also expand definitions, meaning we can trivially show that: 2
_ : three ≡ suc (suc (suc zero)) _ = refl
Agda can also do this if the definitions require computation, as is the case for _+_: 2
_ : three ≡ one + two _ = refl
Each of these examples is of the “unit test” variety. But perhaps you’ll be delighted to learn that we can also use propositional equality to automatically show some algebraic identities—that is, two mathematical expressions with variables that are always equal. For starters, we’d like to prove the following simple identity: 0+𝑥=𝑥
Our familiarity with math notation in situations like these can actually be a burden to understanding. While we will readily admit the truth of this statement, it’s less clear what exactly it’s saying, as what variables are is often fuzzy. I like this example, because working it through helped conceptualize things I’d been confused about for decades. What 0 + 𝑥 = 𝑥 is really saying is that for any 𝑥, it is the case that 0 + 𝑥 = 𝑥. Mathematicians are infuriatingly implicit about what and when they are quantifying over, and a big chunk of internalizing math is just getting a feel for how the quantification works. Phrased in this way, we can think of the identity 0 + 𝑥 = 𝑥 instead as a function which takes a parameter x and returns a proof that, for that exact argument, 0 + x ≡ x. Thus:
3.4. THE EQUALITY TYPE 2
113
0+x≡x : (x : ℕ) → zero + x ≡ x 0+x≡x =
?
In order to give a proof of this fact, we must bind the parameter on the left side of the equals (in fact, we don’t even need to give it a name), and can simply give refl on the right side: 2
0+x≡x : (x : ℕ) → zero + x ≡ x 0+x≡x _ = refl
Our examples thus far seem to indicate that _≡_ can automatically show all of the equalities we’d like. But this is due only to careful planning on my part. Try as we might, however, Agda will simply refuse to typecheck the analogous identity 𝑥 + 0 = 𝑥: 6
x+0≡x : (x : ℕ) → x + zero ≡ x x+0≡x _ = refl
complaining that: i INFO WINDOW x + zero != x of type ℕ when checking that the expression refl has type x + zero ≡ x
Inspecting the error message here is quite informative; Agda tells us that x + zero is not the same thing as x. What exactly does it mean by that? In sec. 1.14 we discussed what happens when an expression gets stuck. Recall that Agda computes by way of matching expressions on which constructors they evaluate to. But we defined _+_ by induction on its first argument, and in this case, the first argument is simply x. Thus the expression x + zero is stuck, which is why Agda can’t work whether refl is an acceptable constructor to use here. We can solve this, like most other problems of stuckness, simply by pattern matching on the stuck variable: 2
x+0≡x : (x : ℕ) → x + zero ≡ x x+0≡x zero
=
{! !}
x+0≡x (suc x) =
{! !}
Immediately, Agda gets unstuck. Our first hole here now has type zero ≡ zero, which is trivially solved by refl:
114 2
CHAPTER 3. PROOF OBJECTS
x+0≡x : (x : ℕ) → x + zero ≡ x x+0≡x zero = refl x+0≡x (suc x) =
{! !}
This second goal here is harder, however. Its type suc (x + zero) ≡ suc x has arisen from instantiating the original parameter at suc x. Thus we are trying to show suc x + zero ≡ suc x, which Agda has reduced to suc (x + zero) ≡ suc x by noticing the leftmost argument to _+_ is a suc constructor. Looking closely, this goal is almost exactly the type of x+0≡x itself, albeit with a suc tacked onto either side. If we were to recurse, we could get a proof of x + zero ≡ x, which then seems plausible that we could massage into the right shape. Let’s pause on our definition of x+0≡x for a moment, in order to work out this problem of fitting a suc into a proof-shaped hole.
3.5
Congruence
At first blush, we are trying to solve the following problem: 2
postulate _ : (x : ℕ) → x + zero ≡ x → suc (x + zero) ≡ suc x
which we read as “for any number x : ℕ, we can transform a proof of x + zero ≡ x into a proof of suc (x + zero) ≡ suc x.” While such a thing is perfectly reasonable, it feels like setting the bar too low. Surely we should be able to show the more general solution that: ⇤ 2
postulate _ : {x y : ℕ} → x ≡ y → suc x ≡ suc y
read informally as “if x and y are equal, then so too are suc x and Notice that while x was an explicit parameter to the previous formulation of this idea, we here have made it implicit. Since there is no arithmetic required, Agda is therefore able to unambiguously determine which two things we’re trying to show are equal. And why do something explicitly if the computer can figure it out on our behalf?
suc y.”
3.5. CONGRUENCE
115
Phrased this way, perhaps our goals are still too narrow. Recall that propositional equality means “these two values evaluate to identical forms,” which is to say that, at the end of the day, they are indistinguishable. If two things are indistinguishable, then there must not be any way that we can distinguish between them, including looking at the result of function call. Therefore, we can make the much stronger claim that “if x and y are equal, then so too are f x and f y for any function f!” Now we really cooking with gas. This property is known as congruence, which again gets shortened to cong due its frequency. The type of cong is rather involved, but most of the work involved is binding the relevant variables. ⇤ 2
cong : {A B : Set} → {x y : A} → (f : A → B) → x ≡ y → f x ≡ f y cong f x≡y =
1 2 3 4 5
?
The proper way to read this type is, from top to bottom: 1. For any types A and B, 2. and for any values x and y, both of type A , then 3. for any function f : A → B, 4. given a proof that x ≡ y, 5. it is the case that f x ≡ f y. Another way of reading the type of congruence is that it allows us to “transport” a proof from the input-side of a function over to the output-side. Actually proving cong is surprisingly straightforward. We already have a proof that x ≡ y. When we pattern match on this value, Agda is smart enough to replace every y in scope with x, since we have already learned that x and y are exactly the same thing. Thus, after a MakeCase with argument x≡y ( C-c C-c in Emacs and VS Code):
116 2
CHAPTER 3. PROOF OBJECTS
cong : {A B : Set} → {x y : A} → (f : A → B) → x ≡ y → f x ≡ f y cong f refl =
{! !}
our new goal has type f x ≡ f x, which is filled trivially by a call to refl. 2
cong : {A B : Set} → {x y : A} → (f : A → B) → x ≡ y → f x ≡ f y cong f refl = refl
Popping the stack, recall that we were looking for a means of completing the following proof: 2
x+0≡x : (x : ℕ) → x + zero ≡ x x+0≡x zero = refl x+0≡x (suc x) =
{! !}
The hole here has type suc (x + zero) ≡ suc x, which we can use cong to help with. Congruence requires a function f that is on both sides of the equality, which in this case means we must use suc. Therefore, we can fill our hole with: 2
x+0≡x : (x : ℕ) → x + zero ≡ x x+0≡x zero = refl x+0≡x (suc x) =
{! cong suc !}
and ask Agda to Refine ( C-c C-r in Emacs and VS Code), which will result in: 2
x+0≡x : (x : ℕ) → x + zero ≡ x x+0≡x zero = refl x+0≡x (suc x) = cong suc
{! !}
3.6. IDENTITY AND ZERO ELEMENTS
117
Notice how Agda has taken our suggestion for the hole, applied it, and left a new hole for the remaining argument to cong. This new hole has type x + zero ≡ x, which is exactly the type of x+0≡x itself. We can ask Agda to fill in the rest of the definition for us by invoking Auto ( C-c C-a in Emacs and VS Code): 2
x+0≡x : (x : ℕ) → x + zero ≡ x x+0≡x zero = refl x+0≡x (suc x) = cong suc (x+0≡x x)
Congruence is an excellent tool for doing induction in proofs. You can do induction as normal, but the resulting proof from the recursive step is usually not quite be what you need. Luckily, the solution is often just a cong away.
3.6 Identity and Zero Elements A common algebraic structure is the idea of an identity element— annoyingly, “identity” in a difference sense than in algebraic identity. An identity element is a value which doesn’t change the answer when applied as a function argument. That’s a very abstract sentence, so let’s dive into it in more detail. Consider addition. As we saw in the previous section, whenever you add zero, you don’t change the result. That is to say that zero is an identity element for the addition function. Since 𝑥 + 0 = 𝑥, zero is a right identity for addition, and because 0 + 𝑥 = 𝑥, zero is a left identity too. Identity values are tied to specific functions. Notice that multiplication by zero definitely changes the result, and so zero is not an identity for multiplication. We do, however, have an identity for multiplication: it’s just the one instead. Identities are extremely important in algebra, because spotting one means we can simplify an expression. In Agda, proofs about identities are often given standard names, with the -identityˡ and -identityʳ suffixes (input as \^l and \^r respectively.) We prepend the function name to these, so, the proof that 0 is a left identity for _+_ should be named +-identityˡ. Therefore, let’s give better names to our functions from earlier: 2
+-identityˡ : (x : ℕ) → zero + x ≡ x +-identityˡ = 0+x≡x
118
CHAPTER 3. PROOF OBJECTS
+-identityʳ : (x : ℕ) → x + zero ≡ x +-identityʳ = x+0≡x
The attentive reader might question why exactly we need +since it’s fully-normalized definition is just refl, which is to say that it’s something Agda can work out for itself without explicitly using +-identityˡ. While that is true, it is an implementation detail. If we were to not expose +-identityˡ, the user of our proof library would be required to understand for themselves exactly how addition is implemented. It doesn’t seem too onerous, but in the wild, we’re dealing with much more complicated objects. Instead, we content ourselves in exposing “trivial” proofs like +identityˡ with the understanding that it is the name of this proof that is important. Throughout your exposure to the Agda standard library, you will find many such-named functions, and the conventions will help you find the theorems you need without needing to dig deeply into the each implementation. In addition to addition, multiplication also enjoys both left and right identities as we have seen. A good exercise is to prove both. identityˡ,
Exercise (Easy) Prove that 1 × 𝑎 = 𝑎 Solution 2
*-identityˡ : (x : ℕ) → 1 * x ≡ x *-identityˡ zero = refl *-identityˡ (suc x) = cong suc (+-identityʳ x)
Exercise (Easy) Prove that 𝑎 × 1 = 𝑎 Solution 2
*-identityʳ : (x : ℕ) → x * 1 ≡ x *-identityʳ zero = refl *-identityʳ (suc x) = cong suc (*-identityʳ x)
Addition and multiplication aren’t the only operations we’ve seen that have identities. Both monus and exponentiation also have identities, but they are not two-sided. For example, zero is a right identity for monus:
3.6. IDENTITY AND ZERO ELEMENTS 2
119
∸-identityʳ : (x : ℕ) → x ∸ 0 ≡ x ∸-identityʳ _ = refl
but it is not a left identity. As it happens, the monus operation does not have a left identity—a fact we will prove in sec. 6.6. Exercise (Easy) Find and prove an identity element for exponentiation. Solution 2
^-identityʳ : (x : ℕ) → x ^ 1 ≡ x ^-identityʳ zero = refl ^-identityʳ (suc x) = cong suc (*-identityʳ x)
Identities are not limited to numeric operations. For example, false is both a left and right identity for _∨_, as we can show: 2
∨-identityˡ : (x : Bool) → false ∨ x ≡ x ∨-identityˡ _ = refl
2
∨-identityʳ : (x : Bool) → x ∨ false ≡ x ∨-identityʳ false = refl ∨-identityʳ true = refl
Exercise Prove analogous facts about the boolean AND function _∧_. Solution 2
∧-identityˡ : (x : Bool) → true ∧ x ≡ x ∧-identityˡ _ = refl
2
∧-identityʳ : (x : Bool) → x ∧ true ≡ x ∧-identityʳ false = refl ∧-identityʳ true = refl
While identity elements might seem unexciting and pointless right now, but they are an integral part for a rich computational structure that we will study in sec. 7.2. For the time being, we will remark only
120
CHAPTER 3. PROOF OBJECTS
that the discovery of the number zero was a marvelous technological achievement in its day. Beyond identities, some operations also have the notion of a zero element, or annihilator. An annihilator is an element which dominates the computation, forcing the return value to also be the annihilator. The most familiar example of a zero element is literally zero for multiplication—whenever you multiply by zero you get back zero! Like identities, zero elements can have a chirality and apply to one or both sides of a binary operator. Multiplication by zero is both a left and right zero: 2
*-zeroˡ : (x : ℕ) → zero * x ≡ zero *-zeroˡ _ = refl
2
*-zeroʳ : (x : ℕ) → x * zero ≡ zero *-zeroʳ zero = refl *-zeroʳ (suc x) = *-zeroʳ x
The name “zero element” can be misleading. Zero elements can exist for non-numeric functions, but the potential confusion doesn’t end there. Many less type-safe languages have a notion of falsey values— that is, values which can be implicitly converted to a boolean, and elicit false when doing so. The number 0 a prototypical example of a falsey value, which unfortunately causes people to equivocate between zero and false. At risk of stating the obvious, falsey values do not exist in Agda, and more generally should be considered a poison for the mind. I bring up falsey values only to disassociate zero from false in your mind. In the context of the _∨_ function, it is true that is the zero element: 2
∨-zeroˡ : (x : Bool) → true ∨ x ≡ true ∨-zeroˡ _ = refl
2
∨-zeroʳ : (x : Bool) → x ∨ true ≡ true ∨-zeroʳ false = refl ∨-zeroʳ true = refl
Annihilators can dramatically simplify a proof. If you can spot one lurking, you know its existence must immediately trivialize a subexpression, reducing it to a zero. Often recursively.
3.7. SYMMETRY AND INVOLUTIVITY
121
Exercise Find an analogous annihilator for _∧_. Solution 2
∧-zeroˡ : (x : Bool) → false ∧ x ≡ false ∧-zeroˡ _ = refl
2
∧-zeroʳ : (x : Bool) → x ∧ false ≡ false ∧-zeroʳ false = refl ∧-zeroʳ true = refl
3.7 Symmetry and Involutivity In the previous section, we proved the elementary fact *-identityˡ, stating that 1 × 𝑎 = 𝑎. Given that we now have that proof under our belts, how challenging do you expect it to be in order to prove 𝑎 = 1 × 𝑎? The obvious idea is to try simply to reuse our *-identityˡ proof, as in: 6
*-identityˡ′ : (x : ℕ) → x ≡ 1 * x *-identityˡ′ = *-identityˡ
Unfortunately, Agda is unhappy with this definition, and it complains: i INFO WINDOW x + 0 * x != x of type ℕ when checking that the expression *-identityˡ has type (x : ℕ) → x ≡ 1 * x
Something has gone wrong here, but the error message isn’t particularly elucidating. Behind the scenes, Agda is trying to massage the type of *-identityˡ into the type of *-identityˡ′. Let’s work it through for ourselves to see where exactly the problem arises. Remember that we defined _≡_ for ourselves, and therefore that it can’t have any special support from the compiler. As far as Agda is concerned _≡_ is just some type, and has nothing to do with equality. Anything we’d expect to hold true of equality is therefore something we have to prove for ourselves, rather than expect Agda to do on our behalf.
122
CHAPTER 3. PROOF OBJECTS
So to see the problem, we begin with the type 1 * x ≡ x from *-identityˡ. Then, we try to assign a value with this type to the definition of *-identityˡ′, which we’ve said has type x ≡ 1 * x. Agda notices that these are not the same type, and kicks off its unification algorithm in an attempt to line up the types. During unification, Agda is attempting to combine these two types: • 1 * x ≡ x , and • x≡1*x which it does by attempting to show that both left-hand sides of _≡_ compute to the same thing, and similarly for both right-hand sides. More generally, if Agda is trying to unify a ≡ b and c ≡ d, it will try to show that a ~ c and b ~ d, where ~ means “simplifies down to identical syntactic forms.” Perhaps you already see where things are going wrong. Agda attempts to unify our two propositional equality types, and in doing so, reduces down to two unification problems. From the left-hand sides, it gets 1 * x ~ x, and from the right-hand sides, x ~ 1 * x. Of course, these unification problems are not syntactically identical, which is exactly why we wanted to prove their equality in the first place. Unfortunately, there is no way we can add *-identityˡ to some sort of “global proof bank” and have Agda automatically solve the equality on our behalf. Instead, we resign ourselves to the fact that we will need a different approach to implement *-identityˡ′. The next obvious solution is to just write out our proof of 𝑎 = 1 × 𝑎 again, pattern match and all. The original implementation of *-identityʳ was, if you will recall: 2
*-identityˡ : (x : ℕ) → 1 * x ≡ x *-identityˡ zero = refl *-identityˡ (suc x) = cong suc (+-identityʳ x)
If we wanted just to rewrite this proof with the propositional equality flipped around, we notice something goes wrong: 6
*-identityˡ′ : (x : ℕ) → x ≡ 1 * x *-identityˡ′ zero = refl *-identityˡ′ (suc x) = cong suc (+-identityʳ x)
3.7. SYMMETRY AND INVOLUTIVITY
123
i INFO WINDOW x + zero != x of type ℕ when checking that the expression +-identityʳ x has type
x ≡ x + 0 * suc x
It’s the same problem we had before, except now the error comes from our use of +-identityʳ! This puts us in an annoyingly recursive bind; in order to flip the arguments on either side of _≡_ must we really reimplement *-identityˡ, +-identityʳ, and every proof in their transitive call graph? By Newton’s good grace, thankfully the answer is a resounding no! What we are missing here is a conceptual piece of the puzzle. Recall that propositional equality itself proves that the two things on either side of _≡_ are in fact just one thing. That is, once we’ve pattern matched on refl: x ≡ y, there is no longer a distinction between x and y! We can exploit this fact to flip any propositional equality proof, via a new combinator sym: 2
sym : {A : Set} → {x y : A} → x ≡ y → y ≡ x sym refl = refl
Rather underwhelming once you see it, isn’t it? After we pattern match on refl, we learn that x and y are the same thing, so our goal becomes x ≡ x, which we can solve with refl. From there, Agda is happy to rewrite the left side as y, since it knows that’s just a different name for x anyway. Thank goodness. Wondering what strange word sym is short for? Symmetry is the idea that a relation doesn’t distinguish between its left and right arguments. We’ll discuss relations in more generality in sec. 4, but all you need to know for now is that equality is a relation. As usual, we shorten “symmetry” to sym due to its overwhelming ubiquity in proofs. Returning to the problem of identityˡ′, sym now gives us a satisfying, general-purpose tool for its implementation: 2
*-identityˡ′ : (x : ℕ) → x ≡ 1 * x *-identityˡ′ x = sym (*-identityˡ x)
124
CHAPTER 3. PROOF OBJECTS
Because sym swaps which of its arguments is on the left and which is on the right, we should expect that applying sym twice should get us back to where we started. Is this so? We could try to ponder the question deeply, but instead we remember that we’re now capable of doing computer-aided mathematics, and the more interesting question is whether we can prove it. In fact we can! The hardest part is laying down the type, which we’d like to work for any propositional equality term, regardless of the specific types involved. Thus we must bind A : Set to quantify over the type of the proof, and then we must bind x : A and y : A for the particular arguments on either side of the equals sign: 2
sym-involutive : {A : Set} → {x y : A} → (p : x ≡ y) → sym (sym p) ≡ p sym-involutive =
?
The proof here is simple and satisfying, and is left as an exercise to the reader. Exercise (Trivial) Prove sym-involutive. Solution 2
sym-involutive : {A : Set} {x y : A} → (p : x ≡ y) → sym (sym p) ≡ p sym-involutive refl = refl
An involution is any operation that gets you back to where you started after two invocations. In other words, it’s a self-canceling operation. Another involution we’ve already run into is not: 2
not-involutive : (x : Bool) → not (not x) ≡ x not-involutive false = refl not-involutive true = refl
Throughout this book, we will encounter more and more algebraic properties like involutivity, symmetry, and identities elements. In fact, I would strongly recommend jotting them down somewhere to keep as a handy cheat-sheet. The road to success as a new mathematician is to simply to not get crushed under all of the jargon. The
3.8. TRANSITIVITY
125
ideas are often easy enough, but there are an awful lot of things you need to simultaneously keep in your head. Discovering new abstractions like these allow you to reuse your entire existing vocabulary and understanding, transplanting those ideas into the new area, which means you can hit the ground running. Indeed, much to the surprise of traditionally-educated people, mathematics is much more about things like identity elements and involutivity than it ever was about numbers.
3.8 Transitivity Proofs, much like computer programs, are usually too big to build all in one go. Just like in software, it’s preferable to build small, reusable pieces, which we can later combine together into the desired product. Blatant in its absence, therefore, is a means of actually composing these proofs together. Much like how there are many ways of combining software, we also have many ways of gluing together proofs. The most salient however is analogous to the implicit semicolon in many procedural languages, allowing us to tack one proof immediately onto the tail of another. This is something you already know, even if you don’t know that you know it. For example, consider the following symbolic proof: (𝑎 + 𝑏) × 𝑐 = 𝑎𝑐 + 𝑏𝑐 𝑎𝑐 + 𝑏𝑐 = 𝑎𝑐 + 𝑐𝑏 𝑎𝑐 + 𝑐𝑏 = 𝑐𝑏 + 𝑎𝑐
This series of equations has the property that the right hand of each equation is the same as the left hand of the subsequent line. As a notational convenience, we therefore usually omit all but the first left hand side, as in: (𝑎 + 𝑏) × 𝑐 = 𝑎𝑐 + 𝑏𝑐 = 𝑎𝑐 + 𝑐𝑏 = 𝑐𝑏 + 𝑎𝑐
Finally, it’s implied that only the first and last expressions in this equality chain are relevant, with everything in between being “accounting” of a sort. Therefore, having done the work, we can omit all of the intermediary computation, and simply write: (𝑎 + 𝑏) × 𝑐 = 𝑐𝑏 + 𝑎𝑐
126
CHAPTER 3. PROOF OBJECTS
Notice how we have now constructed an equality of two rather disparate expressions, simply by chaining together smaller equalities end on end, like dominoes. This property of equality—that we’re allowed to such a thing in the first place—is called transitivity, and we can be stated as: 2
trans& : {A : Set} {x y z : A} → x ≡ y → y ≡ z → x ≡ z
In other words, trans takes a proof that x ≡ y and a proof that y ≡ z, and gives us back a proof that x ≡ z. In order to prove such a thing, we take a page out of the sym book, and pattern match on both proofs, allowing Agda to unify z and y, before subsequently unifying y and x: ⇤ 2
trans refl refl = refl
We can use transitivity to help us prove less-fundamental properties about things. For example, we might like to show 𝑎1 = 𝑎 + (𝑏 × 0). This isn’t too tricky to do with pen and paper: 𝑎1 = 𝑎 =𝑎+0 = 𝑎 + (𝑏 × 0)
Let’s write this as a proposition: 2
a^1≡a+b*0 : (a b : ℕ) → a ^ 1 ≡ a + (b * zero) a^1≡a+b*0 a b =
?
Of course, we can always prove something by doing the manual work of pattern matching on our inputs. But we’d prefer not to whenever possible, as pattern matching leaves you deep in the weeds of implementation details. Proof by pattern matching is much akin to programming in assembly—you can get the job done, but it requires paying attention to much more detail than we’d like. Instead, we’d like to prove the above proposition out of reusable pieces. In fact, we’ve already proven each of the individual steps—^identityʳ, +-identityʳ, and *-zeroʳ correspond to the salient steps on each line of the proof. So let’s do it. Because we’d like to glue together some existing proofs, we begin with a call to trans:
3.8. TRANSITIVITY 2
127
a^1≡a+b*0 : (a b : ℕ) → a ^ 1 ≡ a + b * 0 a^1≡a+b*0 a b = trans
? ?
This call to trans shows up in a deep saffron background. Despite being new, this is nothing to worry about; it’s just Agda’s way of telling us it doesn’t yet have enough information to infer all of the invisible arguments—but our next move will sort everything out. We will follow our “pen and paper” proof above, where our first step was that 𝑎1 = 𝑎, which we called ^-identityʳ a: ⇤ 2
a^1≡a+b*0 : (a b : ℕ) → a ^ 1 ≡ a + b * 0 a^1≡a+b*0 a b = trans (^-identityʳ a)
?
Our goal now has the type a ≡ a + b * zero, which we’d like to simplify and implement in two steps. Thus, we use another call to trans—this time to assert the fact that 𝑎 = 𝑎 + 0. We don’t have a proof of this directly, but we do have the opposite direction via +identityʳ a. Symmetry will help us massage our sub-proof into the right shape: ⇤ 2
a^1≡a+b*0 : (a b : ℕ) → a ^ 1 ≡ a + b * 0 a^1≡a+b*0 a b = trans (^-identityʳ a) ( trans (sym (+-identityʳ a))
? )
We are left with a goal whose type is a + zero ≡ a + b * zero. While we know that *-zeroʳ b shows 𝑏 × 0 = 0, and thus that sym (*-zeroʳ b) gives 0 = 𝑏 × 0 , we are left with the problem of getting this evidence into the right place. Whenever you have a proof for a subexpression, you should think cong—dropping it in place with a hole for its first argument and your subexpression proof as its second: ⇤ 2
a^1≡a+b*0 : (a b : ℕ) → a ^ 1 ≡ a + b * 0 a^1≡a+b*0 a b = trans (^-identityʳ a) ( trans (sym (+-identityʳ a)) ( cong ? (sym (*-zeroʳ b)) )
128
CHAPTER 3. PROOF OBJECTS )
Congruence is highly-parameterized, and therefore often introduces unconstrained metas while working interactively. As before, it’s nothing to worry about. Our final hole in this implementation is a function responsible for “targeting” a particular subexpression in the bigger hole. Recall that here we have a + zero, and we would like to rewrite zero as b * zero. Thus, our function should target the zero in the expression a + zero. In order to do so, we must give a function that changes the zero, leaving the remainder of our expression alone. We can introduce a function via a lambda: ⇤ 2
a^1≡a+b*0 : (a b : ℕ) → a ^ 1 ≡ a + b * 0 a^1≡a+b*0 a b = trans (^-identityʳ a) ( trans (sym (+-identityʳ a)) ( cong (λ φ → ? ) (sym (*-zeroʳ b)) ) )
The lambda (λ) here is input as \Gl , while the phi (φ) is \Gf . We are required to use the lambda symbol, as it’s Agda syntax, but we chose phi only out of convention—feel free to pick any identifier here that you’d instead prefer. A useful trick for filling in the body of cong’s targeting function is to copy the expression you had before, and replace the bit you’d like to change with the function’s input—φ in our case. Thus: ⇤ 2
a^1≡a+b*0 : (a b : ℕ) → a ^ 1 ≡ a + b * 0 a^1≡a+b*0 a b = trans (^-identityʳ a) ( trans (sym (+-identityʳ a)) (cong (λ φ → a + φ) (sym (*-zeroʳ b))) )
Like always, we can rewrite our lambda λ φ → a + φ by “canceling” the φ on both sides. By writing this as a section, we get the slightly terser form a +_, giving rise to a shorter implementation: ⇤ 2
a^1≡a+b*0′ : (a b : ℕ) → a ^ 1 ≡ a + b * 0 a^1≡a+b*0′ a b
3.9. MIXFIX PARSING
129
= trans (^-identityʳ a) ( trans (sym (+-identityʳ a)) (cong (a +_) (sym (*-zeroʳ b))) )
Throughout this book, we will use this second notation whenever the subexpression is easy to target, choosing an explicit lambda if the subexpress is particularly nested. Using an explicit lambda always works, but we can’t always get away using the shorter form. That being said, both forms are equivalent, and you may choose whichever you prefer in your own code. However, by virtue of this presentation being a book, we are limited by physical page widths, and thus will opt for the terser form whenever it will simplify the presentation. Composing proofs directly via trans does indeed work, but it leaves a lot to be desired. Namely, the proof we wrote out “by hand” looks nothing like the pile of trans calls we ended up using to implement a^1≡a+b*0. Thankfully, Agda’s syntax is sufficiently versatile that we can build a miniature domain specific language in order to get more natural looking proofs. We will explore this idea in the next section.
3.9 Mixfix Parsing As we saw when defining binary operators like _∨_ in sec. 1.12, and again when representing negative integers via -[1+_] in sec. 2.13, we can define interesting syntax in Agda by leaving underscores around the place. These underscores are a very general feature for interacting with Agda’s parser—an underscore corresponds to a syntactic hole that Agda intereprets as a good reasonable place for an expression. To illustrate this idea we can make a postfix operator by prefixing our operator with an underscore, as in the factorial function: ⇤ 2
_! : ℕ → ℕ zero ! = 1 suc n ! = suc n * n !
which works as we’d expect: _ : 5 ! ≡ 120 _ = refl
130
CHAPTER 3. PROOF OBJECTS
Figuring out exactly how Agda’s parser manages to make _! takes some thought. The first thing to note is that function application binds more tightly than anything else in the language; thus, our definition of _! is implicitly parsed as: 2
_! : ℕ → ℕ zero ! = 1 (suc n) ! = (suc n) * n !
From here, _! behaves normally with respect to the rules of operator precedence. By default, if you haven’t given an operator an explicit fixity declaration (see sec. 1.17 for a reminder), Agda will assign it a precedence of 20. But _*_ has a precedence of 7, which means that _! binds more tightly than _*_ does, giving us our final parse as: 2
_! : ℕ → ℕ zero ! = 1 (suc n) ! = (suc n) * (n !)
Sometimes it’s desirable to make prefix operators, where the symbol comes before the argument. While Agda parses regular functions as prefix operators, writing an explicit underscore on the end of an identifier means we can fiddle with its associativity. For example, while it’s tedious to write five out of sucs: 2
five : ℕ five = suc (suc (suc (suc (suc zero))))
where each of these sets of parentheses is mandatory. We can instead embrace the nature of counting in unary and define a right-associative prefix “tick mark” (input as \| ): 2
∣_ : ℕ → ℕ ∣_ = suc infixr 20 ∣_ five : ℕ five = ∣ ∣ ∣ ∣ ∣ zero
The presence of zero here is unfortunate, but necessary. When nesting operators like this, we always need some sort of terminal in order to
3.9. MIXFIX PARSING
131
tell Agda we’re done this expression. Therefore, we will never be able to write “true” tick marks which are merely to be counted. However, we can assuage the ugliness by introducing some new syntax for zero, as in: 2
□ : ℕ □ = zero five : ℕ five = ∣ ∣ ∣ ∣ ∣ □
The square □ can be input as \sq . Whether or not this syntax is better than our previous attempt is in the eye of the beholder. Suffice it to say that we will always need some sort of terminal value when doing this style of associativity to build values. Mixfix parsing gets even more interesting, however. We can make delimited operators like in sec. 2.13 by enclosing an underscore with syntax on either side. For example, the mathematical notation for the floor function (integer part) is ⌊𝑥⌋, which we can replicate in Agda: 2
postulate ℝ : Set π : ℝ ⌊_⌋ : ℝ → ℕ three′ : ℕ three′ = ⌊
π
⌋
The floor bars are input via \clL and \clR , while ℝ is written as \bR and π is \Gp . We don’t dare define the real numbers here, as they are a tricky construction and would distract from the point. Agda’s profoundly flexible syntax means we are capable of defining many built-in language features for ourselves. To illustrate, many ALGOL-style languages come with the so-called “ternary operator1 ” which does if..else in an expression context. Mixfix parsing means we can define true ternary operators, with whatever semantics we’d like. But let’s humor ourselves and define the conditional ternary operator for ourselves. Since both ? and : (the traditional syntax of the “ternary operator”) have special meaning in Agda, we must get a little creative with our syntax. 1 Of course, the word “ternary” means only “three-ary”, and has nothing to do with conditional evaluation.
132
CHAPTER 3. PROOF OBJECTS
Thankfully, we’ve got all of Unicode at our fingertips, and it’s not hard to track down some alternative glyphs. Instead, we will use ‽ ( \?! ) and ⦂ ( \z: ): 2
_‽_⦂_ : {A : Set} → Bool → A → A → A false ‽ t ⦂ f = f true ‽ t ⦂ f = t infixr 20 _‽_⦂_ _ : ℕ _ = not true ‽ 4 ⦂ 1
In addition, since Agda doesn’t come with any if..else.. construct, we can also trivially define such a thing: 2
if_then_else_ : {A : Set} → Bool → A → A → A if_then_else_ = _‽_⦂_ infixr 20 if_then_else_
which we can immediately use: 2
_ : ℕ _ = if not true then 4 else 1
Due to our use of infixr, we can also nest if_then_else_ with itself: 2
_ : ℕ _ = if not true then 4 else if true then 1 else 0
As another example, languages from the ML family come with a case..of expression, capable of doing pattern matching on the righthand side of an equals sign (as opposed to Agda, where we can only do it on the left side!) However, it’s easy to replicate this syntax for ourselves: ⇤ 2
case_of_ : {A B : Set} → A → (A → B) → B case e of f = f e
This definition takes advantage of Agda’s pattern-matching lambda, as in:
3.10. EQUATIONAL REASONING 2
133
_ : ℕ _ = case not true of λ { false → 1 ; true → 4 }
There is one small problem when doing mixfix parsing; unfortunately, we cannot put two non-underscore tokens beside one another. For example, it might be nice to make a boolean operator _is equal to_, but this is illegal in Agda. A simple fix is to intersperse our tokens with hyphens, as in: ⇤ 2
_is-equal-to_ : {A : Set} → A → A → Set x is-equal-to y = x ≡ y
which is nearly as good. As you can see, Agda’s parser offers us a great deal of flexibility, and we can use this to great advantage when defining domain specific languages. Returning to our problem of making trans-style proofs easier to think about, we can explore how to use mixfix parsing to construct valid syntax more amenable to equational reasoning.
3.10
Equational Reasoning
Recall, we’d like to develop some syntax amenable to doing “pen and paper” style proofs. That is, we’d like to write something in Agda equivalent to: (𝑎 + 𝑏) × 𝑐 = 𝑎𝑐 + 𝑏𝑐 = 𝑎𝑐 + 𝑐𝑏 = 𝑐𝑏 + 𝑎𝑐
Each side of an equals sign in this notation is equivalent to either x or in a type x ≡ y. In the pen and paper proof, we see what is equal, but not why. Recall that in Agda, each sub-proof has its own name that we must explicitly trans together, and these are the sorts of “why”s we want to track. In other words, proofs written in the above style are missing justification as to why exactly we’re claiming each step of the proof follows. In order to make room for these justifications, we will use Bird notation, which attaches them to the equals sign: y
134
CHAPTER 3. PROOF OBJECTS
(𝑎 + 𝑏) × 𝑐 = (distributivity) 𝑎𝑐 + 𝑏𝑐 = (commutativity of ×) 𝑎𝑐 + 𝑐𝑏 = (commutativity of +) 𝑐𝑏 + 𝑎𝑐
This is the syntax we will emulate in Agda, although doing so is a little finicky and will require much thought. To pique your interested, after this section is complete we will be able to structure the above proof in Agda as: 6
ex : (a b c : ℕ) → (a + b) * c ≡ c * b + a * c ex a b c = begin (a + b) * c ≡⟨ *-distribʳ-+ c a b ⟩ a * c + b * c ≡⟨ cong (a * c +_) (*-comm b c) ⟩ a * c + c * b ≡⟨ +-comm (a * c) (c * b) ⟩ c * b + a * c ∎
We will begin with a new module: 2
module ≡-Reasoning where
The idea behind building this custom syntax is that we will make a series of right-associative syntax operators, in the style of our tick marks in the previous section. This syntax must eventually be terminated, analogously to how ∣_ had to be terminated by □. In this case, we will terminate our syntax using refl, that is, showing we’ve proven what we set out to. You’ll often see a formal proof ended with a black square (∎, input as \qed ), called a tombstone marker. Since proofs already end with this piece of syntax, it’s a great choice to terminate our rightassociative chain of equalities. ⇥ 4
_∎ : {A : Set} → (x : A) → x ≡ x _∎ x = refl infix 3 _∎
3.10. EQUATIONAL REASONING
135
Note that the x parameter here is unused in the definition, and exists only to point out exactly for which object we’d like to show reflexivity. Having sorted out the “end” of our syntax, let’s now work backwards. The simplest piece of reasoning we can do is an equality that requires no justification—think different expressions which automatically compute to the same value, like suc zero and one. Again x exists only to give a type to our proof, so we ignore it, choosing to return the proof we’ve already got: 4
_≡⟨⟩_ : → → → x ≡⟨⟩ p
{A : Set} {y : A} (x : A) x ≡ y x ≡ y = p
infixr 2 _≡⟨⟩_
These long brackets are input as \< and \> , respectively. It’s easy to lose the forest for the trees here, so let’s work through an example. We can write a trivial little proof, showing the equality of several different ways of writing the number 4: 4
_ : 4 _ = 4 two suc suc
≡ suc (one + two) ≡⟨⟩ + two ≡⟨⟩ one + two ≡⟨⟩ (one + two) ∎
In this case, since everything is fully concrete, Agda can just work out the fact that each of these expressions is propositionally equal to one another, which is why we need no justifications. But you’ll notice where once we had x and ys in our types, now have a human-legible argument about which things are equal! Agda successfully parses the above, but it can be helpful for own sanity to make the parse tree explicit. Rather than use infix notation, we’ll use the full unsectioned names for both _≡⟨⟩_ and _∎, and then insert all of the parentheses: ⇤ 4
_ : 4 ≡ suc (one + two) _ = _≡⟨⟩_ 4 ( _≡⟨⟩_ (two + two)
136
CHAPTER 3. PROOF OBJECTS ( _≡⟨⟩_ (suc one + two) ( _∎ (suc (one + two)))))
Recall that the implementation of _≡⟨⟩ merely returns its second argument, so we can simplify the above to: ⇤ 4
_ : 4 ≡ suc (one + two) _ = _≡⟨⟩_ (two + two) ( _≡⟨⟩_ (suc one + two) ( _∎ (suc (one + two))))
Our resulting expression begins with another call to _≡⟨⟩_, so we can make the same move again. And a third time, resulting in: ⇤ 4
_ : 4 ≡ suc (one + two) _ = _∎ (suc (one + two))
Replacing _∎ now with its definition, we finally eliminate all of our function calls, and are left with the rather underwhelming proof: 4
_ : 4 ≡ suc (one + two) _ = refl
Have I pulled a fast one on you? Did we do all of this syntactic manipulation merely as a jape? While it seems like we’ve done nothing of value here, notice what happens if we try writing down an invalid proof—as in: 6
_ : 4 ≡ suc (one + two) _ = 4 ≡⟨⟩ one + two ≡⟨⟩ 1 suc one + two ∎
At 1 we accidentally wrote one instead of suc one. But, Agda is smart enough to catch the mistake, warning us: i INFO WINDOW zero != suc zero of type ℕ when checking that the inferred type of an application one + two ≡ _y_379
3.10. EQUATIONAL REASONING
137
matches the expected type 4 ≡ suc (one + two)
So whatever it is that we’ve built, it’s doing something interesting. Despite ignoring every argument, somehow Agda is still noticing flaws in our proof. How can it do such a thing? Let’s look at the definition of _≡⟨⟩_ again: 4
_≡⟨⟩_ : {A : Set} {y : A} → (x : A) → x ≡ y → x ≡ y x ≡⟨⟩ p = p
Despite that fact that x is completely ignored in the implementation of this function, it does happen to be used in the type! The reason our last example failed to compile is because when we fill in x, we’re changing the type of the proof required in the second argument. But the second argument is already refl. Thus, we’re asking Agda to assign a type of 3 ≡ 4 to refl, which it just can’t do. That’s where the error comes from, and that’s why _≡⟨⟩_ is less trivial than it seems. While all of this syntax construction itself is rather clever, there is nothing magical going on here. It’s all just smoke and mirrors abusing Agda’s mixfix parsing and typechecker in order to get nice notation for what we want. Of course, _≡⟨⟩_ is no good for providing justifications. Instead, we will use the same idea, but this time leave a hole for the justification. 4
_≡⟨_⟩_ : {A : Set} → (x : A) → {y z : A} → x ≡ y → y ≡ z → x ≡ z x ≡⟨ j ⟩ p = trans j p infixr 2 _≡⟨_⟩_
Our new function _≡⟨_⟩_ works exactly in the same way as _≡⟨⟩_, except that it takes a proof justification j as its middle argument,
138
CHAPTER 3. PROOF OBJECTS
and glues it together with its last argument p as per trans. We’ll look at an example momentarily. We have one piece of syntax left to introduce, and will then play with this machinery in full. By way of poetic symmetry (rather than by way of sym) and to top things off, we will add a piece of syntax to indicate the beginning of a proof block. This is not strictly necessary, but makes for nice introductory syntax to let the reader know that an equational reasoning proof is coming up: 4
begin_ : {A : Set} → {x y : A} → x ≡ y → x ≡ y begin_ x=y = x=y infix 1 begin_
The begin_ function does nothing, it merely returns the proof given. And since its precedence is lower than any of our other ≡-Reasoning pieces, it binds after any of our other syntax, ensuring the proof is already complete by the time we get here. The purpose really is just for decoration, but does serve a purpose when we define analogous machinery in the context of preorders (sec. 4.13.) Let’s now put all of our hard work to good use. Recall the proof that originally set us off on a hunt for better syntax: 4
a^1≡a+b*0′ : (a b : ℕ) → a ^ 1 ≡ a + b * 0 a^1≡a+b*0′ a b = trans (^-identityʳ a) ( trans (sym (+-identityʳ a)) (cong (a +_) (sym (*-zeroʳ b))) )
The equational reasoning syntax we’ve built gives us a much nicer story for implementing this. Rather than work with the big explicit pile of calls to trans, after popping out of the ≡-Reasoning module, we can just open a new reasoning block: ⇤ 2
a^1≡a+b*0′ : (a b : ℕ) → a ^ 1 ≡ a + b * 0 a^1≡a+b*0′ a b = begin a ^ 1 ≡⟨ ^-identityʳ a ⟩ a
3.11. ERGONOMICS, ASSOCIATIVITY AND COMMUTATIVITY
139
≡⟨ sym (+-identityʳ a) ⟩ a + 0 ≡⟨ cong (a +_) (sym (*-zeroʳ b)) ⟩ a + b * 0 ∎ where open ≡-Reasoning 1
Note that at 1 we open the ≡-Reasoning module. This is a local binding, which brings our machinery into scope only for the current definition. While it is possible to open ≡-Reasoning at the top level, this is generally frowned upon, as there will eventually be many other sorts of reasoning we might want to perform. For the purposes of this book’s aesthetics, whenever we have the available line-width, we will choose to format equational reasoning blocks as: ⇤ 2
a^1≡a+b*0′ : (a b : ℕ) → a ^ 1 ≡ a + b * 0 a^1≡a+b*0′ a b = begin a ^ 1 ≡⟨ ^-identityʳ a ⟩ a ≡⟨ sym (+-identityʳ a) ⟩ a + 0 ≡⟨ cong (a +_) (sym (*-zeroʳ b)) ⟩ a + b * 0 ∎ where open ≡-Reasoning
You are welcome to pick whichever style you prefer; the former is easier to type out and work with, but the latter looks prettier once the proof is all sorted. As you can see, this is a marked improvement over our original definition. The original implementation emphasized the proof justification—which are important to the computer—while this one emphasizes the actual steps taken—which is much more important to the human. Whenever you find yourself doing non-ergonomic things for the sake of the computer, it’s time to take a step back as we have done here. This is an important lesson, inside Agda and out.
3.11
Ergonomics, Associativity and Commutativity
If you tried writing out the new definition of a^1≡a+b*0′ by hand, you likely didn’t have fun. It’s a huge amount of keystrokes in order to produce all of the necessary Unicode, let alone what the expression looks like between each proof rewrite. Thankfully, Agda’s interactive
140
CHAPTER 3. PROOF OBJECTS
support can help us write out the mechanical parts of the above proof, allowing us to focus more on the navigation than the driving. The first thing you’ll want to do is to write a macro or snippet for your editor of choice. We’re going to be typing out a lot of the following two things, and it will save you an innumerable amount of time to write it down once and have your text editor do it evermore. The two snippets you’ll need are: ≡⟨ ? ⟩ ?
and begin ?
≡⟨ ? ⟩ ? ∎ where open ≡-Reasoning
I have bound the first to \step , and the latter to \begin . Let’s put these to work writing something more useful. We’d like to prove that _∨_ is associative, which is to say, that it satisfies the following law: (𝑎 ∨ 𝑏) ∨ 𝑐 = 𝑎 ∨ (𝑏 ∨ 𝑐)
We can write this in Agda with the type: ⇤ 2
∨-assoc : (a b c : Bool) → (a ∨ b) ∨ c ≡ a ∨ (b ∨ c) ∨-assoc = ?
You should be able to prove this one for yourself: 2
∨-assoc : (a b c : Bool) → (a ∨ b) ∨ c ≡ a ∨ (b ∨ c) ∨-assoc false b c = refl ∨-assoc true b c = refl
Exercise Also prove ∧-assoc. Solution
3.11. ERGONOMICS, ASSOCIATIVITY AND COMMUTATIVITY 2
141
∧-assoc : (a b c : Bool) → (a ∨ b) ∨ c ≡ a ∨ (b ∨ c) ∧-assoc false b c = refl ∧-assoc true b c = refl
Not too hard at all, is it? Let’s now try the same, except this time showing that it’s addition that’s associative: 2
+-assoc : (x y z : ℕ) → (x + y) + z ≡ x + (y + z) +-assoc =
?
A quick binding of variables, induction on x, and obvious use of refl gets us to this step: 2
+-assoc : (x y z : ℕ) → (x + y) + z ≡ x + (y + z) +-assoc zero y z = refl +-assoc (suc x) y z =
?
We’re ready to start a reasoning block, and thus we can use our \begin snippet: 2
+-assoc : (x y z : ℕ) → (x + y) + z ≡ x + (y + z) +-assoc zero y z = refl +-assoc (suc x) y z = begin
? ≡⟨ ? ⟩ ? ∎ where open ≡-Reasoning
Note that I have opted to format this lemma more horizontally than the vertical alignment you have. This is merely to make the most of our page width and save some paper, but feel free to format however you’d like. I find the horizontal layout to be more aesthetically pleasing, but much harder to write. Thus, when I am proving things, I’ll do them in the vertical layout, and do a second pass after the fact to make it look prettier. Regardless of my artisanal formatting decisions, we can now start getting help from Agda. Using Solve ( C-c C-s in Emacs and VS Code) at the first and last holes will get Agda to fill in the terms—the two things that eventually need to be equal:
142 ⇤ 2
CHAPTER 3. PROOF OBJECTS
+-assoc : (x y z : ℕ) → (x + y) + z ≡ x + (y + z) +-assoc zero y z = refl +-assoc (suc x) y z = begin suc x + y + z ≡⟨ ? ⟩ suc x + (y + z) ∎ where open ≡-Reasoning
I always like to subsequently extend the top and bottom of the equality with _⟨⟩_, like this: ⇤ 2
+-assoc : (x y z +-assoc zero +-assoc (suc x) suc x + y + z
? ?
: ℕ) → (x + y) + z ≡ x + (y + z) y z = refl y z = begin
≡⟨⟩ ≡⟨ ? ⟩
≡⟨⟩ suc x + (y + z) ∎ where open ≡-Reasoning
which recall says that the newly added lines are already equal to the other side of the _≡⟨⟩_ operator. We can fill in these holes with Solve ( C-u C-u C-c C-s in Emacs and VS Code), which asks Agda to fully-evaluate both holes, expanding as many definitions as it can while still making progress. Sometimes it goes too far, but for our simple examples here, this will always be helpful. The result looks like this: ⇤ 2
+-assoc : (x y z : ℕ) → (x + y) + z ≡ x + (y + z) +-assoc zero y z = refl +-assoc (suc x) y z = begin suc x + y + z ≡⟨⟩ suc (x + y + z) ≡⟨ ? ⟩ suc (x + (y + z)) ≡⟨⟩ suc x + (y + z) ∎ where open ≡-Reasoning
This new hole is clearly a cong suc, which we can partially fill in: ⇤ 2
+-assoc : (x y z : ℕ) → (x + y) + z ≡ x + (y + z) +-assoc zero y z = refl +-assoc (suc x) y z = begin
3.11. ERGONOMICS, ASSOCIATIVITY AND COMMUTATIVITY suc x + y + z
143
≡⟨⟩
suc (x + y + z) ≡⟨ cong suc suc (x + (y + z)) ≡⟨⟩ suc x + (y + z) ∎ where open ≡-Reasoning
? ⟩
and then invoke Auto ( C-c C-a in Emacs and VS Code) to search for the remainder of the proof: ⇤ 2
+-assoc : (x y z : ℕ) → (x + y) + z ≡ x + (y + z) +-assoc zero y z = refl +-assoc (suc x) y z = begin suc x + y + z ≡⟨⟩ suc (x + y + z) ≡⟨ cong suc (+-assoc x y z) ⟩ suc (x + (y + z)) ≡⟨⟩ suc x + (y + z) ∎ where open ≡-Reasoning
I quite like this workflow when tackling proofs. I introduce a \begin snippet, use Solve ( C-c C-s in Emacs and VS Code) to fill in either side. Then, I add new calls to _≡⟨⟩_ on both the top and bottom and fill those in via Solve ( C-u C-u C-c C-s in Emacs and VS Code). Finally, I like to add \step in the middle, and look for obvious techniques to help fill in the rest. Let’s do another proof together, this time one less trivial. First, we will dash out a quick lemma2 : Exercise (Easy) Implement +-suc : (x y : ℕ) → x + suc y ≡ suc (x + y) Solution ⇤ 2
+-suc : (x y : ℕ) → x + suc y ≡ suc (x + y) +-suc zero y = refl +-suc (suc x) y = cong suc (+-suc x y)
Given +-suc, we would now like to show the commutativity of addition, which is the idea that the idea of the arguments don’t matter. Symbolically, the commutativity property of addition is written as: 2
A lemma is a “boring” theorem: one that we prove only because it’s on the path to something we care more about proving. There is no technical distinction between lemmas and theorems, the difference is only in the mind of the original mathematician.
144
CHAPTER 3. PROOF OBJECTS
𝑎+𝑏=𝑏+𝑎
By this point you should be able to put together the type, and show the zero case. Exercise (Easy) State the type of, perform induction on the first argument, and solve the zero case for +-comm. Solution 2
+-comm : (x y : ℕ) → x + y ≡ y + x +-comm zero y = sym (+-identityʳ y) +-comm (suc x) y =
?
Let’s start with a \begin snippet, this time filling the top and bottom holes via Solve ( C-u C-u C-c C-s in Emacs and VS Code) directly: 2
+-comm : (x y : ℕ) → x + y ≡ y + x +-comm zero y = sym (+-identityʳ y) +-comm (suc x) y = begin suc (x + y) ≡⟨ ? ⟩ y + suc x ∎ where open ≡-Reasoning
Here we have our choice of working top-down, or bottom up. Let’s work bottom-up, for fun. Add a \step , which will make things go saffron temporarily, since Agda now has too many degrees of freedom to work out what you mean: ⇤ 2
+-comm : (x y : ℕ) → x + y ≡ y + x +-comm zero y = sym (+-identityʳ y) +-comm (suc x) y = begin suc (x + y)
≡⟨ ? ⟩
? ≡⟨ ? ⟩ y + suc x ∎ where open ≡-Reasoning Nevertheless, we can persevere and fill in the bottom hole using our +-suc lemma from just now:
3.11. ERGONOMICS, ASSOCIATIVITY AND COMMUTATIVITY ⇤ 2
145
+-comm : (x y : ℕ) → x + y ≡ y + x +-comm zero y = sym (+-identityʳ y) +-comm (suc x) y = begin suc (x + y) ≡⟨
? ⟩
? ≡⟨ sym (+-suc y x) ⟩ y + suc x ∎ where open ≡-Reasoning With this justification in place, we can now ask Agda to fill the remaining term-level hole, again via Solve ( C-u C-u C-c C-s in Emacs and VS Code): ⇤ 2
+-comm : (x y : ℕ) → x + y ≡ y + x +-comm zero y = sym (+-identityʳ y) +-comm (suc x) y = begin suc (x + y) ≡⟨ ? ⟩ suc (y + x) ≡⟨ sym (+-suc y x) ⟩ y + suc x ∎ where open ≡-Reasoning
Since the outermost function call (suc) is the same on both lines, we can invoke cong: ⇤ 2
+-comm : (x y : ℕ) → x + y ≡ y + x +-comm zero y = sym (+-identityʳ y) +-comm (suc x) y = begin suc (x + y) ≡⟨ cong suc ? ⟩ suc (y + x) ≡⟨ sym (+-suc y x) ⟩ y + suc x ∎ where open ≡-Reasoning
and the finish the proof noticing that recursion will happily fill this hole. ⇤ 2
+-comm : (x y : ℕ) → x + y ≡ y + x +-comm zero y = sym (+-identityʳ y) +-comm (suc x) y = begin suc (x + y) ≡⟨ cong suc (+-comm x y) ⟩ suc (y + x) ≡⟨ sym (+-suc y x) ⟩ y + suc x ∎ where open ≡-Reasoning
146
CHAPTER 3. PROOF OBJECTS
As you can see, equational reasoning makes proofs much more legible, and using Agda interactively assuages most of the pain of writing equational reasoning proofs.
3.12
Exercises in Proof
That covers everything we’d like to say about proof in this chapter. However, there are a few more properties about the natural numbers we’d like to show for future chapters, and this is the most obvious place to do it. These proofs are too hard to do simply by stacking calls to trans, and therefore gain a lot of tractability when done with equational reasoning. The diligent reader is encouraged to spend some time proving the results in this section for themselves; doing so will be an excellent opportunity to practice working with Agda and to brandish new tools. For our first exercise, suc-injective states that we can cancel outermost suc constructors in equality over the naturals: Exercise (Trivial) Prove suc-injective : {x y : ℕ} → suc x ≡ suc y → x ≡ y. Solution ⇤ 2
suc-injective : {x y : ℕ} → suc x ≡ suc y → x ≡ y suc-injective refl = refl
Often, a huge amount of the work to prove something is simply in manipulating the expression to be of the right form so that you can apply the relevant lemma. This is the case in *-suc, which allows us to expand a suc on the right side of a multiplication term. 2
*-suc : (x y : ℕ) → x * suc y ≡ x + x * y *-suc =
?
Exercise (Challenge) Prove *-suc. Hint Proving *-suc requires two applications of +-assoc, as well as one of +-comm. Solution
3.12. EXERCISES IN PROOF 2
147
*-suc : (x y : ℕ) → x * suc y ≡ x + x * y *-suc zero y = refl *-suc (suc x) y = begin suc x * suc y
≡⟨⟩ suc y + x * suc y ≡⟨ cong (λ φ → suc y + φ) (*-suc x y) ⟩ suc y + (x + x * y) ≡⟨⟩ suc (y + (x + x * y)) ≡⟨ cong suc (sym (+-assoc y x (x * y))) ⟩ suc ((y + x) + x * y) ≡⟨ cong (λ φ → suc (φ + x * y)) (+-comm y x) ⟩ suc ((x + y) + x * y) ≡⟨ cong suc (+-assoc x y (x * y)) ⟩ suc (x + (y + x * y)) ≡⟨⟩ suc x + (y + x * y) ≡⟨⟩ suc x + (suc x * y) ∎ where open ≡-Reasoning
You will not be surprised to learn that multiplication is also commutative, that is, that: 𝑎×𝑏=𝑏×𝑎
Exercise (Medium) Prove *-comm : (x y : ℕ) → x * y ≡ y * x. Solution ⇤ 2
*-comm : (x y : ℕ) → x * y ≡ y * x *-comm zero y = sym (*-zeroʳ y) *-comm (suc x) y = begin suc x * y ≡⟨⟩ y + x * y ≡⟨ cong (y +_) (*-comm x y) ⟩ y + y * x ≡⟨ sym (*-suc y x) ⟩ y * suc x ∎ where open ≡-Reasoning
148
CHAPTER 3. PROOF OBJECTS
Throughout this book, we will require a slew of additional mathematical facts. They are not particularly interesting facts, and none of them should surprise you in the slightest. Nevertheless, our insistence that we build everything by hand requires this expenditure of energy. The propositions are thus given as exercises, in order that you might find some value in the tedium. Exercise (Medium) Prove *-distribʳ-+ : ( x y z : ℕ ) → (y + z) * x ≡ y * x + z * x. Solution ⇤ 2
*-distribʳ-+ : (x y z : ℕ) → (y + z) * x ≡ y * x + z * x *-distribʳ-+ x zero z = refl *-distribʳ-+ x (suc y) z = begin (suc y + z) * x ≡⟨⟩ x + (y + z) * x ≡⟨ cong (x +_) (*-distribʳ-+ x y z) ⟩ x + (y * x + z * x) ≡⟨ sym (+-assoc x (y * x) (z * x)) ⟩ (x + y * x) + z * x ≡⟨⟩ suc y * x + z * x ∎ where open ≡-Reasoning
Exercise (Hard) Prove *-distribˡ-+ : ( x y z : ℕ ) → x * (y + z) ≡ x * y + x * z. Solution ⇤ 2
*-distribˡ-+ : (x y z : ℕ) → x * (y + z) ≡ x * y + x * z *-distribˡ-+ x y z = begin x * (y + z) ≡⟨ *-comm x _ ⟩ (y + z) * x ≡⟨ *-distribʳ-+ x y z ⟩ y * x + z * x ≡⟨ cong (_+ z * x) (*-comm y x) ⟩ x * y + z * x ≡⟨ cong (x * y +_) (*-comm z x) ⟩ x * y + x * z ∎ where open ≡-Reasoning
Exercise (Medium) Prove *-assoc : (x y z : ℕ) → (x * y) * z ≡ x * (y * z). Solution
3.13. WRAPPING UP ⇤ 2
149
*-assoc : (x y z : ℕ) → (x * y) * z ≡ x * (y * z) *-assoc zero y z = refl *-assoc (suc x) y z = begin suc x * y * z ≡⟨⟩ (y + x * y) * z ≡⟨ *-distribʳ-+ z y (x * y) ⟩ y * z + (x * y) * z ≡⟨ cong (y * z +_) (*-assoc x y z) ⟩ y * z + x * (y * z) ≡⟨⟩ suc x * (y * z) ∎ where open ≡-Reasoning
Congratulations! You made it through all of the tedious proofs about numbers. Go and celebrate with a snack of your choice; you’ve earned it!
3.13
Wrapping Up
In sec. 3.4, we defined propositional equality, represented by _≡_, which we used to prove to Agda that two values are syntactically equal. We proved that equality is reflexive, symmetric, and transitive, as well as showing that it is congruent (preserved by functions.) Furthermore, in sec. 3.10, we built a little domain-specific language in Agda for doing equational reasoning. All of this machinery can be found be found in the standard library under Relation.Binary.PropositionalEquality, however, for now we will only publicly re-export _≡_ and ≡-Reasoning: ⇤ 0
open import Relation.Binary.PropositionalEquality using (_≡_; module ≡-Reasoning) public
For reasons that will become clear in sec. 5, we will export the rest of our propositional equality tools under a new module PropEq, which will need to be opened separately: ⇤ 0
module PropEq where open Relation.Binary.PropositionalEquality using (refl; cong; sym; trans) public
In sec. 3.9 we discussed mixfix parsing, where we leave underscores in the names of identifiers in order to enable more interesting syntactic
150
CHAPTER 3. PROOF OBJECTS
forms. As examples of mixfix identifiers, we created ‘if_then_else_ anddef:case_of_‘, which can be found in the standard library here: ⇤ 0
open import Data.Bool using (if_then_else_) public open import Function using (case_of_) public
While discussing common flavors of proofs in sec. 3.6 and sec. 3.7, we proved many facts about _∨_ and _∧_. These can all be found under Data.Bool.Properties: ⇤ 0
open import Data.Bool.Properties using ( ∨-identityˡ ; ∨-identityʳ ; ∨-zeroˡ ; ∨-zeroʳ ; ∨-assoc ; ∧-assoc ; ∧-identityˡ ; ∧-identityʳ ; ∧-zeroˡ ; ∧-zeroʳ ; not-involutive ) public
Additionally, we tried our hand at defining many facts about the natural numbers—all of which can be found in the standard library under Data.Nat.Properties: ⇤ 0
open import Data.Nat.Properties using ( +-identityˡ ; +-identityʳ ; *-identityˡ ; *-identityʳ ; *-zeroˡ ; *-zeroʳ ; +-assoc ; *-assoc ; +-comm ; *-comm ; ^-identityʳ ; +-suc ; suc-injective ; *-distribˡ-+ ; *-distribʳ-+ ) public
3.13. WRAPPING UP Ï UNICODE IN THIS CHAPTER ʳ U+02B3 MODIFIER LETTER SMALL R (\^r) ˡ U+02E1 MODIFIER LETTER SMALL L (\^l) λ U+03BB GREEK SMALL LETTER LAMDA (\Gl) π U+03C0 GREEK SMALL LETTER PI (\Gp) φ U+03C6 GREEK SMALL LETTER PHI (\Gf) ′ U+2032 PRIME (\') ‽ U+203D INTERROBANG (\ ) ℕ U+2115 DOUBLE-STRUCK CAPITAL N (\bN) ℝ U+211D DOUBLE-STRUCK CAPITAL R (\bR) → U+2192 RIGHTWARDS ARROW (\to) ∎ U+220E END OF PROOF (\qed) ∣ U+2223 DIVIDES (\|) ∧ U+2227 LOGICAL AND (\and) ∨ U+2228 LOGICAL OR (\or) ∸ U+2238 DOT MINUS (\.-) ≡ U+2261 IDENTICAL TO (\ ) ⌊ U+230A LEFT FLOOR (\clL) ⌋ U+230B RIGHT FLOOR (\clR) □ U+25A1 WHITE SQUARE (\sq) ⟨ U+27E8 MATHEMATICAL LEFT ANGLE BRACKET (\) ⦂ U+2982 Z NOTATION TYPE COLON (\z )
151
CHAPTER
4 Relations
0
module Chapter4-Relations where
In the last chapter we explored what equality is, and what sorts of things we could prove with it. We turn our discussion now to relations more generally, of which equality is only one example. In the process, we will learn about universe polymorphism, pre-orders, partially ordered sets, and touch briefly on graphs—all while learning much more about working with Agda interactively. Prerequisites 0
open import Chapter1-Agda using (Bool; false; true; not; _×_)
⇤ 0
open import Chapter2-Numbers using (ℕ; zero; suc; _+_)
⇤ 0
open import Chapter3-Proofs
4.1
Universe Levels
Perhaps you have heard of Bertrand Russell’s “barber paradox”—if there is a barber who shaves only barbers who do not shave themselves, does he shave himself? The paradox is that the barber does 153
154
CHAPTER 4. RELATIONS
shave himself if he doesn’t, and doesn’t if he does. The truth value of this proposition seems to flip-flop back and forth, from yes to no and back to yes again, forever, never settling down, never converging on an answer. Of course, Russell wasn’t actually wondering about barbers; the question was posed to underline a problem with the now-called “naive set theory” that was in vogue at the time. We call it naive set theory these days because it allowed for paradoxes like the one above, and paradoxes are anathema to mathematics. Once you have a contradiction, the entire mathematical system falls apart and it’s possible to prove anything whatsoever. We will look at how to exactly this in sec. 6.1. Modern mathematics’ solution to the barber paradox is the realization that not all collections are sets—some are simply “too big” to be sets. There is no “set of all sets” because such a thing is too big. Therefore, the question of the barber who doesn’t cut his own hair is swept under the rug, much like the hair at actual barbershops. But this only punts on the problem. What is the corresponding mathematical object here. If the set of all sets is not a set, then, what exactly is it? The trick is to build a hierarchy of set-like things, no one of which contains itself, but each subsequent object contains the previous one. The usual, everyday sets and types that exist in other programming languages are Set₀. Set₀ can be written in Agda as Set like we have been doing so far. But, what is the type of Set itself? Why, it’s Set₁! 0
_ : Set₁ _ = Set
We can play the same game, asking what’s the type of Set₁. Unsurprisingly, it’s Set₂: 0
_ : Set₂ _ = Set₁
This collection of sets is an infinite hierarchy, and we refer to each as a sort or a universe. You’re welcome to use an arbitrarily large universe if you’d like: 0
_ = Set₉₈₆₀₂₅₀
4.1. UNIVERSE LEVELS
155
Why do these universes matter? As we have seen, Agda makes no distinction between values and types. So far, we’ve used this functionality primarily when working on indexed types, where we’ve seen values used as indices on types. But as we become more sophisticated Agda programmers (eg. later in this chapter,) we will start going the other direction: using types as values. When we write a proof about values, such a thing lives in Set. But proofs about types must necessarily exist in a higher universe so as to not taunt the barber and his dreadful paradox. Of course, proofs about those proofs must again live in a higher universe. You can imagine we might want to duplicate some proofs in different universes. For example, we might want to say the tuple 1 , (2 , 3) is “pretty much the same thing1 ” as (1 , 2) , 3. But then we might want to say that a type-level tuple of ℕ , (Bool , ℤ)—not ℕ × (Bool × ℤ), mind you— is also “pretty much the same thing” as (ℕ , Bool) , ℤ. Thankfully, Agda allows us to write this sort of proof once and for all, by abstracting over universe levels in the form of universe polymorphism. The necessary type is Level from Agda.Primitive: 0
open import Agda.Primitive using (Level; _⊔_; lzero; lsuc)
Before we play with our new Levels, let’s force Agda to give us an error. Recall our old definition of Maybe from sec. 2.6: ⇤ 0
module Playground-Level where data Maybe₀ (A : Set) : Set where just₀ : A → Maybe₀ A nothing₀ : Maybe₀ A
We might try to generate a term of type Maybe Set, as in: 6
_ = just₀ ℕ
which Agda doesn’t like, and isn’t shy about telling us: 1 For some definition of “pretty much the same thing” that we will make precise in sec. 7.9.
156
CHAPTER 4. RELATIONS
i INFO WINDOW Set₁
Set when checking that the solution Set of metavariable _A_8 has the expected type Set
The problem, of course, is that we said A was of type Set, but then tried to instantiate this with A = Set. But since Set : Set₁, this is a universe error, and the thing fails to typecheck. Instead, we can use universe polymorphism in the definition of Maybe, by binding a Level named ℓ ( \ell ), and parameterizing both the set A and Maybe by this Level: ⇤ 2
data Maybe₁ {ℓ : Level} (A : Set ℓ) : Set ℓ where just₁ : A → Maybe₁ A nothing₁ : Maybe₁ A
Agda is now happy to accept our previous definition: ⇤ 2
_ = just₁ ℕ
In the real world, it happens to be quite a lot of work to bind every level every time, so we often use a variable block to define levels: 2
private variable ℓ : Level data Maybe₂ (A : Set ℓ) : Set ℓ where just₂ : A → Maybe₂ A nothing₂ : Maybe₂ A
Variable blocks are a lovely feature in Agda; whenever you reference a binding defined in a variable, Agda will automatically insert an implicit variable for you. Thus, behind the scenes, Agda is just turning our definition for Maybe₂ into exactly the same thing as we wrote by hand for Maybe₁. Although we didn’t define Maybe this way when we built it in sec. 2.6, we don’t need to make any changes. This is because Chapter2Numbers re-exports Maybe from the standard library, which as a principle is always as universe-polymorphic as possible. These variable bindings are life-saving when working with highly polymorphic structures, and so let’s pop the module stack and introduce a few in the top level for our future definitions:
4.1. UNIVERSE LEVELS ⇤ 0
157
private variable ℓ ℓ₁ ℓ₂ a b c : Level A : Set a B : Set b C : Set c
There’s quite a preponderance of levels we’ve defined here! As you can see, variables can depend on one another; using one in a definition will automatically pull it and any variables it depends on into scope. Therefore, don’t be surprised throughout the rest of this chapter if you see variables that don’t seem to be bound anywhere—this is where they come from! There are three other introduction forms for Level that bear discussion. The first is lzero, which is the universe level of Set₀. In other words, Set₀ is an alias for Set lzero. This brings us to our second introduction form, lsuc, which acts exactly as suc does, but over the Levels. That is, Set₁ is an alias for Set (lsuc lzero). As you’d expect, we can generate any Level by subsequent applications of lsuc to lzero. Our third and final introduction form for Level is _⊔_ (input via \lub ), which takes the maximum of two levels. It’s unnecessary to use _⊔_ when working with concrete levels, but it comes in handy when you have multiple Level variables in scope. For example, you might have two different sets with two different levels, lets say ℓ₁ and ℓ₂. The appropriate level for such a thing would be either ℓ₁ ⊔ ℓ₂ or lsuc (ℓ₁ ⊔ ℓ₂), depending on the exact circumstances. Don’t worry; it’s not as onerous in practice as it might seem. As a beginner writing Agda, getting all of the universes right can be rather daunting. I’d recommend you start by putting everything in Set, and generalizing from there only when the typechecker insists that your levels are wrong. When that happens, simply introduce a new implicit Level for each Set you’re binding, and then follow the type errors until everything compiles again. Sometimes the errors might be incomplete, complaining that the level you gave it is not the level it should be. Just make the change and try again; sometimes Agda will further complain, giving you an even higher bound that you must respect in your level algebra. It can be frustrating, but keep playing along, and Agda will eventually stop complaining. As you gain more proficiency in Agda, you’ll often find yourself trying to do interesting things with Sets, like putting them inside of
158
CHAPTER 4. RELATIONS
data structures. If you wrote the data structures naively over Set, this will invoke the ire of the universe checker, and Agda will refuse your program. After running into this problem a few times, you will begin making all of your programs universe-polymorphic. The result is being able to reuse code you wrote to operate over values when you later decide you also need to be able to operate over types. A little discipline in advance goes a long way.
4.2
Dependent Pairs
As we have seen, function arguments act as the for-all quantifier when encoding mathematical statements as types. For example, we can read the definition for Transitive above as “for all x y z : A, and for all proofs x ≈ y and y ≈ z, it is the case that x ≈ z.” The for-all quantifier states that something holds true, regardless of the specific details. Closely related to this is the there-exists quantifier, aptly stating that there is at least one object which satisfies a given property. As an illustration, there exists a number n : ℕ such that 𝑛 + 1 = 5 , namely 𝑛 = 4. But it is certainly not the case for all n : ℕ that 𝑛 + 1 = 5 holds! True existential values like this don’t exist in Agda since we are restricted to a constructive world, in which we must actually build the thing we are claiming to exist. This sometimes gets in the way of encoding non-constructive mathematics, but it’s not usually a problem. How do we actually build such a there-exists type in Agda? The construction is usually known as a sigma type, written Σ and input as \GS . It’s definition is given by: ⇤ 0
module Definition-DependentPair where open Chapter3-Proofs record Σ (A : Set ℓ₁) (B : A → Set ℓ₂) : Set (lsuc (ℓ₁ ⊔ ℓ₂)) where constructor _,_ field proj₁ : A proj₂ : B proj₁ 1
This definition should feel reminiscent of the tuple type we built in sec. 1.15. Despite the gnarly universe polymorphism—which the typechecker can help you write for yourself, even if you don’t know what
4.3. HETEROGENEOUS BINARY RELATIONS
159
it should be—the only difference between Σ and _×_ from earlier is in the second parameter. To jog your memory, we can redefine tuples here: ⇤ 2
record _×_ (A : Set ℓ₁) (B : Set ℓ₂) : Set (lsuc (ℓ₁ ⊔ ℓ₂)) where constructor _,_ field proj₁ : A proj₂ : B
Contrasting the two, we see that in Σ, the B parameter is indexed by A, while in _×_, B exists all by itself. This indexing is extremely useful, as it allows us to encode a property about the A inside of B. As you can see at 1 , proj₂ has type B proj₁—in other words, that the type of proj₂ depends on the value we put in for proj₁! As an example, we can give our earlier example again—typing ∃ as \ex : ⇤ 2
∃n,n+1≡5 : Σ ℕ (λ n → n + 1 ≡ 5) ∃n,n+1≡5 = 4 , PropEq.refl
In the type of ∃n,n+1≡5, we give two arguments to Σ, the first indicating that there exists a ℕ, and the second being a function that describes which property holds for that particular ℕ. In the value, we need to give both the actual ℕ, and a proof that the property holds for it. The Σ type is one of the most useful constructions in Agda, and we will see many examples of it in the upcoming sections. We will import it from the standard library for the remainder of this module: ⇤ 0
open import Data.Product using (Σ; _,_)
4.3 Heterogeneous Binary Relations One extremely common mathematical construction is the relation, which, in the canon, is used to categorize disparate things like functions, equalities, orderings, and set membership, to name a few. Let’s begin with the mathematical definition, and decompile it into something more sensible for our purposes.
160
CHAPTER 4. RELATIONS A binary relation _R_ over sets 𝑋 and 𝑌 is a subset of the Cartesian product 𝑋 × 𝑌 .
As we saw when in sec. 2.5 when discussing IsEven, subsets are best encoded in Agda as functions into Set. Taken at face value, this would make a relation have type {A B : Set} → A × B → Set. We can do slightly better however, by recalling the curry/uncurry isomorphism (sec. 1.20) and splitting the explicit Cartesian product into two arguments. Such a transformation results then in {A B : Set} → A → B → Set. A fully-fledged solution here must be level polymorphic, since many of the relations we’d like to be able to encode will be over higher-level sets. There are actually three levels required here, one for A, another for B, and a third for the resulting Set. Thus, we come up with our final definition as REL: ⇤ 0
module Sandbox-Relations where REL : Set a → Set b → (ℓ : Level) → Set (a ⊔ b ⊔ lsuc ℓ) REL A B ℓ = A → B → Set ℓ
Notice that the ℓ parameter is not implicit, as is usually the case with Levels. This is because REL is almost always used as a type, in which case we still need to give a level for the resulting set. Thankfully, we can still infer levels a and b from the Set parameters to REL. This REL is the type of heterogeneous relations, that is, relationships between two distinct sets. Relations are required to satisfy no laws whatsoever, meaning anything that can typecheck as a REL is in fact a relation. In a very real way, this means that relations are themselves meaningless and completely devoid of any semantics; they exist solely as a method of organization. To get a feel for how loosey-goosey relations are, we can define a few for ourselves. There is the vacuous relation which relates no values: 2
data Unrelated : REL A B lzero where
and the trivial relation which relates all values: 2
data Related : REL A B lzero where related : {a : A} {b : B} → Related a b
While being “boring” relations, at least these two are principled. We can also define arbitrary relations:
4.4. THE RELATIONSHIP BETWEEN FUNCTIONS AND RELATIONS ⇤ 2
161
data Foo : Set where f1 f2 f3 : Foo data Bar : Set where b1 b2 b3 : Bar data FooBar : REL Foo Bar lzero where f2-b2 : FooBar f2 b2 f2-b2′ : FooBar f2 b2 f2-b : (b : Bar) → FooBar f2 b f3-b2 : FooBar f3 b2
Don’t try to make sense of FooBar, I just made something up. This relation does illustrate, however, that two values can be related in many different ways. That is, we have three different ways of constructing a FooBar f2 b2—via f2-b2, f2-b2′, or f2-b b2. But there is only one Bar that f3 is related to, and poor f1 has no relations at all! Our conclusion is that relations are too unconstrained to be interesting. Of course, there do exist interesting particular relations, but as a class of thing they are “too big.” Much like how you can represent any value of any type as a string if you are willing to suffer hard enough, many things in math can be considered relations. But it’s the constraints that necessarily make things interesting.
4.4
The Relationship Between Functions and Relations
The most salient heterogeneous relationship is the “function.” I’ve added the scare quotes here because while classical mathematicians will define a function in terms of a relation when pressed, this is categorically the wrong way to think about things. This is in stark contrast to constructivists and computer scientists, who take the function as a fundamental building block and use it to define relations—insofar as they care about relations at all. Nevertheless, we will see how to think about functions as relations, and vice versa. Doing so requires us to think of a function as a relation between the inputs on one side and the outputs on the other. We can transform a function f into a relation FnToRel f such that x is in relation to y only when 𝑓 𝑥 = 𝑦. The definition is a bit silly, but it works out nevertheless. To build such a thing, we can use a data type that maps from f into REL A B lzero. If you’ll excuse the cute
162
CHAPTER 4. RELATIONS
notation,2 we can call such a thing _maps_↦_ using \r-| for the ↦ symbol. ⇤ 2
data _maps_↦_ (f : A → B) : REL A B lzero where app : {x : A} → f maps x ↦ f x
Believe it or not, this is everything we need. We can now show that the not function relates false and true in both directions: ⇤ 2
_ : not maps false ↦ true _ = app _ : not maps true ↦ false _ = app
but it doesn’t relate false to itself: 6
_ : not maps false ↦ false _ = app
Transforming a relation back into a function is harder, as functions have significantly more structure than relations do. In particular, we require two constraints on any relation we’d like to transform into a function: 1. for every distinct value on the left of the relation, there is exactly one value on the right, and 2. every value in the left type is related to something in the right. These properties are called functionality and totality respectively. In order to produce a function out a relation, we must first show the relation has both properties, and thus it will do for us to define them. More formally, the functional property states that if 𝑥 ∼ 𝑦 and 𝑥 ∼ 𝑧, then it must be the case that 𝑦 = 𝑧. We can encode this in Agda by mapping from REL into Set, which you can think of as a function which takes a relation and produces the necessary constraint it must satisfy. 2
Frankly, half of the fun in writing Agda is coming up with good notation like this. I’ve tried to restrain myself throughout the book, but this one was too delightful to ignore.
4.4. THE RELATIONSHIP BETWEEN FUNCTIONS AND RELATIONS 2
Functional : REL A B ℓ → Set _ Functional {A = A} {B = B} _~_ = {x : A} {y z : B} → x ~ y → x ~ z → y ≡ z
j
163
1 2
CAUTION
Notice that in the definition of 1 we have given the resulting Level as _—asking Agda to do the work of inferring it for us. It gets correctly inferred as a ⊔ b ⊔ ℓ, but due to a misfeature in how Agda handles variables, we are unable to write this for ourselves! In brief, the problem arises because variables get freshly instantiated every time they are used, meaning the a in the definition of A is not the same as the a we’d like to write here. Furthermore, there is no way to directly get our hands on the proper a. It’s stupid. At 2 , pay attention to how we can bind the A and B variables as if they were just regular implicit parameters to Functional. That’s because they are just regular implicit parameters—as long as they are mentioned directly in the type. Modulo the footgun above, usage of variables can dramatically improve code’s readability. The total property says that for every 𝑥, there must exist some 𝑦 such that 𝑥 ∼ 𝑦. As before in sec. 4.2, we can turn this “there exists” into a Σ type: ⇤ 2
Total : REL A B ℓ → Set _ Total {A = A} {B = B} _~_ = (x : A) → Σ B (λ y → x ~ y)
Given Functional and Total, we’re now ready to turn our relation back into a function: ⇤ 2
relToFn : (_~_ : REL A B ℓ) → Functional _~_ → Total _~_ → A → B relToFn _~_ _ total x with total x ... | y , _ = y
164
CHAPTER 4. RELATIONS
As it happens, this implementation doesn’t actually use the Functional _~_ argument, but its existence in the type is necessary to ensure we didn’t just pick an arbitrary output from the Total property. Notice how cool it is that we can define relToFn without ever giving any actual implementations of Functional or Total. As we get deeper into doing math in Agda, most of the work we do will be of this form: put together some definitions, and assume we have something that satisfies the definition, and use that to show what we intend to. Very rarely do we actually need to get our hands dirty and give any implementations.
4.5
Homogeneous Relations
The relations we’re much more familiar with are homogeneous—those which relate two elements of the same type. It is under this category that things like equality and orderings fall. You will not be surprised to learn that homogeneous relations are a special case of heterogeneous ones. We will name such a thing Rel, which comes with one fewer parameter: 2
Rel : Set a → (ℓ : Level) → Set (a ⊔ lsuc ℓ) Rel A ℓ = REL A A ℓ
As an illustration of Rel, while previously defined propositional equality in this way: 2
module Example₂ where data _≡_ {A : Set a} : A → A → Set a where refl : {x : A} → x ≡ x
but we could have instead given it this type—stressing the fact that it is a homogeneous relation: ⇤ 4
data _≡_ {A : Set a} : Rel A a where refl : {x : A} → x ≡ x
We will study more constrained (read: interesting) examples of homogeneous relations in the remainder of this chapter, alongside their useful applications of, and constructions over.
4.6. STANDARD PROPERTIES OF RELATIONS
4.6
165
Standard Properties of Relations
It’s a good habit to look for what generalizes whenever you notice a connection to something you already understand. In this case, how much of our understanding of propositional equality lifts to relations in general? Recall the three properties we showed about propositional equality: reflexivity, symmetry, and transitivity. Reflexivity is the notion that every element is equal to itself. Symmetry states that the left and right sides of equality are equivalent, and therefore that we can swap between them at will. Transitivity gives us a notion of composition on equality, saying that we can combine two proofs of equality into one, if they share an identical member between them. In order to generalize these properties, we need only replace the phrase “is equal to” with “is in relation with.” Not every relation satisfies each of these properties of course, but having some shared vocabulary gives us things to look out for when designing our own relations. The first step is to formalize each of these notions in the flavor of Functional and Total above. We can encode reflexivity as a proposition stating that all elements are related to themselves: ⇤ 2
Reflexive : Rel A ℓ → Set _ Reflexive {A = A} _~_ = {x : A} → x ~ x
Similarly, symmetry is nothing other than a function which swaps the two sides of the relation: ⇤ 2
Symmetric : Rel A ℓ → Set _ Symmetric {A = A} _~_ = {x y : A} → x ~ y → y ~ x
and transitivity merely glues two related terms together if they share one side in common: ⇤ 2
Transitive : Rel A ℓ → Set _ Transitive {A = A} _~_ = {x y z : A} → x ~ y → y ~ z → x ~ z
Now that we have some common things to look for, let’s dive into designing some new relations and see what shakes out.
166
CHAPTER 4. RELATIONS
4.7 Attempting to Order the Naturals We have now spent several chapters discussing numbers and equality, but what about concepts like “less than or equal to?” Orderings like these are relations in their own regard, and as you might expect, they are just as amenable to formalization in Agda as their more exact counterparts. The first thing to notice is that this is not a general notion—it is very much tied to the natural numbers. We can’t build generic machinery that would allow us to say a value of some arbitrary type is less than some other value of the same. While there are many types that do admit the notion of an ordering relationship, the nature of that relationship must be specialized for each type. Besides, we don’t even have a guarantee such an ordering would be unique—for example, we might choose to order strings lexicographically or by length. One might be the more familiar choice, but it’s hard to argue that one is more correct than the other. A surprising amount of care is required in order to implement an ordering on the natural numbers. There are many gotchas here that serve to illustrate a valuable lesson in designing types in Agda, and so it is worthwhile to go slowly, take our time, and learn what can go wrong. How can we prove that one number is less than or equal to another? Recall that there do not exist any negative natural numbers, so one possible means is to say that 𝑥 ≤ 𝑦 if there exists some 𝑧 such that 𝑥+𝑧 = 𝑦. We can set this up, first by importing our previously-defined machinery directly from the standard library: ⇤ 0
open import Relation.Binary using (Rel; Reflexive; Transitive; Symmetric)
With surprising prescience, I can tell you that our first attempt at implementing _≤_ ( \le ) is going to fail, so let’s make a new module and define our type: ⇤ 0
module Naive-≤₁ where data _≤_ : Rel ℕ lzero where lte : (a b : ℕ) → a ≤ a + b infix 4 _≤_
To a first approximation, it seems to work:
4.7. ATTEMPTING TO ORDER THE NATURALS 2
167
_ : 2 ≤ 5 _ = lte 2 3
Indeed, Agda can even solve the above definition for us via Auto ( C-c C-a in Emacs and VS Code). One of the few things we can prove about _≤_ defined in this way is that suc is monotonic—that is, that if x ≤ y, then suc x ≤ suc y: 2
suc-mono : {x y : ℕ} → x ≤ y → suc x ≤ suc y suc-mono (lte x y) = lte (suc x) y
If you attempted to write this for yourself, you might have been surprised that Refine ( C-c C-r in Emacs and VS Code) refused to introduce the lte constructor, instead complaining about “no introduction forms found.” This is a little surprising, since the above definition does in fact work. Let’s agree to scratch our collective heads and hope nothing else weird happens. Something else weird does in fact to happen when we try to show ≤-refl—which we should be able to do by picking 𝑦 = 0: 6
≤-refl : Reflexive _≤_ ≤-refl {x} = lte x 0
Giving this definition results in an error from Agda: i INFO WINDOW x + 0 != x of type ℕ when checking that the expression lte x 0 has type x ≤ x
Unperturbed, we can try hitting ≤-refl with some of our other proof techniques, and see if we can make progress on it in that way. Let’s proceed with naught but brute force and ignorance, seeing if we can nevertheless bend Agda to our will. Try running MakeCase with argument x ( C-c C-c in Emacs and VS Code): 2
≤-refl : Reflexive _≤_ ≤-refl {zero} = {! !} ≤-refl {suc x} = {! !}
It’s easy to fill the first hole:
168 2
CHAPTER 4. RELATIONS
≤-refl : Reflexive _≤_ ≤-refl {zero} = lte zero zero ≤-refl {suc x} = {! !}
This remaining goal has type suc x ≤ suc x, which sounds like the sort of thing we need recursion to solve. So we can introduce a with abstraction: 2
≤-refl : Reflexive _≤_ ≤-refl {zero} = lte zero zero ≤-refl {suc x} with ≤-refl {x} ... | x≤x =
?
giving us x≤x whose type is, appropriately, x ≤ x. The usual move here would be to pattern match on x≤x to open up its lte constructor, insert a suc, and be on our merry way. Putting that plan into action, however, immediately goes awry when we run MakeCase with argument x≤x ( C-c C-c in Emacs and VS Code): i INFO WINDOW I'm not sure if there should be a case for the constructor lte, because I get stuck when trying to solve the following unification problems (inferred index ≟ expected index): x₁ ≟ x₂
x₁ + y ≟ x₂ Possible reason why unification failed: Cannot solve variable x₁ of type ℕ with solution x₁ + y because the variable occurs in the solution, or in the type of one of the variables in the solution. when checking that the expression ? has type suc x ≤ suc x
Yikes! Something has gone horribly, horribly wrong. Let’s turn our attention to this problem momentarily, but out of sheer cheekiness, we can complete the proof nevertheless. Spotting that x≤x has a satisfactory type for us to invoke suc-mono is sufficient to make progress and fill our final hole: 2
≤-refl : Reflexive _≤_ ≤-refl {zero} = lte zero zero ≤-refl {suc x} with ≤-refl {x}
4.8. SUBSTITUTION
169
... | x≤x = suc-mono x≤x
4.8 Substitution A surprising number of things went wrong when putting together such a simple proof! Let’s together analyze each of them in order to see what exactly happened. Recall our original implementation which we assumed would work: 6
≤-refl : Reflexive _≤_ ≤-refl {x} = lte x 0
However, Agda gave us this error instead: i INFO WINDOW x + 0 != x of type ℕ when checking that the expression lte x 0 has type x ≤ x
The problem here is that lte x 0 has type x ≤ x + 0. From our discussion in sec. 3.5, we saw just how much work it was to convince Agda that 𝑥 = 𝑥 + 0—we had to go through all the work of proving +-identityʳ! Thankfully, that work is not lost to us, and we can reuse it here by way of some standard (if heavy-handed) machinery for rewriting propositional equalities at the level of types. This machinery is called subst, short for substitution: 2
open Chapter3-Proofs using (+-identityʳ) subst : {x y : A} → (P : A → Set ℓ) 1 → x ≡ y → P x → P y subst _ PropEq.refl px = px
You can think of subst as a type-level cong, as it serves the same purpose. At 1 it takes an argument P which is responsible for pointing
170
CHAPTER 4. RELATIONS
out where you’d like the substitution to happen—completely analogous to the function we gave to cong for targeting where the rewrite should occur. To illustrate the use of subst, we can reimplement ≤-refl in terms of it—though the experience is decidedly less than wholesome: 2
≤-refl′ : Reflexive _≤_ ≤-refl′ {x} = subst (λ φ → x ≤ φ) (+-identityʳ x) (lte x 0)
It’s nice to know that subst exists, but as a good rule of thumb, it’s usually the wrong tool for the job. When you find yourself reaching for subst over and over again, it’s indicative that you’ve painted yourself into a corner and wrote a bad definition somewhere. Requiring substitution is usually a symptom of an upstream problem.
4.9
Unification
But not every problem we saw when implementing ≤-refl for the first time can be solved via subst. Recall our attempt to pattern match on x≤x in the following: 2
≤-refl : Reflexive _≤_ ≤-refl {zero} = lte zero zero ≤-refl {suc x} with ≤-refl {x} ... | x≤x =
?
to which Agda replied: i INFO WINDOW I'm not sure if there should be a case for the constructor lte
For goodness sake’s, of course there should be a case for the constructor lte; it’s the only constructor after all! Our indignation is well deserved, but it’s more instructive to think about what has gone wrong here, and what can we do about it? The problem is that Agda is usually really good at pattern matching, eliding impossible patterns whenever the constructor can’t possibly match. In this case, Agda somehow can’t decide if the lte constructor should definitely be there, or whether it definitely shouldn’t be. How can this be so?
4.10. OVERCONSTRAINED BY DOT PATTERNS
171
Internally, Agda implements this functionality by attempting to unify—that is, via matching syntactically—the indices on type’s constructors with the indices of your expression. In this case, we have x≤x : x ≤ x, which Agda needs to unify against lte whose eventual indices are ?a ≤ ?a + ?b (after some renaming to avoid confusion.) Doing so sets up the following series of equations that Agda must solve: ?𝑎 ∼𝑥 ?𝑎 + ?𝑏 ∼𝑥 where we read ~ as “unifies to.” In order to correctly determine if a constructor needs to exist in a pattern match, Agda must be able to syntactically assign an expression to each metavariable (here, ?a and ?b.) While we can use the first equation to unify ?a with x, equation, there is no way to syntactically unify ?a + ?b with x. Even after replacing ?a, we get x + ?b ~ x. The problem is that there’s no syntactic way to get the ?b term all on its own in the equation x + ?b ~ x. You and I know that the only solution to this problem is that ?b = 0, but this is a statement about number theory, and Agda doesn’t know anything about number theory. The pattern checker knows only about syntax and computation, neither of which make progress here. Since there is no way to solve the unification problem x + ?b ~ x, Agda throws up its hands and calls uncle. Unfortunately, it chooses to do so with an extremely unhelpful error. One possible solution here would be for Agda to simply allow you to give cases that it can’t be sure about, but this leads to downstream typechecking issues that would make the implementation of Agda significantly harder. Since the reasons you might want to do this as a user are dubious at best, Agda doesn’t support it, and requires you to find alternative ways to convince the language that you are doing meaningful things. We will not investigate those alternative ways here, except to point out how to avoid the situation altogether.
4.10
Overconstrained by Dot Patterns
One last subtle point about unification: rather surprisingly, we successfully implemented suc-mono, without encountering the dreaded “not sure if there should be a case” problem. How can that have happened? We can get a feeling for the unification algorithm behind the scenes by explicitly binding our implicit arguments:
172 2
CHAPTER 4. RELATIONS
suc-mono′ : {x y : ℕ} → x ≤ y → suc x ≤ suc y suc-mono′ {x} {y} x≤y =
?
Doing a MakeCase with argument x≤y ( C-c C-c in Emacs and VS Code) in this hole will correctly split apart the x≤y, but in doing so, will also leave behind dot patterns for variables that it unified in the process. Recall that dot patterns arise from a constructor showing you which indices it must have, and constraining other variables in the process. Thus, dot patterns are an excellent way to look at what exactly Agda has solved: 2
suc-mono′ : {x y : ℕ} → x ≤ y → suc x ≤ suc y suc-mono′ {x} {.(x + b)} (lte .x b) = lte (suc x) b
It’s worth going through the solved constraints here. In splitting lte, Agda introduced two new variables, a and b, subject to the constraints: 𝑎 ∼𝑥 𝑎 + 𝑏 ∼𝑦
There is a solution here, namely: 𝑎 ∼𝑥 𝑦 ∼𝑥 + 𝑏
which corresponds exactly to how Agda filled in the dot patterns in above. Rather interestingly, we can implement a monomorphic version of suc-mono′ by restricting its type: suc-mono′
2
suc-mono-mono : {x : ℕ} → x ≤ x → suc x ≤ suc x suc-mono-mono = suc-mono′
but we cannot inline the definition of suc-mono′ here, since we will get the “not sure” error. Looking at the constraints Agda must solve immediately shows us the problem: 𝑎 ∼𝑥 𝑎 + 𝑏 ∼𝑥
There’s simply no way to solve this system of equations just by substituting variables with one another. We are required to express the
4.11. ORDERING THE NATURAL NUMBERS
173
constraint x ~ a + b somewhere in the pattern match, but the only variable that isn’t already spoken for is b itself, and we don’t have b isolated in our equation. Thus, the constraint can’t be satisfied, and therefore we are stuck. The Agda folklore warns that one ought not use computing terms (that is to say, anything other than constructors) as type indices—for exactly this reason. This happens to be true, but as we have seen above, it’s not the whole story. The problem is not with computation per se, it’s that when you pattern match and bring these constraints into scope, they don’t work out to nice constructors that Agda can immediately pattern match on. Instead, Agda’s only recourse is to introduce a dot pattern, which reifies the computation, but at the cost of eliminating one of your bindings—that is, by removing a degree of freedom. When you run out of bindings, Agda has nowhere to reify these additional constraints, and you get the dreaded “I’m not sure if there should be a case” error. The takeaway here is that type indices should always be bindings or constructors, but never function calls—doing so risks running out of places to put the indices and will prevent Agda from being able to pattern match on your type. This is a particularly insidious problem because the errors happen far away from the definition, and can be hard to diagnose without constant vigilance.
4.11
Ordering the Natural Numbers
Having worked through this extremely long digression on the nature of Agda’s ability to perform pattern matching, we should now see what’s gone wrong with our first definition of _≤_ and know how to fix it. The solution here is to prevent Agda from introducing dot patterns, and the simplest way to do that is to only ever use constructors as indices to your data type. A good way to proceed here is to work backwards; starting from each constructor, we can determine how to it in order to show our desired less-than-or-equal-to relationship. The case of zero is easy, since zero is the smallest element, we have the case that zero ≤ n, for any other number n! In the case of suc, we know that suc m ≤ suc n if and only if m ≤ n in the first place. This gives rise to a very natural type:
174 ⇤ 0
CHAPTER 4. RELATIONS
module Definition-LessThanOrEqualTo where data _≤_ : Rel ℕ lzero where z≤n : {n : ℕ} → zero ≤ n s≤s : {m n : ℕ} → m ≤ n → suc m ≤ suc n infix 4 _≤_
This does happen to be the right3 definition for _≤_. As in other chapters, let’s drop out of this definition module and import the same thing from the standard library. In doing so, we will ensure everything else we build will play nicely with future chapters and any other Agda code you might want to write against the standard library itself. ⇤ 0
open import Data.Nat using (_≤_; z≤n; s≤s) module Sandbox-≤ where
Let’s now again prove that 2 ≤ 5. Begin with a quick type: ⇥ 2
_ : 2 ≤ 5 _ =
?
Asking Agda to Refine ( C-c C-r in Emacs and VS Code) this hole has it use the s≤s constructor: 2
_ : 2 ≤ 5 _ = s≤s
{! !}
Something interesting has happened here. Invoke TypeContext ( C-c C-, in Emacs and VS Code) on the new hole, and you will see it has type [7]1 ≤ 4! By using s≤s, Agda has moved both sides of the inequality closer to zero. It makes sense when you stare at the definition of s≤s, but it’s a rather magical thing to behold for the first time. Throw another s≤s in the hole: 2
_ : 2 ≤ 5 _ = s≤s (s≤s
{! !} )
whose new hole now has type 0 ≤ 3. From here, the constructor z≤n now fits, which completes the definition: 3
Standard-library approved.
4.12. PREORDERS 2
175
_ : 2 ≤ 5 _ = s≤s (s≤s z≤n)
Thankfully, all our hard work now pays off, as we are able to implement our desired suc-mono and ≤-refl without any further hassle. Exercise (Trivial) Prove suc-mono : {x y : ℕ} → x ≤ y → suc x ≤ suc y. Solution 2
suc-mono : {x y : ℕ} → x ≤ y → suc x ≤ suc y suc-mono = s≤s
Exercise (Easy) Prove ≤-refl : {x : ℕ} → x ≤ x. Solution 2
≤-refl : {x : ℕ} → x ≤ x ≤-refl {zero} = z≤n ≤-refl {suc x} = s≤s ≤-refl
Exercise (Easy) Prove ≤-trans : (x y z : ℕ) → x ≤ y → y ≤ z → x ≤ z. 2
≤-trans : {x y z : ℕ} → x ≤ y → y ≤ z → x ≤ z ≤-trans z≤n y≤z = z≤n ≤-trans (s≤s x≤y) (s≤s y≤z) = s≤s (≤-trans x≤y y≤z)
4.12
Preorders
As humans, we are naturally drawn to order, structure, and patterns, and thus the properties of reflexivity and transitivity can seem mundane to us. But this is a fact about the human mind, not about which mathematical properties are interesting! By virtue of being able to spot “mundane” properties like these is exactly what makes them of note. There is in fact a great amount of structure hidden inside of reflexivity and transitivity; the crushing majority of relations do not satisfy these two properties. Those that do are called preorders: ⇤ 0
module Sandbox-Preorders where open Sandbox-≤
176
CHAPTER 4. RELATIONS
record IsPreorder {A : Set a} (_~_ : Rel A ℓ) : Set (a ⊔ ℓ) where field refl : Reflexive _~_ trans : Transitive _~_
Unlike other constructions we’ve built for ourselves, we will not prefer to get this one from the standard library. The standard library heads off into astronaut territory when it comes to structures like this—generalizing away from a hardcoded dependency on propositional equality to taking some notion of equality as a parameter. We will investigate what exactly is going on there when we discuss setoids in sec. 7.9. But that is a topic far in the future, and for now, we will deal exactly with IsPreorder as its defined here. We have already seen three preorders in this book, perhaps without even realizing it. Of course, _≤_ forms one: ⇤ 2
≤-preorder : IsPreorder _≤_ IsPreorder.refl ≤-preorder = ≤-refl IsPreorder.trans ≤-preorder = ≤-trans
as does _≡_, though we need to be a little careful in showing it. The most salient issue in showing IsPreorder _≡_is that, given our new definition of IsPreorder, the identifiers refl and trans are no longer unambiguous. Agda just isn’t sure if we want the refl constructor for propositional equality, or refl from IsPreorder, and similar problems arise for trans. An easy solution is to give qualified identifiers for the particular things we’d like. We can give the alias PropEq to Chapter3-Proof (the module where we first defined refl and trans) by way of the following syntax: which now gives us unambiguous access to PropEq.refl and PropEq.trans: 2
≡-preorder : IsPreorder (_≡_) IsPreorder.refl ≡-preorder = PropEq.refl IsPreorder.trans ≡-preorder = PropEq.trans
The other issue arising from a naive implementation of ≡-preorder can now be seen—it’s this bright saffron background on IsPreorder. Agda’s failed to fill in an implicit on our behalf. What’s gone wrong is that _≡_ is polymorphic in the type for which it shows equality, and so
4.13. PREORDER REASONING
177
Agda doesn’t know how we’d like to instantiate that polymorphism. In fact—we don’t, and would like to keep it polymorphic. This can be done by explicitly filling in _≡_’s implicit A parameter, which we’d conveniently to fill in with our variable also named A: 2
≡-preorder : IsPreorder (_≡_ {A = A}) IsPreorder.refl ≡-preorder = PropEq.refl IsPreorder.trans ≡-preorder = PropEq.trans
Exercise (Trivial) Should the reader be feeling industrious, they are encouraged to prove that Sandbox-Relations.Unrelated and Sandbox-Relations.Related are also preorders. Solution 2
open Sandbox-Relations using (Related; related) Related-preorder : IsPreorder (Related {A = A}) IsPreorder.refl Related-preorder = related IsPreorder.trans Related-preorder _ _ = related
4.13
Preorder Reasoning
In sec. 3.10, we built equational reasoning tools for working with propositional equality. Now that we know a little more, recall that our equational reasoning machinery used only reflexivity and transitivity. That is to say, we can generalize equational reasoning so that it works over any preorder whatsoever! Let’s quickly build this new reasoning for preorders. At any given point, we’re going to want to be working in only a single preorder, so we can define a new module parameterized by the preorder we’d like: 2
module Preorder-Reasoning {_~_ : Rel A ℓ} (~-preorder : IsPreorder _~_) where
By opening the ~-preorder record, we can bring its record fields into scope. The syntax here is a little odd, since we need to first tell Agda the type of the record: ⇤ 4
open IsPreorder ~-preorder public
178
CHAPTER 4. RELATIONS
By opening it public, we ensure that refl and trans both “leak” in when we open Preorder-Reasoning. In essence, public makes it as if we explicitly defined the imported identifiers in this module—just like when we list out our accomplishments at the end of each chapter. The rest of the preorder reasoning machinery will be presented without further commentary, as there is nothing new here. The only changes are that we’ve replaced _≡_ with ~, and renamed _≡⟨_⟩_ to _≈⟨_⟩_: 4
begin_ : {x y : A} → x ~ y → x ~ y begin_ x~y = x~y infix 1 begin_
4
_∎ : (x : A) → x ~ x _∎ x = refl infix 3 _∎
4
_≡⟨⟩_ : (x : A) → {y : A} → x ~ y → x ~ y x ≡⟨⟩ p = p infixr 2 _≡⟨⟩_
4
_≈⟨_⟩_ : (x : A) → ∀ {y z} → x ~ y → y ~ z → x ~ z _ ≈⟨ x~y ⟩ y~z = trans x~y y~z infixr 2 _≈⟨_⟩_
We would, however, like to make one addition to this interface: making it play nicely with propositional equality. If we happen to know that two terms are propositionally equal, it would be nice to be able to use that fact in a reasoning block. Thus, we also include _≡⟨_⟩_: 4
_≡⟨_⟩_ : (x : A) → ∀ {y z} → x ≡ y → y ~ z → x ~ z _ ≡⟨ PropEq.refl ⟩ y~z = y~z infixr 2 _≡⟨_⟩_
Any code wanting to do equational reasoning over a preorder is now able to: it need only open the Preorder-Reasoning module using its proof of being a preorder (that is, IsPreorder) as an argument.
4.14. REASONING OVER ≤
4.14
179
Reasoning over ≤
Let’s quickly prove a non-trivial fact about the natural numbers, namely that 𝑛 ≤ 1 + 𝑛. You should be able to do this sort of thing in your sleep by now: ⇤ 2
n≤1+n : (n : ℕ) → n ≤ 1 + n n≤1+n zero = z≤n n≤1+n (suc n) = s≤s (n≤1+n n)
We can further use this fact and our preorder reasoning in order to show that 𝑛 ≤ 𝑛 + 1: 2
open Chapter3-Proofs using (+-comm) n≤n+1 : (n : ℕ) → n ≤ n + 1 n≤n+1 n = begin n ≈⟨ n≤1+n n ⟩ 1 1 + n ≡⟨ +-comm 1 n ⟩ n + 1 ∎ where open Preorder-Reasoning ≤-preorder
The proof here is fine, but the syntax leaves a little to be desired. Notice that at 1 we are required to use _≈⟨_⟩_ to show that n ≤ 1 + n. But (from the perspective of someone reading this code with fresh eyes) what the heck is ≈? We’re proving something about _≤_! While ≈ is a reasonable name for a generic preorder, many preorders have existing names that it would be preferable to reuse. In this case, we’d like to use ≤! The trick, as usual, is to make a new module that publicly opens Preorder-Reasoning, using renaming to change whatever names need work. Furthermore, while we’re here, we might as well fill in the preorder parameter with ≤-preorder: ⇤ 2
module ≤-Reasoning where open Preorder-Reasoning ≤-preorder renaming (_≈⟨_⟩_ to _≤⟨_⟩_) public
By now using ≤-Reasoning directly, our proof is worthy of much more delight:
180 ⇤ 2
CHAPTER 4. RELATIONS
n≤n+1 : (n : ℕ) → n ≤ n + 1 n≤n+1 n = begin n ≤⟨ n≤1+n n ⟩ 1 + n ≡⟨ +-comm 1 n ⟩ n + 1 ∎ where open ≤-Reasoning
Don’t be shy in introducing helper modules to put a specific spin on more general notions. Their judicious use can dramatically improve the developer experience, whether the developer be you or a user of your library. Either way, the effort will be appreciated.
4.15
Graph Reachability
We have shown that both _≤_ and _≡_ form a preorders. From this you might be tempted to think that preorders are just tools that are sorta like ordering or equality. Not so. Let’s look at another example to break that intuition. Consider a graph (as in a network, not like a plot.) Math textbooks often begin their discussion around graphs with the telltale phrase: Let 𝐺 = (𝑉 , 𝐸) be a graph with vertices 𝑉 and edges 𝐸 . Left completely unsaid in this introduction is that 𝐸 is in fact a relation on 𝑉 ; given a graph with vertices 𝑉 , it really ought to be the case that the edges do actually lie between the vertices! As a computer scientist, you probably have implemented a graph before at some point, whether it be via pointer-chasing or as an adjacency matrix. These are indeed encodings of graphs, but they are concessions to computability, which we are not particularly intersted in. Playing with graphs in Agda requires only some set V and an edge relation _⇒_ ( \=> ) over it: ⇤ 2
module Reachability {V : Set ℓ₁} (_⇒_ : Rel V ℓ₂) where
What can we say about _⇒_? Does it satisfy any of the usual relation properties? Think on that question for a moment before continuing. Does _⇒_ satisfy any relation properties? The question is not even wrong. We can say nothing about _⇒_ other than what we have asked of it, since it’s a parameter, and thus opaque to us. Given the definition, all we can say for sure about _⇒_ is that it’s a relation over V!
4.15. GRAPH REACHABILITY
181
However, what we can do is construct a new relation on top of _⇒_, and stick whatever we’d like into that thing. One salient example here is the notion of reachability—given a starting vertex on the graph, is their a path to some other vertex? The distinction between the relation _⇒_ and the reachable relation on top of it is subtle but important: while there is no single road that connects Vancouver to Mexico City, there is certainly a path that does! When exactly is one vertex reachable from another? The easiest case is if we already have an edge in _⇒_ that connects two vertices. As a trivial case, two vertices are already connect if they are the same. Finally, if we know an intermediary vertex is reachable from our starting point, and that the goal is reachable from there, we can connect the two paths. This gives rise to a very straightforward definition: ⇥ 4
private variable v v₁ v₂ v₃ : V data Path : ↪_ : here : connect :
Rel V (ℓ₁ ⊔ ℓ₂) where v₁ ⇒ v₂ → Path v₁ v₂ Path v v Path v₁ v₂ → Path v₂ v₃ → Path v₁ v₃
Where we can type ↪ by scrolling through the possibilities under r. It is not difficult to show that Path forms a preorder: ⇤ 4
Path-preorder : IsPreorder Path IsPreorder.refl Path-preorder = here IsPreorder.trans Path-preorder = connect
Pay attention to what we’ve done here, as this is a very general and reusable technique. Given some arbitrary relation _⇒_, about which we know nothing, we were able to extend that relation into a preorder. The ↪_ constructor injects values of type v₁ ⇒ v₂ into the type ‘Path v₁ v₂. Since this is possible, it’s reasonable to think of Path as _⇒_, “but with more stuff in it.” Namely, enhanced with here and connect. The attentive reader will notice that it is exactly here and connect which are used in Path-preorder, which is why Path can turn any relation into a preorder. More generally, the idea behind Path is to augment a type with just enough structure to allow you to build some well-known mathematical (or computational) idea on top. Usually the approach is to
182
CHAPTER 4. RELATIONS
add constructors for every required property. In doing so, you find the free X—in this case, Path is the free preorder.
4.16
Free Preorders in the Wild
We will now combine our free preorder with the preorder reasoning we built in sec. 4.13 to demonstrate actual paths through a social graph. Rather than incriminate any real group of humans, we can instead use the excellent early-noughties romantic comedy About a Boy (Hedges (2002)) as a case study. If you haven’t seen the film, you should consider remedying that as soon as possible. Don’t worry though, there are no spoilers here; it’s safe to continue. Our first task is to define the vertices of the social graph, which of course are the people involved: ⇤ 2
module Example-AboutABoy where data Person : Set where ellie fiona marcus rachel susie will : Person
Some of these people are friends, which we can use as edges in our graph. ⇤ 4
private variable p₁ p₂ : Person data _IsFriendsWith_ : Rel Person lzero where marcus-will : marcus IsFriendsWith will marcus-fiona : marcus IsFriendsWith fiona fiona-susie : fiona IsFriendsWith susie
Friendship is usually considered to be symmetric. While we could add explicit constructors for the other direction of each of these friendships, it’s easier to add a sym constructor: 6
sym : p₁ IsFriendsWith p₂ → p₂ IsFriendsWith p₁
What excellent early-noughties romantic comedy is complete without a series of potential love interests? We can enumerate who likes whom as another source of edges in our graph: ⇤ 4
data _IsInterestedIn_ : Rel Person lzero where marcus-ellie : marcus IsInterestedIn ellie
4.16. FREE PREORDERS IN THE WILD will-rachel rachel-will susie-will
183
: will IsInterestedIn rachel : rachel IsInterestedIn will : susie IsInterestedIn will
As much as many people would prefer a world in which _IsInteris a symmetric relation, this is sadly not the case, and thus we do not require a constructor for it. Finally, we can tie together _IsFriendsWith_ and _IsInterestedIn_ with SocialTie which serves as the definitive set of edges in our graph.
estedIn_
⇤ 4
data SocialTie : Rel Person lzero where friendship : p₁ IsFriendsWith p₂ → SocialTie p₁ p₂ interest : p₁ IsInterestedIn p₂ → SocialTie p₁ p₂
There is no preorder on SocialTie, but we can get one for free by using Path. Then we can look at how will and fiona relate in the social graph: ⇤ 4
open Reachability SocialTie will-fiona : Path will fiona will-fiona = begin will ≈⟨ ↪ friendship (sym marcus-will) ⟩ marcus ≈⟨ ↪ friendship marcus-fiona ⟩ fiona ∎ where open Preorder-Reasoning Path-preorder
or how rachel and ellie relate: ⇤ 4
rachel-ellie : Path rachel ellie rachel-ellie = begin rachel ≈⟨ ↪ interest rachel-will ⟩ will ≈⟨ ↪ friendship (sym marcus-will) ⟩ marcus ≈⟨ ↪ interest marcus-ellie ⟩ ellie ∎ where open Preorder-Reasoning Path-preorder
Agda’s ability to model and work with things like this is frankly amazing. Of course, I am likely the only person in the world interested in the social dynamics of About a Boy, but the fact that it’s possible is a testament to how much power we’ve developed.
184
CHAPTER 4. RELATIONS
4.17 Antisymmetry Let’s take a step back from preorders and look some more at _≤_. For example, does it support symmetry? A moment’s thought convinces us that it can’t possibly.Just because 2 ≤ 5 doesn’t mean that 5 ≤ 2. However, _≤_ does satisfy a related notion, that of antisymmetry. Antisymmetry says that if we know 𝑚 ≤ 𝑛 and that 𝑛 ≤ 𝑚, then it must be the case that 𝑚 = 𝑛. Proving the antisymmetry of _≤_ is straightforward: ⇤ 2
≤-antisym : {m n : ℕ} → m ≤ n → n ≤ m → m ≡ n ≤-antisym z≤n z≤n = PropEq.refl ≤-antisym (s≤s m≤n) (s≤s n≤m) = PropEq.cong suc (≤-antisym m≤n n≤m)
In addition, we can generalize this type to something more reusable, like we did with Reflexive, Symmetric, Transitive, Functional, and Total. This one is a little trickier however, since it’s really a property over two relations: one corresponding to equality, and another to the ordering: ⇤ 2
Antisymmetric : Rel A ℓ₁ → Rel A ℓ₂ → Set _ Antisymmetric _≈_ _≤_ = ∀ {x y} → x ≤ y → y ≤ x → x ≈ y
which does expand to our desired type, as we can show: ⇤ 2
_ : Antisymmetric _≡_ _≤_ _ = ≤-antisym
4.18
Equivalence Relations and Posets
The difference between _≡_’s symmetry and _≤_’s antisymmetry turns out to be the biggest differentiator between the two relations. We have seen that both are preorders, but whether a relation is symmetric or antisymmetric bifurcates the mathematical hierarchy.
4.18. EQUIVALENCE RELATIONS AND POSETS
185
Symmetric preorders like _≡_ are known as equivalence relations, and such things act exactly as equality should. An equivalence relation forms “buckets,” with every member of the underlying set in exactly one bucket, and everything in the bucket being related. For _≡_ there is one bucket for each element in the set, but you could imagine relating strings based on their length, where we would then get a bucket for every unique string length. We’re going to define IsEquivalence and IsPartialOrder at the same time, and both are types parameterized by a given relation. While we could go through the effort and writing out all the necessary levels, sets, and relationship bindings for both, it’s simpler to put that stuff as parameters to an anonymous module: 2
module _ {a ℓ : Level} {A : Set a} (_~_ : Rel A ℓ) where
Anything inside of this module will now inherit each of a, ℓ, A, and _~_, which can be a great time-saver when defining several closelyrelated objects. And we give the module the name _ meaning “this is not really a module, it exists only to these parameters.” Inside the module we are now free to define IsEquivalence: ⇥ 4
record IsEquivalence : Set (a ⊔ ℓ) where field isPreorder : IsPreorder _~_ sym : Symmetric _~_ open IsPreorder isPreorder public
Note that it appears that IsEquivalence has no parameters, but this is not true. The anonymous module above scopes them for us, and any user of IsEquivalence will see that it has a _~_ parameter. The partially ordered sets, usually shortened to posets, are antisymmetric preorders. If you think about preorders as directed graphs, then posets correspond to directed acyclic graphs. Their antisymmetry property precludes any possibility of cycles, since from a cycle we could derive the fact that both 𝑥 ≤ 𝑦 and 𝑦 ≤ 𝑥. Since the natural numbers all sit on a straight line, they have no cycles, which is why they form a poset. We can code up IsPartialOrder analogously to IsEquivalence: ⇤ 4
record IsPartialOrder : Set (a ⊔ ℓ) where field
186
CHAPTER 4. RELATIONS isPreorder : IsPreorder _~_ antisym : Antisymmetric _≡_ _~_
After popping the anonymous module; let’s show that _≡_ and _≤_ really are the sorts of relations we’ve claimed: ⇤ 2
≡-equiv : IsEquivalence (_≡_ {A = A}) IsEquivalence.isPreorder ≡-equiv = ≡-preorder IsEquivalence.sym ≡-equiv = PropEq.sym
and 2
≤-poset : IsPartialOrder _≤_ IsPartialOrder.isPreorder ≤-poset = ≤-preorder IsPartialOrder.antisym ≤-poset = ≤-antisym
4.19
Strictly Less Than
So far we have discussed only the less-than-or-equal to relationship between natural numbers. But sometimes we might want a strict less-than, without any of this “or equal to” business. That’s easy enough; we can just insert a suc on the right side: 2
_) ∎ U+220E END OF PROOF (\qed) ∘ U+2218 RING OPERATOR (\o) ∙ U+2219 BULLET OPERATOR (\.) ∷ U+2237 PROPORTION (\ ) ≅ U+2245 APPROXIMATELY EQUAL TO (\~=) ≈ U+2248 ALMOST EQUAL TO (\~~) ≗ U+2257 RING EQUAL TO (\o=) ≡ U+2261 IDENTICAL TO (\ ) ⊎ U+228E MULTISET UNION (\u+) ⊔ U+2294 SQUARE CUP (\lub) ⊤ U+22A4 DOWN TACK (\top) ⊥ U+22A5 UP TACK (\bot) ⟨ U+27E8 MATHEMATICAL LEFT ANGLE BRACKET (\)
321
CHAPTER
9 Program Optimization
0
module Chapter9-ProgramOptimization where
The purpose of theory is not satisfying idle curiosities; it is to make short work of otherwise difficult problems. In this final chapter, we will turn our gaze towards a difficult problem in computing— dynamic programming—and see how our new theoretical understanding completely eliminates the challenges. Dynamic programming often improves algorithmic complexity asymptotically; solving it is therefore equivalent to program optimization, and doing it automatically demonstrates a total understanding of the problem space. This is by no means a new idea, having first been done by Hinze (2000), but we now have enough comprehension to be able to understand (and prove!) where exactly good ideas like this come from. As is fitting for a capstone, this chapter has a great number of dependencies in concepts: Prerequisites 0
⇤ 0
open import Chapter1-Agda using (_×_; _,_; proj₁; proj₂; _⊎_; inj₁; inj₂)
open import Chapter2-Numbers using (ℕ; zero; suc; _+_; _*_)
323
324
CHAPTER 9. PROGRAM OPTIMIZATION
⇤ 0
open import Chapter3-Proofs using (_≡_; case_of_; module PropEq; module ≡-Reasoning) open PropEq using (refl; sym; trans; cong)
⇤ 0
open import Chapter4-Relations using (Level; _⊔_; lsuc; Σ)
⇤ 0
open import Chapter6-Decidability using (Dec; yes; no; map-dec)
⇤ 0
open import Chapter7-Structures using ( id; _∘_; const; _≗_; prop-setoid; Setoid ; ⊎-setoid; ×-setoid)
⇤ 0
open import Chapter8-Isomorphisms using ( Iso; _≅_; ≅-refl; ≅-sym; ≅-trans; ⊎-fin; fin-iso ; ×-fin; Vec; []; _∷_; lookup; Fin; zero; suc; toℕ ; _Has_Elements; ⊎-prop-homo; ×-prop-homo)
9.1
Why Can This Be Done?
Computing is a practical means to an end. One of the greatest tragedies in our field is the muddled thinking that arises from conflating “what do we want to do” with “how should we do it.” But confusion exists only in our minds, and never in reality. Because dynamic programming is an often-confusing solution to a class of problems, we will likely learn more by studying that class than by studying the offered solutions. Dynamic programming is helpful for problems with a great deal of overlapping subproblems. This notion of subproblems arises from the algorithmic concept of divide and conquer, which you will note is also muddled thinking—it too confuses the question with a means of answering it. At the end of the day, all computing reduces down to sampling functions at finitely many points. Seen through this lens, dynamic programming is a technique which makes functions less expensive to repeatedly sample, by caching their results somewhere convenient.
9.2. SHAPING THE CACHE
325
This is the true nature of dynamic programming—as a technique. It is nothing more than thinking about problems inductively, and memoizing that induction. By virtue of being the sort of person who would read this book, you are clearly already capable of thinking compositionally, and so all we have left is to tackle the problem of memoization. Why should we expect to be able to make progress on this problem? As seen in sec. 8.10, we proved an isomorphism between functions on finite domains and vectors. Indeed, it is exactly this vector which is most-often used as a caching data structure when doing dynamic programming by hand. Figuring out exactly how to index such a structure is often less clear and requires cumbersome fiddling in the absence of explicit isomorphisms. Note however that an effective memoization strategy is dependent on the problem. This “one big vector” isn’t necessarily always the best choice. If you know that you need to sample the entire function space, you might as well just sample the whole function into a table. But if sampling the function is sufficiently expensive and you have no guarantee you’ll need all of its image, you might prefer to memoize the function at only those points necessary. For functions with large input spaces, it would be quite wasteful to allocate a table large enough to memoize the whole function but proceed to only cache a handful of points in it. Therefore, we conclude that different memoization strategies must result in different caching data structures. And in fact, this class of memoization strategies grows fast—corresponding to every possible way of splitting the input space into contiguous chunks. There is no one-size-fits-all memoization strategy, so our eventual solution to this problem must be polymorphic over all possible strategies. It certainly feels like we have our work cut out for us, doesn’t it? Thankfully, the task is less monumental than it seems. The hardest part is simply organizing the problem, and our theory will guide us the rest of the way.
9.2
Shaping the Cache
Our first order of business is to find a means of describing our desired memoization strategy, which in turn we will use to generate the data
326
CHAPTER 9. PROGRAM OPTIMIZATION
type we will use to cache results.1 The data structures we end up building will all turn out to generalize tries, which is more commonly thought-of as a structure for representing collections of strings. We will leave the formal connection to tries unexplored in this text, but use the terminology interchangeably. Note that while there are many variations on the theme, we are looking for the building blocks of these cache shapes. It is undeniable that we must be able to support a flat vector of contiguous results. Another means of building tries is to combine two differently-shaped tries together. And our final means is direct composition—that is, we’d like to nest one trie inside another. In essence, this composition means we are no longer caching values, but instead caching caches of values. Rather surprisingly, these are all the shape combinators we need in order to describe any trie we’d like. We’ll see some examples momentarily, but in the meantime, we can write Shape in Agda: ⇤ 0
data Shape num : beside : inside :
: Set where ℕ → Shape Shape → Shape → Shape Shape → Shape → Shape
The names of these constructors are pleasingly straightforward. We have num n, which corresponds to a table of n elements, while beside combines two caches “side by side”, and inside composes one cache inside of another. To get a feel for how Shape describes data structures, let’s look at a few examples. As we have seen, the simplest trie imaginable is just a vector, which we will represent via num. An 8-celled cache described by this constructor is illustrated in fig. 9.1; note that the filled squares represent places to cache results.
Figure 9.1: The trie described by num 8 The semantics of the trie built by num is that if any value in the table is required, the entire table will be populated. Sometimes this is not desirable, like when snappy startup times are expected, or when the 1 Generating classes of algorithms giving a “factoring” of the shape of the problem is a trick I learned from Elliott (2017).
9.2. SHAPING THE CACHE
327
function we’d like to memoize is prohibitively expensive. Thus, the num shape by itself gives us per-table caching. At the other end of the spectrum, we can get per-element caching semantics by nesting one table inside of another, as in fig. 9.2.
Figure 9.2: The trie described by inside (num 8) (num 1) The inside constructor allows us to compose caches; in the case of fig. 9.2, we have nested a table of 1 inside a table of 8. The data structure described by this Shape is a vector of pointers to elements, where we have a convention that a pointer doesn’t point anywhere until an element in its subtree has been cached. As another example of how flexible the tries defined by our Shape machinery are, we can look at using beside. This Shape lays out two tries side by side, which we can use to define a binary tree by subsequently nesting layers: ⇤ 0
bintrie : ℕ → Shape bintrie zero = num 1 bintrie (suc n) = beside (bintrie n) (bintrie n)
The result of this is illustrated in fig. 9.3:
Figure 9.3: The trie described by bintrie 3
328
CHAPTER 9. PROGRAM OPTIMIZATION
Of course, nothing says we can’t go off the rails and define arbitrarily weird tries: 0
weird : Shape weird = beside (beside (num 3) (inside (num 2) (num 1))) (inside (num 3) (num 1))
which is illustrated in fig. 9.4.
Figure 9.4: The trie described by weird Hopefully you agree that the tries describable by Shape are indeed rich in variation. Of course, we haven’t actually built any of this yet, merely given a type and its intended semantics. In the next section, we turn our attention towards building these tries.
9.3 Building the Tries Whenever we’d like to build a data structure whose cases correspond to some other type, our immediate reaction should be to use an indexed data type. This is no exception; we can build a Trie data structure parameterized by the values we’d like to store it in, and indexed by its shape: ⇤ 0
private variable ℓ ℓ₁ ℓ₂ : Level data Trie (B : Set ℓ) : Shape → Set ℓ where empty : {sh : Shape} table : {n : ℕ} → Vec B n
→ Trie B sh → Trie B (num n)
1
9.3. BUILDING THE TRIES both nest
329
: {m n : Shape} → Trie B m → Trie B n → Trie B (beside m n) : {m n : Shape} → Trie (Trie B n) m → Trie B (inside m n)
Alongside constructors corresponding to the three for Shape, we have added a fourth constructor at 1 , corresponding to an empty, unpopulated trie. A trie of any shape can be empty, thus empty makes no demands on the index of Trie. At 2 , notice the non-regular recursion; our trie parameter no longer contains B, but instead Trie B n. From the Shape of a trie, we can also compute an appropriate type for indexing it, given by Ix: ⇤ 0
Ix Ix Ix Ix
: Shape → Set (num n) = Fin n (beside m n) = Ix m ⊎ Ix n (inside m n) = Ix m × Ix n
This Ix sh type acts as a lookup key in a corresponding Trie B sh. If we ignore the empty case, a num-shaped trie is a vector, which we can index via a Fin. In the beside case, we have two sub-tries we’d like to differentiate between; a key is therefore the coproduct of the sub-trie keys. Similarly, for inside we have nested one trie inside another, meaning we need a key for each trie in order to successfully find a B. The number of keys possible in a Shape is given by ∣_∣: 0
∣_∣ : Shape → ℕ ∣ num m ∣ = m ∣ beside m n ∣ = ∣ m ∣ + ∣ n ∣ ∣ inside m n ∣ = ∣ m ∣ * ∣ n ∣
which we can prove by way of shape-fin: 0
shape-fin : (sh : Shape) → prop-setoid (Ix sh) Has ∣ sh ∣ Elements shape-fin (num x) = ≅-refl shape-fin (beside m n) = ≅-trans (≅-sym ⊎-prop-homo) (⊎-fin (shape-fin m) (shape-fin n)) shape-fin (inside m n) = ≅-trans (≅-sym ×-prop-homo) (×-fin (shape-fin m) (shape-fin n))
2
330
CHAPTER 9. PROGRAM OPTIMIZATION
We will also require decidability of propositional equality over two indices, which we could write by hand, but will instead get by transporting across shape-fin and using our decidable equality over Fin. ⇤ 0
open import Data.Fin using (_≟_) open Iso using (to; from; from∘to) Ix-dec : {sh : Shape} → (ix₁ ix₂ : Ix sh) → Dec (ix₁ ≡ ix₂) Ix-dec {sh = sh} ix₁ ix₂ = let s = shape-fin sh in map-dec (λ toix₁=toix₂ → begin ix₁ ≡⟨ sym (from∘to s ix₁) ⟩ from s (to s ix₁) ≡⟨ cong (from s) toix₁=toix₂ ⟩ from s (to s ix₂) ≡⟨ from∘to s ix₂ ⟩ ix₂ ∎) (cong (to s)) (to s ix₁ ≟ to s ix₂) where open ≡-Reasoning
We’ll need a three other miscellaneous helper functions and proofs. First, often we can infer the first argument of a Σ type, in which case we can omit it using -,_: ⇤ 0
-,_ : {A : Set ℓ₁} {a : A} {B : A → Set ℓ₂} → B a → Σ A B -,_ {a = a} b = a , b
Second, given a function out of Fin, we can use it to build a Vec, as in tabulate: 0
tabulate : {n : ℕ} {A : Set ℓ} → (Fin n → A) → Vec A n tabulate {n = zero} f = [] tabulate {n = suc n} f = f zero ∷ tabulate (f ∘ suc)
Third, we know that calling lookup on tabulate is the same as sampling the function we used to build the table: 0
lookup∘tabulate : {n : ℕ} {A : Set ℓ} → (f : Fin n → A) → (i : Fin n) → lookup (tabulate f) i ≡ f i lookup∘tabulate f zero = refl
9.4. MEMOIZING FUNCTIONS
331
lookup∘tabulate f (suc i) = lookup∘tabulate (f ∘ suc) i
9.4
Memoizing Functions
While it’s tantalizing to jump in and begin implementing our memoizing trees, this is shortsighted. An implementation cannot possibly be correct without first specifying the problem. And take it from me, someone who wasted a week of his life because he initially went down the garden path trying to divorce implementation from proof. Better than jumping in immediately is to take a moment and think about what exactly a memoized trie looks like. Having done that work, we will then immortalize this thinking in a new type, which indexes a Trie, proving that trie does indeed memoize the particular function.2 Only after all this work is done will we jump into implementation, using our types to guarantee correctness by construction. First, we’ll need some variables in scope: 0
private variable B : Set ℓ sh m n : Shape t₁ : Trie B m t₂ : Trie B n f : Ix sh → B
We now have three definitions that need to be defined simultaneously, which we can do by way of Agda’s mutual keyword. Mutual blocks introduce a new scope in which definitions can refer to one another, without the top-down order that Agda usually enforces. We’ll define it momentarily, but for the time being, assume that we have a data type Memoizes f tr which proves that trie tr is filled only with values produced by function f. Then, a MemoTrie is a Trie alongside with a proof that it does indeed memoize a function: ⇤ 0
mutual MemoTrie : {B : Set ℓ} {sh : Shape} → (Ix sh → B) → Set ℓ MemoTrie {B = B} {sh = sh} f = Σ (Trie B sh) (Memoizes f)
Getting the definition of Memoizes right is rather finicky, so we will proceed by case. First, its type: 2 This is a trick I learned from Danielsson and Norell (2011), where you show correctness with respect to an indexed type that provides semantics.
332 2
CHAPTER 9. PROGRAM OPTIMIZATION
data Memoizes {B : Set ℓ} : {sh : Shape} → (f : Ix sh → B) → Trie B sh → Set ℓ where
Of particular interest here is the sheer number of indices we have for Memoizes. We might expect that sh and f could be parameters instead of indices, but each constructor of Memoizes makes different demands of the shape, on which the function is dependent. Thus they must both be indices. For our first case, we note that an empty trie is vacuously memoized, for any function at all—as witnessed by the emptyM (for “memoized”) constructor: ⇤ 4
emptyM : {f : Ix sh → B} → Memoizes f empty
A table late: 4
trie is memoized if the vector it contains was built via tabu-
tableM : {n : ℕ} {f : Ix (num n) → B} → Memoizes f (table (tabulate f))
We can show that a both trie is correctly memoized if its constituent tries split the function space in half, memoizing the inj₁ and inj₂ halves, respectively: ⇤ 4
bothM : {f : Ix (beside m n) → B} → Memoizes (f ∘ inj₁) t₁ → Memoizes (f ∘ inj₂) t₂ → Memoizes f (both t₁ t₂)
And now we come to the hard part—determining exactly when a nest trie correctly memoizes a function. Spiritually this should be the same as in bothM, except that we now need to split the function into an arbitrary number of smaller functions, show that each sub-trie correctly memoizes one, and use all of this information to actually build a nest trie. We will accomplish splitting the function and showing each subtrie correctly memoizes its cut by way of a function, transforming the
9.5. INSPECTING MEMOIZED TRIES
333
index of each sub-trie into a MemoTrie. We will use a helper function, to-trie, to transform this function-of-sub-tries into the necessary nested Trie. ⇤ 4
nestM : {f : Ix (inside m n) → B} → (subf : (ix : Ix m) → MemoTrie (f ∘ (ix ,_))) → Memoizes f (nest (proj₁ (to-trie {f = f} subf)))
All that’s left is to write to-trie, which, despite the tricky-looking implementation, actually comes for free given the types. ⇤ 2
to-trie : {m n : Shape} → {f : Ix (inside m n) → B} → (subf : (ix : Ix m) → Σ (Trie B n) (Memoizes (f ∘ (ix ,_)))) → MemoTrie (proj₁ ∘ subf) to-trie {m = num _} _ = -, tableM to-trie {m = beside _ _} subf with proj₂ (to-trie (subf ∘ inj₁)) , proj₂ (to-trie (subf ∘ inj₂)) ... | mt₁ , mt₂ = -, bothM mt₁ mt₂ to-trie {m = inside _ _} f2 = -, nestM (λ i → to-trie λ j → f2 (i , j))
9.5
Inspecting Memoized Tries
Before we go through the work of actually getting values out of MemoTries, let’s take a moment to see how far our current machinery can go. We’ll write a dummy function to memoize, which simply turns its desired index into a ℕ: ⇤ 0
dummy : (sh : Shape) → Ix sh → ℕ dummy sh ix = toℕ (to (shape-fin sh) ix)
Now, for any arbitrary Shape, we can give a proof of memoization to Agda, and ask it to build us the relevant trie. For example, let’s look again at the Shape for fig. 9.4 above:
334 0
CHAPTER 9. PROGRAM OPTIMIZATION
weird : Shape weird = beside (beside (num 3) (inside (num 2) (num 1))) (inside (num 3) (num 1))
For dummy weird, let’s now give a Memoizes, which chooses some subtries to be emptyM, and others to be full: ⇤ 0
_ : Memoizes (dummy weird)
? _ = bothM (bothM tableM emptyM) (nestM λ { zero → -, emptyM ; (suc zero) → -, tableM ; (suc (suc zero)) → -, emptyM })
Recall that Solve ( C-c C-s in Emacs and VS Code) fills in values that can be unambiguously inferred. If we invoke solve in the hole above, Agda will synthesize the only valid Trie which can possibly satisfy the given Memoizes proof. In this case, it responds with: ⇤ 6
(both (both (table (0 ∷ 1 ∷ 2 ∷ [])) empty) (nest (table ( empty ∷ table (6 ∷ []) ∷ empty ∷ []))))
which we can visualize as in fig. 9.5, where ∙ corresponds to an empty trie.
Figure 9.5: A memoized Trie, synthesized by Agda
9.6. UPDATING THE TRIE
335
In an incredible show, Agda managed to find the memoized Trie which corresponds to our proof! Take a moment or two to marvel at this result; our type Memoizes is precise enough that it completely constrains all valid Tries. Truly remarkable.
9.6 Updating the Trie We are only a few steps away from a working, self-updating trie. Having successfully constrained what a memoized Trie must look like, we need only define the function which looks up a value in this Trie, possibly filling in more fields if they are not yet present. And then we will tie a little bow around the whole thing, wrapping all the machinery in a pleasant interface. For now, we will need a lemma, replace, which replaces a single branch of a nestM’s subf function. The idea here is to create a new subf which compares any looked-up index with the one we’d like to replace; if they match, return the new Memoizes, otherwise, look up this index in the old subf. ⇤ 0
replace : {fst : Trie B n} → (x : Ix m) → Memoizes (f ∘ (x ,_)) fst → ((ix : Ix m) → MemoTrie (f ∘ (ix ,_))) → MemoTrie f replace x sub-mem subf = -, nestM (λ ix → case Ix-dec ix x of λ { (yes refl) → -, sub-mem ; (no z) → subf ix } )
We are now ready for the main event, implementing get′, which looks up an index in a Memoizes3 . If the index is already present in the trie, get′ simply returns the associated value. If it’s not, get′ will build just enough of the trie so that it can get the correct value. Note that we don’t have mutation in Agda, so we can’t update the trie directly. Instead, we will return a new MemoTrie alongside, 3
Why don’t we look up an index in a Trie, you might be wondering? It’s because every Memoizes uniquely describes a memoized Trie, and we’re only interested in the case where we’re looking something up in a Trie that is guaranteed to memoize the given function.
336
CHAPTER 9. PROGRAM OPTIMIZATION
corresponding to what the trie would be if we could update it. This is purely a concession in Agda, and if you wanted to implement the same algorithm in another language, you could certainly perform the mutation. Hence the type of get′: ⇤ 0
get′ : {t : Trie B sh} → Memoizes f t → Ix sh → B × MemoTrie f
There are eight cases we need to consider, so we will look at them in bunches. In general, we must first branch on the Memoizes proof, and, in the case of emptyM, subsequently branch on the Shape sh in order to determine how we must fill in the trie. The first case is when we’ve encountered an empty table, which we must tabulate, being careful to look up the desired B in the resulting table, rather than evaluating f any more than we have to: ⇤ 0
get′ {sh = num x} {f = f} emptyM a = let t = tabulate f in lookup t a , table t , tableM
Note that, as you look through the remainder of the implementation, this is the only case in which we evaluate f. If instead we’d like to lookup a value in an empty both trie, we can branch on which sub-trie we’re looking at. This sub-trie is also empty, but will recursively find the right answer, returning us a minimallyfilled in trie, which we can insert into the proper branch, leaving emptyM in the other branch: ⇤ 0
get′ {sh = beside m n} emptyM with get′ emptyM x ... | b , t₁ , memo = b , both get′ {sh = beside m n} emptyM with get′ emptyM y ... | b , t₂ , memo = b , both
(inj₁ x) t₁ empty , bothM memo emptyM (inj₂ y) empty t₂ , bothM emptyM memo
The only other case in which we might be looking at a emptyM is when we are looking to build a nest trie. This is the same as replacing a branch of the empty-everywhere subf:
9.6. UPDATING THE TRIE 0
337
get′ {sh = inside m n} {f = f} emptyM (x , y) with get′ { f = f ∘ (x ,_) } emptyM y ... | b , _ , sub-mem = b , replace x sub-mem λ ix → -, emptyM
In all other cases, the trie we’d like to index already exists. If it’s a table, we know it must already be filled in, and so we can just lookup the answer in the vector: ⇤ 0
get′ {sh = num _} {t = table t} tableM a = lookup t a , table t , tableM
Otherwise, we need only call get′ recursively, and replace the branch we looked at with the updated sub-trie: ⇤ 0
get′ {sh = beside m n} (bothM with get′ l x ... | b , t₁ , memo = b , both get′ {sh = beside m n} (bothM with get′ r y ... | b , t₂ , memo = b , both
l r) (inj₁ x) t₁ _ , bothM memo r l r) (inj₂ y) _ t₂ , bothM l memo
In the last case, we need to look inside an existent nestM trie, which means looking at its subf function, and then recursively calling get′ on what we find. Care must be taken to subsequently replace the sub-trie we found:4 0
get′ {sh = inside m n} (nestM subf) (x , y) with subf x ... | _ , sub-mem with get′ sub-mem y ... | b , _ , _ = b , replace x sub-mem subf
And we’re done! We’ve got a satisfactory implementation of get′ that certainly seems to work! While we have good guarantees by the fact that get′ operates over MemoTries, an outside observer might not be as impressed with our handiwork as we are. To convince any potential naysayers, we can also show get′-is-fn, which shows that get′ and the function-being-memoized are extensionally equal: 4 Although, rather amazingly, failing to call replace will prevent the program from typechecking!
338 0
CHAPTER 9. PROGRAM OPTIMIZATION
get′-is-fn : {sh : Shape} {t : Trie B sh} {f : Ix sh → B} → (mt : Memoizes f t) → proj₁ ∘ get′ mt ≗ f get′-is-fn {sh = num _} emptyM x = lookup∘tabulate _ get′-is-fn {sh = beside _ _} emptyM (inj₁ x) = get′-is-fn emptyM get′-is-fn {sh = beside _ _} emptyM (inj₂ y) = get′-is-fn emptyM get′-is-fn {sh = inside _ _} emptyM (x , y) = get′-is-fn emptyM get′-is-fn {sh = num _} tableM x = lookup∘tabulate _ get′-is-fn {sh = beside _ _} (bothM t₁ _) (inj₁ x) = get′-is-fn t₁ x get′-is-fn {sh = beside _ _} (bothM _ t₂) (inj₂ y) = get′-is-fn t₂ y get′-is-fn {sh = inside _ _} (nestM subf) (x , y) = get′-is-fn (proj₂ (subf x)) y
9.7 Wrapping It All Up Our machinery so far has all operated over this weird function out of indices of our Shape—but that’s a poor interface for anyone who has a real function they’d like to memoize. All that remains is for us to package up our memoization code into something more readily usable. The trick is we’ll take a Shape, the function we’d like to memoize, and a proof that its domain has the same size as the Shape. From there, we can run the isomorphism gauntlet to define a function more amenable to operating with our memoization machinery: ⇤ 0
module _ {A : Set} {B : Set ℓ} (sh : Shape) (f : A → B) (sized : prop-setoid A Has ∣ sh ∣ Elements) where private A≅Ix : prop-setoid A ≅ prop-setoid (Ix sh) A≅Ix = fin-iso sized (shape-fin sh) f′ : Ix sh → B f′ = f ∘ Iso.from A≅Ix
And finally, we can give an implementation of get with a much nicer signature indeed:
x x y y x
9.7. WRAPPING IT ALL UP ⇤ 2
339
Memoized : Set ℓ Memoized = MemoTrie f′ get : Memoized → A → B × Memoized get (_ , memo) = get′ memo ∘ (Iso.to A≅Ix)
This is the culmination of our journey; we’ve come from nothing, building up everything as we go. It wasn’t very long ago that we defined numbers and addition for ourselves, and we have now followed that path all the way to automating program optimization. Make no mistake, this is the result of an impressive understanding of mathematics and its application to software.
Of this construction, we can finally be certain.
Ï UNICODE IN THIS CHAPTER × Σ λ ′
U+00D7 U+03A3 U+03BB U+2032
MULTIPLICATION SIGN (\x) GREEK CAPITAL LETTER SIGMA (\GS) GREEK SMALL LETTER LAMDA (\Gl) PRIME (\')
₁ U+2081 SUBSCRIPT ONE (\_1) ₂ U+2082 SUBSCRIPT TWO (\_2) ℓ U+2113 SCRIPT SMALL L (\ell) ℕ U+2115 DOUBLE-STRUCK CAPITAL N (\bN) → U+2192 RIGHTWARDS ARROW (\to) ∎ U+220E END OF PROOF (\qed) ∘ U+2218 RING OPERATOR (\o) ∙ U+2219 BULLET OPERATOR (\.) ∣ U+2223 DIVIDES (\|) ∷ U+2237 PROPORTION (\ ) ≅ U+2245 APPROXIMATELY EQUAL TO (\~=) ≗ U+2257 RING EQUAL TO (\o=) ≟ U+225F QUESTIONED EQUAL TO (\?=) ≡ U+2261 IDENTICAL TO (\ ) ⊎ U+228E MULTISET UNION (\u+) ⊔ U+2294 SQUARE CUP (\lub) ⟨ U+27E8 MATHEMATICAL LEFT ANGLE BRACKET (\)
Appendix: Ring Solving
⇤ 0
module Appendix1-Ring-Solving where
You might have noticed something—do proofs in Agda is frustrating! It seems like proofs which are most obvious are also the ones that are tryingly tedious. These are the proofs that involve reasoning about arithmetic—which is a feat that we humans take for granted, having so much experience doing it. Agda’s mechanical insistence that we spell out every step of the tedious process by hand is indeed a barrier to its adoption, but thankfully, there are workarounds for those willing to plumb deeper into the depths of the theory. Recall that when we were implementing *-cong₂-mod in sec. 5.5, that is, cong for modular arithmetic, we built a lot of setoid machinery and reasoning to avoid needing to solve these large proofs by hand. The particular problem here was attempting to solve the following equation:
𝑎𝑐 + (𝑐𝑥 + 𝑎𝑧 + 𝑥𝑧𝑛) × 𝑛 = 𝑏𝑑 + (𝑑𝑦 + 𝑏𝑤 + 𝑦𝑤𝑛) × 𝑛
subject to the additional facts
𝑎 + 𝑥𝑛 ≡ 𝑏 + 𝑦𝑛𝑐 + 𝑧𝑛 ≡ 𝑑 + 𝑤𝑛
In order to get a sense of the actual effort required to solve this problem, we can solve the equation in pen and paper: 341
342
CHAPTER 9. PROGRAM OPTIMIZATION
𝑎𝑐 + (𝑐𝑥 + 𝑎𝑧 + 𝑥𝑧𝑛) ∗ 𝑛 =𝑎𝑐 + 𝑐𝑥𝑛 + 𝑎𝑧𝑛 + 𝑥𝑧𝑛𝑛 =𝑐 ∗ (𝑎 + 𝑥𝑛) + 𝑎𝑧𝑛 + 𝑥𝑧𝑛𝑛 =𝑐 ∗ (𝑎 + 𝑥𝑛) + 𝑧𝑛 ∗ (𝑎 + 𝑥𝑛) =𝑐 ∗ (𝑏 + 𝑦𝑛) + 𝑧𝑛 ∗ (𝑏 + 𝑦𝑛) =𝑐𝑏 + 𝑐𝑦𝑛 + 𝑧𝑛 ∗ (𝑏 + 𝑦𝑛) =𝑐𝑏 + 𝑐𝑦𝑛 + 𝑧𝑛𝑏 + 𝑧𝑦𝑛𝑛 =𝑐𝑏 + 𝑧𝑛𝑏 + 𝑐𝑦𝑛 + 𝑧𝑦𝑛𝑛 =𝑏 ∗ (𝑐 + 𝑧𝑛) + 𝑐𝑦𝑛 + 𝑧𝑦𝑛𝑛 =𝑏 ∗ (𝑐 + 𝑧𝑛) + 𝑦𝑛 ∗ (𝑐 + 𝑧𝑛) =𝑏 ∗ (𝑑 + 𝑤𝑛) + 𝑦𝑛 ∗ (𝑑 + 𝑤𝑛) =𝑏𝑑 + 𝑏𝑤𝑛 + 𝑦𝑛 ∗ (𝑑 + 𝑤𝑛) =𝑏𝑑 + 𝑏𝑤𝑛 + 𝑑𝑦𝑛 + 𝑦𝑤𝑛𝑛 =𝑏𝑑 + 𝑑𝑦𝑛 + 𝑏𝑤𝑛 + 𝑦𝑤𝑛𝑛 =𝑏𝑑 + (𝑑𝑦𝑛 + 𝑏𝑤𝑛 + 𝑦𝑤𝑛𝑛) =𝑏𝑑 + (𝑑𝑦 + 𝑏𝑤 + 𝑦𝑤𝑛) ∗ 𝑛
This proof is already 15 lines long, and that’s including the inherent shortcuts that we take as humans, such as automatically reasoning over the associativity and commutativity of addition and multiplication—imagine how much longer this proof would be if we had to spell out every single time we wanted to move a term around, and if we kept track of all the parentheses required to multiply out 𝑧 ∗ (𝑦 ∗ (𝑛 ∗ 𝑛)). Yeesh. As you can imagine, the cost of writing expensive proofs for simple lemmas can be prohibitive, and get in our way of actually wanting to use Agda. Thankfully, this is not a cost we often need to pay, thanks to Agda’s ring solver. Prerequisites 0
⇤ 0
open import Chapter2-Numbers using (ℕ; zero; suc)
open import Chapter3-Proofs using (_≡_; module PropEq; module ≡-Reasoning) open PropEq
9.8. RINGS
343
using (refl; sym; cong)
⇤ 0
open import Chapter4-Relations using (Level)
⇤ 0
open import Chapter7-Structures using (List; []; _∷_; _∘_)
⇤ 0
open import Chapter8-Isomorphisms using (Fin; zero; suc)
9.8 Rings The ring solver (presented here as a derivative work of Kidney (2019)) is a general purpose tool for automatically reasoning about rings. Rings are algebraic structures which generalize the relationships between addition and multiplication. A ring has an associative, commutative binary operation called “addition” and another associative, commutative binary operation called “multiplication.” These operations need not correspond in any semantic way to the things we think of as being addition and multiplication, merely it’s just they need to properly fit into the “ecosystem niche” that regular addition and multiplication do. What does this mean? A ring must also have distinguished elements 0 and 1 that behave like you’d expect with respect to addition and multiplication, namely that we have the following pile of equalities: +-identityˡ, +-identityʳ, *-identityˡ, *-identityʳ, *-zeroˡ, *-zeroʳ, +-comm, *-comm, +-assocˡ, +-assocʳ, *-assocˡ, *-assocʳ, *distribˡ-+, and *-distribʳ-+. As you can see, there is a great deal of structure inherent in a ring! But, this is just the structure required of a semiring. In order to get the full ring, we require an additive inverse operation analogous to unary negation, with the property that for any 𝑎 we have 𝑎 + −𝑎 = 0. By virtue of generalizing addition and multiplication, addition and multiplication themselves had better form a ring! And indeed they do, though that however, the natural numbers don’t have any additive inverses, and so they can be, at best, semirings. Integers, however, weaken this constraint, and are therefore fully realizable as rings.
344
CHAPTER 9. PROGRAM OPTIMIZATION
Rings occupy an excellent space in the mathematical hierarchy, corresponding to the sort of algebraic reasoning that is required in grade-school, at least, before fractions are introduced. Given our extreme intuitive understanding of arithmetic over rings, it is the sort of reasoning that comes up everywhere in mathematics. Better yet: since we expect children to be able to solve it, there must exist an algorithm for determining the equivalence of two expressions over the same ring. In this chapter, we will get a feel for using Agda’s ring solver to tackle problems, and then dive in more deeply to see exactly how it works by implementing our own version.
9.9 Agda's Ring Solver Agda’s standard library comes with a ring solver, which is a series of tools for automatically solving equalities over rings. Of course, calling it a ring solver is a bit of a misnomer, since the ring solver works over semirings as well, due to a subtle weakening of required ring structure. However, these details are irrelevant to our discussion here; all you need to keep in mind is that the ring solver works over any commutative semiring in addition to rings themselves. The ring solver machinery exists in the standard library under Algebra.Solver.Ring.Simple, but many specialized versions are present. For example, the (semi)ring solver for the natural numbers is squirreled away under Data.Nat.Solver. We can pull it into scope, and get access to the solver itself by subsequently opening +-*-Solver: ⇤ 0
module Example-Nat-Solver where open import Data.Nat.Solver open +-*-Solver open import Chapter2-Numbers using (_+_; _*_)
In our pen and paper example above, we did a lot of work to show the equality of 𝑎𝑐 + (𝑐𝑥 + 𝑎𝑧 + 𝑥𝑧𝑛) × 𝑛 and 𝑐 × (𝑎 + 𝑥𝑛) + 𝑧𝑛 × (𝑎 + 𝑥𝑛). Let’s prove this with the ring solver. We can start with the type, which already is quite gnarly: ⇤ 0 lemma₁ : (a c n x z : ℕ)
9.9. AGDA'S RING SOLVER
345
→ a * c + (c * x + a * z + x * z * n) * n ≡ c * (a + x * n) + z * n * (a + x * n)
Inside of +-*-Solver is solve, which is our front-end for invoking the ring solver. The type of solve is a dependent nightmare, but we can give its arguments informally: 1. n : ℕ: the number of variables that exist in the expression. 2. A function from n variables to a syntactic representation of the expression you’d like solved. 3. A proof that the two expressions have the same normal form. This is almost always simply refl. 4. n more arguments, for the specific values of the variables. In lemma₁ we have five variables (a, c, n, x, and z), and so our first argument to solve should be 5. Next we need to give a function which constructs the syntax of the equality we’re trying to show. In general this means replacing _≡_ with _:=_, _+_ with _:+_, _*_ with _:*_, and any constant k with con k. The variables you receive from the function can be used without any adjustment. Thus the full implementation of lemma₁ is: ⇤ 2
lemma₁ = solve 5 (λ a c n x z → a :* c :+ (c :* x :+ a :* z :+ x :* z :* n) :* n := c :* (a :+ x :* n) :+ z :* n :* (a :+ x :* n) ) refl
It’s certainly not the most beautiful sight to behold, but you must admit that it’s much better than proving this tedious fact by hand. The syntactic equality term we must build here is a curious thing. What exactly is going on? This happens to be a quirk of the implementation of the solver, but it’s there for a good reason. Recall that our “usual” operations (that is, _+_ and _*_ and, in general values that work over ℕ) are computational objects; Agda will compute and reduce them if it is able to do so, and will make these rewrites regardless of what you actually write down. This syntax tree is an annoying thing to write, but is necessary to help the ring solver know what it’s trying to solve. Remember,
346
CHAPTER 9. PROGRAM OPTIMIZATION
just because we’ve written out this expression with full syntax here doesn’t mean this is the actual term that Agda is working on! Agda is free to expand definitional equalities, meaning it might have already reduced some of these additions and multiplications away! But when you think about solving these sorts of equations on paper, what you’re actually doing is working with the syntax, and not actually computing in any real sense. The algorithm to solve equations is to use a series of syntactic rewrite rules that allow us to move symbolic terms around, without ever caring about the computational properties of those symbolic terms. Thus, the lambda we need to give to solve is a concession to this fact; we’d like Agda to prove, symbolically, that the two terms are equivalent, without requiring any computation of the underlying terms in order to do so. And in order to do so, we must explicitly tell Agda what the symbolic equation is, since all it has access is to is some stuck value that exists in Agda’s meta-theory, rather than in the theory of the ring itself. This duplication between the Agda expression of the term and the symbolic version of the same is regrettable. Are we doomed to write them both, every time? Thankfully not.
9.10
Tactical Solving
Agda has a powerful macro system, which, in full glory, is beyond the scope of this book. However, at a high level, the macro system allows regular Agda programs to access the typechecker. This is a tremendous (if fragile) superpower, and allows programmers to do all sorts of unholy things. One such capability is to use the type currently expected by Agda in order to synthesize values at compile time. Another, is to syntactically inspect an Agda expression (or type) at compile time. Together, these features can be used to automatically derive the symbolic form required for doing ring solving. To illustrate broadly how this works, we can write code of this form: a + (x + z) * n (a + x * n) + z * n
≡⟨ ? ⟩
Agda knows that the type of the hole must be a + (x + z) * n ≡ (a + x and if we were to put a macro in place of the hole, that macro can inspect the type of the hole. It can then perform all of the
* n) + z * n,
9.10. TACTICAL SOLVING
347
necessary replacements (turning _+_ into _:+_ and so on) in order to write the ring-solving symbolic lambda for us. All that is left to do is to tell the solver which variables we’d like to use, by sticking them in a list. We can demonstrate all of this by implementing ≈-trans again. This time, the tactical ring solver is found in Data.Nat.Tactic.RingSolver, and requires lists to be in scope as well: ⇤ 0
module Example-Tactical where open import Data.Nat.Tactic.RingSolver
We can then show ≈-trans: 2
open import Chapter2-Numbers using (_+_; _*_) ≈-trans : (a b c n x y z w : ℕ) → a + x * n ≡ b + y * n → b + z * n ≡ c + w * n → a + (x + z) * n ≡ c + (w + y) * n ≈-trans a b c n x y z w pxy pzw = begin a + (x + z) * n ≡⟨ solve (a ∷ x ∷ z ∷ n ∷ []) ⟩ (a + x * n) + z * n ≡⟨ cong (_+ z * n) pxy ⟩ (b + y * n) + z * n ≡⟨ solve (b ∷ y ∷ n ∷ z ∷ []) ⟩ (b + z * n) + y * n ≡⟨ cong (_+ y * n) pzw ⟩ c + w * n + y * n ≡⟨ solve (c ∷ w ∷ n ∷ y ∷ []) ⟩ c + (w + y) * n ∎ where open ≡-Reasoning
The solve macro only works for terms of type x ≡ y, which means it can’t be used to show parameterized properties, like lemma₁ earlier. For that, we can instead invoke solve-∀: ⇤ 2
lemma₁ : (a c n x z : ℕ) → a * c + (c * x + a * z + x * z * n) * n ≡ c * (a + x * n) + z * n * (a + x * n) lemma₁ = solve-∀
As you can see, ring solving is an extremely powerful technique, capable of automating away hours of tedious proof work. But where does these magical powers come from? How can this possibly work? Let’s implement our own ring solver to explore that question.
348
9.11
CHAPTER 9. PROGRAM OPTIMIZATION
The Pen and Paper Algorithm
An interesting insight into how to solve this problem is to use the analogy of solving a maze. Not the corn-maze sort, but the variety that comes on the back of cereal boxes. Solving a maze is often a twosided approach; you explore from the beginning of the maze, and you simultaneously explore from the end. The goal is to meet somewhere in the middle. If you can get to the same place from both sides, you can compose the two half-solutions into a final path to escape the maze. Why does this work? In some sense, it’s because the first moves you can take from either direction are relatively constrained. The number of possibilities are few, and there is an obvious decision procedure in the form of “is this going roughly the right direction?” As you move further from your starting point, the number of possibilities increase exponentially; after all, there’s always the chance that you took the wrong direction on your first step. By exploring from both sides at once, we are minimizing the effects of these exponential blow-ups. Furthermore, your notion of “the right direction to head” increases as you have more familiarity with the other side of the maze. Now that you have a path, you don’t need necessarily to find the end of the path, you just need to intersect it. As a result, we have more “targets” to aim our search at. All of this applies to proofs as well. We have well-defined starting and stopping points, and are tasked with bridging the distance between them. Here too we have exponential blow-ups in complexity, so we can cover the most space by searching from the top and bottom at the same time. Of course, this heuristic doesn’t always work. But what if we had a well-defined “middle” to path find to? The reason the ring solver is a ring solver, as opposed to just a solver, is that rings give us a healthy balance between expressiveness and solvability. Why is that? Rings admit a normal, or canonical, form. That is to say, we have a well-defined, unique notion of what terms in a ring should look like. That means, two terms are equal if they have the same normal form, the proverbial “middle” of the maze. Polynomials are the best examples of the canonical form of rings. While we can express polynomials in any number of ways, by far the most common is in the “sum of descending powers.” To jog your memory, most polynomials look like the following: 𝑥3 + 3𝑥2 − 9𝑥 − 17
9.11. THE PEN AND PAPER ALGORITHM
349
It’s perfectly acceptable, if weird, to write the above as: (𝑥 − 9 + 𝑥2 + 2𝑥)𝑥 − 17
which is equivalent, but the mere fact that it doesn’t “look like a polynomial” is a strong indication that you have internalized the polynomial canonical form—whether or not you were aware of it. Given the existence of canonical forms, we can now reduce the problem of proving ring equality to be: 1. Prove both terms are equal to their canonical form. 2. Compare the canonical forms. 3. If the canonical forms match, compose the earlier proofs. This is a powerful, widely-useful technique, so stick it in your belt! Let’s stop for a quick illustration of the idea in action. We’d like to prove that (𝑥 + 1)(𝑥 − 1) is equal to 𝑥(1 + 𝑥) + 1 − 𝑥 − 2. The first step is to reduce each to normal form: (𝑥 + 1)(𝑥 − 1) = 𝑥(𝑥 + 1) − 1(𝑥 + 1) = 𝑥2 + 𝑥 − 1(𝑥 + 1) = 𝑥2 + 𝑥 − 𝑥 − 1 = 𝑥2 − 1
and 𝑥(1 + 𝑥) + 1 − 𝑥 − 2 = 𝑥 + 𝑥2 + 1 − 𝑥 − 2 = 𝑥2 + 𝑥 − 𝑥 + 1 − 2 = 𝑥2 + 1 − 2 = 𝑥2 − 1
These expressions do in fact have the same normal form, and thus they are equal to one another, which we can show simply by composing the two proofs:
350
CHAPTER 9. PROGRAM OPTIMIZATION
(𝑥 + 1)(𝑥 − 1) = 𝑥(𝑥 + 1) − 1(𝑥 + 1) = 𝑥2 + 𝑥 − 1(𝑥 + 1) = 𝑥2 + 𝑥 − 𝑥 − 1 = 𝑥2 − 1 = 𝑥2 + 1 − 2 = 𝑥2 + 𝑥 − 𝑥 + 1 − 2 = 𝑥 + 𝑥2 + 1 − 𝑥 − 2 = 𝑥(1 + 𝑥) + 1 − 𝑥 − 2
The notion of polynomial generalizes to arbitrary rings. Why is that? We have addition and multiplication, both are associative and commutative, and multiplication distributes over addition. Because of the distributivity, we can always produce a sum of products structure, that is, to distribute all multiplications over every addition. That is, we can always reduce expressions of the form: 𝑥(5 + 𝑦)
with 5𝑥 + 𝑥𝑦
which is to say, we can always move the additions to be the outermost nodes in the expression tree. Similarly, multiplication is commutative, we can freely group together all of the same elements of the group. So, we can happily combine the two 𝑥s in 𝑥𝑦𝑥 = 𝑥𝑥𝑦 = 𝑥2 𝑦
Finally, the commutativity of addition means we can reorder the outermost terms. This allows us to sort the terms by their descending powers of 𝑥. This collection of transformations clearly allows us to put any polynomial of one variable into normal form. It’s not immediately clear how the approach generalizes to polynomials in multiple variables, but as we will see in a moment, there is a very elegant trick that ties everything together. Describing the canonical form in such detail also gives us an insight into why we have ring solvers but not semigroup solvers. Semigroups, having only a single, associative binary operator, simply don’t have
9.12. HORNER NORMAL FORM
351
enough algebraic structure to require interesting proofs. If your semigroup is commutative (“Abelian,” in the jargon) then you can simply reorder all the terms so they appear in a row. It’s exactly the interplay between addition and multiplication that makes the problem at all interesting.
9.12
Horner Normal Form
In order to put a polynomial into normal form, we must have an technique for doing so. Of course, we could just write a function that fiddles with an expression tree until it is in normal form, but, in general, it’s very difficult to prove the correctness of “fiddling.” A much better technique is to build a type which is guaranteed to be in the desired form, and then write a function that produces something of that type. The natural representation of this normal form is a list of coefficients. If we have 𝑥2 +5𝑥−3, we can use -3 ∷ 5 ∷ 1 ∷ [] as our normal form. Why in reversed order, you might ask? Because we don’t know what the biggest power in the polynomial is until we reach the end. For the sake of easier bookkeeping, if we store our powers as little endian, we can ensure that like terms are always in the same place in the list. That is, adding 𝑥2 + 5𝑥 − 3 to 2𝑥 + 2 is much easier to do when the lists are stored in little endian instead of big endian! While lists are the right intuition, they are not exactly right for our use case, as they don’t scale well to multiple variables. Instead, we look to a very similar idea called Horner’s method which expresses polynomial in a slightly different form. Rather than writing 𝑥2 +5𝑥−3, we instead write: (1𝑥 + 5)𝑥 − 3
in Horner normal form (henceforth HNF.) Here, every expression in HNF is either a constant 𝔸 → HNF, or it is of the form HNF → 𝔸 → HNF. We can express this as a data type: ⇤ 0
module Sandbox-Univariate-HNF {ℓ : Level} (𝔸 : Set ℓ) where data HNF : Set ℓ where coeff : 𝔸 → HNF _*x+_ : HNF → 𝔸 → HNF
Looking at this, what we really have is a non-empty snoc list under a different guise. Despite its name, HNF is not truly a normal form, since
352
CHAPTER 9. PROGRAM OPTIMIZATION
we have infinitely many ways of expressing any given term, simply by padding it with a zero for its next power: ⇤ 2
postulate 0# : 𝔸
1
nonunique : HNF → HNF nonunique (coeff a) = coeff 0# *x+ a nonunique (a *x+ b) = nonunique a *x+ b
This is regrettable, but a very difficult thing to solve at the level of types. Agda’s real ring solver performs a normalization stage after every computation to remove any highest-order zero powers, but this adds a great deal of complexity. Since we are only putting together a toy example, we will not concern ourselves with this problem, but do keep in mind its presence. Note that at 1 , we have postulated 0 ; this is only because we haven’t formally defined rings or semirings or any of the actual structure we will need to build a ring solver. So instead, we will simply postulate any piece we need, and you should treat this entire discussion as a sketch of the technique. The motivated reader is encouraged to fill in all the gaps! Horner normal form is desirable for computation since it gives rise to an interpretation into 𝔸 directly, via: 2
postulate _+_ : 𝔸 → 𝔸 → 𝔸 _*_ : 𝔸 → 𝔸 → 𝔸 eval : 𝔸 → HNF → 𝔸 eval x (coeff a) = a eval x (a *x+ b) = (eval x a * x) + b
This requires only 𝑂(𝑛) multiplications of 𝑥, where 𝑛 is the highest power in the polynomial. Compare that to the naive version in which you compute 𝑥3 as x * x * x, which requires 𝑂(𝑛2 ) multiplications.
9.13
Multivariate Polynomials
All of our original examples of using ring solvers involved polynomial in multiple variables; recall lemma₁ which was a polynomial in five
9.13. MULTIVARIATE POLYNOMIALS
353
variables. Clearly multivariate polynomials are important to actually getting work done, and thus we must determine a means of encoding them. The trick is both delightful and simple. In all of our analyses above, we discussed how coefficients play into the thing, without explicitly defining what these coefficients were. Based on our experience with single-variable polynomials, we took for granted that the coefficients must be ring elements, but this is not a necessity. We can recover multivariate polynomials by instead insisting that our coefficients be polynomials in a different variable. That is, we could express the polynomial 𝑥2 + 𝑦2 + 𝑥𝑦 + 𝑦 + 5𝑥 − 3 as 𝑥2 + (𝑦 + 5)𝑥 + (𝑦 − 3). This technique generalizes to any number of variables, simply by sticking another polynomial on 𝑧 in as the coefficients on 𝑦 for example. Let’s start our actual ring solver module in order to explore this idea. Since we would like eventual computational properties, we will add the bare minimum structure on 𝔸 as parameters to our module. ⇤ 0
module Sandbox-RingSolver {ℓ : Level} {𝔸 : Set ℓ} (0# 1# : 𝔸) (_+_ _*_ : 𝔸 → 𝔸 → 𝔸) (let infixr 5 _+_; _+_ = _+_) 1 (let infixr 6 _*_; _*_ = _*_) where
The strange use of let at 1 is an uncommon Agda idiom for defining a fixity on parameters, nothing more. We will require many algebraic definitions to be in scope: ⇤ 2
module _ {A : Set ℓ} where open import Algebra.Definitions {A = A} _≡_ public
Encoding our multivariate HNF in Agda isn’t too tricky; though admittedly the resulting syntax leaves much to be desired. We can parameterize HNF by a natural corresponding to how many distinct variables it has. Anywhere before we used HNF we now use HNF (suc n), and anywhere we used a scalar 𝔸 we instead use HNF n. ⇤ 2
private variable n : ℕ data HNF : ℕ → Set ℓ where const : 𝔸 → HNF zero coeff : HNF n → HNF (suc n)
354
CHAPTER 9. PROGRAM OPTIMIZATION _*x+_ : HNF (suc n) → HNF n → HNF (suc n)
Notice that we have also added const in order to build polynomial in zero variables, which corresponds to sticking in scalar values. This representation works perfectly well, but requires a little alertness when constructing its terms by hand. To take a concrete example, if we are working with an HNF 2—a polynomial in two variables, call them 𝑎 and 𝑏—then the _*x+_ constructor is used to construct both the 𝑎 and 𝑏 univariate polynomials! For example, we would write 𝑎2 + 𝑎𝑏 + 𝑏2 as: ⇤ 2
a²+ab+b² : HNF 2 a²+ab+b² = ( coeff (coeff (const 1#)) *x+ a coeff (const 1#) ) *x+ ( a (coeff (const 1#) *x+ b const 0# ) *x+ b const 0#)
Here, _*x+_ refers both to 𝑎 and to 𝑏, depending on its type (which itself depends on the constructor’s position in the tree.) We’ve annotated a and b here to help, but, as you can see, it is no great joy to construct HNF terms by hand! Thankfully, we won’t need to, and will instead use HNF as a sort of “compilation target” for other operations.
9.14
Building a Semiring over HNF
The idea of HNF is that it is a particular encoding of polynomials. Therefore, we should expect to be able to do anything with HNF that we could do with polynomials encoded some other way. Furthermore, by virtue of it being a normal form, we expect all of these operations to be closed—meaning, if you combine two HNFs, you should always get back another HNF. For example, we can implement addition over HNFs simply by adding like terms: ⇤ 2
_⊕_ : HNF n → HNF n → HNF n const a ⊕ const b = const (a + b)
9.14. BUILDING A SEMIRING OVER HNF coeff a ⊕ coeff b coeff a ⊕ (b *x+ c) (a *x+ b) ⊕ coeff c (a *x+ b) ⊕ (c *x+ d) infixr 5 _⊕_
= = = =
355
coeff (a ⊕ b) b *x+ (a ⊕ c) a *x+ (b ⊕ c) (a ⊕ c) *x+ (b ⊕ d)
Does this really implement addition, you might be wondering? And if so, congratulations, you’ve acquired the correct mindset: that we should demand proof for anything as complicated as this. Don’t worry, we will prove that _⊕_ does in fact implement addition, although first we need to figure out exactly how to formally phrase that question. Another thing we’d like to be able to do is inject scalars directly into a polynomial, rather than faffing about with big chains of coeff in order to stick in a const. This is given by ↪: 2
↪ : 𝔸 → HNF n ↪ {zero} a = const a ↪ {suc n} a = coeff (↪ a)
We can now lift 0# and 1# into any polynomial simply by injecting them: 2
0H : HNF n 0H = ↪ 0# 1H : HNF n 1H = ↪ 1#
Working our way towards multiplication over HNF, we will first need one last piece in place—a helper function for multiplying by the current variable. 2
x* : HNF (suc n) → HNF (suc n) x* a = a *x+ 0H
Note the type here; this is necessarily a function over HNF (suc n), since there are no variables to multiply when dealing with HNF zero. We are now ready to implement _⊗_, which takes advantage of the well-known foiling rule that (𝑎𝑥 + 𝑏)(𝑐𝑥 + 𝑑) = 𝑎𝑐𝑥2 + 𝑎𝑐𝑑 + 𝑏𝑐𝑥 + 𝑏𝑑 . 2
_⊗_ : HNF n → HNF n → HNF n const a ⊗ const b = const (a * b)
356
CHAPTER 9. PROGRAM OPTIMIZATION
coeff a ⊗ coeff b = coeff (a ⊗ b) coeff a ⊗ (b *x+ c) = (coeff a ⊗ b) *x+ (a ⊗ c) (a *x+ b) ⊗ coeff c = (a ⊗ coeff c) *x+ (b ⊗ c) (a *x+ b) ⊗ (c *x+ d) = x* (x* (a ⊗ c)) ⊕ x* ((a ⊗ coeff d) ⊕ (c ⊗ coeff b)) ⊕ coeff (b ⊗ d) infixr 6 _⊗_
We have now implemented 0H, 1H, _⊕_ and _⊗_ which are all of the necessary moving pieces for a semiring. We could construct a fully-blown ring instead by requiring a negation operation over 𝔸, and closing HNF over this operation as well, but that is left as an exercise to the dedicated reader.
9.15
Semantics
In order to prove that addition and multiplication do what they say on the tin, we must give a semantics to HNF, in essence, giving a specification for how they ought to behave. This is sometimes called a denotation or a model. Semantics are often given by a function into some other type. We saw a function like this in our univariate example, in which we evaluated an HNF down to a 𝔸. We will do the same thing here, except that our new eval function must take a mapping of variables to 𝔸, which we can encode as a function Fin n → 𝔸. Thus, we have: 2
eval eval eval eval
: v v v
(Fin n → 𝔸) → HNF n → 𝔸 (const a) = a (coeff a) = eval (v ∘ suc) a (a *x+ b) = v zero * eval v a + eval (v ∘ suc) b
Given a model of HNF, we would now like to show that everything we’ve built so far does in fact preserve meaning, which is to say, addition in HNF should correspond to addition over 𝔸, and so on and so forth. This mathematical property is known as a homomorphism, which means “structure preserving.” The idea being that the homomorphism maps structure on one side to equivalent structure on the other. As a first example, we can give the type of nullary homomorphisms:
9.15. SEMANTICS 2
357
Homomorphism₀ : HNF n → 𝔸 → Set ℓ Homomorphism₀ h a = ∀ v → eval v h ≡ a
and subsequently show that there exists a homomorphism between ↪ a : HNF n and a : 𝔸, as per eval-↪: ⇤ 2
eval-↪ : (a : 𝔸) → Homomorphism₀ {n} (↪ a) a eval-↪ {zero} a f = refl eval-↪ {suc n} a f = eval-↪ a (f ∘ suc)
There exist two special cases of eval-↪: 2
eval-0H : Homomorphism₀ {n} 0H 0# eval-0H = eval-↪ 0# eval-1H : Homomorphism₀ {n} 1H 1# eval-1H = eval-↪ 1#
We also have two unary homomorphisms over eval, although their types are tricky enough that we don’t attempt to give a type synonym for them. The first is that evaluation of a coeff term is equivalent to evaluating it having dropped the current variable. 2
eval-coeff : (f : Fin (suc n) → 𝔸) → (h : HNF n) → eval f (coeff h) ≡ eval (f ∘ suc) h eval-coeff f a = refl
and the other is that to-var (defined momentarily) simply evaluates to the desired variable. First we will write to-var, which transforms a Fin n into the corresponding variable in the correct coefficient space: 2
to-var : Fin n → HNF n to-var zero = x* 1H to-var (suc x) = coeff (to-var x)
We would like to show that the evaluation of this term is equivalent to just instantiating the correct variable. Constructing the homomorphism here requires some of the semiring structure over 𝔸, which we will postulate since we are only making a toy example. In a real implementation, however, these postulates should be required of whoever is instantiating the solver module.
358 2
CHAPTER 9. PROGRAM OPTIMIZATION
postulate +-identityʳ : RightIdentity 0# _+_ *-identityʳ : RightIdentity 1# _*_ eval-to-var : (f : Fin n → 𝔸) → (x : Fin n) → eval f (to-var x) ≡ f x eval-to-var f zero rewrite eval-0H (f ∘ suc) rewrite eval-1H (f ∘ suc) rewrite *-identityʳ (f zero) = +-identityʳ (f zero) eval-to-var f (suc x) = eval-to-var (f ∘ suc) x
There is a third unary homomorphism we’d like to show, namely that x* does what it should. 2
open ≡-Reasoning eval-x* : (f : Fin (suc n) → 𝔸) → (h : HNF (suc n)) → eval f (x* h) ≡ f zero * eval f h eval-x* f (coeff a) = begin f zero * eval f’ a + eval f’ (↪ 0#) ≡⟨ cong ((f zero * eval f’ a) +_) (eval-0H f’) ⟩ f zero * eval f’ a + 0#
≡⟨ +-identityʳ _ ⟩ f zero * eval f’ a
∎ where f’ = f ∘ suc eval-x* f (a *x+ b) = let f0 = f zero 1 f' = f ∘ suc ↓ = eval f ↓' = eval f' in begin f0 * (f0 * ↓ a + ↓' b) + ↓' (↪ 0#) ≡⟨ cong (f0 * (f0 * ↓ a + ↓' b) +_) (eval-0H f') ⟩
9.15. SEMANTICS
359
f0 * (f0 * ↓ a + ↓' b) + 0#
≡⟨ +-identityʳ _ ⟩ f0 * (f0 * ↓ a + ↓' b)
∎
Notice that at 1 we have introduced a let binding in order to give shorter names to common expressions that frequently occur in our proof. This is a useful trick for managing the amount of mental capacity required to work through a proof. Now come the interesting pieces. We’d like to show two binary homomorphisms, one from _⊕_ to _+_, and another between _⊗_ and _*_. First, we can give the definition of a binary homomorphism: ⇤ 2
Homomorphism₂ : (HNF n → HNF n → HNF n) → (𝔸 → 𝔸 → 𝔸) → Set ℓ Homomorphism₂ f g = ∀ v x₁ x₂ → eval v (f x₁ x₂) ≡ g (eval v x₁) (eval v x₂)
The details of these two homomorphisms are quite cursed. As my Reed Mullanix says, “solvers are fun because they condense all the suffering into one place.” The idea is that we will take on all the pain of solving ring problems, and tackle them once and for all. The result is hairy, to say the least. For the sake of this book’s length, we will not prove these two homomorphisms in their full glory, instead we will sketch them out and leave the details for a particularly motivated reader. In order to show the homomorphism for addition, we will require +-assoc, which we again postulate, but in a real solver should instead be brought in as part of the proof that 𝔸 is a (semi)ring in the first place. ⇤ 2
postulate +-assoc : Associative _+_ eval-⊕ : Homomorphism₂ {n} _⊕_ _+_ eval-⊕ f (const a) (const b) = refl eval-⊕ f (coeff a) (coeff b) = eval-⊕ (f ∘ suc) a b eval-⊕ f (coeff a) (b *x+ c) = exercise-for-the-reader eval-⊕ f (a *x+ b) (coeff c) rewrite eval-⊕ (f ∘ suc) b c = sym ( +-assoc _ _ _) eval-⊕ f (a *x+ b) (c *x+ d) = exercise-for-the-reader
360
CHAPTER 9. PROGRAM OPTIMIZATION
The real pain in writing a ring solver is in the homomorphism for multiplication, which is presented here in a very sketched form. There are five cases we need to look at, the first four of which are rather reasonable: 2
postulate *-distribˡ-+ : _*_ DistributesOverˡ _+_ *-distribʳ-+ : _*_ DistributesOverʳ _+_ eval-⊗ : Homomorphism₂ {n} _⊗_ _*_ eval-⊗ f (const a) (const b) = refl eval-⊗ f (coeff a) (coeff b) = eval-⊗ (f ∘ suc) a b eval-⊗ f (coeff a) (b *x+ c) = exercise-for-the-reader eval-⊗ f (a *x+ b) (coeff c) = exercise-for-the-reader
The final case, which multiplies _*x+_ against _*x+_, is an extremely nasty piece of work. Recall that in the definition of _⊗_, we needed to invoke x* four times, _⊕_ three times, and _⊗_ itself four times. Every instance of these uses requires an invocation of the corresponding homomorphism, conged into the right place, and then algebraically manipulated so that like terms can be grouped. This proof is no laughing matter; remember, the ring solver coalesces all of the pain into one place, and this is where it has accumulated. Thankfully, that’s your problem, not mine: 2
eval-⊗ f (a *x+ b) (c *x+ d) = exercise-for-the-reader
9.16
Syntax
We’re nearly at the home stretch. With our semantics out of the way, the next step in our ring solver is to implement the symbolic expression of our ring. This syntax is responsible for bridging the gap between the concrete values in the ring we’d like to equate, and their connections to HNF. While this might sound intimidating, it’s exceptionally straightforward after the previous slog proving the multiplication homomorphism. Our syntax for semirings is simple and unassuming: 2
private variable ℓ₁ : Level
9.16. SYNTAX data Syn var : con : _:+_ : _:*_ : infixl 5 infixl 6
361
(n : ℕ) : Set ℓ where Fin n → Syn n 𝔸 → Syn n Syn n → Syn n → Syn n Syn n → Syn n → Syn n _:+_ _:*_
Additionally, we can assign semantics for Syn, which, given a mapping for the variables, produces an 𝔸. 2
⟦_⟧ : Syn ⟦ var v ⟧ ⟦ con c ⟧ ⟦ x :+ y ⟧ ⟦ x :* y ⟧
n → (Fin n → 𝔸) → 𝔸 vs = vs v vs = c vs = ⟦ x ⟧ vs + ⟦ y ⟧ vs vs = ⟦ x ⟧ vs * ⟦ y ⟧ vs
However, this is not the only interpretation we can give for Syn. There is also a transformation from Syn into HNF: 2
hnf hnf hnf hnf hnf
: Syn n → HNF n (var x) = to-var x (con x) = ↪ x (x :+ b) = hnf x ⊕ hnf b (x :* b) = hnf x ⊗ hnf b
It is exactly the relationship between ⟦_⟧ and hnf that we’re interested in. The former allows us to transform syntax into computable values in the domain of the user of our solver. The latter gives us a means of computing the normal form for a piece of syntax. The relevant theorem here is eval-hnf, which states that you get the same answer whether you evaluate the hnf or simply push the syntax through ⟦_⟧. 2
eval-hnf : (f : Fin n → 𝔸) → (s : Syn n) → eval f (hnf s) ≡ ⟦ s ⟧ f eval-hnf f (var a) = eval-to-var f a eval-hnf f (con a) = eval-↪ a f eval-hnf f (s :+ s₁) rewrite eval-⊕ f (hnf s) (hnf s₁) rewrite eval-hnf f s
362
CHAPTER 9. PROGRAM OPTIMIZATION
rewrite eval-hnf f s₁ = refl eval-hnf f (s :* s₁) rewrite eval-⊗ f (hnf s) (hnf s₁) rewrite eval-hnf f s rewrite eval-hnf f s₁ = refl
9.17
Solving the Ring
Everything is now in place in order to actually solve equalities in our (semi)ring. The core of our solver is equate—the function which is capable of showing that two ring expressions are equivalent. The interface to this function leaves quite a lot to be desired, but we will work on the ergonomics momentarily. The idea here is to use eval-hnf to show that the concrete Agda value is equivalent to the interpretation of the syntax object. We can then show the interpretation of the syntax object is equivalent to the evaluation of the normal form of the syntax object. Subsequently, we ask the user for a proof that the normal forms are the same—which is always just refl—and then do the same chain of proof compositions backwards across the other ring expression. Thus, equate in all its glory is as follows: ⇤ 2
equate : (lhs rhs : Syn n) → hnf lhs ≡ hnf rhs → (f : Fin n → 𝔸) → ⟦ lhs ⟧ f ≡ ⟦ rhs ⟧ f equate lhs rhs lhs=rhs f = begin ⟦ lhs ⟧ f ≡⟨ sym (eval-hnf f lhs) ⟩ eval f (hnf lhs) ≡⟨ cong (eval f) lhs=rhs ⟩ eval f (hnf rhs) ≡⟨ eval-hnf f rhs ⟩ ⟦ rhs ⟧ f ∎
While equate does do everything we’ve promised, its interface leaves much to be desired. In particular, it requires us to manage all of our variables by hand; not only must we give an explicit function f : Fin n → 𝔸 to evaluate the variables, but we also must encode them ourselves in the lhs and rhs Syn objects. This is a far cry from the standard library’s ring solver, which gives us the syntactic variables
9.18. ERGONOMICS
363
in a lambda, and allows us to fill in the eventual values as additional arguments to the function. As our very last objective on this topic, we will turn our attention to assuaging both of these pain points.
9.18
Ergonomics
Some reflection on the difference between our ring solver and the one in Agda’s standard library suggests the path forwards on improving our solver’s ergonomics. Both of the aforementioned differences— providing the syntactic variables, and receiving the actual values for those variables—are instances of a single mechanism: the 𝑛-ary function. An 𝑛-ary function is one which receives 𝑛 arguments of type A before returning something of type B. Building such a thing is less difficult than it might seem at first blush. Recall in sec. 1.20, in which we discussed the curry/uncurry isomorphism, showing that it’s always possible to transform between a curried function of the form A -> B -> C, and a tupled function of the form A × B → C. This is exactly the same idea as what we’ll need to implement an 𝑛-ary function, except that we will use a Vec instead of a tuple. We can give the type of an 𝑛-ary function by doing induction on a natural number, corresponding on the number of arguments we still need to take. Writing a non-dependent version of this type is straightforward: ⇤ 2
open import Data.Vec using (Vec; []; _∷_; lookup; map) N-ary′ : ℕ → Set ℓ₁ → Set ℓ₁ → Set ℓ₁ N-ary′ zero A B = B N-ary′ (suc n) A B = A → N-ary′ n A B
While this works, it doesn’t allow the B type to depend on the vector accumulated in giving this type. Fixing the issue requires a little bit of brain-folding: 2
N-ary : (n : ℕ) → (A : Set ℓ₁) → (Vec A n → Set ℓ₁) → Set ℓ₁ N-ary zero A B = B [] N-ary (suc n) A B = (a : A) → N-ary n A (B ∘ (a ∷_))
364
CHAPTER 9. PROGRAM OPTIMIZATION
In general, the non-dependent versions of functions are special cases of the dependent ones, in which the argument is simply ignored. This gives us a “better” definition for N-ary′: 2
N-ary′ : ℕ → Set ℓ₁ → Set ℓ₁ → Set ℓ₁ N-ary′ n A B = N-ary n A (λ _ → B)
Analogously to the curry and uncurry functions which provided the isomorphism between curried and tupled functions, we can give two functions to show that N-ary n A B is equivalent to a function Vec A n → B. Such a thing comes in two flavors—we must be able to convert one way, and be able to convert back. First we have curryⁿ, which transforms a vectorized function into an 𝑛-ary one: 2
curryⁿ : {n : ℕ} {A : Set ℓ₁} {B : Vec A n → Set ℓ₁} → ((v : Vec A n) → B v) → N-ary n A B curryⁿ {n = zero} x = x [] curryⁿ {n = suc n} x a = curryⁿ (x ∘ (a ∷_))
As an inverse, we have def_$ⁿ_, which undoes the transformation made by curryⁿ. The name here might strike you as peculiar, but it’s a pun on a famous Haskell idiom where _$_ is the function application operator. 2
_$ⁿ_ : {n : ℕ} {A : Set ℓ₁} {B : Vec A n → Set ℓ₁} → N-ary n A B → ((v : Vec A n) → B v) _$ⁿ_ {n = zero} f [] = f _$ⁿ_ {n = suc n} f (x ∷ v) = f x $ⁿ v _$ⁿ_ and curryⁿ allow us to swap between an 𝑛-ary function— which is convenient for users, but hard to actually implement anything using—and a function over vectors—which is much easier to use as an implementer. Thus, we can use curryⁿ whenever we’d like to present a function to the user, and transform it into something workable via _$ⁿ_. Now that we have an 𝑛-ary function that we can use to give the user his syntactic variables, we’d like him to be able to give us both sides of the equation back. Recall that this is not possible with Syn,
9.18. ERGONOMICS
365
which doesn’t contain any constructor for differentiating between the two sides. However, further thought shows that really we’d just like to get back two Syn objects. Rather than going through the work of making a new type to do this for us, we can simply re-purpose _×_ by giving it a new can of paint: 2
open import Chapter1-Agda using (_×_) renaming (_,_ to _:=_) public
By renaming _,_ to _:=_, we can now write a syntactic equality as and our users are none the wiser. There is one final thing to do, and that’s to generate a vector full of distinct variables that we can inject into the syntax lambda that the user gives us. This is done in two pieces: the first step builds the distinct Fin values, and the second then executes map in order to transform them into Syn. lhs := rhs,
⇤ 2
fins : Vec (Fin n) n fins {zero} = [] fins {suc n} = zero ∷ map suc fins vars : Vec (Syn n) n vars = map var fins
And that’s all, folks. We can now give a full-strength definition of solve, equivalent to the one in Agda’s standard library: 2
solve : (n : ℕ) → (eq : N-ary′ n (Syn n) (Syn n × Syn n)) → (let x := y = eq $ⁿ vars {n}) → hnf x ≡ hnf y → N-ary n 𝔸 (λ v → ⟦ x ⟧ (lookup v) ≡ ⟦ y ⟧ (lookup v)) solve n eq x=y = let x := y = eq $ⁿ vars {n} in curryⁿ (equate x y x=y ∘ lookup)
366
CHAPTER 9. PROGRAM OPTIMIZATION
Ï UNICODE IN THIS CHAPTER ² U+00B2 × U+00D7
ʳ U+02B3 ˡ U+02E1 λ U+03BB ′ U+2032 ⁿ U+207F ₀ U+2080 ₁ U+2081 ₂ U+2082 ℓ U+2113 𝔸 U+1D538 ℕ U+2115 → U+2192 ↓ U+2193 ↪ U+21AA ∀ U+2200 ∎ U+220E ∘ U+2218 ∷ U+2237 ≈ U+2248 ≡ U+2261 ⊕ U+2295 ⊗ U+2297 ⟦ U+27E6 ⟧ U+27E7 ⟨ U+27E8 ⟩ U+27E9
SUPERSCRIPT TWO (\^2) MULTIPLICATION SIGN (\x)
MODIFIER LETTER SMALL R (\^r) MODIFIER LETTER SMALL L (\^l) GREEK SMALL LETTER LAMDA (\Gl) PRIME (\') SUPERSCRIPT LATIN SMALL LETTER N (\^n) SUBSCRIPT ZERO (\_0) SUBSCRIPT ONE (\_1) SUBSCRIPT TWO (\_2) SCRIPT SMALL L (\ell) DOUBLE-STRUCK CAPITAL A (\bA) DOUBLE-STRUCK CAPITAL N (\bN) RIGHTWARDS ARROW (\to) DOWNWARDS ARROW (\d-) RIGHTWARDS ARROW WITH HOOK (\r) FOR ALL (\all) END OF PROOF (\qed) RING OPERATOR (\o) PROPORTION (\ ) ALMOST EQUAL TO (\~~) IDENTICAL TO (\ ) CIRCLED PLUS (\o+) CIRCLED TIMES (\ox) MATHEMATICAL LEFT WHITE SQUARE BRACKET (\[[) MATHEMATICAL RIGHT WHITE SQUARE BRACKET (\]]) MATHEMATICAL LEFT ANGLE BRACKET (\)
Bibliography
Abreu, Pedro, and Kevin Buzzard. 2023. Mechanizing Modern Mathematics. https://www.typetheoryforall.com/2023/01/16/26-KevinBuzzard.html. Danielsson, Nils Anders, and Ulf Norell. 2011. “Parsing Mixfix Operators.” In Implementation and Application of Functional Languages, edited by Sven-Bodo Scholz and Olaf Chitil, 5836:80–99. Berlin, Heidelberg: Springer Berlin Heidelberg. https://doi.org/10.1 007/978-3-642-24452-0_5. Elliott, Conal. 2017. “Generic Functional Parallel Algorithms: Scan and FFT.” Proceedings of the ACM on Programming Languages 1 (August): 1–25. https://doi.org/10.1145/3110251. Graham, Paul. 2001. “Beating the Averages. Paul Graham.” April 2001. http://www.paulgraham.com/avg.html. Hedges, Peter. 2002. About a Boy. Universal Pictures. Hinze, Ralf. 2000. “Memo Functions, Polytypically!” Workshop on Generic Programming 2: 17–32. Hughes, R.John Muir. 1986. “A Novel Representation of Lists and Its Application to the Function ‘Reverse’.” Information Processing Letters 22 (3): 141–44. https://doi.org/10.1016/00200190(86)90059-1. Kidney, Donnacha Oisín. 2019. “Automatically and Efficiently Illustrating Polynomial Equalities in Agda.” Technical Report. University College Cork. McBride, Conor Thomas. 2014. “How to Keep Your Neighbours in Order.” In Proceedings of the 19th ACM SIGPLAN International Conference on Functional Programming, 297–309. Gothenburg Sweden: ACM. https://doi.org/10.1145/2628136.2628163. The Agda Community. 2023a. “Agda User Manual.” User Manual 2.6.3. https://agda.readthedocs.io/en/v2.6.3/. ———. 2023b. “Agda Standard Library.” https://github.com/a gda/agda-stdlib. 367
Acknowledgments
It’s easy to underestimate how hard it is to write a book. My wife says her superpower is forgetting how bad things can get; maybe she’s been rubbing off on me, because only a very crazy or very forgetful man would decide to write a third book. And that’s not to mention all the books that were started and died in the water before this one took hold. This particular piece of prose has been a wonderful whirlwind. The seed came to me while hanging out with some friends who wanted to combine quantum mechanics with game theory. I offered to sit down with them to help formalize their ideas in Agda. I had no idea what I was doing, but it quickly became clear that my ability to weave proofs in Agda more than made up for my abject failure to understand what they were going on about. That night I jotted down some quick notes about what such a book could look like. Little did I know, those scribbles would come to dominate the next year of my life. A lot has changed in that year—I’ve gotten married, gone back to school, and have done a lot of growing up. Through all of it, I’ve been working on this book, and it’s frankly a little bittersweet to be finished this thing that’s been with me for so long. Many people were involved in helping tie the knot in reality from which this book springs forth. Some of the particularly instrumental people in this journey are: Erin Jackes, for being my muse, my guiding star, and my one true love. Without her support and outstanding company, I never would have found the wherewithal to make it to the finish line. Reed Mullanix, whom has never once balked at taking an hour out of his day to explain something to me. And I ask very often. Most of the technical things I know have been taught to me by Reed. Conal Elliott, for his unwavering dedication to beauty and simplicity, and for inspiring me to demand more from my work. I pray 369
370
CHAPTER 9. ACKNOWLEDGMENTS
that one day my taste will be half as good as his. Other fantastic people whom I can’t praise enough include Solomon Bothwell, Shae Erisson, Li-yao Xia, Jonathan Lorimer, Asad Saeeduddin, Callan McGill, and the whole Cofree Coffee crew. Andrew McKnight, Kenneth Bruskiewicz, and Yanze Li were instrumental in helping me get through some nasty proofs. Farhad Mehta, John Hughes, Judah Jacobson, Ryan Hunter, thank you all for your support. The punchy cover was designed by Laura Stepšytė. The interior styling was inspired by the work of Bob Nystrom, and the color scheme is lifted from Byrne’s Euclid. The book is compiled through an unholy matrimony made possible by John MacFarlane, Rijnard van Tonder and Amélia Liao, among others. Some ideas for what a book on Agda should be like came from Programming Language Foundations in Agda, although I must admit (abashedly) that I haven’t actually read it. Yet. This book uses Dan Zadorozny’s excellent font “Special Agent Condensed,” and the spectacular “Front Page Neue” by Allison James. Finally, my most sincere gratitude goes out to everyone who has helped make Agda happen. It’s a marvelous piece of technology with a delightful community, and is the most fun I’ve had programming in years. Thank you all most sincerely, from the very bottom of my heart. Sandy Maguire Vancouver, Canada, October 2023.
About the Author Sandy Maguire is a man of many facets, whose passion for software correctness has led him to become the author of three influential books in the field. He firmly believes that a combination of solid theoretical foundations and the practical wisdom to apply them is paramount in the world of software development. In addition to theoretical software, Sandy’s interests are as diverse as they are unique. He’s an avid car-hater who finds solace from the busy city as a talented pianist. His enthusiasm doesn’t stop there, as he’s also a dedicated parkour enthusiast, constantly seeking new heights and challenges. Not content with the status quo, Sandy stepped away from the software industry in 2018 to actively pursue change. He’s a selfdescribed author, programmer, musician, and trouble-maker who’s been immersed in the world of coding for over two decades. With considerable time and experience, he feels like he’s finally making progress on where good software comes from. Currently residing in Vancouver with his wife Erin, you can find Sandy vigorously researching programming languages and delving into the intricacies of mathematics. When he’s not busy advocating for purely functional programming, he can be found leaping over obstacles and attempting not to destroy coffee machines. Sandy Maguire’s unique blend of expertise, creativity, and adventurous spirit make him a compelling and dynamic figure in the world of software.
371
Books by Sandy Maguire • Thinking with Types, 2018, Cofree Press available at https://leanpub.com/thinking-with-types • Algebra-Driven Design, 2020, Cofree Press available at https://leanpub.com/algebra-driven-design • Certainty by Construction, 2023, Cofree Press available at https://leanpub.com/certainty-by-construction
373