The real world is a complicated place; in many cases we don’t even know the probability of something happening. Nevertheless, there are a variety of circumstances where it is important to prove that, given certain reasonable assumptions, certain conclusions necessarily follow.
The desire to conclusively prove claims is an ancient desire. Euclid showed (in geometry) that it was possible to start with a short list of reasonable assumptions, called axioms, and then prove an astonishing array of results. In 1685 Gottfried Leibniz stated in “The Art of Discovery” that “The only way to rectify our reasonings is to make them as tangible as those of the Mathematicians, so... when there are disputes... we can simply say: Let us calculate... to see who is right.” At the time, the logic languages necessary to apply to many fields didn’t exist, and even if they did, the challenge of applying them manually would have made them less useful.
The good news is that today there are a variety of computer programs and logic systems (languages) that can help you conclusively prove claims about the real world. The bad news is that it’s often hard to find out how to use these tools. For example, many of their creators assume that users have a vast amount of mathematical background.
When I was writing my dissertation Fully Countering Trusting Trust through Diverse Double-Compiling I ended up having to learn a lot of poorly-documented “folklore”. In this paper, I briefly provide some “lessons learned” that I hope will help others who want to prove that certain claims are true.
Here are some brief lessons learned:
Unless you want to do things by hand, you’ll need a tool. Different tools implement different logic languages, and of course, the language you use matters greatly. So in practice, you end up selecting both the logic language and the tool simultaneously. (This is a lot like selecting a programming language, really.)
Unfortunately, the current practice is that you need to learn a little bit about several different tools and their languages before you can decide if they’re right for your problem. That’s unfortunate, but probably unavoidable. In particular, tools tend to add little improvements or extensions that may or may not be helpful to you.
My paper High Assurance (for Security or Safety) and Free-Libre / Open Source Software (FLOSS) specifically discusses a number of FLOSS formal methods tools that may be useful to you. I recommend that you prefer FLOSS proof tools for these purposes; they cost nothing, aren’t tied to a particular vendor (avoiding vendor lock-in), won’t disappear on you in the future, and so on.
There are several different basic logic languages. Going from weakest to strongest, major types of languages include propositional logic, first-order logic (FOL), modal FOL, and higher-order logic (HOL). ([Schumann2001, page 73] has a slightly more complicated summary about language expressiveness.) Propositional language is too weak for most ordinary users; programs that implement propositional languages (such as SAT solvers) are usually building blocks for creating larger systems instead of being used directly by end-users.
I recommend that if you’re new to this, start by learning first-order logic (FOL), and see if you can make your problem fit inside FOL. The modal logic and higher-order logic (HOL) systems can be very useful, but they are more difficult to automate (thus requiring more human intervention). What’s more, it’s much easier to learn other systems once you know FOL.
In practice, real-world languages have lots of extensions and “built-in theorems”. At the least, almost every tool builds in equality (=), because that’s necessary for lots of real work. Some systems (like prover9) build in list handling, so you can handle arbitrary-length lists; these can be really useful. Many systems build in or add various forms of arithmetic; if you need arithmetic, you need to use such systems instead of trying to add them yourself. The program CVC3 is a good program to use if you need something that includes lots of arithmetic. Other useful programs include HOL, PVS, and (if you’re proving programs) ACL2.
I should warn you that the term “model” has two subtly different meanings. The non-technical term “model” means “a simplified representation of the real world”; in this sense, if you are trying to prove something about the real world, then your assumptions and goal(s) are models of the real world. There’s also a technical meaning of the term “model” that means “an assignment of constants that meet all the given constraints”; this is the kind of model that “model-checkers” find. Both meanings are used, and it can be confusing. Sorry.
You can see a little more about formal methods in general at my formal methods page.
Many provers are based on “first-order logic” (FOL). If you can express your requirements using some version of FOL, then you can probably use lots of highly-automated tools that actually create the proof (instead of just checking it). Wikipedia’s “First-order logic” entry has some useful information. Some people use different symbols for the same concept, and may add or remove some operations, but once you know one variant it’s trivial to learn another variant.
My dissertation Fully Countering Trusting Trust through Diverse Double-Compiling section 5.2 gives a brief summary of FOL in 5.2. In short, in FOL, every expression is a term or a formula. A term (which denotes an object) is defined as: a variable, a constant, or a function application of form f(...) where each of the zero or more comma-separated parameters is a term. A formula (which denotes a truth value of true or false) is defined as being one of the following (if both A and B are formulas):
Expression | Meaning |
---|---|
-A | “not A”. If A is true, then -A is false; if A is false, then -A is true. |
A & B | “A and B”. Both A and B must be true for the expression to be true |
A | B | “A or B”. At least one of A or B must be true for the expression to be true. |
A -> B | “A implies B” or “if A, then B” |
A <-> B | “A implies B, and B implies A”. This is a short form of “(A -> B) & (B -> A)” |
A <- B | “B implies A”; this is just a sometimes-useful reversed form of “B -> A” |
all X A | “for all values of X, A is true”; X is a variable |
exists X A | “there exists some X where A is true”; X is a variable |
T1 = T1 | “T1 equals T2”; T1 and T2 must be terms |
T1 != T2 | “T1 is not equal to T2” |
p(...) | predicate (a “function” that returns a boolean); each of the 1 or more comma-separated parameters is a term. |
For example, you can state “all men are mortal” in FOL as:
% "For all X, if X is a man, then X is mortal" all X man(X) -> mortal(X).You can express “Socrates is a man” in FOL as:
man(socrates).From these two statements, you can then prove that Socrates is mortal:
mortal(socrates).
There are endless books and pages on FOL; a good library and/or a web search should get you started. “forall x: An introduction to Formal Logic” by P.D. Magnus is an open access introductory textbook in formal logic.
Here are some external pages on how to convert ideas into FOL:
Unfortunately, FOL has some real weaknesses that are easily extended, but many mathematicians don’t seem to realize that they are weaknesses. For example, you can’t define “if-then-else” easily in FOL. FOL has two “types”: formula (essentially booleans) and terms (everything else). You can define a multi-parameter object that returns a boolean (Predicate) or a term (function), but all parameters have to be terms (non-booleans) - and if-then-else inherently takes a boolean! This email notes why it’s important to add if-then-else for practical languages.
Prover9/Mace4 (GPLv2+ license) is a really neat combination of tools. Prover9 can be used to search for a proof, and mace4 can be used to search for a counter-example, which is an interesting combination. Prover9’s language is limited (it’s basically FOL), but if you can express your problem using its language, it’s extraordinarily good at automatically finding proofs or counter-examples. Prover9 includes support for equality and lists, and generally it has a "nice" syntax that's faily close to textbook formats (making it easy to use).
Prover9 is a "theorem prover" that works in a way similar to many other provers. First, it negates the "goal" statement (the idea is that you're trying to prove that there's some contradiction). Then, it tries to generate all possible facts from this set of assumptions (including the negated goal statement), until it finds a contradiction. Obviously, this is not the way that a human would work, but that's okay if you're simply trying to find out if something is true given certain assumptions. This works primarily because computers are remarkably fast, and the developers of prover9 (and other such programs) have honed heuristics that work in many cases.
Mace4 and prover9 include limited support for integer arithmetic. In Mace4 you can turn on arithmetic support with set(arithmetic), and it's fairly capable. Historically, prover9 couldn't handle arithmetic at all, but as of 2008-2009 its "production mode" can do this (use set(production) to turn this on). However, production mode disables many of prover9's other capabilities so this is much more limited (for example, it won't show the rewrites in the output). Many problems don't require calculation, and these limited capabilities may be sufficient for some problems that do. Prover9 has also recently added an if(boolean,if-term,else-term), which is very convenient but again has limitations. If your problem focuses on in-depth arithmetic calculations, prover9 is probably not the tool for you.
Prover9 is easy to invoke; just say “prover9 -f FILE.in” on the command line, where FILE.in gives the assumptions and goal to be proved. The prover9 documentation doesn’t describe its input format very clearly; here’s my attempt at documenting the prover9 formula syntax (in BNF).
There’s a paucity of small, simple examples of prover9 in use. It's hard to be simpler than mortal.in, which is the "Socrates is mortal" proof in prover9 syntax.
Here’s the proof that the square root of 2 is irrational, as prover9 input and prover9 output. (This is a translation of the Otter proof from Larry Wos, by Michael Beeson and William McCune, as published in “The Seventeen Provers of the World” compiled by Freek Wiedijk). Here’s the square root of 2 proof, as a graphic.
Here’s another example, distinct.in, which defines this predicate:
distinct([First, Second : Rest]) -> ( (First != Second) & distinct([First : Rest]) & distinct([Second : Rest])).The “distinct” predicate is useful all by itself, because “distinct” makes it easy to declare that a collection of terms are all different. Basically, “distinct” accepts a list of terms, and then declares that every term is different from all the other terms. For example, you can then state that a, b, c, and d are all different by stating “distinct([a,b,c,d]).”. This is also a useful example of how to do recursive list-handling in prover9.
Note: due to technical issues in prover9, if you include this definition of "distinct" in your assumptions but you never use it, then you may get a misleading message. If prover9 can find a proof but "distinct" is not used, it may output "SEARCH FAILED" followed by "Exiting with 1 proof" (or some other number). In such cases, prover9 did find a proof, but thought it had to keep searching.
Unfortunately, FOL doesn’t have any built-in notion of type, so programs like prover9 that implement FOL don’t have types either. You can implement types in prover9, using FOL; whether or not you should do this depends on your problem. If you add types where they aren’t needed, then the assumptions, goal, and proof may become much more complicated. In my dissertation I decided to discuss types in the informal text, but not use them in the actual proof. A trivial way to implement types is to simply make a predicate that is true if the object is a given type, e.g.:
% human(X) is true if X is human. human(abraham_lincoln). % penguin(X) is true if X is a penguin. penguin(tux).If there are only two types, then it’s easy to declare that if something is of one type, then it’s not of the other type. Here, we declare that any unknown X cannot be both a human and a penguin:
-(human(X) & penguin(X)).File penguin.in shows how to implement simple types in prover9's first-order logic (FOL)
That doesn’t scale well to many types if you want to make claims about many types. If you have 30 different types, and an item is at most one of those types, it’d be a pain to directly declare that the same way. Thankfully, you can then use recursion (just like the “distinct” predicate above) to declare that instances of different types are themselves distinct. The file distinct_types.in shows how to do this.
You can even define that certain types are subclasses of other classes, and so on, but again, you have to build that yourself. The file classes.in is a trivial example.
My dissertation Fully Countering Trusting Trust through Diverse Double-Compiling has some additional examples of using prover9.
I can’t really begin to scratch the surface here. I haven’t gone into a vast number of topics, but I hope this little paper has helped.
[Schumann2001] is not a bad introduction to using automated proof tools.
[Schumann2001] Schumann, Johann M. 2001. “Automated Theorem Proving in Software Engineering”. Berlin: Springer-Verlag. ISBN 3-540-67989-8
You might want to look at my Secure Programming HOWTO web page, Open Source Software and Software Assurance (Security), and High Assurance (for Security or Safety) and Free-Libre / Open Source Software (FLOSS). The open proofs page has information about open proofs.
You can also view my home page.