Beyond Linear Arithmetic: Automatic Proof Procedures for the Reals Prof. Lawrence C. Paulson Computer Laboratory, University of Cambridge 1 Previous Research and Track Record Lawrence C. Paulson is Professor of Computational Logic at the University of Cambridge, where he has held established posts since 1983. Paulson’s early work on LCF contributed much (both code and concepts) to HOL. Paulson introduced Isabelle in 1986 and has been building upon it ever since. Isabelle is a generic proof tool that supports higher-order logic (HOL), Zermelo-Fraenkel set theory (ZF) and other formalisms. Isabelle is widely used in research. Many developments are due to Prof. Tobias Nipkow’s group at the Technical University of Munich. Automatic proof search, one of Isabelle’s particular strengths, is however due to Paulson [10, 11]. The present proposal involves developing automatic proof procedures for the theory of the reals, including transcendental functions. It builds upon Paulson’s existing experience of developing heuristic proof procedures and building proof tools, as well as his extensive knowledge of the automated reasoning literature. The work will be done within the Cambridge Automated Reasoning Group. Hardware verification was pioneered here by Prof. M. J. C. Gordon and his students. They introduced what have become standard techniques, such as the use of higher-order logic to model hardware and software systems. The group’s work continues to attract worldwide attention. Former members such as Dr. John Harrison have taken formal verification to Intel and other companies. The group has built two of the most important proof environments used today, namely HOL and Isabelle. Institutes using Isabelle as a basis for their research include the University of Edinburgh, Carnegie-Mellon University and Australia’s Defence Science and Technology Organisation (DSTO). Several past projects at Cambridge involve Isabelle. Recent projects include the following: • Authentication Logics: New Theory and Implementations (EPSRC ref. GR/K77051), 1996–99. This project was concerned with proving the correctness of security protocols. It produced the inductive method for protocol verification, whereby detailed protocol models are proved to be correct or shown to be flawed. The results of this project have been widely cited and have been incorporated into many of today’s protocol verification tools. 1 • Verifying Electronic Commerce Protocols (EPSRC ref. GR/R 01156/01), 2000– 03. This continued the previous project, with the objective of verifying security protocols of industrial complexity. The huge SET protocol suite was analysed and some vulnerabilities found. The research assistant, Giampaolo Bella, investigated the Zhou-Gollmann non-repudiation protocol and a certified electronic mail protocol designed by Abadi et al. This project produced numerous publications. • Compositional Proofs of Concurrent Programs (EPSRC ref. GR/M75440), 2000–03. This project investigated the verification of reactive systems using UNITY and the guarantees-calculus of Sanders and Chandy. It made much progress on the problem of formalizing states; toward this objective, it produced an untyped UNITY environment, formalized within Isabelle/ZF. A number of journal and conference papers have appeared. • Automation for Interactive Proof (EPSRC ref. GR/S57198/01), 2004–07. This project aims to give interactive proof tools improved automation through an effective combination of interactive and automatic tools. It is developing techniques for transferring subgoals from an interactive prover to an automatic one and for transferring proofs in the opposite direction. The returned proofs are reconstructed (and thus verified) within the interactive prover. This project has been underway for less than a year, but a prototype incorporating these elements is already completed, linking Isabelle to the resolution provers Vampire and SPASS. The designated Research Assistant, Jia Meng, is currently working on Automation for Interactive Proof as a project student. She therefore has experience with Isabelle and with resolution provers. She will have finished her PhD by the time this project starts. She is capable, hard-working and enthusiastic, and already has two research publications. 2 2 Description of Proposed Research A primary goal of formal verification research is improved automation. Many verification tasks involve proving equations and inequations over the reals. For example, Denney et al. [4] describe an application of automated theorem proving to aerospace software; they intend that all proof obligations should be eliminated by simplification or other means, leaving none to be presented to the user. An example involving interactive proof is the formalization in Isabelle/HOL of the Prime Number Theorem [1], where the authors spent much time proving simple facts involving transcendental functions. Linear arithmetic is well understood, and many proof tools provide efficient decision procedures, at least for the quantifier-free case. Such procedures can automatically prove propositional combinations of inequalities between linear combinations of variables. Unfortunately, linear arithmetic restricts us to the language of =, <, ≤, + and multiplication by integer constants. How can we extend this language? The alternatives are to employ more general decision procedures or to seek heuristic methods. More general decision procedures do exist, because the theory of real closed fields is decidable: thus, our language can be extended with multiplication and quantifiers. Dolzmann et al. [5] survey the main algorithms, which work by quantifier elimination. Heuristic methods tend to be extensions of elementary variable elimination methods. In the 1980s, Boyer and Moore [3] showed how such a procedure could share information with other parts of a theorem prover. Hunt et al. [8] describe a more elaborate procedure that can handle some problems involving polynomials (and thus, products of variables). Tiwari [14] describes a different heuristic procedure for polynomials, using ideas from Gröbner basis methods. Decision procedures interest mathematicians, since they represent solutions to decision problems, while heuristic methods may seem to be ad-hoc. For verification purposes, heuristic methods are typically broader in scope and more efficient than decision procedures. Tiwari [14] shows that for many problems, his heuristic procedure easily outperforms QEPCAD [7], a decision procedure that performs partial cylindrical algebraic decomposition (and whose complexity is doubly exponential). Richardson’s theorem tells us not to expect a useful decision procedure for any theory that includes the transcendental functions [16]. Weispfenning [15] has published a decision procedure for linear arithmetic augmented with a single, well-behaved transcendental function; although the work may contribute useful insights, this type of highly specialized procedure does not address our objectives. Heuristic methods are therefore appropriate. This project will focus on heuristic procedures to prove arithmetic statements that may refer to transcendental functions and polynomials. The statements must be quantifier-free, with free variables being implicitly universally quantified. The project will include two other components: • A disproof procedure will identify invalid arithmetic statements so that 3 further proof attempts can be abandoned. A successful disproof procedure for functional programs already exists for Isabelle [2]. • A problem library for real arithmetic will assist the research community by allowing comparisons between different implementations and procedures. These problems will be published via TPTP (Thousands of Problems for Theorem Provers) [13], if possible, and otherwise on a dedicated website.1 The Project Objectives are pragmatic, and focus equally on interactive and automatic verification. 1. To develop heuristic automatic proof procedures for the theory of real arithmetic with transcendental functions. 2. To develop disproof procedures for this theory. 3. To publish a problem library suitable for evaluating such procedures. The aim is to develop procedures that are sound and effective against real-world examples. Background Most heuristic proof procedures for arithmetic [3, 8, 14] extend the FourierMotzkin decision procedure. They use proof by contradiction: the conjecture is negated and converted to disjunctive normal form (DNF). The original conjecture is true if and only if all disjuncts are unsatisfiable. Each disjunct is a conjunction of inequalities, so the proof procedures work by refuting such conjunctions. Disproving the conjecture, conversely, requires showing that one of the conjunctions is satisfiable. The Fourier-Motzkin procedure works by repeated variable elimination. For example, given 2x < 3y − 1 ∧ 2y − x < 0 ∧ y ≥ −1, we obtain 4y < 2x from the second conjunct and thus 4y < 3y − 1 from the first conjunct by transitivity. Thus y < −1, and from the third conjunct we obtain −1 < −1, contradiction. Hunt et al. [8] and Tiwari [14] both extend this procedure to polynomials by including mechanisms to eliminate nonlinear leading terms, while the earlier procedure of Boyer and Moore [3] is able to eliminate certain function applications by using stored facts about those functions. Looking further afield, Ratschan is undertaking a programme of research [12] that may be relevant to both proof and disproof procedures, though set in the context of constraint programming. P ROOF P ROCEDURE O UTLINE The proposed extension of Fourier-Motzkin starts with the observation that most of the well-known functions on the reals are continuous. Moreover, a typical function’s domain can be segmented such that the function is monotonic or antip monotonic on each interval. Functions such as x, e x , log(x), and K · x (product 1 Geoff Sutcliffe has already indicated his willingness to include such problems in TPTP. 4 by a variable known to be positive) are monotonic over their entire domain, while other functions are monotonic on intervals: sin(x) is monotonic on [−π/2, π/2]. If f is a monotonic and continuous segment of a function, then f −1 will also be monotonic and continuous with a suitable domain and range. The existence of inverses will assist in eliminating occurrences of f . The heuristic procedure has several ways to eliminate occurrences of functions. It can appeal to monotonicity, for example reducing f (A) < f (B) to A < B, or reducing f (A) < B to A < f −1 (B); such steps require checking, through a recursive call to the procedure, that A and B lie within the same monotonic segment of f . If there is no other way to proceed, function occurrences can be eliminated by appealing to linear upper or lower bounds. For typical functions, these can be obtained from Taylor expansions around a suitable point. Examples of such bounds include e x ≥ x + 1; log(x) ≤ x − 1 for x ≥ 1; x 2 ≥ 0; x 2 ≥ x for x ≥ 1; sin(x) ≤ x for x ≥ 0; 1/x < 2 − x if |x − 1| < 1. The procedure would probably not compute bounds during its execution, but would be equipped with suitable information prior to execution. Users would be able to augment the procedure with new information about functions used in their application. For example, the proof of u ≤ v ∧ x ≤ y → e u + e x ≤ e v + e y would involve refuting the set of inequalities u ≤ v, x ≤ y, e v + e y < e u + e x . To isolate y, the use of monotonicity and inverses would yield y < log(e u + e x − e v ), concluding x < log(e u + e x − e v ) by transitivity. To isolate x, the pattern f −1 (· · · f (x) · · ·) suggests applying f to both sides, here yielding e x < e u + e x − e v and therefore e v < e u . By monotonicity we obtain v < u, which contradicts u ≤ v and we are done. Clearly a practical procedure will require many further elements. Few functions are as well-behaved as e x . We need to take into account the division of a function into monotonic segments, on intervals that may be open, closed or unbounded at either end. Moreover, many examples seem to require additional knowledge such as e 0 = 1 and sin(0) = 0. D ISPROOF P ROCEDURE O UTLINE The disproof procedure will accept the same input language as the proof procedure, and it will also begin by transforming the negated conjecture into DNF. The obvious approach to disproving a conjecture is simply to generate random variable assignments. This can be done separately on each conjunctive part of the DNF: satisfying one of them yields a counterexample to the conjecture. Floating-point arithmetic, however, could give incorrect results due to rounding errors. False positives (incorrect “counterexamples”) are unacceptable: they could lead to the abandonment of a valid goal. False negatives (overlooked counterexamples) are less serious, but they are still failures. If the conjecture is an equality, say f (x) = g (x), then a floating-point calculation from a random value x could erroneously “disprove” the equality: a false positive. We should never accept a claimed counterexample unless all conjuncts evaluate to true using an exact real arithmetic package such as iRRAM [9]. The counterexample search does not have to be entirely random. We can 5 separate the set of conjuncts into a linear part and a non-linear part. The FourierMotzkin procedure can be used to generate variable assignments satisfying the linear part. These can then be tested against the non-linear part. Programme and Methodology Task 1: Gathering of test data. Several sources of sample problems are available, including Isabelle’s formal development of the reals and Avigad’s formalization of the Prime Number Theorem. Avigad has offered to contribute such problems; see the attached letter of support. Other examples could be obtained from published PVS theories and by making enquiries over various mailing lists. We shall organize them, put them into a uniform format and publish them, probably via the TPTP. Estimated time: 2 person-months. Task 2: Preliminary development of heuristic proof procedures. The outline has been given above, but many details need to be fleshed out, driven by experimentation with our corpus of test data. Estimated time: 12 person-months. Task 3: Evaluation of exact real arithmetic packages. There exist many packages for exact real arithmetic. Criteria for preferring one to another include reliability, support for transcendental functions, efficiency and likelihood of continued development. We may decide to develop a communication mechanism so that the disproof procedure can be developed independently from any particular arithmetic package. Estimated time: 2 person-months. Task 4: Preliminary development of disproof procedures. This task may appear to be straightforward, but many questions require investigation. For example, should we use floating-point arithmetic in the initial search for counterexamples, and only use exact real arithmetic to confirm a claimed counterexample? Such an approach avoids false positives, and the use of floating-point arithmetic makes it more efficient, but it leaves open the risk of false negatives. Estimated time: 9 person-months. Task 5: Integration with interactive proof tools. One objective of this project is to support interactive proofs in arithmetic. We intend to work with Prof. Tobias Nipkow (of the Technical University of Munich) to extend his arithmetic proof procedure to use the methods developed in the project. The main difficulty concerns reconstructing the proofs in the host prover’s formal system, but the heuristic proof methods will be designed to make proof reconstruction as simple as possible. We shall also co-operate with interested developers of other proof tools; in particular, many developers of the HOL system are based at Cambridge. Integration of the disproof procedures with other proof tools is simpler, because there are no proofs to be reconstructed. Estimated time: 9 person-months. 6 Task 6: Testing and refinement. The proof and disproof procedures will undergo continual testing with new problems, and will be refined accordingly. We also intend to evaluate them in their role of supporting interactive proof. Estimated time: 9 person-months. The total effort of 43 person-months allows 36 for the research assistant and 7 for the principal investigator (who will dedicate 20% of his time to the project). Relevance to Beneficiaries The project is designed to benefit a broad range of parties, both academic and industrial. • Developers of proof tools, both interactive and automatic, will be able to provide increased automation to their users. • The many research groups using arithmetic reasoning (for example, to verify hybrid systems) will benefit from increased automation. • Formal arithmetic reasoning will increasingly benefit mathematicians. Already the Flyspeck Project [6] exists to produce a machine-assisted formal proof of the Kepler Conjecture, because the existing proof is too complicated for human referees. Dissemination and Exploitation The source code arising from this project will be freely distributed via the Internet. The techniques and the experimental results will be presented in journal and conference papers. Justification of Resources Staff. Paulson will work part-time on the project, specifically to support tasks 2, 4 and 5. He proposes to supervise one full-time research assistant, Jia Meng. She will undertake all of the project tasks, with day-to-day guidance from Prof. Paulson. We are requesting the top scale point because Meng is already receiving job offers from overseas: we need to be able to offer a competitive salary. The project also requires a Computer Officer (10% time) to maintain the hardware and software used in the research. Travel and Subsistence. Conference attendance is essential to keep abreast of developments and to disseminate results. We are requesting funds to attend some of the main conferences, such as CADE, CAV, IJCAR and TPHOLs. We may wish to attend relevant workshops at Schloß Dagstuhl and elsewhere. We are also requesting funds for visits to other institutions such as TUM, as detailed on the application form. Consumables. The figure shown comprises UNIX-based computers for Paulson and Meng, one laptop (to be shared), and various other computing-related costs. 7 References [1] J. Avigad. Notes on a formalization of the prime number theorem. On the Internet at http://www.andrew.cmu.edu/user/avigad/isabelle/pntnotes_a4.pdf, Sept. 2004. [2] S. Berghofer and T. Nipkow. Random testing in Isabelle/HOL. In Software Engineering and Formal Methods (SEFM’04), pages 230–239. IEEE Computer Society, 2004. [3] R. S. Boyer and J. S. Moore. Integrating decision procedures into heuristic theorem provers: A case study of linear arithmetic. In J. E. Hayes, D. Michie, and J. Richards, editors, Machine Intelligence 11, pages 83–124. Oxford University Press, 1988. [4] E. Denney, B. Fischer, and J. Schumann. Using automated theorem provers to certify auto-generated aerospace software. In D. Basin and M. Rusinowitch, editors, Automated Reasoning — Second International Joint Conference, IJCAR 2004, LNAI 3097, pages 198–212. Springer, 2004. [5] A. Dolzmann, T. Sturm, and V. Weispfenning. Real quantifier elimination in practice. Technical Report MIP-9720, Universität Passau, D-94030, Germany, 1997. [6] T. C. Hales. The flyspeck project fact sheet. On the Internet at http://www.math.pitt.edu/~thales/flyspeck/. [7] H. Hong. QEPCAD — quantifier elimination by partial cylindrical algebraic decomposition. On the Internet at http://www.cs.usna.edu/~qepcad/B/QEPCAD.html. Web site includes sources and documentation. [8] W. A. Hunt, Jr., R. B. Krug, and J. Moore. Linear and nonlinear arithmetic in ACL2. In D. Geist and E. Tronci, editors, Correct Hardware Design and Verification Methods (CHARME), LNCS 2860, pages 319–333, 2003. [9] N. Mueller et al. iRRAM — exact arithmetic in C++. Sources and documentation available at http://www.informatik.uni-trier.de/iRRAM/, 2004. [10] L. C. Paulson. Generic automatic proof tools. In R. Veroff, editor, Automated Reasoning and its Applications: Essays in Honor of Larry Wos, chapter 3. MIT Press, 1997. [11] L. C. Paulson. A generic tableau prover and its integration with Isabelle. Journal of Universal Computer Science, 5(3):73–87, 1999. [12] S. Ratschan. Convergent approximate solving of first-order constraints by approximate quantifiers. ACM Trans. Comput. Logic, 5(2):264–281, 2004. [13] G. Sutcliffe and C. Suttner. The TPTP problem library for automated theorem proving. On the Internet at http://www.cs.miami.edu/~tptp/, 2004. [14] A. Tiwari. Abstraction based theorem proving: An example from the theory of reals. In C. Tinelli and S. Ranise, editors, PDPAR: Workshop on Pragmatics of Decision Procedures in Automated Deduction, pages 40–52. INRIA, Nancy, 2003. [15] V. Weispfenning. Deciding linear-transcendental problems. Technical Report MIP-0005, Universität Passau, D-94030 Passau, Germany, 2000. [16] E. W. Weisstein. Richardson’s theorem. From MathWorld—A Wolfram Web Resource, 1999. On the Internet at http://mathworld.wolfram.com/RichardsonsTheorem.html. 8 Project Plan The chart below indicates the dependencies among the tasks. Gathering of test data will continue throughout the project. 1: Gathering of test data 3: Evaluation of arithmetic packages 2: Development of proof procedures 4: Development of disproof procedures 5: Integration with interactive tools 6: Testing and Refinement 9