- "Our knowledge of the truth of a theorem depends on the correctness of its proof and on the correctness of all of the theorems used in its proof. It is a shaky foundation" because "many great and important theorems don’t actually have proofs. They have sketches of proofs, outlines of arguments, hints and intuitions that were obvious to the author (at least, at the time of writing) and that, hopefully, are understood and believed by some part of the mathematical community."
- Recognizing mathematical truth is harder than one might think at first. "If a theorem

has a short complete proof, we can check it. But if the proof is deep, difficult, and already fills 100 journal pages, if no one has the time and energy to fill in the details, if a “complete” proof would be 100,000 pages long, then we rely on the judgments of the bosses in the field. In mathematics, a theorem is true, or it’s not a theorem. But even in mathematics, truth can be political."

Yes, proofs of several published theorems contain mistakes. However, more often than not, the mistakes can be fixed and the results turn out to live to see the day after all.

Of course, I expect that mathematicians will want to improve upon the social notion of proof that underlies their profession. Quoting again from the opinion piece by Nathanson:

How can the reliability of mathematical proofs be improved? When should a proof be deemed to be "complete"? As a computer scientist, I'd say that a proof is really "complete" when it can be verified by a computer-based proof checker and it has been independently verified using a few such software tools. I know that this is a very stringent requirement, which will most likely never be implemented, but either we are ready to accept that mathematical truth is an unreliable but remarkably robust notion or we enlist the help of our computers to try and make it more reliable.

We mathematicians like to talk about the “reliability” of our literature, but it is, in fact, unreliable.

Part of the problem is refereeing. Many (I think most) papers in most refereed journals are not refereed. There is a presumptive referee who looks at the paper, reads the introduction and the statements of the results, glances at the proofs, and, if everything seems okay, recommends publication. Some referees do check proofs line-by-line, but many do not. When I read a journal article, I often find mistakes. Whether I can fix them is irrelevant. The literature is unreliable.

Do any of you think that proof-checking of research-level mathematical proofs will become commonplace in our lifetime?

Addendum dated 5 August 2008: I should have known that Doron Zeilberger would have commented on Nathanson's opinion piece. Of course, Dr. Z's opinion is thought provoking as usual. You can read it here. In summary, Dr. Z suggests two ways for improving the reliability of mathematical knowledge.

- In his words, "First computerize! Computers are much more reliable than humans, and as more and more mathematics is becoming amenable to computer checking, this is the way to go."
- Abandon anonymous refereeing.

Regarding the second suggestion, I am not so sure that abandoning anonymity in refereeing would have such a great effect on the reliability of mathematical literature. The bottom line is that one should exercise the greatest amount of professionalism in all aspects of the academic profession, regardless of whether one does something anonymously or not. Excellent referees are a great resource to have, and editors and PC members soon build a trusted core of referees that can be relied upon to provide high-quality feedback to authors and editors alike. Referees who do not do a good job tend to be ignored after a while, as do editors who act as black holes for the papers that are submitted to them.

## 7 comments:

An oldie, but a classic:

http://doi.acm.org/10.1145/512950.512970

"Social processes and proofs of theorems and programs" (DeMillo/Lipton/Perlis)

Well, I believe it should happen eventually. I also think that technically it's not that much of a problem.

In principle, proofs (unlike programs) are *made* to be verifiable. So, if proof is written in some formal language, it can be verified in straightforward manner. Moreover, such languages exist and are used in *practice*. I know at least one - TLA+, Temporary Logic of Actions (http://research.microsoft.com/users/lamport/tla/tla.html),

which is rather popular in specification, verification and formal reasoning about discrete dynamic systems.

So, most of required components are out there. What we really need to make it work is 1) unified approach, ideally it would be one standardized "proof language", 2) fine selection of software tools, available for everyone, 3) support from the society.

I can vividly imagine how it's going to be. We use "includes" instead of bibliographical references. Archives and journals do not accept papers, which are not written in formal languages. Paper reviewing becomes an order of magnitude simpler - all proofs are verified automatically on the server during submission. Papers are automatically generated from comments in proof source code. Arguments in natural language tend to become more sloppy - no reason to formalize here, check the source code if you like. Automatic generation of proofs is gaining popularity.

thanks for the post.

this is why i think alternative proofs are very very important (and should be published in journals more often).

I think the original author (Nathanson) makes too strong a distinction between theorems that are true, and those that are false. The history of mathematics is full with theorems that are false if one takes their claims literally. Similarly, the history of mathematics is replete with proofs that have serious gaps. Nathanson's example of the first proposition in Book I of Euclid's Elements ("On a given finite straight line to construct an equilateral triangle") is instructive here. The problem was overcome in some sense only about 2000 year after Euclid, by Hilbert's new axiomatisation of geometry. What Hilbert did was essentially to add new axioms, i.e. additional assumptions. This is very common in mathematics: a proof is faulty, even though it has been vetted by mathematical experts, but by adding a few simple and additional assumptions that do not stand in the way of applications of the theorem, that theorem can still be proven (other examples (1) the development that lead to axiomatic set theory that is the current most widely accepted foundation of mathematics, in particular the invention of the axiom of choice, (2) the development of calculus, with its move from continuity/convergence to uniform continuity/convergence). What does this show?

* It shows that the established process of refereeing is sufficient to weed out egregious errors, but not to prevent all errors.

* Moreover, the remaining errors tend to be fixable. The remaining errors will be weeded out more and more, the more a mathematical construction is used in other contexts.

Hence mathematical trust (like all trust) increases with time. This mirrors the experience in other scientific fields. In physics, an experiment, a theory establishes itself by being repeated by researchers other than the originators, and by making predictions that are different from those which it was intended to make.

What is the role of interactive theorem provers here?

* They complement the refereeing process. They are very good at things human referees are bad at, namely at going through large, boring and repetitive parts of proofs that are increasingly cropping up in proofs (e.g. Four Colour Theorem, Hales's proof of the Kepler conjecture etc). They are bad at creative invention which is something humans see themselves as being good at.

* Moreover, and crucially, their work is repeatable. Everybody can check at home if Gonthier's formalisation proves the Four Colour Theorem, we no longer have to rely on Gonthier's claim that he checked everything.

* Finally, and in the long run probably most importantly, they can be reused. Humanity is building up a library of formalised proofs which can be used in novel context to formalise more proofs.

All of this adds a novel dimension of certainty to mathematical proofs. I envision that in 50 years from now, a result will not be accepted unless it has a formal proof.

We are not there yet, proof assistants are still to difficult to use. In essence you need to spend a few years learning how to use these things before you can be very efficient, but I expect that problem to be overcome. This will be a great benefit to science.

An interesting question arises from the rise of zero-knowledge proofs. What would happen if scientist publish their results only as zero-knowledge proofs, so as to keep a competitive advantage over their colleagues? Scientific communication as we know it may come to an end to everybody's detriment.

As a PS to my post above, process algebra has plenty examples of proofs being improved over time too.

* Take for example recursive definitions of processes. If I remember correctly, Milner used process equations A = P. Later he moved to equations A(x1, ..., xn) = P with the free names of P being a subset of {x1, ..., xn}. The original formulation can cause inconsistency in some boundary cases, but in all the cases that humans are interested in, the problem doesn't matter and the latter, more stringent definition is expressive enough. (The thorny issue of alpha-conversion can be mined for more examples of this -- interestingly, the full problem of alpha-conversion has not been solved fully).

* Milner used the notion of weak bisimulation up to weak bisimulation to prove some theorems. It was later shown inconsistent. But the problem was fixable.

The December 2008 issue of the Notices of the AMS has some good articles on the subject.

http://www.ams.org/notices/200811

D. Zeilberger thinks "The Human Obsession With "Formal Proofs" is a Waste of the Computer's Time, and, Even More Regretfully, of Humans' Time".

http://www.math.rutgers.edu/~zeilberg/Opinion94.html

Post a Comment