Complete contracts through models
July 2005
While this column often touches on general issues of software engineering and
the IT industry, I will this time address a more technical question, which many
people ask in connection with Design by Contract. It's the matter of how to
formulate contracts that say not just the truth and only the truth, but all the
truth, specifying all the relevant properties of a class. We may call them
"complete contracts".
It is widely believed that complete contracts are not possible because of the
simplicity of the assertion language in Eiffel (boolean expressions plus the
"old" notation). Here I intend to show otherwise.
Many contracts that you find today in Eiffel programs and in libraries such as
EiffelBase are not complete. More precisely, the preconditions are complete,
especially in libraries, since a precondition is a commitment by a routine that
it will properly handle any call that satisfies the requirements it states;
you'd better then not forget anything you need for the routine to do its job
properly. This is usually not a problem. But postconditions and class invariants
do not always spell out all the properties ensured or maintained by a routine or
class. Sometimes, it's out of laziness; but it may be that the author of the
class finds the property too difficult or impossible to express. You will
occasionally see an assertion replaced by a comment, as in
-- No other element of
the list has the same value
an admission that the property is important but that the class author couldn't
express it formally.
Some people infer from such examples that the assertion language should be
extended to support first-order predicate calculus: "for all" and "there exists"
operations. But this is not an interesting development, because it loses the
simplicity of assertions, the ability to monitor them at run time easily (if we
use contracts as a debugging and testing tool), and does not address the core
issue anyway: for many properties of interest, first-order is not enough. How do
you express
"The graph has no cycles"
in first-order logic?
A simple solution is available for such examples: use functions. Since a
postcondition is a boolean expression, it can include calls to functions, for
example to one that, on a graph, returns true if and only if the graph is
acyclic. These are functions in the programming sense: routines producing a
result. They must of course be of a "pure" nature, not modifying anything in
their environment; the notion of Only postcondition clause, introduced by the
ECMA standard, helps guarantee this. It is not hard for the author of a GRAPH
class to add such a function, making it possible then to write a postcondition
such as
ensure
acyclic
which is now just as formal as any other, although it is no longer purely
mathematical.
Sometimes one may prefer a clearer separation between the elements used in the
implementation and those appearing in the contract. In that case a particularly
useful technique is that of models. Let me explain how it works by taking
a commonly cited example, that of a "push" operation on a class STACK (actually,
STACK [G] with a generic parameter) as in the EiffelBase library. Typically the
routine will read
put (x: G)
-- Add `x' on top.
require
not is_full
do
... Implementation (or deferred at the level
... of class STACK)
ensure
count = old count + 1
-- Clause 1
item = x
--
Clause 2
end
The precondition is complete. But the postcondition is not: its Clause 1 says
that the stack gains an element (`count' is the number of elements) and Clause 2
that the top (as given by `item') is the element x we just pushed. That's the
truth, it's only the truth, but it's not all the truth. It doesn't say that the
other elements
-- those present before, at position 1 to `old count' -- haven't changed.
The general idea of Design by Contract has always been that -- unlike heavy-duty
formal methods -- the method doesn't ask you to specify everything formally to
get any benefits. Partial postconditions such as the ones above are already of
great use, since they capture the essential elements of a specification. All the
usual advantages of contracts are applicable: documentation, testing,
disciplined inheritance, reuse support. And indeed the missing property seems
the easiest to achieve: who in his right mind would implement a "push" operation
by messing around with existing elements?
And yet we can't use this as an excuse if we want to take contracts to their
full consequences and benefits. If we are talking about large-scale reuse, with
components that can be downloaded from unknown sites, or used through Web
Services, arguments of the form "Who in his right mind would...?" can't satisfy
us any more, if only because there's a lot of guys nowadays in their wrong
minds. A devious implementation could advertise a seemingly nice specification
and wreak havoc with the data structure, for the greater enjoyment of hackers.
Even in the absence of evil intent, history shows that even the most unlikely
mixups will occasionally occur.
Can we make the specification complete in such a case? Extending the contract
language might be a possibility, but it's easy to see that it will be very much
"ad hoc" and that every new structure might suggest new assertion constructs
(See the Object Contraint Language of UML for an example). In their book "Design
by Contract, by Example" [2] Richard Mitchell and James McKim propose a
technique based on extra routines from the class -- in the spirit of the
function `acyclic' above -- but for such examples the specifications are
recursive and I have never been able to understand how they can be justified
mathematically.
It seems preferable to use the idea of model classes. A model class is an
Eiffel class describing purely mathematical concepts; it can then serve to
specify the abstract semantics (the contracts) of many other classes.
This technique is easy to illustrate on the STACK example. One of the useful
model classes is SEQUENCE. It describes the notion of finite sequence elements,
from mathematics. It has feature such as
- count: s.count, for a
sequence s, is the number of
items in s.
- extended: s.extended
(x) is another sequence made
of the items of s followed by the item x.
(As a consequence,
s.extended (x).count = s.count + 1.)
- is_empty:
s.is_empty is true if and only if s has
no items (i.e. s.count = 0).
- head: s.head, for
non-empty s, is a sequence made
of all the items of s except the last.
(As a consequence, s.head.count = s.count - 1).
And so on. A critical requirement on such model classes is that they may contain
no commands (other than creation procedures) and no attributes:
they are purely "functional", being equipped with queries only. That's why
SEQUENCE doesn't have a command `extend' that would modify a sequence to extend
it with an item; only a query `extended' that, as specified above, defines a new
sequence, without any idea of modifying the original.
In that respect, model classes are pure mathematical concepts, and their text
indeed directly reflects a mathematical theory as can be extracted from math
textbooks or from specification languages such as Z or B. It is particularly
useful, however, to express them in Eiffel syntax, in line with the general
notion of seamlessness that lies at the core of the Eiffel method; then we can
use these model elements directly in the contracts of normal Eiffel classes,
STACK implementations for example. It is also useful to provide an
implementation, so that the usual run-time assertion monitoring, as used for
debugging and testing, can continue to deliver its benefits -- even though, as
you're probably going to suspect, we also want to go beyond testing and start
thinking on proofs now that we are moving towards complete contracts. But
wait.
Model classes, then, are direct representations of well-defined mathematical
concepts, with rigorous semantics, directly usable in the contracts of ordinary
Eiffel programs.
How do we take advantage of model classes? It suffices to express the
specification of a class such as STACK in terms of a mathematical model
of the underlying software objects. We can do this explicitly by introducing, in
class STACK [G], a new query
model: SEQUENCE [G]
Then we simply express the effect of any operation in terms of its effect on the
model. Here for example is the new version of the `push' operation; the only
change is the addition of a `Clause 0' to the postcondition:
put (x: G)
-- Add `x' on top.
require
not is_full
do
... Implementation (or deferred at the level
... of class STACK)
ensure
model ~ old model.extended (x) -- Clause 0
count = old count + 1 -- Clause 1
item = x
-- Clause 2
end
(~, a new notation of the ECMA Eiffel standard, is object equality, with `a ~ b'
meaning the same as `equal (a, b)'.)
The new clause simply says that, if the stack is modeled by a sequence with the
convention that the last element is the top, then the effect of put (x) on the
model is to replace it by the same sequence extended by x.
----- ----- ----- ----- ----- -----
| | | | | | | | | | | |
| | | | | | | | | | | x |
| | | | | | | | | | | |
----- ----- ----- ----- ----- -----
<------------------------------------->
old model
<------------------------------------------------->
old model.extended (x)
Note that the previous assertion clauses, Clause 1 and Clause 2, still hold --
of course -- but now become consequences of the new Clause 0, thanks to the
properties of SEQUENCE (expressed of course by contracts in SEQUENCE itself --
contracts that come from the mathematical theory and that we take for granted,
never expecting to have to verify them since they have been proved before us by
distinguished mathematicians).
We will call the original clauses "abstract assertions" and the new ones "model
assertions". The distinction facilitates our life in the context of inheritance
-- but that's for another article (see reference [1]).
To illustrate further how we specify our classes with model properties, the
`pop' operation now becomes
remove
-- Remove top element
require
not is_empty
do
... Implementation (or deferred at the level
... of class STACK)
ensure
model ~ old model.head
-- Model assertion
count = old count - 1
-- Abstract
assertion
end
Here too the old abstract assertion becomes a consequence of the model
assertion. It is not hard to see that all the contracts of a class such as STACK
and its descendants can be made complete through this technique.
A model library covering the contract needs of many practical cases can be built
out of a remarkably small number of concepts. Tobias Widmer has written such a
library (see reference[3]) and used it to provide a version of a significant
EiffelBase subset with complete contracts. (His work was a master's thesis at
ETH; we are currently writing it up as an ordinary paper for wider distribution,
but the thesis text already provides a good idea.)
There are many applications of this work, all being pursued. In particular:
- It will be of great interest to have such
a "fully contracted"
version of the entire EiffelBase. This
should be particularly
attractive to EiffelBase users, who will
benefit from having
a practical library, covering critical
daily needs of programming
with a practical and high-performance
implementation,
and fully specified. The goal is not so far
away given
what Tobias has already done.
- The application to testing is exciting.
Design by Contract
leads to a whole new way of performing
tests, guided
by assertions; the more extensive the
assertions, the
higher the likelihood of uncovering a bug
(the testing
equivalent of hitting gold).
- Beyond tests, as already hinted, we
really want proofs.
Here complete contracts are a must, not
just a nice addition.
On this exciting topic of proofs, it's useful to add that... But wait. I have to
keep some material for later columns.
--Bertrand Meyer
References
----------
[1] Bertrand Meyer: A Framework for Proving Contract-Equipped Classes, in
Abstract State Machines 2003, Advances in Theory and Practice, 10th
International Workshop, Taormina (Italy), March 3-7, 2003, eds. Egon Börger,
Angelo Gargantini, Elvinia Riccobene, Springer-Verlag, 2003, pages 108-125.
Available at
http://se.ethz.ch/~meyer/publications/proofs/framework.pdf.
[2] Richard Mitchell, James McKim: Design by Contract, By Example,
Addison-Wesley, 2001.
[3] Tobias Widmer: Reusable Mathematical Models. Master's thesis, ETH Zurich,
available at
http://se.inf.ethz.ch/projects/tobias_widmer/.
|