Contract Programming
Contracts are a breakthrough technique to reduce the programming effort for large projects. Contracts are the concept of preconditions, postconditions, errors, and invariants.
Building contract support into the language makes for:
- a consistent look and feel for the contracts
- tool support
- it's possible the compiler can generate better code using information gathered from the contracts
- easier management and enforcement of contracts
- handling of contract inheritance
The idea of a contract is simple - it's just an expression that must evaluate to true. If it does not, the contract is broken, and by definition, the program has a bug in it. Contracts form part of the specification for a program, moving it from the documentation to the code itself. And as every programmer knows, documentation tends to be incomplete, out of date, wrong, or non-existent. Moving the contracts into the code makes them verifiable against the program.
Assert Contract
The most basic contract is the AssertExpression. An assert declares an expression that must evaluate to true, with an optional failure string as a second argument:
assert(expression); assert(expression, "message to display on failure");
As a contract, an assert represents a guarantee that the code must uphold. Any failure of this expression represents a logic error in the code that must be fixed in the source code. A program for which the assert contract is false is, by definition, invalid.
As a debugging aid, the compiler may insert a runtime check to verify that the expression is indeed true. If it is false, an AssertError is thrown. When compiling for release, this check is not generated. The special assert(0) expression, however, is generated even in release mode. See the AssertExpression documentation for more information.
The compiler is free to assume the assert expression is true and optimize subsequent code accordingly.
Pre and Post Contracts
The pre contracts specify the preconditions before a statement is executed. The most typical use of this would be in validating the parameters to a function. The post contracts validate the result of the statement. The most typical use of this would be in validating the return value of a function and of any side effects it has. In D, pre contracts begin with in, and post contracts begin with out. They come at the end of the function signature and before the opening brace of the function body.
Pre and post contracts can be written either in expression form (feature introduced in DMD 2.081.0), with a syntax similar to assert, or as block statements containing arbitrary code.
The expression form is:
in(expression) in(expression, "failure string") out(identifier; expression) out(identifier; expression, "failure string") out(; expression) out(; expression, "failure string") { ...function body... }
The block statement form is:
in { ...contract preconditions... } out { ...contract postconditions... } out (identifier) { ...contract postconditions... } do { ...function body... }
The optional identifier in either type of out contract is set to the return value of the function.
By definition, if a pre contract fails, then the function received bad parameters. If a post contract fails, then there is a bug in the function. In either case, an assert statement within the corresponding in or out block will throw an AssertError.
The keyword do can be used to announce the function body. Although any number of pre or post contracts of any form may follow each other, do is required only when the last contract before the body is a block statement. (Before the acceptance of DIP1003, the keyword body was required instead of do, and may still be encountered in old code bases. In the long term, body may be deprecated, but for now it's allowed both as a keyword in this context and as an identifier elsewhere, although do is preferred.)
Though any arbitrary D code is allowed in the in and out contract blocks, their only function should be to verify incoming and outgoing data. It is important to ensure that the code has no side effects, and that the release version of the code will not depend on any effects of the code. For a release build of the code, in and out contracts are not inserted.
Here is an example function in both forms:
int fun(ref int a, int b) in(a > 0) in(b >= 0, "b cannot be negative!") out(r; r > 0, "return must be positive") out(; a != 0) { // function body } int fun(ref int a, int b) in { assert(a > 0); assert(b >= 0, "b cannot be negative!"); } out(r) { assert(r > 0, "return must be positive"); assert(a != 0); } do { // function body }
The two functions are almost identical semantically. The expressions in the first are lowered to contract blocks that look almost exactly like the second, except that a separate block is created for each expression in the first, thus avoiding shadowing variable names.
In, Out and Inheritance
If a function in a derived class overrides a function from its super class, then only one of the in contracts of the function and its base functions must be satisfied. Overriding functions then becomes a process of loosening the in contracts.
A function without an in contract means that any values of the function parameters are allowed. This implies that if any function in an inheritance hierarchy has no in contract, then any in contracts on functions overriding it have no useful effect.
Conversely, all of the out contracts need to be satisfied, so overriding functions becomes a processes of tightening the out contracts.
Invariants
Invariants are used to specify characteristics of a class or struct that must always be true (except while executing a member function). For example, a class representing a date might have an invariant that the day must be 1..31 and the hour must be 0..23:
class Date { int day; int hour; this(int d, int h) { day = d; hour = h; } invariant { assert(1 <= day && day <= 31); assert(0 <= hour && hour < 24, "hour out of bounds"); } }
Invariant blocks should contain assert expressions, and should throw AssertErrors when they fail. Since DMD version 2.081.0, invariants can also be written as expression statements, with assert implied:
invariant(1 <= day && day <= 31); invariant(0 <= hour && hour < 24, "hour out of bounds");
The invariant is a contract saying that the asserts must hold true. The invariant is checked when a class or struct constructor completes, and at the start of the class or struct destructor. For public or exported functions, the order of execution is:
- preconditions
- invariant
- function body
- invariant
- postconditions
The invariant is not checked if the class or struct is implicitly constructed using the default .init value.
The code in the invariant may not call any public non-static members of the class or struct, either directly or indirectly. Doing so will result in a stack overflow, as the invariant will wind up being called in an infinitely recursive manner.
Invariants are implicitly const.
Since the invariant is called at the start of public or exported members, such members should not be called from constructors.
class Foo { public void f() { } private void g() { } invariant { f(); // error, cannot call public member function from invariant g(); // ok, g() is not public } }
The invariant can be checked with an assert() expression:
- classes need to pass a class object
- structs need to pass the address of an instance
auto mydate = new Date(); //class auto s = S(); //struct ... assert(mydate); // check that class Date invariant holds assert(&s); // check that struct S invariant holds
Class invariants are inherited, that is, any class invariant is implicitly in addition to the invariants of its base classes.
There can be more than one invariant declared per class or struct.
When compiling for release, the invariant code is not generated, and the compiled program runs at maximum speed. The compiler is free to assume the invariant holds true, regardless of whether code is generated for it or not, and may optimize code accordingly.