Business plus pleasure

Sept/October 2005

Our topic for this month is a set of distinct but closely connected innovations in the recently adopted Eiffel standard, already implemented in release 5.6 of EiffelStudio. They illustrate how carefully designed language mechanisms can, in the Eiffel spirit, reconcile strict methodological principles with user convenience, insistence on consistency and generality, and an ever present concern to keep the language simple.

References to "the Standard" or "ECMA Eiffel" can be found at http://tinyurl.com/cq8gw
(full URL: http://www.ecma-international.org/publications/standards/Ecma-367.htm), which links to the PDF file.

UNIFORM ACCESS
-----------------------

We start from one of the basic rules of object-oriented programming: that the basic dynamic structure is the object, represented statically by its generating class, and that all accesses to the object should be through one of the features of that class. If you have an object of type STOCK and you want to know its current price, you can use

[ATTRIB_ACCESS]		my_stock.price


provided the class has a feature `price', here a query. It doesn't matter whether that query is an attribute (representing an object field) or a function (representing a computation); this is an internal representation decision, irrelevant to clients accessing objects through calls such as [ATTRIB_ACCESS]. This "Principle of Uniform Access" -- it doesn't matter to clients whether a query is implemented as an attribute or a function -- is applied throughout the rules of Eiffel; the Standard makes it even more systematic by allowing, for example, associating a precondition and a postcondition to an attribute. (This last point is not yet in 5.6.) Most other languages, as well as UML, don't care about Uniform Access; they fundamentally distinguish between attributes (fields) and functions (methods). The result is rather tricky: although you can export an attribute, making [ATTRIB_ACCESS] valid if `price' is exported, this is frowned upon and textbooks suggest not taking advantage of the possibility. This has two bad consequences. First, it is never a good idea to put a mechanism in a language and tell people not to use it. Prevention is better than cure (and what kind of language design is it that scolds its users for using perfectly legal facilities?). Second, people who follow the advice will have to rely on special functions whose only purpose is to return the value of the corresponding attributes. Instead of [ATTRIB_ACCESS], clients will write

[FUNCTION_CALL]		my_stock.get_price


which requires adding to STOCK a feature

	get_price: PRICE
			-- Return the value of `price'
		do
			Result := price
		end


or the equivalent in another language. Such functions are just noise, making the code longer for no good reason, especially in languages where you can't equip them with contracts.

HOW TO EXPORT AN ATTRIBUTE
-----------------------

The real reason why programmers are advised against exporting attributes is that in these languages this means exporting it unrestrictedly: for writes as well as reads. With an exported attribute, clients may use not only [ATTRIB_ACCESS] but

[ATTRIB_ASSIGN]		my_stock.price := 25


which violates a fundamental methodology rule, information hiding. The object-oriented way to achieve the effect of [ATTRIB_ASSIGN] is through a procedure call

[PROC_CALL]		my_stock.set_price (25)


with a command (procedure) in class STOCK that reads

	set_price: PRICE
			-- Return the value of `price'.
		do
			Result := price
		end


In my view the confusion, in the languages that permit [ATTRIB_ASSIGN], is to consider that exporting means exporting unrestrictedly. Because such exports then permit schemes like [ATTRIB_ASSIGN] that are clearly wrong methodologically, the natural reaction is to edict methodological rules that prohibit attribute exports altogether, including such harmless cases as [ATTRIB_ACCESS].

But there's nothing wrong with exporting an attribute, especially if you export it read-only. What we should not export is the information that it is an attribute rather than a function. It should be exported as a query (the common category for attributes and functions). Then clients know they can use [ATTRIB_ACCESS], which really should be called [QUERY_ACCESS]; and they can never use [ATTRIB_ASSIGN], if only because `price' could now be a function, and you cannot assign to a function.

	A note on the stock price example: it can seem strange that
	get_price returns a PRICE and that set_price accepts 25 as
	an argument; but in Eiffel this simply assumes that class PRICE
	includes a `convert' clause to convert from integers. Then in any
	context requiring a PRICE you can instead provide an integer,
	which will be converted to a PRICE object through the procedure
	specified in that clause.


OPERATOR FUNCTIONS AND ARRAY ACCESS
-----------------------

To finish the review of some characteristics of Eiffel 3 (that is to say, pre-ECMA) that warrant further discussion, let's consider an important facility: infix and prefix functions. If instead of giving a function the name `plus' you call it

	infix "+"


then a typical call, instead of the usual "dot notation" form `m.plus (n)', is written `m + n'. This possibility and its `prefix' variant are applicable to many possible operators -- not just the standard ones such as +, -, * and such -- and provides an elegant reinterpretation of traditional arithmetic notation as mere syntax for a feature call of the purest object-oriented pedigree.

One example of infix operator is "@" for arrays (no longer necessary, as we'll see, in Standard Eiffel), defined in the ARRAY Kernel Library class as a synonym for feature `item' which yields array elements: to get the element at index `i', you can indifferently write, for an array `a' and an integer `i':

					a.item (i)

or

[AT_SIGN]			a @ i


WHAT'S WRONG WITH THIS PICTURE?
-----------------------

The techniques so far have been part of the Eiffel story for a long time and have generally been found satisfactory. But in its effort to leave no stone unturned the ECMA standards committee found room for improvement.

Let's start the criticism with the last of the mechanisms described. Specifically, with an issue on the borderline of the cosmetic, which has however turned out to be important to many people. I thought for a long time that there was no reason to scorn the [AT_SIGN] form of accessing array elements in favor of the well-established notation

[BRACKET_ACCESS]	a [i]


One may even argue that the [AT_SIGN] form beats [BRACKET_ACCESS] in the keystroke game (one symbol rather than two). But, like it or not, people not used to Eiffel often demand [BRACKET_ACCESS] and balk at what they see as a bizarre notation.

Where they have a point, and the issue goes beyond cosmetics, is for operations that add array assignment to array access. To express

[BRACKET_UPDATE]	a [i] := a [i] + 1


you must, in pre-ECMA Eiffel, write

[DOT_UPDATE]		a.put (a.item (i) + 1, i)


or the slightly more compact form

[DOT_AT_UPDATE]		a.put (a @ i + 1, i)


neither of which, one must admit, captures the simplicity and clarity of [BRACKET_UPDATE]. Of course they are much more in line with the usual object-oriented view of things, but is this worth the complication?

The problem, of course, is that while for the query `item' we may define an infix synonym "@", no such mechanism is available for the command `put' (whose goal, as you have guessed, is to replace by a given value the array element at a given index).

Infix and prefix features raise some other less important problems. First, they break the Uniform Access principle since in pre-ECMA Eiffel they can't be attributes, just functions. This is more a concern of theoretical elegance than a crucial problem, since one would seldom use an attribute as prefix query. More practically relevant is the problem that environments such as EiffelStudio face when displaying feature names: they make have to make up fictitious names such as infix_plus.

WHY WE REQUIRE SETTERS
-----------------------

Let's continue backwards in the list of mechanisms described. Recalling the various schemes:

[ATTRIB_ACCESS]		my_stock.price
[FUNCTION_CALL]		my_stock.get_price
[ATTRIB_ASSIGN]		my_stock.price := 25   -- Not valid in pre-ECMA Eiffel
[PROC_CALL]		my_stock.set_price (25)


we saw that it would be useless and damaging to disallow [ATTRIB_ACCESS] and (through methodological rules) require [FUNCTION_CALL] instead, causing the proliferation of zillions of noise "getter" functions such as get_price.

But then can't we say the same about Eiffel's prohibition (this time through language rules) of [ATTRIB_ASSIGN], forcing the programmer to write [PROC_CALL] instead? Aren't "setter" procedures such as set_price just as bad?

Actually no, not quite as bad. A setter often does more than setting; instead of just performing

	price := p


for an argument `p', set_price could be of the form

	set_price (p: PRICE)
			-- Set stock price to `p'.
		require
			valid_price: p.is_valid
		do
			price := p
			stock_history.record (p)
		ensure
			is_set: price ~ p
		end


This form has a precondition, which specifies which arguments are acceptable; and in addition to the setting instruction, it records the change in a history database.

It's precisely because of this kind of scheme that we disallow [ATTRIB_ASSIGN]. Maybe, when you start using a certain data abstraction such as STOCK the only setting operations are plain assignments, so that you only need a very simple setter procedure:

	naively_set_price (p: PRICE)
			-- Set stock price to `p'.
		do
			price := p
		end


which may seem like a nuisance to write, like a get_price getter. Then a short-sighted programmer may be tempted simply to use remote assignments of the [ATTRIB_ASSIGN] form -- and if using Eiffel to be upset that the language doesn't let him do that. But the more experienced programmer knows that naive setters don't all remain naive forever. If you could let clients freely mess up object fields, through instructions such as my_stock.price := 25, and then suddenly realize you need to impose a precondition, or extra processing such as updating the stock history every time the price changes, you are stuck: you would have to comb through the code of every possible client for such assignments. This is in fact not even physically possible in the case of a library class. But if right from the start you have used the [PROC_CALL] style, enforcing information hiding by requiring all such modifications to use a procedure call `my_stock.set_price (25)', you have a single place to update, the setter procedure set_price.

CAN AND SHOULD WE DO BETTER?
-----------------------

This is the reasoning that led to the pre-ECMA Eiffel rules, and clearly that reasoning remains valid. Violating data abstraction, information hiding and object technology principles for short-term convenience is never a good idea. Such violations will come back to haunt you as your software grows and diversifies.

And yet -- do we need the syntactical weight of [PROC_CALL]? Wouldn't it be nice to write [ATTRIB_ASSIGN], not as a direct field assignment (this is methodologically wrong and should never be permitted) but as a *shorthand* for [PROC_CALL]? Many people have asked this question over the years. The Eiffel community as a whole has until now overwhelmingly answered "no", largely out of concern for language simplicity.

Indeed the difference in syntactical complexity between [PROC_CALL] and [ATTRIB_ASSIGN] is not so great, But then when we start combining this with array access, as in our examples

[BRACKET_UPDATE]	a [i] := a [i] + 1
[DOT_UPDATE]		a.put (a.item (i) + 1, i)



it would be hard to resist a preference for the first form if it could be given exactly the same semantics as the second, without any violation of methodological principles. This is the most delicate case, since it involves not only a setter mechanism but also the specifics of bracket notation for indexing into an array; if we get it right the notation should also work for data structures other than arrays: list, hash tables and many others.

Can we make life easier for client programmers, propose a notation that is in line with traditional practice while still rigorously compatible with object-oriented principles, achieve this through general and extendible mechanisms rather than ad hoc kludges, and keep the language simple in the process? We had pulled this off for arithmetic expressions in Eiffel 3, thanks to infix and prefix features. The ECMA committee went on the lookout for a similar feat directed at attribute assignment and indexed structures.

"PROPERTIES"
-----------------------

Before describing the result of that effort let's take a detour through another language's mechanism targeting the same basic issue. The mechanism is available in C# which took it from Delphi. Besides attributes (fields) and routines (methods), a C# class can declare properties. A property typically provides a setter and a getter. It is generally connected with an attribute, although the connection is not explicitly declared; the attribute itself remains secret (non-exported), but the setter and getter provide the associated access and modification facilities. So we might declare, in a class STOCK



///This is C#, not Eiffel!

	private int price_internal;    /// This is the secret attribute

	public int price     /// This is the property
		{
		get                        /// This is the getter
			{
			return price_internal
			}

		set                        /// This is the setter
			{if ! (value.is_valid)
				{
				throw new Argument_exception ("Price out of
range")
				}
			}
		}


`get' and `set' are keywords used to define the getter and setter algorithms of a property. `value' is also a keyword; in a setter, it denotes the value used to set the property (like the argument `p' in the Eiffel set_price routine). With these definition, a client may write neither of

///C# text attempt, but invalid
	x = my_stock.price_internal
	my_stock.price_internal = 25

since price_internal is secret, but it may use

///This is C#, not Eiffel!
	x = my_stock.price
	my_stock.price = 25


The first of these instructions, through the getter, returns the value of price_internal; the second one, through the setter, sets the value of price_internal (denoted within the setter as `value') to 25.

So this achieves the goal of proposing an assignment-like syntax with the semantics of a setter procedure call. But is it worth it?

I say no. The meagre benefit doesn't justify the added language and programming complexity: three new keywords -- `get', `set' and `value' -- used (only!) for this mechanism; still the need to introduce a secret attribute as well as the property that shadows it, all because of the misplaced view that exporting an attribute would mean exporting it unrestrictedly; and the introduction of somewhat bizarre animals, the setter and the getter, which are like routines but don't have a name and are not really features (members) of the class. In addition, essentially every getter that ever gets written is of the above form -- `get {return the_secret_attribute}' -- and hence is little more than repetitive "noise" coding. In addition, all this must be explained to programmers, since novices can hardly be expected to guess the purpose of "properties"; it's not that easy to explain and contributes to the difficulty of learning the language and to the risk of misuse.

The intent of the mechanism may be laudable, but the realization is disappointing. Consideration of this experience (in Delphi) played a definite role in keeping us convinced, for many years, that it wasn't worth trying anything in this direction. This topic continued to figure in many discussions on Eiffel mailing lists, but the conclusion was usually that we should just stick to the clear application of O-O principles and forget about giving an assignment-like syntax to procedure calls. "Real programmers use a dot".

FEATURE NAMES AND THEIR ALIASES
-----------------------

It's time now to see the coordinated set of language adaptations through which Standard Eiffel addresses the various issues discussed. We start with the ones at the lowest level, seemingly directed at syntax only, and move up from there.

The first step is to get rid of infix and prefix features. Well, not really, we just change the convention. A consequence of the above observations is that it simplifies everyone's life if we can always assume that a feature has a name of the Identifier kind, like x or this_name or plus. We turn this into a principle of the language:

Feature principle: every feature has an associated identifier (ECMA standard, 8.5.13).

As a consequence, dot notation is always available for feature calls, so that we may write the sum of two integers as `m.plus (n)', class INTEGER having a feature of identifier `plus' for that purpose.

Of course we still want infix and prefix notation. This is obtained no longer by giving the feature a special name (infix "+" in Eiffel 3) but by adding an infix *alias* to its normal identifier name. The feature is declared as

	plus alias "+" (other: INTEGER): INTEGER
			-- Sum of current integer and `other'
		do
			...
		end


As a consequence of this declaration, a call `m.plus (n)' can also be written `m + n'. So we get the same effect as with infix features of Eiffel 3, but + in a more regular and consistent fashion; the infix property is an addition to the normal properties of a feature, which include always having a name and being amenable to dot-notation calls.

In a given class, any particular operator can appear as alias of at most one feature with one argument -- like the above `plus' --, and of at most one feature with no argument. In other words, it may denote a unary (prefix) query and a binary (infix) query. This maintains the no-overloading policy of Eiffel. For unary operators, the feature may in ECMA Eiffel be an attribute as well as a function, as part of the general move to provide full Uniform Access by removing any unnecessary distinction between these two forms of query.

A BRACKET ALIAS
-----------------------

Yet another form of alias is the "bracket" alias, which reads just "[]". At most one feature in a class may have that alias. For example class ARRAY [G] now declares `item' as

	item alias "[]" (i: INTEGER): G
				-- Entry of index `i'
		require
			...
		do
			...
		ensure
			...
		end


The effect, as you have probably guessed, is that instead of the syntax

	a.item (i)


you may now take advantage of the bracket alias to denote an array element as:

	a [i]


The `infix "@"' synonym is no longer needed (although, as other older notations, ISE libraries will continue to support it for compatibility).

With this notion of bracket alias you can use the notation of most other programming languages for arrays, but with a precise object-oriented meaning making it a normal case of the fundamental O-O computation mechanism, feature call. Arrays in Eiffel are not a magically built-in notion but objects defined, like all others, by a class. Compilers, of course, can recognize the specificity of the ARRAY class and optimize this particular kind of call.

Unlike bracket notations restricted to arrays, however, the bracket alias mechanism gives you full generality: you may introduce a bracket-aliased feature in any class. With a hash table of PHONE_NUMBER objects indexed by strings, you will be able indifferently to use

	phone_numbers.item ("ELIZABETH")

or

	phone_numbers ["ELIZABETH"]


if `item' in HASH_TABLE has a bracket alias (it does in EiffelBase).

The feature with a bracket alias must be a query, but it can have any number of arguments; so with the appropriate class for three-dimensional arrays you could use

	my_array3 [i, j, k]


The alias mechanism -- including both its operator and bracket variants -- follows the idea of reconciling object-oriented techniques with traditional notations, as inaugurated by infix and prefix features in Eiffel 3, and improves on it.

ASSIGNER PROCEDURE
-----------------------

Let's come now to the thorniest issue: should we permit an assignment-like syntax for procedure calls. This has long been controversial in the Eiffel community, but I think that the ECMA mechanism will, after the possible initial shock, convince everyone.

Assume, whether it's a good idea or not, that we want to permit assignment-like syntax for changing the `price' of a STOCK. As before, we have the declaration

[ATTRIB_DECL]		price: PRICE


and the procedure set_price, unchanged but repeated here for convenience:

	set_price (p: PRICE)
			-- Set stock price to `p'.
		require
			valid_price: p.is_valid
		do
			price := p
			stock_history.record (p)
		ensure
			is_set: price ~ p
		end


Also as before, a client can call this procedure through

[PROC_CALL]		my_stock.set_price (25)


To permit, as an exact synonym, the form

[ATTRIB_ASSIGN]		my_stock.price := 25


it now suffices to add an `assign' specification to the above declaration of `price', [ATTRIB_DECL], rewriting it as

[ASSIGN_ATTRIB_DECL]	price: PRICE assign set_price


This states that the query `price' now has an associated "assigner command", the procedure set_price. The effect -- the only effect -- of this extra qualification is to allow [ATTRIB_ASSIGN] as an exact synonym for [PROC_CALL].

Eiffel programmers well steeped in the principles of object-oriented design tend at first to react negatively to this mechanism: isn't it a violation of the principles that ensure the quality of our software, from information hiding to Uniform Access? But in fact it's not. [ATTRIB_ASSIGN] is not an assignment; assigning to a field remains prohibited, as it should be. [ATTRIB_ASSIGN] is a specimen of a new construct, the "assigner call", with the semantics of a procedure call. It is just a different syntax for the procedure call [PROC_CALL], in exactly the same way that `m + n' is a different syntax for `m.plus (n)'. In the same way that you can still write `m.plus (n)' if you wish, you can still write `my_stock.set_price (25)'. But the assignment-like syntax is more convenient in some cases. Information hiding is not violated, since `my_stock.price := 25' remains a procedure call, with all its properties, including precondition checking if turned on, and execution of the full procedure body ith its the effect on `stock_history'. No confusion is possible for the reader of the program text since `a.b := c' is only permitted if `b' is a query with an associated assigner command, and then always has the semantics of a procedure call.

The assigner call mechanism is quite flexible. The only requirement is that the feature must be a query -- attribute or function with any number of arguments -- and that the assigner command must be a procedure with the same arguments plus one of the same type as the result of the query (PRICE in our example). An example where the query has an argument, unlike `price' above, is `item' in ARRAY [G], now declared as

	item (i: INTEGER): G assign put
		require ... do ... end


where `put' is the procedure that modifies an array entry:

	put (x: G; i: INGEGER)
			-- Set the `i'-th array entry to `x'.
		require ... do ... ensure ... end


This allows, for an array `a', writing

	a.item (3) := v


as a synonym for the procedure call

	a.put (v, 3)

COMBINING WITH ALIASES
-----------------------

All that remains is to marry the concept of assigner procedures with the alias mechanism. I am sure you have guessed the next step: with the full declaration of `item', specifying both the assigner procedure and the bracket alias

	item alias "[]" (i: INTEGER): G assign put
		require ... do ... end


you can now write the last example also as

	a [3] := v


as an exact synonym for the other two forms. Once again no assignment is involved; these are just syntactic variants for a feature call in the most authentic object-oriented spirit.

And of course we get to write

[BRACKET_UPDATE]	a [i] := a [i] + 1


generalizable to any number of arguments, as in (matrix multiplication)

	a [i, j] := a [i, j] + b [i, k] * c [k, j]


as soon as the corresponding queries have been given the appropriate bracket aliases and assigner commands. Few people actually writing such software would seriously claim that

	a.put (a.item (i, j) + b.item (i, k) * c.item (k, j), i, j)


is better, especially since the bracket form means exactly the same thing and, like the last one, is an uncompromising application of object-oriented concepts.

Since all the mechanisms described are completely general, you can use them in any class. So with the appropriate `assign' specification in HASH_TABLE, clients will be able to write

	phone_numbers ["ELIZABETH"] := "555-5555"


(where the use of a string on the right-hand side again takes advantage of a conversion procedure, here from strings to PHONE_NUMBER objects); this is just a syntactical variant for `phone_numbers.put ("ELIZABETH"], "555-5555")'.

AN ASSESSMENT
-----------------------

Syntax is indeed what all this is about. But clarity and simplicity are important in quality software. The language changes that have been described here are in the end small:

  • A notion of alias, with its operator and bracket variants.
  • The possiblity of associating an "assigner" procedure with a query.

 That's all! The small size of the final mechanism doesn't mean that it was arrived at easily. There were heated discussions in Eiffel circles for many years, and several attempts were rejected, to the point that for a while everyone thought that an assignment-like equivalent for procedure calls was just a bad idea, best forgotten forever. But the issue kept coming back. The criteria for a solution were those described at the beginning of this article:

  • Keep the language simple: indeed there's only one extension, the ability to specify `assign comm' for a query, with just one new keyword. The other change, `alias', is a replacement for an existingmechanism, with a more general result; two keywords, `prefix' and`infix', are replaced by just one, `alias'. (Here too EiffelStudiowill of course continue to support the previous mechanism for along time.)
  • Make programmers' life easy: if we can avoid forcing new notations where there's an existing one with no clear downside, we should try to keep it with a newly reinterpreted semantics compatible withobject-oriented concepts. That was already the case for operatorexpressions, with `m + n' understood as a feature call, and is nowgeneralized to bracket notation `a [x]' as well as assignment syntaxfor calls, as in `a.b := c' and `a [i] := x'.
  • Enforce consistency and generality: none of what we have seen is specific to basic types such as INTEGER or to basic data structures such as arrays. You can use all the mechanisms in your own classes.There's another general principle at work here: what's good for thelanguage designer is probably good for the language user too!Don't be selfish with your great ideas: let everyone else benefitfrom them too. That's why ARRAY is not a predefined notion butan Eiffel class, and anything that works for arrays can work for therest of the world.
  • Always apply the methodological principles of Eiffel and object-oriented design.

I hope you will appreciate the application of these guidelines to the mechanisms described above, and enjoy using these mechanisms in your own software.

--Bertrand Meyer