Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Components and Contracts


May 2000: Beyond Objects: Components and Contracts

I largely agree with Bertrand Meyer’s view in his recent column ("What to Compose," March 2000). So let’s end our arguments here. Or should we? Admittedly, doing so would allow us to move on, but it wouldn’t be quite as much fun–and I certainly agree with Meyer when he writes: "I appreciate Software Development’s editors’ insights in setting up this multivoice column." So, let’s continue. This time, I shall address some of the questions Meyer posed and then move on to the important topic of contracts for component software. Readers less interested in the ongoing debate may prefer to "fast forward" to the latter half (beginning with heading "Components of Quality").

Binary: What and Why

Meyer admits that he finds the qualifications "source" and "binary" confusing, pointing out that in the "good old days (a long, long time ago–1992, perhaps) ‘source’ meant something like C or Pascal, and ‘binary’ meant code for some processor." Well, in the really old days, Fortran source, once completed and packaged into libraries, would be shipped as binary components. These components consisted of a deck of punched cards encoding the source of the Fortran code. Job Control Language (JCL) statements on leading cards would instruct the loader of the machine to first compile the cards. (Yes, nothing is new on the face of the earth!) In this case, the deck of cards, used as a software component, is in "binary form"–ready to be used by an automatic execution environment. The Fortran source is included verbatim in the deck, but the leading JCL commands provide the necessary closure to allow load-time compilation.

So, to reiterate: I believe that a bin-ary unit’s main characteristic is that it can be used directly by the execution environment that the unit targets, whether it is a component or not. If the target environment contains an interpreter or compiler, then a binary unit can look very much like source code. However, true source code serves a different purpose: It is written by programmers to be read by both programmers and tools, with the intent of building things. Very often, source code units are not self-contained. For example, they textually include files from locations specified using file system paths, contain references to build-time variables (conditional compilation), do not contain explicit specifications of what they require and so on. In fact, a source code unit may not be usable outside of a delicate build environment.

It’s true that source code fragility depends on the language and development environment. For example, XML, in combination with XML namespaces, can be seen as a world of "source" that can be directly used as a "binary" as well. The same is true for many scripting languages. However, the fact that the same form can serve both purposes, that of source and that of binary unit, is not a reason to go soft on distinguishing between the two.

To summarize: A unit is a binary if it targets an execution environment; whether the form of that unit is readable, textual or machine code is irrelevant. A unit is a source if it targets human readers as well as development tools. The choice of ahead-of-time, just-in-time, or continuous online compilation or interpretation is one of execution technology that is unrelated to these terms.

Deployment and More

I’ve never claimed that components are simply units of deployment or that all units of deployment are components. Components, instead, are units of deployment that, in addition, satisfy a number of criteria. I entirely agree with Meyer’s seven criteria for components ("What to Compose," March 2000). His second and fifth criteria–in combination–help to explain why focusing on binary components (units of deployment) is important. These two criteria are: "A component may be used by clients without the intervention of the component’s developers," and "a component is usable on the sole basis of the specification that it ships with." If both of these criteria were true, what would a source component be good for? After all, we call a component source if it is meant to be read by humans as well as by tools. Yes, we could learn a great deal from studying the source, but we wouldn’t be allowed to modify the source without effectively creating a new component. Making source available together with a component is a separate issue (see, for example, the open source movement). And only making the source available requires the recipient to rebuild the component before it is usable; this typically requires more information than just the component’s source code, its specification of provided and required functionality, and quality attributes.

To summarize, a software component is a unit of deployment that is shipped in binary form and satisfies a number of criteria (that we don’t need to repeat here). Much of the confusion about this simple statement arises from interpretations of the words "binary" and "deployment"; I hope that this explanation clarifies my actual intent.

Components with Class

Meyer acknowledged that not every class qualifies as a component (and, as a consequence, that a component could contain multiple classes and other things). However, he postulated that the transformation of a given software artifact into a software component is easier when one starts with an object-oriented artifact instead of a non-object-oriented one. Instead of developing a lengthy argument, I would like to point out that, at least from my own experience, this is not necessarily true. I have seen some object-oriented artifacts that were indeed so carefully factored that dividing lines for component boundaries could be found. In the majority of cases, however, that was clearly not the case. The tendency of class libraries to rely heavily on implementation inheritance is the most common showstopper here, but there are other problems. The use of directly exposed class fields, the spreading of concrete class names throughout implementations (where names of interfaces or abstract classes would do), and the reliance on complicated protocols of interaction that aren’t fully specified are other examples.

I have also seen plenty of non-object-oriented artifacts that have very similar problems and are equally hard to componentize–global variables are indeed a common limiting factor here. Nevertheless, there are many examples of traditional procedural abstractions that are easily wrapped and componentized. The reason is quite simple: procedural programming lends itself to layered structuring, while object-oriented programming does not. In a layered system, higher layers can call lower layers, but not the reverse. As long as procedures of a layered system do without global variables in their implementation, they can be clustered quite easily into components.

We will see shortly that the statelessness of easily componentized procedural systems naturally leads to a similar requirement for components in general. Also, the layering approach to system structures is natural when aiming at specifications based on pre- and postconditions–with the caveat that practically no object-oriented design is layered. This observation leads straight to the main topic of this column.

Stately Components

In "Point, Counterpoint" (Feb. 2000), I discussed why statefulness is incompatible with components. To avoid immediate confusion, let me emphasize that I have no problems with stateful objects created by a component. Meyer asked whether I was trying to exclude global variables, and yes, that is one thing that I wish to exclude. However, many object-oriented languages have support for global variables that are less obviously known such as static member variables or fields. Such "class variables" (as opposed to instance variables) are global variables, but scoped by their defining class.

Meyer then asked whether my exclusion of state extends to a broader, nonlanguage sense of global variable, referring to mechanisms such as the Windows registry. This is an interesting and subtle point, and I am glad he brought it up because the issue needs to be clarified.

If a component statically relies on state kept in the Windows registry–rather than through objects that it creates–it would qualify as unwanted global state. The registry can be seen as a directory database–an instance of the registry service component–that in the case of current Windows follows the Singleton pattern: There is only one instance per host. This, however, is not the problem. An actual problem arises if this database is used to make a component stateful. If the registry contains configuration information that is set at the time the component is installed and is kept constant for the lifetime of the component installation, then that is not state. Keeping such constants in a generally accessible place is fine.

If, however, a component with its static code accesses the registry and modifies entries there, then that component isn’t a component by my definition, because it uses the registry to store what amounts to global variables. Clients of the want-to-be component are now coupled by that global state and might suffer from side effects. No such problems arise if the component is truly one and all such updating accesses to the registry are performed by instances (objects) created by the component. Separate clients can now be served by separate instances. (You might object that it isn’t possible to build a system that doesn’t have global variables. That is indeed true, but the required ground-level global variables can and should be handled for us by the operating system.)

Before moving on, let us look at one further example: a typical file system interface. It would seem that a piece of software that implements a traditional procedural file system interface couldn’t be a component. After all, the file system on the disk would seem to be a huge global variable. On closer inspection the file system isn’t found guilty, though. All operations operate on file descriptors–handles to internal file and directory objects. There really is no need for state in the file system component itself.

Components of Quality

Moving from traditional software to component software is a challenge and, no doubt, many important problems still exist. Meyer is absolutely right to remind us that quality is a crucial aspect of a component–a composition tends to be as strong as its weakest component. Quality in terms of fully meeting a precise specification is the ideal for which we should strive.

In the world of components, the specifications that enable composition need to exist "between components." For example, the XML, CORBA, COM or EJB specifications are not owned or introduced by specific components. Instead, such specifications set a "standard" in the middle that components then claim to comply with. We need high-quality specifications in the middle, often called contracts, and high-quality implementations of components that indeed comply.

Contracts as entities in the middle are not naturally attached to specific classes–abstract or not. Instead, they are most naturally expressed as constraints and conditions over "model" variables (a.k.a. specification variables) that do not relate to an implementation. Classes or other programming constructs that then claim contract compliance need to establish the mapping between their implementation variables and the abstract model variables.

Eiffel, and later Sather, are among the few languages that make some such constraints and mappings expressible in the programming language itself. Meyer deserves credit for the contribution he made by designing the language Eiffel and its support for design by contract. In his column, he correctly observes that I used neither Eiffel nor Sather in the coverage of contractual specifications in Component Software (Addison Wesley, 1998), although the book has numerous technical references to aspects of both these languages and many others. I chose a pragmatic course for the book, favoring Java to address the widest possible readership and Component Pascal to highlight a few items that are more naturally expressed in a small language with an explicit support for modules and procedures (besides classes and methods).

However, neither of the two languages used in my book (nor Eiffel or Sather) is capable of doing the "contracts as specifications in the middle" idea with any justice. Simply put, there is no programming language available today that would. The closest approach is the separate support of Interface Definition Language (IDL) in the middle–a venerable approach that dates back to at least the early days of Ada and that today is most prominently used in CORBA and COM.

What’s in a Contract

IDL lends itself to the (incomplete) specification of interfaces, which can then be used by classes to specify both provided and required functionality. The traditional emphasis is on provided functionality: A class is declared to implement a list of interfaces. Required functionality is a more recent focus of attention. Traditional classes often contain the names of other classes or interfaces to handle typed instances and perform outcalls. To form a valid component, classes would have to be equipped with explicit declarations of such required functionality in order for them to be composable solely on the basis of their specification.

As discussed in detail in Component Software, contracts need to go far beyond establishing functional requires and provides clauses. Quality attributes need to be incorporated. For example, a contract might include a parametric statement of the time and space implications of interacting under the rules of that contract. If that sounds esoteric, consider a contract that establishes a streaming link for live MPEG streams (the most popular family of standards for video-stream encoding). The MPEG specifications talk about bounds on jitter (latency variations) in real time. Bounds on bandwidth requirements, buffer space and so on are easily calculated based on the supported encoding quality.

Other examples of quality attributes include compliance with security models, and compliance with testability standards. While we are still struggling to write down and verify precise specifications for functionality, our arsenal of approaches and tools thins when we try to venture out on quality attributes within contracts. That is not to say they are less important; they are just much harder to handle uniformly, and a great deal of work is still required to get it right.

Contract = Interface + Pre + Post?

Unfortunately, even the functional specification house is still not in order. The most prominent approach to adding contractual clauses to interfaces is the use of pre- and postconditions. Dating back to formal specifications of data type abstractions, pre- and postconditions assume that an operation’s (functional) semantics can be fully captured by pinpointing what needs to hold before invoking an operation and what needs to hold just after the operation returns. This is a good idea, and an even better one if invariants are factored in, stating that an invariant and a precondition hold before and the same invariant and a postcondition hold after an operation invocation.

Unfortunately, classes (and non-object-oriented constructs such as callbacks before them) do not fall into the category of data processing abstractions. Indeed, it seems today that only the most trivial components are properly specifiable as data processing abstractions. The problem is externally observable invocations caused by the implementation of an operation, before that operation returns. Object-oriented programs do this all the time–there is hardly any method that wouldn’t rely on other objects by calling their methods. The same is often true for traditional implementations of data processing abstractions (traditionally, but confusingly, called Abstract Data Types). However, these implementations followed a strict layering model: it was simply not possible for the caller of an operation to observe that an abstraction was built out of simpler abstractions. A procedure calling a lower-level procedure wouldn’t transfer control to its caller before the lower-level procedure had returned. Pre- and postconditions keep the house of procedural and data abstraction in order. (And tend to simplify the componentization of such layered procedural systems–but I said that already.)

Sadly, moving to abstractions such as objects that do call out to other abstractions in a way that is (potentially) visible to the original caller breaks much of the beauty and correctness of the pre- and postcondition model. The most glaring problem is that pre- and postconditions cannot even refer to the presence or absence of outcalls! That is, this specification technology is totally oblivious to observable behavior of objects that are currently handling a request. Observable is the key point here: All object-oriented programming languages allow for such observation and many common patterns actually encourage such activity. The most obvious example here is the Observer pattern, of course. Nomen est omen. (Again, see Component Software for a detailed analysis of scenarios that show the problems with an approach based on pre- and postconditions alone. Also notice that these problems are orthogonal to issues of concurrency and that concurrency control, such as monitor models, cannot be used to address them.)

However, allow me to emphasize that pre- and postconditions are still a much better approximation of the (functional) specification of operations than informal commentary or no specification at all. A Contract Definition Language, as called for by Meyer, is indeed a good idea–and one that several research teams around the globe are working on–but getting there will not be easy. In a world of near speed-of-light communication, I am now looking forward to reading the forthcoming columns at a rate of one a month.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.