May 01, 2002
Polyglot ProgrammingWhile some question the need for multilanguage support, .NET proves that such strange bedfellows as Fortran, Eiffel and C# can interoperate. First in a three-part series.Bertrand Meyer
What does it take to support several programming languages within one environment? .NET, which has taken language interoperability to new heights, shows that it's possible-but only with the right design, the right infrastructure, and appropriate effort of both compiler writers and programmers.
What does it take to support several programming languages within one environment? .NET, which has taken language interoperability to new heights, shows that it's possiblebut only with the right design, the right infrastructure, and appropriate effort from both compiler writers and programmers. In this article, I'd like to go deeper than what I've seen published on the topic, to elucidate what it takes to provide true language openness. The experience that my colleagues have accumulated over the last three years of working to port Eiffel on .NET, as well as the countless discussions we've had with other .NET language implementers, informs this discussion.
Who Needs More Than One Language? Even more significant is the matter of libraries. Whether your project uses one language or more, it can take advantage of reusable libraries, whose components may have originated in different source languages. Here, interoperability means that you can use whatever components best suit your needs, regardless of creed or language of origin. This ability to mix languages offers great promise for the future of programming languages, as the practical advance of new language designs has been hindered by the library issue: Though you may have conceived the best language in the world, implemented an optimal compiler and provided brilliant tools, you still might not get the users you deserve because you can't match the wealth of reusable components that other languages are able to provide, merely because they've been around longer. Building bridges to these languages helps, but it's an endless effort if you have to do it separately for each one. In recent years, this library compatibility issue may have been the major impediment to the spread of new language ideas, regardless of their intrinsic value. Language interoperability can overturn this obstacle. Under .NET, as long as your language implementation satisfies the basic interoperability rules of the environment (as explained in the following examples), you can take advantage of components written in any other language whose implementers have adhered to the same rules. That still means some work for compiler writers, but it's work they must do once for their languagenot once for each language with which they want to interface. The language openness of .NET is a welcome relief after the years of incessant Java attempts at language hegemony. For far too long, the Sun camp has preached the One Language doctrine. The field of programming language design has a long, rich history, and there is no credible argument that the alpha and omega of programming, closing off any future evolution, was uttered in Silicon Valley in 1995. Microsoft's .NET breaks this lock. Everyone will benefit, even the Java community: Now that there's competition again, new constructs aresurprise!again being considered for Java; one hears noises, for example, about Sun finally introducing genericity sometime in the current millennium. Such are the virtues of openness and competition. The more than 20 languages ported or in the process of being ported to .NET range from Cobol and Fortran to Smalltalk, Oberon, Eiffel, Java, Perl, Scheme and Python. How does this all work? Do languages have to sacrifice anything? Should we believe those who say that it's all smoke and mirrors, and that deep down, all languages get reduced to a common denominator, whether we call it C#, Visual Basic .NET, managed C++ (or Java)? These are some of the questions I'll examine in this three-part article.
Language Operability at Work
I don't know about you, but I've never seen anything coming even close to this level of interoperability.
Affirmative Action
<%@ Assembly Name="conference" %>
<%@ Import Namespace="Conference_registration" %>
<%@ Page Language="C#" %>
<HTML>
<HEAD>
<TITLE>TOOLS CONFERENCE</TITLE>
<SCRIPT RUNAT="SERVER">
/* Start of C# code */
Registrar conference_registrar;
bool registered;
String error_message;
void Page_Init(Object Source,
EventArgs E) {
conference_registrar = new Registrar();
registrar.start();
... More C# code ...
}
... More HTML ...
The first C# line is the declaration of a C# variable called conference_registrar,
of type REGISTRAR. On the subsequent lines, we create an instance
of that class through a new expression, and assign it to conference_registrar;
and we call the procedure start on the resulting object. Presumably, REGISTRAR
is just some C# class in this system.
Presume not. Class REGISTRAR is an Eiffel class. The only C# code
in this example application is on the ASP.NET page, and consists of only a few
more lines than shown above; its task is merely to read the text entered into
the various fields of the page by a Web site visitor and to pass it on, through
the conference_registrar object, to the rest of the systemthe
part written in Eiffel that does the actual processing.
Nothing in the above example (or the rest of the ASP.NET page) mentions Eiffel.
REGISTRAR is not declared as an Eiffel class, or a class in any
specific language: It's simply used as a class. The expression new REGISTRAR()
that creates an instance of the class might look to the unsuspecting C# programmer
like a C# creation, but in fact it calls the default creation procedure (constructor)
of the Eiffel class. Not that this makes any difference at the level of the
Common Language Runtime: At execution time, we don't have C# objects, Eiffel
objects or Visual Basic objects; we have .NET citizens with no distinction of
race, religion or language origin.
In the previous code sample, if we don't tell the runtime that REGISTRAR
is an Eiffel class, how is it going to find that class? Simple: namespaces.
Here's the beginning of the Eiffel class text of REGISTRAR:
indexing
description: "[
Registration services for a
conference; include adding new
registrants and new registrations.
]"
dotnet_name: "Conference_registration.REGISTRAR"
class
REGISTRAR
inherit
WEB_SERVICE
create
start
feature - Initialization
start is
- Set empty error message.
do
set_last_operation_successful
(True)
set_last_error_message
("No Error")
set_last_registrant_identifier
(-1)
end
... Other features ...
The line preceded by dotnet_name says: "To the rest of the
.NET world, this class shall be part of the namespace Conference_registration,
where it shall be known under the name REGISTRAR." This enables
the Eiffel compiler to make the result available in the proper place for the
benefit of client .NET assemblies, whether they originated in the same language
or in another one.
Now reconsider the beginning of the ASP.NET page shown earlier:
<%@ Assembly Name="conference" %> <%@ Import Namespace="Conference_registration" %> <%@ Page Language="C#" %> <HTML> <HEAD> <TITLE>TOOLS CONFERENCE</TITLE> <SCRIPT RUNAT="SERVER"> ... The rest as before ...
The second line says to import the namespace The basic technique will always be the same:
The details may vary depending on the languages involved. On the producer side, L1, you may retain the original class names or, as in the preceding Eiffel example, explicitly specify an external class name. On the consumer side, you may have mechanisms to adapt the names of external classes and their features to the conventions of L2. Some flexibility is essential here, since what's acceptable
as an identifier in one language may not be in another: Visual Basic, for example,
accepts a hyphen in a feature name, as in
Combining Different Language Models
Starting from a source language, the compiler will map your programs into a common target, as shown in "Combining Different Language Models." This by itself isn't big news, since we could use the same figure to explain how compilers map various languages to the common model of, say, the Intel architecture. What is new is that the object model, as we've seen in detail, retains high-level structures such as classes and inheritance that have direct equivalents in source programs written in modern programming languages, especially object-oriented ones. This is what allows modules from different languages to communicate at the proper level of abstraction, by exchanging objectsall of which, as .NET objects, are guaranteed to have well-understood, language-independent properties.
Object Model Discrepancies Even if we restrict our attention to object-oriented languages, we'll find discrepancies. Each has its own object model; while the key notionsclass, object, inheritance, polymorphism, dynamic bindingare common, individual languages depart from the .NET model in some significant respects:
These object model discrepancies raise a serious potential problem: How do we fit different source languages into a common mold? There are two basic approaches: Either change the source language to fit the model, or let programmers use the language as before, and provide a mapping through the compiler. No absolute criterion exists: Both approaches are found in current .NET language implementations. C++ and Eiffel for .NET provide contrasting examples.
The Radical Solution But if you then look at the specifications for managed classes, you'll realize that you're not in Kansas any more (assuming, for the sake of discussion, that Kansas uses plain C++). On the "no" side, there's no multiple inheritance except from (you guessed it) completely abstract classes, no support for templates, no C-style type casts. On the "yes" side, you'll find new .NET mechanisms such as delegates (objects representing functions) and properties (fields with associated methods). If this sounds familiar, that's because it is: Managed C++ is very close to C#, in spite of what the default Microsoft descriptions would have you believe. Predictably, the restrictions also rule out any cross-inheritance between managed and unmanaged classes. The signal to C++ developers is hard to miss: The .NET designers don't think too highly of the C++ object model and expect you to move to the modern world as they see it. The role of Unmanaged C++ is simply to smooth the transition by allowing C++ developers to move an application to the managed side one class at a time. An existing C++ application will compile straight away as unmanaged. Then you'll try declaring specific classes as managed. The compiler will reject those that violate the rules of the managed world, for example, by using improper casts; the error messages will tell you what you must correct to turn these classes into proper citizens of the managed world. For C++, this is indeed a defensible policy, as the language's object modeldefined to a large extent by the constraint of backward compatibility with C, a language more than three decades oldis obsolete by today's standards.
Respecting Other Object Models
Fortunately, the answer is no, at least not if "you" here means the
programmer. The scheme described in "Combining Different
Language Models" doesn't require that all languages adhere to the .NET
object model; rather that they Tune in next issue and discover how this all works out.
|
|
|||||||||||||||||||||||||||||
|
|
|
|