FREE Subscription to Dr. Dobb’s Digest: Same Great Content, New Digital Edition
Site Archive (Complete)
Architecture & Design
Email
Print
Reprint

add to:
Del.icio.us
Digg
Google
Furl
Slashdot
Y! MyWeb
Blink
May 01, 2002
Modeling XML Applications

The XML Meta-data Interchange specification standardizes metamodels, models and

David Carlson
When mapping between UML and XML Schema, the first priority is to enable generation of a valid XML Schema from any UML class diagram. The developer needs no knowledge of the XML Schema specification (which can be intimidating to newcomers). Having this capability enables a rapid development process and supports reuse of the model vocabularies in several different deployment languages or environments, because the model isn't overly specialized to XML structure.
May 2002: Modeling XML Applications

Read part 1, part 2, part 3 and part 4.

When mapping between UML and XML Schema, the first priority is to enable generation of a valid XML Schema from any UML class diagram. The developer needs no knowledge of the XML Schema specification (which can be intimidating to newcomers). Having this capability enables a rapid development process and supports reuse of the model vocabularies in several different deployment languages or environments, because the model isn't overly specialized to XML structure.

To meet specific XML design guidelines, customization of the generated schemas must be supported. Several XML experts have told me that the generated schema must be the same as one they would write by hand. This may include choice of global versus local element declarations, or a preference for reusable <group> definitions in the schema context. However, the best hand-authored schemas still follow a consistent set of design principles that I intend to duplicate with UML-assisted design.

Rules for XML Schema Generation
I didn't start from scratch when creating these mapping rules, but used the XML Meta-data Interchange (XMI) specification from the Object Management Group (OMG) as a foundation. XMI standardizes the exchange of metamodels, models and object instances between applications. The standard is equally applicable to database meta-data and XML schemas, or to any other model that you might define for your application. Simply stated, XMI defines a consistent way to create XML schemas from models and XML document instances from objects that are instances of those models.

The Catalog Markup Language
[click for larger image]
This UML diagram shows a platform-independent model for a simple product catalog. By following a default set of production rules, we can generate a complete XML Schema from this model, even though the model itself contains nothing specific to XML. All UML attributes are mapped to XML elements contained within an element for each class. Each navigable association end (with an arrow) is also mapped to an element definition within the originating class.

The XMI specification is written in terms of the Meta Object Facility (MOF), the language used to define the UML metamodel—essentially an abstract subset of the UML. For additional background on the MOF and XMI—without jumping into the deep end of the conceptual pool—I recommend reading the overview in section 2 of the MOF 1.3 specification, plus the design rationale in section 4 of the XMI 1.1 specification (see http://www.omg.org/mda/specs.htm).

I created an example product catalog design in UML, called the Catalog Markup Language (CatML), that illustrates the mapping to XML. The class diagram (see "The Catalog Markup Language") defines a catalog composed of items that may be either individual products or product bundles. A product bundle aggregates other products and/or bundles. Each catalog item is associated with exactly one supplier.

The following schema fragment is produced by applying the default mapping rules:

<xs:element name="CatalogItem" 
type="CatalogItemType"/>
<xs:complexType name="CatalogItemType">
  <xs:sequence>
    <xs:element name="globalIdentifier" 
        type="xs:string"/>
    <xs:element name="name" 
        type="xs:string"/>
    <xs:element name="description" 
        type="xs:string" minOccurs="0" 
        maxOccurs="1"/>
    <xs:element name="listPrice">
      <xs:complexType>
        <xs:sequence>
          <xs:element ref="Money"/>
        </xs:sequence>
      </xs:complexType>
    </xs:element>
    <xs:element name="supplier">
      <xs:complexType>
        <xs:sequence>
          <xs:element ref="Party"/>
        </xs:sequence>
      </xs:complexType>
    </xs:element>
  </xs:sequence>
</xs:complexType>
Each UML class creates a schema complexType definition containing a child element for each attribute within that class. (Later, we'll see how some of these UML attributes can be mapped to XML attributes instead of child elements.) The CatalogItem type is also specified by its associations to other classes in the model. The CatML diagram (above) includes one association that originates at CatalogItem, and that is designated by a navigation arrow at the opposite end. Each association end has a role name and multiplicity that specifies how the target class is related. These association ends are added to the content of the complexType definition along with the elements created from the UML attributes.

The following XML document is valid according to this definition:

<CatalogItem>
  <globalIdentifier>p-
18463</globalIdentifier>
  <name>Wizard Laptop</name>
  <listPrice>
    <Money>
      <currency>USD</currency>
      <amount>2295.00</amount>
    </Money>
  </listPrice>
  <supplier>
    <Party>
      <globalIdentifier> s-
1742</globalIdentifier>
      <Contact>
        . . .
      </Contact>
    </Party>
  </supplier>
</CatalogItem>
Because the UML attributes for globalIdentifier, name and description have primitive data types, the schema includes these values as element content. However, the default mapping for associations creates a wrapper element in the schema corresponding to the role name in UML. This element then contains the instances of the associated class, to which the schema refers, using the top-level element for another complexType. To enable these references, a top-level element is automatically created for each complexType in the schema, as shown in the first line of the previous example.

A slight anomaly exists for the attribute listPrice. Because its type is a class instance instead of a primitive data type, it generates a schema structure identical to that used for associations. Some modelers prefer to represent a catalog item's list price as an association to Money in the CatML diagram, but I've found that many others prefer more readable diagrams that are obtained when a simple structure, like Money, is used frequently as an attribute type within the model. Either approach is allowed in UML.

Generalization
Generalization from one class to another is a fundamental concept in object-oriented analysis and design. The specialized subclass inherits attributes and associations from all of its parent classes. This is easily represented in W3C XML Schema, as shown in this example for ProductBundle:

<xs:element name="ProductBundle" 
type="ProductBundleType" 
substitutionGroup="CatalogItem"/>
<xs:complexType name="ProductBundleType">
  <xs:complexContent>
    <xs:extension base="CatalogItemType">
      <xs:sequence>
        <xs:element name="contains">
          <xs:complexType>
            <xs:sequence>
              <xs:element ref="CatalogItem" 
                  minOccurs="1" 
              maxOccurs="unbounded"/>
            </xs:sequence>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
    </xs:extension>
  </xs:complexContent>
</xs:complexType>
There are two differences from the previous schema definition for CatalogItem. The ProductBundle element includes substitutionGroup="CatalogItem", which means that whenever the CatalogItem element is required as an XML content element, ProductBundle may be substituted in its place. Thus, ProductBundle (or, similarly, Product) can be used as the content of an item element in either Catalog or Discount.

The complexType definition for ProductBundle is an extension of the base complexType named CatalogItemType. There is, however, a significant difference in the way this inheritance structure is interpreted in UML and in XML Schema. In UML, the order of attributes and association references within a class is not specified, and the features inherited from parent classes freely intermingle with locally defined attributes and associations in a subclass. In XML Schema, however, inherited elements are treated as a group, so the five elements inherited from CatalogItem are an ordered group in a ProductBundle element, followed in sequence by another ordered group of the elements defined by the ProductBundle class. A single group of these six elements (five inherited and one local to the subclass) cannot be defined when using XML Schema extension.

Customizing XML Schema Design
UML provides a foundation for the modeling structure and behavior of most software systems, but some domain- or platform-specific situations entail additional model information beyond that which UML can express. This issue is solved through the use of UML extension profiles. A UML profile is defined by three items: stereotypes, tagged values (properties) and constraints. A profile provides a definition of these items that explains how they extend the UML in a particular domain, which, in the case delineated here, is XML schema design.

A complete definition of the UML profile for XML Schema is beyond the scope of this article, but is included in an appendix of my book, Modeling XML Applications with UML (Addison-Wesley, 2001), and applied in several examples available on the XMLmodeling .com portal. Four stereotypes are introduced in the following example, along with their tagged value properties (the default property value is in bold):

<<XSDsimpleType>> on a UML class:

  • derivation (restriction | list | union)
<<XSDelement>> on a UML attribute or association end:

  • position (integer value) within a sequence model group
  • anonymousType (true | false)
  • anonymousRole (true | false)
  • form (qualified | unqualified)
  • nillable (true | false)
<<XSDattribute>> on a UML attribute or association end:

  • use (prohibited | optional | required | fixed)
  • form (qualified | unqualified)
<<XSDgroup>> on a UML class:

  • no tagged values
Each stereotype is assigned to one or more UML constructs that are modified by the profile extension, and can be further specified by adding one or more properties that refine its meaning or impact on a model. For example, a stereotype assigned to a UML class extends the meaning of a "class" within the profile's domain, and the stereotype's properties are added to the specification of that class in the model.

Using this subset of the UML profile for XML Schema, let's make the following customizations to the CatML model:

  1. The globalIdentifier attributes on CatalogItem and Party should map to XML attributes and have an XML ID data type within the following CatalogItemType complexType definition.
  2. The CatalogItem listPrice attribute and supplier association end should use an anonymous type in the schema, such that the XML tags for Money and Party are omitted from the document instance, and only their content is included.
  3. The definition of Contact should become a <group> definition in the XML Schema, which is then included by reference within the content of Party.
  4. The Money complexType should contain the value for its amount without requiring the <amount> tag in a document.
Here's a conforming XML document:

<CatalogItem globalIdentifier="p-18463">
  <name>Wizard Laptop</name>
  <listPrice currency="USD">
      2295.00</listPrice>
  <supplier globalIdentifier="s-1742">
    <name>Jones e-Supply</name>
    <city>Boulder</city>
    <state>CO</state>
    <postalCode>80303</postalCode>
  </supplier>
</CatalogItem>
The UML class diagram depicted in "Customizing the XML Schema" (below) includes profile extensions that implement this customization. Rational Rose (used to create these diagrams) does not display tagged value properties on the diagram; the {anonymous Type=true} tag was added as a plain text annotation for illustration in this figure. They are, however, included in the model definition.

By applying the UML profile when generating the schema from this model, the following definition is now obtained for CatalogItem:

<xs:complexType name="CatalogItemType">
  <xs:sequence>
    <xs:element name="name" 
        type="xs:string"/>
    <xs:element name="description" 
        type="xs:string" minOccurs="0" 
        maxOccurs="1"/>
    <xs:element name="listPrice" 
        type="MoneyType"/>
    <xs:element name="supplier" 
        type="PartyType"/>
  </xs:sequence>
  <xs:attribute name="globalIdentifier" 
          type="xs:ID" use="required"/>
</xs:complexType>
I modified the model so that the UML class for Money extends an XML Schema data type, and I added the stereotype <<XSDattribute>> to the currency attribute. Using these profile customizations, the following schema is obtained:

<xs:complexType name="MoneyType">
  <xs:simpleContent>
    <xs:extension base="xs:double">
      <xs:attribute name="currency" 
          type="xs:token" use="required"/>
    </xs:extension>
  </xs:simpleContent>
</xs:complexType>
Customizing the XML Schema
[click for larger image]
This UML diagram shows a subset of the product catalog model with stereotypes added to customize the generated XML Schema. Three of the UML attributes are stereotyped such that they will generate XML attribute definitions in the schema. A tagged value (anonymousType = true) is shown on the supplier end of the association from CatalogItem to Party. The Contact class is marked as a group definition in the schema; its content elements will be included within Party, without an element corresponding to the Contact class itself.

You can download all of these examples as both Rational Rose models and complete XML Schema documents from http://XMLmodeling.com.

Tool Integration
A typical UML tool is a large, complex software package that tends to dominate a developer's work environment. However, several interesting changes are pending that may enable a new generation of configurable modeling components. Foremost among them is the OMG's XMI specification. I've described it in this article as the basis for generating XML schemas from any UML model. This same specification is used to serialize instances of a UML metamodel for exchanging models between different vendors' tools.

Most common UML tools now support an XMI representation of their models. (The UML standard does not, however, completely define diagram graphics for interchange; stay tuned for UML 2.0, which will correct this deficiency.) Some tools, such as the open source ArgoUML, use XMI as their native file storage format.

I envision an XML design tool that operates in loose collaboration with any UML tool that can import and export XMI representations of its models. I've used this XMI format of UML models to create a Web-based tool for bidirectional transformation between UML and XML Schema.

The creation of integrated development environment (IDE) frameworks for tool integration is another interesting trend. The NetBeans framework used by Sun Microsystems' Forte development tool has become popular for sharing component-based tools (see www.netbeans.org). And, last year, IBM released their Eclipse IDE framework, which also enables a multitude of vendors and individual developers to integrate complementary tools within a shared plug-in environment (see www.eclipse.org). Both NetBeans and Eclipse frameworks are open source.

In February 2002, Rational Software announced their new XDE UML product, which is integrated into the Eclipse framework. TogetherSoft and WebGain have also committed to support for Eclipse. I have integrated my XML Schema generation and reverse engineering tool into Eclipse; all of the schema examples in this article were generated using this tool.

The fourth part of this series will describe techniques for reverse engineering an existing XML Schema into a UML model, again using the XMI representation as an intermediary to gain UML tool independence.

RELATED ARTICLES
No Related Articles
TOP 5 ARTICLES
No Top Articles.
DR. DOBB'S CAREER CENTER
Looking for a new job? open | close
Search jobs on Dr. Dobb's TechCareers
Function:

Keyword(s):

State:  
  • Post Your Resume
  • Employers Area
  • News & Features
  • Blogs & Forums
  • Career Resources

    Browse By:
    Location | Employer | City
  • Most Recent Posts:



    MICROSITES
    FEATURED TOPIC

    ADDITIONAL TOPICS

    INFO-LINK



     




    Techweb
    Informationweek Business Technology Network
    InformationweekInformationweek 500Informationweek 500 ConferenceInformationweek AnalyticsInformationweek Events
    Informationweek MagazineGlobal CIOIWK Government ITbMightyByte and SwitchDark Reading
    Digital LibraryIntelligent EnterpriseInternet EvolutionNetwork ComputingPlug Into The CloudDr. DobbsContentinople
    space
    TechWeb Events Network
    InteropVoiceConWeb 2.0 ExpoWeb 2.0 SummitEnterprise 2.0Mobile Business ExpoNoJitter
    Black HatGTECEnergy CampCloud ConnectGov 2.0 ExpoGov 2.0 Summit
    space
    Light Reading Communications Network
    Light ReadingLight Reading AsiaUnstrungCable Digital NewsInternet EvolutionPyramid Research
    Heavy ReadingLight Reading LiveLight Reading InsiderEthrnet ExpoTelco TVTower Technology Summit
    space
    Financial Technology Network
    Advanced TradingBank Systems and TechnologyInsurance and TechnologyWall Street and TechnologyAccelerating WallstreetBST SummitBuyside Trading SummitIT Summit
    space
    Microsoft Technology Network
    MSDNTechNetTotal IT ProTotal Dev ProNET Total Dev Pro CommunitySQL Total Dev Pro Community
    space