January 01, 2002
Taming the XML Beast, Part II
Last week, I introduced the basics of XML: I discussed how XML is about data rather than presentation, and I covered the basics of XML syntax, including elements and attributes. This week, we'll look more deeply at XMLits demand for syntactical rules, how you can gain control over your XML documents using document type definitions, and how to add style using style sheets.
Well-Formed XML Documents
In order for an XML document to be properly writtenin XML jargon, well formedthe document must adhere to specific rules. I'll introduce the general rules here, and then follow up with details.
Let's look at these rules in detail.
The XML Declaration
When writing an XML document, you must include the XML declaration at the top of the document:
The
A Unique Root Element
XML documents should contain only one root element. A familiar example of this is found in XHTML, which is an application of XML. In that case, the root is the
Close All Non-Empty Elements
In HTML, many people insert
Terminate Empty Tags
Not all elements contain content. For example, the HTML
Nesting Symmetry
An element's content may contain other elements. But, you must close each start tag with its corresponding end tag in the reverse order it was opened (first opened; last closed). Back to our element container analogy, you can put other containers inside a container, but you have to close each inner container with its corresponding lid before you can close the outer containers:
An incorrect example:
A well-formed example:
Quoted Attribute ValuesUnlike HTML attributes, attribute values in XML must be quoted, either by a pair of double quotes or a pair of single quotes. Unacceptable non-quoted attribute:<table width=400>
Correctly quoted attribute:<table width="400"> or
<table width='400'>
Character EntitiesWhenever the XML parser encounters certain characters like the< and > symbols, it interprets them as instructions. So to use these symbols in your content text, you have to use their entity references. Most HMTL developers are familiar with the non-breaking space entity, . In XML, only five character entities have been predefined:
> > greater than < < less than & & ampersand ' ' apostrophe " " double quote A Well-Formed Addressbook: XML Document SampleWith these basic rules under our belt, let's examine a well-formed XML document:
The very first line is the XML document declaration. Following the XML declaration is the root element <address_book>. The root element appears only once and everything else in the document is contained within this element. An XML document can also have other processing instructions which would appear outside the root element, following the document declaration in a fashion similar to an HMTL document's <head> section. Each subsequent element begins with a start tag, contains some content (either data or nested elements), and ends with a closing tag.
Valid XML DocumentsA well-formed XML document may be fine for standalone pages, but to make real use of XML, you'll want to specify unique guidelines. These guidelines describe elements your XML must contain, the sequence of those elements, and what contents those elements contain. This is done using a DTD (document type definition). When an XML document follows the basic XML rules for well-formedness and the rules of its specified DTD, it is said to be a valid XML document. Why bother with a DTD? Well, let's say I want to share my address book with a friend. He wants to merge my data into his own address book. To do that, we must both share the same tag set so the data can be used in exactly the same way. So we would have to collaborate and work together to come up with a DTD that would work for both of us. That's another great advantage of XML: with an accepted DTD, different parties can share and exchange data regardless of the application used to process that data. Writing DTD's can end up being a fairly complex process. I'm providing just a small sample below. You can imagine that the more details you want to have in your DTD, the longer and more involved it becomes. You can learn more about writing DTD's in detail at one of the tutorial sites included in the sidebar of this article.
Each element of the XML document is explicitly defined with its element name and the contents it may contain specified in the parenthesis. Elements such as address_book may contain one or more listing elements as denoted by the plus sign (+). Elements listed as content must appear in the order and freqency indicated in the element definition. #PCDATA simply means the element contains data.
While IE5 checks to make sure XML documents are well formed and checks the syntax of your DTD, it doesn't validate XML documents. You would need to install a third party XML parser to validate your XML document to your DTD.
Adding a Little StyleThe XML output on IE 5 is great for outlining the tree structure of a document, but it isn't really the way you'll be displaying your data to users. XML only defines the structure of your data. RememberXML is just about the data. If you want to modify the way your data is presented on a page, you'll need to use a style sheet.
CSS should be familiar enough to most people. The only item that needs further explanation is display:block. In terms of formatting, elements are rendered "inline" as a string of characters such as with the HTML anchor element, or as "blocks," where elements such as the HTML <p> appear as separate blocks. Since XML itself declares no formatting whatsoever, data would be displayed in one long string one right after another. So we need to style certain elements to be "block" level, in essence, causing a hard return at the end of the line.
Here's a sample XSL style sheet for my address book:
In this XSL sample, we are using XSL to define an HTML-based template. Within this template, we've added a page title and header. Then for each
There! We now have a well-formed XML document styled with XSL!
XML & Beyond
Of course, there's a lot more to XML than what I've covered in this article. There are XML-based technologies used to build XML applications, such as XSLT (XSL Transformations), and XLink, which allows you to make any element a hyperlink.
Then there are applications created from XML to enhance Web development, such as SVG (Scalable Vector Graphics) and WML (Wireless Markup Language). XML DTDs have been created to be industry-standard data structures, such as MathML for creating complex mathematical formulas, or VoiceML used for making internet information available via phone and voice.
But because XML is far-reaching doesn't mean it's difficult to understand. Learning how to code XML is really nothing more than remembering a few syntax rules and knowing your data intimately. Once you've mastered XML syntax and data, the possibilities of what you can create using XML are endless!
Bonnie is a technical writer who designs and develops Web sites and creates system documentation.
|
|
||||||||||||||||||||||||
|
|
|
|