The ABCs of XML, Part 2

This article provides more of what you need to know to survive in the world of connected data by introducing XML’s fundamental concepts to those who have managed to avoid this important technology.

1 of 3 < 1 | 2 | 3 View on one page
By John T. Sever, President, Cascade Controls

The ABCs of XMLTHE WORLD-WIDE Web Consortium's XML Recommendation opens with a list of 10 design goals. The first goal states: “XML shall be straightforwardly usable over the Internet.” Straightforward or not, eXtensible markup language (XML) is used extensively over the Internet, and has become a de facto standard for data interchange.

Because of its association with Internet applications, however, automation engineers have sidestepped XML, assuming it has little applicability to their daily work. This is a mistake. Though this technology has been used extensively for Internet-based applications, XML is an extremely simple and flexible data format with untold uses waiting to be uncovered and implemented in industrial controls and automation.


(Including results of the transformation)

>> Transformation Results
>> DV Modules and Parameters
>> DV Sample

Alone, XML data is simply raw text that has little to offer automation engineers. But XML isn’t alone.  Developers everywhere have jumped aboard the XML bandwagon to create a seemingly bottomless reservoir of tools, applications, services, and standards all designed to create, consume, translate, store, and present XML data. This infrastructure of supporting applications is what makes XML such a compelling choice for application data. This article will introduce XML’s fundamental concepts for those who have so far managed to avoid this important technology. Parts 3 and 4 of this four-part article will address XML supporting technologies. [“The ABCs of XML, Part 1” ran in CONTROL, June ’06.]  

Not a Typical Language
XML isn’t a language in the sense that there are defined keywords, functions, or statements. XML is often compared to hypertext markup language (HTML) because it works well with HTML applications, has similar markup, and has been joined with HTML to create the XHTML specification. However, the HTML specification defines a list element tags like <body>, <h1>, <b>, and <i> with defined behavior for HTML browsers.

XML lacks a defined set of tags and allows anyone to create their own set of tags and attributes to suit their own application needs. Instead, the XML specification defines a set of markup rules that must be followed for the marked up text to be interpreted as XML data.

10 Well-Formed Rules
XML is organized in a logical or physical structure called a “document.” An XML document may be a file on disk, it may be streamed from a server, or it may be hard-coded text inside an HMI VBA application. Though the data may have many different sources, the document metaphor still applies as long as it’s “well formed.” To be considered a well formed, an XML document must adhere to the constraints defined in the W3C’s XML specification. These constraints can be distilled into 10 easy rules.

  1. XML is just plain old text
    XML is designed to be human-readable as text. This means any text editor can be used with an XML document. A simple text editor will treat an XML document just as it would an INI file, a CSV file, or any text file.
  2. XML is data
    XML is designed as a flexible self-describing data structure. By itself, XML can’t do anything, nor does it define how data should be processed or handled. By contrast, HTML includes both data and a description of how it should be displayed in a browser. 
  3. XML documents must have one root element
    There can be only one top-level root element in an XML document, and all other elements must be between the root element start and end tags.
  4. XML white space data is preserved
    HTML reduces consecutive white space characters to a single space character. With XML, white space is interpreted as data—just as any other character. 
  5. XML naming rules
    Element names can’t include white space, must start with a letter, and can’t include characters that are used for markup such as <, >, ;, &, among others. It’s generally a good idea to limit element and attribute names to letters, numbers, and underscore.
  6. XML elements must be closed
    An element can be closed with an end tag, or optionally with the shorthand notation for empty elements. By contrast, HTML doesn’t require that elements be closed. In fact, most browsers will attempt to render any HTML element whether or not it’s closed properly. XML is not so forgiving. An empty element is one with no value and no child elements, though it may have attributes. An empty element may be closed with the shorthand notation /> at the end of the start tag. For example <Value /> is equivalent to <Value></Value>.
  7. XML elements must be properly nested
    In HTML, elements were allowed to overlap like this: <b><i>bold and italic text</b> italic only</i>.  This type of element crossing is not allowed in XML. An element that starts inside a parent element must end inside the same parent before the parent element is closed.
  8. XML is case sensitive
1 of 3 < 1 | 2 | 3 View on one page
Show Comments
Hide Comments

Join the discussion

We welcome your thoughtful comments.
All comments will display your user name.

Want to participate in the discussion?

Register for free

Log in for complete access.


No one has commented on this page yet.

RSS feed for comments on this page | RSS feed for all comments