The ABCs of XML, Part 2

This article provides more of what you need to know to survive in the world of connected data by introducing XML’s fundamental concepts to those who have managed to avoid this important technology.

Share Print Related RSS
Page 2 of 3 1 | 2 | 3 View on one page

An XML tag <recipe> isn’t the same as <Recipe>. However, HTML isn’t case sensitive so <h1> is identical to <H1>.

  • XML attribute names must be unique within an element
    An element may have any number of attributes but each attribute name must be unique. The following example is incorrect:
    <People Person="John Doe" Person="John Smith" />
    This example could be structured properly as follows:
    <People>
    <Person>John Doe</Person>
    <Person>John Smith</Person>
    </People>
    Or as:
    <People>
    <Person Name="John Doe"/>
    <Person Name="John Smith"/>
    </People>
  • XML attribute values must be quoted
    XML attribute values must be enclosed either in single quotes or double quotes. If an attribute value contains a double quote character, then enclose the value in single quotes. Likewise, if the attribute value contains a single quote, enclose the value in double quotes. For attribute values that may include either type of quotation character, standard HTML character entities may be used, including &quot; for the double-quote character and &apos; for the single-quote character.
  • Comments
    In addition, the markup for a comment in XML is identical to HTML. The comment opens with <!-- and closes with --> and may span multiple lines.

    XML Declaration
    An XML document may begin with an optional XML declaration. The XML declaration must precede all other content, and isn’t considered part of the XML document. It’s used to provide information to XML processors about the document's content. Because the declaration is not an element, it must not have a closing tag. The declaration looks like this:

    <?xml version=”1.0” encoding=”utf-8” standalone=”yes”?>

    If the declaration is included, version is the only required attribute, and must have a value of either 1.0 or 1.1. Version 1.1 supports special Unicode character handling functionality that’s rarely needed, and, therefore, version 1.0 is used almost exclusively.

    The encoding attribute defines the character encoding used by the document, so that an XML processor may properly parse the document. The default encoding used by XML processors is UTF-8.

    Processing Instructions
    Any number of processing instructions may appear below the XML declaration and before the root element.  It must be enclosed in <? and ?> like the XML declaration, and provides application-specific handling information. A Microsoft Word 2003 XML document may include the following processing instruction, which tells the Windows operating system to identify the XML document as an MS Word file. When double-clicked, an XML file with this processing instruction will open in MS Word, as shown:

    <?mso-application progid="Word.Document"?>

    XML Validation
    An XML document has a specific structure of element names, attribute names, and hierarchical parent-child relationships. As long as a document meets the requirements for “well-formedness,” it can have any structure and contain any data. This flexibility is what makes XML extensible.

    However, applications that interpret XML documents have expectations that the XML will adhere to a particular structure. Validation is the process of checking an XML document for conformance to a defined structure or schema. A schema can be defined in the XML document, or a reference can point to an external schema document. There are multiple standards for defining a schema including Document Type Definition (DTD), XML Schema Definition (XSD) language, and XML Data Reduced (XDR). An XML document that that adheres to a defined schema definition is judged to be “valid.”

    A schema isn’t required when developing XML applications, and, in fact, can significantly complicate XML application development.  When you control a document's content and related applications, you can work more efficiently without a schema.

    Software vendors that support XML data normally publish a schema, so other applications can properly validate content before working with a document. A control system vendor that supports import of XML data into the control system will likely validate a document before the import process to prevent loading data that may lead to a control system fault.

    Namespaces
    XML namespaces solve a problem that can occur when an element name may have different meaning within a single document. For example, the element name template is an XSLT keyword, and its meaning is different than the template element used in an MS Word XML document. All elements and attributes in an XML document are included in a namespace, even if a namespace isn’t explicitly declared. When no namespace is defined in a document, content is included in the default null namespace. A namespace may be defined as an attribute of the start tag of an element with the following format:

    xmlns:prefix="namespaceURI"

    Where a namespace is declared for an element, all child elements with the same prefix are included in the same namespace. The element where the namespace is declared may also be included in the namespace if the same prefix is used in the element name.

    <cc:Recipe xmlns:cc="http://www.cascon.com/Recipe">
       <cc:Step cc:XPos="600" cc:YPos="600" AcquireUnit="yes">
       <cc:Name>Feed:1</cc:Name>
       <UnitAlias>Unit500</UnitAlias>
       </cc:Step>
    </cc:Recipe>

    In the sample above, the prefix cc refers to the namespace http://www.cascon.com/Recipe.  Elements included in this namespace include recipe, step, and name. The element UnitAlias and the attribute AcquireUnit are included in the default null namespace.

    Page 2 of 3 1 | 2 | 3 View on one page
    Share Print Reprints Permissions

    What are your comments?

    Join the discussion today. Login Here.

    Comments

    No one has commented on this page yet.

    RSS feed for comments on this page | RSS feed for all comments