Namespaces in XML and SXML

Namespaces in XML and SXML

Namespaces provide a mechanism to distinguish names used in XML documents, allowing to keep these names simple and meaningful while unique.

While XML Recommendation was introduced without notion of namespaces, most of XML-related technologies rely on namespaces, and Namespaces in XML Recommendation was published by World Wide Web Consortium on January 1999, shortly after the XML Recommendation itself.

Namespaces in XML

In the data model implied by XML, an XML document contains a tree of elements. Each element has an element type name (sometimes called the tag name) and a set of attributes; each attribute consists of a name and a value. The element type name is generally intended to express the semantic meaning of the element. As it is discussed in J'Clark's XML Namespaces , applications typically make use of the element type name and attributes of an element in determining how to process the element.

In XML 1.0 without namespaces, element type names and attribute names are unstructured strings using a restricted set of characters, similar to identifiers in programming languages. This is problematic in a distributed environment like the Web, because there is no way of guaranteeing the uniqueness of the element type name, i.e. semantically different elements and attributes can have identical names.

For example, an attribute named "color" may be used to express the fact that the external representation of the title element on the computer screen should have the red color:

<title color="red">...</title>

and an attribute with the same name may be used to specify the color of the element's content when printed on paper:

<title color="red" color="gray16"/>...</title>

Not only can't we distinguish these two attributes. The element in the former example is not a well-formed XML, since XML Recommendation doesn't allow multiple attributes with identical names in the same XML element.

The XML Namespaces Recommendation provide a solution for a situation like this through extension of the data model which allows element type names and attribute names to be qualified using a Uniform Resource Identifier (URI). Every namespace has to be declared using a family of reserved attributes, and its URI binded with some namespace prefix, because URI can't be directly used as a part of XML name.

If an element type name or attribute name contains a colon, then the part of the name before a colon is considered as a prefix, and the part of the name after a colon - as a local name. A prefix foo refers to URI specified as a value of xmlns:foo attribute.

Thus the attributes in the example above can be qualified by different URIs as

<title
   xmlns:display="http://colors.com/display/"
   xmlns:printer="http://colors.com/printer/"
   display:color="red" printer:color="gray16"/>...</title>

The role of the URI in a name is purely to allow applications to recognize the name. There are no assumptions or guarantees about the resource identified by the URI.

Namespaces description using attributes is rather verbose, but XML Namespaces Recommendation allows inheritance of these attributes. If a prefix foo is used in a name, but this element does not have an xmlns:foo attribute, then a value of its parent element's xmlns:foo attribute will be used; if a parent does not have a xmlns:foo attribute, then a value of its grandparent element's xmlns:foo attribute will be used, and so on. The XML Namespaces Recommendation does not require element type names and attribute names to be qualified names and provides a mean to specify a default namespace for element names. However, unprefixed attributes are not affected by default namespace. An illustrative example is provided by Jim Clark in XML Namespaces.

Namespaces in SXML

Since URI can contain some characters which are not allowed in a correct XML name, it can't be literally included in it. However, any valid URI character is a valid character in a Scheme symbol (and thus SXML name). This makes it possible to qualify SXML names directly, using URI as local name qualifiers in universal names. For example:

<title
   http://colors.com/display/:color="red" 
   http://colors.com/printer/:color="gray16"/>...</title>

That is: the rightmost colon in an SXML name separates the local name from the namespace URI. While such the long SXML names are looking cumbersome when written out, they are memory-effective as a data structure since they are Scheme symbols. No matter how long the name of a symbol may be, its long name is represented just once, in a symbol table. All other occurrences of the symbol are just references to the corresponding slot in the symbol table. Such a representation follows the idea of the Namespaces Recommendation, which says: "Note that the prefix functions only as a placeholder for a namespace name. Applications should use the namespace name, not the prefix, in constructing names whose scope extends beyond the containing document."

Besides the direct way to qualify names with URIs, SXML supports the concept of namespace-ids which are quite similar to XML namespace prefixes. Similarly to a prefix, a namespace-id stands for a namespace URI. The distinctive feature of a namespace-id is that there is a 1-to-1 correspondence between namespace-ids and the corresponding namespace URIs. This is generally not true for XML namespace prefixes and namespace URIs. For example, different XML prefixes may specify the same namespace URI; XML namespace prefixes may be redefined in children elements.

A namespace-id is thus a shortcut for a namespace URI in SXML names. The association between namespace-ids and namespace URIs is defined in the administrative node *NAMESPACES*, which is located before the document element:

(*TOP*
  (@@
    (*NAMESPACES*
      (rdf "http://www.w3.org/1999/02/22-rdf-syntax-ns#")
      (dc "http://purl.org/dc/elements/1.1/")))
  (*PI* xml "version=\"1.0\"")
  (rdf:RDF
    (rdf:Description
      (dc:creator "Karl Mustermann")
      (dc:title "Algebra")
      (dc:subject "mathematics")
      (dc:date "2000-01-23")
      (dc:language "EN")
      (dc:description "An introduction to algebra"))))

The set of examples below illustrates the relationship between XML prefixes, SXML with directly qualified names and SXML with namespace-ids. This sample document contains some resource description expressed in Resource Description Framework. The Resource Description Framework has its own namespace, and the rdf: prefix is typically used for this namespace. Let's consider a situation where some another namespace URI "http://www.resources-of-different-family.com" are used in resurs described. As "rdf" is a natural abbreviation for this URI also, it is possible that rdf: will be used as a prefix for this URI. In XML, this situation causes a sort of collision, since the same prefix will be used for two different URIs, and prefix re-declarations will be required:

<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
  <rdf:Description rdf:about="http://www.w3.org/TR/rdf-syntax-grammar">
    <rdf:editor xmlns:rdf="http://www.resources-of-different-family.com">
      <rdf:Description xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
        <rdf:fullName xmlns:rdf="http://www.resources-of-different-family.com"
        >Dave Beckett</rdf:fullName>
      </rdf:Description>
    </rdf:editor>
  </rdf:Description>
</rdf:RDF>

In SXML with directly qualified names, the document looks clearly, although at the price of verbose written representation:

(*TOP*
  (*PI* xml "version=\"1.0\"")
  (http://www.w3.org/1999/02/22-rdf-syntax-ns#:RDF
    (http://www.w3.org/1999/02/22-rdf-syntax-ns#:Description
      (@
	(http://www.w3.org/1999/02/22-rdf-syntax-ns#:about
	  "http://www.w3.org/TR/rdf-syntax-grammar"))
      (http://www.resources-of-different-family.com:editor
	(http://www.w3.org/1999/02/22-rdf-syntax-ns#:Description
	  (http://www.resources-of-different-family.com:fullName
	    "Dave Beckett"))))))

SXML with namespace-ids requires two different ns-ids, and provides 1-to-1 relationship between namespace-ids and URI:

(*TOP*
  (@@
    (*NAMESPACES*
      (rdf "http://www.w3.org/1999/02/22-rdf-syntax-ns#")
      (rodf "http://www.resources-of-different-family.com")))
  (*PI* xml "version=\"1.0\"")
  (rdf:RDF
    (rdf:Description
      (@ (rdf:about "http://www.w3.org/TR/rdf-syntax-grammar"))
      (rodf:editor
	(rdf:Description
	  (rodf:fullName "Dave Beckett"))))))

While in a generic XML document one XML Namespace prefix may be associated with two or more different URIs, the majority of "real life" XML documents has no multiple URI declarations.

Such the documents provide 1-to-1 relation between namespace prefix and URI, and their representation using XML Namespace prefixes and SXML namespace-id is very similar. However, in SXML the namespaces themselves can be used instead of namespace-ids, and this option may be employed for better performance and more compact data representation.

Links:

James Clark. XML Namespaces.

Namespaces in XML. W3C Recommendation

SXML Specification

Namespaces in XML and SXML. RDLJ 2003-3