In a message to the ietf-xml-use mailing list, James Clark
makes a detailed case for giving RELAX NG at least "equal
billing" with the W3C XML schema language as a recommended
formalism in IETF protocols using XML.
Writing in response to a recently updated IETF
Internet-Draft, Guidelines for the Use of XML within IETF
Protocols
(draft-hollenbeck-ietf-xml-guidelines-04.txt), Clark begins
his message by writing:
I just had a look at
draft-hollenbeck-ietf-xml-guidelines-04. Section 4.6 says
"XML Schema should be used as the formalism in the
absence of clearly stated reasons to choose another." I
strongly disagree with this
recommendation.
I believe RELAX NG is
preferable in many situations to XML Schema and should
receive at least equal billing. Concretely, I propose in
the sentence above changing "XML Schema" to "XML Schema
or RELAX NG".
[...] I don't think RELAX NG
is just another mechanism. It is a solid, mature and
stable specification. It has been developed in an open
standards process (in OASIS). It has multiple,
independent and interoperable implementations. It is
based on a solid body of CS theory (tree automata). It is
on track to become a fully-fledged International
Standard: it recently went out as a Draft International
Standard.
In the remainder of the message, Clark details a variety
of reasons for preferring RELAX NG as a language to
"communicate unambiguously and precisely to a human reader
what XML documents are legal for [a particular XML
application]". He notes that RELAX NG:
-
is designed to be simple and easy to understand;
the specification for the W3C XML schema language, in
contrast, is far from being simple or easy to
understand
-
includes a normative, formal description of the
semantics of a RELAX NG schema and has a solid basis
in tree automata theory; the W3C XML schema language
has no such basis
-
integrates attributes into content models; the W3C
XML schema language's support for attributes is
totally inadequate and provides no advance over
DTDs
-
provides strong support for unordered content; the
W3C XML schema language provides very weak support
for unordered content
-
provides a modular approach to use of datatypes;
the W3C XML schema language is totally lacking in
modularity -- tied to the single collection of
datatypes defined in its specification
-
has a clear, unambiguous notion of validity; the
W3C XML schema language does not: for example, it
provides no way to specify what is allowed as the
root element of a document instance
-
treats validation as a process with two
independent inputs: a schema and an instance to be
validated with respect to the schema; the W3C XML
schema language does not provide a similar clean
separation between the two
-
never changes the information that an application
receives -- it specifies purely what is valid and
what is invalid; the W3C XML schema language, in
contrast, supports infoset augmentation
In a followup to Clark's posting, Tim Bray concurs,
saying that "to the extent that there is an IETF style of
doing things, RNG is right there." That statement will
probably ring true to many people familiar with the IETF
and how its history of RFCs (Request For Comments) --
developed by small groups of technical experts and reviewed in
an open, distributed fashion -- contrasts with the W3C's
institutional process. A couple of excerpts from the Free
On-Line Dictionary of Computing entry for RFC seem relevant:
The RFC tradition of
pragmatic, experience-driven, after-the-fact standard
writing done by individuals or small working groups has
important advantages over the more formal,
committee-driven process typical of ANSI or
ISO.
The RFCs are most
remarkable for how well they work -- they manage to have
neither the ambiguities that are usually rife in informal
specifications, nor the committee-perpetrated misfeatures
that often haunt formal standards.
Related stories
|