James Clark, the first
recipient of the IDEAlliance XML Cup Award and opening keynote speaker at XML 2001, gave a lively description of the five challenges facing the XML community.
1) Make progress without compromising the strength of XML
Clark described the main strength of XML as its diversity. XML "does everything from
encoding SOAP procedure calls with a lifetime of a few milliseconds up to encoding
Buddhist religious texts with a lifetime of millenniums," a power which emerges from its simplicity: "XML doesn't do much by itself… to me XML is a syntax
for encoding a labeled tree". Clark urged XML's keepers to keep this original simplicity
and generality by "separating out functionality that is general purpose,
just dealing with this labeled tree from high levels that are specific to particular
domains whether those domains are databases, or documents or web services or
whatever." He presented as unfortunate counter-examples the cases of W3C XML Schema date types and the xsi:nil
attribute.
2) Don't neglect the foundation
So many applications are now relying on XML that its foundation should be "rock solid" and Clark advised developers to "spend some time fixing the base" after giving a detailed list of things to fix such in XML 1.0. Highlights included "bizarre" behavior in XML 1.0, which "tells you that you need to report unparsed entities but not that you should report elements and attributes";
Namespaces in XML, XML Infoset and XML Base which should be included in the core standard; DTDs which are "basically one big mess" by mixing so many different features in a non XML syntax and for which we need "to learn to live without" and the character entities for which a replacement still needs to be found.
3) Fill in the missing pieces
The
processing model is becoming complex and we currently process validation, inclusion, transformations and soon query processing. There is no generic solution for "controlling the processing pipeline". Each feature does processing its own way (doctype, xsi:schemaLocation
, stylesheet location, etc.), and none of them is working terribly well.
4) Improve XML processing.
Noting that the current solutions to XML processing are just "too much work, too difficult, too error prone", Clark made a difference between using general-purpose languages with a XML API and using a XML specific language.
If you use generic programming languages to process XML, you need "a standard pull API" with readers and writers like those in .NET - just because it comes out of Microsoft is not necessary bad," and loosely coupled data binding interfaces "automating the process of mapping between the XML and the data model" even when the internal
structure is not a tree but directed graphs.
If you use a XML-specific language, XSLT is the current choice even though "used for truly scary things". XSLT lacks a data type set to be able to perform error
detection. Clark felt that the world needs a "better XML specific language" and "XQuery is promising".
5) Avoid
premature standardization.
Arguing that "having a standard is not always a good thing" and that "early standardization may cut off innovation" by preventing small organizations coming along with better solutions to the point where standards become
"anti-competitive", Clark urged us to fight against premature standardization. And since "a lot of people accept standards uncritically", "people need to apply their critical faculties to standards". Clark also stated that "not everything needs to be a standard" giving the
example of processing solutions which do not need to be exchanged and thus do not need to be standards. Clark reminded developers wanting to avoid "vendor lock-in" that open source is also a good solution.
Related story:
|