New titles: Nutshell, New Riders
08:28, 5 Feb 2001 UTC | Michael Smith

New for this month is XML in a Nutshell from O'Reilly. Also recently published: Inside XML from New Riders.

XML in a Nutshell

W. Scott Means and Elliotte Rusty Harold's XML in a Nutshell provides just the kind of concise coverage of its subject that you would expect from an O'Reilly Nutshell book. At under 500 pages, it's still small enough to be genuinely portable, yet thorough enough to serve well as a single source for general XML conceptual and reference information.

Means and Harold (also author of The XML Bible and force behind the Cafe con Leche XML news site) have thoughtfully divided the book into four major parts:

Conceptual information

It's great to see that Part 1, in addition to introducing pretty much what you would expect (fundamentals, DTDs, namespaces) also treats internationalization -- a subject neglected by many other guides. In fact, XML in a Nutshell seems to show a much better than average awareness of internationalization concerns throughout.

However, one conspicuous deficiency in Part 1 is the complete absence of discussion of any schema languages other than DTDs. In fact, the only significant mention of schemas is way back in Chapter 14 (XML as a Data Format), where the authors unfortunately make the mistake of referring to W3C XML Schema as "the" XML Schema language.

I wouldn't expect the book to already include mention of James Clark's TREX validation language -- which has come along only very recently -- but it's misleading to mention just the W3C's schema language and not any of the alternative or "partner" languages, such as Rick Jelliffe's Schematron or Murata (Makoto)'s RELAX, that have been in use for some time. That sort of oversight is inconsistent with the high quality of the rest of the book.

Also, although discussion of them would have been more than appropriate in the section on public IDs, XML in a Nutshell makes no mention of OASIS/SGML Open catalogs, powerful entity-resolution tools that have very practical applications, are supported by a variety of XML editors and other tools, and are actually in wide use for resolving public IDs, despite the book's misleading statement to the contrary:

In practice, public IDs aren't used very often. Almost all XML parsers rely on the URI to actually validate the document.

Although the part about the deficiencies of many XML parsers may very well be true, it's no excuse for ignoring a powerful feature that all of them should support and that many of them can support with the addition of Arbortext's freely available Java Catalog classes. Anyway, that particular oversight is more understandable given that most other XML books make the same mistake.
Related technologies and implementation

The related technologies and implementation information in the middle of the book is nicely divided into a Narrative-centric documents section (for document authors and document-oriented processing of XML) and a Data-centric documents section (for developers/programmers).

Narrative-centric documents

The narrative-centric documents section starts out appropriately with discussion and examples of two industry-standard, narrative-centric XML/SGML vocabularies: TEI and DocBook. This should help to make the section immediately relevant to those interested in using XML for document-authoring applications. Along with covering XHTML, XSLT/XPath, XLink/XPointer, and CSS, the section also includes a very good chapter on XSL-FO.

However, in spite of the breadth of coverage this section provides, I think it's worth pointing out that for document-authoring types, XML in a Nutshell may be neither the best starting point nor the best single reference. Web markup authors, for example, might be better off with just something like Valentine and Minnick's XHTML.

And for authors of technical documentation, Walsh and Muellner's DocBook: The Definitive Guide and the closely related resources available at the official DocBook site and at Walsh's own DocBook site form a focused, specific body of XML-based authoring, transforming, and publishing information that is much more useful for technical-document applications than the general information in XML in a Nutshell. (I haven't yet read Ray's Learning XML, but based on the sample chapter and table of contents -- which seems to indicate that DocBook examples are used in several chapters -- it might also turn out to be a useful starting point.)

This is not a criticism of XML in a Nutshell, because implementation specifics fall outside of the scope of the book. It's just that as a rule of thumb, if you have a specific need for XML, your primary resources are always going to be the specific guides that most closely match your need, not a more general resource.

Data-centric documents

At less than forty pages, programmers looking for how-to information about data-centric applications of XML may find this section a little thin. However, as in the Narrative-centric documents section, it's not within the scope of the book to provide implementation specifics. Instead, it gives a concise conceptual overview and, in the Reference section, complete DOM and SAX references.

General comments

XML in a Nutshell makes it clear from the beginning that there are some XML applications it does not attempt to cover in any depth at all -- for example, SVG, MathML, and RDF. Those exclusions seem appropriate given the size and scope of the book. What XML in a Nutshell does cover in depth is well-organized, well-written, and well-indexed. If you're looking for one concise guide for general conceptual and reference information about XML, it's a very good choice.

Inside XML

Although it's been available since November of last year, Inside XML carries a 2001 copyright, which makes it recent enough for discussion here.

Wealth of examples

The best thing about Steven Holzner's Inside XML is the wealth of examples it provides: extensive examples of DTD syntax, XML instances, and chunks of code (Java and JavaScript/ECMAScript). It extends the usefulness of the examples even further by using shading to highlight the parts most relevant to the discussion each illustrates -- a very helpful technique that it would be nice to see more books use. For examples, see pages 6 and 14 of the sample chapter (1MB PDF). And the code examples are all available as a downloadable file (5MB) from the book's website.

Scope and organization

At more than 1000 pages, Inside XML is a massive book: usable perhaps as a desk reference, but much too big to be portable. And though it's indexed fairly well, getting oriented in it can be a bit of a challenge: its twenty chapters are not grouped into clearly labeled parts, though they easily could have been.

The book provides just one separate reference section: a complete but unannotated copy of the W3C XML recommendation. Despite its bulk, conspicuously absent from Inside XML are reference sections for XSLT/XPath, XLink/XPointer, DOM, SAX, or character sets. I think many will judge that lack of reference information to be a fairly serious deficiency in a book of this type.

Extras and omissions

Inside XML provides much more thorough coverage (forty pages with extensive examples) of the W3C XML Schema language than XML in a Nutshell does. However, it also makes the same mistake that XML in a Nutshell does: mentioning just the W3C's schema language and not any of the alternative or "partner" languages, such as Rick Jelliffe's Schematron or Murata (Makoto)'s RELAX, that have been in use for some time.

Some users may appreciate other extras that Inside XML does include: an RDF chapter, a VML chapter (though not one on SVG), and a "WML, ASP, JSP, Servlets, and Perl" chapter, which, however, starts out with the statement that WML is a "popular application targeted at cordless phones..." (It goes on to mention PDAs and WAP devices, but the sort of imprecision of language shown in that statement unfortunately crops up in other parts of the book -- sometimes, apparently, quite consciously. For example, a section on creating XHTML documents is titled "XHTML Programming", though I think most readers would agree that what it actually treats is simply XHTML markup.)

Unfortunately, no mention here either of the powerful entity-resolution capabilities provided by OASIS/SGML Open catalogs, though they have obvious practical applications, are supported by a variety of XML editors and other tools, and can be supported in many other applications with the addition of Arbortext's freely available Java Catalog classes.

And despite the volume of model XML instances Inside XML contains, it never once provides a single example of an industry-standard, document-oriented vocabulary such as TEI or DocBook, instead mostly favoring manufactured, data-oriented examples.

Java and JavaScript details

Though Inside XML never comes right out and says so, it is very much a book intended for developers, not for document authors. Appropriately, it contains a big bunch of chapters in a row (more than 300 pages) on working with XML using Java and JavaScript (but with a chapter on CSS inexplicably dropped right in the middle). That includes two chapters that are essentially non-XML-specific introductions to Java and JavaScript. New developers may welcome the all-in-one approach of including those two "intro to programming" chapters; others may wish the book had not been weighed down by the extra hundred pages they take up.

Microsoft bias?

The cover of Inside XML declares that it is a "foundational book that covers both the Microsoft and non-Microsoft approach to XML programming." And just a few pages in, the author states that:'s a fact of life that most XML software these days is targeted at Windows.... I wish that there were more support for other operating systems that I like, such as UNIX, but right now a lot of it is Windows-only.

However, despite that statement, he does go on to point out that many XML tools are Java-based and therefore platform-independent. And the only chapters that seem to me to actually show an extensive platform bias are the ones on JavaScript and VML.
Who should buy it?

If you're a Java or (Microsoft) JavaScript developer looking for a guide that provides lots of examples -- and you're willing to live with the book's lack of reference information -- take a look at the sample chapter (1MB PDF), code examples (5MB), and table of contents (12KB PDF). If those are to your taste, you may find Inside XML useful.

xmlhack: developer news from the XML community

Front page | Search | Find XML jobs

Related categories