Encodings
Character Collections
21:57, 9 Nov 1999 UTC | Simon St.Laurent

A new note from the W3C, A Notation for Character Collections on the World Wide Web, combines XML and some CSS notation to describe sets of Unicode characters.

The notation provides for both a core group of characters guaranteed to be in a particular set (the 'kernel') and a group of characters guaranted not to be in the set (the 'hull'). In this particular document, all characters are in fact Universal Character Set codepoints, as defined by ISO10646 and Unicode.

Character collections are much like character sets, though the collection terminology is used because of prior abuse of the term character set. Collections are described using an XML vocabulary, with the syntax for describing character ranges building on CSS2 and references to other collections using XLink.

This document, written by the W3C's Martin Duerst, is not a working draft backed by a W3C working group, at least not yet.

  
xmlhack: developer news from the XML community

Front page | Search | Find XML jobs

Related categories
Encodings
W3C