A new note from the W3C, A Notation for Character
Collections
on the World Wide Web, combines XML and some CSS notation to describe sets of
Unicode characters.
The notation provides for both a core group of characters guaranteed to be in a particular set (the
'kernel') and a group of characters guaranted not to be in the set (the 'hull'). In this particular document, all
characters are in fact Universal Character Set codepoints, as defined by ISO10646 and Unicode.
Character collections are much like character sets, though the collection terminology is used because of
prior abuse of the term character set.
Collections are described using an XML vocabulary, with the syntax for describing character ranges
building on CSS2 and references to other collections
using XLink.
This document, written by the W3C's Martin Duerst, is not a working draft backed by a
W3C working group, at least not yet.