XML 2003 session report: News from the world of DSDL
15:40, 15 Dec 2003 UTC | Uche Ogbuji

10 December 2003 at XML 2003 in Philadelphia...

Eric van der Vlist kicked off a block of presentations opening up the world of ISO Document Schema Definition Languages (DSDL) (ISO/IEC JTC 1 SC 34 WG 1), and some of the innovative work being undertaken in that working group. Eric presented an "Update on ISO DSL Overview and Update". He proceeded through the various parts of DSDL in order. Notice that at the time of writing the numbering on the DSDL Web site is out of date. The numbering in van der Vlist's presentation and this report is up to date as of the DSDL's May meeting.

  • Part 1: Interoperability framework. This part will become a formal roadmap and outline of DSDL as a whole.
  • Part 2: Grammar-based validation. This part is a re-write of the RELAX NG OASIS Specification to meet the requirements of ISO publications, i.e. more formal language. The features will remain the same and the specifications are meant to be identical for assessment of conformance. Eventually RELAX NG compact syntax will be added as an addendum to DSDL Part 2. DSDL Part 2 is now a "Final Draft International Standard" (FDIS), i.e. an official ISO standard.
  • Part 3: Rule-based validation. The intent is to create a hosting language for expressing general-purpose rules in XML. The main input is Schematron, and it has been decided that in effect, DSDL Part 3 will present the evolution of Schematron. An example of what DSDL Part 3 will add to Schematron is extension so that not only XPath 1.0 is supported, but also expressions taken from other languages such as EXSLT, XPath 2.0, XSLT 2.0, and even XQuery 1.0.
  • Part 4: Selection of validation candidates. This part is creating NVDL, a means of splitting up documents comprising multiple vocabularies so that they can be more easily validated. There have been many inputs to this part, but James Clark's Namespace Routing Language (NRL) is the main input to the process.
  • Part 5: Datatypes. The intent is to develop a framework for creating new primitive data types. Jeni Tennison's Datatype Library Language is an input. It defines an XML language for defining regular expressions for the lexical representation of new types. This much alone is provided for (to some extent) in the facet mechanisms in WXS, but the important distinction in DSD part 5 is that it adds a mechanism for mapping of these new data types to the value space, which is not possble in WXS. In effect it allows you to specify semantics as well as syntax of new data types, which is essential.
  • Part 6: Path-based integrity constraints. Van der Vlist said there's not yet much to say about this part. Its goal is to define features similar to WXS's xs:unique, xs:key and xs:keyref, but there have been no contributions to date.
  • Part 7: Character repertoire validation. The goal of this Part 7 is to develop a language that allows schema designers to constrain the character sets that can be used in various lexical structures in XML. There are ways to express some such restrictions at present in RELAX NG, but they break down when trying to apply such restrictions in cases such as mixed content. Part 7 would work by, for example, allowing one to express the constraint that "element and attribute names as well as PI targets to be basic Latin-1" or "numbers must not appear in element and attribute names". Van der Vlist posted examples of Character Repertoire Validation for XML (CVRX), a proposal for Part 7 by Erik Wilde.
  • Part 8: Declarative document manipulation. Van der Vlist mentioned that this part is rather mysterious to him, and the fact that it is an improvement on architectural forms should encourage some sympathy. Architectural forms have the reputation of being very powerful, but a cause of brain meltdown. The intent of this part is to develop an architectural forms system that is practical and fits smartly into the remainder of DSDL.
  • Part 9: Datatype- and namespace-aware DTDs. The intent in a nutshell is to keep SGML DTDs alive in the XML space by adding some of the features that in search of which people usually move to other schema languages.
  • Part 10: Validation management. This part used to be called "validation frameworks". It is meant to be the glue to allow you combine different parts from DSDL. It provides a pipeline framework for pre-processing and validation of documents. Contributions include van der Vlist's own XVIF and Rick Jelliffe's Schemachine. Van der Vlist also contributed a variation on XVIF called XVIF/outie and he showed examples of this.

An audience member expressed concern that DSDL is too "secretive". He mentioned too a dearth of documents available for public content, despite the clear volume of activity. He noticed that the public mailing list archives were very sparse and many of the archives were private. DSDL members in attendance reassured him that exclusion is not the intention, and expressed a willingness to address concerns about the openness of the project.

Related stories:

xmlhack: developer news from the XML community

Front page | Search | Find XML jobs

Related categories