James Clark has announced the addition to his Jing validator for RELAX NG of support
for "feasible validation" (a type of "gradual" or
"progressive" validation), as discussed in Rick Jelliffe's
When Well-Formed is too much and Validity is
too little presentation at XML Europe 2002.
Progressive validation support is useful if you have
documents that are still incomplete or "under construction"
but want to check that you've not made any validity
mistakes in them so far. In the announcement, Clark states
the option will cause Jing to:
first transform the schema
by wrapping each element, attribute, data and list
element in an optional element and then validate against
this transformed schema. The net result is to check
whether the document could be made valid by inserting
additional elements and attributes.
In a comment posted to earlier xmlhack story about fuzzy validation, Rick Jelliffe
elaborates by describing some scenarios in which you might
find "feasible validity" checking useful:
You may be marking up some
text and intend to add the "required" metadata later: you
just want to find out if there are problems (elements
names wrong, etc) in the elements you have done. You
don't want to be bored by "errors" which are really due
to your workplan, nor (worse) have the validation fail
and stop before it gets to the elements you have been
working on.
In the abstract to the paper Jelliffe presented
at XML Europe 2002, he outlines three techniques for
checking the validity and well-formedness of documents that
are incomplete or still under construction:
- weak validation
-
A technique based on "strength-reducing schema (or
DTD) particles so that all elements are optional.
Again, only infeasible documents will cause
validation errors." It will thus raise errors if
something forbidden by the schema is included, but
allow mandatory elements to be missing. Clark's new
option in Jing implements support for weak
validation.
- partial ordering
-
A technique based on "finding which tags (start-
or end-) can feasibly appear before or after each
other; unfeasible markup can be discovered before the
document is even well-formed." Partial ordering
(which can be expressed very simply using Jelliffe's
one-element Hook validation language, described
as "like a checksum for a schema") checks for a more
limited set of contraints; it simply attempts to
answer the question:
Does this element
have a feasible name, ancestry, previous-siblings
and contents?
Partial ordering validation is thus useful as a
means for checking for errors even relatively early
in the process of creating/editing valid
documents.
- Schematron phases
-
A technique which provides "a managed way to
express many kinds of different constraints, allowing
documents to be validated first against some criteria
then others, suitable to the document's progress
through a markup process."
Jelliffe contrasts all three of these validation
techniques -- which, as a class, might generally be called
"gradual" or "progressive" validation techniques -- with
the techniques of checking for all-or-nothing strict
validity and/or "partial validity" (which requires that
"all the elements in every element must match unambiguously
part of the way through the content model of the element")
and states that "the first wave of XML editors [that is,
most existing XML editors] required well-formedness and
usually provided partial validity only, or even required
strict validity."
Topologi, a company Jelliffe founded, implements
progressive validitation in two applications: the Topologi Collaborative Markup Editor and the Topologi Schematron Validator (free download), which, while it
doesn't check for feasible validity per se, does
support Schematron phases.
Related stories