More magic from James Clark: He's announced the alpha release
of
nXML, a new mode for editing XML
documents from within GNU Emacs. It's a milestone in that
it's the first open-source editing application to enable
context-sensitive validated editing
against Relax NG schemas. It also provides a
clever mechanism for real-time, automatic visual
identification of validity errors, along with flexible
syntax-highlighting capabilities -- and many
other features planned for future releases.
To get the current release, go to Clark's Thai Open Source
download site, and look for the latest
nxml-mode-200nnnnn.tar.gz
distribution. To get started using it, follow the
installation instructions in the
README file in the distribution,
and see the TUTORIAL file for
instructions on using its context-sensitive completion
feature, as well as details about customizing its
file-name-based and root-element-based schema
auto-assignment mechanism. Once you've got it up and
running, type
M-x describe-mode or
C-h m for more information.
You will find that despite its "alpha" status, nXML is quite
stable and usable for real-world editing tasks already. But
if you do end up needing help, or find a bug, or want to make
a feature suggestion, there's an
emacs-nxml-mode mailing list that Clark
has set up for nXML discussion and support.
Context-sensitive editing using
completion
The Emacs/nXML mechanism for doing context-sensitive
insertion/completion of markup is similar to the mechanism
that Emacs/PSGML provides:
-
Place your cursor at some point in a document.
-
Type a keyboard combination (in the nXML case,
C-Return) to do
context-sensitive checking to see what markup
(elements, attributes, or enumerated attribute
values) is valid at that point in the document; Emacs
then opens up a "completion" buffer containing a list
of the valid markup choices.
-
Either use a mouse to select one of the choices from
the completion buffer, or type the first few letters
of one of the choices, and then TAB to cause Emacs to
do completion on that name or value (for an example,
see the screenshot in
Figure 1).
But while Emacs/PSGML is limited to doing its
markup-checking strictly against DTDs, Emacs/nXML does its
against Relax NG schemas (specified in the
Relax NG compact syntax -- and included in the nXML
mode distribution are Relax NG compact-syntax schemas
for DocBook, XHTML, XSLT, RDF/XML, and for Relax NG
itself).
By enabling this kind of context-sensitive
Relax NG-aware editing of XML documents, and making it
possible to put together a completely DTD-free, open-source
XML toolchain (that is, Emacs/nXML, used in combination
with Relax NG-aware processing applications such as
Daniel Veillard's xmllint and
xsltproc, which are provided in
the libxml2 and
libxslt distributions),
nXML eliminates what may
have been the last remaining reason many users have had
for keeping DTDs around.
Because going DTD-less means that you can also go without
DOCTYPE declarations in your documents, and because the
Relax NG specification does not mandate any way for
associating a document with a Relax NG, some mechanism
needs to be provided at the editing application level;
Emacs/nXML provides two mechanisms: one for manually
specifying a Relax NG schema by browsing for it on your local filesystem, and one customizable mechanism for
automatically associating a document with a schema.
The schema auto-association mechanism works by looking at
the filename extension of the document (it's configured by
default to do it for .html, .xsl, .rdf,
and .rnc files), or failing that, by looking at the
document's root element (for example, it's configured by
default to associate the DocBook schema with documents that
have book or
article root elements, the XSLT
schema with documents that have
stylesheet or
transform root elements, etc.)
Spotting validity errors in real
time
Another powerful feature that Emacs/nXML provides is a
completely automated mechanism for visually identifying
validity errors in a document, in real-time -- one that
doesn't require you to take any manual action to initiate
validity checking.
The feature is similar to a feature in the
Topologi Collaborative Markup Editor
(a relatively new commercial application that takes a
number of novel approaches to XML editing). The
Emacs/nXML implementation of the feature works like
this: As you are editing a document, nXML:
-
does background re-parsing and re-validating of the
document in the idle periods between the times when
you are actually typing in content
-
visually highlights all instances of invalidity it
finds in the document (by default, the value of the
Emacs "face" it uses to highlight invalidity
instances is a red underline -- but the highlighting
can be changed by customizing that face)
If you then mouse over one of the invalidity-highlighted
points in the document, popup text appears describing the
validity error (see
Figure 2). Or, if you move the text cursor to the
location of the invalidity highlighting, the description of
the validity error instead appears in the "minibuffer" echo
area at the bottom of the Emacs interface (see
Figure 3). You can also use a keyboard combination
(C-c C-n) to step through all validity errors in the
document.
Syntax highlighting and
indentation
Emacs/nXML provides what must be by far the best and most
configurable syntax-highlighting capabilities of any XML
editing application currently available: 30+ customizable
Emacs faces enable you to independently control color and
character formatting of everything from the level of
element and attribute names down to the level of different
types of markup delimiters (angle-bracket tag delimiters,
the quote marks around attribute values, etc.).
And nXML indenting works in pretty much the way you'd
normally expect in other Emacs modes: You can just hit TAB
to shift a line over to its appropriate hierarchical level
of indentation, or do C-M-\
(indent-region) to indent a
region.
But one aspect of working with indenting that Clark does
not seem to have dealt with yet is paragraph filling; for
unindented regions, manual paragraph filling or re-filling
(fill-paragraph or
M-q) works as expected -- but not
for indented paragraphs. (Fill support seems to be on his
TODO list for future releases.) For now, a workaround for
filling an indented paragraph is simply to first move it
back to the left margin (by marking the indentation
whitespace and then doing
kill-rectangle --
C-x r k), then doing
fill-paragraph, and then
indent-region to re-indent it.
Clark once described Relax NG as "a conservative,
evolutionary refinement of well-proven ideas from SGML and
XML DTDs", Emacs/nXML, even in this "alpha" stage, may be
seen in part as an evolutionary refinement in XML editing
-- with some features (context-sensitive completion) very
similar to capabilities in existing editors such as
Emacs/PSGML, some features (configurable syntax
highlighting) that are incremental improvements over
existing capabilities, and at least one feature (automatic
real-time highlighting of validity errors) that is a sort
of next-generation step beyond capabilities in most current
editors.
That said, as usable as it may be in its current state,
Clark seems to be considering it just a start, with
significant new development planned; the description for
the
emacs-nxml-mode mailing list says, in
part, that its purpose is to "discuss details of what
features the mode should provide and how they should
work". And in his initial
release announcement, Clark writes:
This is still very much a
work in progress. Most of the work has been on
providing the underlying infrastructure to support
incremental parsing and validation. There's still much
to be done in exploiting this infrastructure in support
of XML editing. I hope early users will help figure out
the best way to do this.
Also, the TODO file in the
distribution contains a long, long, list of potential
changes. Here's a sample of some of the more intriguing
ones:
-
Command to insert an element template, including all
required attributes and child elements
-
Use RDDL to locate a schema based on the namespace
URI
-
Structure view + Collapse and expand elements (using
invisible, intangible and display text properties)
[this seems like it might
be something like the folded-edited support that
PSGML provides]
-
Smart selection command that selects increasingly
large syntactically coherent chunks of XML. If point
is in an attribute value, first select complete
value; then if command is repeated, select value plus
delimiters, then select attribute name as well, then
complete start-tag, then complete element, then
enclosing element, etc.
An idea that I'd personally like to see implemented: Add a
mechanism for specifying lists of elements from a
particular schema to 'ignore' -- that is, elements that,
though they are in the schema, the user wants to omit from
context-sensitive completion lists for element names.
The rationale behind that is that many users have problems
working with large schemas (like DocBook) because those
schemas contain many elements that they have no use for at
all and would just like to ignore completely. But it's a
challenging and time-consuming process to create a schema
customization to remove unwanted elements. It's much more
efficient and easily user-customizable to provide an
'ignore' capability at the editing application layer level.
And XInclude support would be a really nice thing to have
as well.
Related links
|