XML for document preservation
17:12, 26 Jan 2001 UTC | Eric van der Vlist

Isn't it surprising to find a recommendation, not yet three years old and unapproved by any official body, appraised by engineers, lawyers and archivists invited by their government to debate the long term preservation of digital documents?

Yet this was just the case in a meeting organized by the MTIC [1], for the French Prime Minister, to present a "guide for the preservation of digital documents." It included presentations by Alain Bensoussan, a lawyer specializing in the issues of digital documents, and Catherine Dhérent representing the "Archives Nationales."

Despite their virtual nature, digital documents are threatened by the lack of long-term stability of their media. The French standard NF Z42-013 and law on the validity of digital documents as formal proof require that documents be written on non-rewritable media, guaranteed only over ten years -- a very brief period of time from the archivists' point of view.

This physical deterioration is aggravated by the short life cycle of the logical formats used to represent documents.

The long-term preservation of digital documents thus requires the setup of a dynamic process to schedule, run and audit the physical and logical migrations needed to keep documents alive.

In this context, XML can be used for different purposes:

  • XML is a format that meets the requirements defined by the MTIC -- it's an open recommendation, easy to transform, that should be easy to migrate.
  • XML allows the separation of content from the presentation, and separate storage.
  • The guide recommends defining a XML envelope for the documents, that would contain the description of the document, its requirements for preservation, access control and the history of its migrations.
  • XML is a good candidate for describing the metadata associated with the document -- possibly as a part of its envelope. The MTIC will setup a specific working group for this issue.

The presentation by Alain Bensoussan focused on the legal issues, showing that the presentation does also carry a semantic value that may be needed to courts and that one should keep documents with all the "drivers" needed to visualize them.

The issue could be controversial, since the configuration used by the author of a document and its readers are usually different. A litigation on a contract edited as (X)HTML with tools from a supplier A, and displayed with missing text by a browser B by a customer would probably be difficult to judge.

Any webmaster knows that such things are just too easy to reproduce, and this example gives a new perspective on the legal implications of the lack of conformance to standards in tools.

Copies of the presentations and of the guide should be available online soon.

[1] Mission interministérielle de soutien technique pour le développement des technologies de l'information et de la communication dans l'administration

| See all 3 comments

Newest comments

Re: XML for document preservation (Maxime Coulon - 10:29, 29 May 2003)
dear miss mister i'm a student a the institut for information in amsterdam . i'm busy with a rese ...
Re: XML for document preservation (Eric van der Vlist - 10:45, 28 Jan 2001)
My headlines are not always free from the rubish teasing of the author who'd like to attract more pe ...
Re: XML for document preservation (Rick Jelliffe - 04:41, 27 Jan 2001)
Rubbish (qualified)! XML is a profile of ISO 8879 (WebSGML). An example of how to describe it is g ...
xmlhack: developer news from the XML community

Front page | Search | Find XML jobs

Related categories