Rick Jelliffe, Topologi,
2001-11-27
DZIP is an implemented format used by Topologi to allow the
packaging and interchange of metadata, schemas and scripts
for XML document types. We are making its spec available for
other software creators to use, and to propose it as the basis
for an XAR (XML Application Archive format).
What It Is
A simple format for bundling and distributing the various
schemas, scripts and metadata needed by generic XML desktop
applications.
- Brain-friendly: It is just ZIP with some standard
naming conventions for particular files inside it. One convention
is
If there is a file index.html that is the
human-readable documentation.
- User-friendly: Users can download and install a
single file with all the different configuration and metadata
files they need to work on a document type.
- Integrator-friendly: Integrators can readily create
a DZIP file with any value-added files just using the existing
utilities on most developer's PCs. Configuration files for
multiple vendor's products can co-exist, so the same DZIP
file can serve a complete system.
- Infrastructure-friendly: Compatible with various
manifest and catalog formats.
- Vendor-friendly: We found it took less than a half
day to add DZIP support.
What It Is Not
- For Data: A way to package XML documents: DZIP
is only aimed at configuration files such as schema files.
Imagine the kinds of things that an XML desktop application
might require when you click on File>New....
- For Complex Relationships: Manifest formats (sometimes
called packaging formats) allow very sophisticated
relationships to be expressed. However, DZIP contents itself
with merely providing very flexible and convenient ways
to distribute basic configuration or schema files for a
single document type.
- Super Efficient: ZIP is not a format with best-of-breed
archiving, indexed access, signing, encryption, compression,
localization, etc. DZIP is appropriate when the convenience
factor outweighs the need for high-end performance. However,
ZIP does provide adequate compression, checksums, random
access, permissions.
- For Pay-by-Use: DZIP is targeted mostly at the
needs of the publishing industry. Consequently, aspects
such as digital rights are not at issue. An integrator could
charge for providing or maintaining a DZIP file as part
of a service, but there is no special mechanism built in
for copy protection, billing, user tracking, licensing,
etc.
- Anti-WWW: DZIP does not assume any model of how
the user gets hold of the DZIP file. An application could
dynamically source it over the WWW, a vendor might provide
it as part of a shrink-wrapped application, or an integrator
might deploy it over a corporate intranet.
Motivation
There is no standard format for packaging and distributing
the various schemas, scripts and metadata needed by generic
XML applications.
This kind of format is required when online access to each
component indivivually over the WWW is inappropriate: when
the user may be offline, when the scripts are to be purchased
and installed on the user's computer, when the resources over-ride
the default resources retrievable using the WWW.
Scenario: My company, Topologi
is creating markup tools which need to be readily configurable
and flexible. It is beyond the expertise and patience of most
end-users to configure a large XML or SGML application: we
needed to provide some facility where the end user can make
one menu selection and have all the appropriate configuration
files loaded. DZIP provides a simple way to implement and
deploy this.
Following is a screen shot of a selection mechanism in a
Topologi product under development, which opens DZIP files
one-click configuration.
Scenario: An integrator, Allette
Systems, has long experience in building SGML and XML
systems for customers; yet they report it is still often tedious
to do because a complete XML system often requires installation
and configuration of several different components, each of
which may require multiple files. Furthermore, current inflexible
systems require more maintenance effort than clients should
have to carry. A DZIP system would be simpler to deploy and,
potentially, a single DZIP file could be made which includes
the resources for each of the components in the system, despite
being from different vendors.
Description
1. A DZIP package is a ZIP file.
Rationale: ZIP is a common format, widely available on PCs
and supported in Java. This is compatible with the OASIS Interchange
Package rules, which do not specify the format.
2. A DZIP file is organized so that
- Files in the root of the ZIP archive give information
about that archive, in particular that a file called index.xhtml
(or index.html, index.htm, default.xhtml,
default.html index.htm, or filing those
*.txt) contains documentation about that DZIP file
and document type. (Such a file could also be the RDDL directory,
see below.) A Catalog file can be put here too.
- Files in the second level (i.e., */*) contain
configurations files for the document type, detected by
their extensions.
- Files in other levels contain vendor-specific code. It
is advisable to use the path of your domain to provide namespacing:
vendors may care to make vendor/domainname/
such as vendor/topologi.com/ well-known locations
in which their systems will look for specific configuration
files.
Rationale: The index file of the first level may require
its own CSS stylesheets, DTDs etc. Therefore it is inappropriate
to look in the root directory for configuration files. Instead,
we look in the second level.
Simplicity demands that there should be no mapping tables
if a resource only has a single name. By enforcing that resources
should be in subdirectories, the root directory is kept clean,
and DZIP packagers are not constrained to follow any naming
or organization convention.
Plurality demands that configuration for different vendor's
products can co-exist. This is not so much so that a single
DZIP can support many different applications of the same class,
but rather than an integrator can deploy the configuration
files for all the components in a production chain for a particular
client.
3. A DZIP package may have an OASIS catalog, in its root
directory. The catalog has the name *.soc or CATALOG. If present,
the catalog should be used to map names.
Rationale: This is compatible with the OASIS Interchange
Package rules. Note that a simple user-agent may ignore the
OASIS catalog, and bare any consequent failure.
4. There should only be one DTD, one XS Schema, one RELAX
schema, one Schematron schema, one CSS stylesheet, one XSLT
stylsheet.
Rationale: A document may have different stylesheets, or
several different DTDs possible. However, providing more than
one requires more vendor support, to show the user choices.
A DZIP file may of course have other DTDs or files in deeper,
private levels.
5. The following prefixes have significance:
- .dtd
- XML markup declarations
- .ent
- XML parameter entity declarations
- .mod
- A safe prefix for submodules, not available as the root
of anything
- .css
- A CSS stylesheet
- .xsl or .xslt
- An XSL stylesheet
- .sch
- A Schematron
schema
- .rlx
- A RELAX NG schema
- .xsd or .xsi
- A W3C XML Schema schema
5. In the root directory, a file "dzip_icon16.gif" is a 16x16
icon useful for adding to GUIs by user agents: it is some
representation of the document type.
Rationale: For better user agents.
6. The dzip file is named using a convention:
name-version.dzp
where
- a user agent will use all text before the first "-" as
a name that can be presented to the user, e.g. in a menu
item without having to open the DZIP archive;
- the version number is any string after the first "-",
but which a user agent may attach some policy to (e.g.,
to select the one with the largest string value);
- The extension .dzp is used. (I use this to keep
any future adoption of an XAR extention .xar free.)
Other Technology
The other technologies to consider in this area are as follows.
- OASIS
CATALOGs, a table notation for mapping between names
and locations of entities, with an XML version. In a DZIP file, if there is a CATALOG file in the root
level it will be treated as an OASIS Catalog file; this
file would give relative (i.e. to the root) system identifiers
for public identifiers that the document type uses.
- RDDL
Resource Directory Description Language, an XHTML notation
for representing the various resources associated with a
namespace URI. In a DZIP file, an index.html file in the
root will be treated as giving general documentation for
the document type; this file could be a RDDL (XHTML) document,
as a manifest.
- XPackage
is a manifest format by the Open-EBook consortium. In XPackage
terminology, DZIP is a package archive. Given any
DZIP file, an XPackage package description instance
can be automatically generated, as a manifest. Such an XPackage
package description instance can be included in a
DZIP file, though there is no particular naming convention
at the moment to identify which file is the XPackage.
- DIME,
Direct Internet Message Encapsulation, is a proposal
from Microsoft researchers for encapsulating multiple payloads
together, naming them with IDs and allow efficient access.
Presumably a DZIP file could be unbundled and sent using
DIME, however DIME itself does not provide the conventions
on which DZIP relies.
For further information, see Wrap Your App by Leigh Dodds, XML.COM
|