<?xml version="1.0" encoding="utf-8"?>
<!-- To edit this file, you will need to use a font
  with mathematical operators such as Lucida Sans Unicode
 	∀  For All
 	∈  Element Of
 	⇒ Rightwards Double Arrow
 	¬ Not Sign
 	∧ Logical And
 	The emdash is used by NCR.
-->
<!-- stylesheet needs:  
	note contains ul
-->
<!--
	TODO:
		Element syntax definitions
	    Compile schemas: use James' RELAX NG schematron schema
	    Predicate logic for lets and keys
	    Clarify terminology of firing versus matching
	    Fill in References
	    Complete terminology
	    Align bindings arguments with formal arguments and terms
-->
<document xmlns:p="http://relaxng.org/ns/proofsystem">

<head> 
<organization>ISO/IEC</organization>
<document-type>International Standard</document-type>
<stage>enquiry</stage>
<secretariat>ANSI</secretariat>
<tc-number>1</tc-number>
<tc-name>Information Technology</tc-name>
<sc-number>34</sc-number>
<sc-name>Document Description and Processing Languages</sc-name>
<serial-number>320</serial-number>
<wg-number>1</wg-number>
<document-number>19757</document-number>
<part-number>3</part-number>
<document-language>E</document-language>
<title>
  <main>Document Schema Definition Languages (DSDL)</main>
  <complementary>Rule-based validation &#x2014; Schematron</complementary>
</title>
<date>2003-07-01</date>
</head>

<foreword>

<part-list>
<part><number>0</number><title>Overview</title></part>
<part><number>1</number><title>Interoperability framework</title></part>
<part><number>2</number><title>Grammar-based validation &#x2014; RELAX NG</title></part>
<part><number>3</number><title>Rule-based validation &#x2014; Schematron</title></part>
<part><number>4</number><title>Selection of validation candidates</title></part>
<part><number>5</number><title>Datatypes</title></part>
<part><number>6</number><title>Path-based integrity constraints</title></part>
<part><number>7</number><title>Character reportoire validation</title></part>
<part><number>8</number><title>Declarative document manipulation</title></part>
<part><number>9</number><title>Datatype- and namespace-aware DTDs</title></part>
</part-list>

</foreword>

<introduction>

<p>The structure of <this/> is as follows. 

<xref to="syntax"/>
describes the syntax of a Schematron schema. 
<xref to="semantics"/> describes the semantics of a 
correct Schematron schema; 
the semantics specify when a document is
valid with respect to a Schematron schema.  

Finally, <xref to="conformance"/>
describes conformance requirements for Schematron validators.
</p>
<p>Normative annexes provide RELAX NG and Schematron schemas
for Schematron, and the default query language binding to
XSLT.</p>

<p><This/> is based on the <xref to="schematron"/>. </p>

</introduction>

<scope>

<p><This/> specifies Schematron, a schema language for XML.</p>
<p>Considered theoretically,
a Schematron schema reduces to a non-chaining rule system
whose terms are boolean functions invoking an external query language
on the instance and other visible XML documents, 
with syntactic features to reduce specification size and to allow
efficient implementation.</p>

<p>Considered as a document type, a Schematron schema contains
natural language assertions concerning an instance, marked up
with various elements and attributes for testing these 
assertions, and for simplifying and grouping the assertions. </p>

<p>Considered analytically, 
Schematron has two characteristic high-level
abstractions: the pattern and the phase. These allow 
the representation of non-regular, non-sequential constraints 
that Part 2 cannot specify, and various dynamic or contingent
constraints.</p>

<p><This/> establishes requirements for Schematron schemas and specifies
when an XML document matches the patterns specified by a Schematron
schema.</p>

</scope>

<normative-references>

<p>The following referenced documents are indispensable for the
application of <this/>. For dated references, only the edition
cited applies. For undated references, the latest edition of the
referenced document (including any amendments) applies.</p>

<note><p>Each of the following documents has a unique identifier that
is used to cite the document in the text.  The unique identifier
consists of the part of the reference up to the first
comma.</p></note>

<note><p>The definitions of Part 1 and Part 2
also apply to <this/>.</p></note>

<referenced-document id="xpath-rec">
<abbrev>W3C XPath</abbrev>
<title>XPATH???</title>
<field>W3C Recommendation</field>
<url>http://www.w3.org/TR/???</url>
</referenced-document>

<referenced-document id="xslt-rec">
<abbrev>W3C XSLT</abbrev>
<title>XSLT??</title>
<field>W3C Recommendation</field>
<url>http://www.w3.org/TR/???</url>
</referenced-document>


</normative-references>

<terms-and-definitions>


<term-and-definition>
<term>schema</term>

<definition>specification of a set of XML documents</definition>

</term-and-definition>
 
 <term-and-definition>

<term>correct schema</term>
<definition>schema that satisfies all the requirements of
<this/></definition>
</term-and-definition>

<term-and-definition>
<term>good schema</term>

<definition>a correct schema with queries which
terminate,  and do not
add constraints to those of the natural language
assertions. Note: it is not possible to compute
that a schema is good.
</definition>
</term-and-definition>

<term-and-definition>
<term>valid with respect to a schema</term>
<definition>member of the set of XML documents described by the
schema. An instance is valid if no assertion tests
in fired rules of active patterns fail.
</definition>
</term-and-definition>

<term-and-definition>
<term>phase</term>
<definition>a named, unordered collection of patterns; patterns may belong
to more than one phase; two names #ALL and #DEFAULT are reserved</definition>
</term-and-definition>

<term-and-definition>
<term>active phase</term>
<definition>one particular phase, whose patterns are 
used for validation</definition>
</term-and-definition>


<term-and-definition>
<term>pattern</term>
<definition>a named structure in instances 
described by an lexically ordered collection of rules</definition>
</term-and-definition>


<term-and-definition>
<term>active pattern</term>
<definition>a pattern belonging to the active phase</definition>
</term-and-definition>


<term-and-definition>
<term>abstract pattern</term>
<definition>a pattern with rules
to some extent parameterized</definition>
</term-and-definition>

<term-and-definition>
<term>subject</term>
<definition>a particular information item which matches a rule
and which is used as the base for assertions</definition>
</term-and-definition>

<term-and-definition>
<term>assertion</term>
<definition>a natural language statement;
an assertion "succeeds" or "fails"</definition>
</term-and-definition>

<term-and-definition>
<term>assertion test</term>
<definition>an assertion modelled/implemented by a boolean query</definition>
</term-and-definition>

<term-and-definition>
<term>rule</term>
<definition>unordered collection of assertions</definition>
</term-and-definition>

<term-and-definition>
<term>rule context</term>
<definition>???? a query to specify subjects 
???a selection of elements; 
a rule is said to fire when an information item
matches its query</definition>
</term-and-definition>

<term-and-definition>
<term>abstract rule</term>
<definition>a collection of assertions 
which can be included in other rules but which
does not fire itself</definition>
</term-and-definition>

<term-and-definition>
<term>variable</term>
<definition>let and param</definition>
</term-and-definition>

<term-and-definition>
<term>diagnostic</term>
<definition>named ...</definition>
</term-and-definition>

<term-and-definition>
<term>key</term>
<definition></definition>
</term-and-definition>


<term-and-definition>
<term>variable</term>
<definition></definition>
</term-and-definition>


<term-and-definition>
<term>parameter</term>
<definition></definition>
</term-and-definition>

 
<term-and-definition>
<term>macro substitution</term>
<definition></definition>
</term-and-definition>


<term-and-definition>
<term>element inclusion</term>
<definition></definition>
</term-and-definition>


<term-and-definition>
<term>text substitution</term>
<definition></definition>
</term-and-definition>


</terms-and-definitions>

<clause><title>Notation</title>
<clause>
<title>XPath</title>
<p><This/> uses <xref to="xpath-rec"/> to express the names of 
information items in the schema. In <this/> the prefix <code>sch</code> 
is bound to the Schematron namespace URI.</p>

</clause>
<clause>
<title>Predicate Logic</title>
<p><This/> uses predicate logic to express the semantics of
Schematron schema. 
The following functions are defined:</p>
<notation-list>

<notation-item><notation>∈</notation>
<notation-definition><p>is member of, an infix relation, 
used in the set-operation sense. </p></notation-definition>
</notation-item>
</notation-list>

<note><p>The more familiar term "is element of" is not used,
to avoid confusion with XML elements.
Where y is an element in a simplified schema, x ∈ y  
is defined here as the 
<code>child::x</code> path from context of subject y
as defined by <xref to="xpath-rec"/>.
Where y is the instance being validated,
x ∈ y 
is defined here as all the subjects (information items) in the instance
that can be accessed by the query language,
as speficied in the query language binding.
Where y is the name of the active active-phase,
in which case one of the following is true:</p>
<ul>
<li><p>x ∈ y is defined here as the path
<code>//sch:pattern </code>
when y has the special
value #ALL </p>
</li>

<li><p>x ∈ y is defined here as the path
<code>//sch:pattern[@id=/sch:schema/@default-phase] </code>
when the y has the special
value #DEFAULT </p>
</li>
<li><p>Otherwise x ∈ y is defined here as the path
<code>../sch:pattern[@id=//sch:phase[id="y"]/active/@pattern]</code> 
where y is a name. </p>
</li>
</ul>
</note>
<notation-list>
<notation-item><notation>position( r )</notation>
<notation-definition><p>the XPath function  position()
of a rule r in its parent pattern</p>
</notation-definition></notation-item>

<notation-item><notation>match ( r, s, d )</notation>
<notation-definition><p>a function returning boolean
provided by the query language binding: it returns true
iff the subject s from the document d 
matches the context expression of rule r</p>
</notation-definition></notation-item>

<notation-item><notation>assert ( a,  s, d)</notation>
<notation-definition><p>a function returning boolean
provided by the query language binding: it returns true
iff the assertion a is true when applied to
the subject s from the document d 
</p>
</notation-definition></notation-item>

<!--notation-item><notation>assert ( a,  i, d, v )</notation>
<notation-definition><p>a function returning boolean
provided by the query language binding: it returns true
iff the assertion a is true when applied to
the information item i of the document d 
using the named variable values list v.
</p>
</notation-definition></notation-item-->
</notation-list>
</clause>

<clause id="syntax">
<title>Syntax</title>
<clause id="namespace">
<title>Namespace and Whitespace</title>

<p>All elements shown in the grammar are qualified with the namespace
URI:</p>

<pre>http://www.ascc.net/xml/schematron</pre>


<p>Any element can also have foreign attributes in addition to the
attributes shown in the grammar. A foreign attribute is an attribute
with a name whose namespace URI is neither the empty string nor the
Schematron namespace URI.  Any non-empty element 
may have foreign child elements in addition
to the child elements shown in the grammar. A foreign element is an
element with a name whose namespace URI is not the Schematron namespace
URI.  There are no constraints on the relative position of foreign
child elements with respect to other child elements.</p>

<p>Any element can also have as children strings that consist
entirely of whitespace characters, where a whitespace character is one
of U+0020, U+009, U+00D or U+00A. There are no constraints on the relative
position of whitespace string children with respect to child
elements. 
</p>

<p>Leading and trailing whitespace is allowed for the value of any
attribute, and shall be stripped.</p>
 
 </clause>


<clause>
<title><code>active</code> element</title>

<p><code>empty</code></p>

</clause>

<clause>
<title><code>assert</code> element</title>

<p><code>empty</code></p>

</clause>


<clause>
<title><code>dir</code> element</title>

<p><code>empty</code></p>

</clause>


<clause>
<title><code>diagnostics</code> element</title>

<p><code>empty</code></p>

</clause>


<clause>
<title><code>diagnostic</code> element</title>

<p><code>empty</code></p>

</clause>

<clause>
<title><code>extends</code> element</title>

<p><code>empty</code></p>

</clause>
<clause>
<title><code>key</code> element</title>

<p><code>empty</code></p>

</clause>


<clause>
<title><code>let</code> element</title>

<p><code>empty</code></p>

</clause>


<clause>
<title><code>library</code> element</title>

<p><code>empty</code></p>

</clause>

<clause>
<title><code>name</code> element</title>

<p><code>empty</code></p>

</clause>

<clause>
<title><code>ns</code> element</title>

<p><code>empty</code></p>

</clause>

<clause>
<title><code>p</code> element</title>

<p><code>empty</code></p>

</clause>


<clause>
<title><code>param</code> element</title>

<p><code>empty</code></p>

</clause>
<clause>
<title><code>pattern</code> element</title>

<p><code>empty</code></p>

</clause>


<clause>
<title><code>phase</code> element</title>

<p><code>empty</code></p>

</clause>

<clause>
<title><code>report</code> element</title>

<p><code>empty</code></p>

</clause>

<clause>
<title><code>rule</code> element</title>

<p><code>empty</code></p>

</clause>

<clause>
<title><code>schema</code> element</title>

<p><code>empty</code></p>

</clause>

<clause>
<title><code>span</code> element</title>

<p><code>empty</code></p>

</clause>


<clause>
<title><code>title</code> element</title>

<p><code>empty</code></p>

</clause>

<clause>
<title><code>value-of</code> element</title>

<p><code>empty</code></p>

</clause>


</clause>


<clause id="semantics">
<title>Semantics</title>

<clause id="func">
<title>Validation Function</title>

<p>A Schematron validator is a function 
returning "valid", "invalid" or "error".
The function performs notationally performs two steps: 
transforming the schema into a simple syntax, 
then testing the instance against the simple syntax.
</p>
<note><p><This/> does not constrain other information 
provided by an implementation nor
other uses of Schematron schemas.
However, it is the intent of <this/> to support implementations
to provide rich, specific diagnostics customized with
values that assist in detecting and rectifying problems.</p></note>
<p>A Schematron validator is a function over
the following:
</p>
<ul>
<li><p>a query language binding</p></li>
<li><p>a schema document</p></li>
<li><p>an instance to be validated</p></li>
<li><p>external instances, if the query language invokes them</p></li>
<li><p>a phase name, or #ALL if all patterns shall be active patterns,
or #DEFAULT if the phase attribute on the schema element shall be
used</p></li>
<li><p>a list of name-value pairs, if the schema uses external variables.</p></li>
</ul>

</clause>
<clause id="preprocessing">
<title>Simple Syntax</title>


<p>A Schematron validator shall perform the following transformation
steps on the schema, resulting in a schema in the simple syntax:</p>
<ol>
<li><p>Resolve all libraries by element inclusion</p></li>
<li><p>Resolve all abstract patterns by macro substitution</p></li>
<li><p>Resolve all abstract rules in the schema by element inclusion</p></li>
<li><p>Resolve all top-level parameters by text substitution</p></li>
<li><p>Negate all sch:report elements into sch:assert elements</p></li>
<li><p>Remove (ignore) all elements in foreign namespaces.</p></li>
</ol>
<p>The resulting simple syntax is also a valid Schematron instance
in the full syntax. The simple syntax differs from the complex syntax
by not containing the following XPaths: </p>
<ul>
<li><p><code>//sch:library</code></p></li>
<li><p><code>//sch:pattern/@abstract="true"</code></p></li>
<li><p><code>//sch:rule/@abstract="true"</code></p></li>
<li><p><code>/sch:schema/sch:param</code></p></li>
<li><p><code>//sch:rule/sch:report</code></p></li>
</ul>
</clause>


</clause>

<clause id="simpleval">
<title>Schema Semantics</title>
<p>This clause gives the semantics of a good schema
that has been transformed into the simple syntax.</p>

<p>A good schema with no use of keys or variables
satisfies the following predicate:</p>
<pre>
    ∃ ( instance, schema, active-phase ),
	∀( subject, pattern, rule, assertion ) :
		subject ∈ instance,
		subject ∈ schema,
		pattern ∈ active-phase,
		rule ∈ pattern,
		assertion ∈ rule :
			match ( rule, subject, instance ) 
		 	∧ ( ∀(previous-rule ) : 
		 		previous-rule ∈ pattern,
		 		position (previous-rule ) &lt; position( rule ) :
		 			¬ ( match ( previous-rule, subject, instance )))
		 	⇒ assert ( assertion,  subject, instance ) = true
</pre>
<note><p>In natural language, that is
"<i>There exists an instance, schema and active-phase combination
where, for each subject, pattern, rule and assertion
(the subject being a member of that instance,
the pattern being a member of that schema,
the pattern being a member of that active-phase,
the rule being a member of that pattern,
the assertion being a member of that rule),
the following is true:
if the subject in an instance
matches the rule, and that subject has not
been matched by a previous rule in the same pattern,
then the particular assertion 
evaluates to true when applied to the particular subject
and instance.</i>"
</p></note>
<!--
<p>A good schema with use of variables and keys is
as follows:</p>
<pre xml:space="preserve">
    ∃ ( document schema active-phase )
	∀( context pattern rule assertion ) :
		context ∈ document
		pattern ∈ schema
		pattern ∈ active-phase
		rule ∈ pattern
		assertion ∈ rule 

		augmented-instance = augment( instance, ∀(key: key ∈ schema: key) )		
		:
			match ( rule, context, document ) 
		 	∧ ( ∀(previous-rule ) : 
		 		previous-rule ∈ pattern
		 		position (previous-rule ) &lt; position( rule ) :
		 assert ( assertion,  context, document,	
		 	∀(let-variable : 
				let-variable ∈ rule :
				evaluate( let-variable, information, document ))) 
			// WRONG: one let rule can reference a previous rule
			) = true	
</pre>
-->

</clause>
<clause id="querylang">
<title>Query Language Binding</title>

<p>A query language binding shall provide the
following:
</p>
<ul>
<li><p>The general query language used.
A name token which identifies the query language.
The data model.</p></li>
<li><p>The rule context query language. 
The rule context scope.</p></li>
<li><p>The assertion test, 
a function which returns a data value
coerceable into boolean.</p></li>
</ul>

<note><p>The following query language names are
reserved and recommended:</p>
<ul>
<li><p>xslt</p></li>
<li><p>exslt</p></li> 
<li><p>xslt2</p></li>
<li><p>xpath</p></li>
<li><p>xpath2</p></li>
<li><p>xquery</p></li>
</ul>
</note>
<p>A Schematron implementation which does
not support the specification language
shall fail with an error.</p>

<p>A schema language binding may provide the following:
</p>
<ul>
<li><p>The name query language, a function which returns a 
data value coerceable into a string.</p></li>
<li><p>The value-of query language, a function which returns a 
data value coerceable into a string.</p></li>
<li><p>The key path  language.</p></li>
<li><p>The let value query language, a function which returns a 
data value. </p></li>
<li><p>The variable delimiter convention, a lexical 
convention such as a delimiter by which
the use of a variable in a query expression 
shall be recognized.</p></li>
<li><p>The abstract pattern parameter convention,
a lexical convention such as a delimiter
by which the parameters of
abstract patterns  inside query expressions
shall be recognized.</p></li>
</ul>


<p>A Schematron implementation which does
not support the specification language
shall fail with an error.</p>

</clause>

<clause id="order">
<title>Order and side-effects</title>
<p>The order in which elements
are validated is implementation-dependent,
without altering the validity of the instance.</p>
<p>The order in which patterns are used is
implementation-dependent,
without altering the validity of the instance</p>
<p>The order in which elaborated rule-contexts
are matched is implementation-dependent,
without altering the validity of the instance
</p>
<p>The order in which assertions are tested
is implementation-dependent,
without altering the validity of the instance.</p>

<p>The only elements for which order is significant
are the sch:rule and sch:let elements.
</p>
<p>An sch:rule element acts as an if-then-else
chain within each pattern. An implementation may 
make order non-significant 
by converting rules contexts to elaborated rule contexts. 
An elaborated rule context consists the negated union
of all the lexically previous rule contexts in the same
pattern interected with the current rule context.
</p>
<p>An sch:let element may use lexically previous variables
within the same rule or global variables.</p>

<note><p>A wide variety of implementation
strategies are therefore possible.
</p></note>

<p>All queries shall act as pure functions. 
Queries shall not alter the instance in any
way visible to other queries. 
<This/> does not specify any outcome augmentation 
of the instance being validated.</p>

<p>The only element which has a side-effect is key,
which may provide extra index information for other
queries.</p>


</clause>

</clause>

<clause id="conformance">
<title>Conformance</title>
<clause id="full-conformance">
<title>Full Conformance</title>

<p>A full-conformance Schematron validator shall be able to determine for
any XML document whether it is a correct schema.</p>
<ul>
<li><p>A correct schema conforms to the constraints
of the normative RELAX NG schema of <this/>.</p></li>
<li><p>A correct schema conforms to the constraints
of the normative Schematron schema of <this/>.</p></li>
<li><p>A correct schema's attributes conform to the grammars
specified by the query language binding in use.</p></li>
</ul>


<p>A full-conformance Schematron validator shall be able to determine 
for any XML document and for any good schema whether 
the document is valid with respect to the schema. 
</p>

<note><p>It is not a requirement of <this />
that a full-conformance Schematron validator 
shall be able to determine whether the validation will 
terminate or whether the queries are feasible against
some other schema for the instance.
The ability to determine these
depends on the query language used.
Where the query language allows incorrectness to be established,
implementations are encouraged to report this information
as part of validation.</p></note>
</clause>

<clause id="simple-conformance">
<title>Simple Conformance</title>

<p>A simple-conformance Schematron validator shall be able 
to report for any XML document whether it may not be a valid 
Schematron schema.</p>
<ul>
<li><p>A valid schema conforms to the constraints
of the normative RELAX NG schema of <this/>.</p></li>
<li><p>A valid schema conforms to the constraints
of the normative Schematron schema of <this/>.</p></li>
<li><p>A valid schema's attributes conform to the grammars
specified by the query language binding in use.</p></li>

</ul>


<p>A simple-conformance Schematron validator shall 
be able to determine for any XML
document and for any good schema whether the document is
valid with respect to the schema. </p>

<note><p>It is not a requirement of <this />
that a simple-conformance Schematron validator 
shall be able to determine whether validation will 
terminate or whether the queries are feasible against
some other schema for the instance.
The ability to determine these
depends on the query language used.
Where the query language allows incorrectness to be established,
implementations are encouraged to report this information
as part of validation.</p></note>
</clause>
</clause>
<annex normative="true">
<title>RELAX NG schema for Schematron</title>

<p>A correct Schematron schema shall be valid with respect to the
following RELAX NG schema.</p>
<p><strong>RELAX NG schema for Schematron goes here</strong></p>
<!--
<rngref src="iso-schematron.rng"/>
-->
</annex>

<annex normative="true">
<title>Schematron schema for Schematron</title>

<p>A correct Schematron schema shall be valid with respect to the
following Schematron schema. This schema does not specify
constraints which can be expressed by 
the RELAX NG schema for Schematron.</p>
<p><strong>Schematron Schema for Schematron goes here</strong></p>
<!--
<rngref src="iso-schematron.sch"/>
-->
</annex>

<annex normative="true">
<title>Schematron Query Language Binding for XSLT</title>
<p>A Schematron schema with no language binding or a language
binding with the value "xslt" shall use the following binding.</p>
<ul>
<li><p>The query language used is the extended version of
<xref to="xpath-rec"/> specified in <xref to="xslt-rec"/>.
Consequently, the data model used is the data model of those
specifications.</p></li>
<li><p>The rule context is interpreted according to the production
XXX of XSLT. The rule context may be elements, attributes, 
comments and processing instructions.</p></li>
<li><p>The assertion test is interpreted according to production
XXX of XSLT.</p></li>
<li><p>The name query is interpreted according to production 
XXX of XSLT.</p></li>
<li><p>The value-of query is interpreted according to production
XXX of XSLT.</p></li>
<li><p>The key path is interpreted according to production
XXX of XSLT. A Schematron key is equivalent to an XSLT
key.</p></li>
<li><p>The let value is interpreted according to production
XXX of XSLT. </p></li>
<li><p>A Schematron let expression is treated 
as an XSLT variable. The XSLT $ delimiter signifies the use
of a variables in an assertion test, name query or value-of
query.</p></li>
<li><p>The notation for signifying abstract pattern is to prefix
the token with the <strong>????</strong>. 
This is a character not found in URLs or XPaths.</p></li>
</ul>



</annex>

<bibliography>

<referenced-document id="schematron">
<abbrev>Schematron</abbrev>
<title>Resource Description for Schematron (web page)</title>
<field>Rick Jelliffe</field>
<field>Computing Centre, Academia Sinica, Taipei</field>
<url>http://www.ascc.net/xml/schematron</url>
</referenced-document>

</bibliography>

</document>
