In order to start processing files using Schematron, you're going to need a few files on your system. You will find the XSLT files on Schematron.com. The ones you want are:
The remaining files you can type in, adding to them them as needed. Our file needing validation starts off very simply. It has no DTD or Schema (we have Schematron!). It represents a book. Quite boring and very simple. Example 2.1, “File input.xml, the simplest input document ” shows this file. We can add complexity when we need it to show Schematron features. So let's look for the constraints we want to apply.
Example 2.1. File input.xml, the simplest input document
<?xml version="1.0" encoding="utf-8" ?> <doc> <chapter id="c1"> <title>chapter title</title> <para>Chapter content</para> </chapter> <chapter id="c2"> <title>chapter 2 title</title> <para>Content</para> </chapter> <chapter id="c3"> <title>Title</title> <para>Chapter 3 content</para> </chapter> </doc>
Now for the constraints. What rules do we want to apply? As you may imagine, I'm going to pick some that may be odd, primarily to demonstrate the functionality of Schematron. I'll try and keep them reasonably sensible.
The first rule is to check that each chapter has a title. Before defining that rule in the Schematron file we need to know something of the outline Schematron file that will be used in all the examples.
Since this file is testing the file
going to name it
input.sch. Example 2.2, “File input.sch, an empty Schematron file.” shows this file. I'm using .sch as the filename
extension simply as a reminder that it is a Schematron file.
Example 2.2. File input.sch, an empty Schematron file.
<?xml version="1.0" encoding="utf-8"?> <iso:schema xmlns="http://purl.oclc.org/dsdl/schematron" xmlns:iso="http://purl.oclc.org/dsdl/schematron" xmlns:dp="http://www.dpawson.co.uk/ns#" queryBinding='xslt2' schemaVersion='ISO19757-3'> <iso:title>Test ISO schematron file. Introduction mode</iso:title> <iso:ns prefix='dp' uri='http://www.dpawson.co.uk/ns#'/> <!-- Your constraints go here --> </iso:schema>
The general heading for a Schematron file. Note the namespaces in use. Add them as normal for any XML file.
The required constraints go in the body of the file
Do you think Schematron knows all about your namespaces? No. For each one specific to you, that you need, add it here as a
Not much to look at. The document element is in the schematron namespace. The Schematron namespace http://purl.oclc.org/dsdl/schematron is associated with the prefix iso, as it is in all these examples. Previous versions of Schematron used the sch prefix. You can choose what prefix you want. Just make sure which namespace you want to associate it with.
The queryBinding attribute
specifies which version of XSLT we are going to use to process the
rules. The title is used in the final output as …
surprisingly, a document title! The only other content is a foreign
namespace definition. I've included it here simply to show how it's
done. We'll use it later. If your input document is namespaced,
you'll need to add the namespace in two places, as a declaration in
the document element, and as a
element. That's it!
Now to add the constraints at the place marked in Example 2.2, “File input.sch, an empty Schematron file.”
Example 2.3. Check for a chapter title
Starting with the
element. This is basically a grouping wrapping. For example, we may
choose to group all related chapter level checks within one
pattern. Within a
pattern element there is
rule element. There could be many rules
within a single pattern. It is good practice to restrict the number of
rules such that the group is coherent and can be quickly
element is at the heart
of Schematron. This expresses a rule that you want to run against
the input document. Two points to note here. Firstly the
contextattribute. This may be viewed in the same way as the match attribute
xsl:template element in an XSLT
stylesheet. The key point is that this specifies the context (used in
just the same way as a context is used in XSLT) in which the rules
will be applied. So for this case, the rule will be applied where the
context is the
chapter element in our
input.xml document. Again note that the
rule element has just one child, an
assert element, though as before, it may have
many child elements, though the context will remain that specified by
A word of caution. Some rules are said to be
abstract. This is defined to be the case when the
abstract has a value of true. If a rule
context attribute, then it cannot
abstract value set to
true. More on this later, see Chapter 9, The
extends element. The grammar for the rule element is, using pseudo DTD syntax:
attributes: abstract[true], id
children: Let*, (Assert | Report | extends)+
attributes: abstract[false]?, context, id?
children: Let*, (Assert | Report | extends)+
So a rule is either abstract or has a context. The latter use is the more common one.
We need a clear understanding of this element, so please slow down a
little reading this paragraph! Two aspects are key. Firstly the
test attribute, which acts in just the same
way as the
test attribute on the
xsl:when element in XSLT. It's a boolean test
returning either true or false. It is executed within the context of
rule (the chapter element in
this case). So if we look at the input document for which we are
writing the rules, for each chapter element, we are making an
assertion that the
chapter element has a
title element as a child. That can only be
either true or false. A chapter has a title element as a direct child,
or it doesn't. That's the syntax. Now the semantics.
This is an assert statement. See section ¶ 5.4.2 in 2. An assert statement is (sort of)
negative. What I mean by that is that if the test passes, the
assertion is said to succeed. The text content of the assert statement
(A chapter should.....) is the message you want to be output if the
assertion fails. What this means in fact is, if the test passes the
asssociated message is not output. If
the test fails, the message will be output! Now re-read this
paragraph. I know it made my head hurt. This is why it matches our
test for a
title child element. If the
title is there, no message is output. If the title is missing, then
test fails and the message is output to the report file.
It becomes easier to accept when you see its inverse, the
To recap. We want to check if each chapter element has a
context attribute on the
rule element is set to
statement uses the
test attribute to test
that such a child element exists. If the test fails, then the text contained within the
assert element is output in the report! That
completes the description of the first element. A little tedious, but
I hope worthwhile.
Before moving on to other elements, we should check that it all
works in practice. If all the tests pass, this is really quite
boring. It works on the principle that no news is good news, so that a
test which passes does nothing? So the
assert should not report anything since our input file is
compliant to our single constraint!