Syntax

Revision History
Revision 0.12007-03-07Dave Pawson
Initial Issue

Table of Contents

Validation choices

This chapter address various (pretty minor) syntax issues. If you want the whole story, see the full Relax NG schema, or nvdl.html which is a prettier version of the same information.

I stated earlier that I will only use Relax NG syntax for the schemas and sub-schemas. Taking an example of others is not a problem. Example 6.1

Example 6.1. Alternate schema usage

<?xml version="1.0" encoding="utf-8"?>
<!-- syntax.ex1.nvdl -->
<rules xmlns="http://purl.oclc.org/dsdl/nvdl/ns/structure/1.0" 
       startMode="doc">

  <mode name="doc">
    <namespace ns="http://document">                                 
      <validate                                    1
      message="A variant on the schema specification"           
      schemaType="application/x-rnc">                 2
	<schema>routing.1.rnc</schema>
      </validate>
    </namespace>
  </mode>

  <mode name="rest">
    <namespace ns="http://head" >
      <validate schema="routing.1.xsd" />  
    </namespace>

    <namespace ns="http://body">
      <validate  schemaType="application/xml-dtd" 
      schema="routing.1.dtd" />          3
    </namespace>
  </mode>
</rules>        
   
  
1

Rather than using the schema attribute this example uses the schema element. The message attribute enables helpful comments in the script.

2

Validating with the abbreviated syntax, the schemaType must be specified.

3

We can even use DTD's.


Due care though. The standard says, “If v contains a schema element as a child, its content is used as the schema. When the content is a string and v has the schemaType attribute, its value shall be a MIME mediatype (see IETF RFC2046) and be used for determining the schema language. When the content is a foreign element, its namespace is used for determining the schema language.” Put simply, if it's a modern XML based schema language, then you should be OK. Otherwise specify the mimetype as per RFC2046 (ietf rfc2046).

Validation choices

The user has many (some might say too many) choices when validating an XML instance

  1. Which instance parts to select

  2. Which schema to use (or write)

  3. Whether to use a grammar based parser or a rules based on such as Schematron

  4. When or whether some sections can be ignored

  5. Whether to be lax or very restrictive

All these aspects could be considered when designing the validation scenario.

How restrictive do you want to be? Remember that saying, be forgiving in what you receive and strict in what you send? How forgiving can you afford to be? It's your choice.

Some of the many options are shown here.

Use a rule to trigger an action. Rules have various actions (reject, validate, allow, attach, attachPlaceholder, cancelNestedActions, unwrap) which may be used selectively to determine the validation pattern wanted. Since anyNamespace matches any section, it should be viewed as a default action.

Rules are either within the root element or within nested mode elements.

Use modes for your initial selection. Within any mode you can dispatch on each one of many different namespaces. Modes increase the selective validation of any script. They can be sequenced by either using the useMode attribute specifying another mode, or by having a nested mode child. The latter cannot be 'reached' from other modes (in the same way that private methods are not accessible in an OO system). Nested modes must be the first child of a mode in which case they are candidates for merging if the namespaces are the same, or a mode may be the child of a validate element and may be further nested themselves. Example 6.2 shows this

Example 6.2. Nested modes

  <mode name="start" >
<namespace ns="http://1">
  <validate schema="1.rng" >  1
    <mode>
      <namespace ns="1.1">
	<validate schema="1.1.rng"
	        useMode="next"/> 2
      </namespace>
    </mode>
  </validate>
</namespace>
</mode>

<mode name="next">      3
  <namespace ns="http://2">
    <validate schema="2.rng"/>
  </namespace>
</mode>

1

Rule is: trigger on http://1 namespace

2

Then change mode to the nested mode (must be in namespace http://1.1) and validate that using the schema 1.1.rng, then

3

Change mode and look for this rule, validating in namespace http://2 with schema 2.rng


[Note]Note

Remember that the nested mode element is an alternative to using the useMode attribute on the parent validate element

Equally, you can call up more than one validator on any one namespace, sequencing through each one on the same section. Basic grammar checks, then business rules, then finance rules etc. You will need to understand the strength of your validators to get the best out of them

You can <reject> or <accept> an entire namespace section. This could introduce very lax or very restrictive processing

When selecting sections, you can use explicit namespaces, or you could use wildcards within the namespace name, for example accepting all namespaces from W3C. Use the wildcard character (defaults to *) in the namespace ns attribute.

Remember that rules apply to elements by default. Use the match attribute on namespace ir with a value of attribute to select attributes. Note that the match attribute is also available on the anyNamespace element.

You can tailor the validator action by passing parameters to the validator. Use the option element within the validate element