XML Schema

 

What is XML Schema?

 

XML Schema is a method for “ defining the structure, contents and semantics of XML document”. XML Schemas are written using standard XML rules, meaning we use the same rules and syntax to write an XML Schema, which we use to write a well-formed XML document. The XML Schema is used to describe the content and structure of an XML document, at the same time the schema also determines the validity of the XML document. When we use schemas, we use the two-document model, one is the instance document (which is the well-formed XML document), and the other is the schema document. In simple words, schema describes a class of XML document, and the data document is an instance of that class.

 

Why XML Schemas?

 

XML document is said to be well-formed when the basic rules are followed, like every tag should be properly closed with a corresponding closing tag, etc. But this well-formed document is still not valid, until it has a corresponding XML schema. Once you have a schema, your XML document is not only checked for the basic rules and syntax required by an XML document, but it is compared with the corresponding XML schema, to check the validity of the XML document.

 

XML Schemas assist a user with the namespace support, at the same time, it also supports inheritance and sub-classing. Another major advantage of using schemas is that they provide you with the concept of data types, meaning which data element should contain what type of data. In case you have specified in your schema that a particular element should contain only numeric value, while passing data to the XML document it will be checked that only numeric data is passed on to that element.

 

Lets take a small example in order to understand about XML schemas. First lets take a look at the XML document and then lets go on to understand the corresponding XML Schema.

 

XML document

 

<?xml version=”1.0”?>

 

<article>

        <headline section=”business”>

                <mainhead>Main healine goes here</mainhead>

                <subhead>Sub-headline goes here</subhead>

        </headline>

        <body>Contents of the story</body>

        <stats>

                <bureauID>D-54</bureauID>

        </stats>

        </headline>

</article>

 

Now lets take a moment to understand this XML document. As you can see <article> is the root / document element. This has child tags <headline>, <body> and <stats>. Again <headline> and <stats> tags have child elements of their own. The elements, which have child elements or attributes or both, are considered as complexType. . If you take a look at the <body> tag it contains only data, which makes it a simpleType. The element is considered to be a simpleType when it does not contain any child element neither attributes.

 

Now lets take a look at the XML schema for the above XML document.

 

<schema targetNamespace=”http://www.Mycompany.com/news

        xmlns:news=”http://www.Mycompany.com/news

        xmlns:xsd=”http://www.w3.org/1999/XMLSchema”>

<element name=”article” type=”news:articleType”/>

 

<complexType name=”articleType”>

        <element name=”headline” type=”news:headlineType”/>

        <element name=”body” type=”xsd:string”/>

        <element name=”stats” type=”news:headlineType”/>

</complexType>

 

<complexType name=”headlineType”>

        <element name=”mainhead” type=”xsd:string”/>

        <element name=”subhead” type=”xsd:string”/>

        <attirbute name=”section” type=”xsd:string”/>

</complexType>

 

<complexType name=”statsType”>

        <element name=”bureauID type=”news:BureauIDType”/>

</complexType>

 

<simpleType name=”BureauIDType”>

        <pattern value=”[A-Z]-d{2}/>

</simpleType>

 

</schema>

 

The above is an XML schema which can also be called as a well-formed XML document. We are using schema namespaces. Elements in the schema with the prefix xsd are identified as belonging to the XML Schema namespace by the declaration:

xmlns:xsd=”http://www.w3.org/1999/XMLSchema”>

and with the prefix news belongs to the vocabulary of the author:

xmlns:xsd=”http://www.w3.org/1999/XMLSchema”>

In the schema elements are first declared and then defined.