Modifying an XML Schema with XSLT

Recently I ran into an issue with a XSD that I had generated by using our modelling tool. It appeared the XSD did not match with my wishes so I had to modify the XSD. Now I could do this by hand quite easily but that would mean I had to redo it every time I regenerated the XSD after a model change. So I decided to use XSLT for the modifications. I have used XSLT quite extensively at previous projects but these where about 8 years ago and it just made me realize how fast one looses his knowledge if you don’t use it anymore.
Anyway, to avoid this situation for the next few years I decided to post about the basics of XSLT that I used in this situation so I always have a reference to get back at. The situation was that an XSD was generated but it was lacking several things:

  • no use of namespaces (both import and prefixes)
  • names of custom defined types didn’t end with the suffix ‘Type’
  • some complex types were missing

And several more things but the idea stays the same.
Here is an example XSD that could be generated by the tool:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
	<xs:complexType name="Address">
		<xs:sequence>
			<xs:element name="street" type="xs:string"/>
			<xs:element name="nr" type="xs:integer"/>
			<xs:element name="postalCode" type="myPostalCode"/>
		</xs:sequence>
	</xs:complexType>
	<xs:complexType name="Customer">
		<xs:sequence>
			<xs:element name="firstName" type="xs:string" minOccurs="0"/>
			<xs:element name="lastName" type="xs:string"/>
			<xs:element name="address" type="Address"/>
		</xs:sequence>
	</xs:complexType>
</xs:schema>

Here is how I would like the XSD to be look like:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:tns="htpp://www.pascalalma.net/customer" xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="htpp://www.pascalalma.net/customer">
	<xs:element name="customer" type="tns:CustomerType"/>
	<xs:complexType name="AddressType">
		<xs:sequence>
			<xs:element name="street" type="xs:string"/>
			<xs:element name="nr" type="xs:integer"/>
			<xs:element name="postalCode" type="tns:myPostalCodeType"/>
		</xs:sequence>
	</xs:complexType>
	<xs:complexType name="CustomerType">
		<xs:sequence>
			<xs:element name="firstName" type="xs:string" minOccurs="0"/>
			<xs:element name="lastName" type="xs:string"/>
			<xs:element name="address" type="tns:AddressType"/>
		</xs:sequence>
	</xs:complexType>
	<xs:complexType name="myPostalCodeType">
		<xs:sequence>
			<xs:element name="firstPart">
				<xs:simpleType>
					<xs:restriction base="xs:string">
						<xs:length value="4"/>
					</xs:restriction>
				</xs:simpleType>
			</xs:element>
			<xs:element name="secondPart">
				<xs:simpleType>
					<xs:restriction base="xs:string">
						<xs:length value="2"/>
					</xs:restriction>
				</xs:simpleType>
			</xs:element>
		</xs:sequence>
	</xs:complexType>
</xs:schema>

To get to this last XSD I used the following XSLT (see inline documentation for explanation):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema">
	<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
	<!-- first copy the root and apply templates-->
	<xsl:template match="/">
		<xsl:copy>
			<!-- copy the attributes (if any) of the root element -->
			<xsl:copy-of select="@*"/>
			<xsl:apply-templates/>
		</xsl:copy>
	</xsl:template>
	<!-- match any other element and copy it with the attributes -->
	<xsl:template match="*">
		<xsl:copy>
			<xsl:copy-of select="@*"/>
			<xsl:apply-templates/>
		</xsl:copy>
	</xsl:template>
	<!-- Add the necessary namespaces to the root of the schema -->
	<xsl:template match="xs:schema">
		<!-- Recreate the xs:schema element-->
		<xsl:element name="xs:schema">
			<!-- Copy the existing attributes in the original xs:schema element to this one -->
			<xsl:copy-of select="@*"/>
			<!-- add Namespace -->
			<xsl:namespace name="tns">htpp://www.pascalalma.net/customer</xsl:namespace>
			<xsl:attribute name="targetNamespace">htpp://www.pascalalma.net/customer</xsl:attribute>
			<!-- add toplevel element -->
			<xsl:element name="xs:element">
				<xsl:attribute name="name">customer</xsl:attribute>
				<xsl:attribute name="type">tns:CustomerType</xsl:attribute>
			</xsl:element>
			<!-- continue with matching al child elements of the xs:schema element-->
			<xsl:apply-templates/>
			<!-- call a custom template to add an extra complexType to the schema that couldn't be generated by the modelling tool-->
			<xsl:call-template name="addPostalCodeType"/>
		</xsl:element>
	</xsl:template>
	<!-- suffix all complexType names with 'Type' -->
	<xsl:template match="xs:complexType">
		<xsl:choose>
			<!-- Check if the type already has the suffix 'Type' -->
			<xsl:when test="substring(@name, (string-length(@name)-3)) = 'Type'">
				<!-- if so, just copy the existing attributes -->
				<xsl:copy>
					<xsl:copy-of select="@*"/>
					<xsl:apply-templates/>
				</xsl:copy>
			</xsl:when>
			<xsl:otherwise>
				<!-- if not, copy all attribute and overwrite the attribute 'name' so the suffix 'Type' is added-->
				<xsl:copy>
					<xsl:copy-of select="@*"/>
					<xsl:attribute name="name"><xsl:value-of select="@name"/>Type</xsl:attribute>
					<xsl:apply-templates/>
				</xsl:copy>
			</xsl:otherwise>
		</xsl:choose>
	</xsl:template>
	<!-- Prefix all types with 'tns:'. All default xs: types are untouched.    -->
	<xsl:template match="xs:element">
		<!-- Create a variable with the name of the type of the element in it. It appears this 'type' attribute
			can be of different types so take this into account -->
		<xsl:variable name="elementType">
			<xsl:choose>
				<xsl:when test="@type instance of xs:string or @type instance of xs:untypedAtomic">
					<xsl:choose>
						<xsl:when test="contains(@type,':')">
							<xsl:value-of select="substring-after(@type,':')"/>
						</xsl:when>
						<xsl:otherwise>
							<xsl:value-of select="@type"/>
						</xsl:otherwise>
					</xsl:choose>
				</xsl:when>
				<xsl:otherwise>
					<xsl:value-of select="@type"/>
				</xsl:otherwise>
			</xsl:choose>-->
		</xsl:variable>
		<!-- Create a variable with the namespace prefix of the attribute if any. -->
		<xsl:variable name="elementTypePrefix">
			<xsl:choose>
				<xsl:when test="@type instance of xs:QName">
					<xsl:value-of select="prefix-from-QName(@type)"/>
				</xsl:when>
				<xsl:when test="@type instance of xs:string">
					<xsl:value-of select="substring-before(@type,':')"/>
				</xsl:when>
			        <xsl:otherwise>
                                       <xsl:value-of select="substring-before(@type,':')"/>
                                </xsl:otherwise>
			</xsl:choose>
		</xsl:variable>
		<!-- Now the variables are filled we can use it to determine if we need to add a prefix -->
		<xsl:choose>
			<!-- if type starts with xs: the prefix must not be changed -->
			<xsl:when test="$elementTypePrefix = 'xs'">
				<xsl:copy>
					<xsl:copy-of select="@*"/>
					<xsl:apply-templates/>
				</xsl:copy>
			</xsl:when>
			<xsl:otherwise>
				<xsl:copy>
					<xsl:copy-of select="@*"/>
					<!-- A few steps back we renamed complextypes so they ended with 'Type'. Now we have to suffix the places
              where the types are reffered also with 'Type' -->
					<xsl:if test="substring($elementType, (string-length($elementType)-3)) != 'Type'">
						<xsl:attribute name="type">tns:<xsl:value-of select="@type"/>Type</xsl:attribute>
					</xsl:if>
					<xsl:if test="substring($elementType, (string-length($elementType)-3)) = 'Type'">
						<xsl:attribute name="type">tns:<xsl:value-of select="@type"/></xsl:attribute>
					</xsl:if>
					<xsl:apply-templates/>
				</xsl:copy>
			</xsl:otherwise>
		</xsl:choose>
	</xsl:template>
	<!-- The next template generates the complexType 'myPostalCodeType' when the template is called -->
	<xsl:template name="addPostalCodeType">
		<xsl:element name="xs:complexType">
			<xsl:attribute name="name">myPostalCodeType</xsl:attribute>
			<xsl:element name="xs:sequence">
				<xsl:element name="xs:element">
					<xsl:attribute name="name">firstPart</xsl:attribute>
					<xsl:element name="xs:simpleType">
						<xsl:element name="xs:restriction">
							<xsl:attribute name="base">xs:string</xsl:attribute>
							<xsl:element name="xs:length">
								<xsl:attribute name="value">4</xsl:attribute>
							</xsl:element>
						</xsl:element>
					</xsl:element>
				</xsl:element>
				<xsl:element name="xs:element">
					<xsl:attribute name="name">secondPart</xsl:attribute>
					<xsl:element name="xs:simpleType">
						<xsl:element name="xs:restriction">
							<xsl:attribute name="base">xs:string</xsl:attribute>
							<xsl:element name="xs:length">
								<xsl:attribute name="value">2</xsl:attribute>
							</xsl:element>
						</xsl:element>
					</xsl:element>
				</xsl:element>
			</xsl:element>
		</xsl:element>
	</xsl:template>
</xsl:stylesheet>

Although this example might be very specific and there are better ways to obtain similar results, to me this might be a handy reminder how to solve an issue with XSLT and perhaps it might even help you in some way. In another post I will show how I do this translation in Maven and also how to validate the outcome.

About Pascal Alma

Pascal is a senior IT consultant and has been working in IT since 1997. He is monitoring the latest development in new technologies (Mobile, Cloud, Big Data) closely and particularly interested in Java open source tool stacks, cloud related technologies like AWS and mobile development like building iOS apps with Swift. Specialties: Java/JEE/Spring Amazon AWS API/REST Big Data Continuous Delivery Swift/iOS
This entry was posted in XML/ XSD/ XSLT and tagged . Bookmark the permalink.