December 27, 2008
Posted by Eddie
My XSLT Toolbox – copy and copy-of
Using XSLT to copy elements is extremely common when you’re transforming a source document of a certain type (XML, HTML, etc.) to the same type. Often, you need an exact copy of an element verbatim, but other times you need to selectively choose certain elements to copy and others to discard. XSLT makes this process quite elegant using it’s xsl:copy-of and xsl:copy elements. The following is a setp-by-step tutorial on how these elements are used.
When you need an exact copy of an element and it’s children, you use the xsl:copy-of element, which makes an exact copy of the selected element and it’s children. Given the following XML data, which represents a (trivial) inventory of a store, let’s say you want an exact copy of any items with the name “XSLT”.
<?xml version="1.0" encoding="UTF-8"?> <inventory> <item id="1"> <name>The Little Schemer</name> <type>book</type> <author>Friedman</author> <author>Felleisen</author> <list-price>29.95</list-price> <sell-price>26.99</sell-price> <cost>17.92</cost> </item> <item id="2"> <name>XSLT</name> <type>book</type> <author>Tidwell</author> <list-price>49.95</list-price> <sell-price>34.99</sell-price> <cost>22.92</cost> </item> <item id="3"> <name>Romeo and Juliet</name> <type>compact disc</type> <conductor>Rostropovich</conductor> <list-price>18.98</list-price> <sell-price>13.99</sell-price> <cost>9.92</cost> </item> </inventory> |
You simply apply the following XSLT stylesheet to your source document:
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="/"> <xsl:copy-of select="inventory/item[name = 'XSLT']"/> </xsl:template> </xsl:stylesheet> |
Which gives you exactly what you were looking for, the “item” with the name “XSLT”.
<?xml version="1.0" encoding="utf-8"?> <item id="2"> <name>XSLT</name> <type>book</type> <author>Tidwell</author> <list-price>49.95</list-price> <sell-price>34.99</sell-price> <cost>22.92</cost> </item> |
That was easy, so now let’s say you want to do a little more with your inventory document. Your boss wants a copy of it to look at the numbers and do some accounting. She doesn’t care about the authors or conductors, so she’d like that information left out. Also, she would like an additional piece of information for each item, the amount of profit off each item sold, the difference between the sell-price and the cost.
Because we are adding a piece of information and getting rid of elements that don’t affect the accounting, we can’t use a xsl:copy-of, because that would output an exact copy of the item element, it’s attribute nodes, and it’s child nodes. This exact copy is called a deep copy, because it not only copies the element, but all of it’s children as well. The solution is to use xsl:copy which performs a shallow copy, which means it only copies the current node, and ignores all children or attribute nodes.
Since xsl:copy only copies one element at a time, you need to explicitly specify that you want to continue copying attribute nodes and child nodes. xsl:apply-templates gives us the leverage to write a template that accomplishes that. The following template starts by matching attribute and children nodes, then copies the node, and recursively applies itself to any attribute or child nodes found in the source tree.
<!-- @* matches any attribute node on the current element, node() matches any child nodes of the current element --> <xsl:template match="@*|node()"> <!-- shallow copy... only copy the node you're on (be that attribute or child node) --> <xsl:copy> <!-- apply this template to any other attribute or child nodes found --> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> |
Using this template, I can write a XSLT stylesheet that will copy the entire source document without any changes. This isn’t quite what we were looking for, because we wanted to add and remove child elements of item, but this is the first step.
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="/"> <xsl:apply-templates/> </xsl:template> <!-- copy all attribute or child nodes in place --> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> </xsl:stylesheet> |
So how do we add the profit made from each item and remove the unnecessary information? We use the fact that our xsl:match=”@*|node()” template has a very low priority. Determination of default XSLT priorities are an advanced topic I won’t go into right now, but feel free to explore the topic if you are interested. Our template is essentially given a priority of -.5. Templates such as xsl:template match=”foo”, however, are given a default priority of 0. Because templates matching element names are higher priority, we can easily figure out how to remove the “author” and “conductor” elements, just declare templates without outputs!
<xsl:template match="author"/> <xsl:template match="conductor"/> |
We use the same technique to add an element to our “item” elements. First we use xsl:copy to copy the item node itself. Then we apply-templates to any attribute or children nodes found. When an author or conductor element is found, it will match our explicit rules and produce no output, therefore they will not be copied into our result. Finally, we create a new element named “profit” which will contain the difference between the sell-price and the cost.
<xsl:template match="item"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> <profit><xsl:value-of select="(sell-price - cost)"/></profit> </xsl:copy> </xsl:template> |
So we arrive at our final XSLT stylesheet, which looks like this:
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="/"> <xsl:apply-templates/> </xsl:template> <!-- for each item, copy the <item> element, and apply-templates to it's attributes and children. Finally, create a <profit> element, the difference between the sell-price and the cost. --> <xsl:template match="item"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> <profit><xsl:value-of select="(sell-price - cost)"/></profit> </xsl:copy> </xsl:template> <!-- don't copy these, when they're found, there is no output --> <xsl:template match="author"/> <xsl:template match="conductor"/> <!-- copy all attribute or child nodes in place --> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> </xsl:stylesheet> |
When applied to our source document, we get the result our boss wanted, it excludes any “author” or “composer” elements, and includes a “profit” element.
<?xml version="1.0" encoding="utf-8"?> <inventory> <item id="1"> <name>The Little Schemer</name> <type>book</type> <list-price>29.95</list-price> <sell-price>26.99</sell-price> <cost>17.92</cost> <profit>9.07</profit> </item> <item id="2"> <name>XSLT</name> <type>book</type> <list-price>49.95</list-price> <sell-price>34.99</sell-price> <cost>22.92</cost> <profit>12.07</profit> </item> <item id="3"> <name>Romeo and Juliet</name> <type>compact disc</type> <list-price>18.98</list-price> <sell-price>13.99</sell-price> <cost>9.92</cost> <profit>4.07</profit> </item> </inventory> |
Using this technique, we can then easily prepare another XSLT stylesheet to generate an inventory list for the customer, which will exclude the “cost” element, since we don’t want them knowing it! All we need is to match all attribute and children nodes and copy them as normal, while providing no output when the XSLT processor encounters a “cost” element.
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="/"> <xsl:apply-templates/> </xsl:template> <!-- don't copy the cost, the customer doesn't need to know! --> <xsl:template match="cost"/> <!-- copy all attribute or child nodes in place --> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> </xsl:stylesheet> |
Personally, because I end up transforming XML sources to XML output, I end up using xsl:copy-of and xsl:template match=”@*|node()” all the time. In fact, xsl:template match=”@*|node()” just happens to be the first piece of code in my XSLT toolbox.
2 Comments
July 24, 2009
Great explanation, taking a markup languages class at a university and this example has shed much light on an assignment. Thanks for the tutorial!
July 24, 2009
Thanks a lot! I am glad that I could help someone with something! Lemme know if there’s anything more you’d like to see here!
Leave a comment
You must be logged in to post a comment.