in a previous post I explored how simple processing XML can be with Groovy using XMLSlurper. The example was fairly simple and it occurred to me that it would be helpful to beef up my example to cover some of the other aspects of working with XML using XMLSlurper. More specifically, with this post, I’d like to offer a more robust example to cover the following:
1. Referencing an element that does not exist
2. Verify that XMLSlurper is case sensitive
3. How to extract attributes from the xml
4. How to reference xml elements which contain a hyphen
5. Walking a more complex nested graph
For each example in this post, we’ll use the following xml which is housed in a file to demonstrate the features we’re going to explore.
1: <?xml version='1.0' encoding='UTF-8'?>
2: <customers>
3: <customer id='12345'>
4: <firstName>FirstOne</firstName>
5: <lastName>FirstLastName</lastName>
6: <addresses>
7: <address id='11'>
8: <line-1>first address line1</line-1>
9: <line-2>first address line2</line-2>
10: <city>first-city</city>
11: <state>first-state</state>
12: <postal-code>first-postal</postal-code>
13: </address>
14: <address id='22'>
15: <line-1>second address line1</line-1>
16: <line-2>second address line2</line-2>
17: <city>second-city</city>
18: <state>second-state</state>
19: <postal-code>second-postal</postal-code>
20: </address>
21: </addresses>
22: </customer>
23: <customer id='67890'>
24: <firstName>SecondOne</firstName>
25: <lastName>SecondLastName</lastName>
26: <addresses>
27: <address id='33'>
28: <line-1>third address line1</line-1>
29: <line-2>third address line2</line-2>
30: <city>third-city</city>
31: <state>third-state</state>
32: <postal-code>third-postal</postal-code>
33: </address>
34: <address id='44'>
35: <line-1>fourth address line1</line-1>
36: <line-2>fourth address line2</line-2>
37: <city>fourth-city</city>
38: <state>fourth-state</state>
39: <postal-code>fourth-postal</postal-code>
40: </address>
41: </addresses>
42: </customer>
43: </customers>
Here’s the test method that I’ll use to try touch on each point:
1: public void processAdvancedXMLFile(String fileName)
2: {
3: def file = new File("D:\\temp\\AdvancedExample.xml")
4: def customers = new XmlSlurper().parse(file)
5:
6: println("customers: " + customers)
7: println("customers: " + customers.customer)
8:
9: println("Bogus element: " + customers.bogusField)
10: println("Case Matters: " + customers.customer[0].FIRSTNAME)
11: println("Customer ID Number: " + customers.customer[0].@id)
12:
13: println("All Customer first names: " + customers.customer.firstName)
14: println("First Customer's first names: " + customers.customer[0].firstName)
15:
16: println("All Customer last names: " + customers.customer[0].lastName)
17:
18: //walk the entire xml tree
19: println("Walking the entire xml tree using xmlslurper")
20: customers.children().each {customer ->
21: println("First Name: " + customer.firstName)
22: println("Last Name: " + customer.lastName)
23:
24: customer.addresses.children().each {address ->
25: println("Address id: " + address.@id)
26: println("Address line 1: " + address.'line-1')
27: println("Address line 2: " + address.'line-2')
28: println("Address city: " + address.city)
29: println("Address state: " + address.state)
30: println("Address postal-code: " + address.'postal-code')
31: }
32: println("============================================")
33: }
34:
35: }
Finally, here’s the output window for a sample run:
1: customers: FirstOneFirstLastNamefirst address line1first address line2first-cityfirst-statefirst-postalsecond address line1second address line2second-citysecond-statesecond-postalSecondOneSecondLastNamethird address line1third address line2third-citythird-statethird-postalfourth address line1fourth address line2fourth-cityfourth-statefourth-postal
2: customers: FirstOneFirstLastNamefirst address line1first address line2first-cityfirst-statefirst-postalsecond address line1second address line2second-citysecond-statesecond-postalSecondOneSecondLastNamethird address line1third address line2third-citythird-statethird-postalfourth address line1fourth address line2fourth-cityfourth-statefourth-postal
3: Bogus element:
4: All Customer first names: FirstOneSecondOne
5: First Customer's first names: FirstOne
6: Case Matters:
7: Customer ID Number: 12345
8: All Customer last names: FirstLastName
9: Walking the entire xml tree using xmlslurper
10: First Name: FirstOne
11: Last Name: FirstLastName
12: Address id: 11
13: Address line 1: first address line1
14: Address line 2: first address line2
15: Address city: first-city
16: Address state: first-state
17: Address postal-code: first-postal
18: Address id: 22
19: Address line 1: second address line1
20: Address line 2: second address line2
21: Address city: second-city
22: Address state: second-state
23: Address postal-code: second-postal
24: ============================================
25: First Name: SecondOne
26: Last Name: SecondLastName
27: Address id: 33
28: Address line 1: third address line1
29: Address line 2: third address line2
30: Address city: third-city
31: Address state: third-state
32: Address postal-code: third-postal
33: Address id: 44
34: Address line 1: fourth address line1
35: Address line 2: fourth address line2
36: Address city: fourth-city
37: Address state: fourth-state
38: Address postal-code: fourth-postal
39: ============================================
40: BUILD SUCCESSFUL (total time: 6 seconds)
One nice aspect of working with XMLSlurper is that it returns an empty string instead of null anytime an element is not found. Another key aspect of working with XMLSlurper is that it’s case sensitive. If you ask XMLSlurper for an element that does not exist or use the improper case for an element you’ll get back an empty string as the response.
In my example, I first requested an element which does not exist: customers.bogusField followed next by one that does exist (firstName) but I tried to retrieve it using the wrong case (FIRSTNAME). In the output window you can see that nothing comes back in either case.
The next item from my list is to show how easy it is to pull attributes from the xml. Accessing these attributes is just as easy as accessing elements. The format is only slightly different. Where customers.customer[0].firstName would grab the first customer in the xml doc and print out the contents of the firstName element, to instead grab an attribute from the first customer element you would write: customers.customer[0].@nameOfAttribute (in the example provided I pulled the id attribute from the customer using the following code: customers.customer[0].@id
The next item on my list is one of the few quirky things I’ve uncovered when working with XMLSlurper. If your XML element name has a hyphen in the name you must tweak your syntax just a bit. It helps to understand why you must do this. The reason for this is because Groovy sees this as a minus sign instead of a hyphen. In order to instruct Groovy not to treat this hypen as a minus sign you simply need to enclose the element name in single quotes.
Here are a couple of lines from my example above which illustrate this:
println("Address line 1: " + address.'line-1')
println("Address line 2: " + address.'line-2')
Walking a more complex tree is the next item I wanted to address. The example presented here has multiple customers within the customer tag with each customer in turn having simple elements (firstName, lastName) as well as multiple addresses. The provided example walks this tree and prints out all of the element values using the children() method with a Groovy closure. I don’t show it in the example but it’s also very easy to print out the name of the XML element by using the ‘name’ method. Here’s an example for doing that:
println("Prints firstName: " + customers.customer[0].firstName.name())
In a future post I’ll devote some time to moving data between domain objects and xml.
==================================================================
An example of pulling elements from an atom feed:
Since someone asked, I’ve added a sample which pulls an id element from a sample atom feed (you can find this sample on wikipedia):
1: def data = '''<?xml version="1.0" encoding="utf-8"?>
2: <feed xmlns="http://www.w3.org/2005/Atom">
3: <title>Example Feed</title>
4: <subtitle>A subtitle.</subtitle>
5: <link href="http://example.org/feed/" rel="self"/>
6: <link href="http://example.org/"/>
7: <updated>2003-12-13T18:30:02Z</updated>
8: <author>
9: <name>John Doe</name>
10: <email>johndoe@example.com</email>
11: </author>
12: <id>urn:uuid:60a76c80-d399-11d9-b91C-0003939e0af6</id>
13: <entry>
14: <title>Atom-Powered Robots Run Amok</title>
15: <link href="http://example.org/2003/12/13/atom03"/>
16: <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
17: <updated>2003-12-13T18:30:02Z</updated>
18: <summary>Some text.</summary>
19: </entry>
20: </feed>'''
21:
22: def atomData = new XmlSlurper().parseText(data)
23: println("Sample output from our xml atom feed:")
24: println("id for the feed: " + atomData.id)
25: println("id for the entry: " + atomData.entry.id)
26:
27:
28:
You can also setup XmlSlurper to be aware of your namespaces if needed. Here’s an example to show how to use that:
1: public void processXmlWithNamespaces(){
2: def loanXml =
3: '''<loan
4: xmlns:customer="urn:somecompany:customers"
5: xmlns:account="urn:somecompany:accounts">
6: <customer:name>Joe Customer</customer:name>
7: <account:name>First Mortgage</account:name>
8: <periods>360</periods>
9: </loan>'''
10:
11: def loan = new XmlSlurper().parseText(loanXml)
12:
13: println("combined name: " + loan.name)
14: def ns = [:]
15: ns.customer = "urn:somecompany:customers"
16: ns.account = "urn:somecompany:accounts"
17: loan.declareNamespace(ns)
18:
19: println("customer name: " + loan.'customer:name')
20: println("account name: " + loan.'account:name')
21: }
note the single quotes around the ‘account:name’ and ‘customer:name’. These are needed for the same reason as the quotes around the hyphen.
Here’s the output from a sample run:
1: combined name: Joe CustomerFirst Mortgage
2: customer name: Joe Customer
3: account name: First Mortgage
If you want to learn more about XmlSlurper I’d also highly recommend the book:
The format for the namespace example presented above came from his excellent book (Scott’s a good presenter too if you ever have the opportunity to hear him speak).
Handy, thanks ... one question, how would you use XmlSlurper inside a Grails app to pull in an element named 'id' ? (as in an atom feed, where there is an atom : id element)
ReplyDelete'node.id' works in groovysh, but not in a Grails app; I am suspecting the issue is the inject special behaviour Grails has around the id property.
I updated the post to include a couple of new examples. If this doesn't help, feel free to drop me an email (send me the xml you're trying to process).
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteIs it also possible to write xml with XmlSlurper or only read?
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteare this examples also valid with XmlParser use? thanks
ReplyDelete