Using XML with LINQ Cookbook
XML is still being used by legacy applications. Today, I provide a list of quick recipes for managing and manipulating XML using LINQ.
Before the weekend started, I had a good friend of mine ask me for assistance with an XML document.
It required some serious skills on how to search and replace text throughout specific XML documents.
Before I hung up the phone, they mentioned it would be great to get a refresher on how to use XML with LINQ (Language Integrated Query).
What a great idea!
I thought it would be a great post to give a cheat sheet, or "cookbook," of techniques on how to use LINQ with XML.
The number of applications still using XML is vast and new developers aren't as experienced with LINQ-to-XML. Most want to use JSON since it's the newer file format on the block.
With a large amount of legacy applications requiring maintenance, every developer should know how to manage and manipulate their XML instead of the other way around.
XML can be a handful, but today, I hope to provide a quick cookbook of fast and effective how-to's on using XML spanning from beginner's tips to advanced practices.
XML Terminology
To truly understand how LINQ works with XML, you need some XML terminology first.
Let's look at an example XML document.
<Locations> <Location id="100"> <Title>Magic Kingdom</Title> <Area id="101"> <Title>Tomorrowland</Title> <Attractions> <Attraction id="105"> <Title>AstroJets</Title> </Attraction> <Attraction id="110"> <Title>Space Mountain</Title> </Attraction> <Attraction id="120"> <Title>Monsters, Inc Laughing Floor</Title> </Attraction> </Attractions> </Area> </Location> <Location id="200"> <Title>EPCOT</Title> <Area id="205"> <Title>Future World</Title> <Attractions> <Attraction id="210"> <Title>Spaceship Earth</Title> </Attraction> <Attraction id="220"> <Title>Test Track</Title> </Attraction> <Attraction id="225"> <Title>The Land</Title> </Attraction> </Attractions> </Area> <Area id="230"> <Title>World Showcase</Title> <Attractions> <Attraction id="240"> <Title>England</Title> </Attraction> <Attraction id="242"> <Title>Canada</Title> </Attraction> <Attraction id="245"> <Title>France</Title> </Attraction> <Attraction id="250"> <Title>Germany</Title> </Attraction> </Attractions> </Area> </Location> </Locations>
The terms are as follows:
- Node - A node is any type inside an XML document. The root node for this particular document is Locations, which hold two other nodes (Location) and each location node holds area nodes. Text can be a node type like "Magic Kingdom." An attribute can be another node type like id on each node. To understand all of the different node types, refer to the W3C DOM Spec.
- Element - An element is a type of node and includes everything inbetween the beginning tag to the ending tag. For example, the first element under the first Location element is Magic Kingdom.
- Attribute - Attributes are contained inside elements as name/value pairs. For example, the first Location name has an attribute name of "id" and the attribute value is "100."
For now, these are the basics we'll focus on for the remainder of this post.
How do you load XML?
There are multiple ways to load an XML document.
One way is to create the XML document from a string.
var xml = "<Location id=\"100\">" + "<Title>Magic Kingdom</Title>" + "<Area id=\"101\" >" + "<Title>Tomorrowland</Title>" + "<Attractions>" + "<Attraction id=\"105\">" + "<Title>AstroJets</Title>" + "</Attraction>" + "<Attraction id=\"110\">" + "<Title>Space Mountain</Title>" + "</Attraction>" + "<Attraction id=\"120\">" + "<Title>Monsters, Inc Laughing Floor</Title>" + "</Attraction>" + "</Attractions>" + "</Area>" + "</Location>"; var doc = new XmlDocument(); doc.LoadXml(xml);
Another way is to load it through a file.
var doc = new XmlDocument(); doc.Load(@"C:\XmlDocument\disneyrides.xml");
If you want to load it through XDocument, it's just as easy (since we'll be using XDocument objects for our LINQ examples).
var doc = XDocument.Load(@"C:\XmlDocument\disneyrides.xml");
What's the difference between XmlDocument and XDocument?
An XDocument is the equivalent of an XmlDocument which is LINQ-able. XmlDocument is pre-.NET 3.0 before LINQ came around. It's probably better to use XDocument for managing your XML.
How can I convert an XmlDocument into an XDocument?
I wrote a post a while back about my top 10 extremely useful extension methods and one of them was converting between XDocument and XmlDocument and back again (extension method number 2).
How do you create an XDocument "document?"
To create a new Xml Document using XDocument, you begin with a XmlNamespace and continue to work your way down the tree by including more and more nodes.
To recreate the sample loaded XML starting at Location, here is the proper XDocument syntax to create a new Location.
var doc = new XDocument( new XElement("Location", new XAttribute("id","100"), new XElement("Title", "Magic Kingdom"), new XElement("Area", new XAttribute("id", "101"), new XElement("Title","Tomorrowland"), new XElement("Attractions", new XElement("Attraction", new XAttribute("id","105"), new XElement("Title", "AstroJets") ), new XElement("Attraction", new XAttribute("id", "110"), new XElement("Title", "Space Mountain") ), new XElement("Attraction", new XAttribute("id", "120"), new XElement("Title", "Monsters, Inc Laughing Floor") ) ) ) ) );
How can I assign a namespace to an XDocument?
To set all of the descendants to a namespace, set their localname to the namespace.
XNamespace ns = @"http://mynamespace"; foreach (XElement elem in xdoc.Descendants()) { elem.Name = ns + elem.Name.LocalName; }
If you want every element to have a prefix at the beginning of each element, add the namespace to the root.
xdoc.Root.Add( new XAttribute(XNamespace.Xmlns+"myns", ns));
How can I retrieve all of the attractions?
This will grab all of the attractions throughout the entire Xml document.
var attractions = xdoc.Root.Descendants("Attraction");
How can I get all attractions for just EPCOT?
This gets a little trickier as we dig down into the XML document.
First, you need to get the parent element (Location).
Once you have the parent, you can dig down into the titles and save each location's attractions and use the Where clause to grab the child title from each location.
var attractions = xdoc.Root.Elements("Location") .Select(location => new { title = (string) location.Element("Title").Value, children = location .Elements("Area") .Elements("Attractions") .Elements("Attraction") .ToList() }) .Where(t => t.title == "EPCOT") .Select(t => t.children) .ToList();
We grabbed seven attractions from the EPCOT element.
This is a simple one. Trust me, some LINQ queries can get complicated.
How do you return sorted elements?
You can sort the returned elements by using the OrderBy() or OrderByDescending() clause on your returned elements.
var attractions = xdoc.Root .Descendants("Attraction") .OrderBy(e=> e.Element("Title").Value) .ToList();
Conclusion
LINQ-to-XML provides a quick and easy way to control any XML document whether you need to add, remove, search, or update.
As I mention in most of my posts, I will be updating this as more people ask for specific LINQ-to-XML queries so it will become a living document.
To reach the goal of any XML query, you always need to know where you are in the hierarchy. THAT is the biggest challenge.
Once you determine your path, you can LINQ your way through the most complicated XML document.
What's the biggest XML document you've worked with? Post your comments below.