Reading and extracting values from XML files with PowerShell

XML is used in everything from configuration files to Microsoft Office documents. Not surprisingly, PowerShell is also XML aware. By using PowerShell, you can extract values from an XML file, and if necessary, PowerShell can even perform some sort of action based on those values.

Before I show you how to use PowerShell to read data from an XML file, I need to talk a little bit about an XML file’s structure. Here is a really simple example of an XML file:

<Fruit Name=“Apple” Color=“Red” />
<Fruit Name=“Grape” Color=“Purple”/>
<Fruit Name=“Blueberry” Color=“Blue”/>

XML files: Start at the root

Rule No. 1 for XML files is that every XML file has to have a root element. In this case, <Fruit> is the root element, but it is also the only element. More complex XML files generally include an entire hierarchy of elements.

The lines in the middle of the sample XML file that list fruit names and colors are known as attributes. Attributes contain values (values are the actual data), and these values must be surrounded by quotation marks. Additionally, both attributes and elements must have a closing tag, which is represented by a backslash.

So now that I have given you a quick crash course in XML file syntax, let’s take a look at how to create a PowerShell script that can extract data from an XML file. For the sake of demonstration, I will be using the XML file listed above. I will name the file Demo.xml.

There are actually several different ways that you can access XML data from PowerShell. In my opinion, the easiest way to access XML data from PowerShell is to treat the XML data as an object. Thankfully, this is easier than it sounds.

If you were using PCs back in the days of DOS, you might remember a command named Type. You could enter the Type command, followed by a filename, and DOS would display the file’s contents. At the time, that was one of the most common ways of reading a text file.

Fast forward a few decades, and Microsoft took the Type command’s functionality and integrated it into a PowerShell cmdlet called Get-Content. The Get-Content cmdlet works in exactly the same way as the Type command. Just type Get-Content followed by the file name, and PowerShell will display the file’s contents. You can see an example of this in the figure below.

XML PowerShell
In PowerShell, you can even take things a step further and map a variable to the Get-Content command. This effectively stores the file’s contents within a variable. This works great if you are dealing with a run-of-the-mill text file, but it doesn’t work so well for XML. The reason for this is that when you read an XML file in this way, the file’s contents are interpreted as a string of text. You can, of course, extract data from the text, but doing so requires complex string manipulations. As previously noted, it is a lot easier to just create an object.

As previously noted, you can map a variable to the Get-Content command. Doing so looks something like this:

$Data = Get-Content C:\Data\Demo.xml

In this case, the $Data variable contains string data. If we want to create an object instead, we need only to preface the command with [xml]. This tells PowerShell to treat the variable as XML data rather than string data. Here is what the command looks like:

[xml]$Data=Get-Content C:\Data\Demo.xml

Now that the XML file is being interpreted as XML data, you can navigate the variable’s contents using something called dot notation. This simply means that you can navigate the XML file’s hierarchy by using words separated by periods (dots). Let me give you an example.

Earlier I created an XML variable named $Data. If I simply enter $Data into PowerShell, then PowerShell will return the word Fruit. If I then enter $Data.Fruit into PowerShell, then PowerShell will return fruit names (Apple, Grape, Blueberry). You can see what this looks like in the figure below.

XML PowerShell
That’s great, but what about the colors that were associated with each type of fruit? We can access those using the $_.Fruit.color variable. Let me show you how this works.

As previously explained, you can type $Data.Fruit to see the various types of fruit that are described by the XML file. What that method did not take into account, however, is that each attribute has a name. The attribute name used for the types of fruit is Name. Similarly, the attribute used for the fruit color is Color. As such, we can retrieve the names and colors of the fruit by using these commands:

$Data.Fruit | ForEach-Object {$_.Fruit.Name}
$Data.Fruit | ForEach-Object {$_.Fruit.Color}

You can see what this looks like below.

So what if I wanted to access these attributes directly? Doing so isn’t quite as intuitive as you might expect, but there is a big hint in the two lines of code above. Notice how those lines both begin with $Data.Fruit. They also end with $_.Fruit, a period, and an attribute name. The point is that the word fruit appears twice. This is important if you want to access the attributes directly. Here is an example of how it is done:

$Names = $Data.Fruit.Fruit | Select -ExpandProperty Name

Now, if you enter $Name, it will show all of the fruit names. You can see what this looks like below.

XML PowerShell
XML files and PowerShell: A perfect match

As you can see, PowerShell makes it relatively easy to read the data that is stored within an XML file. In case you are wondering, it is also possible to use PowerShell to create XML files or to modify values within existing XML files, but that’s another story for another day.

Featured image: Shutterstock

About The Author

1 thought on “Reading and extracting values from XML files with PowerShell”

Leave a Comment

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Scroll to Top