Today we will learn how to convert XML to JSON and XML to Dict in python. We can use python xmltodict
module to read XML file and convert it to Dict or JSON data. We can also stream over large XML files and convert them to Dictionary. Before stepping into the coding part, let’s first understand why XML conversion is necessary.
Converting XML to Dict/JSON
XML files have slowly become obsolete but there are pretty large systems on the web that
still uses this format. XML is heavier than JSON and so, most developers prefer the latter in their applications. When applications need to understand the XML provided by any source, it can be a tedious task to convert it to JSON. The xmltodict
module in Python makes this task extremely easy and straightforward to perform.
Getting started with xmltodict
We can get started with xmltodict
module but we need to install it first. We will mainly use pip to perform the installation.
Install xmltodict module
Here is how we can install the xmltodict module using Python Package Index [pip]:
pip install xmltodict
This will be
done quickly as xmltodict
is a very light weight module. Here is the output for this installation:
apt
tool:sudo apt install python-xmltodict
Another plus point is that this module has an official Debian package.
Python XML to JSON
The best place to start trying this module will be to perform an operation it was made to perform primarily, to perform XML to JSON conversions. Let’s look at a code snippet on how this can be done:
import xmltodict
import pprint
import json
my_xml = """
123
Shubham
"""
pp = pprint.PrettyPrinter[indent=4]
pp.pprint[json.dumps[xmltodict.parse[my_xml]]]
Let’s see the output for this program:
parse[...]
function to convert XML data to JSON and then we use the json
module to print JSON in a better format.Converting XML File to JSON
Keeping XML data in the code itself is neither always possible nor it is realistic. Usually, we keep our data in either database or some files. We can directly pick files and convert them to JSON as well. Let’s look at a code snippet how we can perform the conversion with an XML file:
import xmltodict
import pprint
import json
with open['person.xml'] as fd:
doc = xmltodict.parse[fd.read[]]
pp = pprint.PrettyPrinter[indent=4]
pp.pprint[json.dumps[doc]]
Let’s see the output for this program:
open[...]
function was straightforward, we used it get a File descriptor and then parsed the file into a JSON
object.Python XML to Dict
As the module name suggest itself, xmltodict actually converts the XML data we provide to just a simply Python dictionary. So, we can simply access the data with the dictionary keys as well. Here is a sample program:
import xmltodict
import pprint
import json
my_xml = """
123
Shubham
"""
my_dict = xmltodict.parse[my_xml]
print[my_dict['audience']['id']]
print[my_dict['audience']['id']['@what']]
Let’s see the output for this program:
@
symbol.Supporting Namespaces in XML
In XML data, we usually have a set of namespaces which defines the scope of the data provided by the XML file. While converting to the JSON format, it is then necessary that these namespaces persist in the JSON format as well. Let us consider this sample XML file:
123
Shubham
Here is a sample program on how we can include XML namespaces in the JSON format as well:
import xmltodict
import pprint
import json
with open['person.xml'] as fd:
doc = xmltodict.parse[fd.read[], process_namespaces=True]
pp = pprint.PrettyPrinter[indent=4]
pp.pprint[json.dumps[doc]]
Let’s see the output for this program:
JSON to XML conversion
ALthough converting from XML to JSON is the prime objective of this module, xmltodict also supports doing the reverse operation, converting JSON to XML form. We will provide the JSON data in program itself. Here is a sample program:
import xmltodict
student = {
"data" : {
"name" : "Shubham",
"marks" : {
"math" : 92,
"english" : 99
},
"id" : "s387hs3"
}
}
print[xmltodict.unparse[student, pretty=True]]
Let’s see the output for this program:
import xmltodict
student = {
"name" : "Shubham",
"marks" : {
"math" : 92,
"english" : 99
},
"id" : "s387hs3"
}
print[xmltodict.unparse[student, pretty=True]]
In this case, we have three keys at the root level. If we try to unparse this form of JSON, we will face this error:
Conclusion
In this lesson, we studied an excellent Python module which can be used to parse and convert XML to JSON and vice-versa. We also learned how to convert XML to Dict using xmltodict module. Reference: API Doc