Parsing XML in Java

There are several methods available for parsing an XML file. Some parsers transform the XML file into a traversable DOM tree. The tree consists of a root node and one or more child nodes belonging to the root node. These child nodes are able have child nodes of their own and so forth. Nodes without children are referred to as leaf nodes.

XML Tree

An XML Tree

An alternative to the DOM parser is the so called SAX parser. SAX parsers are event-driven. Instead of turning the XML file into a tree, they fire off events to an event handler. For example, when the parser has detected an element. The event handler is then able to perform actions based on the data found in the element. The parser is able to handle elements without storing a tree in memory. This leads to less memory usage and higher performance in comparison to a DOM parser.

This snippet will demonstrate a Java program capable of parsing XML files. A DOM parser is used for ease of use. The following XML file is used to represent a messaging language.

<?xml version="1.0" encoding="UTF-8"?>
<message>
	<from>Foo</from>
	<to>Bar</to>
	<subject>Hi</subject>
	<body>What's up?</body>
</message>

Every message simply consists of a sender, a recipient, a subject line and the body of the message. The XML parser is going to parse the XML file and output the results to the console.

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.*;

public class XMLParser {

	DocumentBuilderFactory documentBuilderFactory;
	DocumentBuilder documentBuilder;

	public XMLParser() throws Exception {
		documentBuilderFactory = DocumentBuilderFactory.newInstance();
		documentBuilder = documentBuilderFactory.newDocumentBuilder();
	}

	public void parseXML(String xmlFilePath) throws Exception {
		// Parse the XML file into a DOM tree
		Document domTree = documentBuilder.parse(xmlFilePath);

		// Select the root node
		Node rootNode = domTree.getDocumentElement();

		System.out.println("Root node: " + rootNode.getNodeName());

		// Traverse the root node and print all the elements
		NodeList childNodes = rootNode.getChildNodes();

		for(int i=0;i<childNodes.getLength();i++){
			// Select the child node
			Node childNode = childNodes.item(i);

			// Make sure this child node is of type Element
			if(childNode.getNodeType() == Node.ELEMENT_NODE){
				// Print the name and value of the child node
				System.out.println("\tNode name: " + childNode.getNodeName() +
						" value: " + childNode.getFirstChild().getNodeValue());

				// Notice the child node having another child node for its value
			}
		}

	}

}
public class Main {

	/**
	 * @param args
	 */
	public static void main(String[] args) {
		try {
			// Create our parser
			XMLParser xmlParser = new XMLParser();
			// Parse message.xml
			xmlParser.parseXML("message.xml");
		} catch(Exception ex){
			System.out.println("An error has occurred: " + ex.getMessage());
		}
	}

}

Output:

Root node: message
	Node name: from value: Foo
	Node name: to value: Bar
	Node name: subject value: Hi
	Node name: body value: What's up?
This entry was posted in Java and tagged , , , , , , , , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Why ask?