 | Level: Introductory Soma Ghosh (sghosh@entigo.com ), Application developer, Entigo
16 Sep 2003 More and more enterprise and Java technology projects are making use of XML as a medium to store data in a portable fashion. But due to the increased processing power demanded by XML parsers, J2ME applications have largely been left out of this trend. Now, however, small-footprint XML parsers for the Java language are emerging that will allow MIDP programmers to take advantage of the power of XML. Soma Ghosh illustrates their potential with a sample application.
The fusion of Java and XML technologies creates the powerful combination of portable code and portable data. But where does the Java 2 Platform, Micro Edition (J2ME) fit in? In this article, I'll show some of the progress that has been made in cutting XML parsers down to a size suited to J2ME applications and limited-resource platforms. I'll use the kXML package to write an application for the MIDP profile that can parse an XML document. But first, I'll take a look at the world of XML parsers, and find out some of the reasons why they've been slow to move to the J2ME platform.
XML parsers: An overview
An XML processing model describes the steps an application should take to process XML; an application that implements such a model is called an XML parser. You can integrate an XML parser into your Java applications with the Java API for XML Processing (JAXP). JAXP allows applications to parse and transform XML documents using an API that is independent of any particular XML processor implementation. Through a plug-in scheme, developers can change XML processor implementations without altering their applications.
The XML parsing process operates in three phases:
-
XML input processing. In this stage, the application parses and validates the source document; recognizes and searches for relevant information based on its location or its tagging in the source document; extracts the relevant information when it is located; and, optionally, maps and binds the retrieved information to business objects.
-
Business logic handling. This is the stage in which the actual processing of the input information takes place. It might result in the generation of output information.
-
XML output processing. In this stage, the application constructs a model of the document to be generated with the Document Object Model (DOM). It then either applies XSLT style sheets or directly serializes to XML.
SAX (the Simple API for XML) and DOM are the most common processing models. If you use a SAX-based parser to process an XML document, you need to code methods to handle events thrown by the parser as it encounters the different tokens of the markup language. Because a SAX parser generates a transient flow of events, the XML input processing steps described above (parsing, recognizing, extracting, and mapping) must be performed in a single cycle: Each caught event is handled immediately and the relevant information is passed on with the event. SAX-based parsers fall into the category of push parsers. A push parser reads through an entire XML document. As it encounters various parts of the document, it notifies a listener object.
If you use a DOM-based parser, you need to write code to walk through the tree-like data structure that the parser will create from the source document. With DOM, the XML input processing is done in at least two cycles. First, the DOM parser creates a tree-like data structure, called a DOM tree, that models the XML source document; then the application walks through the DOM tree, searching for relevant information to extract and further process. This last cycle can be repeated as many times as necessary, since the DOM tree persists in memory. DOM-based parsers fall into the category of model parsers. A model parser reads an entire document and creates a representation of the document in memory.
Both push and model parsers require an amount of memory and processing power that is beyond the capabilities of many J2ME devices. To get around those device limitations, a third type of parser, called a pull parser, can be used. A pull parser reads a small amount of a document at once. The application drives the parser through the document by repeatedly requesting the next piece. The kXML parser that I'll use in my sample application is an example of a pull parser.
XML parsing in a MIDP environment
You can use XML parsers in J2ME applications to interface with an existing XML service. For example, you could get a customized view of news on your phone from an aggregator site that summarizes headlines and story descriptions for a news site in XML format.
XML parsers tend to be bulky, with heavy run time memory requirements. In order to adapt to the MIDP environment, XML parsers must be small to meet the resource constraints of MIDP-based devices. They should also be easily portable, with minimum effort required to port them to MIDP.
Two frequently used XML parsers for resource-constrained devices are kXML and NanoXML. kXML is written exclusively for the J2ME platform (CLDC and MIDP). As of version 1.6.8 for MIDP, NanoXML supports DOM parsing. (See Resources for links to NanoXML and kXML.)
Performance issues in deploying XML parsers
There are several performance issues that you should keep in mind while deploying XML parsing in a MIDP application:
-
Increase in size: An XML parser is code-intensive and increases the overall size of an application. This is a particularly important consideration for resource-constrained MIDP devices. There are several optimization techniques you can use to fight code expansion. First, you should remove resource files that are not in use. You should also use obfuscators that will remove unused classes, unused methods, and variables from your JAD file.
-
Heavy string parsing: XML parsers use intensive string parsing to perform their jobs; this will add to the overhead in MIDP applications with low runtime memory. XML documents that J2ME applications parse need to be small and contain as much useful information as possible.
-
Slow response time: As the MIDP application parses a relatively large amount of XML data, the response time will increase. The XML files to be parsed should be small, and the parsing should be done in a thread of execution that is separate from the main application.
I'll use some of these techniques to keep my sample application streamlined.
An XML-parsing J2ME application
In this section, I'll develop a J2ME application, ParseXML, that parses RSS (rich site summary)-formatted XML documents and displays the information encoded in those documents on a phone screen. RSS is a simple XML format that summarizes headlines and story descriptions for a news site. You can see an example of an RSS document in Listing 1.
Listing 1. RSS data to be parsed by the application
<?xml version="1.0"?>
<!DOCTYPE rss PUBLIC
"-//Netscape Communications//DTD RSS 0.91//EN"
"http://my.netscape.com/publish/formats/rss-0.91.dtd"
>
<rss version="0.91">
<channel>
<title>Meerkat: An Open Wire Service</title>
<link>http://meerkat.oreillynet.com</link>
<description>
Meerkat is a Web-based syndicated content reader based on RSS ("Rich Site Summary").
RSS is a fantastic, simple-yet-powerful syndication system rapidly gaining momentum.
</description>
</channel>
</rss>
|
On execution, ParseXML displays the title and description from the above RSS data on the phone screen. The UI aspects of the application are handled by the javax.microedition.lcdui package.
The XML parser used in this application is kXML 1.2. In order to make kXML classes available to the application, you'll need to download the kxml.zip package from the kXML site (see Resources for a link). Then, copy the contents of the package to the appropriate folder in your development system. If you're using Sun's J2ME toolkit (see Resources), copy the contents to TOOLKIT_HOME\apps\ParseXML\lib.
The J2ME application initially displays a textbox, a button that triggers the display of parsed XML data, and an Exit button. When the Display XML button is pressed, the viewXML() method is called. In this method, the application creates an instance of XMLParser to walk its way through the document. On every item it finds, the parser looks for the subcontents title and description in order to find the text that you want to display. This process goes on recursively. The user can choose to exit the application during this process by clicking Exit.
The RSS data is made available to ParseXML in the form of a String. Because the XML parser works on a stream of bytes, the String is converted into a byte array, which is used to construct an instance of ByteArrayInputStream. The ByteArrayInputStream in turn creates an instance of InputStreamReader, which creates an instance of an XMLParser. This process is illustrated in Listing 2.
Listing 2. Creating an instance of XMLParser to parse a string containing RSS data
byte[] xmlByteArray = xmlStr.getBytes();
ByteArrayInputStream xmlStream = new
ByteArrayInputStream( xmlByteArray );
InputStreamReader xmlReader = new
InputStreamReader( xmlStream );
XmlParser parser = new XmlParser( xmlReader );
|
In order to capture data from an aggregator site, the application should open a URL connection and get RSS data on an InputStream. The InputStream is made available to the XMLParser through an InputStreamReader. This is illustrated in Listing 3.
Listing 3. Get the RSS data from an aggregator site
HttpConnection hc = (HttpConnection)Connector.open(url);
InputStream is = hc.openInputStream();
Reader reader = new InputStreamReader(is);
XmlParser parser = new XmlParser(reader);
|
When the XMLParser's read() method encounters an item, it returns a ParseEvent object. This object contains valuable information, such as the event type (whether it represents the start of a tag, the end of a tag, the end of a document, text, or whitespace), the event name (that is, the tag name), and the event text (that is, the text enclosed between the start and end tags). As illustrated in Listing 4, in the example the parser finds the tag names title and description, then reads further to extract the text for display purposes.
The ParseEvent generates the Event types Xml.START_TAG, Xml.END_TAG, Xml.END_DOCUMENT, Xml.WHITESPACE, and Xml.TEXT when it encounters the start of a tag, end of a tag, end of a document, whitespace, and text between tags, respectively.
Listing 4. Picking up title and description from RSS data
case Xml.START_TAG:
// see API doc of StartTag for more access methods
// Pick up Title for display
if ("title".equals(event.getName()))
{
pe = parser.read();
title = pe.getText();
}
// Pick up description for display
if ("description".equals(event.getName()))
{
pe = parser.read();
desc = pe.getText();
}
|
Listing 5 contains the complete code listing for the ParseXML application.
Listing 5. ParseXML application
import javax.microedition.midlet.*;
import javax.microedition.lcdui.*;
import javax.microedition.io.*;
import java.io.*;
//kxml imports
import org.kxml.*;
import org.kxml.parser.*;
public class ParseXML extends MIDlet implements CommandListener {
private Command exitCommand; // The exit command
private Command displayXML; // On execution, it displays title and description
// on phone screen
private Display display; // The display for this MIDlet
// UI Items for display of title and description on phone screen
private static TextBox t;
private static String textBoxString = "";
// XML String
private String xmlStr = "";
public ParseXML() {
display = Display.getDisplay( this );
exitCommand = new Command( "Exit", Command.EXIT, 2 );
displayXML = new Command( "XML", Command.SCREEN, 1 );
// The XML String in form of RSS
StringBuffer xmlString = new StringBuffer();
xmlString.append("<?xml version=\"1.0\"?>
<!DOCTYPE rss PUBLIC \"-//Netscape Communications//DTD RSS 0.91//EN\"");
xmlString.append("\"http://my.netscape.com/publish/formats/rss-0.91.dtd\">");
xmlString.append("<rss version=\"0.91\">");
xmlString.append("<channel><title>Meerkat: An Open Wire Service</title>");
xmlString.append("<link>http://meerkat.oreillynet.com</link>");
xmlString.append("<description>Meerkat is a Web-based syndicated content
reader based on RSS (\"Rich Site Summary\").
RSS is a fantastic, simple-yet-powerful syndication
system rapidly gaining momentum.");
xmlString.append("</description><language>en-us</language>");
xmlString.append("</channel>");
xmlString.append("</rss>");
xmlStr = xmlString.toString();
}
public void startApp() {
// The textbox displays title and description from a RSS String
t = new TextBox( "MIDlet XML", "kXML", 256, 0 );
t.addCommand( exitCommand );
t.addCommand( displayXML );
t.setCommandListener( this );
display.setCurrent( t );
}
/**
Pause is a no-op since there are no background activities or
record stores that need to be closed.
*/
public void pauseApp() { }
/**
Destroy must cleanup everything not handled by the garbage collector.
In this case there is nothing to cleanup.
*/
public void destroyApp(boolean unconditional) { }
/*
Respond to commands, including exit. On the exit command, cleanup
and notify that the MIDlet has been destroyed.
*/
public void commandAction(Command c, Displayable s) {
if ( c == exitCommand ) {
destroyApp( false );
notifyDestroyed();
}
else if ( c == displayXML ) {
try {
viewXML();
}
catch( Exception e ) {
e.printStackTrace();
}
}
}
// This function sets up kxml parser and calls traverse() to parse the whole XML String
public void viewXML() throws IOException {
try {
byte[] xmlByteArray = xmlStr.getBytes();
ByteArrayInputStream xmlStream = new
ByteArrayInputStream( xmlByteArray );
InputStreamReader xmlReader = new
InputStreamReader( xmlStream );
XmlParser parser = new XmlParser( xmlReader );
try
{
traverse( parser, "" );
}
catch (Exception exc)
{
exc.printStackTrace();
}
return;
}
catch ( IOException e ) {
return ;
} finally {
return ;
}
}
/**
Traverses the XML file
*/
public static void traverse( XmlParser parser, String indent ) throws Exception
{
boolean leave = false;
String title = new String();
String desc = new String();
do {
ParseEvent event = parser.read ();
ParseEvent pe;
switch ( event.getType() ) {
// For example, <title>
case Xml.START_TAG:
// see API doc of StartTag for more access methods
// Pick up Title for display
if ("title".equals(event.getName()))
{
pe = parser.read();
title = pe.getText();
}
// Pick up description for display
if ("description".equals(event.getName()))
{
pe = parser.read();
desc = pe.getText();
}
textBoxString = title + " " + desc;
traverse( parser, "" ) ; // recursion call for each <tag></tag>
break;
// For example </title?
case Xml.END_TAG:
leave = true;
break;
// For example </rss>
case Xml.END_DOCUMENT:
leave = true;
break;
// For example, the text between tags
case Xml.TEXT:
break;
case Xml.WHITESPACE:
break;
default:
}
} while( !leave );
t.setString( textBoxString );
}
}
|
Figure 1 shows the ParseXML application in action.
Figure 1. ParseXML application at run time
The bottom Line
In this article, you saw how you can use J2ME to fuse Java technology and XML -- in other words, to fuse portable code with portable data. Designing J2ME applications with embedded parsers can be a challenge because of the resource constraints inherent in J2ME devices. However, with the gradual availability of compact parsers suited to the MIDP platform, XML parsing will soon will be a widely used feature of the Java platform on mobile devices.
Resources
- Check out the kXML parser, which I use in the sample application. You can download the kxml.zip package from this site.
- Stay connected to the wireless world with the developerWorks Wireless zone.
- The developerWorks Java technology zone keeps you up to date on J2ME, MIDlet, and other events in the Java arena.
- Find the latest developments on wireless, mobile, and voice computing on IBM's Pervasive Computing site.
- Build Java apps with IBM's Visual Age for Java.
- Download Sun's J2ME Toolkit to run and test your wireless applications.
- Java.sun.com has several MIDP articles that offer additional insight on the pros and cons of XML Parsing discussed in this article.
About the author  | |  | A holder of a master's degree in computer science and engineering, Soma Ghosh has developed a wide range of Java applications in the areas of e-commerce and networking over the past eight years. She believes that wireless commerce represents the near future of the industry and has recently been drawn to develop wireless versions of existing desktop components and models. Soma is an open source Java developer and is currently associated with Entigo, a pioneer in e-business solutions and B2B sell- and service-side e-commerce products. Contact her at sghosh@entigo.com. |
Rate this page
|  |