logo
 
Assignment One
CS 6890 - Web-based Database Management Systems
Utah State University
Home
Lectures
Homework   
Syllabus
Resources
People

Overview

Weight: The homework will count 8% of your final grade.
Due Date Tuesday, February 5 (by 11:59PM)
This assignment is an introduction to parsing XML in Java. We will use Java's DOM style parsing. You may do the assignment using Windows or Linux. Part of the assignment is in the tarred, gzipped, directory one.tar.gz.

Compiling Environment

Commands in the Makefile are set up to compile and run under Linux.

Turnin

All code should be developed in the one directory. When you are done and certain that everything is working correctly, create a tarred, gzipped copy of your directory by executing
  make tar
It will do a make clean to first remove all of the .class files in your directory. It will then create the file one.tar.gz in your directory. Turnin in the assignment using the turnin page. Please upload the file one.tar.gz. You may turnin your assignment as many times as you like.

Documentation

All documentation should be done using javadoc. We will automatically run your programs using javadoc and examine the results. For more information on javadoc execute
  man javadoc
in your Linux shell.

Grading

The assignment will be graded for good programming style (indentation and appropriate comments), as well as clean compilation and execution (the programs should work!).

Makefile

The Makefile has targets to compile and test your code. The supported targets are listed below.
  • File.class - Compile File.java to create File.class. For instance, to compile ElementIgnore.java execute
      make ElementIgnore.class
    
  • File.one - Test File.java on the file xml/one.xml. For instance, to test ElementIgnore.java on the file xml/one.xml execute
      make ElementIgnore.one
    
    There are several test targets supported: File.one, File.two, File.three, and File.four.
  • File.doc - Run javadoc on File.java and put the output into the doc/ directory. For instance, to produce the documentation for ElementIgnore.java execute
      make ElementIgnore.doc
    
  • tar - Explained above, see the turnin instructions.
  • clean - Delete all .class files.

Parsing XML

To produce the following programs, you will use the built-in Java package to parse XML. In other words, don't write your own parser. For each program, use the Document Object Model (DOM) style of parsing to create a "tree" of the XML. The Traverse.java file is a working example that parses an XML document, creates the DOM tree, and traverses the tree to print the document.

Traverse.java

An example inorder parse tree traversal program is included in the file Traverse.java. Please study the traversal program to learn how to traverse the tree structure.

PrettyPrint.java - 33%

This program should print a nicely indented XML tree.
  • Each element is printed on a separate line.
  • A text node consisting entirely of whitespace can be either ignored or printed (will appear as at least one blank line).
  • Each element or text node in the content of an element should be indented two spaces.
  • Each attribute should be printed on a separate line, indented with respect to the enclosing element.
  • Each comment or processing node should be ignored.
For example, suppose xml/one.xml contains the following.
<?xml version="1.0"?>
<hi name="23"> <w age="36"      file="j">
<howdy>hello</howdy> </w> </hi>
Then executing make PrettyPrint.one would produce the following output.
<hi 
  name="23">
  <w 
    file="j" 
    age="36">
    <howdy>
      hello
    </howdy>
  </w>
</hi>
Note that the order of attributes is unimportant (the order may be reversed by the parsing).

Count.java - 33%

This program prints a count of each kind of element. You might want to use a HashTable to store the count. For example, suppose xml/two.xml contains the following XML.
<?xml version="1.0"?>
<yyy>abcd<zzz>

</zzz>A<yyy>B</yyy> Groovy! </yyy>

Then executing make Count.two would produce the following output.
yyy -> 2
zzz -> 1
Note that whitespace like carriage returns are not contained within an element and so should be retained.

Number.java - 33%

This program outputs the label of the "scope" in which each element appears. A scope is the extent of an element, from the starting tag through the ending tag. Every element creates its own scope. The scope is represented as a range of numbers, from the label through the label of the top descendant (inclusive). For example, assume the file xml/one.xml contains the following.
<?xml version="1.0"?>
<hi><name>23</name>
 <w><file>j</file><age>36</age>
  <howdy>hello</howdy>
 </w>
</hi>
Then executing make Number.one would produce the following output.
hi (1-6) 
name (2-2)
w (3-6) 
file (4-4)
age (5-5)
howdy (6-6)
The range of numbers is given after each element.
                                                                                                                                                                                                                                                                                                                                             
  E-mail questions or comments to Curtis.Dyreson at usu dot edu