Java for C++ Programmers

Adapted from Vicki Allan's adaptation of Marvin Solomon's (University of Wisconsin) tutorial

Contents


Introduction

This tutorial is intended as a self-help guide. If you get lost using it, take note of your problems and let me know so I can make it clearer.

The code used this tutorial is available for download in a single archive file. Get tutorial-code.zip or tutorial-code.tar.bz2 as appropriate.

Applications vs. Applets

The first thing you have to decide when writing a Java program is whether you are writing an application or an applet. An applet is piece of code designed to display a part of a document. It is run by a browser (such as Firefox or Internet Explorer) in response to an <object> tag in the document. (Older versions of the HTML specification described an <applet> tag, but its use is now deprecated.) An application, on the other hand, is a standalone program.

Java was originally designed to build active, multimedia, interactive environments, so its standard runtime library has lots of features to aid in creating user interfaces. There are standard classes to create scrollbars, pop-up menus, etc. There are special facilities for manipulating URLs and network connections.

Bytecode

Bytecode is computer object code that is processed by a program, usually referred to as a virtual machine, rather than by the “real” computer machine, the hardware processor. The virtual machine converts each generalized machine instruction into a specific machine instruction or instructions that this computer's processor will understand. Bytecode is the result of compiling source code written in a language that supports this approach. Most computer languages, such as C and C++, require a separate compiler for each computer platform — that is, for each computer operating system and the hardware set of instructions that it is built on. Using a language that comes with a virtual machine for each platform, your source language statements need to be compiled only once and will then run on any platform.

The best-known language today that uses the bytecode and virtual machine approach is Java. Rather than being interpreted one instruction at a time, Java bytecode can be recompiled at each particular system platform by a just-in-time (JIT) compiler. Usually, this will enable the Java program to run faster. In Java, bytecode is contained in a binary file with a .class suffix.

The Java API

The Java language is actually rather small and simple — an order of magnitude smaller and simpler than C++, and in some ways, even smaller and simpler than C. However, it comes with a very large and constantly growing library of utility classes. Fortunately, you only need to know about the parts of this library that you really need, you can learn about it a little at a time, and there is excellent, browsable, on-line documentation. These libraries are grouped into packages. One set of about 175 packages, called the Java 2 Platform API comes bundled with the language (API stands for “Application Programming Interface”). You will primarily use classes from three of these packages:

  • java.lang contains things like strings, that are essentially built in to the language.
  • java.io contains support for input and output, and
  • java.util contains some handy data structures such as lists and hash tables.

A First Example

Large parts of Java are identical to C++. For example, the following procedure, which sorts an array of integers using insertion sort, is exactly the same in C++ or Java.¹

 
// Sort the array a[] in ascending order using an insertion sort.
void sort(int a[], int size)
{
  for (int i = 1; i < size; i++)
    {
      // a[0..i-1] is sorted; insert a[i] in the proper place
      int j, x = a[i];
      for (j = i - 1; j >= 0 && a[j] > x; j--)
        {
          a[j + 1] = a[j];
        }
 
      // now a[0..j] are all <= x and a[j+2..i] are > x
      a[j+1] = x;
    }
}
        

Note that the syntax of control structures (such as for and if), assignment statements, variable declarations, and comments are all the same in Java as in C++.

A Java program to test the sort procedure is different in a few ways. Here is a complete Java program using the sort procedure.

 
// SortTest.java
 
import java.util.Random;
 
class SortTest {
 
    // Sort the array a[] in ascending order using an insertion sort.
    static void sort(int a[], int size)
    {
        for (int i = 1; i < size; i++)
            {
                // a[0..i-1] is sorted; insert a[i] in the proper place
                int j, x = a[i];
                for (j = i - 1; j >= 0 && a[j] > x; j--)
                    {
                        a[j + 1] = a[j];
                    }
                // now a[0..j] are all <= x and a[j+2..i] are > x
                a[j + 1] = x;
            }
    }
 
    // Test program to test sort
    public static void main(String args[])
    {
        if (args.length != 1)
            {
                System.out.println ("usage: java SortTest array-size");
                System.exit (1);
            }
        int size = Integer.parseInt (args[0]);
        int test[] = new int[size];
        Random r = new Random();
 
        for (int i = 0; i < size; i++)
            test[i] = r.nextInt (100);
        System.out.println ("before");
        for (int i = 0; i < size; i++)
            System.out.print (" " + test[i]);
        System.out.println ();
        sort (test, size);
        System.out.println ("after");
        for (int i = 0; i < size; i++)
            System.out.print (" " + test[i]);
        System.out.println ();
        System.exit (0);
    }
}
        

Java carries out conservative flow analysis to make sure that for every access, the local variable is definitely assigned before the access. To try it out,  follow the instructions on the getting started document to run the program (with command-line arguments) in Eclipse.  Add a few errors to the code and recompile to see how the IDE complains. There are several things to note about this program. First, Java has no “top-level” or “global” variables or functions. A Java program is always a set of class definitions. Thus, we had to make sort and main member functions (called methods in Java) of a class, which we called SortTest.

Second, the main function is handled somewhat differently in Java from C++. In Java, the first thing executed is the method called main of the indicated class (in this case SortTest). The main method takes only one parameter, an array of strings (denoted String args[] in Java). This array has one element for each word on the command line following the name of the class being executed. Thus if we set the command line arguments to “10” in our example call, when main starts, args is an array of length one, and args[0] == 10. There is no separate argument to tell you how many words there are, but in Java, you can tell how big any array is by using length. In this case args.length == 1.

The third difference to note is the way I/O is done in Java. System.out in Java is roughly equivalent to cout in C++, and

System.out.println(whatever);

is roughly equivalent to

cout << whatever << endl;

The equivalent C++ program would use three functions from the standard library, atoi, random, and exit. Integer.parseInt does the same thing as atoi: It converts the character string "10" to the integer value ten, and System.exit(1) does the same thing as exit(1): It immediately terminates the program, returning an exit status of 1 (meaning something's wrong). The library class Random defines random number generators. The statement Random r = new Random() creates an instance of this class, and r.nextInt (100) uses it to generate an integer between 0 (inclusive) and 100 (exclusive).

Finally, note that the #include directives from C++ have been replaced by import declarations. Although they have roughly the same effect, the mechanisms are different. In C++, #include <iostream> pulls in a source file called iostream from a source library and compiles it along with the rest of the program. #include is usually used to include files containing declarations of library functions and classes, but the file could contain any C++ source code whatever. The Java declaration import java.util.Random makes it possible to drop the package name (java.util) and refer to the class as plain Random. Without the import statement, that class would still be accessible, but would have to be accessed by its fully qualified name (java.util.Random).

For simple types of interactive input and output, a JOptionPane is appropriate. Consider the following example:

 
// InputWindow.java
 
import javax.swing.JOptionPane;  // Refer to plain JOptionPane below
public class InputWindow
{
    public static void main(String args[])
    {
        String xString;   // first string entered by user
        String yString;   // second string entered by user
        int x;            // first number to add
        float y;          // second number to add
        float sum;        // sum of x and y
 
        xString = JOptionPane.showInputDialog ("Enter integer");
        yString = JOptionPane.showInputDialog ("Enter float");
 
        // Convert numbers from type String to the appropriate type
        x = Integer.parseInt (xString);
        y = Float.parseFloat (yString);
        sum = x + y;
 
        // Display the results
        JOptionPane.showMessageDialog(null, "The sum is " + sum, "Results",
                                      JOptionPane.PLAIN_MESSAGE );
 
        System.exit (0);   // Is this necessary?  Try leaving it out.
   }
}
          

The JOptionPane interacts with the user through Java Strings. We can convert from a string to an integer by invoking Integer.parseInt (xString). We can convert from a float to a string by concatenating a float to a string such as "The sum is " + sum.

EXERCISE 1

Copy SortTest.java onto your computer. Verify that you can compile it, and run it for 100 integers. Add a method to perform a binary search on the array. Using JOptionPane (as shown in InputWindow.java), allow the user to specify which integer he/she would like to search for. Using JOptionPane for output, indicate the number of times the specified integer was found.

Names, Packages, and Separate Compilation

As in C or C++, case is significant in Java identifiers. Aside from a few reserved words, like if, while, etc., the Java language places no restrictions on what names you use for functions, variables, classes, etc. However, there is a standard naming convention, which all the standard Java libraries follow.

  • Names of classes are in MixedCase starting with a capital letter. If the most natural name for the class is a phrase, start each word with a capital letter, as in StringBuffer.
  • Names of constants (see below) are ALL_UPPER_CASE. Separate individual words in a phrase with underscores, as in MIN_VALUE.
  • Other names are in lower case or mixedCase, starting with a lower-case letter.

Simple class definitions in Java look rather like class definitions in C++

class Pair { int x, y; }

Each class definition should go in a separate file, and the name of the source file must be exactly the same (including case) as the name of the class, with ".java" appended. For example, the definition of Pair must go in file Pair.java. The file is compiled  and produces a .class file. There are exceptions to the rule that requires a separate source file for each class. In particular, class definitions may be nested. However, this is an advanced feature of Java, and you should never nest class definitions unless you know what you are doing!

There is a large set of predefined classes, grouped into packages. The full name of one of these predefined classes includes the name of the package as prefix. We already saw the class java.util.Random. The import statement allows you to omit the package name from one of these classes. Because the SortTest program starts with
import java.util.Random;
we can write
Random r = new Random();
rather than
java.util.Random r = new java.util.Random();
You can import all the classes in a package at once with the wild-card notation:
import java.io.*;
The package
java.lang is special; every program behaves as if it started with
import java.lang.*;
whether it does or not. You can define your own packages, but defining packages is an
advanced topic beyond the scope of what's required for this course.

The import statement doesn't really do anything. It just introduces a convenient abbreviation for a fully-qualified class name. When a class needs to use another class, all it has to do is reference it. The Java compiler will know that it is supposed to be a class by the way it is used, will import the appropriate .class file, and will even compile a .java file if necessary. (That's why it's important for the name of the file to match the name of the class). For example, here is a simple program that uses two classes:

 
class HelloTest
{
    public static void main(String[] args)
    {
        Hello greeter = new Hello();
        greeter.speak();
    }
}
 
class Hello
{
    void speak() {
        System.out.println("Hello World!");
    }
}
Put each class in a separate file (HelloTest.java and Hello.java). Then compile as before.

Values, Objects, and Pointers

It is sometimes said that Java doesn't have pointers. That is not true. In fact, objects can only be referenced with pointers. More precisely, variables can hold primitive values (such as integers or floating-point numbers) or references (pointers) to objects. A variable cannot hold an object, and you cannot make a pointer to a primitive value. Since you don't have a choice, Java doesn't have a special notation like C++ does to indicate when you want to use a pointer.

There are exactly eight primitive types in Java: boolean, char, byte, short, int, long, float, and double. Most of these are similar to types with the same name in C++. We mention only the differences.

A boolean value is either true or false. You cannot use an integer where a boolean is required (e.g. in an if or while statement) nor is there any automatic conversion between boolean and integer.

A char value is 16 bits rather than 8 bits, as it is in C or C++, to allow for all sorts of international alphabets. As a practical matter, however, you are unlikely to notice the difference. The byte type is an 8-bit signed integer (like signed char in C or C++).

A short is 16 bits and an int is 32 bits, just as in C or C++ on machines with 32-bit machine words. (In C and C++, the size is platform-dependent, but in Java it is guaranteed to be 32 bits on every platform.) On many 32-bit platforms, C++ int and long are both 32-bit quantities. A Java long, however, is 64 bits long—twice as big as an int—so it can hold any value from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. The types float and double are just like in C++ on 32-bit machines: 32-bit and 64-bit floating point numbers, respectively. Also note that Java has support for objects representing integers of arbitrary size.

As in C++, objects are instances of classes. There is no prefix * or & operator or infix -> operator.

As an example, consider the following class declaration, which is both legal C++ and in Java:

class Pair { int x, y; }

C++ vs. Java object references

C++

Java

 Pair origin;

 Pair origin = new Pair();

 Pair *p, *q, *r;

 Pair p, q, r;

 origin.x = 0;

 origin.x = 0;

 p = new Pair;

 p = new Pair();

 p -> y = 5;

 p.y = 5;

 q = p;

 q = p;

 r = &origin;

 not possible

As in C++, arguments to a Java procedure are passed “by value”, but realize that when an object is passed as a parameter, the value of the reference is passed, not the value of the object. Hence, changes made via a reference parameter change the object visible to the calling routine. However, if you assign a new value to a reference parameter, that change is not passed back to the calling routine, since a parameter is really just a local variable. Consider the following example.

 
// ObjectRef.java
 
public class ObjectRef
{
    public void f()
    {
        int n = 1;
        Pair p = new Pair();
        p.x = 2; p.y = 3;
        System.out.println(n);    // prints 1
        System.out.println(p.x);  // prints 2
        g(n,p);
        System.out.println(n);    // still prints 1
        System.out.println(p.x);  // prints 100
    }
 
    public void g(int num, Pair ptr)
    {
        System.out.println(num);  // prints 1
        num = 17;                 // changes only the local copy
        System.out.println(num);  // prints 17
        System.out.println(ptr.x);// prints 2
        ptr.x = 100;              // changes the x field of caller's Pair
        ptr = null;               // changes only the local ptr
    }
 
    public static void main(String[] args) 
    {
        ObjectRef ref = new ObjectRef();
        ref.f();
    }
}
          

The formal parameters num and ptr are local variables in the procedure g initialized with copies of the values of n and p. Any changes to num and ptr affect only the copies. However, since ptr and p point to the same object, the assignment to ptr.x in g changes the value of p.x.

Unlike C++, Java has no way of declaring reference parameters, and unlike C++ or C, Java has no way of creating a pointer to a (non-object) value, so you can't do something like this:

// swap2.cpp
// C++ only
#include <iostream>
using namespace std;
void swap2(int &xp, int &yp)
{
  int tmp = xp;
  xp = yp;
  yp = tmp;
}
 
int main()
{
  int this_one = 88, that_one = 99;
  swap2(this_one, that_one);  // now this_one == 99 and that_one == 88
  cout << "this_one == " << this_one << "; that_one == " << that_one << endl;
  return 0;
}
        

You will probably miss reference parameters most in situations where you want a procedure to return more than one value. As a work-around, you can (a) return or pass in an object containing multiple fields, or (b) return an array containing multiple entries.

Garbage Collection

New objects are created by the new operator in Java just like C++ (except that an argument list is required after the class name, even if the constructor for the class doesn't take any arguments so the list is empty). However, there is no delete operator. The Java system automatically deletes objects when no references to them remain. This is a much more important convenience than it may at first seem. The delete operator is error-prone. Deleting objects too early in C++ can lead to dangling references, as in:

 
// C++ code causing a dangling reference
p = new Pair();
// ...
q = p;
// ... later
delete p;
q -> x  = 5; // oops!
        

To be fair, we should point out that this particular example only causes a problem because we created an alias, also a dangerous and error-prone technique. Furthermore, modern compilers are starting to catch problems like this via dataflow analysis. Nonetheless, why worry when you can use a language that does not permit such problems to happen in the first place?

A related problem is that of forgetting to delete objects that are no longer used. When the last reference to such an object is lost, it becomes impossible to release the storage it occupies. This is called a storage leak. Note that, while Java eliminates all dangling references and storage leaks, it does not eliminate all memory management problems. If you keep a reference to an object that will no longer be used, the garbage collector cannot collect it. Hence, it is possible to make a program with modest storage requirements crash with an out of memory error because uncollected garbage mounts up until it consumes all available space.

Consider the following example of creating a linked list.

 
// Node.java
public class Node
{
   int val;
   Node next;
 
   Node (int v, Node n)
    {
        val = v;
        next = n;
    }
}
        
 
// List.java
 
import java.util.Random;
 
public class List
{
    Node head;  // The front of the list
    public void insert(int newVal)
    {
        Node newNode = new Node(newVal, head);
        head = newNode;
    }
 
    /* Every object has a toString() method, since the granddaddy of all
     * classes, Object, has such a method.  Note also that the syntax
     *
     *   String s = "constant string" + someObject;
     *
     * is translated internally into
     *
     *   String s = new StringBuilder("constant string")
     *                  .append(someObject.toString()).toString();
     *
     * Note also that StringBuilder is new with JDK 1.5.  Previous versions
     * of the JDK used StringBuffer in an identical fashion.
     */
    public String toString()
    {
        StringBuilder s = new StringBuilder("List contains\n");
        for (Node curr = head; curr != null; curr = curr.next)