This tutorial is
intended as a self-help guide. If you get lost using it, take note of your
problems and let me know so I can make it clearer.
The code used this tutorial is
available for download in a single archive file. Get tutorial-code.zip
or tutorial-code.tar.bz2 as appropriate.
The first thing you have to
decide when writing a Java program is whether you are writing an application
or an applet. An applet is piece of code designed to display a part of
a document. It is run by a browser (such as Firefox or Internet Explorer) in
response to an <object> tag
in the document. (Older versions of the HTML specification described an <applet> tag, but its use is now deprecated.) An application,
on the other hand, is a standalone program.
Java was originally designed to
build active, multimedia, interactive environments, so its standard runtime
library has lots of features to aid in creating user interfaces. There are standard
classes to create scrollbars, pop-up menus, etc. There are special facilities
for manipulating URLs and network connections.
Bytecode is computer object code
that is processed by a program, usually referred to as a virtual machine,
rather than by the “real” computer machine, the hardware processor.
The virtual machine converts each generalized machine instruction into a
specific machine instruction or instructions that this computer's processor
will understand. Bytecode is the result of compiling source code written in a
language that supports this approach. Most computer languages, such as C and
C++, require a separate compiler for each computer platform — that is,
for each computer operating system and the hardware set of instructions that it
is built on. Using a language that comes with a virtual machine for each
platform, your source language statements need to be compiled only once and
will then run on any platform.
The best-known language today
that uses the bytecode and virtual machine approach is Java. Rather than being
interpreted one instruction at a time, Java bytecode can be recompiled at each
particular system platform by a just-in-time (JIT) compiler. Usually, this will
enable the Java program to run faster. In Java, bytecode is contained in a
binary file with a .class suffix.
The Java language is actually
rather small and simple — an order of magnitude smaller and simpler than
C++, and in some ways, even smaller and simpler than C. However, it comes with
a very large and constantly growing library of utility classes. Fortunately,
you only need to know about the parts of this library that you really need, you
can learn about it a little at a time, and there is excellent, browsable, on-line documentation. These
libraries are grouped into packages. One set of about 175 packages,
called the Java 2 Platform API comes bundled with the language (API
stands for “Application Programming Interface”). You will primarily
use classes from three of these packages:
Large parts of Java are identical
to C++. For example, the following procedure, which sorts an array of integers
using insertion sort, is exactly the same in C++ or Java.¹
// Sort the array a[] in ascending order using an insertion sort.void sort(int a[], int size){ for (int i = 1; i < size; i++) { // a[0..i-1] is sorted; insert a[i] in the proper place int j, x = a[i]; for (j = i - 1; j >= 0 && a[j] > x; j--) { a[j + 1] = a[j]; } // now a[0..j] are all <= x and a[j+2..i] are > x a[j+1] = x; }}
Note that the syntax of control structures
(such as for and if), assignment statements, variable declarations, and comments are all the
same in Java as in C++.
A Java program to test the sort procedure
is different in a few ways. Here is a complete Java program
using the sort procedure.
//SortTest.java
import java.util.Random; class SortTest { // Sort the array a[] in ascending order using an insertion sort. static void sort(int a[], int size) { for (int i = 1; i < size; i++) { // a[0..i-1] is sorted; insert a[i] in the proper place int j, x = a[i]; for (j = i - 1; j >= 0 && a[j] > x; j--) { a[j + 1] = a[j]; } // now a[0..j] are all <= x and a[j+2..i] are > x a[j + 1] = x; } } // Test program to test sort public static void main(String args[]) { if (args.length != 1) { System.out.println ("usage: java SortTest array-size"); System.exit (1); } int size = Integer.parseInt (args[0]); int test[] = new int[size]; Random r = new Random(); for (int i = 0; i < size; i++) test[i] = r.nextInt (100); System.out.println ("before"); for (int i = 0; i < size; i++) System.out.print (" " + test[i]); System.out.println (); sort (test, size); System.out.println ("after"); for (int i = 0; i < size; i++) System.out.print (" " + test[i]); System.out.println (); System.exit (0); }}
Java carries out conservative flow
analysis to make sure that for every access, the local variable is definitely
assigned before the access. To try it out,
follow the instructions on the getting started document to run
the program (with command-line arguments) in Eclipse. Add a few errors to the code and recompile to
see how the IDE complains. There are several things to note about this program.
First, Java has no “top-level” or “global” variables or
functions. A Java program is always a set of class definitions. Thus, we had to make sort and main member functions (called methods
in Java) of a class, which we called SortTest.
Second, the main function is handled somewhat differently
in Java from C++. In Java, the first thing executed is the method called main of the indicated class (in this case SortTest). The main method takes only one parameter, an array
of strings (denoted String args[] in Java). This array has one element for each
word on the command line following the name of the class being
executed. Thus if we set the command line arguments to “10” in our
example call, when main starts, args is an array of length one, and args[0] == 10. There is no separate argument to tell
you how many words there are, but in Java, you can tell how big any
array is by using length. In this case args.length == 1.
The third difference to note is the way
I/O is done in Java. System.out in Java is roughly equivalent to cout in C++, and
System.out.println(whatever);
is roughly equivalent to
cout << whatever
<< endl;
The equivalent C++ program would use three
functions from the standard library, atoi, random, and exit. Integer.parseInt does the same thing as atoi: It converts the character string
"10" to the integer value ten, and System.exit(1) does the same thing as exit(1): It immediately terminates the program,
returning an exit status of 1 (meaning something's wrong). The library class Random defines random number generators. The statement
Random r = new Random() creates an instance of this class, and r.nextInt (100) uses it to generate an integer between 0
(inclusive) and 100 (exclusive).
Finally, note that the #include directives from C++ have been replaced by
import
declarations. Although they have roughly the same effect, the mechanisms are
different. In C++, #include <iostream> pulls in a source file called iostream from a source library and compiles it
along with the rest of the program. #include is usually used to include files
containing declarations of library functions and classes, but the file could
contain any C++ source code whatever. The Java declaration import
java.util.Random makes
it possible to drop the package name (java.util) and refer to the class as plain Random. Without the import statement, that class would still be
accessible, but would have to be accessed by its fully qualified name
(java.util.Random).
For simple types of interactive input and
output, a JOptionPane is appropriate. Consider the following example:
// InputWindow.java import javax.swing.JOptionPane; // Refer to plain JOptionPane belowpublic class InputWindow{ public static void main(String args[]) { String xString; // first string entered by user String yString; // second string entered by user int x; // first number to add float y; // second number to add float sum; // sum of x and y xString = JOptionPane.showInputDialog ("Enter integer"); yString = JOptionPane.showInputDialog ("Enter float"); // Convert numbers from type String to the appropriate type x = Integer.parseInt (xString); y = Float.parseFloat (yString); sum = x + y; // Display the results JOptionPane.showMessageDialog(null, "The sum is " + sum, "Results", JOptionPane.PLAIN_MESSAGE ); System.exit (0); // Is this necessary? Try leaving it out. }}
The JOptionPane interacts with the user through Java
Strings. We can convert from a string to an integer by invoking Integer.parseInt
(xString). We can convert
from a float to a string by concatenating a float to a string such as "The sum
is " + sum.
Copy SortTest.java
onto your computer. Verify that you can compile it, and run it for 100
integers. Add a method to perform a binary search on the array. Using JOptionPane (as shown in
InputWindow.java), allow the user to specify which integer he/she would
like to search for. Using JOptionPane for output, indicate the number of times the specified integer was found.
As in C or C++, case is significant in
Java identifiers. Aside from a few reserved words, like if, while, etc., the Java language places no restrictions
on what names you use for functions, variables, classes, etc. However, there is
a standard naming convention, which all the standard Java libraries follow.
Simple class definitions in Java look
rather like class definitions in C++
class Pair {
int x, y; }
Each class definition should go in a
separate file, and the name of the source file must be exactly the same
(including case) as the name of the class, with ".java" appended. For example, the
definition of Pair must go in
file Pair.java. The file is compiled and produces a .class file. There are exceptions to the rule that
requires a separate source file for each class. In particular, class
definitions may be nested. However, this is an
There is a large set of predefined
classes, grouped into packages. The full name of one of these
predefined classes includes the name of the package as prefix. We already saw
the class java.util.Random. The import statement allows you to omit the package name
from one of these classes. Because the SortTest program
starts with
import java.util.Random;
we can write
Random r = new Random();
rather than
java.util.Random r = new java.util.Random();
You can import all the classes in a package at once with the wild-card
notation:
import java.io.*;
The package java.lang is special; every program behaves as if it started with
import
java.lang.*;
whether it does or not. You can define your own packages, but defining packages
is an
The import statement doesn't really do
anything. It just introduces a convenient abbreviation for a fully-qualified
class name. When a class needs to use another class, all it has to do is
reference it. The Java compiler will know that it is supposed to be a class by
the way it is used, will import the appropriate .class file, and will even compile a .java file if necessary. (That's why it's important
for the name of the file to match the name of the class). For example, here is
a simple program that uses two classes:
class HelloTest{ public static void main(String[] args) { Hello greeter = new Hello(); greeter.speak(); }} class Hello{ void speak() { System.out.println("Hello World!"); }}Put each class in a separate file (HelloTest.java and Hello.java). Then compile as before.
It is sometimes said that Java doesn't
have pointers. That is not true. In fact, objects can only be
referenced with pointers. More precisely, variables can hold primitive values
(such as integers or floating-point numbers) or references (pointers)
to objects. A variable cannot hold an object, and you cannot make a pointer to
a primitive value. Since you don't have a choice, Java doesn't have a special
notation like C++ does to indicate when you want to use a pointer.
There are exactly eight primitive types in
Java: boolean, char, byte, short, int, long, float, and double. Most of these are similar to types with
the same name in C++. We mention only the differences.
A boolean value is either true or false. You cannot use an integer where a boolean is required (e.g. in an if or while statement) nor is there any automatic
conversion between boolean and integer.
A char value is 16 bits rather than 8 bits, as
it is in C or C++, to allow for all sorts of
international alphabets. As a practical matter, however, you are unlikely
to notice the difference. The byte type is an 8-bit signed integer (like signed char in C or C++).
A short is 16 bits and an int is 32 bits, just as in C or C++ on machines
with 32-bit machine words. (In C and C++, the size is platform-dependent, but
in Java it is guaranteed to be 32 bits on every platform.) On many 32-bit
platforms, C++ int and long are both 32-bit quantities. A Java long, however, is 64 bits long—twice as
big as an int—so it can hold any value from -9,223,372,036,854,775,808 to
9,223,372,036,854,775,807. The types float and double are just like in C++ on 32-bit machines:
32-bit and 64-bit floating point numbers, respectively. Also note that Java has
support for objects
representing integers of arbitrary size.
As in C++, objects are instances
of classes. There is no prefix * or & operator or infix -> operator.
As an example, consider the following
class declaration, which is both legal C++ and in Java:
class Pair {
int x, y; }
|
C++ vs. Java object references |
|
|
C++ |
Java |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
not possible |
As in C++, arguments to a Java procedure
are passed “by value”, but realize that when an object is passed as
a parameter, the value of the reference is passed, not the value of
the object. Hence, changes made via a reference parameter change the object
visible to the calling routine. However, if you assign a new value to a
reference parameter, that change is not passed back to the calling routine,
since a parameter is really just a local variable. Consider the following
example.
// ObjectRef.java public class ObjectRef{ public void f() { int n = 1; Pair p = new Pair(); p.x = 2; p.y = 3; System.out.println(n); // prints 1 System.out.println(p.x); // prints 2 g(n,p); System.out.println(n); // still prints 1 System.out.println(p.x); // prints 100 } public void g(int num, Pair ptr) { System.out.println(num); // prints 1 num = 17; // changes only the local copy System.out.println(num); // prints 17 System.out.println(ptr.x);// prints 2 ptr.x = 100; // changes the x field of caller's Pair ptr = null; // changes only the local ptr } public static void main(String[] args) { ObjectRef ref = new ObjectRef(); ref.f(); }}
The formal parameters num and ptr are local variables in the procedure g initialized with copies of the
values of n and p. Any changes to num and ptr affect only the copies. However, since ptr and p point to the same object, the assignment
to ptr.x in g changes the value of p.x.
Unlike C++, Java has no way of declaring reference
parameters, and unlike C++ or C, Java has no way of creating a pointer to a
(non-object) value, so you can't do something like this:
// swap2.cpp// C++ only#include <iostream>using namespace std;void swap2(int &xp, int &yp){ int tmp = xp; xp = yp; yp = tmp;} int main(){ int this_one = 88, that_one = 99; swap2(this_one, that_one); // now this_one == 99 and that_one == 88 cout << "this_one == " << this_one << "; that_one == " << that_one << endl; return 0;}
You will probably miss reference parameters
most in situations where you want a procedure to return more than one value. As
a work-around, you can (a) return or pass in an object containing multiple
fields, or (b) return an array containing multiple entries.
New objects are created by the new operator in Java just like C++ (except
that an argument list is required after the class name, even if the constructor
for the class doesn't take any arguments so the list is empty). However, there
is no delete operator. The Java system automatically deletes objects when no references
to them remain. This is a much more important convenience than it may at first
seem. The delete operator is error-prone. Deleting objects too early in C++ can lead to dangling
references, as in:
// C++ code causing a dangling referencep = new Pair();// ...q = p;// ... laterdelete p;q -> x = 5; // oops!
To be fair, we should point out that this
particular example only causes a problem because we created an alias,
also a dangerous and error-prone technique. Furthermore, modern compilers are
starting to catch problems like this via dataflow analysis. Nonetheless, why
worry when you can use a language that does not permit such problems to happen
in the first place?
A related problem is that of forgetting to
delete objects
that are no longer used. When the last reference to such an object is lost, it
becomes impossible to release the storage it occupies. This is called a storage
leak. Note that, while Java eliminates all dangling references and storage
leaks, it does not eliminate all memory management problems. If you keep a
reference to an object that will no longer be used, the garbage collector
cannot collect it. Hence, it is possible to make a program with modest storage
requirements crash with an out of memory error because uncollected garbage
mounts up until it consumes all available space.
Consider the following example of creating
a linked list.
// Node.javapublic class Node{int val;
Node next; Node (int v, Node n) { val = v; next = n; }} //List.java
importjava.util.Random;
publicclass List
{ Node head; // The front of the list public void insert(int newVal) { Node newNode = new Node(newVal, head); head = newNode; } /* Every object has a toString() method, since the granddaddy of all * classes, Object, has such a method. Note also that the syntax * * String s = "constant string" + someObject; * * is translated internally into * * String s = new StringBuilder("constant string") * .append(someObject.toString()).toString(); * * Note also that StringBuilder is new with JDK 1.5. Previous versions * of the JDK used StringBuffer in an identical fashion. */ public String toString() { StringBuilder s = new StringBuilder("List contains\n"); for (Node curr = head; curr != null; curr = curr.next)