In this chapter we provide a brief overview of the Java Programming Language. The tutorial presented here is intended to introduce readers to just enough Java to write physics analysis routines for Java Analysis Studio, and is far from a complete Java tutorial (for more detailed tutorials see the references at the end of this chapter). No prior knowledge of Java or object-oriented programming is assumed. Readers already familiar with C or C++ should find the contents very familiar.
The Java programming language is an object-oriented language developed by Sun, and
popularized by its introduction as a way of programming "web applets" in
Netscape and other browsers. Java code is compiled into a
machine-independent-pseudo-machine-code called bytecodes. By
convention Java source code is kept in files with extension .java
, and the bytecodes
generated by compiling the source are kept in files with extension .class
. Since Java
bytecodes are machine independent they can be run on any machine, and can easily be moved
from machine to machine over a network. Java Analysis Studio makes use of this feature to
allow analysis code to be written and compiled on the desktop machine, and then either
executed locally or moved to a data server to be executed. When Java bytecodes are
executed on a particular machine they are normally converted to native machine code at
runtime, a process known as Just-In-Time (JIT) compilation. Thus
analysis code written in Java can attain execution speeds not too far removed from
compiled code and considerably faster than interpretive languages previously used for
physics analysis, such as COMIS and IDA. The compile/load speed of Java is very fast, so
the turnaround time for modifying-compiling-reloading and running analysis code in Java is
also very good.
Since Java is a pure object-oriented language all code is written as classes. A class typically represents a set of concrete or abstract items, such as Histograms, Cuts, Particles, or Event Analyzers. Objects represent a specific instance of an item contained within the set, thus a specific histogram may be represented by an object of class Histogram. (By convention classes are given capitalized names, while objects are spelt with an initial lowercase letter). Procedures within classes are called methods, thus a Histogram class would typically contain methods for filling the histogram as well as methods for extracting information about the histogram. Classes normally have one or more constructors, which are methods used to create new objects of that class. Constructors always have the same name as the class itself, and unlike other methods they cannot return a value (since they implicitly return a new object).
To create a very simple Histogram class in Java one could write:
public class Histogram { Histogram(String name) { m_name = name; } public String getName() { return m_name; } public void fill(double value, double weight) { // fill bins here } private String m_name; }
This code declares a class called Histogram. The class is declared public meaning that anyone can access the Histogram class. The histogram class contains one constructor, which by convention has the same name as the name of the class. In this case the constructor takes a single argument of type String, which is stored into the variable m_name. Note that m_name is declared at the level of the class itself, rather than inside any of the methods, which indicates that the variable is a member variable. Member variables have the same life-span as the objects which contain them, thus whenever an object of class Histogram is created one string variable m_name will be created within that object, and the variable will maintain its value until that object is destroyed. m_name is declared private, meaning that it can only be accessed directly from within the Histogram class itself.
Our example Histogram class contains a second method, fill, which takes two arguments, the bin to fill (value) and the weight to add to that bin. For simplicity the body of the function is omitted (fortunately we don’t really need to write our own Histogram class since Java Analysis Studio already contains a fully functional histogram class which we can just use!).
Note that all statements in Java must end with a semi-colon, and that squiggly-brackets ({}) are used to delimit the beginning and end of blocks of code, such as the bodies of functions. Double slashes are used to begin single line comments. (Anyone familiar with C or C++ will notice that most Java conventions are exactly the same as for those languages.)
In Java, objects of a particular class are created by using the new keyword. Thus to create and fill a histogram in Java one would write:
Histogram myhist = new Histogram("My Histogram"); myhist.fill(99.0,1.0);
In the first line above a local variable myhist is declared, which is of type Histogram, and which is assigned a new histogram object (named "My Histogram"). In the second line the fill method of the histogram object is invoked. In Java it is not necessary (or possible) to explicitly delete an object, rather the object will be automatically destroyed once there are no active references to it (a process known as garbage collection).
One of the most powerful features of object-oriented languages is the concept of inheritance, whereby one class may inherit all or a subset of the methods and member variables of a super-class. For example we could implement a BetterHistogram class as follows:
public class BetterHistogram extends Histogram { BetterHistogram(String name) { super(name); } public void fill(double value, double weight) { m_nEntries++; super.fill(value,weight); } public int getNEntries() { return m_nEntries; } private int m_nEntries = 0; }
In this example the class BetterHistogram is declared as extending Histogram, which
means that in inherits all of the methods and member variables of the Histogram class, but
can, in addition, add its own methods and member variables. All classes implicitly inherit
from the base class Object which is built-in to the Java language.
Java classes can only extend one super-class (i.e. Java does not support multiple-inheritance) although they can implement any number of interfaces (see below).
When using Java Analysis Studio the most common way in which you will encounter
inheritance is when you write your own event analysis routines. Java Analysis Studio
contains a built-in class
EventAnalyzer, which can be thought of as
an empty framework for
performing event analysis. The class contains methods which are called to process each
event (processEvent) and which are called at the beginning and end of each run
(beforeFirstEvent and afterLastEvent) but each of these methods is empty, meaning that it
does not actually perform any data analysis (or anything else). The purpose of a framework
class such as EventAnalyser is to allow you to extend it, for example to provide
a MyEventAnalysis class which actually does something useful (your physics analysis). For
example:
public class MyAnalysis extends EventAnalyzer { public void processEvent(EventData d) { // perform analysis and fill histograms here. } }
As mentioned above classes in Java can only extend a single super-class. However, in addition to classes Java supports a second concept called an interface. Like classes interfaces define a set of methods, but unlike classes they can not contain any implementation of these methods. (Interfaces in Java are very similar to pure-virtual classes in C++).
The following is an example of an interface.
public interface FourVector { public double x(); public double y(); public double z(); public double t(); public double magnitude(); }
What the above means is that anything which calls itself an FourVector must provide implementations of each of the methods specified in the interface. A class can be declared to provide an implementation of an interface using the implements keyword. For example:
public class Particle implements FourVector { private double px, py, pz, mass; public Particle(double px, double py, double pz, double mass) { this.px = px; this.py = py; this.pz = pz; this.mass = mass; } public double x() { return x; } public double y() { return y; } public double z() { return z; } public double t() { return mass; } public double magnitude() { return Math.sqrt(x*x + y*y + z*z + mass*mass); } }
A single class can provide an implementation of any number of interfaces. As far as the user of an interface is concerned, they work exactly the same as classes, so with the above definitions you can write:
FourVector v = new Particle(1,2,3,0.5); double e = v.magnitude();
Classes in Java are normally defined inside packages. Packages have two functions
For example the full name of the String class is java.lang.String,
indicating that it is in the java.lang package. By convention package names are
always in lower case (a convention which is just about universally followed),
and by convention should be named using the reversed domain name of
the creating organization, for example edu.stanford.slac.jas.Histogram
(a
convention often ignored since it can lead to unwieldy package names - and not
everyone has their own domain name).
When defining a class you will normally start your .java file with a package statement:
package my.analysis;
This statement implies that any classes defined inside the file are considered to be in package my.analysis. If you do not put a package statement in your file your classes will be considered to be in the "unnamed package". In general the unnamed package is good for quick tests and experimenting, but for code that you expect to use longer term an explicit package statement is a good idea.
Wherever a class name appears you can use the full name including package name, however this leads to a lot of typing and can adversely effect the clarity of your code. As an alternative you can use import statements following the package statement at the top of your program. For example:
import java.lang.String; import java.lang.*;
The first line imports a specific class, the second line imports all of the classes in package java.lang. Once you have imported a class you can refer to it by its short name (e.g. String). If you import two packages which contain a class with the same name you will still need to refer to it using its fully qualified name. Note that in reality you do not ever need to import package java.lang since it is unique in always being considered to be implicitly imported. The package statement, if it exists, must be the first statement in the file, and must be immediately followed by any import statements.
Java requires that a class whose full name is my.analysis.Histogram be defined in a file called Histogram.java which resides in a directory my/analysis. ie:
Methods and member variables within Java classes can have access modifiers applied to them, that control where they can by used from. The allowed access modifiers are:
In Java there are only two types of variables, intrinsic and reference types. Intrinsic variables are those that refer to built-in simple types of variables, such as int, double, float, boolean. A complete list of built-in types is given in the following table:
Type | Description |
---|---|
byte |
8-bit signed integer. |
short |
16-bit signed integer. |
int |
32-bit signed integer. |
long |
64-bit signed integer. |
float |
32-bit IEEE754 floating-point. |
double |
64-bit IEEE754 floating-point. |
char |
16-bit Unicode character. Unicodes are extensions to ASCII to support international character sets. (Click here for information about Unicodes.) |
boolean |
A true or false value, using the keywords true and false
-- pretty clever. There is no conversion between booleans and other types, such as int 's. |
Note that Java completely defines the size and behavior of all built-in types, so they should behave identically on all platforms. When intrinsic variables are passed to functions they are always passed by value, thus the variable within the function is initially set to the value of the passed argument, but subsequent changes to the variable inside the function will have no effect on the value of the variable passed in.
The only other type of variable in Java is a reference to an object. References variables either always point to an object of a particular type, or have the special value null. Objects are only created if the new operator is explicitly used, the assignment operator just creates two references to the same object. Thus the statements:
Histogram a = new Histogram("my histogram"); Histogram b = a;
create one histogram object and sets variables a and b to point to the same histogram
object. Therefore modifying the object pointed to by a
will also modify the object pointed
to by b
(since they are the same object). This can be a little confusing until one gets
used to it, for example:
Histogram a = new Histogram("my histogram"); Histogram b = a; b.fill(1.0); System.out.println("a has "+a.getNEntries()+" entries");
will print 1 not 0.
The arithmetic operators in Java are almost identical to those in C or C++. These arithmetic operators can be used on any integer or floating point operands. The operands will be automatically promoted as necessary (thus adding an int and a double will produce a double).
Operator | Use | Description |
---|---|---|
+ |
op1 + op2 |
Adds op1 and op2 |
- |
op1 - op2 |
Subtracts op2 from op1 |
* |
op1 * op2 |
Multiplies op1 by op2 |
/ |
op1 / op2 |
Divides op1 by op2 |
% |
op1 % op2 |
Computes the remainder of dividing op1 by op2 |
++ |
op++ |
Increments op by 1; evaluates to value before incrementing |
++ |
++op |
Increments op by 1; evaluates to value after incrementing |
-- |
op-- |
Decrements op by 1; evaluates to value before decrementing |
-- |
--op |
Decrements op by 1; evaluates to value after decrementing |
+ |
+op |
Promotes op to int if it's a byte , short ,
or char |
- |
-op |
Arithmetically negates op |
Java does not contain any operator like for Fortran ** operator, you must use the java.lang.Math.pow method described under Mathematical
Functions below.
Note that the + operator can also be used to concatenate Strings. Other than this one special case, arithmetic operators can only be used on the built-in Java type, thus even if you define your own Complex type you will not be able to use the + operator to add Complex objects together, since Java does not support operator overloading.
Relational operators can only be used on boolean operands. Unlike C, Java will not automatically convert integers to booleans.
Operator | Use | Return true if |
---|---|---|
> |
op1 > op2 |
op1 is greater than op2 |
>= |
op1 >= op2 |
op1 is greater than or equal to op2 |
< |
op1 < op2 |
op1 is less than op2 |
<= |
op1 <= op2 |
op1 is less than or equal to op2 |
== |
op1 == op2 |
op1 and op2 are equal |
!= |
op1 != op2 |
op1 and op2 are not equal |
&& |
op1 && op2 |
op1 and op2 are both true ,
conditionally evaluates op2 |
|| |
op1 || op2 |
either op1 or op2 is true ,
conditionally evaluates op2 |
! |
! op |
op is false |
& |
op1 & op2 |
op1 and op2 are both true , always
evaluates op1 and op2 |
| |
op1 | op2 |
either op1 or op2 is true , always
evaluates op1 and op2 |
One thing to be aware of is that the ==
operatator will only consider two
references to be equal if they point to the same object, thus:
String a = new String("xyz"); String b = new String("xyz"); boolean result = (a == b);
will set result equals to false, even though both strings have the same contents. You
should use the Object.equals method to compare string for equality:
String a = new String("xyz"); String b = new String("xyz"); boolean result = a.equals(b);
Java supports one other conditional operator--the ?:
operator. This
operator is a tertiary operator and is basically short-hand for an if
-else
statement:
boolean-expression ? op1 : op2
The ?:
operator evaluates boolean-expression
and returns op1
if it's true and op2
if it's false.
Bitwise operators can be used on integer operands.
Operator | Use | Operation |
---|---|---|
>> |
op1 >> op2 |
shift bits of op1 right by distance op2 |
<< |
op1 << op2 |
shift bits of op1 left by distance op2 |
>>> |
op1 >>> op2 |
shift bits of op1 right by distance op2
(unsigned) |
& |
op1 & op2 |
bitwise and |
| |
op1 | op2 |
bitwise or |
^ |
op1 ^ op2 |
bitwise xor |
~ |
~op2 |
bitwise complement |
These assignment operators are just shorthand ways of performing common operations such as incrementing a variable by a given amount. They are normally clearer (and less prone to typos) that their longer counterparts.
Operator | Use | Equivalent to |
---|---|---|
+= |
op1 += op2 |
op1 = op1 + op2 |
-= |
op1 -= op2 |
op1 = op1 - op2 |
*= |
op1 *= op2 |
op1 = op1 * op2 |
/= |
op1 /= op2 |
op1 = op1 / op2 |
%= |
op1 %= op2 |
op1 = op1 % op2 |
&= |
op1 &= op2 |
op1 = op1 & op2 |
|= |
op1 |= op2 |
op1 = op1 | op2 |
^= |
op1 ^= op2 |
op1 = op1 ^ op2 |
<<= |
op1 <<= op2 |
op1 = op1 << op2 |
>>= |
op1 >>= op2 |
op1 = op1 >> op2 |
>>>= |
op1 >>>= op2 |
op1 = op1 >>> op2 |
As you have already seen Java statements all end with ;
and multiple
statements may be grouped together into a block using curly braces {}
.
In addition Java supports all of the loop and conditional statements of the C language
(although long-term Fortran users may be dismayed by the lack of a goto
statement).
Statement | Keyword |
---|---|
decision making | if-else , switch-case |
loop | for , while , do-while |
miscellaneous | break , continue , label: , return |
The usage of these statements is fairly self-explanatory, as the examples below will hopefully demonstrate.
int testscore; char grade; if (testscore >= 90) { grade = 'A'; } else if (testscore >= 80) { grade = 'B'; } else if (testscore >= 70) { grade = 'C'; } else if (testscore >= 60) { grade = 'D'; } else { grade = 'F'; }
int month; . . . switch (month) { case 1: System.out.println("January"); break; case 2: System.out.println("February"); break; case 3: System.out.println("March"); break; case 4: System.out.println("April"); break; case 5: System.out.println("May"); break; case 6: System.out.println("June"); break; case 7: System.out.println("July"); break; case 8: System.out.println("August"); break; case 9: System.out.println("September"); break; case 10: System.out.println("October"); break; case 11: System.out.println("November"); break; case 12: System.out.println("December"); break; default: System.out.println("Huh?????"); break; }
The switch statement inherits C's behavior of "falling through" from one case
to the following case
unless an explicit break
statement is
inserted after each case as in the above example. Note also the use of the default
statement to catch otherwise unmet cases.
There are two forms of the while loop, one which tests the condition at the top of the loop, and one which tests it at the end of the loop (and hence always executes the loop body at least once).
int i = 0; while (i<100) { i++; }
int i=0; do { i++; } while (i<100);
The for loop perhaps requires some explanation for those not familiar with C. The for statement contains three clauses, separated by semi-colons. The first clause is executed once at the beginning of the loop, the second clause is executed before each iteration of the loop, and the third clause is executed at the end of each iteration of the loop. Any of the clauses can be omitted (although the semi-colons are still required). The first clause may contain a variable declaration, in which case the variable is only accessible from within the body of the for loop. The second clause, if present, must evaluate to a logical expression, and if false the loop will be exited.
for (int i=0; i<100; i++) { System.out.println(i); }
All loop constructs may contain a continue statement, meaning that execution should immediately skip to the next iteration of the loop, or the break statement, meaning that the loop should be immediately terminated and execution continued from after the loop. Continue and break statements normally operate on the innermost loop, although this can be modified by explicitly labeling the loop, and using a break or continue statement with a label.
Finally the return statement can be used to return from a method call. If the method's return type is anything but void the return statement must specify a return value.
Java contains many common mathematical functions as part of the java.lang.Math built in class. Unfortunately you must always prefix these methods with
the class name (Math), making complicated expressions a bit unwieldy.
The Math class contains two useful constants, Math.E and Math.PI
, as well as many methods including, Math.pow(double,double)
(raise to power), Math.sqrt(double)
, Math.log(double)
(natural log) and trigonometric functions Math.sin(double)
, Math.cos(double)
etc.
The Math class also contains a simple random number generator, Math.random()
which returns a random number in the range 0 to 1. For a more complete random number
generator, which also allows setting and retrieving seeds and generating normally
distributed random numbers, see the class java.util.Random
.
Example:
double r = Math.random(); double phi = Math.random()*Math.PI*2; double x = r*Math.sin(phi); double y = Math.sqrt(r*r - x*x);
A sequence of character data is called a string and is implemented in the Java
environment by the String
class. The Java language contains a few special shortcuts for handling
Strings, for example any occurence of a quoted string constant will automatically be
converted to a String, and the concatenation operator (+) can be used to concatenate two
String together to produce a new String. Finally the concatenation operator (+) can be
used to concatentate a String with any other object, in which case the object is first
converted to a String (using the Object's toString
method).
String world = "World"; System.out.println("Hello "+world); System.out.println("The time is now "+new Date());
String
objects are immutable--that is, they
cannot be changed once they've been created. Java provides a different class, StringBuffer
, which you can use to create and manipulate character data on the fly.
Arrays in java are handled by array objects. In common with other objects they are created using the new operator, although the syntax is slightly modified. The statement:
int[] arrayOfInts = new int[100];
creates an array containing 100 ints, and assigns a reference to the array to the variable arrayOfInts. As in C and C++, array elements are numbered from 0, and are accessed as follows:
for (int i=0; i<arrayOfInts.length; i++) arrayOfInts[i] = 0;
The member variable length can be used to access the dimension of an array. As well as arrays of all the built-in types, Java also allows arrays of reference types, such as:
String[] arrayOfStrings = new String[10]; for (int i = 0; i < arrayOfStrings.length; i++) { arrayOfStrings[i] = new String("Hello " + i); }
Like C and C++ Java does not directly support multi-dimensional arrays, but it does support arrays of arrays which give much the same functionality:
double[][] arrayOfArrayOfDoubles = new double[10][3]; for (int i=0; i<10; i++) for (int j=0; j<3; j++) arrayOfArrayOfDoubles[i][j] = 0;
The Java language has built-in support for handling errors, using a mechanism known as exception handling. To generate an exception in your code use the throw statement. For example:
if (x < 0) throw new IllegalArgumentException("x must be >= 0");
In Java exception are represented by instances of classes which extend
Throwable.
Exceptions fall into two categories, checked exceptions
and unchecked exceptions. If a method throws a checked
exception it must explicitly declare that the exception can be thrown, using a
throws clause. Declaring unchecked exceptions using a
throws clause is optional. For example:
public double MySqrt(double x) throws IllegalArgumentException { if (x < 0) throw new IllegalArgumentException("x must be >= 0"); return Math.sqrt(x); }
(Note that the Math.sqrt() method does not throw an exception when given a negative number, instead it returns a special double value, Double.NaN, which represents an undefined number. This is the normal behavior for floating point operations in Java).
Unchecked exceptions are those that extend either
Error
or
RuntimeException
.
All other exceptions are checked. In general checked exceptions are used for
errors that could have been expected to happen in a well defined place (for
example IO errors when reading a file), whereas unchecked exception are used for
errors that could happen almost anywhere (for example running out of memory).
These definitions are however rather vague, so it is often a matter of taste and
style whether to use a checked or unchecked exception.
You can deal with exceptions in your programs using a try ... catch statement. For example:
try { for (int i=0; i<errors.length; i++) { errors[i] = MySqrt(errors[i]); } } catch (IllegalArgumentException x) { System.err.println("Error calculating errors"); x.printStackTrace(); }
If a call to MySqrt results in an exception being throw, the loop will immediately be terminated and the body of the catch clause executed. If an exception is thrown inside a routine and is not caught using a try ... catch statement it is "bubbled up" to the caller of that method, and the caller of the caller etc., until either a catch clause is found, or the top level routine is reached in which case the exception is reported by Java, and the program terminated.
Last Modified: January 14, 2004