Reading 18: Java, part 1
I love coffee, I love tea.
I love the java jive, 'cause it loves me!
- "The Java Jive" (song lyrics)
Overview
So far in the course, we've been learning how to program in Python. For the rest of the course, we will be learning and using the Java programming language. These readings are intended to get you up to speed on Java quickly, as well as to give you an idea of Java's strengths and weaknesses relative to Python.
Topics
-
Why learn another programming language?
-
Why Java?
-
How Java is different from Python
-
Java basics
Why learn another programming language?
In CS 1, you've been learning (and hopefully enjoying) Python. But Python is only one of dozens of programming languages in use today, and the reality is that in order to be an accomplished programmer, you will need to learn more than one language. Fortunately, after learning one language, learning the next is (usually) much easier. This is particularly the case when the new language is quite similar to one you already know. That will turn out to be the case for Java and Python, though there are some key differences too.
The reason why you should learn another programming language is simple: no language is the best tool for all tasks. Even though Python is a wonderful language and is useful for a broad set of tasks, it can't be the best choice for everything. For instance, one limitation of Python is that code written in it is slow compared to many other languages (including Java), so if speed is important for you, Python may not be the best choice.
Why Java?
Java was designed and released in the mid-1990s (a few years after Python first appeared). At the time, the dominant languages for software development were C and C++. Both C and C++ were (and are) languages in which you can write very efficient code, but they can also be painful to use. They don't handle memory automatically, so even creating something as simple as an array is a fair amount of work; you need to tell the computer when to create the memory the array uses, and when you don't need it anymore. Also, C++ is extremely complex, and at the time, many programmers were wishing for a simpler language that was nearly as efficient as C++. In addition, C and C++ require that you recompile (recreate) a program on any computer that uses it, and C/C++ code that works fine on one kind of computer (say, a computer running MacOS) won't necessary work on another kind (say, a computer running Windows) because the software libraries on both systems are completely different. (In computer-speak, we say that programs written in C and C++ are not portable.)
So, to sum up, the goal of Java was to be a programming language that
- was almost as fast as C or C++
- was simpler than C++
- didn't require manual memory management
- was portable (the same code works everywhere)
In addition, the Java authors wanted Java to be an object-oriented language, which we will discuss in detail below.
Over the years, Java has become one of the most popular programming languages, so learning it will definitely be worth your time.
How Java is different from Python
Java has many similarities to Python, as well as many important differences.
Like Python, Java:
- is garbage collected (you don't have to manage memory yourself)
- is object-oriented
- is portable (mostly)
Java differs in these ways:
-
Java is completely object-oriented. In Python, you can use objects or not, but in Java, they are all that there is.
-
Java requires explicit type declarations on variables, whereas Python doesn't.
-
Java is considerably more portable than Python, and is pretty much the most portable language you can use. Not just the language, but all the core libraries are standardized, and code you write on one computer will run unaltered an any other. (Java's slogan is "Write once, run anywhere!", sometimes abbreviated as "WORA")
-
Java code runs much faster than Python code on average.
-
Java code tends to be much more verbose than Python (meaning it takes many more lines of code to say the same thing).
Java basics
Installing Java
The assignments will walk you through the process of installing Java, but if you want to do it on your own, you can go to this site and download and install it yourself.
The two programs that we will be needing in this reading are:
javac
- the Java compilerjava
- the Java bytecode interpreter
We'll show you how to use these below.
A simple program
Let's create a simple program in Java. This program, when run in a terminal, will print the words "hello, world!" to the terminal and exit.
public class Hello {
public static void main(String[] args) {
System.out.println("hello, world!");
}
}
Before we discuss the program itself,
let's see how to run it.
First, copy the code above into a code editor (like Visual Studio Code),
and save it as the file Hello.java
.
Then type this in a terminal:
(The $
is the terminal prompt; yours may look different.)
Assuming that the current directory of the terminal
is the same one in which Hello.java
is located,
this will compile the Java code into Java bytecode,
which is a file ending in the file extension .class
.
If we list the files in the directory,
we'll see it:
In addition to the original file we wrote (Hello.java
),
there is a new file: Hello.class
.
Here's what happened:
when you typed javac Hello.java
,
you were invoking the Java compiler,
whose job it is to "compile" Java source code
into a simpler and easier-to-interpret version,
which is called Java bytecode.
The details of Java bytecode aren't important now;
all you need to know is that the name of the code representation
that is output by the Java compiler.
In order to actually run Java code, another step is needed.
You have to take the compiled Java code (here, Hello.class
),
and run it using the Java bytecode interpreter,
a program which is just called java
.
(Sometimes we'll just call this the "java interpreter".)
That looks like this:
Notice that the command-line argument to the java
program
is the name Hello
(not Hello.java
or Hello.class
)
However, the java interpreter will use the name
to find the Hello.class
file
(and not the Hello.java
file,
which it doesn't care about).
Once it finds this file,
it will interpret its contents,
which in this case leads to it printing the words "hello, world"
on the terminal.
OK, so that's what the program does. Let's take another look at the program:
public class Hello {
public static void main(String[] args) {
System.out.println("hello, world!");
}
}
There's a lot to unpack here, but notice first that the equivalent program in Python would contain only one line:
In addition, you don't have to compile Python code yourself;
you just run it by telling Python where the code's file is.
For instance, if the file with the Python code is called hello.py
,
then we can immediately do this:
So it's clear that Java
- is much more verbose than Python
- requires two steps to run the program instead of one
Java is an object-oriented language, and not only that, it's a "pure" object-oriented language. What this means will take a while to explain, but one aspect of it is that almost all Java code is contained inside of code constructs called classes. You can see this from the first line:
This code defines a class called Hello
.
(We'll discuss what public
means later.)
So let's talk about classes.
And, since classes are inextricably linked to objects,
let's talk about them too.
Classes and objects
Since Java is an "object-oriented language", one of the main things you do in Java is to create objects. But how do you create objects? And for that matter, what are objects?
A good way to understand objects is to realize that there are only two fundamental things in programming:
- data
- functions that act on data
You've already seen both of these in Python, many times. Now imagine that it was possible to combine data and functions into one entity — that's what an object is!
You might wonder why it's advantageous to combine data and functions into one thing.
One reason is that you can use the same function name
for different kinds of object data.
This is why we use the "dot syntax" in Python (and Java!);
for two different kinds of objects foo
and bar
,
foo.length()
and bar.length()
could be doing completely different things,
but they both use the name length
for the name of the function.
Functions that are part of an object are referred to as methods.
If we weren't using objects, we'd have to invent different names,
and we would also have to pass the data as an argument:
foo_length(foo); // ugly!
bar_length(bar); // ugly!
// vs.
foo.length(); // nice!
bar.length(); // nice!
So this is certainly convenient. But a more important reason why objects are useful is that they allow you to restrict access to the data inside the object so that only the methods can directly access it. The effect of this is to make code more robust and easier to debug; if the internal data gets messed up, the only thing that could have messed it up is a call to one of the methods of the object. In contrast, if the data was accessible to any function, it can be very hard to find out which function messed up the data. (This is the same reason we don't like to have global variables.) The technical name for this phenomenon is data hiding.
The nice thing about data hiding is that the only way the rest of the code can interact with an object is by calling the methods of the object. The internal data of the object is not accessible, so if you decide you want to change the internal data or the way the methods interact with it, it won't affect any external code as long as the methods give the same results. This is called data abstraction, and it makes it easier to develop programs, especially large programs that need to be modified over time.
OK, so that's what you need to know about objects. What about classes? A class is a programming construct which contains information on how to build objects. Basically, classes are what you, the programmer, write in order to create objects. Classes can contain several different things:
- methods: functions that are part of the objects created by the class
- constructors: special methods which create objects of the class (also called instances of the class)
- fields: variables that "belong" to objects created from the class
- static methods: methods which are part of the class as a whole, but not part of objects created from the class.
Terminology
If your head is spinning now because of all the terminology, that's completely natural. Object-oriented programming does have a lot of jargon associated with it. Don't worry, you'll pick it up quickly!
This is a good time to return to our example:
public class Hello {
public static void main(String[] args) {
System.out.println("hello, world!");
}
}
This is a class called Hello
. It contains:
- no fields
- no regular methods
- no constructors
- one static method, called
main
The main
method is special;
it is where the program starts executing.
When you enter the command java Hello
,
the Java interpreter:
- finds the file
Hello.class
, - finds the code for the
main
method inside theHello
class, - and executes it!
Java vs. Python
The main
method in Java is very similar to
the if __name__ == '__main__':
idiom in Python.
In this case, the main
method contains a single line:
This is Java's way of telling the computer to print the string
"hello, world!"
to the terminal.
The ln
in println
means "add a newline after printing the string".
(Python does the same thing, but just calls it print
.)
The System.out
stuff indicates where to print the message;
System.out
corresponds to printing to the terminal.
Technically, out
is what's called a PrintStream
(something you can print on)
and System
is a class which represents the computer in an abstract way.
We mentioned the reason for the name main
above,
but let's look at the entire line:
There's a lot of stuff here!
The String[] args
part is the (only) argument to the main
method.
For this method, it refers to the command-line arguments
passed to the program when it was run.
These are represented as an array of strings.
So if you happened to run the program like this:
then the args
argument would be the array of strings
containing the strings "foo"
, "bar"
, and "baz"
, in order.
Command-line arguments are very useful
for passing information into a program at the moment that it is run.
The String[] args
syntax illustrates another important aspect of Java
that isn't there in Python (at least, not by default):
you have to declare the types of all function arguments and local variables!
(Type declarations are a big part of why Java is as verbose
as it is.)
In this case, this just says that args
has the type "array of strings",
which is represented in Java as String[]
.
The name String
is the name of a class which represents character strings,
and the []
tells us that this is actually an array of strings,
not just a single string.
The void
part is the return type of the main
method;
it states that the method doesn't return anything when called.
You are required to specify the return type
for any method you write in Java.
The static
keyword indicates that this method (main
)
is a static method of the class.
That just means that you don't need an instance of the class to call it,
which is good, because this class doesn't have any constructors
or any obvious way to create new objects from this class.
The public
keyword indicates that this method (main
)
can be called by any code anywhere;
it's part of the public interface of the class.
Notice that Java is extremely verbose and also extremely picky; you have to state exactly what the type of each local variable or method argument is, you have to specify the return type of every method, you have to specify if a method is static (if it is), and you have to specify if a method is public (if it is). There are many other things we can specify in Java as well.
Coming up
We've said all we have to say about this trivial example, so in the next reading we will introduce more Java features and we'll look at more realistic examples.