Java, Java, Java: From the Java Library: The Java Collections Framework and Generic Types

19. Data Structures: Lists, Stacks, and Queues

19.7. From the Java Library: The Java Collections Framework and Generic Types

java.sun.com/j2se/1.5.0/docs/api/

generic types

THE JAVA CLASS LIBRARY contains implementations of some abstract data types. The Java utility package, java.util.*, contains a good number of classes and interfaces designed to facilitate storing and ma- nipulating groups of objects. This family of related interfaces and classes is called the Java collections framework. It contains structures that corre- spond to the ADTs that we have just discussed, plus other data structures. Java 5.0 has reimplemented the Java collections framework using generic types that allow a programmer to specify a type for the objects that are stored in the structure.

Generic types in Java

Declaring classes that use the generic type construct introduced in Java 5.0 involves using new syntax to refer to the class name. Such classes and in- terfaces, including those in the collections framework, use angle brackets containing one or more variables (separated by commas) to refer to un- specified type names. For example, you would use <E> or <K,V> to refer to unspecified type names. Thus, names of classes or interfaces imple- mented with generic types are written with the syntax ClassName<E>.

Let’s reconsider the Vector class, which was introduced in Chapter 9. The Vector class, which is part of the Java collections framework, has a generic type implementation in Java 5.0. Figure 16.28 describes the Vector<E> class. Notice that the E refers to an unspecified type name, that is, the name of a class or interface. This type is specified when a corresponding variable is declared. The type must also be included after a

constructor’s type name when an object is instantiated and assigned to the variable. The following code demonstrates how to create a Vector<E> object for storing String objects.

Figure 16.28: The java.util.- Vector class is implemented with a generic type in Java 5.0.

In effect, the <E> serves as parameter for the type of objects that will be stored in the Vector. Java 5.0 still allows the use of the unparameterized Vector class which is equivalent to instantiating a Vector<Object>

SECTION 6 • Java Collections and Generic Types783

object. If you use a Vector object, the above code would be written as follows.

One benefit a generic type provides is type checking of method argu- ments at compile time. If strVec is a Vector<String> object, then the statement

will generate a compile-time error. By contrast, if strVec was just a plain Vector object, no error would be found at compile time. Thus, if a programmer wishes to create an array of String objects, using generic types will help guarantee that the objects being stored are actually of type String. In this way, using generic types helps to reduce the number of programming errors and thereby makes programs safer and more robust. A second benefit of using generic types is that the return type of objects retrieved from the data structure will be of the specified type rather than of type Object. If you compare the last statement in each of the two code segments above, you can see that using a generic type eliminates the need to cast an Object to a String. This is a big convenience for the programmer, because forgetting to cast objects from one type to another is

a common programming error.

The java.util.Stack<E> class

The Java collections framework includes the Stack<E> class, imple- mented as a subclass of the Vector<E> class. It contains the methods shown in Figure 16.29. For the most part, its methods provide the same functionality as the methods we developed earlier in this chapter.

Note that the methods provide the functionality of a stack ADT but the de- tails of its implementation are hidden from the user. An object of this class can be declared, instantiated, and used in a manner like the Vector<E> code.


Stack<E>

+empty() : boolean +peek() : E +pop() : E +push(in o : E) : E +search(in o : Object) : int

Figure 16.29: The java.util.- Stack<E> class is a subclass of Vector<E>.

SELF-STUDY EXERCISE

EXERCISE 16.11 Write a class with only a main() method that modi- fies Figure 16.23 so that it uses the parameterized java.util.Stack<E> class instead of the Stack class used there.

Figure16.30:

The

The List<E> interface

and the LinkedList<E> class

The java.util.LinkedList<E> is an implementation of a linked list (Fig. 16.30). Like our implementation earlier in this chapter, it contains methods that can be used to define the standard stack and queue methods. Many of the standard list-processing methods are defined as part of the java.util.List<E> interface. The advantage of defining list op- erations as an interface is that they can be implemented by a number of data structures. Code for using the list methods can be written to work

independently of the data structure being used.

For example, the collections framework contains LinkedList<E> and ArrayList<E>, both of which implement the List<E> interface. In this section, we will demonstrate how to make appropriate use of the List<E> interface and data structures that implement it.

Suppose that a programmer is developing an application to track ac-

tivity of employees working at a small company’s phone-in help desk. The programmer has decided to use the LinkedList<E> data structure to store objects of the PhoneRecord class that was defined earlier in this chapter and will use methods of the List<E> interface to manipulate the data. A list seems to be an appropriate structure for this problem since

LinkedList<E> class imple- ments the List<E> interface. Only a partial list of methods are shown.

An unknown (but relatively small) amount of data will be involved.

The company wants the data stored in the order it is generated.

The main use of the data will be to print out the list’s phone records.

The programmer might write a short method like that in Figure 16.31 to demonstrate how the List<E> and LinkedList<E> structures will be used.

Figure 16.31: A method that demonstrates the interface List<E> and the class LinkedList<E>.

Note that the statement

declares a variable theList of interface type List<E> but assigns an ob- ject of class type LinkedList<E>. This is appropriate because the class implements the interface and the code uses only methods from the inter- face. The class ArrayList<E> in the collections framework also imple- ments the List<E> interface. It uses an array rather than a linked list to store elements and has a constructor with an int parameter that sets the size of the array. If the programmer knew that theList would contain close to, but always less than, 100 elements, then it might be better to declare:

Also note the unusual looking for loop at the end of the method. This is a new feature of Java 5.0 which can be used to simplify the coding of loops

that iterate through every object in a collection of objects. The statementThe for–each loop

should be thought of as executing the enclosed statements for each PhoneRecord object, pr, in the theList data structure. In previous ver- sions of Java, an interface named Iterator had to be used to enumerate all the elements in a collection. The Iterator approach is more flexible— for example, it allows you to iterate through just some of the members of the collection— and will therefore still have to be used for more complex loops.

The output of the method will be:

In the next section we will examine two other structures in the collec- tions framework, the Set interface and the Map interface.

Using the Set and Map Interfaces

The Set and Map interfaces are similar to the List interface in that there are multiple classes in the collections framework that implement them.

Using the Set Interface.

The Set interface is modeled after the set theory principles taught in math- ematics. In mathematics, sets are groups of elements with a clearly defined

algorithm for deciding if any given element is in any given set. Elements can be added to sets and can be removed from sets. Sets cannot have duplicate elements; if an element is added to a set that already contains an element equal to it, the new set still has a single such element. The ele- ments of a set have no natural order; two sets that have the same elements listed in different orders are considered to be the same set.

In computer science and in Java, data structures that model sets are designed for large collections of data. Such data structures have a method that determines if an object is in a given set with an efficient algorithm. For large data sets, using such a method is much faster than iterating through a list. Sometimes, you may or may not be able to list the elements of a set data structure in some natural order, depending on how the data

structure is implemented. An incomplete listing of the methods of the

Set<E> interface is given in the UML diagram in Figure 16.32.

TreeSet<E> and HashSet<E> are two classes in the collections

Figure 16.32:A partial list of methods of the Set<E> interface.

Overriding methods

framework that implement the Set<E> interface. They both provide fast operations to check whether an element is in a set. They also provide quick insertion of an element into the set or removal of an element from a set. For large sets—those having at least several thousand elements—where there are large numbers of insertions, deletions, and tests for whether elements are in a set, linked lists would be much slower.

When using the Set<E> interface for a user-defined class E, you will likely want to override the definition of the equals() method from the Object class in E because that method is used when computing the value of aSet.contains(anElement). When using the TreeSet<E> class for a user defined class E, you should implement the compareTo() method of the Comparable interface because it is used to order the el- ements of E. In the next section, we will discuss the specific manner in which elements are ordered. Finally, when using the HashSet<E> class for a user defined class E, you should override the hashCode() method of the Object class because it is used HashSet<E>. Hash codes are in- dexes that are computed from the particular object that is being stored in the HashSet. Given an object’s hash code, the object can be retrieved directly, as we do with object’s stored in an array. However, we will not discuss hash codes in any detail in this text.

Problem Statement

Let’s think about a simple example for using a set data structure. Sup- pose that a programmer is developing an application for a large com- pany for maintaining a no–call list. The programmer has decided to use the TreeSet<E> data structure to store objects of the PhoneRecord class that was defined earlier in this chapter and will use methods of the Set<E> interface to manipulate the data.

A TreeSet seems to be an appropriate structure for this problem, since

A large amount of data will be involved.

The company wants the PhoneRecord data stored in alphabeti- cal order.

The main use of the data will be to test whether names are in the set.

The programmer might write a short method like that in Figure 16.33 to demonstrate how the Set<E> and TreeSet<E> structures will be used.

Figure 16.33: A method that demonstrates use of the interface Set<E>

and the class TreeSet<E>.

In order for the testSet() method to work as we would like, we need to have the PhoneRecord class implement the Comparable interface and to override the equals() method. For this example, it is reason- able to assume that the name field of PhoneRecord objects should be unique so that it can be used to decide if two PhoneRecord objects are equal. The name field of PhoneRecord can also be used to define the other two methods discussed above. Thus, add the following code to the PhoneRecord class.

The output of the TestSet() method is listed below:

Notice that Jane M PhoneRecord appears only once in the listing of elements in the set.

Using the Map<K,V> Interface.

The Map<K,V> interface is modeled after looking up definitions for words in a dictionary. In computer science, maps are considered to be a collection of pairs of elements. A pair consists of a key that corresponds to a word being looked up and a value corresponding to the definition of the word. Pairs can be added to maps and can be removed from maps. Maps cannot have distinct pairs with the same keys; if you attempt to add a pair to a

map that already contains a pair with the same key, the second pair will

replace the first. An incomplete listing of the methods of the Map<K,V>

Figure 16.34:A partial list of methods of Map<K,V>.

interface is given in the UML diagram in Figure 16.34. TreeMap<K,V> and HashMap<K,V> are two classes in the collections framework that implement the Map<K,V> interface.

Let’s now consider a simple example of using a map data structure. Suppose that our programmer has been hired by a large company to de- velop an application that maintains an electronic phone list for company employees. The programmer has decided to use the TreeMap<E> data structure to store pairs of names and telephone numbers and will use methods of the Map<V,K> interface to manipulate the data.

A TreeMap seems like an appropriate data structure for this problem,

since

A large amount of data will be involved.

The company wants the PhoneRecord data stored in alphabeti- cal order.

The main use of the data will be to use names to access telephone numbers.

The programmer might write a short method like that in Figure 16.35 to

demonstrate how the Map<K,V> and TreeMap<K,V> structures will be used.

The output for this test program is:

Notice the the second phone number for Jane M is the one that is stored in the data structure.

Figure 16.35: A method that demonstrates use of the interface Map<K,V>

and the class TreeMap<K,V>.