Java, Java, Java: Array Algorithms: Sorting

12. Arrays and Array Processing

12.5. Array Algorithms: Sorting

Sorting an array is the process of arranging its elements in ascending or descending order. Sorting algorithms are among the most widely used algorithms. Any time large amounts of data are maintained, there is some

Figure 9.12: The AnalyzeFreq class definition.

need to arrange them in a particular order. For example, the telephone company needs to arrange its accounts by the last name of the account holder as well as by phone number.

Insertion Sort

The first sorting algorithm we’ll look at is known as insertion sort, so named because as it traverses through the array from the first to the last element, it inserts each element into its correct position in the partially sorted array.

For an array of N elements, let’s think of the array as divided into two parts. The sorted part will be the left hand side of the array. And the unsorted part will be the right hand side of the array. Initially, the sorted part consists of the first element in the array—the element at index 0.

Insertion sort moves through the unsorted portion of the array—that is its loop variable, k, ranges from 1 through N-1. On each iteration it inserts the kth element into its correct position in the sorted part of the array. To insert an element into the sorted part of the array, it may be necessary to move elements greater than the one being inserted out of the way.

In pseudocode, insertion sort can be represented as follows:

As is apparent from the pseudocode, we have a nested for loops. The outer

(k) loop, iterates through the array from 1 to N-1. The inner loop iterates as many times as necessary, starting with the element just to the left of the kth element in order to insert the kth element into its correct position in the sorted portion. Note that the kth element is always removed from the array (and stored in the variable x), to make room for elements that have to be moved to the right.

To see how this works, consider an integer array containing the ages of five friends:

For this five-element array, insertion sort initially will assume that the el- ement at index 0 is in the correct position. The vertical line marks the boundary between the sorted and unsorted portions of the array. The outer loop will look at each of the remaining elements, one at a time, in- serting it into its proper position in the sorted portion of the array. To insert 20, the number at index 1, the inner loop will move 21 to the right by one position. To do this, the algorithm will remove 20 from its location and store it in x. It will then move 21 one space to the right. Finally, it will insert 20, which is stored in x, at index 0, where it belongs relative to the other elements in the sorted part of the array. At this point, the sorted portion of the array consists of the first two elements, which are in the correct order, relative to each other.

For the next element, 27, none of elements in the sorted portion need to be moved, so the inner for loop will iterate zero times. This gives us:

For the fourth element, 24, only the previous element, 27, needs to be moved to the right, giving:

At this point, the sorted part of the array consists of the first four elements, which are in the correct order relative to each other. Finally, for the last element, 19, all of the elements in the sorted part of the array need to be moved one space to the right. This will require four iterations of the inner loop. We show the state of the array after each iteration of the inner for loop:

Clearly, the fact that so many elements may have to moved on each it- eration of the outer loop shows that insertion sort is not a very efficient algorithm.

The Sort class (Fig 9.13) provides an implementation of the insertionSort() method. There are several points worth noting about this code. First, because it takes an int array as a parameter, the insertionSort() method will sort any array of integers, regardless of the array’s length.

Figure 9.13: Source code for the insertionSort() method. Note in

main() how an integer array is passed to the method.

Array parameters

Second, note how empty brackets ([]) are used to declare an array pa- rameter. If the brackets were omitted, then arr would be indistinguish- able from an ordinary int parameter. Using the brackets indicates that this method takes an array of integers as its parameter.

Third, note how an array of integers is passed to the insertionSort() method in the main() method:

That is, when passing an array to a method, you use just the name of the array, without brackets. Both of the following statements would cause syntax errors:

Selection sort algorithm

In the first case, empty brackets are only used when you declare an array variable, not when you are passing the array to a method. In the second case, intArr[5] is an int, not an array, and cannot legally be passed to insertionSort().

Finally, within the insertionSort() method itself, note that we declare the index for the inner for loop outside of the for statement. This is so it can be used outside the scope of the for loop to insert temp at location arr[i+1], its correct location. Note also that the index of its correct loca- tion is i+1, rather than just i. This is because the inner loop might iterate past location 0, which would give i a value of -1 at that point.

Selection Sort

There are a large variety of array sorting algorithms. Selection sort is dif- ferent from, but comparable to, insertion sort in its overall performance. To illustrate the selection sort algorithm, suppose you want to sort a deck of 25 index cards, numbered from 1 to 25. Lay the 25 cards out on a table, one card next to the other. Starting with the first card, look through the deck and find the smallest card, the number 1 card, and exchange it with the card in the first location. Then, go through the deck again starting at the second card, find the next smallest card, the number 2 card, and exchange it with the card in the second location. Repeat this process 24 times.

Translating this strategy into pseudocode gives the following algorithm:

For a deck of 25 cards, you need to repeat the outer loop 24 times. In other words, you must select the smallest card and insert it in its proper location 24 times. The inner loop takes care of finding the smallest remaining card. On each iteration of this outer loop, the algorithm assumes that the card specified by the outer loop variable, count, is the smallest card (line 2). It

usually won’t be, of course, but we have to start somewhere.

The inner loop then iterates through the remaining cards (from count+1 to 25) and compares each one with the card that is currently the smallest (lines 4 and 5). Whenever it finds a card that is smaller than the smallest card, it designates it as the smallest card (line 5). At the end of the loop, the smallestCard variable will remember where the smallest card is in the deck.

Finally, when the inner loop is finished, the algorithm swaps the small- est card with the card in the location designated by count.

Algorithm: Swapping Memory Elements

An important feature of the selection sort algorithm is its need to swap two array elements, or cards, to continue our example. Swapping two memory elements, whether they are array elements or not, requires the use of a temporary variable. For example, to swap the jth and kth elements

in an int array named arr, you would use the following algorithm:Swapping algorithm

The temp variable temporarily stores the jth element so its value is not lost when its location is overwritten by the kth element. The need for this variable is a subtlety that beginning programmers frequently over- look. But consider what would happen if we used the following erroneous

algorithm:Swapping blunder

If arr[j] refers to 4 and arr[k] refers to 2 in the array 1 4 2 8, then the erroneous algorithm would produce 1 2 2 8, the wrong result.

The following method implements the swap algorithm for two elements,

el1 and el2 of an int array:

SELF-STUDY EXERCISES

EXERCISE 9.8 Sort the array, 24 18 90 1 0 85 34 18, using the insertion sort algorithm. Show the order of the elements after each iteration of the outer loop.

EXERCISE 9.9 Sort the array, 24 18 90 1 0 85 34 18, using the selection sort algorithm. Show the order of the elements after each iteration of the outer loop.

EXERCISE 9.10 Write a Java code segment to swap two Student ob- jects, student1 and student2.

EXERCISE 9.11 Write a Java implementation of the selectionSort() method to sort an array of int.

Passing a Value and Passing a Reference

Recall from Chapter 3 that when an Object is passed to a method, a copy of the reference to the Object is passed. Because an array is an object, a reference to the array is passed to insertionSort(), rather than the whole array itself. This is in contrast to how a value of a primitive type is passed. In that case, a copy of the actual value is passed.

One implication of this distinction is that for arguments of primitive type, the original argument cannot be changed from within the method be- cause the method has only a copy of its value. For example, the follow- ing method takes an int parameter n, which is incremented within the method:

But because n is a parameter of primitive type, incrementing it within the method has no effect on its associated argument. Thus, in the following segment, the value of Num—n’s associated argument—will not be affected by what goes on inside the add() method. The output produced by the code segment is shown in the comments:

Passing a primitive value

Note that while n’s value has changed inside the method, Num’s value remains unaffected.

The case is much different when we pass a reference to an object. In that case, the object itself can be manipulated from within the method. The insertionSort() method is a good illustration. In the following

code segment, the array anArr is printed, then sorted, and then printedPassing an object

again:

As you can see, the object itself (the array) has been changed from within the method. This shows that changes within insertionSort to the ar- ray referenced by arr are actually being made to anArr itself. If fact, because insertionSort() is passed a copy of the reference variable anArr, both arr and anArr are references to the very same object—that is, to the same array (Fig. 9.14).

The justification for passing a reference to an object rather than the en-

Method call overhead

Figure 9.14: When an array is passed to a method, both the pa- rameter and the corresponding ar-

Method Definition (parameter)Method Call (argument)

gument refer to the same object.public void insertionSort (int arr [ ])

{

...

}

insertionSort (anArr)

tire object itself is a matter of efficiency. A reference uses just 4 bytes of data, whereas an object may use thousands of bytes. It would just be too inefficient to copy hundreds of bytes each time an object is passed to a method. Instead, the method is passed a reference to the object, thereby giving it access to the object without incurring the expense of copying large amounts of data. Indeed, Java provides no way to pass a copy of an object to a method.

SELF-STUDY EXERCISE

EXERCISE 9.12 Give the values that will be stored in myArr and k after you invoke mystery(myArr, k), where myArr, k and mystery() are declared as follows: