10. Strings and String Processing

10.2. String Basics

 

Before we cover the new material on Strings, let’s first review what we know about this topic. In Java, Strings are considered full-fledged objects. A String object is a sequence of the characters that make up the string, plus the methods that are used to manipulate the string. The java.lang.String class (Fig. 7.1) is a direct subclass of Object, and it contains many public methods that can be used to perform useful opera- tions on strings (such as concatenation). We will discuss a selection of the

 

more commonly used methods, but for a full listing and description of the

String methods see

,,

 

J

Like other object variables, String variables serve as references to their respective objects. However, unlike other Java objects, Strings have cer- tain characteristics in common with the primitive data types. For example, as we have already seen, Java allows for literal strings. A string literal is a sequence of zero or more characters contained in double quotes, such as “Socrates” and “” (the empty string). Java allows us to perform operations on literal strings, such as concatenation. As we have already seen, the expression "Hello" + "world" results in the string "Helloworld". Java also allows us to use string literals to initialize String variables with an assignment statement. These exceptional features greatly simplify the use of Strings in our programs. Given how much we use Strings, incorporating these features into Java seems like a good design decision.

Constructing Strings

To create String objects, the String class provides many constructors, including the following:

,,

 

 

J

When we create an object using the first constructor, as in

,,

J

Java will create a String object and make name the reference to it. Fig-

ure 7.2 shows a hypothetical representation of a String object. In addi-

tion to storing the sequence of characters that make up the string, Java

 

also stores an integer value representing the number of characters in the string. We have chosen to represent these two elements as the private in- stance variables, value, for the sequence of characters, and count for the number of characters. In fact, we don’t know exactly how Java stores the sequence of characters. That information is hidden. As Figure 7.2 illus- trates, when we use the default constructor, the value of the is the empty string and its count is 0.

The second constructor is the copy constructor for the String class. A copy constructor is a constructor that makes a duplicate, sometimes called a clone, of an object. Many Java classes have copy constructors. Consider the following statements:


Figure 7.2:An empty string is a String object with value “” and count 0.

 

,,

 

J

These two statements would result in two distinct String objects, both storing the word “Hello”.

 

Note that in the first of the preceding statements, we used the literal string “Hello” in the constructor. When Java encounters a new literal string in a program, it constructs an object for it. For example, if your

program contained the literal “Socrates,” Java would create an object for

it and treat the literal itself as a reference to the object (Fig. 7.3).

 

Figure7.3:Thelit- eralString“Socrates.”


We often use a string literal to assign a value to a String variable:

,,

 

 

 

 

 

 

"Socrates"

name2 name3

 

""

 

name1


J

In this case, the reference variable s is initially null—that is, it has no referent, no object, to refer to. However, after the assignment statement, s would refer to the literal object “Socrates,” which is depicted in Figure 7.3. Given these two statements, we still have only one object, the String object containing the word “Socrates.”. But now we have two references to it: the literal string “Socrates,” and the reference variable s.

Assignment statements can also be used as initializers when declaring a String variable:

 

 

 

Figure 7.4: The variables name1, name2, and name3 serve as refer- ences to the literal String objects “Socrates” and “”.

 

 

 

 

name4 name6

 

 

name5

 

 

 

 

 

Figure 7.5:Together with the objects in Figure 7.4, there are nowfourdifferentString objectswitheightdifferent referencestothem,includ- ingtheliterals“Socrates” and “”.


,,

 

 

J

In this example, Java does not construct new String objects. Instead, as Figure 7.4 shows, it simply makes the variables name1, name2, and name3 serve as references to the same objects that are referred to by the literal strings “” and “Socrates.” This is a direct consequence of Java’s policy of creating only one object to serve as the referent of a literal string, no mat- ter how many occurrences there are of that literal in the program. Thus, these declarations result in no new objects, just new references to existing objects. The justification for this policy is that it saves lots of memory in our programs. Instead of creating a String object for each occurrence of the literal “Socrates,” Java creates one object and lets all occurrences of “Socrates” refer to that object.

Finally, consider the following declarations, which do invoke the

String constructors:

,,

 

 

J

In this case, as shown in Figure 7.5, Java creates two new objects and sets name4 to refer to the first and name5 to refer to the second. It gives name4 the empty string as its value, and it gives name5 “Socrates” as its value. But these two objects must be distinguished from the objects corre-

 

sponding to the literals (“” and “Socrates”) themselves. The declaration of

name6 just creates a second reference to the object referred to by name4.

 

Concatenating Strings

Another way to build a String object is to concatenate two other strings. Recall from Chapter 2 that there are two ways to perform string concate- nation in Java: We can use the concat() method or the concatenation operator, +.

,,

 

 

 

J

The second of these statements uses the concatenation operator, +, to createString concatenation

the String “Jacqueline Kennedy Onassis.” The third statement uses the

String method, concat(), to print “JacquelineOnassis.”

Using the + symbol as the string concatenation operator is another ex-Operator overloading

ample of operator overloading—using the same operator for two or more different operations—which we encountered in Chapter 5.

Note that primitive types are automatically promoted to Strings when they are mixed with concatenation operators. Thus, the statement

,,

 

J

will print the string “The sum of 5 and 5 = 10.” Note that the integer addition—(5 + 5)—is performed first, before the integer result is converted into a String. If we had left off the parentheses around the addition oper- ation, the second plus sign would also be interpreted as a concatenation operator. Thus,

,,

 

J

 

would print “The concatenation of 5 and 5 = 55.”

SELF-STUDY EXERCISES

EXERCISE 7.1What will be printed by each of the following segments of code?

String s1 = "silly"; System.out.println(s1);

String s2 = s1; System.out.println(s2);

String s3 = new String (s1 + " stuff"); System.out.println(s3);

EXERCISE 7.2Write a String declaration that satisfies each of the following descriptions:

Initialize a String variable, str1, to the empty string.

Instantiate a String object, str2, and initialize it to the word stop.

Initialize a String variable, str, to the concatenation of str1 and str2. EXERCISE 7.3Evaluate the following expressions:

,,

 

J

a. M + Nb. M + s1c. s1 + s2

EXERCISE 7.4Draw a picture, similar to Figure 7.5, showing the ob- jects and references that are created by the following declarations:

,,

 

 

 

J

Indexing Strings

 

 

 

 

 

String length


Programmers often need to take strings apart or put them together or re- arrange them. Just think of the many word-processing tasks, such as cut and paste, that involve such operations. To help simplify such operations, it is useful to know how many characters a string contains and to number, or index, the characters that make up the string.

The number of characters in a string is called its length. The String instance method, length(), returns an integer that gives the String’s length. For example, consider the following String declarations and the corresponding values of the length() method for each case:

,,

 

 

 

Indexes

0 1 2 3 4 5 6 7

S o c r a t e s


 

J

The position of a particular character in a string is called its string in-

 

dex. All Strings in Java are zero indexed—that is, the index of the first

 

Figure 7.6: The string “Socrates” has eight characters, indexed from 0 to 7. This is an example of zero indexing.

 

character is zero. (Remember, zero indexing is contrasted with unit index- ing, in which we start counting at 1.) For example, in “Socrates,” the letter S occurs at index 0, the letter o occurs at index 1, r occurs at index 3, and so on. Thus, the String “Socrates” contains eight characters indexed from 0 to 7 (Fig. 7.6). Zero indexing is customary in programming languages. We will see other examples of this when we talk about arrays and vectors.

 

 

Converting Data to Strings

The String.valueOf() method is a class method that is used to con- vert a value of some primitive type into a String object. For example, the expression, String.valueOf(128) converts its int argument to the String “128.”

There are different versions of valueOf(), each of which has the fol- lowing type of signature:

,,

 

J

where Type stands for any primitive data type, including boolean, char, int, double, and so on.

The valueOf() method is most useful for initializing Strings. Be- cause valueOf() is a class method, it can be used as follows to instantiate new String objects:

,,

 

 

 

J

We have already seen that Java automatically promotes primitive type values to String where necessary, so why do we need the valueOf() methods? For example, we can initialize a String to “3.14159” as follows:

,,

 

J

In this case, because it is part of a concatenation expression, the value of

Math.PI will automatically be promoted to a String value. The point

 

 

 

 

Readability


of the valueOf() method is twofold. First, it may be the method that the Java compiler relies on to perform string promotions such as this one. Sec- ond, using it in a program—even when it is not completely necessary— makes the promotion operation explicit rather than leaving it implicit. This helps to make the code more readable. (Also, see Exercise 7.9.)

 

SELF-STUDY EXERCISES

EXERCISE 7.5 Evaluate each of the following expressions:

 

String.valueOf (45)

String.valueOf (128 - 7)


String.valueOf (’X’)

 

EXERCISE 7.6Write an expression to satisfy each of the following descriptions:

Convert the integer value 100 to the string ”100”.

Convert the character ’V’ to the string ”V”.

Initialize a new String object to X times Y.