Featured

The Character Class in Java

The data type char is primitive and has no methods. For this reason, the Character class in the java.lang core package includes a large number of methods that are useful for dealing with single characters; many of these methods are static. This class includes methods for testing, such as whether a character is a digit, a letter, or a special character.

 

Is That So?

What all test methods have in common is that they start with the prefix is and return a boolean. In addition, methods are available for converting, for example, to uppercase or lowercase. The following list includes a few examples:

 

Results of Some is*() Methods

 

All these methods “know” about the properties of each Unicode character. Furthermore, the code point of each Unicode character is always the same, no matter whether a program is executed in Germany or Mongolia.

 

Note: The term “letter” not only describes well-known letters like “a” or “?.” Unicode contains more than 100,000 characters, including hundreds of letters and numbers.5

Testing Whether a String Consists Only of Digits

In the following example, we’ll declare a method that will run through a string and test if the string consists only of digits. Although such functionality is useful in practice, Java Platform, Standard Edition (Java SE) doesn’t provide a simple method for it.

 

public class IsNumeric {

 

   /**

     * Returns {@code true} if the String contains only Unicode digits.

     * An empty string or {@code null} leads to {@code false}.

     *

     * @param string Input String.

     * @return {@code true} if string is numeric, {@code false} otherwise.

     */

   public static boolean isNumeric( String string ) {

       if ( string == null || string.length() == 0 )

           return false;

 

       for ( int i = 0; i < string.length(); i++ )

           if ( ! Character.isDigit( string.charAt( i ) ) )

               return false;

      return true;

   }

 

   public static void main( String[] args ) {

       System.out.println( isNumeric( "1234" ) ); // true

       System.out.println( isNumeric( "12.4" ) ); // false

       System.out.println( isNumeric( "-123" ) ); // false

   }

}

 

Our method defines that null and an empty string aren’t considered numeric. You can also specify that null should lead to an exception and that an empty string is definitely numeric. Conventions like these are up to the author of the library, and different utility libraries with such helper functions have different uses.

 

Our example uses two String methods: length() returns the length of a string, and charAt(int) returns the character at the desired position. A loop iterates over the string and tests each character with isDigit(...). If a character is not a digit, return false automatically exits the loop. If the loop runs successfully, a return true can report that each character was a digit.

Overview of the Most Important is*(...) Methods

final class java.lang.Character

implements Serializable, Comparable<Character>

 

  • static boolean isDigit(char ch)

Is it a digit between 0 and 9?

  • static boolean isLetter(char ch)

Is it a letter?

  • static boolean isLetterOrDigit(char ch)

Is it an alphanumeric character?

  • static boolean isLowerCase(char ch)
  • static boolean isUpperCase(char ch)

Is it a lowercase letter or an uppercase letter?

  • static boolean isWhiteSpace(char ch)

Is it a space, line feed, return, or tab (i.e., whitespace)?

 

Converting Characters to Uppercase/Lowercase

To convert a character to uppercase/lowercase, the Character class declares the methods toUpperCase(...) and toLowerCase(...). The is*(...) methods that carry out the testing are often used when a string is traversed.

 

Our next example asks a user to enter a string. Valid letters should be converted to uppercase, and any whitespace should be replaced with an underscore. To run the input, we’ll again use the String methods length() and charAt(int).

 

String input = new java.util.Scanner( System.in ).nextLine();

 

for ( int i = 0; i < input.length(); i++ ) {

   char c = input.charAt( i );

   if ( Character.isWhitespace( c ) )

       System.out.print( '_' );

   else if ( Character.isLetter( c ) )

       System.out.print( Character.toUpperCase( c ) );

}

 

For example, for the input “honiara brotherhood guesthouse1,” the output is “HONIARA_ BROTHERHOOD_GUESTHOUSE.” The “1” disappears because it’s neither whitespace nor a letter.

 

final class java.lang.Character

implements Serializable, Comparable<Character>

  • static char toUpperCase(char ch)
  • static char toLowerCase(char ch)

The static methods return the matching uppercase or lowercase letter.

 

Note: The methods toUpperCase(...) and toLowerCase(...) exist twice: once as static methods on Character—in which case, they accept exactly one char as argument—and once as object methods on String objects. Care should be taken with Character.toUpperCase('s') because the result is “ß,” unlike the String method "s".toUpperCase(), returns the result “SS,” that is, a string extended by one. Even though there’s now an uppercase “ß” (Unicode U+00DF), Java still returns Unicode U+00DF, not U+1E9E, for Character.toUpperCase('s').

 

From Character to String

To convert a Unicode character to a string, you can use the overloaded static String method valueOf(char). A comparable method also exists in Character, namely, the static method toString(char). Both methods are limited in that the Unicode character can be only 2 bytes long. The static method Character.toString(int) creates a string for any Unicode character, and so, Character.toString(128123) results in a string with a ghost.

 

From char to int: From Character to Number*

When characters come from a user input, you are often required to convert them to numbers. The digit '5' is to become the numeric value 5. According to old hacker traditions, the solution was always to subtract the value of '0'. The ASCII zero '0' has the char value 48, and '1' then has the value 49, until '9' finally reaches 57. Logically, '5' - '0' = 53 - 48 = 5. The solution has the disadvantage of only working for ASCII digits.

 

For example, a neat Java solution is to convert a char to a string and then convert it via an Integer method, for example, in the following way:

 

char c = '5';

int i = Integer.parseInt( String.valueOf(c) ); // 5

 

The parseInt(...) method is fully internationalized and also converts decimal numbers from other scripts, such as Hindi/Sanskrit:

 

System.out.println( Integer.parseInt( " " ) ); // 5

 

This method works but isn’t efficient for single characters in loops. Two other ways, using static methods from the Character class, are available.

The getNumericValue(…) Method

The Character method getNumericValue(char) returns the numeric value of a digit. Of course, this method has been internationalized too. Consider the following example:

 

int i = Character.getNumericValue( '5' );

System.out.println( i ); // 5

System.out.println( Integer.parseInt( " " ) ); // 5

 

The method is much more powerful because it knows the actual “value” of all Unicode characters, including, for example, also Roman numerals (I, II, III, IV, V, VI, VII, VIII, IX, X, XI, XII, L, C, D, and M), which are placed in the Unicode alphabet starting from \u2160:

 

System.out.println( Character.getNumericValue( '\u216f' ) ); // 1000

 

The Integer.parseInt(...) method can’t handle \u216f, thus Integer.parseInt("\ u216f") throws an exception.

The *digit(...) Methods

The Character class also has conversion methods for digits with respect to any base, and vice versa.

 

final class java.lang.Character

implements Serializable, Comparable<Character>

 

  • static int digit(char ch, int radix)

Returns the numeric value that the character ch has under the base radix; common is base 10. For example, Character.digit('f', 16) is equal to 15. Any number system with a base between Character.MIN_RADIX (2) and Character.MAX_RADIX (36) is allowed. If no conversion is possible, the return value is -1.

  • static char forDigit(int digit, int radix)

Converts a numeric value to a character. For example, Character.forDigit(6, 8) is “6,” and Character.forDigit(12, 16) is “c.”

 

Example: The following example converts a string of digits into an integer:

 

char[] chars = { '3', '4', '0' };

int result = 0;

for ( char c : chars ) {

   result = result * 10 + Character.digit( c, 10 );

   System.out.println( result );

}

The output is 3, 34, and 340.

 

Editor’s note: This post has been adapted from a section of the book Java: The Comprehensive Guide by Christian Ullenboom.

Recommendation

Java: The Comprehensive Guide
Java: The Comprehensive Guide

This is the up-to-date, practical guide to Java you’ve been looking for! Whether you’re a beginner, you’re switching to Java from another language, or you’re just looking to brush up on your Java skills, this is the only book you need. You’ll get a thorough grounding in the basics of the Java language, including classes, objects, arrays, strings, and exceptions. You'll also learn about more advanced topics: threads, algorithms, XML, JUnit testing, and much more. This book belongs on every Java programmer's shelf!

Learn More
Rheinwerk Computing
by Rheinwerk Computing

Rheinwerk Computing is an imprint of Rheinwerk Publishing and publishes books by leading experts in the fields of programming, administration, security, analytics, and more.

Comments