Monday, April 5, 2010

Java String.trim() not trimming Whitespace!

The problem:

I was using String.trim() to remove leading and trailing whitespace characters from an imported string.
However it turns out, that the trim() method only checks for a whitespace characters defined in ASCII.
That is, any non-ASCII whitespace characters will still persist!

Reference: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4080617

Diagnosis:

Upon inspecting each character of the string, the whitespace character returned a value of 160
The code to inspect is below:
        for (int i=0; i < string160.length(); i++) {                    
            char c = string160.charAt(i);
            int intC = (int) c;
            System.out.println(intC);            
        }


Solution:

The solution was to remove all instances of this troublesome character with the following code:

string160 = string160.replaceAll("[\\u00A0]", "");

No comments:

Post a Comment