Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I'm currently looking into String concat options and the penalty they have on the overall performance. And my test-case creates results that blow my mind, I'm not sure if I'm overlooking something.

Here is the deal: Doing "something"+"somethingElse" in java will (at compile-time) create a new StringBuilder every time this is done.

For my test-case, I'm loading a file from my HDD that has 1661 lines of example data (classic "Lorem Ipsum"). This question is not about the I/O performance, but about the performance of the different string concat methods.

public class InefficientStringConcat {

    public static void main(String[] agrs) throws Exception{
        // Get a file with example data:

        System.out.println("Starting benchmark");
        // Read an measure:
        for (int i = 0; i < 10; i++){
            BufferedReader in = new BufferedReader(
                    new InputStreamReader(new FileInputStream(new File("data.txt")))
            );

            long start = System.currentTimeMillis();
            // Un-comment method to test:
            //inefficientRead(in);
            //betterRead(in);
            long end = System.currentTimeMillis();
            System.out.println("Took "+(end-start)+"ms");

            in.close();
        }



    }

    public static String betterRead(BufferedReader in) throws IOException{
        StringBuilder b = new StringBuilder();
        String line;
        while ((line = in.readLine()) != null){
            b.append(line);
        }
        return b.toString();
    }

    public static String inefficientRead(BufferedReader in) throws IOException {
        String everything = "", line;
        while ((line = in.readLine()) != null){
            everything += line;
        }
        return everything;
    }
}

As you can see, the setup is the same for both tests. Here are the results:

Using inefficientRead()-method:

Starting benchmark
#1 Took 658ms
#2 Took 590ms
#3 Took 569ms
#4 Took 567ms
#5 Took 562ms
#6 Took 570ms
#7 Took 563ms
#8 Took 568ms
#9 Took 560ms
#10 Took 568ms

Using betterRead()-method

Starting benchmark
#1 Took 42ms
#2 Took 10ms
#3 Took 5ms
#4 Took 7ms
#5 Took 16ms
#6 Took 3ms
#7 Took 4ms
#8 Took 5ms
#9 Took 5ms
#10 Took 13ms

I'm running the tests with no extra parameters to the java-command. I'm running a MacMini3,1 from early 2009 and Sun JDK 7:

[luke@BlackBox ~]$ java -version
java version "1.7.0_09"
Java(TM) SE Runtime Environment (build 1.7.0_09-b05)
Java HotSpot(TM) Client VM (build 23.5-b02, mixed mode)

This strikes me as a very heavy difference. Am I doing something wrong in measuring this, or is this supposed to happen?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
78 views
Welcome To Ask or Share your Answers For Others

1 Answer

Am I doing something wrong in measuring this, or is this supposed to happen?

It's supposed to happen. Constructing a long string using repeated string concatenation is a known performance anti-pattern: every concatenation has to create a new string with a copy of the original string and also a copy of the additional string. You end up with O(N2) performance. When you use StringBuilder, most of the time you're just copying the additional string into a buffer. Occasionally the buffer will need to run out space and need to be expanded (by copying the existing data into a new buffer) but that doesn't happen often (due to the buffer expansion strategy).

See my article on string concatenation for details - it's a very old article, so predates StringBuilder, but the fundamentals haven't changed. (Basically StringBuilder is like StringBuffer, but without synchronization.)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...