Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I understand that using a BufferedReader (wrapping a FileReader) is going to be significantly slower than using a BufferedInputStream (wrapping a FileInputStream), because the raw bytes have to be converted to characters. But I don't understand why it is so much slower! Here are the two code samples that I'm using:

BufferedInputStream inputStream = new BufferedInputStream(new FileInputStream(filename));
try {
  byte[] byteBuffer = new byte[bufferSize];
  int numberOfBytes;
  do {
    numberOfBytes = inputStream.read(byteBuffer, 0, bufferSize);
  } while (numberOfBytes >= 0);
}
finally {
  inputStream.close();
}

and:

BufferedReader reader = new BufferedReader(new FileReader(filename), bufferSize);
try {
  char[] charBuffer = new char[bufferSize];
  int numberOfChars;
  do {
    numberOfChars = reader.read(charBuffer, 0, bufferSize);
  } while (numberOfChars >= 0);
}
finally {
  reader.close();
}

I've tried tests using various buffer sizes, all with a 150 megabyte file. Here are the results (buffer size is in bytes; times are in milliseconds):

Buffer   Input
  Size  Stream  Reader
 4,096    145     497
 8,192    125     465
16,384     95     515
32,768     74     506
65,536     64     531

As can be seen, the fastest time for the BufferedInputStream (64 ms) is seven times faster than the fastest time for the BufferedReader (465 ms). As I stated above, I don't have an issue with a significant difference; but this much difference just seems unreasonable.

My question is: does anyone have a suggestion for how to improve the performance of the BufferedReader, or an alternative mechanism?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
565 views
Welcome To Ask or Share your Answers For Others

1 Answer

The BufferedReader has convert the bytes into chars. This byte by byte parsing and copy to a larger type is expensive relative to a straight copy of blocks of data.

byte[] bytes = new byte[150 * 1024 * 1024];
Arrays.fill(bytes, (byte) '
');

for (int i = 0; i < 10; i++) {
    long start = System.nanoTime();
    StandardCharsets.UTF_8.decode(ByteBuffer.wrap(bytes));
    long time = System.nanoTime() - start;
    System.out.printf("Time to decode %,d MB was %,d ms%n",
            bytes.length / 1024 / 1024, time / 1000000);
}

prints

Time to decode 150 MB was 226 ms
Time to decode 150 MB was 167 ms

NOTE: Having to do this intermixed with system calls can slow down both operations (as system calls can disturb the cache)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share

548k questions

547k answers

4 comments

86.3k users

...