Archive for October, 2006

Optimization 33: Input Buffering We reverse direction now

Sunday, October 29th, 2006

Page 127

Hint: This post is supported by Gama hrvatski web hosting services

Optimization 33: Input Buffering We reverse direction now

Sunday, October 29th, 2006

Optimization 33: Input Buffering We reverse direction now from output to input streams. The issues are analogous, although with a different twist. The most important issue is still buffering. We’ll continue driving the discussion with the same program we used earlier only this time instead of writing Trade records to file, we will read them in from a file. The code for reading Trade records back in is given by private void in() throws IOException { Trade buy = null; Trade sell = null; String line = null; DataInputStream dis = new DataInputStream( new FileInputStream(”stock.dat”)); long start = System.currentTimeMillis(); // <+++ Start timing while ((line = dis.readLine()) != null) { // 100,000 iterations } long stop = System.currentTimeMillis(); // <+++ Stop timing System.out.println( " in time = " + (stop - start) ); dis.close(); } When we created this file, we used the println() method such that Trade records are separated by a new line character. Therefore, we need to use an input stream that provides us the ability to read in one line at a time. The DataInputStream is one such stream. Notice that the in() method as it stands above does not perform any buffering. Using this method to read in a file consisting of 100,000 Trade records has taken 40 seconds. (Don't compare this directly to our previous output measurement. Our output tests have written a million records out to file, while here we are only reading 100,000 records.) This is pretty slow. It is about 10 times slower than it has taken us to write the same number of records out to file. The problem with our code above is that it truly reads in one byte at a time. It is not the fault of the readLine() method. The readLine() method must inspect every character for line termination, but that does not mean that characters must be read one at a time. The real problem is that the DataInputStream is wrapped directly around a nonbuffered input stream such as FileInputStream. The FileInputStream is simple and stupid if you ask for a single byte, a physical read of a single byte will take place. What we really want to have is a logical read of a single byte but physical reads of large chunks so that we minimize the overhead per byte. You can get exactly this result if you insert a BufferedInputStream between your DataInputStream and the FileInputStream. So, let's do that: DataInputStream dis = new DataInputStream( new BufferedInputStream( new FileInputStream("stock.dat"))); That's a one-line change. The rest of the code remains intact, demonstrating the beauty of the Java I/O library design. Performance of this version improved by almost a whopping 9x, from 40 seconds to 4.7 seconds. See Figure 5.5. Figure 5.5. Input buffering is highly recommended Page 126

Hint: This post is supported by Gama hrvatski web hosting services

Optimization 32: Prefer Byte Stream to Unicode You

Sunday, October 29th, 2006

Page 125
Note: If you are looking for cheap and inexpensive provider to host and run your tomcat application check Actions tomcat hosting services

Optimization 32: Prefer Byte Stream to Unicode You

Sunday, October 29th, 2006

Optimization 32: Prefer Byte Stream to Unicode You could split the set of output streams to roughly two sets: Those who handle Unicode strings are called writers and the rest handle byte arrays. When you are sending Unicode characters down a writer stream, they eventually reach an underlying sink that may very well be a socket or a file. The problem with those end points is that they don’t “talk” Unicode, they “talk” bytes. Somewhere along the way, your Unicode characters must be converted to a byte stream, and that conversion is expensive (see Optimization 4). The writers basically sit on top of more primitive byte streams. The OutputStreamWriter is the bridge class that glues the two sets together by converting a Unicode stream to a byte stream. When you define a FileWriter as in FileWriter fw = new FileWriter; what you really get is equivalent to OutputStreamWriter osw = new OutputStreamWriter( new FileOutputStream); The writer streams provide a more convenient interface to work with and, as often is the case, convenience trades off with efficiency. Somebody, somewhere, has got to do the work on our behalf and it is usually not as fast as we could do ourselves. Let’s compare the use of a buffered file writer to a buffered file output stream. We have already done the buffered FileOutputStream in the previous section. We used BufferedOutputStream fos = new BufferedOutputStream( new FileOutputStream(”stock.dat”)); in the execution of one million iterations of the loop below: for (int i = 0; i < n; i++) { // 1,000,000 iterations fos.write(buy.toBytes()); fos.write(sell.toBytes()); } This loop executed in 19 seconds. We now replace the stream with a writer: BufferedWriter fw = new BufferedWriter( new FileWriter("stock.dat")); We also have to modify the test loop to fit the BufferedWriter interface: for (int i = 0; i < n; i++) { fw.write(buy.toString(), 0, buy.toString().length()); fw.write(sell.toString(), 0, sell.toString().length()); } With this writer, the loop execution time increased to 34 seconds. See Figure 5.4. Figure 5.4. An output stream is faster than a writer Page 124
Note: If you are looking for cheap and inexpensive provider to host and run your tomcat application check Actions tomcat hosting services

Page 122

Sunday, October 29th, 2006

Page 123
Note: If you are looking for high quality webhost to host and run your jsp application check Vision jsp hosting services

Page 122

Sunday, October 29th, 2006

Page 122
Note: If you are looking for high quality webhost to host and run your jsp application check Vision jsp hosting services

Page 120

Sunday, October 29th, 2006

Page 120

Hint: This post is supported by Gama besplatan domen provider

Page 120

Sunday, October 29th, 2006

Optimization 31: Don’t Flush Prematurely Flushing output data should not be an automatic reflex. There are some subtle issues that ought to be considered. As a general rule, you are better off buffering your output data, and you must flush if the progress of the computation depends on it. For example, a socket client sending a request to a server must flush the request; otherwise, the data will get stuck in the output buffer and the client is hung. On the other hand, it would not be smart to flush bits and pieces of the request. The point with flushing is to find the right spot where flushing is necessary and to avoid doing it too often. Take our application, for example. The only time you may consider flushing is at the end of a transaction. In our case, that occurs when you are done writing all the Trade objects. However, if the end of a transaction coincides with the closing of the stream, then you don’t have to flush at all because the close() call will perform a flush() for you. I have seen such code in practice in which periodic flushes are performed unnecessarily. If the application produces enough data to exceed the buffer’s capacity, you may want to leave flushing decisions to the buffered stream itself. Let’s introduce premature flushing into our code and see what happens. We will use a BufferedWriter but will flush it after every Trade record: private void out(int n) throws FileNotFoundException, IOException { Trade buy = new Trade (”01-01-99″, true, 100, “IBM”, 140); Trade sell = new Trade (”01-08-99″, false, 100, “IBM”, 150); BufferedOutputStream fos = new BufferedOutputStream( new FileOutputStream(”stock.dat”)); long start = System.currentTimeMillis(); // <+++ Start timing for (int i = 0; i < n; i++) { // 1,000,000 iterations fos.write(buy.toBytes()); fos.flush(); fos.write(sell.toBytes()); fos.flush(); } long stop = System.currentTimeMillis(); // <+++ Stop timing System.out.println( " out time = " + (stop - start) ); fos.close(); } The million iterations we use to measure this code executed in 38 seconds. See Figure 5.2. Figure 5.2. Periodic flush() calls will hurt Page 121

Hint: This post is supported by Gama besplatan domen provider

Page 118

Sunday, October 29th, 2006

Optimization 30: Output Buffering Buffering is about minimizing the overhead per byte. If you send data one byte at a time, you’ll have to traverse the whole output machinery for each byte. If you chunk the data into larger blocks, you can spread the fixed cost over a large number of bytes and minimize the amortized cost. This is nothing new; buffering output data enhances efficiency in all programming languages I have ever worked on. The Java I/O stream library makes buffering very easy. All you need to do is wrap a buffering stream around your intended output stream. We will demonstrate the power of buffering by writing Trade records to a file using a FileOutputStream with and without buffering. The first version uses a FileOutputStream with no buffering: private void out(int n) throws FileNotFoundException, IOException { Trade buy = new Trade (”01-01-99″, true, 100, “IBM”, 140); Trade sell = new Trade (”01-08-99″, false, 100, “IBM”, 150); FileOutputStream fos = new FileOutputStream(”stock.dat”); long start = System.currentTimeMillis(); // <+++ Start timing for (int i = 0; i < n; i++) { // 1,000,000 iterations fos.write(buy.toBytes()); fos.write(sell.toBytes()); } long stop = System.currentTimeMillis(); // <+++ Stop timing System.out.println( " out time = " + (stop - start) ); fos.close(); } The loop under test iterates one million times, writing buy and sell Trade representations to file. I bought 100 shares and dumped them a week later for a hefty gain. They happen to be the same trades over and over, but that's okay. We are only trying to give the stream a workout. The test loop executed in 33 seconds. In the second test we simply decorated the FileOutputStream with a BufferedOutputStream, as in BufferedOutputStream fos = new BufferedOutputStream( new FileOutputStream("stock.dat")); That's it. The rest of the code remains the same. The execution time of the loop dropped to 19 seconds. Without buffering, every Trade was flushed out to file, roughly every 50 bytes. The default buffer size of BufferedOutputStream is 512, which allowed us to flush multiple Trade records at once and improve efficiency. Typically, flushing one byte at a time will be very costly. Efficiency will improve as the size of the buffer grows. At some point you reach optimal efficiency, and beyond that, you get diminishing returns or even a slight drop in efficiency. Trying to find that "sweet spot," we increased the buffer size from the default of 512 to 4,096: BufferedOutputStream fos = new BufferedOutputStream( new FileOutputStream("stock.dat"), 4096); We also reduced it to 64 to get the other extreme. It turns out that the default of 512 was close enough to the "sweet spot." See Figure 5.1 (measurements are in seconds; lower is better). Page 119
Note: If you are looking for high quality webhost to host and run your jsp application check Vision jsp hosting services

Page 118

Sunday, October 29th, 2006

Page 118
Note: If you are looking for high quality webhost to host and run your jsp application check Vision jsp hosting services