Binary I/O Classes

paulike

Paul Ngugi

Posted on July 1, 2024

Binary I/O Classes

The abstract InputStream is the root class for reading binary data, and the abstract OutputStream is the root class for writing binary data. The design of the Java I/O classes is a good example of applying inheritance, where common operations are generalized in superclasses, and subclasses provide specialized operations.

Image description

Figure above lists some of the classes for performing binary I/O. InputStream is the root for binary input classes, and OutputStream is the root for binary output classes. Figures below list all the methods in the classes InputStream and OutputStream. All the methods in the binary I/O classes are declared to throw java.io.IOException or a subclass of java.io.IOException.

Image description

Image description

FileInputStream/FileOutputStream

FileInputStream/FileOutputStream is for reading/writing bytes from/to files. All the methods in these classes are inherited from InputStream and OutputStream. FileInputStream/FileOutputStream does not introduce new methods. To construct a FileInputStream, use the constructors shown in Figure below.

Image description

A java.io.FileNotFoundException will occur if you attempt to create a
FileInputStream with a nonexistent file.

To construct a FileOutputStream, use the constructors shown in Figure below.

Image description

If the file does not exist, a new file will be created. If the file already exists, the first two constructors will delete the current content of the file. To retain the current content and append new data into the file, use the last two constructors and pass true to the append parameter.

Almost all the methods in the I/O classes throw java.io.IOException. Therefore, you have to declare to throw java.io.IOException in the method or place the code in a try-catch block, as shown below:

Image description

The code below uses binary I/O to write ten byte values from 1 to 10 to a file named temp.dat and reads them back from the file.

Image description

The program uses the try-with-resources to declare and create input and output streams so that they will be automatically closed after they are used. The java.io.InputStream and java.io.OutputStream classes implement the AutoClosable interface. The AutoClosable interface defines the close() method that closes a resource. Any object of the AutoClosable type can be used with the try-with-resources syntax for automatic closing.

A FileOutputStream is created for the file temp.dat in line 8. The for loop writes ten byte values into the file (lines 11–12). Invoking write(i) is the same as invoking write((byte)i). Line 17 creates a FileInputStream for the file temp.dat. Values are read from the file and displayed on the console in lines 20–22. The expression ((value = input.read()) != -1) (line 21) reads a byte from input.read(), assigns it to value, and checks whether it is –1. The input value of –1 signifies the end of a file.

The file temp.dat created in this example is a binary file. It can be read from a Java program but not from a text editor.

When a stream is no longer needed, always close it using the close() method or automatically close it using a try-with-resource statement. Not closing streams may cause data corruption in the output file, or other programming errors.

The root directory for the file is the classpath directory. For the example in this book, the root directory is c:\book, so the file temp.dat is located at c:\book. If you wish to place temp.dat in a specific directory, replace line 6 with

FileOutputStream output = new FileOutputStream ("directory/temp.dat");

An instance of FileInputStream can be used as an argument to construct a Scanner, and an instance of FileOutputStream can be used as an argument to construct a PrintWriter. You can create a PrintWriter to append text into a file using

new PrintWriter(new FileOutputStream("temp.txt", true));

If temp.txt does not exist, it is created. If temp.txt already exists, new data are appended to the file.

FilterInputStream/FilterOutputStream

Filter streams are streams that filter bytes for some purpose. The basic byte input stream provides a read method that can be used only for reading bytes. If you want to read integers, doubles, or strings, you need a filter class to wrap the byte input stream. Using a filter class enables you to read integers, doubles, and strings instead of bytes and characters. FilterInputStream and FilterOutputStream are the base classes for filtering data. When you need to process primitive numeric types, use DataInputStream and DataOutputStream to filter bytes.

DataInputStream/DataOutputStream

DataInputStream reads bytes from the stream and converts them into appropriate primitive-type values or strings. DataOutputStream converts primitive-type values or strings into bytes and outputs the bytes to the stream.

DataInputStream extends FilterInputStream **and implements the **DataInput interface, as shown in Figure below.

Image description

DataOutputStream extends FilterOutputStream and implements the DataOutput interface, as shown in Figure below.

Image description

DataInputStream implements the methods defined in the DataInput interface to read primitive data-type values and strings. DataOutputStream implements the methods defined in the DataOutput interface to write primitive data-type values and strings. Primitive values are copied from memory to the output without any conversions. Characters in a string may be written in several ways, as discussed in the next section.

Characters and Strings in Binary I/O

A Unicode character consists of two bytes. The writeChar(char c) method writes the Unicode of character c to the output. The writeChars(String s) method writes the Unicode for each character in the string s to the output. The writeBytes(String s) method writes the lower byte of the Unicode for each character in the string s to the output. The high byte of the Unicode is discarded. The writeBytes method is suitable for strings that consist of ASCII characters, since an ASCII code is stored only in the lower byte of a Unicode. If a string consists of non-ASCII characters, you have to use the writeChars method to write the string.

The writeUTF(String s) method writes two bytes of length information to the output stream, followed by the modified UTF-8 representation of every character in the string s. UTF-8 is a coding scheme that allows systems to operate with both ASCII and Unicode. Most operating systems use ASCII. Java uses Unicode. The ASCII character set is a subset of the Unicode character set. Since most applications need only the ASCII character set, it is a waste to represent an 8-bit ASCII character as a 16-bit Unicode character. The modified UTF-8 scheme stores a character using one, two, or three bytes. Characters are coded in one byte if their code is less than or equal to 0x7F, in two bytes if their code is greater than 0x7F and less than or equal to 0x7FF, or in three bytes if their code is greater than 0x7FF.

The initial bits of a UTF-8 character indicate whether a character is stored in one byte, two bytes, or three bytes. If the first bit is 0, it is a one-byte character. If the first bits are 110, it is the first byte of a two-byte sequence. If the first bits are 1110, it is the first byte of a three byte sequence. The information that indicates the number of characters in a string is stored in the first two bytes preceding the UTF-8 characters. For example, writeUTF("ABCDEF") actually writes eight bytes (i.e., 00 06 41 42 43 44 45 46) to the file, because the first two bytes store the number of characters in the string.

The writeUTF(String s) method converts a string into a series of bytes in the UTF-8 format and writes them into an output stream. The readUTF() method reads a string that has been written using the writeUTF method.

The UTF-8 format has the advantage of saving a byte for each ASCII character, because a Unicode character takes up two bytes and an ASCII character in UTF-8 only one byte. If most of the characters in a long string are regular ASCII characters, using UTF-8 is more efficient.

Creating DataInputStream/DataOutputStream

DataInputStream/DataOutputStream are created using the following constructors:

public DataInputStream(InputStream instream)
public DataOutputStream(OutputStream outstream)

The following statements create data streams. The first statement creates an input stream for the file in.dat; the second statement creates an output stream for the file out.dat.

DataInputStream input = new DataInputStream(new FileInputStream("in.dat"));
DataOutputStream output = new DataOutputStream(new FileOutputStream("out.dat"));

The code below writes student names and scores to a file named temp.dat and reads the data back from the file.

Image description

A DataOutputStream is created for file temp.dat in lines 8. Student names and scores are written to the file in lines 11–16. A DataInputStream is created for the same file in lines 20. Student names and scores are read back from the file and displayed on the console in lines 23–25.

DataInputStream and DataOutputStream read and write Java primitive-type values and strings in a machine-independent fashion, thereby enabling you to write a data file on one machine and read it on another machine that has a different operating system or file structure. An application uses a data output stream to write data that can later be read by a program using a data input stream.

DataInputStream filters data from an input stream into appropriate primitive-type values or strings. DataOutputStream converts primitive-type values or strings into bytes and outputs the bytes to an output stream. You can view DataInputStream/FileInputStream and DataOutputStream/FileOutputStream working in a pipe line as shown in Figure below.

Image description

You have to read data in the same order and format in which they are stored. For example, since names are written in UTF-8 using writeUTF, you must read names using readUTF.

Detecting the End of a File

If you keep reading data at the end of an InputStream, an EOFException will occur. This exception can be used to detect the end of a file, as shown in the code below.

Image description

The program writes three double values to the file using DataOutputStream (lines 8–12) and reads the data using DataInputStream (lines 14–17). When reading past the end of the file, an EOFException is thrown. The exception is caught in line 19.

BufferedInputStream/BufferedOutputStream

BufferedInputStream/BufferedOutputStream can be used to speed up input and output by reducing the number of disk reads and writes. Using BufferedInputStream, the whole block of data on the disk is read into the buffer in the memory once. The individual data are then delivered to your program from the buffer, as shown in Figure below (a). Using BufferedOutputStream, the individual data are first written to the buffer in the memory. When the buffer is full, all data in the buffer are written to the disk once, as shown in Figure below (b).

Image description

BufferedInputStream/BufferedOutputStream does not contain new methods. All the methods in BufferedInputStream/BufferedOutputStream are inherited from the InputStream/OutputStream classes. BufferedInputStream/BufferedOutputStream manages a buffer behind the scene and automatically reads/writes data from/to disk on demand.

You can wrap a BufferedInputStream/BufferedOutputStream on any InputStream/OutputStream using the constructors shown in Figures below.

Image description

Image description

If no buffer size is specified, the default size is 512 bytes. You can improve the performance of the TestDataStream program in TestDataStream.java above by adding buffers in the stream in lines 8 and lines 20, as follows:

DataOutputStream output = new DataOutputStream(
new BufferedOutputStream(new FileOutputStream("temp.dat")));
DataInputStream input = new DataInputStream(
new BufferedInputStream(new FileInputStream("temp.dat")));

You should always use buffered I/O to speed up input and output. For small files, you may not notice performance improvements. However, for large files—over 100 MB—you will see substantial improvements using buffered I/O.

💖 💪 🙅 🚩
paulike
Paul Ngugi

Posted on July 1, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related