If an application performs disk operations, disk I/O should be monitored for possible performance issues. Some applications make heavy use of disk as a major part of its core functionality such as databases, and almost all applications utilize an application log to write important information about the state or behavior of the application as events occur. Disk I/O utilization is the most useful monitoring statistic for understanding application disk usage since it is a measure of active disk I/O time. Disk I/O utilization along with system or kernel CPU utilization can be monitored using iostat on Linux and Solaris.
To use iostat on Linux, the optional sysstat package must be installed. To monitor disk utilization on Windows Server systems, the Performance Monitor has several performance counters available under its Logical Disk performance object. On Solaris, iostat -xc shows disk utilization for each disk device on the system along with reporting CPU utilization. This command is useful for showing both disk utilization and system or kernel CPU utilization together. The following example shows a system that has three disks, sd0, sd2, and sd4, with disk I/O utilization of 22%, 13%, and 36%, respectively, along with 73% system or kernel CPU utilization.
The other statistics from iostat are not as important for application performance monitoring since they do not report a “busy-ness” indicator.
To monitor disk I/O utilization and system or kernel CPU utilization on Linux you can use iostat -xm. The following is an example of iostat -xm from a Linux system showing 97% and 69% for disks hda and hdb, respectively, along with 16% system or kernel CPU utilization. Columns reporting 0 values were removed from the output for ease of reading.
One of the challenges with monitoring disk I/O utilization is identifying which files are being read or written to and which application is the source of the disk activity. Recent versions of Solaris 10 and Solaris 11 Express include several DTrace scripts in the /usr/demo/dtrace directory that can help monitor disk activity. The iosnoop.d DTrace script provides details such as which user id is accessing the disk, what process is accessing the disk, the size of the disk access, and the name of the file being accessed. The iosnoop.d script is also included in the Solaris DTraceToolKit downloadable at http://www.solarisinternals.com/wiki/index.php/DTraceToolkit. The following is example output from executing iosnoop.d while launching NetBeans IDE. The entire output is not displayed since there are many files accessed during a NetBeans IDE launch. Hence, for brevity the output is trimmed.
$ iosnoop.d UID PID D BLOCK SIZE COMM PATHNAME 97734 1617 R 4140430 1024 netbeans /techpaste/tmp/netbeans 97734 1617 R 4141518 1024 bash /techpaste/tmp/netbeans/modules 97734 1617 R 4150956 1024 bash /techpaste/tmp/netbeans/update 97734 1697 R 4143242 1024 java /techpaste/tmp/netbeans/var 97734 1697 R 4141516 1024 java /techpaste/tmp/netbeans/config 97734 1697 R 4143244 1024 java /techpaste/tmp/netbeans/var/log 97734 1697 R 4153884 1024 java /techpaste/tmp/netbeans/docs 97734 1697 R 4153884 1024 java /techpaste/tmp/netbeans/docs 97734 1697 R 4153884 1024 java /techpaste/tmp/netbeans/docs 97734 1697 R 4153884 1024 java /techpaste/tmp/netbeans/docs 97734 1697 R 4153884 1024 java /techpaste/tmp/netbeans/docs 97734 1697 R 4153884 1024 java /techpaste/tmp/netbeans/docs 97734 1697 R 4153884 1024 java /techpaste/tmp/netbeans/docs 97734 1697 R 4153884 1024 java /techpaste/tmp/netbeans/docs 97734 1697 R 12830464 8192 java /usr/jdk1.6.0/jre/lib/rt.jar 97734 1697 R 12830480 20480 java /usr/jdk1.6.0/jre/lib/rt.jar 97734 1697 R 12830448 8192 java /usr/jdk1.6.0/jre/lib/rt.jar 97734 1697 R 12830416 8192 java /usr/jdk1.6.0/jre/lib/rt.jar 97734 1697 R 12830432 4096 java /usr/jdk1.6.0/jre/lib/rt.jar 97734 1697 R 12828264 8192 java /usr/jdk1.6.0/jre/lib/rt.jar [... additional output removed ...]
The “UID” column reports the user id responsible for performing the disk access. The “PID” column is the process id of the process performing the disk access. The “D” column indicates whether the disk access is the result of a read or write, “R” = read, “W” = write. The “BLOCK” column is the disk block. The “SIZE” column is the amount of data accessed in bytes. The “COMM” column is the name of the command performing the disk access, and the “PATHNAME” column is the name of the file being accessed. Patterns to look for in the output of iosnoop.d is repeated accesses to the same file, same disk block, by the same command, process id, and user id. For example, in the preceding output there are many disk accesses of 1024 bytes on the same disk block 4153884, which may indicate a possible optimization opportunity. It may be that the same information is being accessed multiple times. Rather than re-reading the data from disk each time, the application may be able to keep the data in memory, reuse it, and avoid rereading and experiencing an expensive disk read. If the same data is not being accessed,it may be possible to read a larger block of data and reduce the number of disk accesses.
At a larger scale, if high disk I/O utilization is observed with an application, it may be worthwhile to further analyze the performance of your system’s disk I/O subsystem by looking more closely at its expected workload, disk service times, seek times, and the time spent servicing I/O events. If improved disk utilization is required, several strategies may help. At the hardware and operating system level any of the following may improve disk I/O utilization:
– A faster storage device
– Spreading file systems across multiple disks
– Tuning the operating system to cache larger amounts of file system data structures
At the application level any strategy to minimize disk activity will help such as reducing the number of read and write operations using buffered input and output streams or integrating a caching data structure into the application to reduce or eliminate disk interaction. The use of buffered streams reduces the number of system calls to the operating system and consequently reduces system or kernel CPU utilization. It may not improve disk I/O performance, but it will make more CPU cycles available for other parts of the application or other applications running on the system. Buffered data structures are available in the JDK that can easily be utilized, such as java.io.BufferedOutputStream and java.io.BufferedInputStream. An often overlooked item with disk performance is checking whether the disk cache is enabled. Some systems are configured and installed with the disk cache disabled. An enabled disk cache improves an application’s performance that heavily relies on disk I/O. However, you should use caution if you discover the default setting of a system has the disk cache disabled. Enabling the disk cache may result in corrupted data in the event of an unexpected power failure.