Linux I/O Performance Tests using dd

From Thomas-Krenn-Wiki
Jump to: navigation, search

Under Linux, the dd command can be used for simple sequential I/O performance measurements. This article will provide valuable information about which parameters should be used. For more detailed I/O performance benchmarking, the Flexible I/O Tester (Fio) can be used.

Basics

dd can be used for simplified copying of data at the low level.[1] In doing this, device files are often access directly. Beware that erroneous usage of dd can quickly lead to data loss. We absolutely recommend performing the steps described below on test systems. If dd is used incorrectly, data loss will be the result.

Measuring Write Performance

Modern operating systems do not normally write files immediately to RAID systems or hard disks. Temporary memory that is not currently in use will be used to cache writes and reads (regarding this, see also Operating System Caches).

So that I/O performance measurements will not be affected by these caches (temporary memory), the oflag parameter can be used. Thereby, the following two flags are interesting (for details, see dd --help and Dd using direct or synchronized I/O):

  • direct (use direct I/O for data)
  • dsync (use synchronized I/O for data)
  • sync (likewise, but also for metadata)

For measuring write performance, the data to be written should be read from /dev/zero[2] and ideally written it to an empty RAID array, hard disk or partition (such as using of=/dev/sda for the first hard disk or of=/dev/sda2 for the second partition on the first hard disk). If this is not possible, a normal file in the file system (such as using of=/root/testfile) can be written. For safety reasons we are using test files in the following examples. The write performance achieved thereby will be a little slower (because metadata will also be written to the file system).

Important: When writing to a device (such as /dev/sda), the data stored there will be lost. For that reason, you should only use empty RAID arrays, hard disks or partitions.

Note:

  • When using if=/dev/zero and bs=1G, Linux will need 1GB of free space in RAM. If your test system does not have sufficient RAM available, use a smaller parameter for bs (such as 512MB).
  • In order to get results closer to real-life, we recommend performing the tests described several times (three to ten times, for example). By doing so, you can quickly detect outliers. Such outliers can include cron jobs, interrupts or general conditions due to parallel processing, which can all briefly affect performance. An extreme example, which clarifies this issue, would be the parallel execution of updatedb by a cron job.

Laptop Example

In this example, the test data will be written to /root/testfile. The test system (a Thinkpad T43 Type 2668-4GG) had 1.5 GByte of RAM and a Fujitsu MHT2060AH hard disk rotating at 5,400 rpm.

Laptop Throughput (Streaming I/O)

One gigabyte was written for the test, first with the cache activated (hdparm -W1 /dev/sda):

root@grml ~ # dd if=/dev/zero of=/root/testfile bs=1G count=1 oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 32.474 s, 33.1 MB/s
root@grml ~ # 

Then, with the cache deactivated (hdparm -W0 /dev/sda):

root@grml ~ # dd if=/dev/zero of=/root/testfile bs=1G count=1 oflag=direct    
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 123.37 s, 8.7 MB/s
root@grml ~ # 

Laptop Latency

In this test, 512 bytes were written one thousand times, first with the cache activated (hdparm -W1 /dev/sda):

root@grml ~ # dd if=/dev/zero of=/root/testfile bs=512 count=1000 oflag=direct
1000+0 records in
1000+0 records out
512000 bytes (512 kB) copied, 0.36084 s, 1.4 MB/s
root@grml ~ #

Then, with the cache deactivated (hdparm -W0 /dev/sda): One thousand accesses required 11.18 seconds, meaning one access took 11.18 ms.

root@grml ~ # dd if=/dev/zero of=/root/testfile bs=512 count=1000 oflag=direct
1000+0 records in
1000+0 records out
512000 bytes (512 kB) copied, 11.1865 s, 45.8 kB/s
root@grml ~ # 

Server with RAID10 Example

In this example, the test data was written to an empty partition. The test system was an 2HE Intel Dual-CPU SC823 Server with six 147 GB SAS Fujitsu MBA3147RC (15,000 rpm) hard disks and an Adaptec 5805 RAID controller with the cache activated and a BBU.

Server Throughput (Streaming I/O)

One gigabyte was written for the test:

test-sles10sp2:~ # dd if=/dev/zero of=/root/testfile bs=1G count=1 oflag=dsync
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 5.11273 seconds, 210 MB/s
test-sles10sp2:~ 

Server Latency

In this test, 512 bytes were written one thousand times. Thereby, the 0.084 seconds that were measured for one thousand accesses corresponded to precisely 0.084 ms for each access. This value is so low because of the RAID controller’s cache:

test-sles10sp2:~ # dd if=/dev/zero of=/root/testfile bs=512 count=1000 oflag=dsync
1000+0 records in
1000+0 records out
512000 bytes (512 kB) copied, 0.083902 seconds, 6.1 MB/s
test-sles10sp2:~ #

References

  1. Dd (Unix) (en.wikipedia.org)
  2. /dev/zero (en.wikipedia.org)


Foto Werner Fischer.jpg

Author: Werner Fischer

Werner Fischer, working in the Web Operations & Knowledge Transfer team at Thomas-Krenn, completed his studies of Computer and Media Security at FH Hagenberg in Austria. He is a regular speaker at many conferences like LinuxTag, OSMC, OSDC, LinuxCon, and author for various IT magazines. In his spare time he enjoys playing the piano and training for a good result at the annual Linz marathon relay.


Related articles

Bash redirect sdtout and stderr
Data Synchronization with rsync under Linux
Linux Performance Analysis using kSar