RAID Controller and Hard Disk Cache Settings
This article will describe the various options for setting the caches for RAID controllers and hard disks. This will include the description of those settings that are necessary to avoid data loss when power failures occur (which could otherwise risk the destruction of the file system).
Caches Affecting Data Access
Caches at various levels are involved in accessing data. They will be briefly described here.
Operating System Cache
Modern operating systems use a so-called Page Cache. When data is written, it will first be stored in this cache. The content of this cache is periodically transferred (as well as when system calls like sync or fsync are called) to the underlying memory system. This underlying system may be a RAID controller or the hard disk directly.
Under Linux, the number of megabytes of working memory currently used for the page cache is indicated in the Cached column of the report produced by the
free -m command.
[root@testserver ~]# free -m total used free shared buffers cached Mem: 15976 15195 781 0 167 9153 -/+ buffers/cache: 5874 10102 Swap: 2000 0 1999 [root@testserver ~]#
The Linux Page Cache Basics article will provide additional information about this topic.
If the power fails, the content of the page cache will be lost.
RAID Controller Cache
RAID controller caches can significantly increase performance when writing data. A typical example for such a cache would currently consist of 256, 512 or 1024 MB. If the power were to fail, the content of this cache would be lost, unless the content has been protected by a battery backup unit (BBU) or battery backup module (BBM). BBUs and BBMs have integrated batteries, which can generally power the content of the cache for up to 72 hours. If the server is re-started during that period, the data in the cache can be recovered.
Note: The battery status should be checked at periodic intervals, since capacity will reduce over the life of the battery. When the battery becomes too weak (generally after one to three years), it should be replaced (just like a notebook battery). If the battery status is not checked, there is the risk after several years that the battery will only be able to retain the content of the cache for very short period, which would risk data loss if the power failure were to continue for a longer period. The Battery Backup Unit (BBU/BBM) Maintenance for RAID Controllers article will provide detailed information about this issue.
Note: RAID controllers, which do not use a BBU to protect the cache (but instead copy the content of the cache to flash memory in the event of a power failure), do not require special cache protection maintenance. The new Adaptec Series 5Z Controller systems using the so-called “Zero-maintenance Cache Protection “(ZMCP)” feature would provide this capability.
Hard Disk Cache
Hard disks also have an integrated cache. 16 to 64 megabytes is common for these caches. The content of these caches cannot be protected by a BBU or BBM. Newer 3ware controllers protect the content of caches integrated into hard disks using a proprietary Write Journal (Storsave Configuration Settings). Otherwise, the content of these caches would normally be lost upon power failure.
Risks from the Loss of Cache Content during a Power Failure
If the RAID controller or the hard disk were to inform the operating system that data had been written (however the data had actually merely been stored in the cache), the worst-case scenario would involve complete loss of data during a sudden power failure. This could occur, for example, when a number of updates were made to the file system’s metadata, however only some of those updates were written to the disk (and the remaining updates were lost in the cache).
The only choice in such a case would be to restore from the current backup. Cache settings for secure operation (see below) can reduce these risks, however they will quite naturally reduce performance somewhat.
Settings for Secure Operation
The objective of secure operation is to avoid the loss of data in the caches for the RAID controllers and the hard disks during a power failure.
The following ground rules should be used for secure operation.
- When using a cache with the RAID controller, the cache content should be protected by a BBU or BBM. The battery status should be checked periodically. The batteries should be replaced every two or three years, as a general rule. In addition, the fact that a BBU or BBM can only protect the content of the cache for up to 72 hours must be taken into consideration. If a power failure might last several days, the battery could be fully discharged, resulting in the loss of the cache content. The cache content will be permanently secured when using controllers with Zero-maintenance Cache Protection (ZMCP) features.
- Hard disk caches normally cannot be secured by a BBU or BBM. For that reason, these caches should be deactivated.
- Comment: With newer RAID controllers, 3ware provides a proprietary Write Journal feature, which will protect the cache content, when the Storsave settings have been set to Protection or Balanced (in the case of the Balanced setting, only if a BBU is used). The precise functional manner of this feature has not been document. However, because the data was protected during a power failure according to our tests, the content of the hard disk cache would have to be similarly stored in the RAID controller’s cache (which is protected by the BBU) in some form.
With the help of a Perl script, we were able to test the possibility of data loss during a power failure under Linux. Background information regarding this script can be found at: http://brad.livejournal.com/2116715.html
The following table shows the results of testing various RAID controllers and diverse settings. All of the RAID controllers were used in combination with BBUs or BBMs for the tests.
|1||Adaptec 5405||5.2-0 (15726)||HDD Cache active||Max. performance, data loss possible|
|2||Adaptec 5405||5.2-0 (15726)||HDD Cache deactivated (by arcconf CLI)||Error-free test (no data loss)|
|3||Areca 1210||1.43||HDD Cache deactivated (Disk Write Cache Mode set to Disabled)||Error-free test (no data loss)|
|4||Areca 1210||1.43||HDD Cache active (Disk Write Cache Mode set to Enabled)||Max. performance, data loss possible|
|5||Onboard SATA||-||HDD Cache active||Max. performance, data loss possible|
|6||Onboard SATA||-|| HDD Cache deactivated (by
||Error-free test (no data loss)|
|7||3ware 9550SX||FE9X 3.02.00.016||HDD Cache probably active (Storsave Perform)||Max. performance, data loss possible|
|8||3ware 9550SX||FE9X 3.02.00.016||HDD Cache probably active (Storsave Balanced)||Error-free test (no data loss)|
|9||3ware 9550SX||FE9X 3.02.00.016||HDD Cache probably active (Storsave Protection)||Error-free test (no data loss)|
Note: The results listed above ultimately reflect our test results. To test server systems from end to end, we would recommend that customers conduct their own tests.
- Regarding the Adaptec 5405, the cache for the individual hard disks was deactivated as according to this example for the first hard disk:
arcconf SETCACHE 1 DEVICE 0 0 wt
- Regarding the Onboard SATA test, we deactivated the hard disk cache as follows:
hdparm -W0 /dev/sda. For most hard disks, this setting was reset after rebooting and had to be subsequently reset. The Microsoft knowledge base article contains additional information about this.
- Disk buffer (en.wikipedia.org)
Author: Werner Fischer
Werner Fischer, working in the Knowledge Transfer team at Thomas-Krenn, completed his studies of Computer and Media Security at FH Hagenberg in Austria. He is a regular speaker at many conferences like LinuxTag, OSMC, OSDC, LinuxCon, and author for various IT magazines. In his spare time he enjoys playing the piano and training for a good result at the annual Linz marathon relay.