RAID

From Thomas-Krenn-Wiki
Jump to navigation Jump to search

A RAID (Redundant Array of Independent Disks) describes the interconnection of several hard disks or other data carriers to a single logical drive. Dependent on the respective RAID level, it enables both higher reliability and higher performance than can be achieved with a hard disk. RAIDs can be used with the help of Hardware RAID, Software RAID or Firmware/Driver RAID.

RAID level

RAID Level RAID 0 RAID 1 RAID 4 RAID 5 RAID 6 RAID 10
minimum number of HDDs 2 2 3 3 4

(5 for 3ware)

4
functionality RAID 0 RAID 1 RAID 4 RAID 5 RAID 6 RAID 10
data security none failure of a
drive
failure of a
drive
failure of a
drive
failure of two
drives
failure of a
pro Sub-Array drive
capacity utilization 100% 50% 67% - 94% (for 16 HDDs) 67% - 94% (for 16 HDDs) 50% - 88% (for 16 HDDs) 50%
rebuild after HDD-failure
loss of data not possible !
(*)
copy reflected HDD calculation of the original content using XOR (all remaining HDDs must be read completely) calculation of the original content using XOR (all remaining HDDs must be read completely) calculation of the original content of the HDD (dependent on the respective RAID 6 implementation) copy reflected HDD
rebuild after double HDD failure
loss of data not possible!
(*)

loss of data not possible!
(*)

loss of data not possible!
(*)

loss of data not possible!
(*)
calculation of the original content of both HDDs (dependent on the respective RAID 6 implementation) only possible if two hard disks of different mirrors are affected, then copy each of the mirrored HDDs

(*) In many cases, data can only be restored by RAID data recovery.

RAID 0

  • high performance: several hard disks can be used parallely both when reading and writing
  • no reliability: if a hard disk fails, all data in the RAID volume is lost.
RAID 0: higher performance, but no reliability


RAID 1

  • Performance: When writing, the performance is almost that of a single hard disk. If several processes are reading simultaneously (e.g. read access by several users), it is possible to read from both drives in parallel and thus increase the read performance. See also SSD RAID Performance-Tests - RAID 1.
  • reliability: all data is reflected completely and the failure of a hard disk does not lead to a loss of data.
RAID 1: all data is reflected


RAID 2

RAID 4 works in a similar way to RAID 5, except that the parity data is stored on a dedicated hard disk and not distributed across several hard disks as with RAID 5.

RAID 4: parity data is stored on a data carrier. If a data carrier fails and it is replaced by a new one, the original content can be restored


RAID 5

  • failure of a hard disk does not lead to a loss of data: Since only 50 percent of the capacity of the two hard disks can be used in RAID 1. RAID 5 is a RAID level that can be used with several hard disks and, like RAID 1, tolerates the failure of a single hard disk.
  • parity data: Instead of a complete reflection of data, a RAID 5 calculates parity data. For this, a XOR operation is used.
  • distribution of parity data: The parity data is distributed to all available hard disks and is not stored on a single hard disk as with a RAID 4. In a RAID 4, this parity hard disk is subject to greater wear than the other hard disks in the RAID array, which is why RAID 4 is hardly relevant today compared to RAID 5.
  • higher reading performance: When reading large amounts of data, several hard disks can be used in parallel, thus achieving higher read performance compared to a single hard disk.
  • writing performance: When writing small amounts of data, read access is previously required so that the new parity data is calculated and written for the affected stripe (read-modify-write, write penalty). In the case of hardware RAID controllers with integrated caches, the caches cushion this problem.
  • Initialization for the creation of a RAID 5 required: The original parity must be correct so that read-modify-write functions. This is guaranteed by an initialization of a RAID 5 during the setup of a RAID set. The initialization may take advantage of more hours or days depending on the size of the RAID set. In contrast, RAID 1 and RAID 6 (at least with Linux Software RAID) do not require initialization.[1]
  • time-consuming restoration of a failed hard disk: With a RAID 1, it is sufficient to copy the content of the remaining hard disk to a new hard disk after a hard disk failure to restore the RAID volume. If a hard disk in a RAID 5 fails, the data of all remaining hard disks is read when restoring to a new hard disk in order to calculate the content of the replaced hard disk using XOR calculation.
RAID 5: parity information is distributed to all data carriers. If a data carrier fails and it is replaced by a new one, the original content can be restored.


RAID 6

  • failure of two hard disks leads to no loss of data
  • different implementations of RAID 6: there are different mathematical opportunities how double parity data is created and used (for example Galois field or RAID DP)
RAID 6: double parity information allows the failure of two data carriers without loss of data


RAID 10

RAID 10: offers high performance as RAID 0, combined with the data safety of RAID 1


RAID: avoid loss of data and data restoration

In the article RAID data rescue, you can read what you should definitely know when using a RAID system:

  • symptoms of an upcoming loss of data
  • What are the dangers of RAID systems?
  • prevention of data loss with RAID.

References

More information


Author: Werner Fischer

Werner Fischer, working in the Knowledge Transfer team at Thomas-Krenn, completed his studies of Computer and Media Security at FH Hagenberg in Austria. He is a regular speaker at many conferences like LinuxTag, OSMC, OSDC, LinuxCon, and author for various IT magazines. In his spare time he enjoys playing the piano and training for a good result at the annual Linz marathon relay.


Related articles

Creation of Broadcom RAID-controller logs with BCMget
Using Western Digital WD2002FYPS 2TB HDDs with RAID Controllers