Mdadm recovery and resync
The following article looks at the Recovery and Resync operations of the Linux Software RAID tools mdadm more closely. Here we will show you a few commands and explain the steps.
Perform Rebuild/Recovery
The following attributes apply to a rebuild:[1]
- fill a new disc with relevant information from an existing array
- under the assumption that all data must be rewritten and that the data from the existing array is correct
- is performed when a disk has been replaced
Force Manual Recovery
Intentionally Set Faulty Partition
root@ubuntumdraidtest:~# mdadm --manage --set-faulty /dev/mdN /dev/sdX1 mdadm: set /dev/sdX1 faulty in /dev/mdN
Status of Software RAIDs
root@ubuntumdraidtest:~# mdadm -D /dev/mdN
provides the following overview:
/dev/md0: Version : 1.2 Creation Time : Mon Sep 30 14:59:51 2013 Raid Level : raid1 Array Size : 2095040 (2046.28 MiB 2145.32 MB) Used Dev Size : 2095040 (2046.28 MiB 2145.32 MB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Update Time : Mon Oct 7 08:11:41 2013 State : clean, degraded Active Devices : 1 Working Devices : 1 Failed Devices : 1 Spare Devices : 0 Name : ubuntumdraidtest:0 (local to host ubuntumdraidtest) UUID : ebe7bfba:a20bf402:f7954545:d920460f Events : 471 Number Major Minor RaidDevice State 3 8 17 0 active sync /dev/sdb1 1 0 0 1 removed 2 8 33 - faulty spare /dev/sdc1
Here the sdc1 partition of the RAID was set to md0 faulty and is now displayed as faulty spare. The status of the RAIDs changed from clean to clean, degraded.
SDX1 hot remove
root@ubuntumdraidtest:~# mdadm --manage /dev/mdN -r /dev/sdX1 mdadm: hot removed /dev/sdX1 from /dev/mdN
SDX1 add
root@ubuntumdraidtest:~# mdadm --manage /dev/mdN -a /dev/sdX1 mdadm: added /dev/sdX1
Visualization
cat /proc/mdstat
shows:
md0 : active raid1 sdc1[2] sdb1[3] 2095040 blocks super 1.2 [2/1] [U_] [>....................] recovery = 0.4% (9600/2095040) finish=3.6min spe ed=9600K/sec unused devices: <none>
nmon shows that from /dev/sdb reads and is written /dev/sdc.
┌nmon─13g──────[H for help]───Hostname=ubuntumdraidtRefresh= 2secs ───15:08.05─┐ │ Disk I/O ──/proc/diskstats────mostly in KB/s─────Warning:contains duplicates─│ │DiskName Busy Read WriteMB|0 |25 |50 |75 100|│ │sda 0% 0.0 0.0| > |│ │sda1 0% 0.0 0.0|> |│ │sda2 0% 0.0 0.0|>disk busy not available |│ │sda5 0% 0.0 0.0| > |│ │sdb 5% 66.3 0.0|RRR > |│ │sdb1 6% 66.3 0.0|RRR > |│ │sdc 72% 0.0 65.3|WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW >│ │sdc1 72% 0.0 65.3|WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW >│ │sdd 0% 0.0 0.0|> |│ │sdd1 0% 0.0 0.0|>disk busy not available |│ │dm-0 0% 0.0 0.0| > |│ │dm-1 0% 0.0 0.0|> |│ │md0 0% 0.0 0.0|>disk busy not available |│ │Totals Read-MB/s=132.6 Writes-MB/s=130.7 Transfers/sec=541.7 │ │──────────────────────────────────────────────────────────────────────────────│
Behavior when using a spare disk
If a RAID is operated with a spare disk, it will jump in for the disk set to faulty.
A rebuild is performed automatically. The disk set to faulty appears in the output of mdadm -D /dev/mdN
as faulty spare.
To put it back into the array as a spare disk, it must first be removed using mdadm --manage /dev/mdN -r /dev/sdX1
and then added again mdadm --manage /dev/mdN -a /dev/sdd1
.
Resync
The following properties apply to a resync:[1]
- Ensures that all data in the array is synchronized respectively consistent.
- Assumption that the data is (should be) in order.
- In case of an error while reading a device, the data is written from another device to all others.
Force Manual Resync
End Array
root@ubuntumdraidtest:~# mdadm --stop /dev/mdN mdadm: stopped /dev/mdN
Perform Resync
root@ubuntumdraidtest:~# mdadm --assemble --run --force --update=resync /dev/mdN /dev/sdX1 /dev/sdX1 mdadm: /dev/mdN has been started with 2 drives.
- For further information see:[2]
- For a list of commands, see
/usr/share/doc-base/mdadm-readme-recipes
.
Visualization
cat /proc/mdstat
liefert:
Every 2.0s: cat /proc/mdstat Wed Sep 25 15:19:59 2013 Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid1 sdb1[0] sdc1[1] 2095040 blocks super 1.2 [2/2] [UU] [===>.................] resync = 19.7% (414656/2095040) finish=0.2min speed=138218K/sec unused devices: <none>
References
- ↑ 1.0 1.1 Linux RAID Wiki recovery and resync (raid.wiki.kernel.org)
- ↑ Resync auf einem Software-RAID erzwingen (web.archive.org)
Author: Thomas Niedermeier Thomas Niedermeier working in the product management team at Thomas-Krenn, completed his bachelor's degree in business informatics at the Deggendorf University of Applied Sciences. Since 2013 Thomas is employed at Thomas-Krenn and takes care of OPNsense firewalls, the Thomas-Krenn-Wiki and firmware security updates.
|