Mdadm recovery and resync

From Thomas-Krenn-Wiki
Jump to: navigation, search

The following article looks at the Recovery and Resync operations of the Linux Software RAID tools mdadm more closely. Here we will show you a few commands and explain the steps.

Perform Rebuild/Recovery

The following attributes apply to a rebuild:[1]

  • fill a new disc with relevant information from an existing array
  • under the assumption that all data must be rewritten and that the data from the existing array is correct
  • is performed when a disk has been replaced

Force Manual Recovery

Intentionally Set Faulty Partition

root@ubuntumdraidtest:~# mdadm --manage --set-faulty /dev/mdN /dev/sdX1
mdadm: set /dev/sdX1 faulty in /dev/mdN

Status of Software RAIDs

root@ubuntumdraidtest:~# mdadm -D /dev/mdN provides the following overview:

/dev/md0:
        Version : 1.2
  Creation Time : Mon Sep 30 14:59:51 2013
     Raid Level : raid1
     Array Size : 2095040 (2046.28 MiB 2145.32 MB)
  Used Dev Size : 2095040 (2046.28 MiB 2145.32 MB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

    Update Time : Mon Oct  7 08:11:41 2013
          State : clean, degraded 
 Active Devices : 1
Working Devices : 1
 Failed Devices : 1
  Spare Devices : 0

           Name : ubuntumdraidtest:0  (local to host ubuntumdraidtest)
           UUID : ebe7bfba:a20bf402:f7954545:d920460f
         Events : 471

    Number   Major   Minor   RaidDevice State
       3       8       17        0      active sync   /dev/sdb1
       1       0        0        1      removed

       2       8       33        -      faulty spare   /dev/sdc1


Here the sdc1 partition of the RAID was set to md0 faulty and is now displayed as faulty spare. The status of the RAIDs changed from clean to clean, degraded.

SDX1 hot remove

root@ubuntumdraidtest:~# mdadm --manage /dev/mdN -r /dev/sdX1
mdadm: hot removed /dev/sdX1 from /dev/mdN

SDX1 add

root@ubuntumdraidtest:~# mdadm --manage /dev/mdN -a /dev/sdX1
mdadm: added /dev/sdX1

Visualization

cat /proc/mdstat shows:

md0 : active raid1 sdc1[2] sdb1[3]
      2095040 blocks super 1.2 [2/1] [U_]
      [>....................]  recovery =  0.4% (9600/2095040) finish=3.6min spe
ed=9600K/sec

unused devices: <none>

nmon shows that from /dev/sdb reads and is written /dev/sdc.

┌nmon─13g──────[H for help]───Hostname=ubuntumdraidtRefresh= 2secs ───15:08.05─┐
│ Disk I/O ──/proc/diskstats────mostly in KB/s─────Warning:contains duplicates─│
│DiskName Busy  Read WriteMB|0          |25         |50          |75       100|│
│sda        0%    0.0    0.0|        >                                        |│
│sda1       0%    0.0    0.0|>                                                |│
│sda2       0%    0.0    0.0|>disk busy not available                         |│
│sda5       0%    0.0    0.0| >                                               |│
│sdb        5%   66.3    0.0|RRR             >                                |│
│sdb1       6%   66.3    0.0|RRR             >                                |│
│sdc       72%    0.0   65.3|WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW             >│
│sdc1      72%    0.0   65.3|WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW             >│
│sdd        0%    0.0    0.0|>                                                |│
│sdd1       0%    0.0    0.0|>disk busy not available                         |│
│dm-0       0%    0.0    0.0|        >                                        |│
│dm-1       0%    0.0    0.0|>                                                |│
│md0        0%    0.0    0.0|>disk busy not available                         |│
│Totals Read-MB/s=132.6    Writes-MB/s=130.7    Transfers/sec=541.7            │
│──────────────────────────────────────────────────────────────────────────────│

Behavior when using a spare disk

If a RAID is operated with a spare disk, it will jump in for the disk set to faulty. A rebuild is performed automatically. The disk set to faulty appears in the output of mdadm -D /dev/mdN as faulty spare.

To put it back into the array as a spare disk, it must first be removed using mdadm --manage /dev/mdN -r /dev/sdX1 and then added again mdadm --manage /dev/mdN -a /dev/sdd1.

Resync

The following properties apply to a resync:[1]

  • Ensures that all data in the array is synchronized respectively consistent.
  • Assumption that the data is (should be) in order.
  • In case of an error while reading a device, the data is written from another device to all others.

Force Manual Resync

End Array

root@ubuntumdraidtest:~# mdadm --stop /dev/mdN
mdadm: stopped /dev/mdN

Perform Resync

root@ubuntumdraidtest:~# mdadm --assemble --run --force --update=resync /dev/mdN /dev/sdX1 /dev/sdX1
mdadm: /dev/mdN has been started with 2 drives.
  • For further information see:[2]
  • For a list of commands, see /usr/share/doc-base/mdadm-readme-recipes.

Visualization

cat /proc/mdstat liefert:

Every 2.0s: cat /proc/mdstat                                                      Wed Sep 25 15:19:59 2013

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sdb1[0] sdc1[1]
     2095040 blocks super 1.2 [2/2] [UU]
     	[===>.................]  resync = 19.7% (414656/2095040) finish=0.2min speed=138218K/sec

unused devices: <none>

References

  1. 1.0 1.1 Linux RAID Wiki recovery and resync (raid.wiki.kernel.org)
  2. Resync auf einem Software-RAID erzwingen (web.archive.org)
Foto Thomas Niedermeier.jpg

Author: Thomas Niedermeier

Thomas Niedermeier, working in the Knowledge Transfer team at Thomas-Krenn, completed his bachelor's degree in business informatics at the Deggendorf University of Applied Sciences. Thomas has been working for Thomas-Krenn since 2013 and is mainly responsible for the maintenance of the Thomas-Krenn wiki.