How do I convert my root disk to RAID1 after installation of Red Hat Enterprise Linux 7?

Moderator: cah

Post Reply
cah
General of the Army / Fleet Admiral / General of the Air Force
General of the Army / Fleet Admiral / General of the Air Force
Posts: 1342
Joined: Sun Aug 17, 2008 5:05 am

How do I convert my root disk to RAID1 after installation of Red Hat Enterprise Linux 7?

Post by cah »

After my Solaris 11 server got corrupted, I had migrated from Solaris to Linux (CentOS 7.3, derived from RHEL 7.3).

Since I was trying to get the OS up and running, I didn't get a chance to set up hardware RAID 1 then. Now that it's running for a few months, it is impossible for me to put the root disk into hardware raid, for it will destroy the data on disk. The only solution for me to get it redundant is to set up the software raid.

Here are the steps for me to get the server protected by setting up RAID 1 (mirroring) on my boot and root disks.
  1. Gather the partition information from your main disk /dev/sda.

    Code: Select all

    /root%parted /dev/sda u s p
    Model: ATA Hitachi HDS72101 (scsi)
    Disk /dev/sda: 1953525168s
    Sector size (logical/physical): 512B/512B
    Partition Table: msdos
    Disk Flags: 
    
    Number  Start     End          Size         Type     File system  Flags
     1      2048s     2099199s     2097152s     primary  xfs          boot
     2      2099200s  1953523711s  1951424512s  primary               lvm
    
    NOTE: partition 1 is /boot and partition 2 is /root.

    Also, check current partitiono on sdb:

    Code: Select all

    /root%parted /dev/sdb u s p
    Model: ATA ST1000DM003-1CH1 (scsi)
    Disk /dev/sdb: 1953525168s
    Sector size (logical/physical): 512B/4096B
    Partition Table: msdos
    Disk Flags: 
    
    Number  Start   End          Size         Type     File system  Flags
     1      32130s  1953503999s  1953471870s  primary               boot
    
    Note: Both disks have the same sectors - 1953525168s.
  2. Using the start and end sectors, reproduce the partitioning scheme from the previous command on the new unused disk.

    Code: Select all

    /root%parted /dev/sdb mklabel msdos --> partition table is already msdos, this step can be skipped
    /root%parted /dev/sdb mkpart primary 2048s 2099199s
    Warning: You requested a partition from 1049kB to 1075MB (sectors
    2048..2099199).
    The closest location we can manage is 1049kB to 16.5MB (sectors 2048..32129).
    Is this still acceptable to you?
    Yes/No? No
    
    I figured I have to remove the existing partition before I can set it up properly.

    Code: Select all

    /root%parted /dev/sdb      
    GNU Parted 3.1
    Using /dev/sdb
    Welcome to GNU Parted! Type 'help' to view a list of commands.
    (parted) help                                                             
      align-check TYPE N                        check partition N for TYPE(min|opt)
            alignment
      help [COMMAND]                           print general help, or help on
            COMMAND
      mklabel,mktable LABEL-TYPE               create a new disklabel (partition
            table)
      mkpart PART-TYPE [FS-TYPE] START END     make a partition
      name NUMBER NAME                         name partition NUMBER as NAME
      print [devices|free|list,all|NUMBER]     display the partition table,
            available devices, free space, all found partitions, or a particular
            partition
      quit                                     exit program
      rescue START END                         rescue a lost partition near START
            and END
      rm NUMBER                                delete partition NUMBER
      select DEVICE                            choose the device to edit
      disk_set FLAG STATE                      change the FLAG on selected device
      disk_toggle [FLAG]                       toggle the state of FLAG on selected
            device
      set NUMBER FLAG STATE                    change the FLAG on partition NUMBER
      toggle [NUMBER [FLAG]]                   toggle the state of FLAG on partition
            NUMBER
      unit UNIT                                set the default unit to UNIT
      version                                  display the version number and
            copyright information of GNU Parted
    (parted) rm 1                                                             
    (parted) u s p                                                            
    Model: ATA ST1000DM003-1CH1 (scsi)
    Disk /dev/sdb: 1953525168s
    Sector size (logical/physical): 512B/4096B
    Partition Table: msdos
    Disk Flags: 
    
    Number  Start  End  Size  Type  File system  Flags
    
    (parted) q                                                                
    Information: You may need to update /etc/fstab.
    
    Then, I was able to run the command successfully.

    Code: Select all

    /root%parted /dev/sdb mkpart primary 2048s 2099199s
    Information: You may need to update /etc/fstab.
    
    A quick check:

    Code: Select all

    root%parted /dev/sdb u s p                                               
    Model: ATA ST1000DM003-1CH1 (scsi)
    Disk /dev/sdb: 1953525168s
    Sector size (logical/physical): 512B/4096B
    Partition Table: msdos
    Disk Flags: 
    
    Number  Start  End       Size      Type     File system  Flags
     1      2048s  2099199s  2097152s  primary
     
    Looking good. Move on to the next partition.

    Code: Select all

    # parted /dev/sdb mkpart primary 2099200s  1953523711s
    
    Status check:

    Code: Select all

    /root%parted /dev/sdb u s p                                               
    Model: ATA ST1000DM003-1CH1 (scsi)
    Disk /dev/sdb: 1953525168s
    Sector size (logical/physical): 512B/4096B
    Partition Table: msdos
    Disk Flags: 
    
    Number  Start     End          Size         Type     File system  Flags
     1      2048s     2099199s     2097152s     primary
     2      2099200s  1953523711s  1951424512s  primary
     
  3. Add the RAID flag on all partitions that will be mirrored.

    Code: Select all

    /root%parted /dev/sda set 1 raid on                                       
    Information: You may need to update /etc/fstab.
    /root%parted /dev/sda set 2 raid on                                       
    Information: You may need to update /etc/fstab.
    /root%parted /dev/sdb set 1 raid on
    Information: You may need to update /etc/fstab.
    /root%parted /dev/sdb set 2 raid on                                       
    Information: You may need to update /etc/fstab.
    
  4. Create a degraded RAID device on the first partition of the new disk. This will be used for your boot partition (/boot). NOTE: Use --metadata=1.0 option to store /boot on this device, otherwise the bootloader will not be able to read the metadata.

    Code: Select all

    /root%mdadm --create /dev/md0 --level=1 --raid-disks=2 missing /dev/sdb1 --metadata=1.0
    mdadm: array /dev/md0 started.
    
    NOTE: --level=1 mean it is creating RAID 1 (mirror) array.
  5. Create file-system same as existing on /dev/sda1 on the new degraded RAID array /dev/md0. xfs is the default filesystem in Red Hat Enterprise Linux 7.

    Code: Select all

    /root%mkfs.xfs /dev/md0 
    meta-data=/dev/md0               isize=512    agcount=4, agsize=65532 blks
             =                       sectsz=4096  attr=2, projid32bit=1
             =                       crc=1        finobt=0, sparse=0
    data     =                       bsize=4096   blocks=262128, imaxpct=25
             =                       sunit=0      swidth=0 blks
    naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
    log      =internal log           bsize=4096   blocks=1605, version=2
             =                       sectsz=4096  sunit=1 blks, lazy-count=1
    realtime =none                   extsz=4096   blocks=0, rtextents=0
    
  6. Mount the new raid array and copy over the files from /boot.

    Code: Select all

    /root%mkdir /mnt/md0
    /root%mount /dev/md0 /mnt/md0
    /root%rsync -a /boot/ /mnt/md0/
    /root%sync
    /root%umount /mnt/md0
    /root%rmdir /mnt/md0
    
  7. Unmount the current /boot, and mount the new RAID volume there.

    Code: Select all

    /root%umount /boot
    /root%mount /dev/md0 /boot
    
  8. Add the old disk to the new array to complete the mirror.

    Code: Select all

    /root%mdadm /dev/md0 -a /dev/sda1
    mdadm: added /dev/sda1
    
  9. Monitor the RAID status and wait for the recovery to complete.

    Code: Select all

    /root%mdadm -D /dev/md0 
    /dev/md0:
            Version : 1.0
      Creation Time : Wed Aug 30 15:30:37 2017
         Raid Level : raid1
         Array Size : 1048512 (1023.94 MiB 1073.68 MB)
      Used Dev Size : 1048512 (1023.94 MiB 1073.68 MB)
       Raid Devices : 2
      Total Devices : 2
        Persistence : Superblock is persistent
    
        Update Time : Wed Aug 30 15:34:15 2017
              State : clean 
     Active Devices : 2
    Working Devices : 2
     Failed Devices : 0
      Spare Devices : 0
    
               Name : hsiao.net:0  (local to host hsiao.net)
               UUID : a15cbc22:c27202f8:31c08617:3def65df
             Events : 33
    
        Number   Major   Minor   RaidDevice State
           2       8        1        0      active sync   /dev/sda1
           1       8       17        1      active sync   /dev/sdb1
    
  10. Find UUID of /dev/md0 using blkid.

    Code: Select all

    /root%blkid  |grep md0
    /dev/md0: UUID="6767ec6d-5437-45ed-a04b-208aef0c4c55" TYPE="xfs"
    
  11. Update the /etc/fstab file with the new location for boot. (Add the new UUID line)

    Code: Select all

    /root%grep boot /etc/fstab
    #UUID=c46975ae-27fa-4785-b024-4ceec84f9f61 /boot                   xfs     defaults        0 0
    UUID=6767ec6d-5437-45ed-a04b-208aef0c4c55 /boot                   xfs     defaults        0 0
    
  12. Create a degraded RAID device on the second partition of the new disk. This will be used for your LVM partition(/).

    Code: Select all

    /root%mdadm --create /dev/md1 --level=1 --raid-disks=2 missing /dev/sdb2 --metadata=1.2
    mdadm: array /dev/md1 started.
    
    NOTE: metadata 1.2 reduced the size from 930.51 GB (238209 extents) to 930.39 GB (238178). Therefore, 'pvmove /dev/sda2 /dev/md1' failed due to Insufficient free space.

    Instead, I had to use metadata 1.0 to get the md1 to the correct size.

    Code: Select all

    /root%mdadm --create /dev/md1 --level=1 --raid-disks=2 missing /dev/sdb2 --metadata=1.0
    mdadm: /dev/sdb2 appears to be part of a raid array:
           level=raid1 devices=2 ctime=Wed Aug 30 15:38:05 2017
    Continue creating array? y
    mdadm: array /dev/md1 started.
    
    NOTE 1: --level=1 mean it is creating RAID 1 (mirror) array.
    NOTE 2: Default value for metadata is 1.2 for --create. Apparently, 1.0 is needed for me.
  13. Add this new array to your LVM stack and add it to your existing volume group.

    Code: Select all

    /root%vgextend cl /dev/md1 
      Physical volume "/dev/md1" successfully created.
      Volume group "cl" successfully extended
    
  14. Move the physical extents from the old partition to the new array.

    Code: Select all

    /root%pvmove /dev/sda2 /dev/md1
      Insufficient free space: 238209 extents needed, but only 238178 available
      Unable to allocate mirror extents for pvmove0.
      Failed to convert pvmove LV to mirrored
    
    After using metadata 1.0, I was able to run pvmove command.

    Code: Select all

    /root%pvmove /dev/sda2 /dev/md1
      /dev/sda2: Moved: 0.00%
      /dev/sda2: Moved: 0.15%
      /dev/sda2: Moved: 0.30%
      /dev/sda2: Moved: 0.42%
      /dev/sda2: Moved: 0.56%
      /dev/sda2: Moved: 0.71%
      /dev/sda2: Moved: 0.86%
      /dev/sda2: Moved: 1.01%
      /dev/sda2: Moved: 1.16%
      /dev/sda2: Moved: 1.31%
      /dev/sda2: Moved: 1.46%
      /dev/sda2: Moved: 1.61%
    .....
    
    This will take some time to complete. Started around 13:15 08/30/2017. Estimated 0.01% per second or slower. It will take roughly 3 hours to complete the pvmove.
  15. Remove the old partition from the volume group and LVM stack.

    Code: Select all

    /root%vgreduce cl /dev/sda2 
      Removed "/dev/sda2" from volume group "cl"
    /root%pvremove /dev/sda2 
      Labels on physical volume "/dev/sda2" successfully wiped.
    
  16. At this moment there are many known issues with lvmetad cache existing in RHEL7. So it's better to disable lvmetad cache.
    Modify use_lvmetad as below in lvm.conf

    Code: Select all

    vi /etc/lvm/lvm.conf
    
            # Changed from 1 to 0 to disable lvmetad cache - CAH 08/30/2017
            #use_lvmetad = 1
            use_lvmetad = 0
    
    Stop lvm2-lvmetad service.

    Code: Select all

    /boot%systemctl stop lvm2-lvmetad.service
    Warning: Stopping lvm2-lvmetad.service, but it can still be activated by:
      lvm2-lvmetad.socket
    /boot%systemctl disable lvm2-lvmetad.service
    
  17. Add the old partition to the degraded array to complete the mirror.
    Check /dev/md1 status first:

    Code: Select all

    /root%mdadm -D /dev/md1
    /dev/md1:
            Version : 1.0
      Creation Time : Wed Aug 30 16:13:10 2017
         Raid Level : raid1
         Array Size : 975712064 (930.51 GiB 999.13 GB)
      Used Dev Size : 975712064 (930.51 GiB 999.13 GB)
       Raid Devices : 2
      Total Devices : 1
        Persistence : Superblock is persistent
    
      Intent Bitmap : Internal
    
        Update Time : Wed Aug 30 19:19:48 2017
              State : clean, degraded 
     Active Devices : 1
    Working Devices : 1
     Failed Devices : 0
      Spare Devices : 0
    
               Name : hsiao.net:1  (local to host hsiao.net)
               UUID : 73c0656c:37edea8c:1fd11ddb:d2683a2e
             Events : 114
    
        Number   Major   Minor   RaidDevice State
           -       0        0        0      removed
           1       8       18        1      active sync   /dev/sdb2
    
    Add /dev/sda2 to array

    Code: Select all

    /root%mdadm /dev/md1 -a /dev/sda2
    mdadm: added /dev/sda2
    
  18. Monitor the RAID status and wait for the recovery to complete.

    Code: Select all

    /root%mdadm -D /dev/md1          
    /dev/md1:
            Version : 1.0
      Creation Time : Wed Aug 30 16:13:10 2017
         Raid Level : raid1
         Array Size : 975712064 (930.51 GiB 999.13 GB)
      Used Dev Size : 975712064 (930.51 GiB 999.13 GB)
       Raid Devices : 2
      Total Devices : 2
        Persistence : Superblock is persistent
    
      Intent Bitmap : Internal
    
        Update Time : Wed Aug 30 19:20:04 2017
              State : clean, degraded, recovering 
     Active Devices : 1
    Working Devices : 2
     Failed Devices : 0
      Spare Devices : 1
    
     Rebuild Status : 0% complete
    
               Name : hsiao.net:1  (local to host hsiao.net)
               UUID : 73c0656c:37edea8c:1fd11ddb:d2683a2e
             Events : 123
    
        Number   Major   Minor   RaidDevice State
           2       8        2        0      spare rebuilding   /dev/sda2
           1       8       18        1      active sync   /dev/sdb2
    
    This will take some time to rebuild. Again for 2 to 3 hours.
    Here is the status after spare rebuilding is done:

    Code: Select all

    /root%mdadm -D /dev/md1
    /dev/md1:
            Version : 1.0
      Creation Time : Wed Aug 30 16:13:10 2017
         Raid Level : raid1
         Array Size : 975712064 (930.51 GiB 999.13 GB)
      Used Dev Size : 975712064 (930.51 GiB 999.13 GB)
       Raid Devices : 2
      Total Devices : 2
        Persistence : Superblock is persistent
    
      Intent Bitmap : Internal
    
        Update Time : Thu Aug 31 00:15:42 2017
              State : clean 
     Active Devices : 2
    Working Devices : 2
     Failed Devices : 0
      Spare Devices : 0
    
               Name : hsiao.net:1  (local to host hsiao.net)
               UUID : 73c0656c:37edea8c:1fd11ddb:d2683a2e
             Events : 4429
    
        Number   Major   Minor   RaidDevice State
           2       8        2        0      active sync   /dev/sda2
           1       8       18        1      active sync   /dev/sdb2
    
  19. Scan mdadm metadata and append RAID information to /etc/mdadm.conf.

    Code: Select all

    /root%ls -l /etc/mdadm.conf
    ls: cannot access /etc/mdadm.conf: No such file or directory
    /root%mdadm --examine --scan >/etc/mdadm.conf
    /root%ls -l /etc/mdadm.conf                  
    -rw-r--r-- 1 root root 176 Aug 30 19:25 /etc/mdadm.conf
    
    /root%cat /etc/mdadm.conf  
    ARRAY /dev/md/0  metadata=1.0 UUID=a15cbc22:c27202f8:31c08617:3def65df name=hsiao.net:0
    ARRAY /dev/md/1  metadata=1.0 UUID=73c0656c:37edea8c:1fd11ddb:d2683a2e name=hsiao.net:1
    
  20. Update /etc/default/grub with MD device UUID. (info will be available after md0/md1 are created)

    Code: Select all

    /boot%mdadm -D /dev/md* |grep UUID
               UUID : a15cbc22:c27202f8:31c08617:3def65df
               UUID : 73c0656c:37edea8c:1fd11ddb:d2683a2e
    
    /boot%grep GRUB_CMDLINE_LINUX /etc/default/grub 
    #GRUB_CMDLINE_LINUX="ipv6.disable=1 crashkernel=auto rd.lvm.lv=cl/root rd.lvm.lv=cl/swap rhgb quiet"
    GRUB_CMDLINE_LINUX="rd.md.uuid=a15cbc22:c27202f8:31c08617:3def65df rd.md.uuid=73c0656c:37edea8c:1fd11ddb:d2683a2e ipv6.disable=1 crashkernel=auto rd.lvm.lv=cl/root rd.lvm.lv=cl/swap rhgb quiet"
    
  21. Update grub2.cfg.

    Code: Select all

    /boot/grub2%cp -p grub.cfg grub.cfg.20170309
    
    /boot/grub2%grub2-mkconfig -o /boot/grub2/grub.cfg 
    Generating grub configuration file ...
      WARNING: Not using lvmetad because config setting use_lvmetad=0.
      WARNING: To avoid corruption, rescan devices to make changes visible (pvscan --cache).
      WARNING: Not using lvmetad because config setting use_lvmetad=0.
      WARNING: To avoid corruption, rescan devices to make changes visible (pvscan --cache).
    /usr/sbin/grub2-probe: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
    Found linux image: /boot/vmlinuz-3.10.0-514.el7.x86_64
    Found initrd image: /boot/initramfs-3.10.0-514.el7.x86_64.img
    /usr/sbin/grub2-probe: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
    /usr/sbin/grub2-probe: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
    /usr/sbin/grub2-probe: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
    Found linux image: /boot/vmlinuz-0-rescue-acac49b8c0944ac189fdf455a5e7d7c0
    Found initrd image: /boot/initramfs-0-rescue-acac49b8c0944ac189fdf455a5e7d7c0.img
      WARNING: Not using lvmetad because config setting use_lvmetad=0.
      WARNING: To avoid corruption, rescan devices to make changes visible (pvscan --cache).
      WARNING: Not using lvmetad because config setting use_lvmetad=0.
      WARNING: To avoid corruption, rescan devices to make changes visible (pvscan --cache).
    /usr/sbin/grub2-probe: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
    done
    
  22. Verify that both of your disks are listed in /boot/grub/device.map. Add them if needed.

    Code: Select all

    /boot/grub2%cat /boot/grub2/device.map
    # this device map was generated by anaconda
    (hd0)      /dev/sda
    
    /boot/grub2%cp -p device.map device.map.20170220
    
    /boot/grub2%vi device.map
    
    /boot/grub2%cat /boot/grub2/device.map          
    # this device map was generated by anaconda
    (hd0)      /dev/sda
    (hd1)      /dev/sdb
    
  23. Re-install grub on both disk.
    Kindly Note: Use --metadata=0.9 for boot device if below command fails with error -> cannot find /dev/md0 in /dev/sd* device

    Code: Select all

    /boot/grub2%grub2-install /dev/sda
    Installing for i386-pc platform.
    grub2-install: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
    grub2-install: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
    Installation finished. No error reported.
    
    /boot/grub2%grub2-install /dev/sdb
    Installing for i386-pc platform.
    grub2-install: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
    grub2-install: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
    Installation finished. No error reported.
    
  24. Rebuild initramfs image with mdadmconf.
    It is recommended you make a backup copy of the initramfs in case the new version has an unexpected problem:

    Code: Select all

    /boot%cp -p initramfs-3.10.0-514.el7.x86_64.img initramfs-3.10.0-514.el7.x86_64.img.20170220
    
    /boot%dracut -f --mdadmconf
    
    -rw------- 1 root root 31407076 Aug 30 19:45 initramfs-3.10.0-514.el7.x86_64.img
    -rw------- 1 root root 31446130 Feb 20  2017 initramfs-3.10.0-514.el7.x86_64.img.20170220
    
  25. Reboot the machine to make sure everything is correctly utilizing the new software RAID devices.

    One may pull the original /dev/sda out and boot up to see if the /dev/md0 (still has /dev/sba in it) can boot up as expected. It should. After pulling out the original /dev/sda, status will show one missing and /dev/sdb1 becomes /dev/sda1:

    Code: Select all

    # mdadm -D /dev/md0
    .....
    /dev/md0:
               Version : 1.0
         Creation Time : Thu Aug 17 18:10:36 2017
            Raid Level : raid1
            Array Size : 511936 (499.94 MiB 524.22 MB)
         Used Dev Size : 511936 (499.94 MiB 524.22 MB)
          Raid Devices : 2
         Total Devices : 1
           Persistence : Superblock is persistent
    
           Update Time : Thu Aug 17 19:33:32 2017
                 State : clean, degraded 
        Active Devices : 1
       Working Devices : 1
        Failed Devices : 0
         Spare Devices : 0
    
    Consistency Policy : unknown
    
                  Name : test.lab.msp.redhat.com:0  (local to host test.lab.msp.redhat.com)
                  UUID : 78046e00:70413e71:4b0009b7:4ee5e2f9
                Events : 50
    
        Number   Major   Minor   RaidDevice State
           -       0        0        0      removed
           1       8        1        1      active sync   /dev/sda1
    .....
    
    The original disk will need to be added back to the raid (/dev/md0 and /dev/md1):

    Code: Select all

    # mdadm /dev/md0 -a /dev/sdb1
    # mdadm /dev/md1 -a /dev/sdb2
    
    NOTE: Be careful, remember sdb1 is for /boot while sdb2 is for /.

    Check the status of the raid rebuild:

    Code: Select all

    <snip>
     [root@test ~]# mdadm -D /dev/md0
    /dev/md0:
               Version : 1.0
         Creation Time : Thu Aug 17 18:10:36 2017
            Raid Level : raid1
            Array Size : 511936 (499.94 MiB 524.22 MB)
         Used Dev Size : 511936 (499.94 MiB 524.22 MB)
          Raid Devices : 2
         Total Devices : 2
           Persistence : Superblock is persistent
    
           Update Time : Thu Aug 17 19:35:11 2017
                 State : clean, degraded, resyncing (DELAYED) 
        Active Devices : 1
       Working Devices : 2
        Failed Devices : 0
         Spare Devices : 1
    
    Consistency Policy : unknown
    
                  Name : test.lab.msp.redhat.com:0  (local to host test.lab.msp.redhat.com)
                  UUID : 78046e00:70413e71:4b0009b7:4ee5e2f9
                Events : 52
    
        Number   Major   Minor   RaidDevice State
           2       8       17        0      spare rebuilding   /dev/sdb1
           1       8        1        1      active sync   /dev/sda1
    </snip>
    
    When completed it should look like normal again:

    Code: Select all

    <snip>
     
    /dev/md0:
               Version : 1.0
         Creation Time : Thu Aug 17 18:10:36 2017
            Raid Level : raid1
            Array Size : 511936 (499.94 MiB 524.22 MB)
         Used Dev Size : 511936 (499.94 MiB 524.22 MB)
          Raid Devices : 2
         Total Devices : 2
           Persistence : Superblock is persistent
    
           Update Time : Thu Aug 17 19:36:39 2017
                 State : clean 
        Active Devices : 2
       Working Devices : 2
        Failed Devices : 0
         Spare Devices : 0
    
    Consistency Policy : unknown
    
                  Name : test.lab.msp.redhat.com:0  (local to host test.lab.msp.redhat.com)
                  UUID : 78046e00:70413e71:4b0009b7:4ee5e2f9
                Events : 69
    
        Number   Major   Minor   RaidDevice State
           2       8       17        0      active sync   /dev/sdb1
           1       8        1        1      active sync   /dev/sda1
     
    </snip>
    
How to remove a RAID device?
To remove an existing RAID device, first deactivate it by running the following command as root:

Code: Select all

mdadm --stop <raid_device>
Once deactivated, remove the RAID device itself:

Code: Select all

mdadm --remove <raid_device>
Finally, zero superblocks on all devices that were associated with the particular array:

Code: Select all

mdadm --zero-superblock <component_device…>
Eg.

Code: Select all

~]# mdadm --detail /dev/md3 | tail -n 4
    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1
       2       8       33        2      active sync   /dev/sdc1
In order to remove this device, first stop it by typing the following at a shell prompt:

Code: Select all

~]# mdadm --stop /dev/md3
mdadm: stopped /dev/md3
Once stopped, you can remove the /dev/md3 device by running the following command:

Code: Select all

~]# mdadm --remove /dev/md3
Finally, to remove the superblocks from all associated devices, type:

Code: Select all

~]# mdadm --zero-superblock /dev/sda1 /dev/sdb1 /dev/sdc1
CAH, The Great
cah
General of the Army / Fleet Admiral / General of the Air Force
General of the Army / Fleet Admiral / General of the Air Force
Posts: 1342
Joined: Sun Aug 17, 2008 5:05 am

Replacing a failed HDD in the software RAID1 on RHEL/Centos 7

Post by cah »

Got the new Seagate Berracuda 2 TB HDD today (09/19/2019).
Put into the computer and got recognized by the BIOS right away. The previous Hitachi I ordered last week was defective for sure.
After BIOS saw 2 HDDs, I went ahead and boot up the server.

Check the hardware list (focusing on disk1 and disk2)

Code: Select all

# lshw
...
           *-disk:0
                description: ATA Disk
                product: Hitachi HDS72101
                vendor: Hitachi
                physical id: 1
                bus info: scsi@1:0.0.0
                logical name: /dev/sda
                version: A3EA
                serial: JP2940HD0NEKAC
                size: 931GiB (1TB)
                capabilities: partitioned partitioned:dos
                configuration: ansiversion=5 logicalsectorsize=512 sectorsize=51
2 signature=0003ff21
              *-volume:0
                   description: Linux raid autodetect partition
                   physical id: 1
                   bus info: scsi@1:0.0.0,1
                   logical name: /dev/sda1
                   capacity: 1GiB
                   capabilities: primary bootable multi
              *-volume:1
                   description: Linux raid autodetect partition
                   physical id: 2
                   bus info: scsi@1:0.0.0,2
                   logical name: /dev/sda2
                   serial: yFI7pm-YF1z-5F8d-gCaT-45oT-uKvg-HJMhnN
                   size: 930GiB
                   capacity: 930GiB
                   capabilities: primary multi lvm2
           *-disk:1
                description: ATA Disk
                product: ST2000DM008-2FR1
                physical id: 0.0.0
                bus info: scsi@2:0.0.0
                logical name: /dev/sdb
                version: 0001
                serial: ZFL0FF1D
                size: 1863GiB (2TB)
                configuration: ansiversion=5 logicalsectorsize=512 sectorsize=40
96
Check partitions on disk1(/dev/sda)

Code: Select all

/root%parted /dev/sda u s p
Model: ATA Hitachi HDS72101 (scsi)
Disk /dev/sda: 1953525168s
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags: 

Number  Start     End          Size         Type     File system  Flags
 1      2048s     2099199s     2097152s     primary  xfs          boot, raid
 2      2099200s  1953523711s  1951424512s  primary               raid
 
Check partitions on disk2(/dev/sdb)

Code: Select all

/root%parted /dev/sdb u s p
Error: /dev/sdb: unrecognised disk label
Model: ATA ST2000DM008-2FR1 (scsi)                                        
Disk /dev/sdb: 3907029168s
Sector size (logical/physical): 512B/4096B
Partition Table: unknown
Disk Flags:


NOTE: This new HDD has 2 TB (3907029168s). After creating 2 partitions (1953523711s), there are still 1953505457 sectors left to use.

Make the new HDD msdos to match existing HDD in mirror.

Code: Select all

/root%parted /dev/sdb mklabel msdos 
Information: You may need to update /etc/fstab.
Create the first partition (/boot) on the new HDD

Code: Select all

/root%parted /dev/sdb mkpart primary 2048s 2099199s                       
Information: You may need to update /etc/fstab.
Check the partition(s) on the new HDD (disk2, /dev/sdb)

Code: Select all

/root%parted /dev/sdb u s p                                               
Model: ATA ST2000DM008-2FR1 (scsi)
Disk /dev/sdb: 3907029168s
Sector size (logical/physical): 512B/4096B
Partition Table: msdos
Disk Flags: 

Number  Start  End       Size      Type     File system  Flags
 1      2048s  2099199s  2097152s  primary
Create the 2nd partition

Code: Select all

/root%parted /dev/sdb mkpart primary 2099200s  1953523711s
Information: You may need to update /etc/fstab.
Check the partition(s) again

Code: Select all

/root%parted /dev/sdb u s p                                               
Model: ATA ST2000DM008-2FR1 (scsi)
Disk /dev/sdb: 3907029168s
Sector size (logical/physical): 512B/4096B
Partition Table: msdos
Disk Flags: 

Number  Start     End          Size         Type     File system  Flags
 1      2048s     2099199s     2097152s     primary
 2      2099200s  1953523711s  1951424512s  primary
Install the drub2 on disk2

Code: Select all

/root%grub2-install /dev/sdb
Installing for i386-pc platform.
grub2-install: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
grub2-install: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
Installation finished. No error reported.
Check the details on the first mirror for /boot

Code: Select all

/root%mdadm -D /dev/md0
/dev/md0:
           Version : 1.0
     Creation Time : Wed Aug 30 15:30:37 2017
        Raid Level : raid1
        Array Size : 1048512 (1023.94 MiB 1073.68 MB)
     Used Dev Size : 1048512 (1023.94 MiB 1073.68 MB)
      Raid Devices : 2
     Total Devices : 1
       Persistence : Superblock is persistent

       Update Time : Fri Sep 20 00:40:24 2019
             State : clean, degraded 
    Active Devices : 1
   Working Devices : 1
    Failed Devices : 0
     Spare Devices : 0

Consistency Policy : resync

              Name : hsiao.net:0  (local to host hsiao.net)
              UUID : a15cbc22:c27202f8:31c08617:3def65df
            Events : 453

    Number   Major   Minor   RaidDevice State
       2       8        1        0      active sync   /dev/sda1
       -       0        0        1      removed
The mirror showed /dev/sdb1 was removed.

Add /dev/sdb1 to the first mirror (/dev/md0)

Code: Select all

/root%mdadm /dev/md0 -a /dev/sdb1
mdadm: added /dev/sdb1
Check the details on the first mirror for /boot again

Code: Select all

/root%mdadm -D /dev/md0          
/dev/md0:
           Version : 1.0
     Creation Time : Wed Aug 30 15:30:37 2017
        Raid Level : raid1
        Array Size : 1048512 (1023.94 MiB 1073.68 MB)
     Used Dev Size : 1048512 (1023.94 MiB 1073.68 MB)
      Raid Devices : 2
     Total Devices : 2
       Persistence : Superblock is persistent

       Update Time : Fri Sep 20 00:41:18 2019
             State : clean 
    Active Devices : 2
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 0

Consistency Policy : resync

              Name : hsiao.net:0  (local to host hsiao.net)
              UUID : a15cbc22:c27202f8:31c08617:3def65df
            Events : 472

    Number   Major   Minor   RaidDevice State
       2       8        1        0      active sync   /dev/sda1
       3       8       17        1      active sync   /dev/sdb1
Since the partition is very small, the resync completed before I checked the details again.

Check the details on the second mirror for /root

Code: Select all

/root%mdadm -D /dev/md1          
/dev/md1:
           Version : 1.0
     Creation Time : Wed Aug 30 16:13:10 2017
        Raid Level : raid1
        Array Size : 975712064 (930.51 GiB 999.13 GB)
     Used Dev Size : 975712064 (930.51 GiB 999.13 GB)
      Raid Devices : 2
     Total Devices : 1
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Fri Sep 20 00:41:52 2019
             State : clean, degraded 
    Active Devices : 1
   Working Devices : 1
    Failed Devices : 0
     Spare Devices : 0

Consistency Policy : bitmap

              Name : hsiao.net:1  (local to host hsiao.net)
              UUID : 73c0656c:37edea8c:1fd11ddb:d2683a2e
            Events : 302490

    Number   Major   Minor   RaidDevice State
       2       8        2        0      active sync   /dev/sda2
       -       0        0        1      removed
The mirror showed /dev/sdb2 was removed.

Add /dev/sdb2 to the secondmirror (/dev/md1)

Code: Select all

/root%mdadm /dev/md1 -a /dev/sdb2
mdadm: added /dev/sdb2
/root%mdadm -D /dev/md1          
/dev/md1:
           Version : 1.0
     Creation Time : Wed Aug 30 16:13:10 2017
        Raid Level : raid1
        Array Size : 975712064 (930.51 GiB 999.13 GB)
     Used Dev Size : 975712064 (930.51 GiB 999.13 GB)
      Raid Devices : 2
     Total Devices : 2
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Fri Sep 20 00:42:27 2019
             State : active, degraded, recovering 
    Active Devices : 1
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 1

Consistency Policy : bitmap

    Rebuild Status : 0% complete

              Name : hsiao.net:1  (local to host hsiao.net)
              UUID : 73c0656c:37edea8c:1fd11ddb:d2683a2e
            Events : 302505

    Number   Major   Minor   RaidDevice State
       2       8        2        0      active sync   /dev/sda2
       3       8       18        1      spare rebuilding   /dev/sdb2
It showed the mirror is rebuilding and it was at 0%.
After 2 hours, it was about 60% done.

Code: Select all

/root%mdadm -D /dev/md1
/dev/md1:
           Version : 1.0
     Creation Time : Wed Aug 30 16:13:10 2017
        Raid Level : raid1
        Array Size : 975712064 (930.51 GiB 999.13 GB)
     Used Dev Size : 975712064 (930.51 GiB 999.13 GB)
      Raid Devices : 2
     Total Devices : 2
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Fri Sep 20 02:45:59 2019
             State : clean, degraded, recovering 
    Active Devices : 1
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 1

Consistency Policy : bitmap

    Rebuild Status : 61% complete

              Name : hsiao.net:1  (local to host hsiao.net)
              UUID : 73c0656c:37edea8c:1fd11ddb:d2683a2e
            Events : 305507

    Number   Major   Minor   RaidDevice State
       2       8        2        0      active sync   /dev/sda2
       3       8       18        1      spare rebuilding   /dev/sdb2
I will test booting from /dev/sdb when the mirror rebuilding is completed.
CAH, The Great
Post Reply