Replacing broken disk in SVM RAID5

It’s ineluctable that one day metacheck script, I strongly encourage you to use it if you don’t, will report a metadevice problem. In nine cases out of ten the root cause will be a failed disk.

# metastat d90
d90: RAID
    State: Needs Maintenance 
    Invoke: metareplace d90 c0t13d0s2 
    Interlace: 512 blocks
    Size: 335421567 blocks (159 GB)
Original device:
    Size: 335424000 blocks (159 GB)
        Device      Start Block  Dbase        State Reloc  Hot Spare
        c0t11d0s2       8019        No         Okay   Yes 
        c0t12d0s2       7874        No         Okay   Yes 
        c0t13d0s2      10218        No  Maintenance   Yes 
        c0t14d0s2      10218        No         Okay   Yes 
        c0t15d0s2       8019        No         Okay   Yes 
        c0t10d0s2       8019        No         Okay   Yes 

Not a big deal to fix it correctly without unmounting a file system. Just plug a new disk into empty slot and do metareplace. But what if all slots are occupied? Well, theoretically it shouldn’t be a problem either:

But in my case these steps just didn’t work and I understand why – my broken disk was still a part of SVM. Actually, it was clearly explained to me in the following error message:

# cfgadm -c unconfigure c0::dsk/c0t13d0
cfgadm: Component system is busy, try again: failed to offline: 
     Resource                     Information              
------------------  ---------------------------------------
/dev/dsk/c0t13d0s2  component of RAID "/dev/md/dsk/d90"    
/dev/md/dsk/d90     mounted filesystem "/usr/fsrv/archive" 

Since I couldn’t offline the filesystem and cause a downtime my last resort was to skip cfgadm steps and brusquely remove a disk from its slot. Said and done. Thankfully, it worked smoothly and once a new disk found its shelter inside a server the rest was trivial…

# metareplace -e d90 c0t13d0s2

# metastat d90
d90: RAID
    State: Resyncing    
    Resync in progress:  0.0% done
    Interlace: 512 blocks
    Size: 335421567 blocks (159 GB)
Original device:
    Size: 335424000 blocks (159 GB)
        Device      Start Block  Dbase        State Reloc  Hot Spare
        c0t11d0s2       8019        No         Okay   Yes 
        c0t12d0s2       7874        No         Okay   Yes 
        c0t13d0s2      10218        No    Resyncing   Yes 
        c0t14d0s2      10218        No         Okay   Yes 
        c0t15d0s2       8019        No         Okay   Yes 
        c0t10d0s2       8019        No         Okay   Yes 
Posted on October 27, 2009 at 2:35 pm by sergeyt · Permalink
In: Solaris

Leave a Reply