Replacing broken disk in SVM RAID5

It’s ineluctable that one day metacheck script, I strongly encourage you to use it if you don’t, will report a metadevice problem. In nine cases out of ten the root cause will be a failed disk.

# metastat d90
d90: RAID
    State: Needs Maintenance 
    Invoke: metareplace d90 c0t13d0s2 
    Interlace: 512 blocks
    Size: 335421567 blocks (159 GB)
Original device:
    Size: 335424000 blocks (159 GB)
        Device      Start Block  Dbase        State Reloc  Hot Spare
        c0t11d0s2       8019        No         Okay   Yes 
        c0t12d0s2       7874        No         Okay   Yes 
        c0t13d0s2      10218        No  Maintenance   Yes 
        c0t14d0s2      10218        No         Okay   Yes 
        c0t15d0s2       8019        No         Okay   Yes 
        c0t10d0s2       8019        No         Okay   Yes 

Not a big deal to fix it correctly without unmounting a file system. Just plug a new disk into empty slot and do metareplace. But what if all slots are occupied? Well, theoretically it shouldn’t be a problem either:

  • Do cfgadm -c unconfigure c#t#d# to unconfigure the disk.
  • Replace it with a new one and use cfgadm -c configure c#t#d# to make visible to system.
  • Don’t forget about metadevadm -u c#t#d#s#
  • Finally, type metareplace to initiate re-syncing.

But in my case these steps just didn’t work and I understand why – my broken disk was still a part of SVM. Actually, it was clearly explained to me in the following error message:

# cfgadm -c unconfigure c0::dsk/c0t13d0
cfgadm: Component system is busy, try again: failed to offline: 
     Resource                     Information              
------------------  ---------------------------------------
/dev/dsk/c0t13d0s2  component of RAID "/dev/md/dsk/d90"    
/dev/md/dsk/d90     mounted filesystem "/usr/fsrv/archive" 

Since I couldn’t offline the filesystem and cause a downtime my last resort was to skip cfgadm steps and brusquely remove a disk from its slot. Said and done. Thankfully, it worked smoothly and once a new disk found its shelter inside a server the rest was trivial…

# metareplace -e d90 c0t13d0s2

# metastat d90
d90: RAID
    State: Resyncing    
    Resync in progress:  0.0% done
    Interlace: 512 blocks
    Size: 335421567 blocks (159 GB)
Original device:
    Size: 335424000 blocks (159 GB)
        Device      Start Block  Dbase        State Reloc  Hot Spare
        c0t11d0s2       8019        No         Okay   Yes 
        c0t12d0s2       7874        No         Okay   Yes 
        c0t13d0s2      10218        No    Resyncing   Yes 
        c0t14d0s2      10218        No         Okay   Yes 
        c0t15d0s2       8019        No         Okay   Yes 
        c0t10d0s2       8019        No         Okay   Yes 

Studying HDS 9990 and 9985 (THI1570)

Tomorrow I’m leaving to Saint-Pitersburg for 5-days training titled “Installing, Configuring and Maintaining Hitachi Universal Storage Platform™ V and Universal Storage Platform ™ VM”. Once I’m done with it, my CTS profile will be 100% completed. Not bad at all ;-)

Back to childhood

Having just returned from a child theater (sorry only in Russian) I can’t wait to express the inner fillings that brimmed me. A play, called “A hedgehog in the fog”, was awesome and brilliantly performed. And to tell the truth, sitting there in a dark with my kid on the laps I felt like I was a little boy myself, like there had never been those mature years at all. And my eyes were almost wet… Maybe I’m too sentimental or maybe it’s because I have a birthday today or maybe both. God knows!

Veritas romp

Just came across a really funny piece of code in OpenSolaris.

/*
  * XXX - Don't port this to new architectures
  * A 3rd party volume manager driver (vxdm) depends on the symbol romp.
  * 'romp' has no use with a prom with an IEEE 1275 client interface.
  * The driver doesn't use the value, but it depends on the symbol.
  */
 void *romp;		/* veritas driver won't load without romp 4154976 */

Backup and restore Sun Cluster

Imagine for a second, even if it’s extremely unusual situation, that both nodes of your cluster have miserably failed simultaneously because of a buggy component. And now there is a dilemma what to do next: whether to reconfigure everything from scratch or, since you’re a vigilant SA, use the backups. But how to restore cluster configuration, did devices, resource and disk groups, etc? I tried to simulate such a failure in my testbed environment using the most simple SC configuration with one node and two VxVM disk groups to give a general overview and to show how easy and straightforward the whole process is.

As you might already know CCR (Cluster Configuration Repository) is a central database that contains:

  • Cluster and node names
  • Cluster transport configuration
  • The names of Solaris Volume Manager disk sets or VERITAS disk groups
  • A list of nodes that can master each disk group
  • Operational parameter values for data services
  • Paths to data service callback methods
  • DID device configuration
  • Current cluster status

This information is kept under /etc/cluster/ccr so it’s quite obvious that we need a whole backup of /etc/cluster directory since some crucial data, i.e. nodeid, are stored under /etc/cluster as well:

# cd /etc/cluster; find ./ | cpio -co > /cluster.cpio

And don’t forget about /etc/vfstab, /etc/hosts, /etc/nsswitch.conf and /etc/inet/ntp.cluster files. Of course, there are a lot more to keep in mind, but here I’m speaking only about SC related files. Now, here goes the trickiest part. What would I give it such a strong name? Because I learned this hard way during my experiment and if you omit it you won’t be able to load did module and recreate DID device entries (here I deliberately decided not to backup /devices and /dev directories). Wright down or try to remember the output:

# grep did /etc/name_to_major
did 300

# grep did /etc/minor_perm
did:*,tp 0666 root sys

Of course, you could simply add those two files to the backup list – whatever you prefer.

Now it’s safe to reinstall OS and once it’s done install SC software. Don’t run scinstall – it’s unnecessary.
First, create a new partition for globaldevices on the root disk. It’s better to assign it the same number as it was before the crash to avoid editing /etc/vfstab file and bothering with scdidadm command. Next, edit /etc/name_to_major and /etc/minor_perm files and make appropriate changes or simply overwrite them with copies form your backup. Now do a reconfiguration reboot:

# reboot — -r
or

#halt
ok> boot -r

or

# touch /reconfigure; init 6

When you’re back check that did module was loaded and there is pseudo/did@0:admin under /devices path:

# modinfo | grep did
285 786f2000 3996 300 1 did (Disk ID Driver 1.15 Aug 20 2006)

# ls -l /devices/pseudo/did\@0\:admin
crw——- 1 root sys 300, 0 Oct 2 16:30 /devices/pseudo/did@0:admin

You should also see that /global/.devices/node@1 was successfully mounted. So far, so good. But we are still in a non cluster mode. Lets fix that.

# mv /etc/cluster /etc/cluster.orig
# mkdir /etc/cluster; cd /etc/cluster
# cpio -i < /cluster.cpio

Restore other files, i.e. /etc/vfstab and others of that ilk, and reboot your system. Once it’s back again double check that DID entries have been created:

#  scdidadm -l 
1        chuk:/dev/rdsk/c1t6d0          /dev/did/rdsk/d1
2        chuk:/dev/rdsk/c2t0d0          /dev/did/rdsk/d2
3        chuk:/dev/rdsk/c2t1d0          /dev/did/rdsk/d3
4        chuk:/dev/rdsk/c3t0d0          /dev/did/rdsk/d4
5        chuk:/dev/rdsk/c3t1d0          /dev/did/rdsk/d5
6        chuk:/dev/rdsk/c6t1d0          /dev/did/rdsk/d6
7        chuk:/dev/rdsk/c6t0d0          /dev/did/rdsk/d7

# for p in  `scdidadm -l | awk '{print $3"*"}' `; do ls -l $p; done

Finally, import VxVM disk group, if that’s haven’t been done automatically and bring them online:

# vxdg import testdg
# vxdg import oradg

# scstat -D

-- Device Group Servers --

                          Device Group        Primary             Secondary
                          ------------        -------             ---------
  Device group servers:     testdg              testbed                -
  Device group servers:     oradg               testbed                -


-- Device Group Status --

                              Device Group        Status              
                              ------------        ------              
  Device group status:        testdg              Offline
  Device group status:        oradg               Offline


# scswitch -z -D oradg -h testbed
# scswitch -z -D testdg -h tetbed

# scstat -D

-- Device Group Servers --

                         Device Group        Primary             Secondary
                         ------------        -------             ---------
  Device group servers:  testdg              testbed                -
  Device group servers:  oradg               testbed                -


-- Device Group Status --

                              Device Group        Status              
                              ------------        ------              
  Device group status:        testdg              Online
  Device group status:        oradg               Online

Easy!