Disk replacement with VxVM

Be aware, that with Solaris 10 and VxVM MP3 a correct disk replacement procedure now looks as described here:

# vxdisk rm
Offline and Unconfigure the Lun
# luxadm -e offline /dev/rdsk/s2
# cfgadm -c unconfigure -o unusable_SCSI_LUN cX::
Cleanup/Recreate device Tree
# devfsadm -Cv
Recreate and Rescan VxVM devices.
# vxddladm stop eventsource
# mv disk.info disk.backup
# mv array.info array.backup
# rm /dev/vx/dmp/*
# rm /dev/vx/rdmp/*
# vxdisk scandisks
# vxdctl enable
# vxddladm start eventsource

Otherwise, you’re going to put yourself at risk to admire the following error message every time “vxdct enable” command is invoked:

VxVM vxdctl ERROR V-5-1-0 Data Corruption Protection Activated – User Corrective Action Needed
VxVM vxdctl INFO V-5-1-0 To recover, first ensure that the OS device tree is up to date (requires OS specific commands).
VxVM vxdctl INFO V-5-1-0 Then, execute ‘vxdisk rm’ on the following devices before reinitiating device discovery:

Stumbled upon this issue myself, when replacing two disks in D240 JBOD.

VxVM is watching after you

Couple of days ago my colleague told me about one neat feature of VxVM 5.x that could be quite helpful in the field. Imagine a situation when a customer complains about VxVM misconfiguration and blames your team for a slime work. To prove him wrong, you could sift through VxVMs’ command log files to get a list of commands you typed during initial configuration. If a customer did something wrong himself and now is trying to shift the blame upon you these log files could be of invaluable help as well – just show where and when he made the mistake. The log files could me found in /etc/vx/log and are named /etc/vx/log/cmdlog and /etc/vx/log/cmdlog.number for the current and historic command logs respectively. There is a vxcmdlog(1M) command to give you some control over this feature.
One thing to keep in mind is that no every commands script are logged:

Most command scripts are not logged, but the command binaries that they call are logged. Exceptions are the vxdisksetup, vxinstall, and vxdiskunsetup scripts, which are logged.

Enjoy!

Mending VxVM

Last night I was upgrading a domain on our SF6900 box to Solaris 10 and since we use VxVM to mirror our root disk there was no use in preparing dedicated backup neither on tape nor on separate disk because it’s much easier and accurate to split rootdg into two disk groups, e.g. rootdg and rootdgold, and later import rootdgold, i.e. our saved copy of rootdg, to restore every single file. But when I tried to start the volumes it only revealed that not everything went off smoothly:

# vxvol -g rootdgold startall
VxVM vxvol ERROR V-5-1-11804 Volume coredump is empty and cannot be started
VxVM vxvol ERROR V-5-1-11804 Volume oem_home is empty and cannot be started
VxVM vxvol ERROR V-5-1-11804 Volume var is empty and cannot be started
VxVM vxvol ERROR V-5-1-11804 Volume rootvol is empty and cannot be started
VxVM vxvol ERROR V-5-1-11804 Volume swapvol is empty and cannot be started

Looking at the volumes I found that all of them were really in EMPTY state:

# vxprint -g rootdgold -v
TY NAME         ASSOC        KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0
v  coredump     fsgen        DISABLED 62918208 -        EMPTY    -       -
v  oem_home     fsgen        DISABLED 4202688  -        EMPTY    -       -
v  rootvol      root         DISABLED 8395200  -        EMPTY    -       -
v  swapvol      swap         DISABLED 50330496 -        EMPTY    -       -
v  var          fsgen        DISABLED 8395200  -        EMPTY    -       -

Actually, not much should be done to overcome this situation. You could use both “vxvol init clean” or “vxmend fix clean” as a remedy to set the state for the named plex to CLEAN:

# vxmend -g rootdgold fix clean rootvol-02
# vxmend -g rootdgold fix clean coredump-02
# vxmend -g rootdgold fix clean oem_home-02
# vxmend -g rootdgold fix clean var-02
# vxmend -g rootdgold fix clean swap-02

Once that has been done I started the volume and successfully mounted it.

# vxvol -g rootdgold startall

Veritas romp

Just came across a really funny piece of code in OpenSolaris.

/*
  * XXX - Don't port this to new architectures
  * A 3rd party volume manager driver (vxdm) depends on the symbol romp.
  * 'romp' has no use with a prom with an IEEE 1275 client interface.
  * The driver doesn't use the value, but it depends on the symbol.
  */
 void *romp;		/* veritas driver won't load without romp 4154976 */

Do vxinitrd after kernel upgrade

Just a quick reminder. If you have your root disk on a Linux box encapsulated then don’t forget to recreate initrd image after the kernel has been updated.

/usr/lib/vxvm/bin/vxinitrd /boot/VxVM_initrd.img 'uname -r'

Otherwise you’ll end up being welcomed by the following message once you reboot.

Kernel panic - not syncing: Attempted to kill init!

Have a nice day!

Netbackup’s error 96.

Actually this error is trivially easy to overcome.

# bperror -S 96
unable to allocate new media for backup, storage unit has none available
The tape manager (bptm) could not allocate a new volume for backups.
This indicates that the storage unit has no more volumes available in the
volume pool for this backup. Note that NetBackup will not change storage
units during the backup.

All it says is that we’ve ran out of free tapes in our pool. Use the following command to get a list of available recommendations from Symantec.

# bperror -S 96 -r

In absolutely most cases all that you’ll have to do to resolve the issue is just to find an appropriate tape and expire it. Doing so, the expired tape will be automatically placed into the ScratchPool, don’t tell me that you don’t have it ;-), from which it will be lately reused when your backup job starts on the schedule.

# bpmedialist
# bpexpdate -d 0 -m BU0001

Easy.

vxdmpadm path activate weirdness

Just stumbled upon a strange behavior of vxdmpadm which requires further investigation.
The problem I’ve faced with during an attempt to set certain path “active” to loadbalance the data flow on HDS between its controllers. The built-in help clearly states that:

# vxdmpadm setattr help

vxdmpadm setattr path
 pathtype=

pathtype can be either:
        active
        nomanual
        nopreferred
        preferred [priority=]
        primary
        secondary
        standby

So I dully expected it to work as declared but instead I got the following error:

# vxdmpadm getsubpaths  ctlr=c2
NAME         STATE[A]   PATH-TYPE[M] DMPNODENAME  ENCLR-TYPE   ENCLR-NAME   ATTRS
================================================================================
c2t50060E800042A5F0d81s2 ENABLED(A) PRIMARY      hds9500-alua0_0051 HDS9500-ALUA hds9500-alua0   -
c2t50060E800042A5F3d81s2 ENABLED    SECONDARY    hds9500-alua0_0051 HDS9500-ALUA hds9500-alua0   -
c2t50060E800042A5F0d82s2 ENABLED(A) SECONDARY    hds9500-alua0_0052 HDS9500-ALUA hds9500-alua0   -
c2t50060E800042A5F3d82s2 ENABLED    PRIMARY      hds9500-alua0_0052 HDS9500-ALUA hds9500-alua0   -

# vxdmpadm getsubpaths  ctlr=c3
NAME         STATE[A]   PATH-TYPE[M] DMPNODENAME  ENCLR-TYPE   ENCLR-NAME   ATTRS
================================================================================
c3t50060E800042A5F1d81s2 ENABLED(A) PRIMARY      hds9500-alua0_0051 HDS9500-ALUA hds9500-alua0   -
c3t50060E800042A5F2d81s2 ENABLED    SECONDARY    hds9500-alua0_0051 HDS9500-ALUA hds9500-alua0   -
c3t50060E800042A5F1d82s2 ENABLED(A) SECONDARY    hds9500-alua0_0052 HDS9500-ALUA hds9500-alua0   -
c3t50060E800042A5F2d82s2 ENABLED    PRIMARY      hds9500-alua0_0052 HDS9500-ALUA hds9500-alua0   -

# vxdmpadm setattr path c2t50060E800042A5F3d82s2 pathtype=active
VxVM vxdmpadm ERROR V-5-1-10357  Invalid argument or attribute specified.

Looks like I’ll need to investigate deeper to find the culprit but as a workaround just disabled the second path to force a failover to another one I tried to make active.

Update.
As always RTFM rules and I must admin that my apprehension, that with option one could change the state listed in the second column to active, was completely wrong. In the man page it’s lucidly written that pathtype=active is used to change a standby path to active.

# vxdmpadm setattr path c2t50060E800042A5F3d81s2 pathtype=standby
# vxdmpadm getsubpaths 

NAME         STATE[A]   PATH-TYPE[M] DMPNODENAME  ENCLR-NAME   CTLR   ATTRS
================================================================================
c2t50060E800042A5F3d81s2 ENABLED    SECONDARY    hds9500-alua0_0051 hds9500-alua0 c2     STANDBY

# vxdmpadm setattr path c2t50060E800042A5F3d81s2 pathtype=active
# vxdmpadm getsubpaths 

NAME         STATE[A]   PATH-TYPE[M] DMPNODENAME  ENCLR-NAME   CTLR   ATTRS
================================================================================
c2t50060E800042A5F3d81s2 ENABLED    SECONDARY    hds9500-alua0_0051 hds9500-alua0 c2       -

Since in my case the path was already active it would be strange to make it active for the second time and as a result I got the error. So actually there were two options:

  • Use vxdmpadm disable
  • Use vxdmpadm setattr path pathtype=standby/active

So, folks, never underestimate the documentation. ;-)

Data restoration from tape

Recently I had to restore some data from a tape written by means of Netbackup, so solely for the reference purposes I decided to write this short post.
First we need to mount the tape, I did this using robtest utility, and perfrom a robot’s inventory to make Netbackup aware about a new tape. Keep in mind that sometimes barcode visible on the tape itself could not much with what has been written on the tape during the backup. This discrepancy could be a result of different Netbackup’s barcode rules. To double check, use more e.g. more /dev/rmt/13cbn
After that I ran the following set of commands:

bpimport -create_db_info -id F006L1 -L /tmp/bpimport.log
bplist -C client's_name -l -t 4 -R /
bprestore -B -S source -D destination -C client's_name -t 0  \\
-L /tmp/bprestore.log -R /tmp/rename_file /what/to/restore

Missing voolboot file

If executing ” vxdctl enable” you receive the following error:

VxVM vxdctl ERROR V-5-1-1589 enable failed: Volboot file not loaded

then this sequence could help you to resolve the problem:

vxio set 10
vxconfigd -d
vxdctl init
vxdctl enable

Good luck.