Had a small snag when I tried to upgrade my old (r151018) OmniOS installation to OmniOS CE as described in the ANNOUNCEMENT OmniOS Community Edition – OmniOSce r151022h
During “pkg update” stage I got something similar to the following:
pkg update: The certificate which issued this certificate:/C=US/ST=Maryland/O=OmniTI/OU=OmniOS/CN=OmniOS r151018 Release
Signing Certificate/emailAddress=omnios-supp…@omniti.com could not be found.
Thankfully, the solution was a straightforward sequence of steps to upgrade to r151020, then to r151021 and finally to r151022.
From there I was able to successfully upgrade to OmniOS CE. Even “-r” option in “pkg update -rv” worked as a charm because this option doesn’t exist in r151018. Probably, I could skip r151021 all together, but it’s always better be safe than sorry.
I was really bad at googling the steps to grow ZFS rpool online without using zpool append, so here is my little story how I’ve done it.
Before I begin, please, note that everything said below applies only to the configurations where rpool consumes a whole disk. If there is another partition sitting right after (or with some gap) your rpool, you’re risking to corrupt the data. So don’t blindly use the last cylinder (or “$”) when it’s time to modify the partition table.
- Here is the rpool:
- Saving the partition table first so later this information could be used to correctly re-lable (re-partition) the expanded disk. Btw, use “zpool status rpool -v” to find the device name:
- Behind the scene LUN was expanded. To confirm that luxadm could be quite handy (pay attention to “Unformatted capacity”) field:
- Time for the scariest part, i.e. re-lable the disk by installing a new partition table.
To be able to do that a new disk’s geometry must be somehow conveyed to the format utility and that’s surprisingly easy to achieve. Just run format, select the disk and use “type” option to autoconfigure it:
- Notice that now it’s configured with a new capacity.
- Don’t leave the format prompt yet since we are not done. Next step is to carve out the partition table. Remember, that I only had two partitions (0 for root and 2 for backup) and your situation might be completely different, so don’t copy/paste rashly.
- Use the numbers from the menu to select the partition you’re willing to modify:
- For me, it was enough to set the size of all partitions to zero except the two: 0 (tagged as root) and 2 (tagged as backup). As you can see below wm (write-mountable) flag was set only for partition 0, whilst the rest have wu (write-unmountable):
- Label the disk and quit the format tool:
- Finally it’s time to grow ZFS rpool and to confirm we’re golden:
# zpool list rpool NAME SIZE ALLOC FREE CAP HEALTH ALTROOT rpool 31.1G 23.6G 7.52G 75% ONLINE -
# prtvtoc /dev/dsk/c4t60060E80167D3C0000017D3C000010CAd0s2 * /dev/dsk/c4t60060E80167D3C0000017D3C000010CAd0s2 partition map * * Dimensions: * 512 bytes/sector * 512 sectors/track * 15 tracks/cylinder * 7680 sectors/cylinder * 8533 cylinders * 8531 accessible cylinders * * Flags: * 1: unmountable * 10: read-only * * First Sector Last * Partition Tag Flags Sector Count Sector Mount Directory 0 2 00 0 65518080 65518079 2 5 01 0 65518080 65518079
# luxadm display /dev/rdsk/c4t60060E80167D3C0000017D3C000010CAd0s2 DEVICE PROPERTIES for disk: /dev/rdsk/c4t60060E80167D3C0000017D3C000010CAd0s2 Vendor: HITACHI Product ID: OPEN-V -SUN Revision: 7006 Serial Num: 50 17D3C10CA Unformatted capacity: 36864.000 MBytes Read Cache: Enabled Minimum prefetch: 0x0 Maximum prefetch: 0x0 Device Type: Disk device
# format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c1t0d0
/pci@0/pci@0/pci@2/scsi@0/sd@0,0 1. c4t60060E80167D3C0000017D3C000010CAd0 /scsi_vhci/ssd@g60060e80167d3c0000017d3c000010ca Specify disk (enter its number): 1 selecting c4t60060E80167D3C0000017D3C000010CAd0 [disk formatted] /dev/dsk/c4t60060E80167D3C0000017D3C000010CAd0s0 is part of active ZFS pool rpool. Please see zpool(1M). FORMAT MENU: disk - select a disk type - select (define) a disk type partition - select (define) a partition table current - describe the current disk format - format and analyze the disk repair - repair a defective sector label - write label to the disk analyze - surface analysis defect - defect list management backup - search for backup labels verify - read and display labels save - save new disk/partition definitions inquiry - show vendor, product and revision volname - set 8-character volume name ! - execute , then return quit format> type AVAILABLE DRIVE TYPES: 0. Auto configure 1. Quantum ProDrive 80S 2. Quantum ProDrive 105S 3. CDC Wren IV 94171-344 4. SUN0104 5. SUN0207 6. SUN0327 7. SUN0340 8. SUN0424 9. SUN0535 10. SUN0669 11. SUN1.0G 12. SUN1.05 13. SUN1.3G 14. SUN2.1G 15. SUN2.9G 16. Zip 100 17. Zip 250 18. Peerless 10GB 19. HITACHI-OPEN-V-SUN-7005 20. SUN300G 21. other Specify disk type (enter its number): 0 c4t60060E80167D3C0000017D3C000010CAd0: configured with capacity of 35.99GB selecting c4t60060E80167D3C0000017D3C000010CAd0 [disk formatted]
format> partition PARTITION MENU: 0 - change `0' partition 1 - change `1' partition 2 - change `2' partition 3 - change `3' partition 4 - change `4' partition 5 - change `5' partition 6 - change `6' partition 7 - change `7' partition select - select a predefined table modify - modify a predefined partition table name - name the current table print - display the current table label - write partition map and label to the disk quit partition>
partition> print Current partition table (unnamed): Total disk cylinders available: 9828 + 2 (reserved cylinders) Part Tag Flag Cylinders Size Blocks 0 root wm 0 - 9827 35.99GB (9828/0/0) 75479040 1 unassigned wu 0 0 (0/0/0) 0 2 backup wu 0 - 9827 35.99GB (9828/0/0) 75479040 3 unassigned wu 0 0 (0/0/0) 0 4 unassigned wu 0 0 (0/0/0) 0 5 unassigned wu 0 0 (0/0/0) 0 6 unassigned wu 0 0 (0/0/0) 0 7 unassigned wu 0 0 (0/0/0) 0
Ready to label disk, continue? yes
# zpool online -e rpool /dev/dsk/c4t60060E80167D3C0000017D3C000010CAd0s0 # zpool list rpool NAME SIZE ALLOC FREE CAP HEALTH ALTROOT rpool 35.9G 23.6G 12.3G 65% ONLINE -
Good luck and safe expanding your pools.
The lastest episode of BSD Now (103) podcast brought in a fantastic and hilarious interview with Bryan Cantrill who is well known for his wit and right on the bullseye rants. It’s been awhile since I cried laughing so unquestionably this video is a highly recommended. Not to mention that his talk was very educational both from the technical (epoll, kqueue) and historical point of views. Bookmarked and added to the favorites.
This is another post about the usefulness of reading man pages and READMEs.
The other day I was patching a Solaris box and was greeted with the following error:
This appears to be an attempt to install the same architecture and version of a package which is already installed. This installation will attempt to overwrite this package.
/some_long_path_to/install/checkinstall: /some_long_path_to//install/checkinstall: cannot open
pkgadd: ERROR: checkinstall script did not complete successfully
No changes were made to the system.
I was beating my head against the wall for quite a while before I decided to give “man patchadd” a try. Thankfully, the helpful paragraph was found in a wink of an eye:
pkgadd is invoked by patchadd and executes the installation scripts in the pkg/install directory. The checkinstall script is executed with its ownership set to user install, if there is no user install then pkgadd executes the checkinstall script as noaccess. The SVR4 ABI states that the checkinstall shall only be used as an information gathering script. If the permissions for the checkinstall script are changed to something other than the initial settings, pkgadd may not be able to open the file for reading, thus causing the patch installation to abort with the following error:
pkgadd: ERROR: checkinstall script did not complete successfully.
There is no need to tell, that after the patch was moved to another directory where user noaccess (didn’t have user install) had enough permissions the problem had gone.
Have safe and flawless patching!
One day you might find yourself in a similar situation as I did when I wasn’t able to create a new boot environment:
# lucreate -n SolarisFeb16 Analyzing system configuration. Comparing source boot environment
file systems with the file system(s) you specified for the new boot environment. Determining which file systems should be in the new boot environment. Updating boot environment description database on all BEs. Updating system configuration files. The device is not a root device for any boot environment; cannot get BE ID. Creating configuration for boot environment . Source boot environment is . Creating boot environment . Cloning file systems from boot environment to create boot environment . Creating snapshot for on . Creating clone for on . Setting canmount=noauto for > in zone on . ERROR: The boot environment name does not have a boot device defined in . ERROR: Root slice devicePopulation of boot environment does not have a boot device defined in .> for BE was not found: . successful. Creation of boot environment successful.
Even the last two lines say that population and creation were successful luactivate would disagree:
# luactivate SolarisFeb16 ERROR: Unable to determine the configuration of the current boot environment
The root case was an outdated 121430-xx patch. What is more important is that this patch is not part of the Recommended Patch Cluster:
Live Upgrade patch 121430-XX is included in the patches/ directory of the patchset, but this patch will not be applied during patchset installation. The decision to apply the Live Upgrade patch is left to the user, this is done to accommodate users who wish to independently manage the version of the Live Upgrade patch on their system. Where a user wishes to apply the Live Upgrade patch, this needs to be done manually with the patchadd command.
After installing the latest 121430-93 (as of this writing) the problem has happily disappeared.
A good reminder to myself to always check README(s).
Yesterday Oracle announced the availability of Solaris 11.2 beta with a bunch of sweet enhancements, e.g. Openstack, Solaris Kernel zones, Unified archives, Compliance check and reporting, Automation with puppet and more.
Find more by reading Solaris 11.2 Beta – What’s new
For those who is interested in a hands on experience Solaris11.2 beta is also available for download in different formats including Virtualbox VM template.
Now I know what I will be doing during the upcoming 4 days-long state holiday.
If you got bored pressing “y” every time you delete a package and pkgrm asks its “Do you want to remove this package? [y,n,?,q]” question, there is a very easy solution:
# yes | pkgrm package1 package2 ... package3
Has saved me a lot of time.
A quick hint on how to find a possible root cause why a newly imported service doesn’t auto start on system reboot. So… You’ve just imported service’s manifest, rebooted the system and noticed the service is still in the offline state:
svccfg import service_manifest.xml init 6 svcs service_name STATE STIME FMRI offline 15:55:52 svc:service_name:default
A good start is to use “svcs -l service_name” to check that all service instances upon your service depends (used ssh as an example):
$ svcs -l ssh fmri svc:/network/ssh:default name SSH server enabled true state online next_state none state_time Wed Nov 06 15:55:01 2013 logfile /var/svc/log/network-ssh:default.log restarter svc:/system/svc/restarter:default contract_id 60 dependency require_all/none svc:/system/filesystem/local (online) dependency optional_all/none svc:/system/filesystem/autofs (disabled) dependency require_all/none svc:/network/loopback (online) dependency require_all/none svc:/network/physical (online) dependency require_all/none svc:/system/cryptosvc (online) dependency require_all/none svc:/system/utmp (online) dependency require_all/restart file://localhost/etc/ssh/sshd_config (online)
So you’ve checked the logs and all the dependencies and still don’t fully understand why the new service did start. I feel your pain as I was in exactly the same situation. Take a look at the services that depend on your service (again I will use ssh as an example):
svcs -D ssh STATE STIME FMRI online 15:55:51 svc:/milestone/multi-user-server:default
Or instead, the services it depends upon:
svcs -d svc:/application/management/sma:default STATE STIME FMRI online 15:54:38 svc:/milestone/name-services:default online 15:54:49 svc:/system/cryptosvc:default online 15:54:50 svc:/milestone/network:default online 15:54:51 svc:/system/filesystem/local:default online 15:55:00 svc:/milestone/sysconfig:default online 15:55:01 svc:/system/system-log:default online 15:55:06 svc:/network/rpc/rstat:default
If you don’t see “milestone” being mentioned in the output then that is definitely the problem and explains why the service didn’t start when you rebooted the system. Basically it’s the same as if you hadn’t run “chkconfig service_name on” on a RedHat-like or “update-rc.d serice_name defaults” on a Debian-like Linux system. Refer to the table below to get an idea which run level corresponds to what milestone:
|Run Level||SMF Milestones|
For the sake of completeness here is a full list of all milestones available (at least on the system I have near me):
svcs "milestone*" STATE STIME FMRI disabled 15:54:42 svc:/milestone/patching:default online 15:54:38 svc:/milestone/name-services:default online 15:54:49 svc:/milestone/devices:default online 15:54:50 svc:/milestone/network:default online 15:54:50 svc:/milestone/single-user:default online 15:55:00 svc:/milestone/sysconfig:default online 15:55:34 svc:/milestone/multi-user:default online 15:55:51 svc:/milestone/multi-user-server:default
Time to open the service’s manifest file and edit it. And here you have two options:
- Either a milestone will depend upon your service (the first example);
- Or make your service dependent on a milestone (the latter one);
<dependent name='ssh_multi-user-server' grouping='optional_all' restart_on='none'> <service_fmri value='svc:/milestone/multi-user-server' /> </dependent> <dependency name='network' grouping='require_all' restart_on='error' type='service'> <service_fmri value='svc:/milestone/network:default'/> </dependency>
Now disable, delete, import and enable the service once again:
svcadmin disable service_name svccfg delete service_name svccfg import service_name.xml svcadmin enable service_name
At this point you could safely reboot the system.
Recently I wrote a post about configuring OpenLDAP server with TLS support using RHEL available here. There I also mentioned how to setup Linux to authenticate against a LDAP server. But I didn’t said a word about Solaris. That’s unfair and I’m going to fix that by providing a quick guide on how to setup LDAP client in Solaris 10.
- First of all add LDAP server’s certificate into your locale certificate database. Otherwise, you won’t be able to setup a TLS session:
- Just verify that everything was done right:
- Setup Solaris ldap client:
- All the rest is just almost like in the Linux world:
- Just take another look at your configuration:
- Use some very basic tools,i.e. id or getent, to make sure your could query and receive correct response from LDAP server.
- Finally, try to ssh into your server with a LDAP aware account.
/usr/sfw/bin/certutil -N -d /var/ldap/ /usr/sfw/bin/certutil -A -n "LDAP server certificate" -i /path_to_where_you_copied_ldap_certificate_file a -t CT -d /var/ldap
/usr/sfw/bin/certutil -L -d /var/ldap/
ldapclient manual \ -a credentialLevel=proxy \ -a authenticationMethod=tls:simple \ -a domainName=example.com \ -a defaultSearchBase=DC=example,DC=com \ -a proxyDN="cn=svc_ldp_proxy,dc=example,dc=com" \ -a proxyPAssword=PASSWORD \ -a serviceSearchDescriptor="passwd:ou=people,?sub" \ -a serviceSearchDescriptor="group:ou=group,?sub?gidnumber" \ -a serviceSearchDescriptor="netgroup:ou=netgroup,?sub" \ -a serviceSearchDescriptor="shadow:ou=people,?sub?uid=*" \ -a followReferrals=false LDAP_SERVER_IP:LDAP_SERVER_PORT
Please note that your serviceSearchDescriptor attribute might be different and that depends on your LDAP structure. This attribute just instruct ldap client how it should build its query to search, in my particular case, for passwd, group and net group records.
passwd: compat passwd_compat: ldap group: files ldap hosts: files dns ipnodes: files dns networks: files protocols: files rpc: files ethers: files netmasks: files bootparams: files publickey: files netgroup: ldap automount: files aliases: files services: files printers: user files auth_attr: files prof_attr: files project: files tnrhtp: files tnrhdb: files
cat /etc/pam.conf | grep sshd-kbdint
sshd-kbdint auth requisite pam_authtok_get.so.1 debug sshd-kbdint auth required pam_unix_cred.so.1 debug sshd-kbdint auth binding pam_unix_auth.so.1 server_policy debug sshd-kbdint auth required pam_ldap.so.1 debug
If anything goes wrong your could do the following:
- Use ldapsearch -v to make you sure you could setup a TLS session with your LDAP server successfully.
- Enable PAM debugging and check the logs. To do that just run “touch /etc/pam_debug”, edit /etc/syslog.conf and add a new line (if it doesn’t already there of course):
And restart syslog with “svcadm restart svc:/system/system-log:default”.
- Analyze the logs on your LDAP server.
- Switch off TLS and try to sniff the traffic with snoop to make sure your ldap client sends reasonable queries.
Have fun and happy tinkering!