Expanding Solaris filesystem

Of course, you know about growfs command which is a key tool if you’re aiming at expanding a UFS file system in non-destructive way. But sometimes you may see the following error message as a result of growfs attempt:

failed to disable logging

The reason why this error happens is not obvious though. From the source code it could happen when the return value from rl_log_control() is not RL_SUCCESS. But if we go deeper into rl_log_control()’s internals the actual reason is quite obscure. Since rv is declared as RL_SUCCESS in the very beginning of the function there are only two options left where it could return RL_SYSERROR:

rl_result_t	rv = RL_SUCCESS;

        if (alreadymounted == RL_TRUE)
 		fd = open(li.li_mntpoint, O_RDONLY);
 	else
 		fd = open(li.li_tmpmp, O_RDONLY);
 	if (fd == SYSERR) {
 		perror("open");
 		rv = RL_SYSERR;
 		goto out;
 	}

    	fl.nbytes_requested = 0;
     	fl.nbytes_actual = 0;
     	fl.error = FIOLOG_ENONE;
     
     	if (ioctl(fd, request, &fl) == SYSERR) {
     		perror("ioctl");
     		(void) close(fd);
     		rv = RL_SYSERR;
     		goto out;
     	}

Unfortunately I wasn’t able to dtrace the issue more thoroughly because stopping xntpd and syslog helped to resolve the problem and all the tries to reproduce the same behavior on our testbed system did meet with success.

Update 1
Actually I’ve just successfully reproduced the faulty behavior and going to fiddle with dtrace tomorrow.
Update 2
Found a related bug id 6625306
Update 3
In the end, I didn’t reached the root cause of the problem but from what I’ve observed I could confirm, as described in the aforementioned bug id, that the error “failed to disable logging” is misleading. Below is the actual sequence of called functions:

 mkfs.c:growinit()
       |
       V
roll_log.c:rl_log_control()
       |
       V
ioctl.c:ioctl()
       |
       V
vnode.c:fop_ioctl()
       |
       V
ufs_vnops.c:ufs_ioctl()
       |
       V
ufs_log.c:ufs_fiologenable()
       |
       V
lufs.c:lufs_disable()

Armed with dtrace it was easy to verify that a return value from ioctl functions was 0 or in other words successful and truss’s output only accords with that:

5858:   open("/", O_RDONLY)                             = 5
5858:   ioctl(5, _ION('f', 88, 0), 0xFFBFC490)          = 0
5858:   ioctl(5, _ION('f', 72, 0), 0xFFBFC4AC)          = 0
5858:   close(5)                                        = 0

But the net result was still negative and I had to stop xntpd to grow the root file system.

Here are the components which add up to the issue:

  • Solaris 10 10/09 s10s_u8wos_08a SPARC
  • 5.10 Generic_142900-02 sun4u sparc
  • Veritas-5.0_MP3_RP2
  • Encapsulated root disk

Update 4
I believe this is going to be the last update in the series of inconsistent postings regarding “failed to disable logging” error. What I overlooked yesterday is the following check in the roll_log.c:

if (((request == _FIOLOGENABLE) && (!logenabled)) ||
     	    ((request == _FIOLOGDISABLE) && logenabled))
     		rv = RL_FAIL;

As I have already mentioned before, the second ioctl which checks _FIOISLOG returned success in my case. But what I didn’t check is the logic hidden behind it, partly because I misinterpreted a comment to _FIOISLOG definition in sys/filio.h, which says:

#define	_FIOISLOG	_IO('f', 72)		/* disksuite/ufs protocol */

I agree it’s a lame excuse. Anyway, here is a code from ufs_log.c:ufs_fioislog()

/*
  * ufs_fioislog
  *	Return true if log is present and active; otherwise false
  */
 /* ARGSUSED */
 int
 ufs_fioislog(vnode_t *vp, uint32_t *islog, cred_t *cr, int flags)
 {
 	ufsvfs_t	*ufsvfsp	= VTOI(vp)->i_ufsvfs;
 	int		active;
 
 	active = (ufsvfsp && ufsvfsp->vfs_log);
 	if (flags & FKIOCTL)
 		*islog = active;
 	else if (suword32(islog, active))
 		return (EFAULT);
 	return (0);
 }

So, after running a trivial dtrace script I just confirmed what I should have noticed straight away:

#!/usr/sbin/dtrace -s

#pragma D option quiet

fbt:ufs:ufs_fioislog:entry
/stringof(args[0]->v_path)=="/"/
{
        self->path=stringof(args[0]->v_path);
        printf ("Vnode path is %s\n", self->path);
}

fbt:ufs:ufs_fioislog:return
/self->path=="/"/
{
        trace (args[1]);
        self->path=0;
}

That “failed to disable logging” error message is valid and not misleading but as correctly noticed in bug:

lufs_enable()/lufs_disable() used by the ufs_ioctl()’s _FIOLOGENABLE and _FIOLOGDISABLE
do report errors while attempting to enable/disable the ondisk log via a corresponding
structure fiolog_t defined in ufs_filio.h and sometimes in addition with a real
error return value returned by to ufs_ioctl()

I should’ve been more vigilant and should’ve checked against fiolog_t structure correctly because initially I made a mistake in the script. So finally, with

#!/usr/sbin/dtrace -s

#pragma D option quiet

fbt:ufs:lufs_disable:entry
{
self->v_path=stringof(args[0]->v_path);
self->fiolog=args[1];
printf ("Vnode path - %s\n", self->v_path);
}

fbt:ufs:lufs_disable:return
/self->v_path=="/"/
{
printf ("Return value - %d \nfiolog->error - %d", args[1], self->fiolog->error);
exit(0);
}

I received the following result:

Vnode path - /
Return value - 0 
fiolog->error - 4

So in the end the whole picture has become more clear – we failed in attempt to write-lock the file system.

#define	FIOLOG_ENONE	0
#define	FIOLOG_ETRANS	1
#define	FIOLOG_EROFS	2
#define	FIOLOG_EULOCK	3
#define	FIOLOG_EWLOCK	4
#define	FIOLOG_ECLEAN	5
#define	FIOLOG_ENOULOCK 6

Why did it fail to write-lock? Well, further dtracing ufs_lockfs.c:ufs__fiolfs() revealed that the mount device was simply busy since this function on return set return value to 16 which is, according to errno.h, means:

#define	EBUSY		16		/* Mount device busy */

Ooh!

One comment on “Expanding Solaris filesystem

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>