Ubuntu – How to stop all processes in a chroot

chrootinitupstart

I have a number of LVM partitions, each containing an Ubuntu installation. Occasionally, I want to do an apt-get dist-upgrade, to update an installation to the most recent packages. I do this with chroot – the process is usually something like:

$ sudo mount /dev/local/chroot-0 /mnt/chroot-0
$ sudo chroot /mnt/chroot-0 sh -c 'apt-get update && apt-get dist-upgrade'
$ sudo umount /mnt/chroot-0

[ not shown: I also mount and unmount /mnt/chroot-0/{dev,sys,proc} as bind-mounts to the real /dev, /sys and /proc, as the dist-upgrade seems to expect these to be present ]

However, after upgrading to precise, this process no longer works – the final umount will fail because there are still open files on the /mnt/chroot-0 filesystem. lsof confirms that there are processes with open files in the chroot. These processes have been started during the dist-upgrade, I'm assuming this is because certain services in the chroot need to be restarted (eg, through service postgresql restart) after the package is upgraded.

So, I figure I need to tell upstart to stop all the services that are running within this chroot. Is there a way to reliably do this?

I've tried:

cat <<EOF | sudo chroot /mnt/chroot-0 /bin/sh
# stop 'initctl' services 
initctl list | awk '/start\/running/ {print \$1}' | xargs -n1 -r initctl stop
EOF

Where initctl list seems to do the right thing and only list processes that have been started in this particular root. I've tried adding this too, as suggested by Tuminoid:

cat <<EOF | sudo chroot /mnt/chroot-0 /bin/sh
# stop 'service' services
service --status-all 2>/dev/null |
    awk '/^ \[ \+ \]/ { print \$4}' |
    while read s; do service \$s stop; done
EOF

However, these doesn't seem to catch everything; processes that have daemonised and been reparented to PID 1 don't get stopped. I've also tried:

sudo chroot /mnt/chroot-0 telinit 0

But in this case, init doesn't distinguish between the separate roots and shuts down the entire machine.

So, is there any way to tell init to stop all processes in a particular chroot, so that I can safely unmount the filesystem? Does upstart have any facility to SIGTERM/SIGKILL all child processes (as would be done during regular shutdown) within a chroot?

Best Answer

I don't trust anything but the kernel to keep a sane state here, so I don't (ab)use init to get this job done, nor do I count on myself actually knowing what is or isn't mounted (some packages can mount extra filesystems, like binfmt_misc). So, for process slaughter, I use:

PREFIX=/mnt/chroot-0
FOUND=0

for ROOT in /proc/*/root; do
    LINK=$(readlink $ROOT)
    if [ "x$LINK" != "x" ]; then
        if [ "x${LINK:0:${#PREFIX}}" = "x$PREFIX" ]; then
            # this process is in the chroot...
            PID=$(basename $(dirname "$ROOT"))
            kill -9 "$PID"
            FOUND=1
        fi
    fi
done

if [ "x$FOUND" = "x1" ]; then
    # repeat the above, the script I'm cargo-culting this from just re-execs itself
fi

And for umounting chroots, I use:

PREFIX=/mnt/chroot-0
COUNT=0

while grep -q "$PREFIX" /proc/mounts; do
    COUNT=$(($COUNT+1))
    if [ $COUNT -ge 20 ]; then
        echo "failed to umount $PREFIX"
        if [ -x /usr/bin/lsof ]; then
            /usr/bin/lsof "$PREFIX"
        fi
        exit 1
    fi
    grep "$PREFIX" /proc/mounts | \
        cut -d\  -f2 | LANG=C sort -r | xargs -r -n 1 umount || sleep 1
done

As an addendum, I'd point out that approaching this as an init problem is probably the wrong way to look at it, unless you actually have an init in the chroot and a separate process space (ie: in the case of LXC containers). With a single init (outside the chroot), and a shared process space, this is no longer "init's problem", but rather just up to you to find the processes that happen to have the offending path, hence the above proc walk.

It's not clear from your initial post if these are fully-bootable systems that you're just upgrading externally (which is how I read it), or if they're chroots that you use for things like package builds. If it's the latter, you might also want a policy-rc.d in place (like the one dropped in by mk-sbuild) that just forbids init jobs starting in the first place. Obviously, that's not a sane solution if these are meant to be bootable systems as well.