There is an ever reoccurring question about zones in chats with me is: What's the sequence of zone booting when you autostart them? For all practical purposes ... there isn't a sequence. Okay, of course there is one. It's the same that shows up, when you look at your zones with zoneadm list -pi. The reason for this is simple, just look into the responsible method script for the zone boot. You will find it at /lib/svc/method/svc-zones. But the sequence is somewhat irrelevant because of the & after the zoneadm boot command.
In some cases this isn't what you want (and most questions about the sequence arise out of the non-sequence). For example because you have an application zone that needs a database zone.
A simple and obvious idea
I thought a moment about it and got to a relatively simple idea. There is already something solving dependencies in Solaris and starting components in the right sequence. It's the Service Management Facility.
I want to put a disclaimer on the following stuff ... it's an idea, an proof-of-concept heck i put together in 20 minutes or so. When i really think of it, i can think of several points of improvement, but i have to hold a 2 hours talk in Kiel next week, so the current state has to be sufficient for now. It took vastly longer to write the explanations than to write the scripts
A showcase for the idea
The problem i've used for demonstration is as follows. You have a zone
serverzone. This zone provides fileservices with samba. You have another zone
clientzone and out of some strange reasons, there is an application in this zone that should be started only in the case the samba service is available. Okay, not a good example, but i didn't wanted to install a database and a webserver with scripting

Of course well behaving system of several applications shouldn't insist on a certain sequence of application startup, however there are cases where your applications wasn't written by people aware of this rule.
Okay, the generic startup of zones doesn't help. It starts all zones at the same moment. And now this obvious idea comes into the game.
Implementation of the idea
We won't start the zones via
/lib/svc/method/svc-zones, so don't put them in the autoboot mode. We will create an distinct startup SMF service per zone.
Starting a single zone with a SMF service
I've created a SMF manifest for each zone. The first manifest is for the zone
serverzone. But the following content to the file
/var/tmp/serverzone.xml:
<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">
<service_bundle type='manifest' name='zonestarter'>
<service
name='zonesequencer/serverzone'
type='service'
version='1'>
<dependency
name='network'
grouping='require_all'
restart_on='none'
type='service'>
<service_fmri value='svc:/milestone/network:default' />
</dependency>
<exec_method
type='method'
name='start'
exec='/lib/svc/method/zonestarter %m %{config_params/name}'
timeout_seconds='120' />
<exec_method
type='method'
name='refresh'
exec='/lib/svc/method/zonestarter %m %{config_params/name}'
timeout_seconds='120' />
<exec_method
type='method'
name='stop'
exec='/lib/svc/method/zonestarter %m %{config_params/name}'
timeout_seconds='120' />
<property_group name='startd' type='framework'>
<propval
name='duration' type='astring' value='transient' />
</property_group>
<property_group name='general' type='framework'>
<propval
name='action_authorization'
type='astring'
value='solaris.smf.manage.zonestarter.serverzone' />
<propval name='value_authorization'
type='astring'
value='solaris.smf.manage.zonestarter.serverzone' />
</property_group>
<instance name='default' enabled='false'>
<property_group
name='config_params' type='application'>
<propval
name='name' type='astring' value='serverzone'/>
</property_group>
</instance>
<stability value='Unstable' />
<template>
<common_name>
<loctext xml:lang='C'>
zonestartup
</loctext>
</common_name>
</template>
</service>
</service_bundle>
Now import it via
svcadm import /var/tmp/serverzone.xml.
The second one is for the zone
clientzone. Put the content into
/var/tmp/clientzone.xml:
<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">
<service_bundle type='manifest' name='zonestarter'>
<service
name='zonesequencer/clientzone'
type='service'
version='1'>
<dependency
name='network'
grouping='require_all'
restart_on='none'
type='service'>
<service_fmri value='svc:/milestone/network:default' />
</dependency>
<exec_method
type='method'
name='start'
exec='/lib/svc/method/zonestarter %m %{config_params/name}'
timeout_seconds='120' />
<exec_method
type='method'
name='refresh'
exec='/lib/svc/method/zonestarter %m %{config_params/name}'
timeout_seconds='120' />
<exec_method
type='method'
name='stop'
exec='/lib/svc/method/zonestarter %m %{config_params/name}'
timeout_seconds='120' />
<property_group name='startd' type='framework'>
<propval
name='duration' type='astring' value='transient' />
</property_group>
<property_group name='general' type='framework'>
<propval
name='action_authorization'
type='astring'
value='solaris.smf.manage.zonestarter.clientzone' />
<propval name='value_authorization'
type='astring'
value='solaris.smf.manage.zonestarter.clientzone' />
</property_group>
<instance name='default' enabled='false'>
<property_group
name='config_params' type='application'>
<propval
name='name' type='astring' value='clientzone'/>
</property_group>
</instance>
<stability value='Unstable' />
<template>
<common_name>
<loctext xml:lang='C'>
zonestartup
</loctext>
</common_name>
</template>
</service>
</service_bundle>
You have to import it into the SMF repository as well, so do this via
svcadm import /var/tmp/clientzone.xml. Obviously this manifest needs a method script. So i wrote this little script. Put it to
/lib/svc/method/zonestarter
# cat zonestarter.sh
#!/bin/sh
. /lib/svc/share/smf_include.sh
[ ! -x /usr/sbin/zoneadm ] && exit 0
#if [ -z "$SMF_FMRI" ]; then
# echo "this script can only be invoked by smf(5)"
# exit $SMF_EXIT_ERR_NOSMF
#fi
cd /
PATH=/usr/sbin:/usr/bin; export PATH
case "$1" in
'start')
zoneadm -z $2 boot
;;
'stop')
zoneadm -z $2 halt
;;
'refresh')
zoneadm -z $2 halt
zoneadm -z $2 refresh
;;
+)
echo "Usage: $0 { start | stop | refresh }"
exit 1
;;
esac
exit 0
Make the script executable by
chmod +x /lib/svc/method/zonestarter
Dependencies and checks
When you use those manifests and scripts without further components, you would have essentially the same behaviour as before. SMF tries to start all services in the state
offline without unfullfilled dependencies. But just configuring a dependency between the service
zonestarter/serverzone and
zonestarter/clientzone would't help as well. Both are transient. Those services are considered online, when the method scripts execute give back an errorcode 0. A successful
zoneadm boot yield such an errorcode but it doesn't wait until all services in the zone are up and running.
Let's steal an idea from the probing of cluster services: We test if the service we have to wait for is available by testing the service. For this case i put a testfile into the public share of my samba service on zone
serverzone.
# /usr/sfw/bin/smbclient -d 0 -N -c "ls" \\\\192.168.56.102\\public
Anonymous login successful
Domain=[HIVEMIND] OS=[Unix] Server=[Samba 3.0.33]
. D 0 Wed Feb 17 14:22:34 2010
.. D 0 Wed Feb 17 14:05:20 2010
test2 0 Wed Feb 17 14:22:34 2010
checkfile.smf.d8e8fca2dc0f896fd7cb4cb0031ba249 A 1024 Wed Feb 17 14:20:11 2010
test A 0 Wed Feb 17 14:10:03 2010
57858 blocks of size 262144. 43180 blocks available
It's a safe assumption that the samba service was started in the
serverzone when you see the testfile in the share. You can easily check it in scripts with
smbclient
With this assumption we develop a method script. It's meant for a transient service as well, so it doesn't terminate until the script was able to connect to the samba service.
Okay, put this little script to
/lib/svc/method/zonechecker_samba and make the script executable afterwards:
#!/bin/bash
. /lib/svc/share/smf_include.sh
if [ -z "$SMF_FMRI" ]; then
echo "this script can only be invoked by smf(5)"
exit $SMF_EXIT_ERR_NOSMF
fi
cd /
PATH=/usr/sbin:/usr/bin; export PATH
getproparg() {
val=`svcprop -p $1 $SMF_FMRI`
[ -n "$val" ] && echo $val
}
testfile=`getproparg config_params/testfile`
host=`getproparg config_params/host`
share=`getproparg config_params/share`
checkcmd="/usr/sfw/bin/smbclient -g -d0 -N -c \"ls\" //$host/$share 2>/dev/null"
echo $checkcmd
case "$1" in
'start')
while true;
do
result=`$checkcmd | grep "$testfile"`
errorcode=$?
if [ $errorcode -eq 0 ]; then
exit 0
fi
sleep 5
done
exit 1
;;
'stop')
# not implemented
;;
'refresh')
# not implemented
;;
+)
echo "Usage: $0 { start | stop | refresh }"
exit 1
;;
esac
exit 0
As you may have recognized, the script doesn't contain configurable data like the hostname, share or the name of the testfile. The script gets this information from the SMF service configuration.
Okay, put this stuff into
/var/tmp/zonechecker.serverzone.xml:
<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">
<service_bundle type='manifest' name='zonechecker'>
<service
name='zonechecker/serverzone'
type='service'
version='1'>
<dependency
name='serverzone'
grouping='require_all'
restart_on='none'
type='service'>
<service_fmri value='svc:/zonesequencer/serverzone:default' />
</dependency>
<dependent name='clientzone' grouping='require_all' restart_on='none'>
<service_fmri value='svc:/zonesequencer/clientzone:default' />
</dependent>
<exec_method
type='method'
name='start'
exec='/lib/svc/method/zonechecker_samba %m'
timeout_seconds='0' />
<exec_method
type='method'
name='refresh'
exec='/lib/svc/method/zonechecker_samba %m'
timeout_seconds='0' />
<exec_method
type='method'
name='stop'
exec='/lib/svc/method/zonechecker_samba %m'
timeout_seconds='0' />
<property_group name='startd' type='framework'>
<propval
name='duration' type='astring' value='transient' />
</property_group>
<property_group name='general' type='framework'>
<propval
name='action_authorization'
type='astring'
value='solaris.smf.manage.zonechecker.serverzone' />
<propval name='value_authorization'
type='astring'
value='solaris.smf.manage.zonechecker.serverzone' />
</property_group>
<instance name='default' enabled='false'>
<property_group
name='config_params' type='application'>
<propval name='testfile' type='astring' value='checkfile.smf.d8e8fca2dc0f896fd7cb4cb0031ba24' />
<propval name='host' type='astring' value='192.168.56.102' />
<propval name='share' type='astring' value='public' />
</property_group>
</instance>
<stability value='Unstable' />
<template>
<common_name>
<loctext xml:lang='C'>
zonestartup
</loctext>
</common_name>
</template>
</service>
</service_bundle>
You have to import this manifest as well with
svccfg import /var/tmp/zonechecker.serverzone.xml
Before i proceed i want to explain some components. I've highlighted the most important parts:
- The configuration of dependencies and dependents puts this service between the startup of the
serverzone and the clientzone. Just a short explanation: The configuration of dependent services in a manifest enables you to tell SMF that other services depend on your new service, without the need to modify other manifests.
Due to this configuration the service svc:/zonesequencer/clientzone:default depends on svc:/zonechecker/serverzone:default. svc:/zonechecker/serverzone:default depends on svc:/zonesequencer/serverzone:default. svc:/zonesequencer/clientzone:default just starts if svc:/zonechecker/serverzone:default was started successful (translated: it was able to find the test file via smb) and svc:/zonechecker/serverzone:default just starts when svc:/zonesequencer/serverzone:default
- The role of the second part is obvious. It's the configuration of the name of the testfile and on which server and share it should try to find it.
- The timeout period is set to 0. This translates into "no timeout"
A test
Let's try this hack. I've disabled samba in the
serverzone to ensure that i have enough time to observe and to allow the the boot of
clientzone simply by enabling the service in
serverzone. I did a reboot on the system afterwards. After a few moment the system should be ready to login. So let's look for our services in the global zone.
# svcs -a | grep "zone"
online 20:55:41 svc:/zonesequencer/serverzone:default
online 20:56:53 svc:/system/zones:default
offline 20:55:32 svc:/zonesequencer/clientzone:default
offline* 20:55:41 svc:/zonechecker/serverzone:default
Let's dissect this output: Forget about
svc:/system/zones:default. It just starts zones configured for autoboot.
online just tells you that the method script went through your configuration and looked for such zones. No zone with autoboot - no zones boots triggered by this service.
svc:/zonesequencer/serverzone:default is online. So the
zoneadm -z serverzone boot command in our script went through successfully.
offline* must be read as "The service isn't online, but i'm trying to get it there at the moment". As our method scripts doesn't come back without having been able to connect to the samba service in
serverzone and we've switched off the timeout it will stay in this mode until the samba connect was successul. The
offline of
svc:/zonesequencer/clientzone:default can be translated to "I'm offline and as there are still dependencies (we know it's just
svc:/zonechecker/serverzone:default) in
offline or
offline* state, i don't try to start it at the moment."
When you look at the output of
zoneadm you will see exactly the matching situation.
# zoneadm list -pi
0:global:running:/::native:shared
1:serverzone:running:/zones/serverzone:c32c5e48-e6fb-6793-d064-f62e12bea4aa:native:shared
-:clientzone:installed:/zones/clientzone:23278132-6693-e55c-f859-cfc961a7f9aa:native:shared
serverzone has booted ...
clientzone not.
Now let's start the samba services in the
serverzone. I just make a zlogin into the zone and do an
svcadm enable samba over there. When you look at state of your services again, you will now recognize that all zone services went successfully through their method scripts.
# svcs -a | grep "zone"
online 20:55:41 svc:/zonesequencer/serverzone:default
online 20:56:53 svc:/system/zones:default
online 21:01:30 svc:/zonechecker/serverzone:default
online 21:01:32 svc:/zonesequencer/clientzone:default
A quick check of the zone configuration will confirm, that both zones are running now:
# zoneadm list -pi
0:global:running:/::native:shared
1:serverzone:running:/zones/serverzone:c32c5e48-e6fb-6793-d064-f62e12bea4aa:native:shared
2:clientzone:running:/zones/clientzone:23278132-6693-e55c-f859-cfc961a7f9aa:native:shared
Conclusion
The scripts and manifests provided here aren't complete, but i hope that they give you a starting point for you own work, when you have to boot your zone in a certain sequence dependent on the service availability.