The individual owning this blog works for Oracle in Germany. The opinions expressed here are his own, are not necessarily reviewed in advance by anyone but the individual author, and neither Oracle nor any other party necessarily agrees with them.
Friday, March 9. 2012
Yesterday i had the opportunity to show Oracle VM for SPARC in front of customers in action. Not a single slide was used ... Everything was live . The following entry shows what i essentially did in this demo. Perhaps long time users of LDOMS or Oracle VM for SPARC (as they are called today) have already seen for this, however that wasn't the planned audience of this walkthrough. In this example i've configured the control domain, one guest domain, installed it with Solaris 11 and migrated it live (without service interruption) from one system to another)
Okay, i started with two unconfigured (okay, to be exact ... deconfigured systems) system of the type SPARC T3-4. So i had plenty resources to play. The first system was node1 listening on 10.128.0.72, the second system was node2 listening on 10.128.0.73.
Just to be sure, i checked the configuration.
There was just a single logical domain with all resources (512 virtual CPUs and 256 GB memory were assigned to it. The situation on the second node was the same. No wonder. Same HW config, same SW config.
Ensure that you have enabled the vntsd daemon on both systems
Okay, basics were the same, now i had to start the basic config. Important is those single large domains will act as so called control domains, however they will significantly smaller for that task. The already running Solaris 10 was kept unharmed and got the OS of the control domain.
First step was to configure the virtual console server:
With this command you configure a console server listening on ports 5000 to 5100 named primary-vcc0 in the domain primary. Okay, the next step was to configure the so called virtual disk server. As long as you don't configure any hardware directly into a domain like a networking card for for iSCSI or a HBA for storage access, the virtual disk server is the daemon that provides the service of storage devices to all guest domains.
With this command we have configured a virtual disk server called primary-vds0 in the domain vds. The next step is the configuration of the networking. For this task we configure a virtual switch.
The virtual switch called primary-vsw0 is running in the domain primary and it's connecting into the real world via the device igb0. When you want to check all the services you have just configured, you can do this with a single command.
At the moment this primary domain is using all the resources. In order to be able to configure some guests, we have to free some room. So at first we reduce the number of assigned crypto units. I just want to give them one.
In the next step we assign 8 processor to the domain.
Okay, let's check the current configuration.
Okay, just 8 virtual cpus, hpowever the domain still occupied all the memory in the system. We have to reduce that. Technically it's possible to do this on the running system but getting a running logical domain from 256 GB to 8 GB is quite some work, so most often it is just faster to put the domain in the deferred configuration mode, do the configuration and reboot the system as in this moment nothing runs on the system. When doing deferred reconfiguration, the config change is accepted but it will be executed with the next reboot:
Now we set the memory of the domain primary to 8 GB
Saving the config to the ILOM, and rebooting the system.
Okay, while the first system is rebooting, we just repeat the same configuration steps on the second system:
We now check the config on both systems. On the first system:
On the second system.
Okay ... i have to explain a little bit ... 10.10.1.37 is a S7000 filer i've used for central storage. In the directory /ldoms/isos i've put a iso of the solaris 11 11/11 text install image.
As i want to install the ldom i will create later on, i add this iso to the virtual disk server as a device:
Okay, i want to demo live migration in this walkthrough, so i need some shared storage. It's obvious why i need shared storage: It' makes no sense to migrate a logical domain to a system that hasn't access to the same disk devices. So i configured my S7000 filer to offer some LUNs via iSCSI. However i have to configure the primary domain in order to actually use this LUNs. However this is pretty easy. At first we tell the iSCSI initator of Solaris 10 that there are disk to find behind 10.10.1.37
Now we tell Solaris to discover the LUNs behing this IP.
And now we poplulate the /dev tree with the nescessary nodes.
Okay, repeat on the second system.
Okay, let's have a look what the system has found. From the configuration in the filer i knew that there must be something like 600144F0C56DC0FB00004F586FD60004 in the disk id. As the disk is unlabeled at that time the format command will offer to you do this labeling. Do it ... you need a labeled disk.
Okay, now check the availability of disk disk on the other server.
Check ... is there. The next step is the last one in this tour we have to execute on both systems. With this command we assign the disk /dev/dsk/c0t600144F0C56DC0FB00004F586FD60004d0s2 on both nodes as lmtest001iscsibootdisk to the virtual disk server called primary-vds0
Okay, now we configure our first guest domain.
At first we just create the domain.
Now we add 8 virtual CPUs to the domain.
Of course a domain needs memory, so i give it 16 GB.
Now i'm creating a networking interface for the domain lmtest001 connected to the virtual switch primary-vsw0 and naming it vnet1.
Okay, as my iscsi disk is totally empty, i have to provide an installation image ( i could do this via jumpstart or AI, however that would be out of scope of this short article). So i assign the virtual disk sol11iso on the virtual disk server primary-vds0 (remember, we configured them earlier) to lmtest001. To the domain it's named vdisk_iso.
Now i have to assign the iscsi boot disk to the domain. The command is quite similar.
The next stop is to declare the bootdevice and to tell the system to boot automatically from it.
However, the domain is still inactive and no resources have been binded to the domain
So we bind the resources with a single command:
When you look up the status again, you see a state transition. The domain isn't "inactive" any longer, it's now bound.
Now it's time to start up the domain.
When you look back into the last output of
As you see, there is a boot prompt like with a native SPARC machine. As there is no operating system on the device we've called bootdisk, the system doesn't come up but stays in that prompt.
Now let's boot from the iso image:
Okay, at first the system comes up from the ISO as you see
Okay, this will now take a while. I won't write about it. It's a standard Solaris 11 install. You know the drill.
After the reboot initiated in the installation procedure, the systems comes up with an installed OS. As you will recognize by the string of the boot device, that you now have booted from the iscsi boot disk.
Okay, let's play a little bit with the domain. Login into the shell of the system. When you execute an prtdiag, you will see 16 GB of memory and 8 virtual CPUs.
Okay, let's assume we've changed our mind and want a domain with 8 additional virtual CPUs. You can do this while the domain is running:
When you do another prtdiag in the still running domain, you will see 16 virtual CPUs.
Okay, 8 additional 8G may be a nice idea. So let's add them to the running domain.
Do another prtdiag, an we see 24 gigs of memory.
However, we aren't really in decision mood today and think that our first config was nice and revert to the old values. So we remove 8 gigs of memory again from the domain lmtest001
And again our domain has just 16 GB.
Now we have just to remove the 8 additional vcpus.
Okay, a last time we will execute prtdiag.
Again, back to 8 virtual CPUs in the domain.
Okay, but now back to our demonstration of live migation. I would like to demonstrate the live migration with some network traffic, so i need an ip address. So i configure one one the
Just a short test from my local workstation. Just a short remark ... the times are that bad, because my servers were in Scotland and i was in the Düsseldorf FTL lounge connected via VPN over an UTMS line ...
Okay, let's kick of the live migation. With this command i order the logical domain manager to migrate the domain lmtest001 to the server running a control domain on 10.128.0.73. The password was in my case the root password of that control domain.
This will take a while, however you will just get back your prompt in a very unspectacular way. You will get back the prompt as soon as the migration has completeted
What has in the meantime happened?
Well a ping has been lost. However what's more interesting, there isn't a domain lmtest001 on my server.
Because it's on the other one.
Of course i could migrate back to my old system.
It has disappered from this server
And is now back on the original one.
Do you want to learn more?
docs.oracle.com: Oracle VM for SPARC Documentation
Tracked: Apr 28, 00:04
Display comments as (Linear | Threaded)
Very nice write-up. No more downtime for planned hardware maintenance Amazing feature.
How would you compare two technologies: LDOMs and solaris zones? Which one is going to be developed and used in future? Could you recommend any nice white paper to read about the pros and cons of both those technologies in comparison? Don't you feel that oracle would end development of LDOMs in order to make zoning the only virtualisation technology?
LDOM's are hardware partitioning where zones is a virtualized Solaris instance.
Oracle likes the hardware partitioning it makes purchasers of their high end servers able to utilise them quickly without requiring more/Cheaper hardware. Eg: Oracle DB licensed per Proc.
Hope this helps.
The difference of hardware and software virtualisation is clear. What is unclear for me is the following development of both technologies and real benefits of LDOMs in comparison to zones from practical POV. It seems to me that LDOMs are to be killed in future as they provide no real advantages. Am I right or not?
"Both Oracle Database 9i R2 and 10g R2 databases have been certified to run in an Oracle Solaris Container. A licensing agreement recognizes capped Oracle Solaris 10 Containers as hard partitions. The ability to license only the CPUs or cores configured in an Oracle Solaris Container provides flexibility, consolidation opportunities, and possible cost savings." From here: http://developers.sun.com/solaris/docs/oracle_containers.pdf
Well, you are wrong. Totally.
1. LDOMs and Zones are mechanisms with different targets.
2. Zones are designed as a lightweight mechanism. LDOMs have a higher level of isolation, but therefore have a higher overhead. Zones are based on a single kernel, each LDOM has it's own kernel. Both situations have advantages and disadvantages. You can have different OS kernels in each LDOM. As you may have noticed, the demo above runs a Solaris 10 control domain with a Solaris 11 guest domain. On Zones you have exactly the same kernel in each zones, which can be problematic if one vendors says "i want this OS patch level", and another says "i want this one" and both are not identical.
3. There is no "XOR" in this question. Most customers use both because in the kind of having LDOMs with a large number of Zones in each of it.
4. The article above should show you one of the read advantage. You can't live migrate with zones, however with LDOMs you could do that. I know that there are vendors that offer some kind of live migration for a kind of zone, however even friends of that architectures admit that those technologies are a mess. And you have to work around with more and more conversion/lookup tables thus increasing the overhead for normal operation.
thank you for an excellent detailed post. I have a few questions...
so the lesson is, an iSCSI bootdisk coming from a S7000 filer was logically migrated from node1 (a T4 system) to another physical chassis called node2 (a T4 system).
This gives node1 and node2 the ability to "host" or pass back and forth a Solaris OS instance called "lmtest001" that is wholly contained on OBP device "/virtual-devices@100/channel-devices@200/disk@1:a"
Q. What the approximate time in seconds or minutes to accomplish "ldm migrate lmtest001 10.128.0.73"? Is your prompt held for the duration - does it report explicit success or failure of the migration command?
Q. What is the "state" of the Solaris OS instance "lmtest001" during this timeframe? is it running or must it be "halted" with init 0?
I only have 1 T5540 series (need another!), just migrating away from the Sun 6800 (Dynamic Reconfiguration within the same chassis), shared FC storage (Sun 2540) and Veritas (VCS) world.
No, the lesson is "How to move a running OS with it's running application from one server to another"!
1. The time for the migration can take some time as the content of the memory of the LDOM is transfered to the other system. LDM will return as soon as migraiton has completed or was ended with an error.
2. lmtest0001 is running during the migration. You can migrate a domain under full production load. Depending on the rate of memory page changes there may be a short moment, where the domain is freezed at the very end of the migraiton, however as i showed in the example a single ping is "timeouting" here.
Very nice Joerg. Where can I find the docs/requirements for ldm? Can I do Live Migration with the T1-series T1000 boxes?
And does a live migrated system drag it's NFS mounted file systems with it to the new location?
And can I migrate to unlike T-series machines, like from the T1000 to a T5440 and back? Thanks for your time.
Jörg, one more question (which I didn't ask on Thursday as were already short on time ).
In your example you assign a fixed number of processors to the LDOM. Is it possible to assign "dynamic" resource like you could do while creating resource pools (e.g. "poolcfg -c 'create pset zone_pset (uint pset.min=2; uint pset.max=4)' ") for zones?
Yes, you can do that, it's called Dynamic Resource Management.
Great. So this would be a facility to "overbook" the hardware?
If so, what gets checked when migrating to another box? I guess at least "vcpu-min" must be matched?
Btw, nice to see that this blog is getting back to life lately.
Please to see some great articles being posted on your site again.....Looks like your site could be my most visited once again.
Hope all is going well
All the best
Good article and questions, here is another one
If you have zones running inside the LDOM guest, can you still Live Migrate the the LDOM with the zones inside it?
Of course ... as the Zones are part of the migrated OS, the Zones are migrated with the LDOM.
The author does not allow comments to this entry
The LKSF book
The book with the consolidated Less known Solaris Tutorials is available for download here
Martin about End of c0t0d0s0.org
Mon, 01.05.2017 11:21
Thank you for many interesting blog posts. Good luck with al l new endeavours!
Hosam about End of c0t0d0s0.org
Mon, 01.05.2017 08:58
Joerg Moellenkamp about tar -x and NFS - or: The devil in the details
Fri, 28.04.2017 13:47
At least with ZFS this isn't c orrect. A rmdir for example do esn't trigger a zil_commit, as long as you don't speci [...]
Thu, 27.04.2017 22:31
You say: "The following dat a modifying procedures are syn chronous: WRITE (with stable f lag set to FILE_SYNC), C [...]