Less known Solaris features - IP Multipathing (Part 3): Foundations 2

As I wrote before, there are two methods of failure detection. Link based failure detection and probe based failure detection. Both have advantages and disadvantages.

Methods of failure detection

Link based

The link based method is the fasted method of both. Whenever the link goes down, the IPMP gets a notification of the state change of the interface almost immediately. So it can react instantaneously on such failures. Furthermore it doesn’t need any test addresses. It doesn’t check the availability on the IP layer and so there is no need for the interface to communicate independently from the data address. But there is a big disadvantage. The challenge lies in the point that it doesn’t check the health of your IP connection, it just checks if there is a link. It’s like a a small signal light, that indicates that there’s power on the plug, but doesn’t tell you if it’s 220v or 110v. There are situations when a purely link-based mechanism is misguiding, especially when the networks are getting more complex. Just think about the following network:

Let’s assume that link 1 fails. Obviously the link at the physical interface goes down. The link based mechanism can detect this failure and the system can react to this problem and switch over to the other networking card.

But now let’s assume that link 2 fails.

The link on the connection 1 is still up and the system considers the connection to the network as functional. There is no change in the flags of the IP interface. However your networking connection is still broken as your defaultrouter is gone. A link means nothing when you can’t communicate over it. At first such scenarios doesn’t sound so common and an intelligent network design can prevent such situations. Yes, that’s correct, but just think about el-cheapo media converters from fibre to copper, that doesn’t take down the link on the copper side when the link is down on the fibre side(Albeit any decent media converter has a feature that mirrors the link down state from one side to the other to ease management and to notify the connected system about problems). Or small switches that are misused as media converters(Dont’ laugh about it … I found dusty old 10BASET hubs in raised floors working perfectly as media converters for years and years)

Probe based

So how you can circumvent this problem? The solution is somewhat obvious. Don’t check only the link on the physical layer. Check it on the layer that really matters. In the case of networking: Don’t check if there’s a physical link … check if you can reach other systems with the IP protocol. And the probe base failure detection does exactly this. As i wrote before, the probe based failure detection uses ICMP messages to check a functional IP network. So it can check if you really have an IP connection to your default router and not just a link to a switch somewhere between the server and the router. But this method has a disadvantage as well: You need vastly more IP-addresses. Every interface in the IPMP address needs a test address. The test address is used to test the connection and stays on the interface even in the case of a failure (Obviously you need the test mechanism to check if the physical link was repaired by the admin). The IP address consumption is huge. Given you have n interfaces you need n test addresses.An IPMP group with four connections needs 4 test addresses. However you can ease the consumption of IP-Address by using a private network for the test addresses different to the network containing the data addresses. But I will get to this at the end of this chapter.

Failure/Repair detection time

Another interesting question is the speed of the failure and repair detection. It’s different for both mechanisms. For link based failure detection it’s easy, as soon as the IPMP subsystem gets aware of the situation, that an interface lost the RUNNING flag, it’s considered down. It’s nearly instantaneous. Even probe-based IPMP uses this mechanism to speed up the failover. Link-based failure detection is still in action, even when you use probe-based failure detection. But what’s with the reaction time of the probe based failure detection? As i’ve told you before, the mechanism is based on ICMP messages. There are two simple rules:

In the default configuration, probing takes place roughly every 2 seconds. You can observe this by snoop the interface when you have put it into a IPMP group.

jmoekamp@hivemind:~# snoop -d e1000g0 -t a -r  icmp
Using device e1000g0 (promiscuous mode)
18:56:10.82015 -> ICMP Echo request (ID: 11017 Sequence number: 27065)
18:56:10.82045 -> ICMP Echo reply (ID: 11017 Sequence number: 27065)
18:56:12.64018 -> ICMP Echo request (ID: 11017 Sequence number: 27066)
18:56:12.64053 -> ICMP Echo reply (ID: 11017 Sequence number: 27066)

Given the 2 seconds between the probes, a failure is detected in 10 seconds by default, a repair is detected in 20 seconds. However you can change this number in the case you need a faster failure. I will explain that later in this tutorial.