I’ve had a few readers send in questions around their iSCSI network design as it pertains to an ESXi environment. The main confusion is when to use multiple subnets for iSCSI, as opposed to VMkernel binding, based on various vendor documents and best practice guides.
Let’s start with what VMware has to say. There’s a lot of VMware KBs that outline configuration considerations when it comes to iSCSI. The primary one to read is entitled Considerations for using software iSCSI port binding in ESX/ESXi which has several good nuggets of information contained within. Other good ones are:
- Multi-homing on ESXi/ESX
- Cannot reach iSCSI target after enabling iSCSI port binding on ESX/ESXi 4.x and ESXi 5.x
- Rescanning takes a long time when using multiple VMkernel ports with port binding to access two or more storage arrays on different broadcast domains (super long name!)
The major takeaway is that if you wish to use VMkernel binding (which you should), use a single subnet (broadcast domain) for iSCSI traffic without any routing between the initiator(s) and target(s). Binding requires that all initiators can log into all targets; if there are multiple subnets being used, initiators would have to cross subnets in order to log into some of the targets. And again – routing with binding is not supported, nor does it make a lot of sense. In an MPIO scenario, IO would alternate between switched and routed interfaces, making latency and troubleshooting funky.
I’ve drawn a diagram that I think is a bit more helpful to describe the errors of multiple subnet iSCSI design. The environment below has two networks – blue and green. The use of VMkernel Binding means that every initiator on the host is going to attempt to log in to all targets that have been added manually or retrieved using the send targets request (read my other blog post for more on caveats with multiple subnets and dynamic discovery). Assuming that the networks are unable to route between one another, the blue initiator would be unable to reach the green target, and vice versa.
Single Subnet Design
To fix the above scenario, I’d suggest a single subnet topology (assuming your array supports it). It will result in 4 paths to the LUN. In an active / passive controller configuration, this means 2 active paths and 2 standby paths. This works great for a host with 2 network adapters – we can send IO down both of them no matter which controller is active.
Multiple Subnet Design
If you poke around various vendor documentation, and even older VMware docs, you’ll see references to multiple subnet design for iSCSI. There’s two reasons for this:
- The VMkernel TCP/IP stack will only use one vmk per subnet based on the routing table. Use multiple subnets is a way to deterministically force traffic out of specific vmks.
- Limiting the quantity of paths going to a LUN when a large quantity of initiators and/or targets are present to avoid excessive logins and link overcommitment.
EMC’s VNX TechBook has a section on iSCSI (published Jan 2015) and recommends against VMkernel Binding, instead preferring multiple subnets in a somewhat more complex design that is further debated in a few different community threads. The doc describes binding (explicit assignment) as being technically possible but not supported (page 44-45) due to the multi-target nature of the array.
Further, another caveat with a single iSCSI subnet is that in some scenarios you’d have 2-4 initiators talking to 4-8 targets, which could potentially result in 32 paths (4 initiators * 8 targets) to a LUN. That’s far too many paths and presents scenarios where IO would live on an initiator for longer than desired and the added paths provide no performance gain. There’s also a lot of login chatter required to keep 32 paths alive, which is wasteful.
Using multiple subnets allows for sufficient front end port usage on the array in addition to network adapters on the host. Here’s an example that does not use VMkernel Binding; instead, multiple subnets are used, which would result in 4 total paths to the LUN instead of 8. Note that I’m assuming the host has 2 network adapters, which is why you’d want to keep the paths down (1 active path per adapter). Note that the major difference between the first diagram and the one below is that we’re now presenting both the blue and green networks to both array controllers instead of blue only on the left and green only on the right.
As always, read your specific vendor’s documentation because they have the final say in what they’ll support you using when you dial for help. 🙂