First, let’s get the jokes out of the way: yes, it’s a technical blog post. No, I haven’t forgotten I had a blog. I know, I know.
One of the projects that I’ve been working on at SolidFire for the last few months is putting together the VMware and Cisco UCS reference architecture, and helping the team put together a UCS best practices document.
Part of that design is to take advantage of the stateless nature of the UCS blade platform by booting the ESXi hosts from the SolidFire array. UCS has supported this feature for a while, but the process certainly has it’s share of quirks. I’m not going to go through and show the entire process for setting up iSCSI boot-from-SAN, although if there’s any interest in that part I can probably put together a Camtasia video walkthrough.
What I did want to show was an “undocumented feature” we ran into when trying to use CHAP authentication to the SolidFire array as part of the booting process.
You can see that it’s associated with a SolidFire account called “BOOT-AI”, and if we look at that account, you can see that it has a random target and initiator password that has been generated for it by the system.
Maybe, at this point, you are thinking “Gee, it doesn’t make ANY SENSE AT ALL that there would be a password complexity setting on the profile, since it’s just passing that back to whatever storage array is being used.” That would be a logical thought, but you’d be wrong. Luckily, there’s a pretty good help function that gives you the information you are looking for.
Here’s the problem: this is a total lie. It’s plain not true. Watch what happens when we use the standard, random password that was generated with the account.
First, we set the iSCSI Boot Parameters in the Boot Order screen in UCSM for the appropriate service profile, using the Volume IQN from the boot volume and the authentication profile we created earlier:
When you get to this point, you try to do some basic trouble shooting from the FI, and when you try and pull the iSCSI configuration for this blade, you get no data returned at all, indicating that the iSCSI stack never initialized at all:
In fact, if we look at the VIF Paths on the blade, and then even check the ARP table on the FIs, we see that the network stack didn’t try to initialize at all. If you were to boot from a live CD at this point, you’d see that there are no network adapters connected to the blade at all. All of this, because you followed directions and put a password with 12 to 16 characters into a field that returned no errors.
Yes, this took me a little more than two days to troubleshoot. I wonder if Cisco has an Internet of Things script that can get that back for me…
So, the fix is pretty simple. We go into the SolidFire array, change the account password to something that doesn’t use anything except alphanumerical characters along with the – (hyphen), _ (underscore), : (colon), and . (period).
Now, the host boots just fine. Hopefully this helps someone else out there having issues.