When the vSphere 5.5 bits dropped this weekend, I pulled everything down and started to upgrade my lab, which was on vSphere 5.1U1. I did the upgrade as outlined in the 5.5 documentation, forgoing the simple install because I had an existing SSO environment that I wanted to update.
Yes, I did a full backup of all databases and made a clone of the vCenter server itself. This isn’t my first go-round with a new version release of vSphere.
The install went well, and it appeared that all of the services came back up. The web client came up fine and I logged in with the email@example.com account. I could see the AD integration set up in the SSO configuration, I could browse the domain for users, computers and groups with no issue. I verified that the vCenter permissions were still in place, using a domain group I’d created and added all admin accounts to. No problems at all.
Then, I logged out of the .local account and tried to log in as a domain account. That’s when things got fun.
First thing that pops up, I get this error:
“Cannot parse group information”
Ok, I’ve seen this one before. Last time I got this error it was because of an “invalid” character in the AD group or user name. Here’s the KB that covers it:
But, that’s not my issue. What next? I tried all of the following:
- Remove and re-add the domain identity source to the SSO config. No good.
- Remove and re-add the domain identity source as an LDAP Server. That works, but then I get the error “Client is not authenticated to VMware inventory service”.
- I try uninstalling and reinstalling the inventory service, the web client. No good.
- I try manually re-registering the Inventory Service, the Web Client and SSO to each other using this KB: Re-pointing and re-registering VMware vCenter Server 5.x and components. No good.
- This I can temporarily resolve this issue by changing the service account used for the SSO service from the domain account to “Local System”, but that’s a terrible workaround…
- Completely uninstall vCenter and all of the components and do a fresh install. Maybe it’s just the upgrade that was the issue… No good. After a fresh install I get the same error.
- Use the vCenter Appliance. No good.
So it’s now been the better part of two days, and I’m giving up hope. I send up a flare on Twitter, and Philip Leighton jumped in to help since we were both seeing the same issue. We can’t find anything wrong, or anything that really fixes the issue, and I give up and get on a plane to a partner event in Ottawa.
Finally, something shows up on VMTN today, where someone else is having the same issues. They all go through the same steps I did to troubleshoot and collectively we learn some things:
- The common denominator seems to be that the AD source being used is of the Win2012 variety…
- SSO and local vCenter server accounts always seem to work with no issues.
- If you move SSO to a server that isn’t part of the Win2012 domain, authentication magically works again!
- If you use the Client Integration plugin on a Windows client that is logged into the domain, it also still works!
- No other workaround to using a domain account to log in seems to work consistently.
It starts to get better when someone who appears to be a VMware employee (Fill out your VMTN profile, SrinuA!) gets the original poster to upload his sso-support bundle. At 5pm yesterday, he comes back and confirms it’s a bug, with a fix expected in EP1. He also offers to let the original poster test a .dll fix, although I didn’t see any results of that.
After a few more people chime in that they are having the same issue (the thread is over 500 views at this point, less than two days after it was created), SrinuA comes back at noon today with the following update:
To clarify this issue exists the SSO/vCenter systems which are deployed on win2k12 machine and are joined to a win2k12 domain, and an identity source is setup to use Active Directory with windows authentication and you are using a domain user from the win2k12 domain to login. We are preparing a patch dll which contains the fix and will put up a kb article with the patch dll attached. We will put a kb article which will contain the patch dll with the instructions on how to apply this patch within 12-24 hours. Sorry for the delayed response and thanks for being patient.
So that’s where we are right now. As soon as the patch is put out, I’ll add the link to it and the KB here. My lesson learned here is that I should have put my findings up on VMTN immediately. I was worried that since I had a lab environment that my issues were of my own making, not something systemic with the application. Maybe we could have gotten this patch sooner if I’d started the conversation Monday morning when I found the issue. I won’t make that mistake again.