Service Providin’ Aint Easy

My team has been invited into the Cisco labs in RTP to build out a type 1 vBlock and see what sort of value it provides for us as a multi-tenant cloud provider.  The process of the evaluation should be interesting, and I'll detail what we do here.  If you have any questions, please let me know and I'll pass them on to the team and try to get you an answer.

What a day.  Almost 8 straight hours of whiteboarding and we still have a list of things to get through before we actually start building out operational processes.  Working with us today were Scott Lowe and Scott Baker, part of the vSpecialist team along with Andy Sholomon and Connie Varner with the Cisco Advanced Services team.  Joining me from my team were Ronnie Frames, our Director of Network Services and Brett Impens, our Virtualization Product Manager.

Discussion, theory and whiteboarding dominated the day, as it should in an engagement like this.  The general consensus from the group was that the level of complexity introduced by the multi-tenant and high growth components of our environment was going to be our biggest hurdle.  After detailing the three different physical networks and hundreds of VLANs that currently connect into each VM host, we spent a good amount of time discussing how that looks once we collapse things down to one pair of vNICs and a single Nexus 1000v dvSwitch.  At one point one of the Cisco team mentioned that they had large banking customers who felt things were too complex with 15 VLANs, and we had to laugh. 

In the end we determined that policy-based routing was going to be a key, and that we needed to push layer 3 routing into the Nexus 7000 switch layer.  Of course that domino hit others, and we ended up redoing the firewall stack upstream as well as the connections to the services-based networks downstream.  Nothing happens in a vacuum. 

While the discussions were continuing on the network side, Scott Lowe and Scott Baker dug in on the UCS blade provisioning process.  At the end of it we built a process that combined some extremely slick scripting, a couple tricks on the UCS service policies, some custom configuration on the SAN and a few PXE workarounds (not easy in a network where ANY untagged packet is immediately discarded…) which allowed us to go from putting the blade into the chassis to having a full ESX install booting from SAN in about 12 minutes.  Looking back it was pretty unbelievable, and will really put us back in the driver's seat when it comes to being able to provision customers quickly and programatically.  We always try to keep the risk of error to a minimum, and this platform really aids that effort.

Tomorrow, we really start putting rubber to the road.  With the Nexus 7000 switches built and the base of the scripting in place we only need to work on the storage provisioning/best practices before we can start building clusters.  We also have some executive meetings with the Cisco team, and both my boss and his boss will be on-site.  It should be a good day, let's see what we can get done.