Cause and Effect: A Cisco Story

I’m in a strange mood.  I’m struggling to find the motivation I need for the final stretch before VMworld, and a combination of a brutal travel schedule, way too much stuff on my plate and just a general sense of being personally unsettled has me feeling out of sorts.

layoffsPersonally and professionally, the news out of Cisco that they are going to lay off 4,000 more employees in an effort to jumpstart growth hits close to home.  In addition to some of those people inevitably being good friends and colleagues of mine, a healthy and innovative Cisco is good for the industry, and watching them struggle is hard.  I also have a find place in my heart for them on a completely personal reason, because while I’ve never been a Cisco employee, in a very real sense Cisco is responsible for where I’ve ended up.

Let me tell you that story.

When I sit with the executive teams from our customers and talk about transforming the alignment of IT to better match the business, I’m not reading from a script; I’m sharing my personal scars.  Prior to working at VCE, I spent 6 years with a regional service provider in a number of capacities.  The last of my jobs there was as the Director of Managed Services, and we struggled mightily to build products and processes, in part because of the rigid and sometimes random way that teams were siloed.

In particular, our network team was isolated to the point of frustration.  In the past, when the company had only sold data center space, it made sense to have the network team separate since, in a real way, they were the only “technology” people in the company.  The problem was, as the company evolved from offering just colo space to providing shared, managed and finally “cloud” services, that isolation which was once understandable became a huge liability.

network_cable_unpluggedThe scope of the liability was pretty significant.  First, the actual management of the network team was located in a different city from the rest of the company, and that lack of communications was challenging.  Second, the core of the network was designed from the aspect of being an ISP, first and foremost.  It was a textbook Cisco design, from the core, distribution and access layers to the STP and VLAN layouts, and it worked as you’d expect.  We ultimately had a half dozen different BGP peering arrangements with different telcos, and that part of our offering was as solid as a rock.  We provided fantastic bandwidth to our customers.

The problem came when our offerings expanded to include more than just public internet access.  We started offering shared backup services, monitoring and management services, shared storage services and finally virtualized/cloud data center services, and each of these came with demands on the network design.  VLANs, which were easily managed when each customer had a grand total of one, became much more of an issue when a customer may have 5 or more, and when the number of customers increased 10-fold as the company grew.  STP planning was relatively easy with a dozen devices, but it was far, far more evil when there were close to a hundred per data center.

It would have been possible at an early stage, to separate the design of the services networks from the public-facing networks, but because there was little to no communication between the teams, the network group had no concept as to what was needed.  Instead, they did what they knew, and built a completely separate network stack for each service, complete with it’s own core, distribution and access layers, overlapping VLAN assignments, site-to-site links and complete logical and physical isolation.

Of course, those of you who understand multi-tenant networking are going to see where this is going.  Soon, we had customers with gear in multiple data centers, utilizing multiple services, causing all sorts of headaches.  I’m sure you can use your imagination.

As a backdrop to this, while all of the gear we used was made by Cisco, almost none of it had been bought directly, and we struggled with reliability and support for the longest time.  The tipping point came when a Catalyst switch, originally manufactured for eastern Europe, that was being used to provide the access layer for the VMware management network in one of our data centers failed, and failed wide open.  I blogged about the issue and what the downstream effects were here.  It was at this point I started looking for alternatives.  I couldn’t bring the mountain to Mohammad, so I was determined to find another mountain.

20130104_084948I’d actually started this process a couple months earlier.  In November of 2009, VMware, Cisco and EMC had announced the Acadia/VCE joint venture, and in February of 2010 I’d begun what had to have been one of the first full scale POCs with what would become a Vblock 2.  Led by Andy Sholomon (@asholomon) and the rest of the Cisco Advanced Services team, we spent a full week redesigning everything about our environment, especially the network layout.  (PS: the background of Andy’s Twitter page is the Vblock we used for the testing, and to my knowledge it’s the first VMAX-based Vblock in existence.)  In fact, at one point Andy looked at our Director of networking engineering and told him that we had the worst network design ever, and it was all I could do not to jump up and down clapping.  Of course the failure we ended up having in May proved his point pretty well.

This project was also the first real exposure I’d had to the still relatively small EMC vSpecialist team, and the list of people involved read like a who’s who.  Chad Sakac sponsored the POC from the EMC side, and brought along Scott Lowe, Scott Baker, Jonathan Donaldson and maybe even a few more.  Getting to work with that technology and learn from that group of people was pretty eye-opening.  The world I’d known all of a sudden seemed very, very small indeed.

From there, I knew what I wanted to do professionally.  Less than 7 months later I was an Acadia employee, and I’ve been there (since renamed to VCE, of course) ever since.

So, in a real way, it was Cisco who validated my feelings on traditional network design in a multi-tenant service environment, and it was Cisco who showed me the power that converged infrastructure had to make the platform simple.  Cisco was also a pretty optimistic place at the time; the UCS platform was new and exciting, the Nexus line of switches was making an impact… Success wasn’t guaranteed, but there was a general feeling of optimism that made the hard work seem worth it.  I don’t see that optimism today, after the earnings news and layoffs.  Even the seemingly manufactured outrage at the VMware acquisition of Nicira doesn’t seem to be able to sustain the energy level.  I’m not sure what will.  Hopefully they can find a way to get it back.

But the larger story is how the organization of the company had a direct and negative impact on our ability to deliver the services that our customers wanted to pay us for.  Even in a small organization, having any divergence between the desired business outcome and the processes that enable it can be disastrous.  I know this from personal experience, and try to use that to implore customers to make the changes they have to.

So, that’s the story.  And I’ve successfully procrastinated through another couple hours, avoiding having to work on another PowerPoint presentation.