AOL Power Hog Award

I have received a ton of emails after my post about our Uptime Institute Server Round Up Award asking me about our “Power Hog” Award.   In case you missed it, part of our internal analysis was going through and identifying inefficient servers and systems and we motivated the owners of those systems to migrate their installations to the cloud infrastructure that we built out.  You definitely knew you were in trouble when a Power Hog Award arrived on your desk.   I guess we were not below shame as a tactic.   So for those of you who were interested in seeing our illustrious(?) award I thought I would share a photo of one.

pig

 

\Mm

Attacking the Cruft

Today the Uptime Institute announced that AOL won the Server Roundup Award.  The achievement has gotten some press already (At Computerworld, PCWorld, and related sites) and I cannot begin to tell you how proud I am of my teams.   One of the more personal transitions and journeys I have made since my experience scaling the Microsoft environments from tens of thousands of servers to hundreds of thousands of servers has been truly understanding the complexity facing a problem most larger established IT departments have been dealing with for years.  In some respects, scaling infrastructure, while incredibly challenging and hard, is in large part a uni-directional problem space.   You are faced with growth and more growth followed by even more growth.  All sorts of interesting things break when you get to big scale. Processes, methodologies, technologies, all quickly fall to the wayside as you climb ever up the ladder of scale.

At AOL I faced a multi-directional problem space in that, as a company and as a technology platform we were still growing.  Added to that there was 27 years of what I call “Cruft”.   I define “Cruft” as years of build-up of technology, processes, politics, fiscal orphaning and poor operational hygiene.  This cruft can act as a huge boat anchor and barrier to an organization to drive agility in its online and IT operations.  On top of this Cruft a layer of what can best be described as lethargy or perhaps apathy can sometimes develop and add even more difficulty to the problem space.

One of the first things I encountered at AOL was the cruft.  In any organization, everyone always wants to work on the new, cool, interesting things. Mainly because they are new and interesting..out of the norm.  Essentially the fun stuff!  But the ability for the organization to really drive the adoption of new technologies and methods was always slowed, gated or in some cases altogether prevented by years interconnected systems, lost owners, servers of unknown purpose lost in the distant historical memory and the like.   This I found in healthy populations at AOL. 

We initially set about building a plan to attack this cruft.   To earnestly remove as much of the cruft  as possible and drive the organization towards agility.  Initially we called this list of properties, servers, equipment and the like the Operations $/-\!+ list. As this name was not very user-friendly it migrated into a series of initiatives grouped the name of Ops-Surdities.   These programs attacked different types of cruft and were at a high level grouped into three main categories:

The Absurdity List – A list of projects/properties/applications that had a questionable value, lack of owner, lack of direction, or the like but was still drawing load and resources from our data centers.   The plan here was to develop action plans for each of the items that appeared on this list.

Power Hog – An effort to audit our data center facilities, equipment, and the like looking for inefficient servers, installations, and /or technology and migrating them to new more efficient platforms or our AOL Cloud infrastructure.  You knew you were in trouble when you had a trophy of a bronze pig appear on your desk or office and that you were marked. 

Ops Hygiene – The sometimes tedious task of tracking down older machines and systems that may have been decomissioned in the past, marked for removal, or were fully depreciated and were never truly removed.  Pure Vampiric load.  You may or may not be surprised how much of this exists in modern data centers.  It’s a common issue I have had with most data center management professionals in the industry.

So here we are, in a timeline measured in under a year, and being told all along the way by“crufty old-timers” that we would never make any progress, my teams have de-comissioned almost 10,000 servers from our environments. (Actually this number is greater now, but the submission deadline for the award was earlier in the year).  What an amazing accomplishment.  What an amazing team!

So how did we do it?

As we will be presenting this in a lot more detail at the Uptime Symposium, I am not going to give away all of our secrets in a blog post and give you a good reason to head to the Uptime event and listen to and ask the primary leaders of this effort how they did it in person.  It may be a good use of that Travel budget your company has been sitting on this year.

What I will share is some guidelines on approach and some things to be wary of if you are facing similar challenges in your organization.

FOCUS AND ATTENTION

I cannot tell you how many I have spoken with that have tried to go after ‘cruft’ like this time and time again and failed.   One of the key drivers for success in my mind is ensuring that there is focus and attention on this kind of project at all levels, across all organizations, and most importantly from the TOP.   To often executives give out blind directives with little to no follow through and assume this kind of thing gets done.   They are generally unaware of the natural resistance to this kind of work there is in most IT organizations.    Having a motivated, engaged, and focused leadership on these types of efforts goes and extraordinarily long way to making headway here.  

BEWARE of ORGANIZATIONAL APATHY

The human factors that stack up against a project like this are impressive.  While they may not be openly in revolt over such projects there is a natural resistance to getting things done.  This work is not sexy.  This work is hard.  This work is tedious.  This likely means going back and touching equipment and kit that has not been messed with for a long time.   You may have competing organizational priorities which place this kind of work at the bottom of the workload priority list.   In addition to having Executive buy in and focus, make sure you have some really driven people running these programs.  You are looking for CAN DO people, not MAKE DO people.

TECHNOLOGY CAN HELP, BUT ITS NOT YOUR HEAVY LIFTER

Probably a bit strange for a technology blog to say, but its true.  We have an incredible CMDB and Asset System at AOL.  This was hugely helpful to the effort in really getting to the bottom of the list.   However no amount of Technology in place will be able to perform the myriad of tasks required to actually make material movement on this kind of work.   Some of it requires negotiation, some of it requires strength of will, some of it takes pure persistence in running these issues down…working with the people.  Understanding what is still required, what can be moved.  This requires people.   We had great technologies in place from the perspective of knowing where are stuff was, what it did, and what it was connected to.  We had great technologies like our Cloud to move some of these platforms to ultimately.    However, you need to make sure you don’t go to far down the people trap.  I have a saying in my organization – There is a perfect number of project managers and security people in any organization.  Where the work output and value delivered is highest.   What is that number?  Depends – but you definitely know when you have one too many of each.

MAKE IT FUN IF YOU CAN

From the brass pigs, to minor celebrations each month as we worked through the process we ensured that the attention given the effort was not negative. Sure it can be tough work, but you are at the end of the day substantially investing in the overall agility of your organization.  Its something to be celebrated.    In fact at the completion of our aggressive goals the primary project leads involved did a great video (which you can see here) to highlight and celebrate the win.   Everyone had a great laugh and a ton of fun doing what was ultimately a tough grind of work.  If you are headed to Symposium I strongly encourage you to reach out to my incredible project leads.  You will be able to recognize them from the video….without the mustaches of course!

\Mm

A Digital Adieu

Those who follow the news from Digital Realty Trust closely may have recently read that I have decided to leave the company to focus a bit more on some personal work/life balance issues.  With this move comes a new role that I will talk more of in the coming days and weeks.

I would like to take a moment and talk about my time and experience at Digital and what I believe to be some industry ground breaking work that is being done there.   The first thing that strikes me about the company is the quality and dedication of the people.  The staff within the organization are incredibly committed to both providing the best product  (in terms of engineering and construction) along with an obsessive regimen around Operations.  In my role running all aspects of design, construction, and operations, this passion showed through every single day.    It was a delight to work with such motivated people. 

From the outside it might be difficult to gauge just how significant this operation truly is.   As many of you know I have run some large programs before, but they all pale in comparison to the size, scope, and complexity of the work happening at DLR.  Its one thing to be building a couple of very large facilities and quite another to be building out tens upon tens of data center construction initiatives across the world.   There simply is no organization in the world that has to construct, manage and operate more data centers, period.   In addition to these “block and tackling” items there is also a healthy focus on modularization and evolving data center design and prototyping.   This focus is not just about driving additional efficiencies in power and cooling, but also in cost, and time to deploy.  A true intersection of business requirements.  On top of all this you add the Pod Architecture Services program and Build to Suit program which additionally extend Digitals capabilities to those looking to build “Do it Yourself” (DIY) Data Centers.   In short, it was a ton of fun with incredible opportunities for growth.

In my time at the company I have focused on driving additional streamlining efforts and operational rigor across the board and have helped set the engineering direction of the company.   This work has already begun to pay some significant dividends and I am sure will likely continue well into the future.   But let me be clear – The success of these initiatives will be delivered by a top rate team with few peers in the industry.  

In short, Digital was a great experience and I feel blessed in having made some life-long friends there as well.   So as I start a new chapter in my life, a bid fond adieu to a Data Center Juggernaut and look boldly forward to what is to come, for me and for Digital.

\Mm

Must Have Swag…..

I try not to post much business related stuff (ala Digital Realty Trust) on Loosebolts as its my own place to rant and rave.   To be clear-none of the things I say on here represent the views of the company what-so-ever.   But sometimes, there are a things that come along that really make me smile and I have to comment on them.

As you know I am huge fan of modularization in the data center.  Modularization in construction, modularization in operation, modularization is just all-around goodness from a technical perspective through the business side of things.   That’s why the newest marketing campaign from Digital has me smiling ear to ear.  image   The new Data Center Construction kit brings back memories from when I was a kid and built giant structures for my little people to generally live, die, and party in.    It was of course a modular approach that led to endless hours of fun and imagination.   Applying these fond remembrances of youth and combining it with both the modular data center movement, and general fun will make this the MUST-HAVE piece of swag in the industry.   Data Center Knowledge posted a video about the toys a few weeks ago.    I can definitely tell you it will lead to hours of fun and wasted time at work putting it together.   I should know, my completed “data center” sits proudly in my office! 

After all we are all just kids at heart, aren’t we?

\Mm