The Weight of Technical Liberty…Cutting the Cruft

Over the next few months, it’s my sincere desire to share with you some of the amazing technology accomplishments currently underway at First Data and how we are attempting to change the industry.  In any conversation about the future, you must begin by framing the past.   As you may or may not know First Data is a company that was founded in 1971.   It is a company hallmarked in its early years by significant technology innovation with a number of ‘firsts’ in the enablement of credit card processing across the globe. 

Throughout the years the company grew both organically as well as through large numbers of mergers and acquisitions on a global scale which ultimately enabled it to become the international leader it is today.  I will spare folks a deeper commercial of the company only to state that today it has more scale and technology reach than any other company like it in the #Fintech space. 

I share this information because it’s that unencumbered growth over decades of acquisition, an evolving and changing regulatory and compliance field of requirements, and a historically growing list of platforms and services that ultimately led to the largest trove of ‘Cruft’ I have ever been challenged with in my personal career. It’s a challenge 45 years in the making. 

As you may recall I first defined ‘Cruft’ while engaged at the Turn-Around at AOL:

Cruft is defined as years of build-up of technology, processes, politics, fiscal orphaning, and poor operational hygiene that ultimately impede technical agility and operation.  Additionally, Cruft can create an acidic cloud of lethargy or apathy in the workforce that ultimately sucks the energy out of innovation from within.

When I originally defined the term I was referring to the work we accomplished attacking the Cruft in a different organization which ultimately led to the company winning the Uptime Institute’s “Server Round-Up” Award. That award was created to promote full IT and Facilities integration and improve overall energy efficiency.  While recognized for the energy efficiency improvement, it was really a by-product of other technological and organizational wins for the company.

Our work on ‘attacking the Cruft’ at First Data has resulted in similar, in fact, greater energy cost savings, but more importantly it has reduced and continues to reduce the operational complexity of our environments.  Attacking the Cruft problem along the technology, process, and hygiene axes have resulted in some very powerful and significant results.  While we are far from completing the task, the last twenty-four (24) months have yielded some mind-numbing progress.

Is this really my metric? So Not Technical…

The first challenge I had was trying to find a way to truly quantify the reductions in a metric everyone could understand.   Simply counting servers was not enough, it could not account for other devices like storage equipment, network equipment, and other kit that does not easily fold into that definition.   Measuring power usage decreases, while absolutely telling the effort from a purely technical perspective, obfuscated the tremendous amount of work and passion the teams poured into modernizing our plant.  Many of the consumers of the information of our modernization efforts are not technology or energy wonks.   We had to come up with a metric that was universal.  That everyone, even non-technical people could understand and visualize.   In the end, we settled on the ‘ton’. 

I know what you are thinking…the ton?  As in… like..weight?

Yes. 

It’s not as cool as measuring in megawatts, or measured computational capacity, or MIPS, or IOPs, or whatever metric is fashionable these days, but it is universal.  Additionally, the scale of the work output would just get lost.   So what did we achieve over the last 24 months?

  1. We removed 220+ tons of IT Equipment from our global data centers.
  2. We consolidated and shutdown 5 data centers across the world; and have an aggressive plan to continue to consolidate more.
  3. We employed large-scale internal virtualization technology, open source cloud technologies, and are building a hybridized cloud controller that has resulted in moving nearly 75% of our physical distributed server environments to a virtualized footprint. (I will share more on that in a different post).

There were significant other achievements as well which we can discuss at a later date.  But as I said, we had to set the framework of what the starting position was.   We still have a mountain of work in this space to do but the momentum has started and passions have been ignited.  Those passions are blowing away that “acidic cloud” that results from Cruft.  The results speak for themselves.  That is an incredible amount of work to achieve in just 24 months.  It’s not just about establishing a set of technical goals for an organization to achieve.  As a leader it’s about ensuring that you have created the fertile soil for those changes to take place and have empowered your people to make decisions along that alignment. 

Of course, none of this could have been achieved if the firm from the top down was dedicated to driving this kind of significant change.   First Data is truly blessed with a board and leadership team who not only understand technology, they have lived it, they have managed it, they have won with it.   It’s a very unique set of variables that have been toggled.

While tonnage may be an easier metric for non-techies to understand how much equipment was removed,  it is hard to grasp just how much 220 tons actually represents.  As these efforts over the last two years have created more operational simplicity giving us the freedom and liberty to expand and explore new technology approaches it is only fitting to associate it with the Statue of Liberty.  Which by coincidence also weighs 220 tons.  Visualize that.

\Mm

Bippity Boppity Boom! The Impact of Enchanted Objects on Development, Infrastructure and the Cloud

I have been spending a bunch of my time recently thinking through the impact on what David Rose of Ditto Labs and MIT Media Lab romantically calls ‘Enchanted Objects’.  What are enchanted objects?   Enchanted Objects are devices, appliances, tools, dishware, anything that is ultimately connected to the Internet (or any connected network) and become to some degree aware of the world around them.   Imagine an Umbrella that has a light on its hilt that lights up if it may rain today, reminding you that you might want to bring it along on your travels.   Imagine your pantry and refrigerator communicating with your grocery cart at the store while you shop, letting you know the things you are running low on or even bypasses the part where you have to shop, and automatically just orders it to your home.  This approach is going to fundamentally change everything you know in life from credit cards to having a barbeque with friends. These things and their capabilities are going to change our world in ways that we cannot even fathom today.   Our Technology Industry calls this emerging field, the Internet of Things.   Ugh!  How absolutely boring. Our industry has this way of sucking all the fun out of things don’t we?   I personally feel that ‘Enchanted Objects’ is a far more compelling classification, as it speaks to the possibilities, wonderment and possibly terror that lies in store for us.  If we must make it sound ‘technical’ maybe we can call it the Enchantosphere.

While I may someday do a post about all of the interesting things I have found out there already, or the ideas that I have come up with for this new enchanted world,  I wanted to to reflect a bit on what it means for the things that I normally write about.  You know, things like The cloud, big infrastructure, and scaled software development.   So go grab your walking staff of traffic conditions and come on an interesting journey into the not-so-distant world of Cloud powered magic…

The first thing you need to understand is, if you work in this industry, you are not an idle player in this magical realm.  You are, for lack of a better term, a wizard or an enchanter.   Your role will be pivotal in creating magic items, maintaining the magic around us, or ensuring that the magic used by everyone stays strong. While the Dungeons and Dragons and fantasy book references are almost limitless for this conversation I am going to try and bring it back to the world we know today.  I promise.  I am really just trying to tease out a glimpse of the world to come and the importance of the cloud, data center infrastructure, and the significant impacts on software development and how software based services may have to evolve. 

The Magical Weaves Surround Us

Every device and enchanted item will be connected.  Whether via through WIFI in your work and home, over mobile networks, or all of the above and more, these Enchanted Objects will be connected to the magical weaves all around us.  If you happen to be a network engineer you know that I am talking to you.  All of these objects are going to have to connect to something.   If you are one of those folks who are stuck in IPv4, you better upgrade yourself. There just isn’t enough address space there to connect everything in our magical world of the future.  IPv6 will be a must. In fact, these devices could just be that ‘killer app’ that drives global adoption of the standard even faster.   But its not just about address space, these kind of connected objects are going to open up and challenge whole new areas in security, spectrum management, routing, and a host of other areas.   I am personally thinking through some very interesting source-based routing applications in the Enchantosphere as well.   The short of it is, this new magical world is going to stress the limits of how things are connected today and Network Engineers will be charged with keeping our magical weaves flowing to allow our charmed existences to continue.  You are the Keepers of the Magical Weave and I am not talking about a tricked out hairpiece either.

While just briefly mentioned above – Security Engineers are going to have to evolve significantly as well.   It will lead into whole new areas and fields of privacy protection hard to even conceive at this point.  Even things like Health and Safety will need to be considered.  Imagine a stove that starts pre-heating itself based on where you are on your commute home and the dinner menu you have planned.  While some of those controls will need to be programmed into the software itself, there is no doubt that those capabilities will need to be well guarded.  Why, I can almost see the Wards and Glyphs of Protection you will have to create.

The Wizard’s Tower

imageAs cool as all these enchanted objects could be, they would all be worthless IP-enabled husks without the advent of the construct that we now call The Cloud.  When I talk about ‘The Cloud’ I am talking about more than just virtualized server instances and marketing-laden terminology.  I am talking about Data Centers.  I am talking about automation.  I am talking about ubiquitous compute capabilities all around the world.  The actual physical places where the magical services live! The Data Centers which include the technologies of both IT and facilities infrastructure and automation, The proverbial Wizards Tower!  This is where our enchanted objects will come to discover who they, how they work, what they should do, and retrieve any new capabilities they may yet magically receive.  This new world is going to drive the need for more compute centers across the globe.  This growth will not just be driven by demand, although the demand will admittedly be huge, but by other more mundane ‘muggle’ matters such as regulatory requirements, privacy enforcement, taxation and revenue.  I bet you were figuring  that with all this new found magical power flying around we would be able to finally rid ourselves of lawyers, legislators, government hacks, and the like.   Alas, it is after all still the real world.  Cloud Computing capacity will continue to grow, the demand for services increasing, and the development of an entire eco-system of software and services that sit atop the various cloud providers will be birthed.

I don’t know if many of you have read Robert Jordan’s fantasy series called ‘The Wheel of Time’, but in that series he has a a classification of enchanted objects called the Terangreal.  These are single purpose or limited power artifacts that anyone can use.   Like my example of the umbrella that lights up if its going to rain after it checks with Weatherbug for weather conditions in your area, or a ring that lights up to let you know that there is a new Loosebolts post available to read, or a garden gnome whose hat lights up when it detects evidence of plant eating bugs in your garden.  These are devices that require no technical knowledge to use, configure, but give some value to its owner.   They do their function and that is it.   By the way, I am an engineer not a marketing guy, if you don’t like my examples of special purpose enchanted objects you can tweet me better ones at @mjmanos. 

These devices will reach out, download their software, learn their capabilities, and just work as advertised.   Software in this model may seem very similar to todays software development techniques and environments but I believe we will begin to see fundamental changes in how software works and is distributed.   Software will be portable. Services will be portable.   Allowing for truly amazing “Multi-purpose” enchanted objects.  The ability to download “apps” to these objects can become common place.   Even something as a common place as a credit card could evolve to a piece of software or code that could be transported around in various devices.  Simply wave that RFID enabled stick (ok, wand) that contains your credit card app at the register and as long as you are wearing your necklace which stores your digital ID the transaction goes through.  Two factor authentication in the real world.  Or instead of a wand, maybe its just your wallet.  When thinking about this app enabled platform it gives a whole new meaning to the Capital One catchphrase Whats in your wallet?  The bottom line here is that a whole host of software, services, and other capabilities will become incredibly portable, and allow for some very interesting enchanted objects indeed.

The bottom line here is that we are just beginning to see into a new world of the Internet of Things… of Enchanted Objects.   The simpler things become the more complex they truly are.   Those of us who deal with large scale infrastructure, software and service development, and cloud based technologies have a heck of a ride ahead of us.  We are the keepers of the complex, Masters of the Arcane, and needers of a good bath.

\Mm

This is just lost on so many companies / organizations…

image

Having experienced nearly all of the pain and desire one could have in trying to scale out applications, operations, and infrastructure, I have become a huge proponent of blending efforts between Development with Operations.   Additionally I think the blend should include lower level stuff like facilities as well.  The entire online paradigm fundamentally changes how the problem space should be viewed.

With Concepts like NoOps, DevOps, and the like becoming fashionable in the Development community its probably no surprise that these issues are being addressed from people’s own comfort spaces.  To a development engineer – those Ops folks are crusty and cranky.   To an Operations engineer those darn developers don’t really code for long term operations.   Its always the ‘throw the code over the wall’ and the Ops folks will make it work mentality.   In reality both sides are right.

The simple truth is (in my opinion) that the University System is to a large degree failing the industry especially when it comes to developing for future platforms.   Graduates are coming out by the thousands versed in the development of  Java, Ruby on Rails, and insert your favorite flavor of high level web platform here.   The amount of graduates who understand the underlying systems, and more basically how things work are becoming rarer by the year.  Add to this mix an understanding of developing for code to RUN, and the infrastructural and operational requirements associated with it you are dealing with a very rare set of skill sets.   Many of the big companies who do build for the RUN of software (read as SAAS, large scale online services, etc) actually go through a bit of “re-education” with new hires to either teach them these skill sets for the first time, or “re-program” the bad stuff they learned out. 

I had a series of related things come through my inbox and a video shared with me from last years Velocity conference.   I think they are powerful thought provokers to read and watch.  

The first is the video from Velocity by Theo Schlossnagle, a founder and principal at OmniTI.  It is  somewhat skunk-worked under the heading of Career Development.   Theo takes this from the perspective of the individual, but it is easily applied to organizations at large. Its 13 minutes, but well worth the time.

http://velocityconf.com/velocity2011/public/schedule/detail/20406

The second is a post by Adrian Cockcroft from Netflix talking about the development and evolution of DevOps/NoOps in the culture there.   The approach is right on, although I think to a large degree some of the real “ops” stuff has been outsourced to the cloud provider.   That being said I think the mindset shown here from a broader “Development Responsibilities” is definitely in the right way to think about the problem space.   I have often talked about the NetFlix Chaos Monkey approach and just how powerful that paradigm is:

http://perfcap.blogspot.com/2012/03/ops-devops-and-noops-at-netflix.html

The last is actually a response to the Cockcroft post by John Allspaw who used to run Flickr’s Operations and is now at Etsy.  Who while arguing the benefit of a stronger Ops presence and involvement also highlights the benefits of having Development and Engineering more aware of their surroundings.

Happy Reading!

\Mm