Industry Impact : Brothers from Different Mothers and Beyond…

Screen Shot 2013-11-15 at 12.19.43 PM

My reading material and video watching habits these past two weeks have brought me some incredible joy and happiness. Why?  Because Najam Ahmad of Facebook is finally getting some credit for the amazing work that he has done and been doing in the world of Software Defined Networking.  In my opinion Najam is a Force Majeure in the networking world.   He is passionate.  He is focused. He just gets things done.  Najam and I worked very closely at Microsoft as we built out and managed the company’s global infrastructure. So closely in fact that we were frequently referred to as brothers from different mothers.   Wherever Najam was-I was not far behind, and vice versa. We laughed. We cried.  We fought.  We had alot of fun while delivered some pretty serious stuff.  To find out that he is behind the incredible Open Compute Project advances in Networking is not surprising at all.   Always a forward thinking guy he has never been satisfied with the status quo.    
If you have missed any of that coverage you I strongly encourage you to have a read at the links below.   


This got me to thinking about the legacy of the Microsoft program on the Cloud and Infrastructure Industry at large.   Data Center Knowledge had an article covering the impact of some of the Yahoo Alumni a few years ago. Many of those folks are friends of mine and deserve great credit.  In fact, Tom Furlong now works side by side with Najam at Facebook.    The purpose of my thoughts are not to take away from their achievements and impacts on the industry but rather to really highlight the impact of some of the amazing people and alumni from the Microsoft program.  Its a long overdue acknowledgement of the legacy of that program and how it has been a real driving force in large scale infrastructure.   The list of folks below is by no means comprehensive and doesnt talk about the talented people Microsoft maintains in their deep stable that continue to drive the innovative boundaries of our industry.  

Christian Belady of Microsoft – Here we go, first person mentioned and I already blow my own rule.   I know Christian is still there at Microsoft but its hard not to mention him as he is the public face of the program today.  He was an innovative thinker before he joined the program at Microsoft and was a driving thought leader and thought provoker while I was there.  While his industry level engagements have been greatly sidelined as he steers the program into the future – he continues to be someone willing to throw everything we know and accept today into the wind to explore new directions.
Najam Ahmad of Facbook - You thought  I was done talking about this incredible guy?  Not in the least, few people have solved network infrastructure problems at scale like Najam has.   With his recent work on the OCP front finally coming to the fore, he continues to drive the capabilities of what is possible forward.  I remember long meetings with Network vendors where Najam tried to influence capabilities and features with the box manufacturers within the paradigm of the time, and his work at Facebook is likely to end him up in a position where he is both loved and revilved by the Industry at large.  If that doesn’t say your an industry heavy weight…nothing does.
James Hamilton of Amazon - There is no question that James continues to drive deep thinking in our industry. I remain an avid reader of his blog and follower of his talks.    Back in my Microsoft days we would sit  and argue philosophical issues around the approach to our growth, towards compute, towards just about everything.   Those conversations either changed or strengthed my positions as the program evolved.   His work in the industry while at Microsoft and beyond has continued to shape thinking around data centers, power, compute, networking and more.
Dan Costello of Google - Dan Costello now works at Google, but his impacts on the Generation 3 and Generation 4 data center approaches and the modular DC industry direction overall  will be felt for a very long time to come whether Google goes that route or not.   Incredibly well balanced in his approach between technology and business his ideas and talks continue to shape infrastructre at scale.  I will spare people the story of how I hired him away from his previous employer but if you ever catch me at a conference, its a pretty funny story. Not to mention the fact that he is the second best break dancer in the Data Center Industry.
Nic Bustamonte of Google – Nic is another guy who has had some serious impact on the industry as it relates to innovating the running and operating of large scale facilities.   His focus on the various aspects of the operating environments of large scale data centers, monioring, and internal technology has shifted the industry and really set the infancy for DCIM in motion.   Yes, BMS systems have been around forever, and DCIM is the next interation and blending of that data, but his early work here has continued to influence thinking around the industry.
Arne Josefsberg of ServiceNow - Today Arne is the CTO of Service Now, and focusing on infrastructure and management for enterprises to the big players alike and if their overall success is any measure, he continues to impact the industry through results.  He is *THE* guy who had the foresight of building an organiation to adapt to this growing change of building and operating at scale.   He the is the architect of building an amazing team that would eventually change the industry.
Joel Stone of Savvis/CenturyLink – Previously the guy who ran global operations for Microsoft, he has continued to drive excellence in Operations at Global Switch and now at Savvis.   An early adopter and implmenter of blending facilities and IT organizations he mastered issues a decade ago that most companies are still struggling with today.
Sean Farney of Ubiquity – Truly the first Data center professional who ever had to productize and operationalize data center containers at scale.   Sean has recently taken on the challenge of diversifying data center site selection and placement at Ubquity repurposing old neighorbood retail spaces (Sears, etc) in the industry.   Given the general challenges of finding places with a confluence of large scale power and network, this approach may prove to be quite interesting as markets continue to drive demand.   
Chris Brown of Opscode – One of the chief automation architects at my time at Microsoft, he has moved on to become the CTO of Opscode.  Everyone on the planet who is adopting and embracing a DevOps has heard of, and is probably using, Chef.  In fact if you are doing any kind of automation at large scale you are likely using his code.
None of these people would be comfortable with the attention but I do feel credit should be given to these amazing individuals who are changing our industry every day.    I am so very proud to have worked the trenches with these people. Life is always better when you are surrounded by those who challenge and support you and in my opinion these folks have taken it to the next level.
\Mm

Bippity Boppity Boom! The Impact of Enchanted Objects on Development, Infrastructure and the Cloud

I have been spending a bunch of my time recently thinking through the impact on what David Rose of Ditto Labs and MIT Media Lab romantically calls ‘Enchanted Objects’.  What are enchanted objects?   Enchanted Objects are devices, appliances, tools, dishware, anything that is ultimately connected to the Internet (or any connected network) and become to some degree aware of the world around them.   Imagine an Umbrella that has a light on its hilt that lights up if it may rain today, reminding you that you might want to bring it along on your travels.   Imagine your pantry and refrigerator communicating with your grocery cart at the store while you shop, letting you know the things you are running low on or even bypasses the part where you have to shop, and automatically just orders it to your home.  This approach is going to fundamentally change everything you know in life from credit cards to having a barbeque with friends. These things and their capabilities are going to change our world in ways that we cannot even fathom today.   Our Technology Industry calls this emerging field, the Internet of Things.   Ugh!  How absolutely boring. Our industry has this way of sucking all the fun out of things don’t we?   I personally feel that ‘Enchanted Objects’ is a far more compelling classification, as it speaks to the possibilities, wonderment and possibly terror that lies in store for us.  If we must make it sound ‘technical’ maybe we can call it the Enchantosphere.

While I may someday do a post about all of the interesting things I have found out there already, or the ideas that I have come up with for this new enchanted world,  I wanted to to reflect a bit on what it means for the things that I normally write about.  You know, things like The cloud, big infrastructure, and scaled software development.   So go grab your walking staff of traffic conditions and come on an interesting journey into the not-so-distant world of Cloud powered magic…

The first thing you need to understand is, if you work in this industry, you are not an idle player in this magical realm.  You are, for lack of a better term, a wizard or an enchanter.   Your role will be pivotal in creating magic items, maintaining the magic around us, or ensuring that the magic used by everyone stays strong. While the Dungeons and Dragons and fantasy book references are almost limitless for this conversation I am going to try and bring it back to the world we know today.  I promise.  I am really just trying to tease out a glimpse of the world to come and the importance of the cloud, data center infrastructure, and the significant impacts on software development and how software based services may have to evolve. 

The Magical Weaves Surround Us

Every device and enchanted item will be connected.  Whether via through WIFI in your work and home, over mobile networks, or all of the above and more, these Enchanted Objects will be connected to the magical weaves all around us.  If you happen to be a network engineer you know that I am talking to you.  All of these objects are going to have to connect to something.   If you are one of those folks who are stuck in IPv4, you better upgrade yourself. There just isn’t enough address space there to connect everything in our magical world of the future.  IPv6 will be a must. In fact, these devices could just be that ‘killer app’ that drives global adoption of the standard even faster.   But its not just about address space, these kind of connected objects are going to open up and challenge whole new areas in security, spectrum management, routing, and a host of other areas.   I am personally thinking through some very interesting source-based routing applications in the Enchantosphere as well.   The short of it is, this new magical world is going to stress the limits of how things are connected today and Network Engineers will be charged with keeping our magical weaves flowing to allow our charmed existences to continue.  You are the Keepers of the Magical Weave and I am not talking about a tricked out hairpiece either.

While just briefly mentioned above – Security Engineers are going to have to evolve significantly as well.   It will lead into whole new areas and fields of privacy protection hard to even conceive at this point.  Even things like Health and Safety will need to be considered.  Imagine a stove that starts pre-heating itself based on where you are on your commute home and the dinner menu you have planned.  While some of those controls will need to be programmed into the software itself, there is no doubt that those capabilities will need to be well guarded.  Why, I can almost see the Wards and Glyphs of Protection you will have to create.

The Wizard’s Tower

imageAs cool as all these enchanted objects could be, they would all be worthless IP-enabled husks without the advent of the construct that we now call The Cloud.  When I talk about ‘The Cloud’ I am talking about more than just virtualized server instances and marketing-laden terminology.  I am talking about Data Centers.  I am talking about automation.  I am talking about ubiquitous compute capabilities all around the world.  The actual physical places where the magical services live! The Data Centers which include the technologies of both IT and facilities infrastructure and automation, The proverbial Wizards Tower!  This is where our enchanted objects will come to discover who they, how they work, what they should do, and retrieve any new capabilities they may yet magically receive.  This new world is going to drive the need for more compute centers across the globe.  This growth will not just be driven by demand, although the demand will admittedly be huge, but by other more mundane ‘muggle’ matters such as regulatory requirements, privacy enforcement, taxation and revenue.  I bet you were figuring  that with all this new found magical power flying around we would be able to finally rid ourselves of lawyers, legislators, government hacks, and the like.   Alas, it is after all still the real world.  Cloud Computing capacity will continue to grow, the demand for services increasing, and the development of an entire eco-system of software and services that sit atop the various cloud providers will be birthed.

I don’t know if many of you have read Robert Jordan’s fantasy series called ‘The Wheel of Time’, but in that series he has a a classification of enchanted objects called the Terangreal.  These are single purpose or limited power artifacts that anyone can use.   Like my example of the umbrella that lights up if its going to rain after it checks with Weatherbug for weather conditions in your area, or a ring that lights up to let you know that there is a new Loosebolts post available to read, or a garden gnome whose hat lights up when it detects evidence of plant eating bugs in your garden.  These are devices that require no technical knowledge to use, configure, but give some value to its owner.   They do their function and that is it.   By the way, I am an engineer not a marketing guy, if you don’t like my examples of special purpose enchanted objects you can tweet me better ones at @mjmanos. 

These devices will reach out, download their software, learn their capabilities, and just work as advertised.   Software in this model may seem very similar to todays software development techniques and environments but I believe we will begin to see fundamental changes in how software works and is distributed.   Software will be portable. Services will be portable.   Allowing for truly amazing “Multi-purpose” enchanted objects.  The ability to download “apps” to these objects can become common place.   Even something as a common place as a credit card could evolve to a piece of software or code that could be transported around in various devices.  Simply wave that RFID enabled stick (ok, wand) that contains your credit card app at the register and as long as you are wearing your necklace which stores your digital ID the transaction goes through.  Two factor authentication in the real world.  Or instead of a wand, maybe its just your wallet.  When thinking about this app enabled platform it gives a whole new meaning to the Capital One catchphrase Whats in your wallet?  The bottom line here is that a whole host of software, services, and other capabilities will become incredibly portable, and allow for some very interesting enchanted objects indeed.

The bottom line here is that we are just beginning to see into a new world of the Internet of Things… of Enchanted Objects.   The simpler things become the more complex they truly are.   Those of us who deal with large scale infrastructure, software and service development, and cloud based technologies have a heck of a ride ahead of us.  We are the keepers of the complex, Masters of the Arcane, and needers of a good bath.

\Mm

Google Purchase of Deep Earth Mining Equipment in Support of ‘Project Rabbit Ears’ and Worldwide WIFI availability…

(10/31/2013 – Mountain View, California) – Close examination of Google’s data center construction related purchases has revealed the procurement of large scale deep earth mining equipment.   While the actual need for the deep mining gear is unclear, many speculate that it has to do with a secretive internal project that has come to light known only as Project: Rabbit Ears. 

According to sources not at all familiar with Google technology infrastructure strategy, Project Rabbit ears is the natural outgrowth of Google’ desire to provide ubiquitous infrastructure world wide.   On the surface, these efforts seem consistent with other incorrectly speculated projects such as Project Loon, Google’s attempt to provide Internet services to residents in the upper atmosphere through the use of high altitude balloons, and a project that has only recently become visible and the source of much public debate – known as ‘Project Floating Herring’, where apparently a significantly sized floating barge with modular container-based data centers sitting in the San Francisco Bay has been spied. 

“You will notice there is no power or network infrastructure going to any of those data center shipping containers,” said John Knownothing, chief Engineer at Dubious Lee Technical Engineering Credibility Corp.  “That’s because they have mastered wireless electrical transfer at the large multi-megawatt scale.” 

Real Estate rates in the Bay Area have increased almost exponentially over the last ten years making the construction of large scale data center facilities an expensive endeavor.  During the same period, The Port of San Francisco has unfortunately seen a steady decline of its import export trade.  After a deep analysis it was discovered that docking fees in the Port of San Francisco are considerably undervalued and will provide Google with an incredibly cheap real estate option in one of the most expensive markets in the world. 

It will also allow them to expand their use of renewable energy through the use of tidal power generation built directly into the barges hull.   “They may be able to collect as much as 30 kilowatts of power sitting on the top of the water like that”, continues Knownothing, “and while none of that technology is actually visible, possible, or exists, we are certain that Google has it.”

While the technical intricacies of the project fascinate many, the initiative does have its critics like Compass Data Center CEO, Chris Crosby, who laments the potential social aspects of this approach, “Life at sea can be lonely, and no one wants to think about what might happen when a bunch of drunken data center engineers hit port.”  Additionally, Crosby mentions the potential for a backslide of human rights violations, “I think we can all agree that the prospect of being flogged or keel hauled really narrows down the possibility for those outage causing human errors. Of course, this sterner level of discipline does open up the possibility of mutiny.”

However, the public launch of Project Floating Herring will certainly need to await the delivery of the more shrouded Project Rabbit Ears for various reasons.  Most specifically the primary reason for the development of this technology is so that Google can ultimately drive the floating facility out past twelve miles into International waters where it can then dodge all national, regional, and local taxation, the safe harbor and privacy legislation of any country or national entity on the planet that would use its services.   In order to realize that vision, in the current network paradigm, Google would need exceedingly long network cables  to attach to Network Access Points and Carrier Connection points as the facilities drive through international waters.

This is where Project Rabbit Ears becomes critical to the Google Strategy.   Making use of the deep earth mining equipment, Google will be able to drill deep into the Earths crust, into the mantle, and ultimately build a large Network Access Point near the Earth’s core.  This Planetary WIFI solution will be centrally located to cover the entire earth without the use of regional WIFI repeaters.  Google’s floating facilities could then gain access to unlimited bandwidth and provide yet another consumer based monetization strategy for the company. 

Knownothing also speculates that such a move would allow Google to make use of enormous amounts of free geo-thermic power and almost singlehandedly become the greenest power user on the planet.   Speculation also abounds that Google could then sell that power through its as yet un-invented large scale multi-megawatt wireless power transfer technology as unseen on its floating data centers.

Much of the discussion around this kind of technology innovation driven by Google has been given credible amounts of veracity and discussed by many seemingly intelligent technology based news outlets and industry organizations who should intellectually know better, but prefer not to acknowledge the inconvenient lack of evidence.

 

\Mm

Editors Note: I have many close friends in the Google Infrastructure organization and firmly believe that they are doing some amazing, incredible work in moving the industry along especially solving problems at scale.   What I find simply amazing is in the search for innovation how often our industry creates things that may or may not be there and convince ourselves so firmly that it exists. 

2014 The Year Cloud Computing and Internet Services will be taxed. A.K.A Je déteste dire ça. Je vous l’avais dit.

 

france

Its one of those times I really hate to be right.  As many of you know I have been talking about the various grass roots efforts afoot across many of the Member EU countries to start driving a more significant tax regimen on Internet based companies.  My predictions for the last few years have more been cautionary tales based on what I saw happening from a regulatory perspective on a much smaller scale, country to country.

Today’s Wall Street Journal has an article discussing France’s movements to begin taxation on Internet related companies who derive revenue from users and companies across the entirety of the EU, but holding those companies responsible to the tax base in each country.   This could likely mean that such legislation is likely to become quite fractured and tough for Internet Companies to navigate.  The French proposition is asking the European Commission to draw up proposals by the Spring of 2014.

This is likely to have a very interesting (read as cost increases) across just about every aspect of Internet and Cloud Computing resources.  From a business perspective this is going to increase costs which will likely be passed on to consumers in small but interesting ways.  Internet advertising will need to be differentiated on a country by country basis, and advertisers will end up having different cost structures, Cloud Computing Companies will DEFINITELY need to understand where instances of customer instances were, and whether or not they were making money.  Potentially more impactful, customers of Cloud computing may be held to account for taxation accountability that they did not know they had!  Things like Data Center Site Selection are likely going to become even more complicated from a tax analysis perspective as countries with higher populations will likely become no-go zones (perhaps) or require the passage of even more restrictive laws around it.

Its not like the seeds of this haven’t been around since 2005, I think most people just preferred to keep a blind eye to the tax that the seed was sprouting into a full fledged tree.   Going back to my Cat and Mouse Papers from a few years ago…  The Cat has caught the mouse, its now the mouse’s move.

\Mm

 

Authors Note: If you don’t have a subscription to the WSJ, All Things Digital did a quick synopsis of the article here.

The Soft Whisper that Big Data, Cloud Computing, and Infrastructure at Scale should treat as a Clarion Call.

The Cloud Jail

On Friday, August 23rd, the Chinese government quietly released Shi Tao from prison.   He was released a full fifteen months before his incarceration was supposed to end.  While certainly a relief to his family and friends, it’s likely a bittersweet ending to a sour turn of events.

Just who is Shi Tao and what the heck does he have to do with Big Data?  Why is he important to Cloud Computing and big infrastructure?   Is he a world-class engineer who understands technology at scale?  Is he a deep thinker of all things cloud?  Did he invent some new technology poised to revolutionize and leap frog our understanding?

No.  He is none of these things.  

He is a totem of sorts.   A living parable and a reminder of the realities that many in the Cloud Computing industry and those who deal with Big Data, rarely if ever address head on or give little mind to.   He represents the cautionary tale of what can happen if companies and firms don’t fully vet the impacts of their Technology Choices grounded by the real world.  The site selection of their data centers.   The impact of how data is stored.  Where that data is stored.   The methods used of storing the data.  In short a responsibility for the full accounting and consideration of their technological and informational artifacts. 

To an engineering mind that responsibility generally means the most efficient storage of data with the least amount of cost.  Using the most direct method or the highest performing algorithm.  In short…to continually build a better mouse trap.  

In site selecting of new data centers it would likely be limited to just the basic real estate and business drivers.   What is the power cost?  What is the land cost?  What is my access to water? Is there sufficient network nearby?  Can I negotiate tax breaks at the country and/or at local levels?

In selecting a cloud provider its generally about avoiding large capital costs and paying what I need, when I need it.

In the business landscape of tomorrow, these thoughts will prove short-sighted and may likely expose your company to significant cost and business risks they are not contemplating or worse!

Big Data is becoming a dangerous game.  To be fair content and information in general has always been a bit of a dangerous game.   In Technology, we just go on pretending we live under a Utopian illusion that fairness  ultimately rules the world.  It doesn’t.   Businesses have an inherent risk collecting, storing, analyzing , and using the data that they obtain.  Does that sound alarmist or  jaded?  Perhaps, but its spiced with some cold hard realities that are becoming ever more present every day and you ignore at your own peril.  

Shi was arrested in 2004 and sentenced to prison the following year on charges of disclosing state secrets.  His crime? He had sent details of a government memo restricting news coverage to a human rights group in the United States.  The Chinese government demanded that Yahoo! (his mail provider) turn over all mail records (Big Data) to the authorities. Something they ultimately did.  

Now Before you go and get your Western Democracy Sensibilities all in a bunch and cry foul-that ugly cold hard reality thing I was talking about plays a real part here.  As Yahoo was operating as a business inside China, they were bound by comply with Chinese law no matter how hard the action was to stomach for them.   Around that time Yahoo sold most of its stake in the Chinese market to Alibaba and as of the last month or so Yahoo has since left China altogether.  

Yahoo’s adventure in Data information risk and governmental oversight was not over however.  They were brought before the US Congress on charges of Human Rights Violations.   Placing them once again into a pot of boiling water from a governmental concern closer to home.  

These events took place back almost seven years ago and I would argue that the world of information, big data, and scaled infrastructure has actually gotten more convoluted and tricky to deal with.   With the advent of Amazon AWS and other cloud services, a lack of understanding of regional and local Safe Harbor practices amongst enterprises and startups alike,  concepts like chain of custody and complicated and recursive ownership rights can be obfuscated to the point of insanity if you don’t have a program to manage it.    We don’t have to use the example of China either, similar complexities are emerging across and internal to  Europe .  Is your company really thinking through Big Data?  Do you fully understand ownership in a clouded environment?  Who is responsible for taxation for you local business hosted internationally?  What if your cloud servers ,with your data, hosted by a cloud platform  were confiscated by local and regional governments without your direct involvement?  Are you strategically storing data in a way that protects yourself? Do you even have someone looking at these risks to your business? 

As a recovering network engineer I am reminded by an old joke referring to the OSI Model.   The OSI Model categorizes all functions of a communication system into seven logical layers.   It makes internetworking clear and efficient and easily categorized.  Of course, as every good network engineer knows, it doesn’t account for layers 8 and 9.  But wait!  You said there were only 7!  Well Layers 8 and 9 are Politics and Religion.    These layers exist in Cloud Computing and Big Data too and are potentially more impactful to the business overall.

All of these scenarios do not necessarily lend themselves to be the most direct or efficient, but its pretty clear that you can save yourself a whole lot of time and heartache if you think about them strategically.  The infrastructure of tomorrow is powerful, robust, and ubiquitous.   You simply cannot manage this complex eco-system the same ways you have in the past and just like the technology your thinking needs to evolve.   

\Mm

Lots of interest in the MicroDC, but do you know what I am getting the most questions about?

 Scott Killian of AOL talks about the MicroDC

Last week I put up a post about how AOL.com has 25% of all traffic now running through our MicroDC infrastructure.   There was a great follow up post by James LaPlaine our VP of Operations on his blog Mental Effort, which goes into even greater detail.   While many of the email inquiries I get have been based around the technology itself, surprisingly a large majority of the notes have been questions around how to make your software. applications, and development efforts ready for such an infrastructure and what the timelines for realistically doing so would be.   

The general response of course is that it depends.  If you are a web-based platform or property focused solely on Internet based consumers, or a firm that needs diversified presence in different regions without the hefty price tag of renting and taking down additional space this may be an option.  However many of the enterprise based applications have been written in a way that is highly dependent upon localized infrastructure, short application based latency, and lack adequate scaling.  So for more corporate data center applications this may not be a great fit.  It will take sometime for those big traditional application firms to be able to truly build out their infrastructure to work in an environment like this (they may never do so).   I suspect most will take an easier approach and try to ‘cloudify’ their own applications and run it within their own infrastructure or data centers under their control.   This essentially will allow them to control the access portion of users needs, but continue to rely on the same kinds of infrastructure you might have in your own data center to support it.   Its much easier to build a web based application which then connects to a traditional IT based environment, than to truly build out infrastructure capable of accommodating scale.   I am happy to continue answer questions as they come up, but as I had an overwhelming response of questions about this I thought I would throw something quick up here that will hopefully help.

 

\Mm

On Micro Datacenters, Sandy, Supercomputing 2012, and Coding for Containerized Data Centers….

image

As everyone has been painfully aware last week the United States saw the devastation caused by the Superstorm Sandy.   My original intention was to talk about yet another milestone with our Micro Data Center approach.  As the storm slammed into the East Coast I felt it was probably a bad time to talk about achieving something significant especially as people were suffering through the storms outcome.  In fact, after the storm AOL kicked off an incredible supplies drive and sent truckloads of goods up to the worst of the affected areas.

So, here we are a week after the storm, and while people are still in need and suffering, it is clear that the worst is over and the clean up and healing has begun.   It turns out that Super Storm Sandy also allowed us to test another interesting case in the journey of the Micro Data Center as well that I will touch on.

25% of ALL AOL.COM Traffic runs through Micro Data Centers

I have talked about the potential value of our use of Micro Data Centers and the pure agility and economics the platform will provide for us.   Up until this point we had used this technology in pockets.  Think of our explorations as focusing on beta and demo environments.  But that all changed in October when we officially flipped the switch and began taking production traffic for AOL.COM with the Micro Data Center.  We are currently (and have been since flipping the switch) running about 25% of all traffic coming to our main web site.   This is an interesting achievement in many ways.  First, from a performance perspective we are manually limiting the platform (it could do more!) to ~65,000 requests per minute and a traffic volume of about 280mbits per second.   To date I haven’t seen many people post performance statistics about applications in modular use, so hopefully this is relevant and interesting to folks in terms of the volume of load an approach such as this could take.   We recently celebrated this at a recent All-Hands with an internal version of our MDC being plugged into the conference room.  To prove our point we added it to the global pool of capacity for AOL.com and started taking production traffic right there at the conference facility.   This proves in large part the value, agility and mobility a platform like this could bring to bear.

Scott Killian, AOL's Data Center guru talks about the deployment of AOLs Micro Data Center. An internal version went 'live' during the talk.

 

As I mentioned before, Super Storm Sandy threw us another curveball as the hurricane crashed into the Mid-Atlantic.   While Virginia was not hit anywhere near as hard as New York and New Jersey, there were incredible sustained winds, tumultuous rains, and storm related damage everywhere.  Through it all, our outdoor version of the MDC weathered the storm just fine and continued serving traffic for AOL.com without fail. 

 

This kind of Capability is not EASY or Turn-Key

That’s not to say there isn’t a ton of work to do to get an application to work in an environment like this.   If you take the problem space at different levels whether it be DNS, Load Balancing, network redundancy, configuration management, underlying application level timeouts, systems dependencies like databases, other information stores and the like the non-infrastructure related work and coding is not insignificant.   There is a huge amount of complexity in running a site like AOL.Com.  Lots of interdependencies, sophistication, advertising related collection and distribution and the like.   It’s safe to say that this is not as simple as throwing up an Apache/Tomcat instance into a VM. 

I have talked for quite awhile about what Netflix engineers originally coined as Chaos Monkeys.   The ability, development paradigm, or even rogue processes for your applications to survive significant infrastructure and application level outages.  Its essentially taking the redundancy out of the infrastructure and putting into the code. While extremely painful at the start, the savings long term are proving hugely beneficial.    For most companies, this is still something futuristic, very far out there.  They may be beholden to software manufacturers and developers to start thinking this way which may take a very very long time.  Infrastructure is the easy way to solve it.   It may be easy, but its not cheap.  Nor, if you care about the environmental angle on it, is it very ‘sustainable’ or green.   Limit the infrastructure. Limit the Waste.   While we haven’t really thought about in terms of rolling it up into our environmental positions, perhaps we should.  

The point is that getting to this level of redundancy is going to take work and to that end will continue to be a regulator or anchor slowing down a greater adoption of more modular approaches.  But at least in my mind, the future is set, directionally it will be hard to ignore the economics of this type of approach for long.   Of course as an industry we need to start training or re-training developers to think in this kind of model.   To build code in such a way that it takes into effect the Chaos Monkey Potential out there.

 

Want to see One Live?

image

We have been asked to provide an AOL MicroData Center for the Super Computing 12 conference next week in Salt Lake City, Utah with our partner Penguin Computing.  If you want to see one of our Internal versions live and up-close feel free to stop by and take a look.  Jay Moran (my Distinguished Engineer here at AOL) and Scott Killian (The leader of our data center operations teams) will be onsite to discuss the technologies and our use cases.

 

\Mm

Insider Redux: Data Barn in a Farm Town

I thought I would start my first post by addressing the second New York Times article first. Why? Because it specifically mentions activities and messages sourced from me at the time when I was responsible for running the Microsoft Data Center program. I will try to track the timeline mentioned in the article with my specific recollections of the events. As Paul Harvey used to say, so then you could know the ‘REST of the STORY’.

I remember my first visit to Quincy, Washington. It was a bit of a road trip for myself and a few other key members of the Microsoft site selection team. We had visited a few of the local communities and power utility districts doing our due diligence on the area at large. Our ‘Heat map’ process had led us to Eastern Washington state. Not very far (just a few hours) from the ‘mothership’ of Redmond, Washington. It was a bit of a crow eating exercise for me as just a few weeks earlier I had proudly exclaimed that our next facility would not be located on the West Coast of the United States. We were developing an interesting site selection model that would categorize and weight areas around the world. It would take in FEMA disaster data, fault zones, airport and logistics information, location of fiber optic and carrier presence, workforce distributions, regulatory and tax data, water sources, and power. This was going to be the first real construction effort undertaken by Microsoft. The cost of power was definitely a factor as the article calls out. But just as equal was the generation mix of the power in the area. In this case a predominance of hydroelectric. Low to No carbon footprint (Rivers it turns out actually give off carbon emissions I came to find out). Regardless the generation mix was and would continue to be a hallmark of site selection of the program when I was there. The crow-eating exercise began when we realized that the ‘greenest’ area per our methodology was actually located in Eastern Washington along the Columbia River.

We had a series of meetings with Real Estate folks, the local Grant County PUD, and the Economic Development folks of the area. Back in those days the secrecy around who we were was paramount, so we kept our identities and that of our company secret. Like geeky secret agents on an information gathering mission. We would not answer questions about where we were from, who we were, or even our names. We ‘hid’ behind third party agents who took everyone’s contact information and acted as brokers of information. That was early days…the cloak and dagger would soon come out as part of the process as it became a more advantageous tool to be known in tax negotiations with local and state governments.

During that trip we found the perfect parcel of land, 75 acres with great proximity to local sub stations, just down line from the Dams on the nearby Columbia River. It was November 2005. As we left that day and headed back it was clear that we felt we had found Site Selection gold. As we started to prepare a purchase offer we got wind that Yahoo! was planning on taking a trip out to the area as well. As the local folks seemingly thought that we were a bank or large financial institution they wanted to let us know that someone on the Internet was interested in the area as well. This acted like a lightning rod and we raced back to the area and locked up the land before they Yahoo had a chance to leave the Bay Area. In these early days the competition was fierce. I have tons of interesting tales of cloak and dagger intrigue between Google, Microsoft, and Yahoo. While it was work there was definitely an air of something big on the horizon. That we were all at the beginning of something. In many ways many of the Technology professionals involved regardless of company forged some deep relationships and competition with each other.

Manos on the Bean Field December 2005The article talks about how the ‘Gee-Whiz moment faded pretty fast’. While I am sure that it faded in time (as all things do), I also seem to recall the huge increase of local business as thousands of construction workers descended upon this wonderful little town, the tours we would give local folks and city council dignitaries, a spirit of true working together. Then of course there was the ultimate reduction in properties taxes resulting from even our first building and an increase in home values to boot at the time. Its an oft missed benefit that I am sure the town of Quincy and Grant County has continued to benefit from as the Data Center Cluster added Yahoo, Sabey, IAC, and others. I warmly remember the opening day ceremonies and ribbon cutting and a sense of pride that we did something good. Corny? Probably – but that was the feeling. There was no talk of generators. There were no picket signs, in fact the EPA of Washington state had no idea on how to deal with a facility of this size and I remember openly working in partnership on them. That of course eventually wore off to the realities of life. We had a business to run, the city moved on, and concerns eventually arose.

The article calls out a showdown between Microsoft and the Power Utility District (PUD) over a fine for missing capacity forecasting target. As this happened much after I left the company I cannot really comment on that specific matter. But I can see how that forecast could miss. Projecting power usage months ahead is more than a bit of science mixed with art. It gets into the complexity of understanding capacity planning in your data centers. How big will certain projects grow. Will they meet expectations?, fall short?, new product launches can be duds or massive successes. All of these things go into a model to try and forecast the growth. If you think this is easy I would submit that NOONE in the industry has been able to master the crystal ball. I would also submit that most small companies haven’t been able to figure it out either. At least at companies like Microsoft, Google, and others you can start using the law and averages of big numbers to get close. But you will always miss. Either too high, or too low. Guess to low and you impact internal budgeting figures and run rates. Not Good. Guess to high and you could fall victim to missing minimal contracts with utility companies and be subject to fines.

In the case mentioned in the article, the approach taken if true would not be the smartest method especially given the monthly electric bill for these facilities. It’s a cost of doing business and largely not consequential at the amount of consumption these buildings draw. Again, if true, it was a PR nightmare waiting to happen.

At this point the article breaks out and talks about how the Microsoft experience would feel more like dealing with old-school manufacturing rather than ‘modern magic’ and diverts to a situation at a Microsoft facility in Santa Clara, California.

The article references that this situation is still being dealt with inside California so I will not go into any detailed specifics, but I can tell you something does not smell right in the state of Denmark and I don’t mean the Diesel fumes. Microsoft purchased that facility from another company. As the usage of the facility ramped up to the levels it was certified to operate at, operators noticed a pretty serious issue developing. While the building was rated to run at certain load size, it was clear that the underground feeders were undersized and the by-product could have polluted the soil and gotten into the water system. This was an inherited problem and Microsoft did the right thing and took the high road to remedy it. It is my recollection that all sides were clearly in know of the risks, and agreed to the generator usage whenever needed while the larger issue was fixed. If this has come up as a ‘air quality issue’ I personally would guess that there is politics at play. I’m not trying to be an apologist but if true, it goes to show that no good deed goes unpunished.

At this point the article cuts back to Quincy. It’s a great town, with great people. To some degree it was the winner of the Internet Jackpot lottery because of the natural tech resources it is situated on. I thought that figures quoted around taxes were an interesting component missed in many of the reporting I read.

“Quincy’s revenue from property taxes, which data centers do pay, has risen from $815,250 in 2005 to a projected $3.6 million this year, paying for a library and repaved streets, among other benefits, according to Tim Snead, the city administrator.”

As I mentioned in yesterday’s post my job is ultimately to get things done and deliver results. When you are in charge of a capital program as large as Microsoft’s program was at the time – your mission is clear – deliver the capacity and start generating value to the company. As I was presented the last cropThe last bag of beans harvested in Quincy of beans harvested from the field at the ceremony we still had some ways to go before all construction and capacity was ready to go. One of the key missing components was the delivery and installation of a transformer for one of the substations required to bring the facility up to full service. The article denotes that I was upset that the PUD was slow to deliver the capacity. Capacity I would add that was promised along a certain set of timelines and promises and commitments were made and money was exchanged based upon those commitments. As you can see from the article, the money exchanged was not insignificant. If Mr. Culbertson felt that I was a bit arrogant in demanding a follow through on promises and commitments after monies and investments were made in a spirit of true partnership, my response would be ‘Welcome to the real world’. As far as being cooperative, by April the construction had already progressed 15 months since its start. Hardly a surprise, and if it was, perhaps the 11 acre building and large construction machinery driving around town could have been a clue to the sincerity of the investment and timelines. Harsh? Maybe. Have you ever built a house? If so, then you know you need to make sure that the process is tightly managed and controlled to ensure you make the delivery date.

The article then goes on to talk about the permitting for the Diesel generators. Through the admission of the Department of Ecology’s own statement, “At the time, we were in scramble mode to permit our first one of these data centers.” Additionally it also states that:

Although emissions containing diesel particulates are an environmental threat, they were was not yet classified as toxic pollutants in Washington. The original permit did not impose stringent limits, allowing Microsoft to operate its generators for a combined total of more than 6,000 hours a year for “emergency backup electrical power” or unspecified “maintenance purposes.”

At the time all this stuff was so new, everyone was learning together. I simply don’t buy that this was some kind Big Corporation versus Little Farmer thing. I cannot comment on the events of 2010 where Microsoft asked for itself to be disconnected from the Grid. Honestly that makes no sense to me even if the PUD was working on the substation and I would agree with the articles ‘experts’.

Well that’s my take on my recollection of events during those early days of the Quincy build out as it relates to the articles. Maybe someday I will write a book as the process and adventures of those early days of birth of Big Infrastructure was certainly exciting. The bottom line is that the data center industry is amazingly complex and the forces in play are as varied as technology to politics to people and everything in between. There is always a deeper story. More than meets the eye. More variables. Decisions are never black and white and are always weighted against a dizzying array of forces.

\Mm

Pointy Elbows, Bags of Beans, and a little anthill excavation…A response to the New York Times Data Center Articles

I have been following with some interest the series of articles in the New York Times by Jim Glanz.  The series premiered on Sunday with an article entitled Power, Pollution and the Internet, which was followed up today with a deeper dive in some specific examples.  The examples today (Data  Barns in a farm town, Gobbling Power and Flexing muscle) focused on the Microsoft program, a program of which I have more than some familiarity since I ran it for many years.   After just two articles, reading the feedback in comments, and seeing some of the reaction in the blogosphere it is very clear that there is more than a significant amount of misunderstanding, over-simplification, and a lack of detail I think is probably important.   In doing so I want to be very clear that I am not representing AOL, Microsoft, or any other organization other than my own personal observations and opinions.  

As mentioned in both of the articles I was one of hundreds of people interviewed by the New York Times for this series.  In those conversations with Jim Glanz a few things became very apparent.  First – He has been on this story for a very long time, at least a year.   As far as journalists go, he was incredibly deeply engaged and armed with tons of facts.  In fact, he had a trove of internal emails, meeting minutes, and a mountain of data through government filings that must have taken him months to collect.  Secondly, he had the very hard job of turning this very complex space into a format where the uneducated masses can begin to understand it.  Therein lies much of the problem – This is an incredibly complex space to try and communicate it to those not tackling it day to day or even understand that technological, regulatory forces involved.  This is not an area or topic that can be sifted down to a sound bite.   If this were easy, there really wouldn’t be a story would there?

At issue for me is that the complexity of the powers involved seems to get scant attention aiming larger for the “Data Centers are big bad energy vampires hurting the environment” story.   Its clearly evident reading through the comments on the both of the articles so far.   Claiming that the sources and causes have everything to do from poor web page design to government or multi-national companies conspiracies to corner the market on energy. 

So I thought I would take a crack article by article to shed some light (the kind that doesn’t burn energy) on some of the topics and just call out where I disagree completely.     In full transparency  the “Data Barns” article doesn’t necessarily paint me as a “nice guy”.  Sometimes I am.  Sometimes I am not.  I am not an apologist, nor do I intend to do so in this post.  I am paid to get stuff done.  To execute. To deliver.  Quite frankly the PUD missed deadlines (the progenitor event to my email quoted in the piece) and sometimes people (even utility companies) have to live in the real world of consequences.   I think my industry reputation, work, and fundamental stances around driving energy efficiency and environmental conservancy in this industry can stand on its own both publicly and for those that have worked for me. 

There is an inherent irony here that these articles were published in both print and electronically to maximize the audience and readership.  To do that, these articles made “multiple trips” through a data center, and ultimately reside in one (or more).  They seem to denote that keeping things online is bad which seems to go against the availability and need of the articles themselves.  Doesn’t the New York times expect to make these articles available on-line for people to read?  Its posted online already.  Perhaps they expect that their micro-fiche experts would be able to serve the demand for these articles in the future?  I do not think so. 

This is a complex eco-system of users, suppliers, technology, software, platforms, content creators, data (both BIG and small), regulatory forces, utilities, governments, financials, energy consumption, people, personalities, politics, company operating tenets, community outreach to name a very few.  On top of managing through all these variables they also have to keep things running with no downtime.

\Mm

The AOL Micro-DC adds new capability

Back in July, I announced AOL’s Data Center Independence Day with the release of our new ‘Micro Data Center’ approach.   In that post we highlighted the terrific work that the teams put in to revolutionize our data center approach and align it completely to not only technology goals but business goals as well.   It was an incredible amount of engineering and work to get to that point and it would be foolish to think that the work represented a ‘One and Done’ type of effort.  

So today I am happy to announce the roll out of a new capability for our Micro-DC – An indoor version of the Micro-DC.

Aol MDC-Indoor2

While the first instantiations of our new capability were focused on outdoor environments, we were also hard at work at an indoor version with the same set of goals.   Why work on an indoor version as well?   Well you might recall in the original post I stated:

We are no longer tied to traditional data center facilities or colocation markets.   That doesn’t mean we wont use them, it means we now have a choice.  Of course this is only possible because of the internally developed cloud infrastructure but we have freed ourselves from having to be bolted onto or into existing big infrastructure.   It allows us to have an incredible amount geo-distributed capacity at a very low cost point in terms of upfront capital and ongoing operational expense.

We need to maintain a portfolio of options for our products and services.  In this case – having an indoor version of our capabilities to ensure that our solution can live absolutely anywhere.   This will allow our footprint, automation and all, to live inside any data center co-location environment or the interior of any office building anywhere around the planet, and retain the extremely low maintenance profile that we were targeting from an operational cost perspective.  In a sense you can think of it as “productizing” our infrastructure.  Could we have just deployed racks of servers, network kit, etc. like we have always done?  Sure.   But by continuing to productize our infrastructure we continue to drive down the costs relating to our short term and long term infrastructure costs.  In my mind, Productizing your infrastructure, is actually the next evolution in standardization of your infrastructure.   You can have infrastructure standards in place – Server Model, RAM, HD space, Access switches, Core switches, and the like.  But until you get to that next phase of standardizing, automating, and ‘productizing’ it into a discrete set of capabilities – you only get a partial win.

Some people have asked me, “Why didn’t you begin with the interior version to start with? It seems like it would be the easier one to accomplish.”  Indeed I cannot argue with them, it would have probably been easier as there were much less challenges to solve.  You can make basic assumptions around where this kind of indoor solution would live in, and reduce much of the complexity.   I guess it all nets out to a philosophy of solving the harder problems first.   Once you prove the more complicated use case, the easier ones come much faster.   This is definitely the situation here.  

While this new capability continues the success we are seeing in re-defining the cost and operations of our particular engineering environments, the real challenge here (as with all sorts infrastructure and cloud automation) is whether or not we can map similar success of our applications and services to work correctly in that space.   On that note, I should have more to post soon. Stay Tuned!  Smile

 

\Mm