On Micro Datacenters, Sandy, Supercomputing 2012, and Coding for Containerized Data Centers….

image

As everyone has been painfully aware last week the United States saw the devastation caused by the Superstorm Sandy.   My original intention was to talk about yet another milestone with our Micro Data Center approach.  As the storm slammed into the East Coast I felt it was probably a bad time to talk about achieving something significant especially as people were suffering through the storms outcome.  In fact, after the storm AOL kicked off an incredible supplies drive and sent truckloads of goods up to the worst of the affected areas.

So, here we are a week after the storm, and while people are still in need and suffering, it is clear that the worst is over and the clean up and healing has begun.   It turns out that Super Storm Sandy also allowed us to test another interesting case in the journey of the Micro Data Center as well that I will touch on.

25% of ALL AOL.COM Traffic runs through Micro Data Centers

I have talked about the potential value of our use of Micro Data Centers and the pure agility and economics the platform will provide for us.   Up until this point we had used this technology in pockets.  Think of our explorations as focusing on beta and demo environments.  But that all changed in October when we officially flipped the switch and began taking production traffic for AOL.COM with the Micro Data Center.  We are currently (and have been since flipping the switch) running about 25% of all traffic coming to our main web site.   This is an interesting achievement in many ways.  First, from a performance perspective we are manually limiting the platform (it could do more!) to ~65,000 requests per minute and a traffic volume of about 280mbits per second.   To date I haven’t seen many people post performance statistics about applications in modular use, so hopefully this is relevant and interesting to folks in terms of the volume of load an approach such as this could take.   We recently celebrated this at a recent All-Hands with an internal version of our MDC being plugged into the conference room.  To prove our point we added it to the global pool of capacity for AOL.com and started taking production traffic right there at the conference facility.   This proves in large part the value, agility and mobility a platform like this could bring to bear.

Scott Killian, AOL's Data Center guru talks about the deployment of AOLs Micro Data Center. An internal version went 'live' during the talk.

 

As I mentioned before, Super Storm Sandy threw us another curveball as the hurricane crashed into the Mid-Atlantic.   While Virginia was not hit anywhere near as hard as New York and New Jersey, there were incredible sustained winds, tumultuous rains, and storm related damage everywhere.  Through it all, our outdoor version of the MDC weathered the storm just fine and continued serving traffic for AOL.com without fail. 

 

This kind of Capability is not EASY or Turn-Key

That’s not to say there isn’t a ton of work to do to get an application to work in an environment like this.   If you take the problem space at different levels whether it be DNS, Load Balancing, network redundancy, configuration management, underlying application level timeouts, systems dependencies like databases, other information stores and the like the non-infrastructure related work and coding is not insignificant.   There is a huge amount of complexity in running a site like AOL.Com.  Lots of interdependencies, sophistication, advertising related collection and distribution and the like.   It’s safe to say that this is not as simple as throwing up an Apache/Tomcat instance into a VM. 

I have talked for quite awhile about what Netflix engineers originally coined as Chaos Monkeys.   The ability, development paradigm, or even rogue processes for your applications to survive significant infrastructure and application level outages.  Its essentially taking the redundancy out of the infrastructure and putting into the code. While extremely painful at the start, the savings long term are proving hugely beneficial.    For most companies, this is still something futuristic, very far out there.  They may be beholden to software manufacturers and developers to start thinking this way which may take a very very long time.  Infrastructure is the easy way to solve it.   It may be easy, but its not cheap.  Nor, if you care about the environmental angle on it, is it very ‘sustainable’ or green.   Limit the infrastructure. Limit the Waste.   While we haven’t really thought about in terms of rolling it up into our environmental positions, perhaps we should.  

The point is that getting to this level of redundancy is going to take work and to that end will continue to be a regulator or anchor slowing down a greater adoption of more modular approaches.  But at least in my mind, the future is set, directionally it will be hard to ignore the economics of this type of approach for long.   Of course as an industry we need to start training or re-training developers to think in this kind of model.   To build code in such a way that it takes into effect the Chaos Monkey Potential out there.

 

Want to see One Live?

image

We have been asked to provide an AOL MicroData Center for the Super Computing 12 conference next week in Salt Lake City, Utah with our partner Penguin Computing.  If you want to see one of our Internal versions live and up-close feel free to stop by and take a look.  Jay Moran (my Distinguished Engineer here at AOL) and Scott Killian (The leader of our data center operations teams) will be onsite to discuss the technologies and our use cases.

 

\Mm

Author: mmanos

Infrastructure at Scale Technologist and Cloud Aficionado.

7 thoughts on “On Micro Datacenters, Sandy, Supercomputing 2012, and Coding for Containerized Data Centers….”

    1. Thanks Kevin. If you are headed to 7×24 in Phoenix, I can share more. Or its a quick jaunt up to SC12 in Salt Lake City to see an indoor version. 🙂

    1. Derrick, for a wide number of reasons we dont talk about how many we have deployed but its safe to say less than 10. What this does prove to us is that the technology is viable and does so at a significant cost benefit of deploying traditional capacity. Its just a lit more work upfront on the development front. But if you do it right, you only have to do it once and you will be able to leverage that code over and over again.

  1. For those of you who will be at the Supercomputing show in Salt Lake this week, AOL’s Jay Moran and Scott Killian are speaking in the Penguin booth, 1217, at 11 am & 1 pm on Thursday. The internal version is on display too, as Mike noted above.

  2. The fact that 25% of all AOL traffic runs through micro data centers was quite surprising. It makes you wonder what the future of data management holds and if micro data centers will be the wave of the future.

    Great post.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s