Feeds:
Posts
Comments

Archive for the ‘Cloud Infrastructure’ Category

 Scott Killian of AOL talks about the MicroDC

Last week I put up a post about how AOL.com has 25% of all traffic now running through our MicroDC infrastructure.   There was a great follow up post by James LaPlaine our VP of Operations on his blog Mental Effort, which goes into even greater detail.   While many of the email inquiries I get have been based around the technology itself, surprisingly a large majority of the notes have been questions around how to make your software. applications, and development efforts ready for such an infrastructure and what the timelines for realistically doing so would be.   

The general response of course is that it depends.  If you are a web-based platform or property focused solely on Internet based consumers, or a firm that needs diversified presence in different regions without the hefty price tag of renting and taking down additional space this may be an option.  However many of the enterprise based applications have been written in a way that is highly dependent upon localized infrastructure, short application based latency, and lack adequate scaling.  So for more corporate data center applications this may not be a great fit.  It will take sometime for those big traditional application firms to be able to truly build out their infrastructure to work in an environment like this (they may never do so).   I suspect most will take an easier approach and try to ‘cloudify’ their own applications and run it within their own infrastructure or data centers under their control.   This essentially will allow them to control the access portion of users needs, but continue to rely on the same kinds of infrastructure you might have in your own data center to support it.   Its much easier to build a web based application which then connects to a traditional IT based environment, than to truly build out infrastructure capable of accommodating scale.   I am happy to continue answer questions as they come up, but as I had an overwhelming response of questions about this I thought I would throw something quick up here that will hopefully help.

 

\Mm

Read Full Post »

image

As everyone has been painfully aware last week the United States saw the devastation caused by the Superstorm Sandy.   My original intention was to talk about yet another milestone with our Micro Data Center approach.  As the storm slammed into the East Coast I felt it was probably a bad time to talk about achieving something significant especially as people were suffering through the storms outcome.  In fact, after the storm AOL kicked off an incredible supplies drive and sent truckloads of goods up to the worst of the affected areas.

So, here we are a week after the storm, and while people are still in need and suffering, it is clear that the worst is over and the clean up and healing has begun.   It turns out that Super Storm Sandy also allowed us to test another interesting case in the journey of the Micro Data Center as well that I will touch on.

25% of ALL AOL.COM Traffic runs through Micro Data Centers

I have talked about the potential value of our use of Micro Data Centers and the pure agility and economics the platform will provide for us.   Up until this point we had used this technology in pockets.  Think of our explorations as focusing on beta and demo environments.  But that all changed in October when we officially flipped the switch and began taking production traffic for AOL.COM with the Micro Data Center.  We are currently (and have been since flipping the switch) running about 25% of all traffic coming to our main web site.   This is an interesting achievement in many ways.  First, from a performance perspective we are manually limiting the platform (it could do more!) to ~65,000 requests per minute and a traffic volume of about 280mbits per second.   To date I haven’t seen many people post performance statistics about applications in modular use, so hopefully this is relevant and interesting to folks in terms of the volume of load an approach such as this could take.   We recently celebrated this at a recent All-Hands with an internal version of our MDC being plugged into the conference room.  To prove our point we added it to the global pool of capacity for AOL.com and started taking production traffic right there at the conference facility.   This proves in large part the value, agility and mobility a platform like this could bring to bear.

Scott Killian, AOL's Data Center guru talks about the deployment of AOLs Micro Data Center. An internal version went 'live' during the talk.

 

As I mentioned before, Super Storm Sandy threw us another curveball as the hurricane crashed into the Mid-Atlantic.   While Virginia was not hit anywhere near as hard as New York and New Jersey, there were incredible sustained winds, tumultuous rains, and storm related damage everywhere.  Through it all, our outdoor version of the MDC weathered the storm just fine and continued serving traffic for AOL.com without fail. 

 

This kind of Capability is not EASY or Turn-Key

That’s not to say there isn’t a ton of work to do to get an application to work in an environment like this.   If you take the problem space at different levels whether it be DNS, Load Balancing, network redundancy, configuration management, underlying application level timeouts, systems dependencies like databases, other information stores and the like the non-infrastructure related work and coding is not insignificant.   There is a huge amount of complexity in running a site like AOL.Com.  Lots of interdependencies, sophistication, advertising related collection and distribution and the like.   It’s safe to say that this is not as simple as throwing up an Apache/Tomcat instance into a VM. 

I have talked for quite awhile about what Netflix engineers originally coined as Chaos Monkeys.   The ability, development paradigm, or even rogue processes for your applications to survive significant infrastructure and application level outages.  Its essentially taking the redundancy out of the infrastructure and putting into the code. While extremely painful at the start, the savings long term are proving hugely beneficial.    For most companies, this is still something futuristic, very far out there.  They may be beholden to software manufacturers and developers to start thinking this way which may take a very very long time.  Infrastructure is the easy way to solve it.   It may be easy, but its not cheap.  Nor, if you care about the environmental angle on it, is it very ‘sustainable’ or green.   Limit the infrastructure. Limit the Waste.   While we haven’t really thought about in terms of rolling it up into our environmental positions, perhaps we should.  

The point is that getting to this level of redundancy is going to take work and to that end will continue to be a regulator or anchor slowing down a greater adoption of more modular approaches.  But at least in my mind, the future is set, directionally it will be hard to ignore the economics of this type of approach for long.   Of course as an industry we need to start training or re-training developers to think in this kind of model.   To build code in such a way that it takes into effect the Chaos Monkey Potential out there.

 

Want to see One Live?

image

We have been asked to provide an AOL MicroData Center for the Super Computing 12 conference next week in Salt Lake City, Utah with our partner Penguin Computing.  If you want to see one of our Internal versions live and up-close feel free to stop by and take a look.  Jay Moran (my Distinguished Engineer here at AOL) and Scott Killian (The leader of our data center operations teams) will be onsite to discuss the technologies and our use cases.

 

\Mm

Read Full Post »

I thought I would start my first post by addressing the second New York Times article first. Why? Because it specifically mentions activities and messages sourced from me at the time when I was responsible for running the Microsoft Data Center program. I will try to track the timeline mentioned in the article with my specific recollections of the events. As Paul Harvey used to say, so then you could know the ‘REST of the STORY’.

I remember my first visit to Quincy, Washington. It was a bit of a road trip for myself and a few other key members of the Microsoft site selection team. We had visited a few of the local communities and power utility districts doing our due diligence on the area at large. Our ‘Heat map’ process had led us to Eastern Washington state. Not very far (just a few hours) from the ‘mothership’ of Redmond, Washington. It was a bit of a crow eating exercise for me as just a few weeks earlier I had proudly exclaimed that our next facility would not be located on the West Coast of the United States. We were developing an interesting site selection model that would categorize and weight areas around the world. It would take in FEMA disaster data, fault zones, airport and logistics information, location of fiber optic and carrier presence, workforce distributions, regulatory and tax data, water sources, and power. This was going to be the first real construction effort undertaken by Microsoft. The cost of power was definitely a factor as the article calls out. But just as equal was the generation mix of the power in the area. In this case a predominance of hydroelectric. Low to No carbon footprint (Rivers it turns out actually give off carbon emissions I came to find out). Regardless the generation mix was and would continue to be a hallmark of site selection of the program when I was there. The crow-eating exercise began when we realized that the ‘greenest’ area per our methodology was actually located in Eastern Washington along the Columbia River.

We had a series of meetings with Real Estate folks, the local Grant County PUD, and the Economic Development folks of the area. Back in those days the secrecy around who we were was paramount, so we kept our identities and that of our company secret. Like geeky secret agents on an information gathering mission. We would not answer questions about where we were from, who we were, or even our names. We ‘hid’ behind third party agents who took everyone’s contact information and acted as brokers of information. That was early days…the cloak and dagger would soon come out as part of the process as it became a more advantageous tool to be known in tax negotiations with local and state governments.

During that trip we found the perfect parcel of land, 75 acres with great proximity to local sub stations, just down line from the Dams on the nearby Columbia River. It was November 2005. As we left that day and headed back it was clear that we felt we had found Site Selection gold. As we started to prepare a purchase offer we got wind that Yahoo! was planning on taking a trip out to the area as well. As the local folks seemingly thought that we were a bank or large financial institution they wanted to let us know that someone on the Internet was interested in the area as well. This acted like a lightning rod and we raced back to the area and locked up the land before they Yahoo had a chance to leave the Bay Area. In these early days the competition was fierce. I have tons of interesting tales of cloak and dagger intrigue between Google, Microsoft, and Yahoo. While it was work there was definitely an air of something big on the horizon. That we were all at the beginning of something. In many ways many of the Technology professionals involved regardless of company forged some deep relationships and competition with each other.

Manos on the Bean Field December 2005The article talks about how the ‘Gee-Whiz moment faded pretty fast’. While I am sure that it faded in time (as all things do), I also seem to recall the huge increase of local business as thousands of construction workers descended upon this wonderful little town, the tours we would give local folks and city council dignitaries, a spirit of true working together. Then of course there was the ultimate reduction in properties taxes resulting from even our first building and an increase in home values to boot at the time. Its an oft missed benefit that I am sure the town of Quincy and Grant County has continued to benefit from as the Data Center Cluster added Yahoo, Sabey, IAC, and others. I warmly remember the opening day ceremonies and ribbon cutting and a sense of pride that we did something good. Corny? Probably – but that was the feeling. There was no talk of generators. There were no picket signs, in fact the EPA of Washington state had no idea on how to deal with a facility of this size and I remember openly working in partnership on them. That of course eventually wore off to the realities of life. We had a business to run, the city moved on, and concerns eventually arose.

The article calls out a showdown between Microsoft and the Power Utility District (PUD) over a fine for missing capacity forecasting target. As this happened much after I left the company I cannot really comment on that specific matter. But I can see how that forecast could miss. Projecting power usage months ahead is more than a bit of science mixed with art. It gets into the complexity of understanding capacity planning in your data centers. How big will certain projects grow. Will they meet expectations?, fall short?, new product launches can be duds or massive successes. All of these things go into a model to try and forecast the growth. If you think this is easy I would submit that NOONE in the industry has been able to master the crystal ball. I would also submit that most small companies haven’t been able to figure it out either. At least at companies like Microsoft, Google, and others you can start using the law and averages of big numbers to get close. But you will always miss. Either too high, or too low. Guess to low and you impact internal budgeting figures and run rates. Not Good. Guess to high and you could fall victim to missing minimal contracts with utility companies and be subject to fines.

In the case mentioned in the article, the approach taken if true would not be the smartest method especially given the monthly electric bill for these facilities. It’s a cost of doing business and largely not consequential at the amount of consumption these buildings draw. Again, if true, it was a PR nightmare waiting to happen.

At this point the article breaks out and talks about how the Microsoft experience would feel more like dealing with old-school manufacturing rather than ‘modern magic’ and diverts to a situation at a Microsoft facility in Santa Clara, California.

The article references that this situation is still being dealt with inside California so I will not go into any detailed specifics, but I can tell you something does not smell right in the state of Denmark and I don’t mean the Diesel fumes. Microsoft purchased that facility from another company. As the usage of the facility ramped up to the levels it was certified to operate at, operators noticed a pretty serious issue developing. While the building was rated to run at certain load size, it was clear that the underground feeders were undersized and the by-product could have polluted the soil and gotten into the water system. This was an inherited problem and Microsoft did the right thing and took the high road to remedy it. It is my recollection that all sides were clearly in know of the risks, and agreed to the generator usage whenever needed while the larger issue was fixed. If this has come up as a ‘air quality issue’ I personally would guess that there is politics at play. I’m not trying to be an apologist but if true, it goes to show that no good deed goes unpunished.

At this point the article cuts back to Quincy. It’s a great town, with great people. To some degree it was the winner of the Internet Jackpot lottery because of the natural tech resources it is situated on. I thought that figures quoted around taxes were an interesting component missed in many of the reporting I read.

“Quincy’s revenue from property taxes, which data centers do pay, has risen from $815,250 in 2005 to a projected $3.6 million this year, paying for a library and repaved streets, among other benefits, according to Tim Snead, the city administrator.”

As I mentioned in yesterday’s post my job is ultimately to get things done and deliver results. When you are in charge of a capital program as large as Microsoft’s program was at the time – your mission is clear – deliver the capacity and start generating value to the company. As I was presented the last cropThe last bag of beans harvested in Quincy of beans harvested from the field at the ceremony we still had some ways to go before all construction and capacity was ready to go. One of the key missing components was the delivery and installation of a transformer for one of the substations required to bring the facility up to full service. The article denotes that I was upset that the PUD was slow to deliver the capacity. Capacity I would add that was promised along a certain set of timelines and promises and commitments were made and money was exchanged based upon those commitments. As you can see from the article, the money exchanged was not insignificant. If Mr. Culbertson felt that I was a bit arrogant in demanding a follow through on promises and commitments after monies and investments were made in a spirit of true partnership, my response would be ‘Welcome to the real world’. As far as being cooperative, by April the construction had already progressed 15 months since its start. Hardly a surprise, and if it was, perhaps the 11 acre building and large construction machinery driving around town could have been a clue to the sincerity of the investment and timelines. Harsh? Maybe. Have you ever built a house? If so, then you know you need to make sure that the process is tightly managed and controlled to ensure you make the delivery date.

The article then goes on to talk about the permitting for the Diesel generators. Through the admission of the Department of Ecology’s own statement, “At the time, we were in scramble mode to permit our first one of these data centers.” Additionally it also states that:

Although emissions containing diesel particulates are an environmental threat, they were was not yet classified as toxic pollutants in Washington. The original permit did not impose stringent limits, allowing Microsoft to operate its generators for a combined total of more than 6,000 hours a year for “emergency backup electrical power” or unspecified “maintenance purposes.”

At the time all this stuff was so new, everyone was learning together. I simply don’t buy that this was some kind Big Corporation versus Little Farmer thing. I cannot comment on the events of 2010 where Microsoft asked for itself to be disconnected from the Grid. Honestly that makes no sense to me even if the PUD was working on the substation and I would agree with the articles ‘experts’.

Well that’s my take on my recollection of events during those early days of the Quincy build out as it relates to the articles. Maybe someday I will write a book as the process and adventures of those early days of birth of Big Infrastructure was certainly exciting. The bottom line is that the data center industry is amazingly complex and the forces in play are as varied as technology to politics to people and everything in between. There is always a deeper story. More than meets the eye. More variables. Decisions are never black and white and are always weighted against a dizzying array of forces.

\Mm

Read Full Post »

I have been following with some interest the series of articles in the New York Times by Jim Glanz.  The series premiered on Sunday with an article entitled Power, Pollution and the Internet, which was followed up today with a deeper dive in some specific examples.  The examples today (Data  Barns in a farm town, Gobbling Power and Flexing muscle) focused on the Microsoft program, a program of which I have more than some familiarity since I ran it for many years.   After just two articles, reading the feedback in comments, and seeing some of the reaction in the blogosphere it is very clear that there is more than a significant amount of misunderstanding, over-simplification, and a lack of detail I think is probably important.   In doing so I want to be very clear that I am not representing AOL, Microsoft, or any other organization other than my own personal observations and opinions.  

As mentioned in both of the articles I was one of hundreds of people interviewed by the New York Times for this series.  In those conversations with Jim Glanz a few things became very apparent.  First – He has been on this story for a very long time, at least a year.   As far as journalists go, he was incredibly deeply engaged and armed with tons of facts.  In fact, he had a trove of internal emails, meeting minutes, and a mountain of data through government filings that must have taken him months to collect.  Secondly, he had the very hard job of turning this very complex space into a format where the uneducated masses can begin to understand it.  Therein lies much of the problem – This is an incredibly complex space to try and communicate it to those not tackling it day to day or even understand that technological, regulatory forces involved.  This is not an area or topic that can be sifted down to a sound bite.   If this were easy, there really wouldn’t be a story would there?

At issue for me is that the complexity of the powers involved seems to get scant attention aiming larger for the “Data Centers are big bad energy vampires hurting the environment” story.   Its clearly evident reading through the comments on the both of the articles so far.   Claiming that the sources and causes have everything to do from poor web page design to government or multi-national companies conspiracies to corner the market on energy. 

So I thought I would take a crack article by article to shed some light (the kind that doesn’t burn energy) on some of the topics and just call out where I disagree completely.     In full transparency  the “Data Barns” article doesn’t necessarily paint me as a “nice guy”.  Sometimes I am.  Sometimes I am not.  I am not an apologist, nor do I intend to do so in this post.  I am paid to get stuff done.  To execute. To deliver.  Quite frankly the PUD missed deadlines (the progenitor event to my email quoted in the piece) and sometimes people (even utility companies) have to live in the real world of consequences.   I think my industry reputation, work, and fundamental stances around driving energy efficiency and environmental conservancy in this industry can stand on its own both publicly and for those that have worked for me. 

There is an inherent irony here that these articles were published in both print and electronically to maximize the audience and readership.  To do that, these articles made “multiple trips” through a data center, and ultimately reside in one (or more).  They seem to denote that keeping things online is bad which seems to go against the availability and need of the articles themselves.  Doesn’t the New York times expect to make these articles available on-line for people to read?  Its posted online already.  Perhaps they expect that their micro-fiche experts would be able to serve the demand for these articles in the future?  I do not think so. 

This is a complex eco-system of users, suppliers, technology, software, platforms, content creators, data (both BIG and small), regulatory forces, utilities, governments, financials, energy consumption, people, personalities, politics, company operating tenets, community outreach to name a very few.  On top of managing through all these variables they also have to keep things running with no downtime.

\Mm

Read Full Post »

Back in July, I announced AOL’s Data Center Independence Day with the release of our new ‘Micro Data Center’ approach.   In that post we highlighted the terrific work that the teams put in to revolutionize our data center approach and align it completely to not only technology goals but business goals as well.   It was an incredible amount of engineering and work to get to that point and it would be foolish to think that the work represented a ‘One and Done’ type of effort.  

So today I am happy to announce the roll out of a new capability for our Micro-DC – An indoor version of the Micro-DC.

Aol MDC-Indoor2

While the first instantiations of our new capability were focused on outdoor environments, we were also hard at work at an indoor version with the same set of goals.   Why work on an indoor version as well?   Well you might recall in the original post I stated:

We are no longer tied to traditional data center facilities or colocation markets.   That doesn’t mean we wont use them, it means we now have a choice.  Of course this is only possible because of the internally developed cloud infrastructure but we have freed ourselves from having to be bolted onto or into existing big infrastructure.   It allows us to have an incredible amount geo-distributed capacity at a very low cost point in terms of upfront capital and ongoing operational expense.

We need to maintain a portfolio of options for our products and services.  In this case – having an indoor version of our capabilities to ensure that our solution can live absolutely anywhere.   This will allow our footprint, automation and all, to live inside any data center co-location environment or the interior of any office building anywhere around the planet, and retain the extremely low maintenance profile that we were targeting from an operational cost perspective.  In a sense you can think of it as “productizing” our infrastructure.  Could we have just deployed racks of servers, network kit, etc. like we have always done?  Sure.   But by continuing to productize our infrastructure we continue to drive down the costs relating to our short term and long term infrastructure costs.  In my mind, Productizing your infrastructure, is actually the next evolution in standardization of your infrastructure.   You can have infrastructure standards in place – Server Model, RAM, HD space, Access switches, Core switches, and the like.  But until you get to that next phase of standardizing, automating, and ‘productizing’ it into a discrete set of capabilities – you only get a partial win.

Some people have asked me, “Why didn’t you begin with the interior version to start with? It seems like it would be the easier one to accomplish.”  Indeed I cannot argue with them, it would have probably been easier as there were much less challenges to solve.  You can make basic assumptions around where this kind of indoor solution would live in, and reduce much of the complexity.   I guess it all nets out to a philosophy of solving the harder problems first.   Once you prove the more complicated use case, the easier ones come much faster.   This is definitely the situation here.  

While this new capability continues the success we are seeing in re-defining the cost and operations of our particular engineering environments, the real challenge here (as with all sorts infrastructure and cloud automation) is whether or not we can map similar success of our applications and services to work correctly in that space.   On that note, I should have more to post soon. Stay Tuned!  Smile

 

\Mm

Read Full Post »

Yesterday we celebrated Independence Day here in the United States.   It’s a day where we embrace the freedoms we enjoy as a country, look back on where we have come, and celebrate the promise of the future.   Yesterday was also a different kind of Independence Day for my teams at AOL.  A Data Center Independence Day, if you will. 

You may or may not have been following the progress of the work that we have been doing here at AOL over the last 14 or so months but the pace of change has been simply breathtaking.  One of the first things I did when I entered into the company was deeply review all of the aspects of Operations.  From Data Centers to Network Engineering, to the engineering teams supporting the products and services and everything in between.   The net of the exercise was that AOL was probably similar to most companies out there in terms of technology mix, from the CRUFT that I mentioned in a previous post, to latest technologies.  There were some incredible technologies built over the last three decades, some outdated processes and procedures, and if I am honest traces of a culture where the past had more meaning of the present or future.

In a very short period of time all of that changed.  We aggressively made changes to the organization,  re-aligned priorities, and perhaps most of all we created and defined a powerful collection of changes and evolutions we would need to bring about with very aggressive timelines.    These changes were part of a defined Technology Roadmap that broke the work we needed to accomplish into three categories of work.   The categorization focused on the internal technical challenges and tools we needed to make to enhance our own internal efficiencies.  The second categorization focused on the technical challenges and aggressive things we could do to enhance and bring greater scalability to our products and services.   This would include things like additional services and technology suites to our internally developed cloud infrastructure, and other items that would allow for more rapid product delivery of our products and services.   The last categorization of work, was for the incredibly aggressive “wish list” types of changes.  Items that could be so disruptive, so incredibly game-changing for us, that they could redefine our work on the whole.  In fact we named this group of work “Nibiru” after a mythical planet that is said to cross into our solar system and wreaks havoc and brings about great change. 

On July 4, 2012, one of our Nibiru items arrived and I am extremely ecstatic to state that we achieved our “Data Center Independence Day”.  Our primary “Nibiru” goal was to develop and deliver a data center environment without the need of a physical building.  The environment needed to require as minimal amount of physical “touch” as possible and allow us the ultimate flexibility in terms of how we delivered capacity for our products and services. We called this effort the Micro Data Center.   If you think about the amount of things that need to change to evolve to this type of strategy it’s a bit mind-boggling. 

image

Here is just a few of the things required to look at/change/and automate to even make this kind of achievement possible:

  • Developing an entirely new Technology Suite and the ability to deliver that capacity anywhere in the world with minimal to no staffing.
  • Delivering extremely dense compute capacity (think the latest technology) to give us the longest possible use of these assets once deployed into the field.
  • The ability to deliver a “Microdata Center” anywhere on the planet regardless of temperature and humidity settings
  • The ability to support/maintain/and administer remotely.
  • The ability to fit into the power envelope of a normal office building
  • Participation in our cloud environment and capabilities
  • The processes by which these facilities are maintained and serviced
  • and much much more…

In my mind, Its one thing to claim a technical achievement, its quite another to operationalize that achievement and make the process of supporting it repeatable. That’s my measure as to when you can REALLY declare victory.  Science Experiments don’t count.   It has to just plain work.    To that end our first “beta” site for the technology was the AOL campus in Dulles, Virginia.  Out on a lonely slab of concrete in the back of one of the buildings our future has taken shape.

Thanks in part to a lot of the work going on in the data center containerization imagespace, we were able to jump start much of the work in a relatively quick fashion.  In fact the pace set the Data Center and Technology Operations teams to deliver this achievement is more than a bit astounding.   Most, if not all, of the existing AOL Data Centers would fall somewhere around a traditional Tier III / Tier II Uptime Institute definition.   The teams really pushed ahead way outside their comfort zones to deliver some incredibly evolutions in a very short period of time.   Of course there were steps along the way to get here.  But those steps now seem to be in double time.  A few months back we announced the launching of ATC, Our first completely automated facility.   The work that went into ATC, was foundational to our achievement yesterday.   It allowed us to really start working on the hard stuff first.   That is to say the ‘Operationalization’ of these kinds of environments.   It set the stage of how we could evolve to this next tier of evolution.   Below is a summary of some of the achievements of our ATC launch, but if you were curious for the specifics on our work there feel free to click the ‘Breaking the Chrysalis’ post I did at that time.  You can see how the work that we have been driving in our own internal cloud environments, the changes in operational procedure, the change in thought is additive and fundamental to our latest achievement.   Its especially interesting to note that with all of the interesting blips and hiccups occurring in the ‘cloud industry’ like the leap second and  the terrible storms on the East Coast this week which affected many data centers, that ATC, our completely unmanned facility just kept humming along with no issues (To be fair neither did our traditional facilities) despite much of the initial negative feedback we had received was solely based around the reliability of such moves.   It goes to show how important engineering FOR Operation is.  For AOL we have built this in from the start.

What does this actually buy AOL?

Ok, we stuck some computers in a box and we made sure it requires very little care and feeding – what does this buy us?  Quite a bit actually.  Jay Moran, the Distinguished Engineer who was in charge of driving this effort is always quick to point out that the problem space here is not just about the Technology.  It has to be a marriage with the business side as well.  Obviously the inherent flexibility of the design allows us a greater number of places around the planet we can deploy capacity to and that in and of itself is pretty revolutionary.   We are no longer tied to traditional data center facilities or colocation markets.   That doesn’t mean we wont use them, it means we now have a choice.  Of course this is only possible because of the internally developed cloud infrastructure but we have freed ourselves from having to be bolted onto or into existing big infrastructure.   It allows us to have an incredible amount geo-distributed capacity at a very low cost point in terms of upfront capital and ongoing operational expense.   This is a huge game changer.  So much so, allow me to do a bit of the ‘back of the napkin math’ with you.   Lets call our global capacity in terms of compute, storage, etc. that we have today in our traditional environments – the Total Compute Capability or TCC. Its essentially the bandwidth for the work that we can get done.   Inside the cost for TCC you have operating costs such power, lease costs, Data Center facility maintenance costs, support staff, etc.  You additionally have the imagedepreciation for the facilities themselves (or the specific buildouts – if colocating), the server and other equipment depreciation, and the rest.   Lets call that baseline X.   The MicroData Center strategy built out with the latest, our most dense server standards and infrastructure would allow us to have 5X the amount of total TCC in less than 10% of the cost and physical footprint.   If you think about how this will allow us to aggregate and grow over time it ultimately drives us to a VERY LOW operational cost structure for delivering our products and services.   Additionally it positions us for the future in very significant ways.

  • It redefines software architecture for greater resiliency
  • It allows us an incredibly flexible platform for driving and addressing privacy laws, regulatory oversight, and other such concerns allowing us to respond rapidly.
  • It further reduces energy consumption and carbon footprint emissions (important as taxation evolves around the world, as well as ongoing operational costs)
  • Gives us the ability to drive Edge Computing delivery to potentially bypass CDNs for certain content.
  • Gives us the capability to drive ‘Community-in-a-box’ whereby we can quickly launch new products in markets, quickly expand existing footprints like Patch in a low cost, but still hyper-local platform, allow the Huffington Post a platform to rapidly partner and enter new markets with minimal cost turn ups.
  • The fact that the technology mix in our SKUs is comprised of compute, storage, and network capacity maximizes the amount of products and services we can deploy to it.  

As Always its really about the People

I cannot let a post about this huge win for us to go by without mentioning the teams involved in delivering this capability.  This is not just a win for AOL, or to a lesser degree the industry at large in another proof-point that it cant evolve if it puts its mind to changing, but rather the Technology Teams at AOL.  When I was first approached about joining AOL, my slightly sarcastic and comedic response was probably much like yours – ‘Are they still around?’ But the fact of the matter is that AOL has a vision of where they want to go, and what they want to be.   That was compelling for me personally, compelling enough for me to make the move.   What has truly amazed me however is the dedication and tenacity of its employees.  These achievements would not be possible without the outright aggressiveness the organization has taken to moving the company forward.  Its always hard to assess from the outside just how hard an effort is internally to achieve.  In the case of our micro Data Center Strategy, the teams had just about every kind of barrier to deliver this capacity.  Every kind of excuse to not make it, or even not to try.   They put all of those things aside and just plain executed.  If you allow me a small moment of bravado – Not only did my teams simply kick ass, they did it in a way that moved the needle for the company, and in my mind once again catapulted themselves into the forefront of operations and technology at scale.   We still have a bunch of Nibiru projects to deliver, so my guess is we haven’t heard the last of some of these big wins.

\Mm

Read Full Post »

Site Selection can be a tricky thing.  You spend a ton of time upfront looking for that perfect location.   The confluence of dozens of criteria, digging through fiber maps, looking at real estate, income and other state taxes.   Even the best laid plans, and most thoughtful of approaches can be waylaid by changes in government, the emergence of new laws, and other regulatory changes which can put your selection at risk.  I was recently made aware of yet another cautionary artifact you might want to pay attention to: Pay to Play laws and budget challenged States.  

As many of my frequent readers know, I am from Chicago.  In Chicago, and Illinois at large “Pay to Play” has much different connotations than the topic I am about to bring up right now.  In fact the Chicago version broke out into an all out National and International Scandal.  There is a great book about it if you are interested, aptly entitled, Pay to Play.

The Pay to Play that I am referring to is an emerging set of regulations and litigation techniques that require companies to pay tax bills upfront (without any kind of recourse or mediation) which then forces companies to litigate to try and recover those taxes if unfair.   Increasingly I am seeing this in states where the budgets are challenged and governments are looking for additional funds and are targeting Internet based products and services.   In fact, I was surprised to learn that AOL has been going through this very challenge.  While I will not comment on the specifics of our case (its not specifically related to Data Centers anyway) it may highlight potential pitfalls and longer term items to take into effect when performing Data Center Site Selection.    You can learn more about the AOL case here, if you are interested.

For me it highlights that lack of understanding of Internet services by federal and local governments combined with a lack of inhibition in aggressively pursuing revenue despite that lack of understanding can be dangerous and impactful to companies in this space.   These can pose real dangers especially in where one site selects for their facility.    These types of challenges can come into play whether you are building your own facility, selecting a colocation facility and hosting partner, or if stretched eventually where your cloud provider may have located their facility.  

It does beg the question as to whether or not you have checked into the financial health of the States you may be hosting your data and services in.   Have you looked at the risk that this may pose to your business?  It may be something to take a look at!

 

\Mm

Read Full Post »

nice

Today marked the closing lot of sessions for DataCentres2012 and my keynote session to the attendees.    After sitting through a series of product, technology, and industry trend presentations over the last two days I was feeling that my conversation would at the very least be something different.   Before I get to that – I wanted to share some observations from the morning. 

It all began with an interesting run-down of the Data Center and infrastructure industry trends across Europe from Steve Wallage of The BroadGroup.   It contained some really compelling information and highlighted some interesting divergence between the European market and the US market in terms of adoption and trends of infrastructure.   It looks like they have a method for those interested to get their hand on the detailed data (for purchase) if you are interested.  The parts I found particularly industry was the significant slow down of the Wholesale data center market across Europe while Colocation providers continued to do well.   Additionally the percentages of change within the customer base of those providers by category was compelling and demonstrated a fundamental shift and move of content related customers across the board.

This presentation was followed by a panel of European Thought Leaders made up mostly of those same colocation providers.  Given the presentation by Wallage I was expecting some interesting data-points to emerge.  While there was a range of ideas and perspectives represented by the panel, I have to say it really got me worked up and not in a good way.   In many ways I felt the responses from many (not all) on the panel highlighted a continued resistance to change in thinking around everything from efficiency, to technology approach.  It represented the things I despise most about about our industry at large.  Namely the slow adoption of change. The warm embrace of the familiar.  The outright resistance to new ideas.    At one point, a woman in the front row whom I believe was from Germany got up to ask a question if the panelists had any plans to move their facilities outside of the major metros.  She referenced Christian Belady’s presentation around the idea of Data as Energy and remote locations like Quincy, Washington or Lulea, Sweden.   She referred to the overall approach and thinking differently as quite visionary.   Now the panel could have easily have referred to the fact that companies like Microsoft, Google, Facebook and the like have much greater software level control than a colo-provider could provide.   Or perhaps they could have referenced that most of their customers are limited by distance to existing infrastructure deployments due to inefficiencies in commercial or custom internally deployed applications. Databases with response times architected for in-rack or in-facility levels of response times.   They did at least reference that most customers tend to be server huggers and want their infrastructure close by.  

Instead the initial response was quite strange in my mind.  It was to go after the ideas as “innovative” and to imply that nothing was really innovative about what Microsoft had done and the fact that they built a “mega data center” in Dublin shows that there is nothing innovative really happening.  Really?   The adoption of 100% Air Side economization is something everyone does?   The deployment of containerized compute capacity is run of the mill?  The concepts about the industrialization of compute is old-hat?  I had to do a mental double take and question whether they were even listening during ANY of the earlier sessions.   Don’t get me wrong, I am not trying to be an apologist for the Microsoft program, in fact there are some tenets of the program I find myself not in agreement with.  However – You cannot deny that they are doing VERY different things.   It illustrated an interesting undercurrent I felt during the entire event (and maybe even our industry).  I definitely got the sensation of a growing gap between users requirements and their forward roadmaps and desires and what manufacturers and service providers are providing.  This panel, and a previous panel on modularization really highlighted these gulfs pretty demonstrably.   At a minimum I definitely walked away with an interesting new perspective on some of the companies represented.

It was then time for me to give my talk.   Every discussion up until this point had really focused on technology or industry trends.  I was going to talk about something else. Something more important.  The one thing seemingly missing from the entire event.   That is – the people attending.   All the technology in the world, all of the understanding of the trends in our industry are nothing unless the people in the room were willing to act. Willing to step up and take active roles in their companies to drive strategy.  As I have said before – to get out of the basement and into the penthouse.   The pressures on our industry and our job roles has never been more complicated.   So I walked through regulations, technologies, and cloud discussions.  Using the work that we did at AOL as a backdrop and example – I really tried to drive my main point.   That our industry – specifically the people doing all the work – were moving to a role of managing a complex portfolio of technologies, contracts, and a continuum of solutions.  Gone are the days where we can hide sheltered in our data center facilities.   Our resistance to embrace change, need to evolve with us, or it will evolve around us.   I walked through specific examples of how AOL has had to broaden its own perspective and approach to this widening view of our work roles at all levels.   I even pre-announced something we are calling Data Center Independence Day.   An aggressive adoption of modularized compute capacity that we call MicroData Centers  to help solve many of the issues we are facing as a business and the rough business case as to why it makes sense for us to move to this model.    I will speak more of that in the weeks to come with a greater degree of specifics, but stressed again the need for a wider perspective to manage a large portfolio of technologies and approaches to be successful in the future.

In closing – the event was fantastic.   The ability this event provides to network with leaders and professionals across the industry was first rate.   If I had any real constructive feedback it would be to either lengthen sessions, or reduce panel sizes to encourage more active and lively conversations.  Or both!

Perhaps at the end of the day, it’s truly the best measure of a good conference if you walk away wishing that more time could be spent on the topics.  As for me I am headed back Stateside and to digging into the challenges of my day job.    To the wonderful host city of Nice, I say Adieu!

 

\Mm

Read Full Post »

datacentre2012

This week I am headed off to France to be a keynote speaker at DataCentres2012.   In my opinion this event is the pre-eminent infrastructure and operations conference across the whole of Europe if not the world.   Regularly attracting the best speakers and infrastructure professionals in the industry (myself excluded of course – perhaps they are looking for some comic relief?) along with an incredible list of attendees which is a veritable who’s who of our industry world-wide. 

By the looks of it, Cloud and Energy concerns will be the topic of many of the conversations.    No doubt many will focus on the impact  of technology, its usefulness, features, and capabilities.   But those of you who have heard me speak before know, that I believe there is a larger more personal story – for both the professional and the companies they represent.  The problems we are facing in the industry today are complicated, multi-faceted, and deep-rooted in the past of our own decisions or the decisions of our predecessors.   We sometimes think technology alone is the salve for all ills.   This is no more true than the purchase of a pen being the solution to writers block. 

My talk will center on this multi-faceted problem space.   I will use real world examples of how I have, and am tackling those issues.  Who knows?  I might even let loose some of the top secret work we have been doing internally to position us for the future.   

As always – If you happen to be at the conference or in Nice -  Don’t be a stranger!  

\Mm

Read Full Post »

Today we celebrated going live with IPV6 versions of many of our top rated sites.  The work was done in advance of our participation in the IPV6 Launch Day.  For the uninitiated, IPv6 Launch Day is the date where major sites will begin to have their websites publicly available in the new Internet numbering scheme and is currently set for June 6, 2012.  As many of you likely know the IPv4 space which has served the Internet so well since its inception is running out of unique addresses.  I am especially proud of the fact that we are the first of the largest Internet players to achieve this unique feat.   In fact three of our sites occupy slots in the Top 25 Sites in the ranking including www.aol.com, www.engadget.com, and www.mapquest.com.  As with all things there are some interesting caveats.  For example – Google is IPv6 enabled for some ISPs, but not all.  I am specifically highlighting global availability.  

 

clip_image001

 

The journey has been an interesting one and there are many lessons learned going through these exercises.   In many conversations on this topic with other large firms exploring making this move, I often hear how difficult this process appears to be and a general reluctance to even begin.  Like the old saying goes even the longest journey begins with the first step.  

This work was far from easy and our internal team had some great learnings in our efforts to take our sites live beginning with the World IPV6 day in 2011.   Although our overall traffic levels remain pretty tiny (sustained about 4-5Mb/s) it is likely to grow as more ISPs convert their infrastructure to IPv6.

Perhaps the most significant thing I would like to share is that while migrating to the numbering system – I was very pleased to find the number of options available to companies in staging their moves to the IPv6.  Companies have a host of options available to them outside of an outright full scale renumbering of their networks.  There are IPv4 to IPv6 Gateways, capabilities already built into your existing routing and switching equipment that could ease the way, and even some capabilities in external service providers like Akamai that could help ease your adoption into the new space.  Eventually everything will need to get migrated and you will need to have a comprehensive plan to get you there, but its nice to know that firms have a bunch of options available to assist in this technical journey. 

 

\Mm

Read Full Post »

Older Posts »

Follow

Get every new post delivered to your Inbox.

Join 711 other followers