DataCenter Think Tanks Sheepish on Weightloss

LoseWeight.jpg

Matt Stansbury over at Data Center Facilities Pro posted an interesting post regarding a panel containing Uptime’s Ken Brill.  The note warns folks on the use of PUE as a benchmarking standard between data centers.

I can’t say I really disagree with what he says. In my mind, self measurement is always an intensely personal thing.  To me, PUE is a great self-measurement tool to drive towards power efficiency in your data center.  Do you include lighting?  Do you include your mechanical systems?  To me those questions are not all that dissimilar to the statement, “I am losing weight”.  Are you weighing yourself nude? in the your underwear?  with your shoes on? 

I do think the overall PUE metric could go a little farther to fully define what *MUST* be in the calculation, especially if you are going to use it comparatively.  But those who want to use this metric in some kind of competitive game are completely missing the point.   This is ultimately about using the power resources you have to its highest possible efficiency.    As I have stated over and over, and as recently as the recent Data Center Dynamics conference in Seattle.  Every Data Center is different.  If I tried to compare the efficiency of one of our latest generation facilities in San Antonio or Dublin to a facility built 10 years ago, assuming we made sure that we were comparing apples to apples with like systems included, of course the latest generation facilities would be better off.   A loss of 5 pounds on an Olympic runner with 4% body fat compared to a loss of 5 pounds on professional sumo wrestler have dramatically different effects (or non effects). 

Everyone knows I am a proponent of PUE/DCiE.  So when you read this understand where my baggage is carried.   To me the use of either, or, or both of these is a matter of audience.   Engineers love efficiency.  Business Managers understand overhead.   Regardless the measurement is consistent and more importantly the measurement is happening with some regularity. This is more important than anything.

If we are going to attempt to use PUE for full scale facility comparison a couple of things have to happen.   At Microsoft we measure PUE aggressively and often.  This speaks to the time element that Ken mentions in his talk in the post.   It would be great for the Green Grid or Uptime or anyone to produce the “Imperial Standard”.  One could even think that these groups could earn some extra revenue by certifying facilities to the “Imperial PUE standard”.  This would include minimum measurement cylces (once a day, twice a day, average for a year, peak for a year, etc).  Heavens knows it would be a far more useful metric for measuring data centers than the current LEEDS certifications. But thats another post for another time.  Seriously, the time element is hugely important.  Measuring your data center once at midnight in January while experiencing the coldest winter on record might make great marketing, but it doesnt mean much.

As an industry we have to face the fact that there are morons amongst us.  This, of course  is made worse if people are trying to advertise PUE as a competitive advantage due mostly to the fact that this means that they have engaged marketing people to “enhance” the message.    Ken’s mention of someone announcing that their PUE of .8 should instantly flag that person as an idiot and you should hand them an Engvallian sign.    But even barring these easy to identify examples we must remember that any measurement can be gamed.  In fact, I would go so far as to say that gaming measurements is the national pastime of all businesses. 

Ultimately I just chalk this up to another element of “Green-washing” that our industry is floating in.

eue

Ken also talks about the use of the word “Power” being incorrect and that because it is a point in time measurement versus an over time measurement and that we should be focused on “Energy”. According to Ken this could ultimately doom the measurement on the whole.  I think this is missing the point entirely on two fronts.   First whether you call it power or energy, the naming semantics dont really matter.  They matter to english professors and people writing white papers, but it terms of actually doing something, it has no effect.  The simple act of measuring is the most critical concept here.   Measure something, get better.   Whether you like PUE, DCIE or whether you want to adopt “Energy” and call it EUE and embrace a picture of a sheep with power monitoring apparatus attached to its back, the name doesnt really matter. )Though I must admit, a snappy mascot might actually drive more people to measure.  Just do something!

My informal polling at speaking engagements continues on the state of the industry and I am sad to say, the amount of people actively measuring power consumption remains less than 10% (let alone measuring for efficiency!), and if anything the number seems to be declining.  

In my mind, as an end-user, the thrash that we see coming from the standards bodies and think tank organizations like Uptime, Green Grid, and others should really stop bickering over whose method of calculation is better or has the best name.  We have enough challenge getting the industry to adopt ANY KIND of measurement.  To confuse matters more and argue the finer points of absurdity is only going to further magnify this thrash and ensure we continue to confuse most data center operators into more non-action .  As an industry we are heading down a path with our gun squarely aimed at out foot.   If we are not careful, the resultant wound is going to end up in amputation.

– MM

Reflections on my visit to the US Army War College

dl-ar-317.jpg

This week I had the honor and privilege of being a guest at the US Army War College as part of their 2008 Strategy Implementation Seminar.  The program focuses on strategy and leadership to further prepare the graduates for their future roles as leaders of the United States Military.  The program culminates in a special seminar that brings in outside leaders, academics, and essentially a cross section of the American public and exposes the graduating members of the course to the rich diversity of thought found in our society. 

Conversely it is an incredible opportunity to expose those same industry leaders and cross section of the American public to the leadership talent and depth of the military.  While I am certain that the graduates take away interesting nuggets of perspective, information, thoughts, approaches, and methodologies from the guests, I would have to say the education the guests get in return is equally if not more valuable.

This event had a very profound affect on me as a leader, as an individual, and as a United States citizen. 

The dedication, talent, intelligence and sheer mental fortitude present in these students was astounding.  Their grasp of issues as far ranging as economics, politics (both domestic and foreign), law, and of course the military were simply complete. Perhaps more importantly their understanding of the interplay of these topics and the resultant gaps would put most of the people I deal with to shame as ignorant simpletons.

The Seminar brings in a host of authors, strategists, industry and military leaders to a single location for an intense and well regimented program for thought and discourse on a wide variety of topics.  There is a very strong non-attribution component to the conversations which allows the participants, speakers, panel members, and guests to be open and forthright in the exchange of ideas.  It is truly powerful stuff.  The conversations were never timid, always insightful, and you felt that all points of view got an equal share of the spotlight. Something many people might not expect from such an event.

As a leader, the lessons learned from this event will stay with me forever.  While there are always nuggets of leadership theory one can glean from any such effort (and I certainly picked up a few here), the event exposed me to different types of interactions, and greatly increased my ability to manage my ‘situational awareness’. 

I had a great many conversations with students and guests alike.  The interactions with the other guests provided a wonderful opportunity to network with people across many walks of life.  An experience most of us never truly get.  To interact with professors, think tank analysts, politicians, industry leaders, news reporters, all engaged in active debate was personally very gratifying for me. 

More impacting was the interaction with the students. The students were not all Army.  There was representation from the Marines, Navy, Air Force, State Department, and others.   Most of the students I talked to had been to Iraq or Afghanistan.   In many cases multiple times.   These folks had stories that would make you laugh, make you cry, make your heart sing for joy, and take you to some dark places, and above all allow you, to for a brief crystalline moment, see into and through the eyes of those who defend this country.   To gauge the quality of the men and women putting it on the line.  For me this was a powerful personal experience.

If you are ever so fortunate as to be invited to this event, Go! I can guarantee that It will be one of the richest and most rewarding experiences of your life. Kudos to the Department of Defense and US Army War College for building such a strong and robust program.

And finally, a great big “Hoo-Ah!” to my all compatriots in Seminar 6. I enjoyed your hospitality and willingness to bring me into your fold.  You have taught me a truly important lesson, “Always remember to Start slow, then taper off.”   🙂

-Mike Manos

Data Center Leadership video posted at TechEd

clip_image001

I recently did a panel at TechEd in Orlando.  While there Lewis Curtis caught me for a moment to shoot some questions at me around Data Center leadership.  It may or may not be interesting to you (My money is on the latter 🙂 ).  But if you are interested the link can be found below:

On Datacenter Leadership

 

-Mm

Green Grid Data Center Indicator + CADE = Something useful!

There are times when two concepts merge and the result makes something better than the whole.  It is not unlike the old television commercial where two people collide into each other.  One eating a chocolate bar, the other a vat of peanut butter.   The resulting lines are television gold:

“Your chocolate is in my peanut butter!”

“Your peanut butter is on my chocolate!”

“HEY!” (in unison with smiles)

I have been anxiously awaiting for the Green Grid to publish their work on the Data Center Indicator tool.  My good friend Christian Belady and the incredible folks in the Technical workgroups came up with something that made me smile and gave CADE a way to be a viable metric. 

The Data Center Indicator Tool gives you a visual representation across all factors important to operating and measuring a data center.  Its no secret that  I get quite passionate about the need to measure your data center.   The lack of strict and rigorous uniform measurement across the data center industry is one of the biggest tragedies we are waiting to inflict upon ourselves. 

This tool is not necessarily for beginners as it assumes you have a good set of data and active measurement already in place.   However, in terms of quickly identifying trends and understanding your environment, I find it quite unique and interesting.   In fact, many of the same factors represented are rolled up into the CADE metric.  

image

The white paper which has been published on the Green Grid Site (White Paper #15)  Is a great way to have a holistic view at your environment over time and is even suitable for executives not familiar with the intricacies of Data Center or Mission Critical environment facilities.    If one takes the rolled up percentage in CADE and combines it with this type of Graph you have a great KPI, and a mechanism which makes the information actionable.  That dear readers is what any facilities manager can use.

-Mm

Struggling with CADE, McKinsey / Uptime Metric (RE-POST)

percentage.jpg

This is a Re-post of my original blog post on May 5th regarding tortured thoughts around the CADE Data Center Metric put forward by McKinsey. This has relevance to my next post and I am placing it here for your convenience.

 

I guess I should start out this post with the pre-emptive statement that as a key performance indicator I support the use of CADE or metrics that tie both facilities and IT into a single metric.  In fact we have used a similar metric internally at Microsoft.  But the fact is at the end of the day I believe that any such metrics must be useful and actionable.  Maybe its because I have to worry about Operations as well.  Maybe its because I don’t think you roll the total complexity of running a facility with one metric.  In short, I don’t think dictating yet another metric, especially one that doesn’t lend itself to action, is helpful.

As some of you know I recently gave keynote speeches at both DataCenter World and the 2008 Uptime Symposium.  Part of those speeches included a simple query of the audience of how many people are measuring energy efficiency in their facilities.  Now please keep in mind that the combined audience of both engagements numbered between 2000-2400 datacenter professionals.  Arguably these are the 2400 that really view data centers as a serious business within their organizations.  These are folks whose full time jobs are running and supporting data center environments for some of the most important companies around the world.   At each conference less than 10% of them raised their hands.   The fact that many in the industry including Ken Brill at the Uptime Institute, Green Grid, and others have been preaching about measurement for at least the last three years and less than 10% of the industry has accepted this best practice is troublesome.  

Whether you believe in measuring PUE or DCIE, you need to be measuring *something* in order to even get one variable of the CADE metric.  Given this lack of instrumentation and\or process within those firms most motivated to do so speaks in large part of the lack of success this metric is going to have over time.  It therefore follows, if they are not measuring efficiency, they likely don’t understand their total facility utilization (electrically speaking).  The IT side may have an easier way of getting variables for system utilization, but how many firms have host level performance agents in place? 

I want to point out that I am speaking to the industry in general.  Companies like ours who are investing hundreds of millions of dollars get the challenges and requirements in this space.  Its not a nice to have, its a requirement.  But when you extend this to the rest of the industry, there is a massive gap in this space.

Here are some interesting scenarios that when extended to the industry may break or complicate the CADE metric:

  • As you cull out dead servers in your environment, your utilization will drop accordingly and as a result the metric will remain unchanged.  The components of CADE are not independent. Dead servers are removed so that Average server utilization goes up then Data Center Utilization goes down showing proportionally so there is no change and if anything PUE goes up which means the metric may actually go up. Keep in mind that all results are good when kept in context of one another.
  • Hosting Providers like Savvis, Equinix, Dupont Fabros, Digital Realty Trust, and the army of others will be exempt from participating.  They will need to report back of house numbers to the their  customers (effectively PUE).    They do not have access to their customers server information It seems to me that CADE reporting in hosted environments will be difficult if not impossible.  As the design of their facilities will need to play a large part of the calculation this makes effective tracking difficult.  Additionally, overall utilization will be measured at what level?
  • If hosters exempted, then it gives CADE a very limited application or shelf-life.  You have to own the whole problem for it to be effective.  
  • As I mentioned, I think CADE has strong possibilities for those firms who own their entire stack.   But most of the data centers in the world would probably not fall into “all-in” scenario bucket.

I cant help but think we are putting the cart before the horse in this industry.  CADE may be a great way to characterize data center utilization but its completely useless if the industry isnt even measuring the basics.  I have come to the realization that this industry does a wonderful job in telling its members WHAT to do, but lacks to follow-up with the HOW.  CADE is meant for higher level consumption.  Specifically those execs who lack the technical skill-sets to make heads or tails of efficiencies and how they relate to overall operations.   For them, this metric is perfect. But we have a considerable way to go before the industry at large gets there.

Regardless, I strongly suggest each and everyone adopt the take away at Symposium….Measure, measure, measure.

Upcoming Speaking Events

speech.gif

For those of you interested, I thought I would outline some of my upcoming speaking events:

August 7th, 2008 – Data Center Dynamics Keynote – Seattle

Presentation: “The Need for Data Center Glasnost”

September 16th, 2008 – Data Center Dynamics Keynote – Chicago

Presentation: “Containers, Fact, Fiction and Fantasy”

October 2, 2008 – Power and Cooling ’08 – London

Presentation: “Think Green, Think Different, Think DATACENTRES”

November 17th, 2008 – 7×24 Exchange Keynote with C. Belady – Palm Springs

Presentation: “Kicking Anthills: The Challenge Landscape of Mission Critical Environments

-Mm

Do Not Tighten Bolts

nuts-and-bolts.jpgI have a pretty incredible job.   My current role has me leading Microsoft’s efforts at designing, constructing, and operating its world-wide Data Center infrastructure in support of our cloud services initiatives, or more correctly “Software + Services”.  

There aren’t many people around tasked with this kind of challenge, in fact the number of companies attempting this challenge can be counted on one or two hands.   One routine question I get asked is ‘what methodology or approach I use to deliver against this challenge?’.  The question of course assumes there is an answer.  As if some there is some book one can run out and purchase to figure it out.   No such book exists. 

The real answer involves the hard work and dedication of an incredibly talented team focused on a single mission.   That is exactly what we have on the Data Center challenge at Microsoft.  I have the world’s most talented team and I am not in the least shy about my confidence in them.   However, even with the raw materials produced by this incredible team,  the technology breakthroughs in Data Center design, the incredible process and automation improvements, the focused reduction of energy consumption and drive for greater energy-use efficiency is not 100% of the formula.

There is one missing element.  Clues to this one missing element can be found in products for sale at your local Target or Walmart store.   Many people may be surprised by this, but I suspect most wont. 

I recently spent a good part of a weekend putting together deck furniture for my home.   It was good quality stuff, it had the required parts and hardware and not unlike other do-it-yourself furniture it had directions that left a lot to be desired. In many ways its like IT Infrastructure or running any IT shop.   You have all the tools, you have all the raw components, but how you put it all together is where the real magic happens, and the directions are usually just as vague on how to do it.

One of the common themes across all steps of the deck furniture pieces was a common refrain, ‘Do Not Tighten Bolts”.   The purpose was to get all of the components together, even if a bit loose, to ensure you had the right shape, all components were in the right place, and then and only then do you tighten the bolts.

If you really want to know the secret to putting together solutions at scale, remember the “Do Not Tighten Bolts” methodology.   Assemble to raw components, ensure you have the right shape and that all components are in the right place, and then “Tighten it down.”   This can be and is an iterative process.   Keep working to get that right shape.  Keep working to find the right component configuration.  Tighten bolts.    As I built my first deck chair, there was significant amounts of trial and error.  The second deck chair however was seamless, even with the same cruddy directions.   Once you learn to ‘Not Tighten’ technique the assembly process is quick and provides you with great learnings.   

Some may feel this approach too simplistic, or lacks the refinement of a polished methodology.   The fact of the matter is that Cloud Services infrastructure takes lots of hard work, great natural resources, and above all flexibility without adherence to dogmatic approach.

That is why I have named this blog “Loose Bolts”.  I will be moving my formal blog activity to this forum and hopefully post interesting topics from time to time.

Thanks so much for reading,

Mike Manos

Stirring Anthills – My Response to the Recent Computer World Article

clip_image001

 

** THIS IS A RE-POST From my former BLOG Site, saving here for continuity and posterity **

When one inserts the stick of challenge and change into the anthill of conventional and dogmatic thinking they are bound to stir up a commotion.

That is exactly what I thought when I read the recent Computerworld article by Eric Lai on containers as a data center technology.  The article found here, outlines six reasons why containers won’t work and asks if Microsoft is listening.   Personally, it was an intensely humorous article, albeit not really unexpected.  My first response was "only six"?  You only found six reasons why it won’t work?  Internally we thought of a whole lot more than that when the concept first appeared on our drawing boards. 

My Research and Engineering team is challenged with vetting technologies for applicability, efficiency, flexibility, longevity, and perhaps most importantly — fiscal viability.   You see, as a business, we are not into investing in solutions that are going to have a net effect of adding cost for costs sake.    Every idea is painstakingly researched, prototyped, and piloted.  I can tell you one thing, the internal push-backs on the idea numbered much more than six and the biggest opponent (my team will tell you) was me!

The true value of any engineering organization is to give different ideas a chance to mature and materialize.  The Research and Engineering teams were tasked with making sure this solution had solid legs, saved money, gave us the scale, and ultimately was something we felt would add significant value to our program.  I can assure you the amount of math, modeling, and research that went into this effort was pretty significant.  The article contends we are bringing a programmer’s approach to a mechanical engineer’s problem.  I am fairly certain that my team of professional and certified engineers took some offense to that, as would Christian Belady who has conducted extensive research and metrics for the data center industry. Regardless, I think through the various keynote addresses we’ve participated in over the last few months we tried to make the point that containers are not for everyone.   They are addressing a very specific requirement for properties that can afford a different operating environment.  We are using them for rapid and standard deployment at a level the average IT shop does not need or tools to address. 

Those who know me best know that I enjoy a good tussle and it probably has to do with growing up on the south side of Chicago.  My team calls me ornery, I prefer "critical thought combatant."   So I decided I would try and take on the "experts" and the points in the article myself with a small rebuttal posted here:

Challenge 1: Russian Doll Like Nesting servers on racks in containers and lead to more moreness.

Huh?  This challenge has to do with the perceived challenges on the infrastructure side of the house, and complexity of managing such infrastructure in this configuration.   The primary technical challenge in this part is harmonics.   Harmonics can be solved in a multitude of ways, and as accurately quoted is solvable.  Many manufacturers have solutions to fix harmonics issues, and I can assure you this got a pretty heavy degree of technical review.   Most of these solutions are not very expensive and in some cases are included at no cost.   We have several large facilities, and I would like to think we have built up quite a stable of understanding and knowledge in running these types of facilities.  From a ROI perspective, we have that covered as well.   The economics of cost and use in containers (depending upon application, size, etc.) can be as high as 20% over conventional data centers.   These same metrics and savings have been discovered by others in the industry.  The larger question is if containers are a right-fit for you.  Some can answer yes, others no. After intensive research and investigation, the answer was yes for Microsoft.

Challenge 2: Containers are not as Plug and Play as they may seem.

The first real challenge in this section is about shipment of gear and that it would be a monumental task for us to determine or provide verification of functionality.   We deploy tens of thousands of servers per month. As I have publicly talked about, we moved from individual servers as a base unit, to entire racks as a scale unit, to a container of racks.   The process of determining functionality is incredibly simple to do.  You can ask any network, Unix, or Microsoft professional on just how easy this is, but let’s just say it’s a very small step in our "container commissioning and startup" process.  

The next challenge in this section is truly off base. .   The expert is quoted that the "plug and play" aspect of containers is itself put in jeopardy due to the single connection to the wall for power, network, etc.  One can envision a container with a long electrical extension cord.  I won’t disclose some of our "secret sauce" here, but a standard 110V extension cord just won’t cut it.  You would need a mighty big shoe size to trip over and unplug one of these containers. Bottom line is that connections this large require electricians for installation or uninstall. I am confident we are in no danger of falling prey to this hazard. 

However, I can say that regardless of the infrastructure technology the point made about thousands of machines going dark at one time could happen.  Although our facilities have been designed around our "Fail Small Design" created by my Research and Engineering group, outages can always happen.  As a result, and being a software company, we have been able to build our applications in such a way where the loss of server/compute capacity never takes the application completely offline.  It’s called application geo-diversity.  Our applications live in and across our data center footprint. By putting redundancy in the applications, physical redundancy is not needed.  This is an important point, and one that scares many "experts."   Today, there is a huge need for experts who understand the interplay of electrical and mechanical systems.  Folks who make a good living by driving Business Continuity and Disaster Recovery efforts at the infrastructure level.   If your applications could survive whole facility outages would you invest in that kind of redundancy?  If your applications were naturally geo-diversified would you need a specific DR/BCP Plan?   Now not all of our properties are there yet, but you can rest assured we have achieved that across a majority of our footprint.  This kind of thing is bound to make some people nervous.   But fear not IT and DC warriors, these challenges are being tested and worked out in the cloud computing space, and it still has some time before it makes its way into the applications present in a traditional enterprise data center.

As a result we don’t need to put many of our applications and infrastructure on generator backup.  To quote the article :

"Few data centers dare to make that choice, said Jeff Biggs, senior vice president of operations and engineering for data center operator Peak 10 Inc., despite the average North American power uptime of 99.98%. "That works out to be about 17 seconds a day," said Biggs, who oversees 12 data centers in southeastern states. "The problem is that you don’t get to pick those 17 seconds."

He is exactly right. I guess two points I would highlight here are: the industry has some interesting technologies called battery and rotary UPS’ that can easily ride through 17 seconds if required, and the larger point is, we truly do not care.   Look, many industries like the financial and others have some very specific guidelines around redundancy and reliability.   This drives tens of millions to hundreds of millions of extra cost per facility.   The cloud approach eliminates this requirement and draws it up to the application. 

Challenge 3: Containers leave you less, not more, agile.

I have to be honest; this argument is one that threw me for a loop at first.   My initial thought upon reading the challenge was, "Sure, building out large raised floor areas to a very specific power density is ultimately more flexible than dropping a container in a building, where density and server performance could be interchanged at a power total consumption level."   NOT!  I can’t tell you how many data centers I have walked through with eight-foot, 12-foot, or greater aisles between rack rows because the power densities per rack were consuming more floor space.   The fact is at the end of the day your total power consumption level is what matters.   But as I read on, the actual hurdles listed had nothing to do with this aspect of the facility.

The hurdles revolved around people, opportunity cost around lost servers, and some strange notion about server refresh being tied to the price of diesel. A couple of key facts:

· We have invested in huge amounts of automation in how we run and operate.   The fact is that even at 35 people across seven days a week, I believe we are still fat and we could drive this down even more.   This is running thin, its running smart.  

· With the proper maintenance program in place, with professionals running your facility, with a host of tools to automate much of the tasks in the facility itself, with complete ownership of both the IT and the Facilities space you can do wonders.  This is not some recent magic that we cooked up in our witches’ brew; this is how we have been running for almost four years! 

In my first address internally at Microsoft I put forth my own challenge to the team.   In effect, I outlined how data centers were the factories of the 21st century and that like it or not we were all modern day equivalents of those who experienced the industrial revolution.  Much like factories (bit factories I called them), our goal was to automate everything we do…in effect bring in the robots to continue the analogy.  If the assembled team felt their value was in wrench turning they would have a limited career growth within the group, if they up-leveled themselves and put an eye towards automating the tasks their value would be compounded.  In that time some people have left for precisely that reason.   Deploying tens of thousands of machines per month is not sustainable to do with humans in the traditional way.  Both in the front of the house (servers,network gear, etc) and the back of the house (facilities).   It’s a tough message but one I won’t shy away from.  I have one of the finest teams on the planet in running our facilities.   It’s a fact, automation is key. 

Around opportunity cost of failed machines in a container from a power perspective, there are ultimately two scenarios here.   One is that the server has failed hard and is dead in the container.  In that scenario, the server is not drawing power anyway and while the container itself may be drawing less power than it could, there is not necessarily an "efficiency" hit.   The other scenario is that the machine dies in some half-state or loses a drive or similar component.   In this scenario you may be drawing energy that is not producing "work".  That’s a far more serious problem as we think about overall work efficiency in our data centers.  We have ways through our tools to mitigate this by either killing the machine remotely, or ensuring that we prune that server’s power by killing it at an infrastructure level.   I won’t go into the details here, but we believe efficiency is the high order bit.   Do we potentially strand power in this scenario?  Perhaps. But as mentioned in the article, if the failure rate is too high, or the economics of the stranding begin to impact the overall performance of the facility, we can always swap the container out with a new one and instantly regain that power.   We can do this significantly more easily than a traditional data center could because I don’t have to move servers or racks of equipment around in the data center(i.e. more flexible).   One thing to keep in mind is that all of our data center professionals are measured by the overall uptime of their facility, the overall utilization of the facility (as measured by power), and the overall efficiency of their facility (again as measured by power).  There is no data center manager in my organization who wants to be viewed as lacking in these areas and they give these areas intense scrutiny.  Why?  When your annual commitments are tied to these metrics, you tend to pay attention to them. 

The last hurdle here revolves around the life expectancy of a server and technology refresh change rates and somehow the price of diesel and green-ness.

"Intel is trying to get more and more power efficient with their chips," Biggs said. "And we’ll be switching to solid-state drives for servers in a couple of years. That’s going to change the power paradigm altogether." But replacing a container after a year or two when a fraction of the servers are actually broken "doesn’t seem to be a real green approach, when diesel costs $3.70 a gallon," Svenkeson said.

Clear as mud to me.  I am pretty sure the "price of diesel" in getting the containers to me is included in the price of the containers.  I don’t see a separate diesel charge.  In fact, I would argue that "shipping around 2000 servers individually" would ultimately be less green or (at least in travel costs alone) a push.   In fact, if we dwell a moment longer on the "green-ness" factor, there is something to be said for the container in that the box it arrives in is the box I connect to my infrastructure.   What happens to all the foam product and cardboard with 2000 individual servers?  Regardless, we recycle all of our servers.  We don’t just "throw them away".On the technology refresh side of the hurdle, I will put on my business hat for a second.  Frankly, I don’t know too many people who depreciate server equipment less than three years.  Those who do, typically depreciate over one year.  But having given talks at Uptime and AFCOM in the last month the comment lament across the industry was that people were keeping servers (albeit power inefficient servers) well passed their useful life because they were "free".   Technology refresh IS a real factor for us, and if anything this approach allows us to adopt new technologies faster.   I get to upgrade a whole container’s worth of equipment to the best performance and highest efficiency when I do refresh and best of all there is minimal "labor" to accomplish it.  I would also like to point out that containers are not the only technology direction we have.  We solve the problems with the best solution.  Containers are just one tool in our tool belt.   In my personal experience, the Data Center industry often falls prey to the old adage of “if your only tool is a hammer then every problem is a nail syndrome.”

Challenge 4: Containers are temporary, not a long term solution.

Well I still won’t talk about who is in the running for our container builds, but I will talk to the challenges put forth here.   Please keep in mind that Microsoft is not a traditional "hoster".  We are an end user.  We control all aspects of construction, server deployments and applications that go into our facilities.  Hosting companies do not.   This section challenges that while we are in a growth mode now, it won’t last forever, therefore making it temporary. The main point that everyone seems to overlook is the container is a scale unit for us.  Not a technology solution for incremental capacity, or providing capacity necessarily in remote regions.   If I deploy 10 containers in a data center, and each container holds 2000 servers, that’s 20,000 servers.  When those servers are end of life, I remove 10 containers and replace them with 10 more.   Maybe those new models have 3000 servers per container due to continuing energy efficiency gains.   What’s the alternative?  How people intensive do you think un-racking 20000 servers would be followed by racking 20000 more?   Bottom line here is that containers are our scale unit, not an end technology solution.   It’s a very important distinction that seems lost in multiple conversations.  Hosting Companies don’t own the gear inside them, users do. It’s unlikely  they will ever experience this kind of challenge or need.  The rest of my points are accurately reflected in the article. 

Challenge 5: Containers don’t make a data center Greener

This section has nothing to do with containers.   This has to do with facility design.  While containers may be able to take advantage of the various cooling mechanisms available in the facility the statement is effectively correct that "containers" don’t make a data center greener.   There are some minor aspects of "greener" that I mentioned previously around shipping materials, etc, but the real "green data center" is in the overall energy use efficiency of the building.

I was frankly shocked at some of the statements in this section:

An airside economizer, explained Svenkeson, is a fancy term for "cutting a hole in the wall and putting in a big fan to suck in the cold air." Ninety percent more efficient than air conditioning, airside economizers sound like a miracle of Mother Nature, right?  Except that they aren’t. For one, they don’t work — or work well, anyway — during the winter, when air temperature is below freezing. Letting that cold, dry air simply blow in would immediately lead to a huge buildup of static electricity, which is lethal to servers, Svenkeson said.

Say what?  Airside economization is a bit more than that.  I am fairly certain that they do work and there are working examples across the planet.   Do you need to have a facility-level understanding of when to use and when not to use them?  Sure.   Regardless all the challenges listed here can be easily overcome.   Site selection also plays a big role. Our site selection and localization of design decides which packages we deploy.   To some degree, I feel this whole argument falls into another one of the religious wars on-going in the data center industry.   AC vs. DC, liquid cooled vs. air cooled, etc.  Is water-side economization effective? Yes.  Is it energy efficient? No.  Not at least when compared to air economization in a location tailor made for it.  If you can get away with cooling from the outside and you don’t have to chill any water (which takes energy) then inherently it’s more efficient in its use of energy.  Look, the fact of the matter is we have both horses in the race.  It’s about being pragmatic and intelligent about when and where to use which technology.  

Some other interesting bits for me to comment on:

Even with cutting-edge cooling systems, it still takes a watt of electricity to a cool a server for every watt spent to power it, estimated Svenkeson. "It’s quite astonishing the amount of energy you need," Svenkeson said. Or as Emcor’s Baker put it, "With every 19-inch rack, you’re running something like 40,000 watts. How hot is that? Go and turn your oven on."

I would strongly suggest a quick research into the data that Green Grid and Uptime have on this subject.   Worldwide PUE metrics (or DCiE if you like efficiency numbers better) would show significant variation in the one for one metric.   Some facilities reach a PUE of 1.2 or 80% efficient at certain times of the year or in certain locations.   Additionally the comment that every 19inch rack draws 40kw is outright wrong.  Worldwide averages show that racks are somewhere between 4kw and 6kw.  In special circumstances, densities approach this number, but as an average number it is fantastically high. 

But with Microsoft building three electrical substations on-site sucking down a total of 198 megawatts, or enough to power almost 200,000 homes, green becomes a relative term, others say. "People talk about making data centers green. There’s nothing green about them. They drink electricity and belch heat," Biggs said. "Doing this in pods is not going to turn this into a miracle."

I won’t publicly comment on the specific size of the substation, but would kindly point someone interested in the subject to substation design best practices and sizing.  How you design and accommodate a substation for things like maintenance, configuration and much more is an interesting topic in itself.  I won’t argue that the facility isn’t large by any standard; I’m just saying there is complexity one needs to look into there.   Yes, data centers consume energy, being "green" assumes you are doing everything you can to ensure every last watt is being used for some useful product of work.  That’s our mission. 

Challenge 6: Containers are a programmers approach to a mechanical engineer’s problem.

As I mentioned before, a host of professional engineers that work for me just sat up and coughed. I especially liked:

"I think IT guys look at how much faster we can move data and think this can also happen in the real world of electromechanics," Baker said. Another is that techies, unfamiliar with and perhaps even a little afraid of electricity and cooling issues, want something that will make those factors easier to control, or if possible a nonproblem. Containers seem to offer that. "These guys understand computing, of course, as well as communications," Svenkeson said. "But they just don’t seem to be able to maintain a staff that is competent in electrical and mechanical infrastructure. They don’t know how that stuff works."

I can assure you that outside of my metrics and reporting tool developers, I have absolutely no software developers working for me.   I own IT and facilities operations.   We understand the problems, we understand the physics, we understand quite a bit. Our staff has expertise with backgrounds as far ranging as running facilities on nuclear submarines to facilities systems for space going systems.  We have more than a bit of expertise here. With regards to the comment that we are unable to maintain a staff that is competent, the folks responsible for managing the facility have had a zero percent attrition rate over the last four years.  I would easily put my team up against anyone in the industry. 

I get quite touchy when people start talking negatively about my team and their skill-sets, especially when they make blind assumptions.  The fact of the matter is that due to the increasing visibility around data centers the IT and the Facilities sides of the house better start working together to solve the larger challenges in this space.  I see it and hear it at every industry event.  The us vs. them between IT and facilities; neither realizing that this approach spells doom for them both.  It’s about time somebody challenged something in this industry.  We have already seen that left to its own devices technological advancement in data centers has by and large stood still for the last two decades.  As Einstein said, "We can’t solve problems by using the same kind of thinking we used when we created them."

Ultimately, containers are but the first step in a journey which we intend to shake the industry up with.  If the thought process around containers scares you then, the innovations, technology advances and challenges currently in various states of thought, pilot and implementation will be downright terrifying.  I guess in short, you should prepare for a vigorous stirring of the anthill.