Insider Redux: Data Barn in a Farm Town

I thought I would start my first post by addressing the second New York Times article first. Why? Because it specifically mentions activities and messages sourced from me at the time when I was responsible for running the Microsoft Data Center program. I will try to track the timeline mentioned in the article with my specific recollections of the events. As Paul Harvey used to say, so then you could know the ‘REST of the STORY’.

I remember my first visit to Quincy, Washington. It was a bit of a road trip for myself and a few other key members of the Microsoft site selection team. We had visited a few of the local communities and power utility districts doing our due diligence on the area at large. Our ‘Heat map’ process had led us to Eastern Washington state. Not very far (just a few hours) from the ‘mothership’ of Redmond, Washington. It was a bit of a crow eating exercise for me as just a few weeks earlier I had proudly exclaimed that our next facility would not be located on the West Coast of the United States. We were developing an interesting site selection model that would categorize and weight areas around the world. It would take in FEMA disaster data, fault zones, airport and logistics information, location of fiber optic and carrier presence, workforce distributions, regulatory and tax data, water sources, and power. This was going to be the first real construction effort undertaken by Microsoft. The cost of power was definitely a factor as the article calls out. But just as equal was the generation mix of the power in the area. In this case a predominance of hydroelectric. Low to No carbon footprint (Rivers it turns out actually give off carbon emissions I came to find out). Regardless the generation mix was and would continue to be a hallmark of site selection of the program when I was there. The crow-eating exercise began when we realized that the ‘greenest’ area per our methodology was actually located in Eastern Washington along the Columbia River.

We had a series of meetings with Real Estate folks, the local Grant County PUD, and the Economic Development folks of the area. Back in those days the secrecy around who we were was paramount, so we kept our identities and that of our company secret. Like geeky secret agents on an information gathering mission. We would not answer questions about where we were from, who we were, or even our names. We ‘hid’ behind third party agents who took everyone’s contact information and acted as brokers of information. That was early days…the cloak and dagger would soon come out as part of the process as it became a more advantageous tool to be known in tax negotiations with local and state governments.

During that trip we found the perfect parcel of land, 75 acres with great proximity to local sub stations, just down line from the Dams on the nearby Columbia River. It was November 2005. As we left that day and headed back it was clear that we felt we had found Site Selection gold. As we started to prepare a purchase offer we got wind that Yahoo! was planning on taking a trip out to the area as well. As the local folks seemingly thought that we were a bank or large financial institution they wanted to let us know that someone on the Internet was interested in the area as well. This acted like a lightning rod and we raced back to the area and locked up the land before they Yahoo had a chance to leave the Bay Area. In these early days the competition was fierce. I have tons of interesting tales of cloak and dagger intrigue between Google, Microsoft, and Yahoo. While it was work there was definitely an air of something big on the horizon. That we were all at the beginning of something. In many ways many of the Technology professionals involved regardless of company forged some deep relationships and competition with each other.

Manos on the Bean Field December 2005The article talks about how the ‘Gee-Whiz moment faded pretty fast’. While I am sure that it faded in time (as all things do), I also seem to recall the huge increase of local business as thousands of construction workers descended upon this wonderful little town, the tours we would give local folks and city council dignitaries, a spirit of true working together. Then of course there was the ultimate reduction in properties taxes resulting from even our first building and an increase in home values to boot at the time. Its an oft missed benefit that I am sure the town of Quincy and Grant County has continued to benefit from as the Data Center Cluster added Yahoo, Sabey, IAC, and others. I warmly remember the opening day ceremonies and ribbon cutting and a sense of pride that we did something good. Corny? Probably – but that was the feeling. There was no talk of generators. There were no picket signs, in fact the EPA of Washington state had no idea on how to deal with a facility of this size and I remember openly working in partnership on them. That of course eventually wore off to the realities of life. We had a business to run, the city moved on, and concerns eventually arose.

The article calls out a showdown between Microsoft and the Power Utility District (PUD) over a fine for missing capacity forecasting target. As this happened much after I left the company I cannot really comment on that specific matter. But I can see how that forecast could miss. Projecting power usage months ahead is more than a bit of science mixed with art. It gets into the complexity of understanding capacity planning in your data centers. How big will certain projects grow. Will they meet expectations?, fall short?, new product launches can be duds or massive successes. All of these things go into a model to try and forecast the growth. If you think this is easy I would submit that NOONE in the industry has been able to master the crystal ball. I would also submit that most small companies haven’t been able to figure it out either. At least at companies like Microsoft, Google, and others you can start using the law and averages of big numbers to get close. But you will always miss. Either too high, or too low. Guess to low and you impact internal budgeting figures and run rates. Not Good. Guess to high and you could fall victim to missing minimal contracts with utility companies and be subject to fines.

In the case mentioned in the article, the approach taken if true would not be the smartest method especially given the monthly electric bill for these facilities. It’s a cost of doing business and largely not consequential at the amount of consumption these buildings draw. Again, if true, it was a PR nightmare waiting to happen.

At this point the article breaks out and talks about how the Microsoft experience would feel more like dealing with old-school manufacturing rather than ‘modern magic’ and diverts to a situation at a Microsoft facility in Santa Clara, California.

The article references that this situation is still being dealt with inside California so I will not go into any detailed specifics, but I can tell you something does not smell right in the state of Denmark and I don’t mean the Diesel fumes. Microsoft purchased that facility from another company. As the usage of the facility ramped up to the levels it was certified to operate at, operators noticed a pretty serious issue developing. While the building was rated to run at certain load size, it was clear that the underground feeders were undersized and the by-product could have polluted the soil and gotten into the water system. This was an inherited problem and Microsoft did the right thing and took the high road to remedy it. It is my recollection that all sides were clearly in know of the risks, and agreed to the generator usage whenever needed while the larger issue was fixed. If this has come up as a ‘air quality issue’ I personally would guess that there is politics at play. I’m not trying to be an apologist but if true, it goes to show that no good deed goes unpunished.

At this point the article cuts back to Quincy. It’s a great town, with great people. To some degree it was the winner of the Internet Jackpot lottery because of the natural tech resources it is situated on. I thought that figures quoted around taxes were an interesting component missed in many of the reporting I read.

“Quincy’s revenue from property taxes, which data centers do pay, has risen from $815,250 in 2005 to a projected $3.6 million this year, paying for a library and repaved streets, among other benefits, according to Tim Snead, the city administrator.”

As I mentioned in yesterday’s post my job is ultimately to get things done and deliver results. When you are in charge of a capital program as large as Microsoft’s program was at the time – your mission is clear – deliver the capacity and start generating value to the company. As I was presented the last cropThe last bag of beans harvested in Quincy of beans harvested from the field at the ceremony we still had some ways to go before all construction and capacity was ready to go. One of the key missing components was the delivery and installation of a transformer for one of the substations required to bring the facility up to full service. The article denotes that I was upset that the PUD was slow to deliver the capacity. Capacity I would add that was promised along a certain set of timelines and promises and commitments were made and money was exchanged based upon those commitments. As you can see from the article, the money exchanged was not insignificant. If Mr. Culbertson felt that I was a bit arrogant in demanding a follow through on promises and commitments after monies and investments were made in a spirit of true partnership, my response would be ‘Welcome to the real world’. As far as being cooperative, by April the construction had already progressed 15 months since its start. Hardly a surprise, and if it was, perhaps the 11 acre building and large construction machinery driving around town could have been a clue to the sincerity of the investment and timelines. Harsh? Maybe. Have you ever built a house? If so, then you know you need to make sure that the process is tightly managed and controlled to ensure you make the delivery date.

The article then goes on to talk about the permitting for the Diesel generators. Through the admission of the Department of Ecology’s own statement, “At the time, we were in scramble mode to permit our first one of these data centers.” Additionally it also states that:

Although emissions containing diesel particulates are an environmental threat, they were was not yet classified as toxic pollutants in Washington. The original permit did not impose stringent limits, allowing Microsoft to operate its generators for a combined total of more than 6,000 hours a year for “emergency backup electrical power” or unspecified “maintenance purposes.”

At the time all this stuff was so new, everyone was learning together. I simply don’t buy that this was some kind Big Corporation versus Little Farmer thing. I cannot comment on the events of 2010 where Microsoft asked for itself to be disconnected from the Grid. Honestly that makes no sense to me even if the PUD was working on the substation and I would agree with the articles ‘experts’.

Well that’s my take on my recollection of events during those early days of the Quincy build out as it relates to the articles. Maybe someday I will write a book as the process and adventures of those early days of birth of Big Infrastructure was certainly exciting. The bottom line is that the data center industry is amazingly complex and the forces in play are as varied as technology to politics to people and everything in between. There is always a deeper story. More than meets the eye. More variables. Decisions are never black and white and are always weighted against a dizzying array of forces.

\Mm

Pointy Elbows, Bags of Beans, and a little anthill excavation…A response to the New York Times Data Center Articles

I have been following with some interest the series of articles in the New York Times by Jim Glanz.  The series premiered on Sunday with an article entitled Power, Pollution and the Internet, which was followed up today with a deeper dive in some specific examples.  The examples today (Data  Barns in a farm town, Gobbling Power and Flexing muscle) focused on the Microsoft program, a program of which I have more than some familiarity since I ran it for many years.   After just two articles, reading the feedback in comments, and seeing some of the reaction in the blogosphere it is very clear that there is more than a significant amount of misunderstanding, over-simplification, and a lack of detail I think is probably important.   In doing so I want to be very clear that I am not representing AOL, Microsoft, or any other organization other than my own personal observations and opinions.  

As mentioned in both of the articles I was one of hundreds of people interviewed by the New York Times for this series.  In those conversations with Jim Glanz a few things became very apparent.  First – He has been on this story for a very long time, at least a year.   As far as journalists go, he was incredibly deeply engaged and armed with tons of facts.  In fact, he had a trove of internal emails, meeting minutes, and a mountain of data through government filings that must have taken him months to collect.  Secondly, he had the very hard job of turning this very complex space into a format where the uneducated masses can begin to understand it.  Therein lies much of the problem – This is an incredibly complex space to try and communicate it to those not tackling it day to day or even understand that technological, regulatory forces involved.  This is not an area or topic that can be sifted down to a sound bite.   If this were easy, there really wouldn’t be a story would there?

At issue for me is that the complexity of the powers involved seems to get scant attention aiming larger for the “Data Centers are big bad energy vampires hurting the environment” story.   Its clearly evident reading through the comments on the both of the articles so far.   Claiming that the sources and causes have everything to do from poor web page design to government or multi-national companies conspiracies to corner the market on energy. 

So I thought I would take a crack article by article to shed some light (the kind that doesn’t burn energy) on some of the topics and just call out where I disagree completely.     In full transparency  the “Data Barns” article doesn’t necessarily paint me as a “nice guy”.  Sometimes I am.  Sometimes I am not.  I am not an apologist, nor do I intend to do so in this post.  I am paid to get stuff done.  To execute. To deliver.  Quite frankly the PUD missed deadlines (the progenitor event to my email quoted in the piece) and sometimes people (even utility companies) have to live in the real world of consequences.   I think my industry reputation, work, and fundamental stances around driving energy efficiency and environmental conservancy in this industry can stand on its own both publicly and for those that have worked for me. 

There is an inherent irony here that these articles were published in both print and electronically to maximize the audience and readership.  To do that, these articles made “multiple trips” through a data center, and ultimately reside in one (or more).  They seem to denote that keeping things online is bad which seems to go against the availability and need of the articles themselves.  Doesn’t the New York times expect to make these articles available on-line for people to read?  Its posted online already.  Perhaps they expect that their micro-fiche experts would be able to serve the demand for these articles in the future?  I do not think so. 

This is a complex eco-system of users, suppliers, technology, software, platforms, content creators, data (both BIG and small), regulatory forces, utilities, governments, financials, energy consumption, people, personalities, politics, company operating tenets, community outreach to name a very few.  On top of managing through all these variables they also have to keep things running with no downtime.

\Mm

Sites and Sounds of DataCentre2012: Thoughts and my Personal Favorite presentations Day 1

We wrapped our first full day of talks here at DataCentre2012 and I have to say the content was incredibly good.    A couple of the key highlights that really stuck out in my mind were the talk given by Christian Belady who covered some interesting bits of the Microsoft Data Center Strategy moving forward.   Of course I have a personal interest in that program having been there for Generation1 through Generation4 of the evolutions of the program.   ms-beladyChristian covered some of the technology trends that they are incorporating into their Generation 5 facilities.  It was some very interesting stuff and he went into deeper detail than I have heard so far around the concept of co-generation of power at data center locations.   While I personally have some doubts about the all-in costs and immediacy of its applicability it was great to see some deep meaningful thought and differentiation out of the Microsoft program.  He also went into a some interesting “future” visions which talked about data being the next energy source.  While he took this concept to an entirely new level  I do feel he is directionally correct.  His correlations between the delivery of “data” in a utility model rang very true to me as I have long preached about the fact that we are at the dawning of the Information Utility for over 5 years.

Another fascinating talk came from Oliver J Jones of a company called Chayora.   Few people and companies really understand the complexities and idiosyncrasies of doing business let alone dealing with the development and deployment of large scale infrastructure there.    The presentation done by Mr. Jones was incredibly well done.  Articulating the size, opportunity, and challenges of working in China through the lens of the data center market he nimbly worked in the benefits of working with a company with this kind of expertise.   It was a great way to quietly sell Chayora’s value proposition and looking around the room I could tell the room was enthralled.   His thoughts and data points had me thinking and running through scenarios all day long.  Having been to many infrastructure conferences and seeing hundreds if not thousands of presentations, anyone who can capture that much of my mindshare for the day is a clear winner. 

Tom Furlong and Jay Park of Facebook gave a great talk on OCP with a great focus on their new facility in Sweden.  They also talked  a bit about their other facilities in Prineville and North Carolina as well.   With Furlong taking the Mechanical innovations and Park going through the electrical it was a great talk to created lots of interesting questions.  fb-parkAn incredibly captivating portion of the talk was around calculating data center availability.   In all honesty it was the first time I had ever seen this topic taken head on at a data center conference. In my experience, like PUE, Availability calculations can fall under the spell of marketing departments who truly don’t understand that there SHOULD be real math behind the calculation.   There were two interesting take aways for me.  The first was just how impactful this portion of the talk had on the room in general.   There was an incredible amount of people taking notes as Jay Park went through the equation and way to think about it.   It led me to my second revelation – There are large parts of our industry who don’t know how to do this.   fb-furlongIn private conversations after their talk some people confided that had never truly understood how to calculate this.   It was an interesting wake-up call for me to ensure I covered the basics even in my own talks.

After the Facebook talk it was time for me to mount the stage for Global Thought Leadership Panel.   I was joined on stage by some great industry thinkers including Christian Belady of Microsoft, Len Bosack (founder of Cisco Systems) now CEO XKL Systems, Jack Tison-CTO of Panduit, Kfir Godrich-VP and Chief Technologist at HP, John Corcoran-Executive Chairman of Global Switch, and Paul-Francois Cattier-Global VP of Data Centers  at Schneider Electric.   That’s a lot of people and brainpower to fit on a single stage.  We really needed three times the amount of time allotted for this panel, but that is the way these things go.   Perhaps one of the most interesting recurring themes from question to question was the general agreement that at the end of the day – great technology means nothing without the will do something different.   There was an interesting debate on the differences between enterprise users and large scale users like Microsoft, Google, Facebook, Amazon, and AOL.  I was quite chagrined and a little proud to hear AOL named in that list of luminaries (it wasn’t me who brought it up).   But I was quick to point out that AOL is a bit different in that it has been around for 30 years and our challenges are EXACTLY like Enterprise data center environments.   More on that tomorrow in my keynote I guess.

All in all, it was a good day – there were lots of moments of brilliance in the panel discussions throughout the day.  One regret I have was on the panel regarding DCIM.   They ran out of time for questions from the audience which was unfortunate.   People continue to confuse DCIM as BMS version 2.0 and really miss capturing the work and soft costs, let alone the ongoing commitment to the effort once started.   Additionally there is the question of once you have mountains of collected data, what do you do with that.   I had a bunch of questions on this topic for the panel, including if any of the major manufacturers were thinking about building a decision engine over the data collection.  To me it’s a natural outgrowth and next phase of DCIM.  The one case study they discussed was InterXion.  It was a great effort but I think in the end maintained the confusion around a BMS with a web interface versus true Facilities and IT integration.     Another panel on Modularization got some really lively discussion on feature/functionality and differentiation, and lack of adoption.  To a real degree it highlighted an interesting gulf between manufacturers (mostly represented by the panel) who need to differentiate their products and the users who require vendor interoperability of the solution space.   It probably doesn’t help to have Microsoft or myself in the audience when it comes to discussions around modular capacity.   On to tomorrow!

\Mm

A Well Deserved Congratulations to Microsoft Dublin DC Launch

Today Microsoft announced the launch of their premier flagship data center facility in Dublin, Ireland.  This is a huge achievement in many ways and from many angles.    While there are those who will try and compare this facility to other ‘Chiller-less’ facilities, I can assure you this facility is unique in so many ways.   But that is a story for others to tell over time.

I wanted to personally congratulate the teams responsible for delivering this marvel and acknowledge the incredible amount of work in design, engineering, and construction to make this a reality.  To Arne, and the rest of my old team at Microsoft in DCS – Way to go! 

\Mm

PS – I bet there is much crying and gnashing of teeth as the unofficial Limerick collection will now come to a close.  But here is a final one from me:

 

A Data Centre from a charming green field did grow,

With energy and server lights did it glow

Through the lifting morning fog,

An electrical Tir Na Nog,

To its valiant team – Way to Go!

Generation 4 – A deeper look

Christian Belady (Our Principal Power and Cooling Architect) and David Gauthier  (One of our Data Center Engineering teams) put a post up answering some of the many questions we have been getting around our Generation 4 Approach.  Its some good additional primer information and addresses some of the recurring themes we are getting in mail.

Check out their joint blog at : http://blogs.technet.com/msdatacenters/

There is also a good video interview of them here.

/Mm

Our Vision for Generation 4 Modular Data Centers – One way of Getting it just right . . .

 

image

Data Centers are a hot topic these days. No matter where you look, this once obscure aspect of infrastructure is getting a lot of attention. For years, there have been cost pressures on IT operations and this, when the need for modern capacity is greater than ever, has thrust data centers into the spotlight. Server and rack density continues to rise, placing DC professionals and businesses in tighter and tougher situations while they struggle to manage their IT environments. And now hyper-scale cloud infrastructure is taking traditional technologies to limits never explored before and focusing the imagination of the IT industry on new possibilities.

At Microsoft, we have focused a lot of thought and research around how to best operate and maintain our global infrastructure and we want to share those learnings. While obviously there are some aspects that we keep to ourselves, we have shared how we operate facilities daily, our technologies and methodologies, and, most importantly, how we monitor and manage our facilities. Whether it’s speaking at industry events, inviting customers to our “Microsoft data center conferences” held in our data centers, or through other media like blogging and white papers, we believe sharing best practices is paramount and will drive the industry forward.  So in that vein, we have some interesting news to share.

Today we are sharing our Generation 4 Modular Data Center plan. This is our vision and will be the foundation of our cloud data center infrastructure in the next five years. We believe it is one of the most revolutionary changes to happen to data centers in the last 30 years. Joining me, in writing this blog are Daniel Costello, my director of Data Center Research and Engineering and Christian Belady, principal power and cooling architect. I feel their voices will add significant value to driving understanding around the many benefits included in this new design paradigm.

Our “Gen 4” modular data centers will take the flexibility of containerized servers—like those in our Chicago data center—and apply it across the entire facility. So what do we mean by modular? Think of it like “building blocks”, where the data center will be composed of modular units of prefabricated mechanical, electrical, security components, etc., in addition to containerized servers.

Was there a key driver for the Generation 4 Data Center?

If we were to summarize the promise of our Gen 4 design into a single sentence it would be something like this: “A highly modular, scalable, efficient, just-in-time data center capacity program that can be delivered anywhere in the world very quickly and cheaply, while allowing for continued growth as required.”  Sounds too good to be true, doesn’t it?  Well, keep in mind that these concepts have been in initial development and prototyping for over a year and are based on cumulative knowledge of previous facility generations and the advances we have made since we began our investments in earnest on this new design.

One of the biggest challenges we’ve had at Microsoft is something Mike likes to call the ‘Goldilock’s Problem’.  In a nutshell, the problem can be stated as:

The worst thing we can do in delivering facilities for the business is not have enough capacity online, thus limiting the growth of our products and services.

The second worst thing we can do in delivering facilities for the business is to have too much capacity online.

This has led to a focus on smart, intelligent growth for the business — refining our overall demand picture. It can’t be too hot. It can’t be too cold. It has to be ‘Just Right!’ The capital dollars of investment are too large to make without long term planning. As we struggled to master these interesting challenges, we had to ensure that our technological plan also included solutions for the business and operational challenges we faced as well. 

So let’s take a high level look at our Generation 4 design

Are you ready for some great visuals? Check out this video at Soapbox. Click here for the Microsoft 4th Gen Video.  It’s a concept video that came out of my Data Center Research and Engineering team, under Daniel Costello, that will give you a view into what we think is the future.

image

From a configuration, construct-ability and time to market perspective, our primary goals and objectives are to modularize the whole data center. Not just the server side (like the Chicago facility), but the mechanical and electrical space as well. This means using the same kind of parts in pre-manufactured modules, the ability to use containers, skids, or rack-based deployments and the ability to tailor the Redundancy and Reliability requirements to the application at a very specific level.

image

Our goals from a cost perspective were simple in concept but tough to deliver. First and foremost, we had to reduce the capital cost per critical Mega Watt by the class of use.  Some applications can run with N-level redundancy in the infrastructure, others require a little more infrastructure for support. These different classes of infrastructure requirements meant that optimizing for all cost classes was paramount.  At Microsoft, we are not a one trick pony and have many Online products and services (240+) that require different levels of operational support. We understand that and ensured that we addressed it in our design which will allow us to reduce capital costs by 20%-40% or greater depending upon class. 

For example, non-critical or geo redundant applications have low hardware reliability requirements on a location basis. As a result, Gen 4 can be configured to provide stripped down, low-cost infrastructure with little or no redundancy and/or temperature control.  Let’s say an Online service team decides that due to the dramatically lower cost, they will simply use uncontrolled outside air with temperatures ranging 10-35 C and 20-80% RH. The reality is we are already spec-ing this for all of our servers today and working with server vendors to broaden that range even further as Gen 4 becomes a reality.  For this class of infrastructure, we eliminate generators, chillers, UPSs, and possibly lower costs relative to traditional infrastructure.

Applications that demand higher level of redundancy or temperature control will use configurations of Gen 4 to meet those needs, however, they will also cost more (but still less than traditional data centers). We see this cost difference driving engineering behavioral change in that we predict more applications will drive towards Geo redundancy to lower costs.

Another cool thing about Gen 4 is that it allows us to deploy capacity when our demand dictates it.  Once finalized, we will no longer need to make large upfront investments. Imagine driving capital costs more closely in-line with actual demand, thus greatly reducing time-to-market and adding the capacity Online inherent in the design.  Also reduced is the amount of construction labor required to put these “building blocks” together. Since the entire platform requires pre-manufacture of its core components, on-site construction costs are lowered. This allows us to maximize our return on invested capital.

 

image

In our design process, we questioned everything. You may notice there is no roof and some might be uncomfortable with this. We explored the need of one and throughout our research we got some surprising (positive) results that showed one wasn’t needed.

In short, we are striving to bring Henry Ford’s Model T factory to the data center. http://en.wikipedia.org/wiki/Henry_Ford#Model_T.  Gen 4 will move data centers from a custom design and build model to a commoditized manufacturing approach. We intend to have our components built in factories and then assemble them in one location (the data center site) very quickly. Think about how a computer, car or plane is built today. Components are manufactured by different companies all over the world to a predefined spec and then integrated in one location based on demands and feature requirements.  And just like Henry Ford’s assembly line drove the cost of building and the time-to-market down dramatically for the automobile industry, we expect Gen 4 to do the same for data centers. Everything will be pre-manufactured and assembled on the pad.

image

And did we mention that this platform will be, overall, incredibly energy efficient? From a total energy perspective not only will we have remarkable PUE values, but the total cost of energy going into the facility will be greatly reduced as well.  How much energy goes into making concrete?  Will we need as much of it?  How much energy goes into the fuel of the construction vehicles?  This will also be greatly reduced! A key driver is our goal to achieve an average PUE at or below 1.125 by 2012 across our data centers.  More than that, we are on a mission to reduce the overall amount of copper and water used in these facilities. We believe these will be the next areas of industry attention when and if the energy problem is solved. So we are asking today…“how can we build a data center with less building”?

image

We have talked openly and publicly about building chiller-less data centers and running our facilities using aggressive outside economization. Our sincerest hope is that Gen 4 will completely eliminate the use of water. Today’s data centers use massive amounts of water and we see water as the next scarce resource and have decided to take a proactive stance on making water conservation part of our plan. 

By sharing this with the industry, we believe everyone can benefit from our methodology.  While this concept and approach may be intimidating (or downright frightening) to some in the industry, disclosure ultimately is better for all of us. 

Gen 4 design (even more than just containers), could reduce the ‘religious’ debates in our industry. With the central spine infrastructure in place, containers or pre-manufactured server halls can be either AC or DC, air-side economized or water-side economized, or not economized at all (though the sanity of that might be questioned).  Gen 4 will allow us to decommission, repair and upgrade quickly because everything is modular. No longer will we be governed by the initial decisions made when constructing the facility. We will have almost unlimited use and re-use of the facility and site. We will also be able to use power in an ultra-fluid fashion moving load from critical to non-critical as use and capacity requirements dictate. 

Finally, we believe this is a big game changer. Gen 4 will provide a standard platform that our industry can innovate around. For example, all modules in our Gen 4 will have common interfaces clearly defined by our specs and any vendor that meets these specifications will be able to plug into our infrastructure.  Whether you are a computer vendor, UPS vendor, generator vendor, etc., you will be able to plug and play into our infrastructure. This means we can also source anyone, anywhere on the globe to minimize costs and maximize performance.  We want to help motivate the industry to further innovate—with innovations from which everyone can reap the benefits. 

To summarize, the key characteristics of our Generation 4 data centers are:

  • Scalable
  • Plug-and-play spine infrastructure
  • Factory pre-assembled: Pre-Assembled Containers (PACs) & Pre-Manufactured Buildings (PMBs)
  • Rapid deployment
  • De-mountable
  • Reduce TTM
  • Reduced construction
  • Sustainable measures
  • Map applications to DC Class

image

We hope you join us on this incredible journey of change and innovation!

Long hours of research and engineering time are invested into this process. There are still some long days and nights ahead, but the vision is clear. Rest assured however, that we as refine Generation 4, the team will soon be looking to Generation 5 (even if it is a bit farther out).  There is always room to get better. 

So if you happen to come across Goldilocks in the forest, and you are curious as to why she is smiling you will know that she feels very good about getting very close to ‘JUST RIGHT’.   

Generations of Evolution – some background on our data center designs

We thought you might be interested in understanding what happened in the first three generations of our data center designs. When Ray Ozzie wrote his Software plus Services memo it posed a very interesting challenge to us. The winds of change were at ‘tornado’ proportions.   That “plus Services” tag had some significant (and unstated) challenges inherent to it.  The first was that Microsoft was going to evolve even further into an operations company.  While we had been running large scale Internet services since 1995, this development lead us to an entirely new level.  Additionally, these “services” would span across both Internet and Enterprise businesses. To those of you who have to operate “stuff”, you know that these are two very different worlds in operational models and challenges. It also meant that, to achieve the same level of reliability and performance required our infrastructure was going to have to scale globally and in a significant way.

It was that intense atmosphere of change that we first started re-evaluating data center technology and processes in general and our ideas began to reach farther than what was accepted by the industry at large. This was the era of Generation 1.  As we look at where most of the world’s data centers are today (and where our facilities were), it represented all the known learning and design requirements that had been in place since IBM built the first purpose-built computer room. These facilities focused more around uptime, reliability and redundancy. Big infrastructure was held accountable to solve all potential environmental shortfalls. This is where the majority of infrastructure in the industry still is today.

We soon realized that traditional data centers were quickly becoming outdated. They were not keeping up with the demands of what was happening technologically and environmentally.  That’s when we kicked off our Generation 2 design. Gen 2 facilities started taking into account sustainability, energy efficiency, and really looking at the total cost of energy and operations. No longer did we view data centers just for the upfront capital costs, but we took a hard look at the facility over the course of its life.  Our Quincy, Washington and San Antonio, Texas facilities are examples of our Gen 2 data centers where we explored and implemented new ways to lessen the impact on the environment. These facilities are considered two leading industry examples, based on their energy efficiency and ability to run and operate at new levels of scale and performance by leveraging clean hydro power (Quincy) and recycled waste water (San Antonio) to cool the facility during peak cooling months.

As we were delivering our Gen 2 facilities into steel and concrete, our Generation 3 facilities were rapidly driving the evolution of the program. The key concepts for our Gen 3 design are increased modularity and greater concentration around energy efficiency and scale.  The Gen 3 facility will be best represented by the Chicago, Illinois facility currently under construction.  This facility will seem very foreign compared to the traditional data center concepts most of the industry is comfortable with. In fact, if you ever sit around in our container hanger in Chicago it will look incredibly different from a traditional raised-floor data center. We anticipate this modularization will drive huge efficiencies in terms of cost and operations for our business. We will also introduce significant changes in the environmental systems used to run our facilities.  These concepts and processes (where applicable) will help us gain even greater efficiencies in our existing footprint, allowing us to further maximize infrastructure investments.

This is definitely a journey, not a destination industry. In fact, our Generation 4 design has been under heavy engineering for viability and cost for over a year.  While the demand of our commercial growth required us to make investments as we grew, we treated each step in the learning as a process for further innovation in data centers.  The design for our future Gen 4 facilities enabled us to make visionary advances that addressed the challenges of building, running, and operating facilities all in one concerted effort.

/Mm/Dc/Cb

In disappointment, there is opportunity. . .

I was personally greatly disappointed with the news coming out of last week that the Uptime Institute had branded Microsoft and Google as the enemy to traditional data center operators.  To be truthful, I did not give the reports much credit especially given our long and successful relationship with that organization.  However, when our representatives to the event returned and corroborated the story, I have to admit that I felt more than  a bit let down.

As reported elsewhere, there are some discrepancies in how our mission was portrayed versus the reality of our position.   One of the primary messages of our cloud initiatives is that there is a certain amount of work/information that you will want to be accessed via the cloud, and there is some work/information that you want to keep privately.  Its why we call it SOFTWARE + SERVICES.  There’s quite a few things people just would not feel comfortable running in the cloud.   We are doing this (data center construction and operation)  because the market, competitive forces, and our own research is driving us there.   I did want to address some of the misconceptions coming out of that meeting however:

On PUE, Measurement, and our threat to the IT industry

The comments that Microsoft and Google are the biggest threat to the IT industry and that Microsoft is “making the industry look bad by putting our facilities in areas that would bring the PUE numbers down” are very interesting.  First as mentioned before, please revisit our Software + Services strategy, its kind of hard to be a threat if we are openly acknowledging the need for corporate data centers in our expressed strategy.   I can assure you that we have no intention of making anyone look “bad”, nor do we in any way market our PUE values.  We are not a data center real estate firm and we do not lease out our space where this might even remotely be a factor. 

While Microsoft believes in Economization (both water and air-side), not all of our facilities employ this technology.  In fact, if a criticism does exist its that we believe that its imperative to widen your environmental envelopes as open as you can.  Simply stated – run your facilities hotter!

The fact of the matter is that Microsoft has invested in both technology and software to allow us to run our environments more aggressively than a traditional data center environment.   We understand that certain industries have very specific requirements around the operation of storage of information which drive and dictate certain physical reliability and redundancy needs.   I have been very vocal around getting the Best PUE for your facility.  Our targets are definitely unrealistic for the industry at large but the goal of driving the most efficiency you can out of your facilities is something everyone should be focused on.

It was also mentioned that we do not measure our facilities over time which is patently untrue.   We have years and years worth of measured information for our facilities with multiple measurements per day.  We have been fairly public about this and have produced specifics on numbers (including the Uptime Symposium last year) which makes this somewhat perplexing. 

On Bullying the Industry

If the big cloud players are trying to bully the industry with  money and resources, I guess I have to ask – To what end?  Does this focus on energy efficiency equate to something bad?  Aside from the obvious corporate responsibility of using resources wisely and lowering operating costs, the visibility we are bringing to this space is not inherently bad.  Given the energy constraints we are seeing across the planet, a focus on energy efficiency is a good thing. 

Lets not Overreact, There is yet hope

While many people (external and internal) approached me about pulling out of the Uptime organization entirely or even suggesting that we create a true non-for-profit end user forum, motivated by technology and operations issues alone, I think its more important to stay the course.   As an industry we have so much yet to accomplish.  We are at the beginning of some pretty radical changes in both technology, operations, and software that will define our industry in the coming decades.   Now is not the time to splinter but instead redouble our efforts to work together in the best interests of all involved.

Instead of picking apart the work done by the Green Grid and attacking the PUE metric by and large, I would love to see Uptime and Green Grid working together to give some real guidance.  Instead of calling out that PUE’s of 1.2 are unrealistic for traditional data center operators, would it not be more useful for Uptime and Green Grid to produce PUE targets and ranges associated with each Uptime Tier?   In my mind that would go along way to drive the standardization of reporting and reduce ridiculous marketing claims of PUE.

This industry is blessed with two organizations full of smart people attacking the same problem set.  We will continue our efforts through the Microsoft Data Center Experience (MDX) events, conferences, and white-papers to share what we are doing in the most transparent way possible.

/Mm

Out of the Box Paradox – Manifested (aka Chicago Area Data Center begins its journey)

clip_image001

With modern conventional thinking and untold management consultants coaching people to think outside the box, I find it humorous that we have actually physically manifested an “Out of the Box Paradox” in Chicago.  

What is an Out of the Box Paradox you ask?  Well I will refer to Wikipedia on this one for a great example:

“The encouragement of thinking outside the box, however, has possibly become so popular that thinking inside the box is starting to become more unconventional.  This kind of “going against the grain means going with the grain” mentality causes a paradox in that there may be no such thing as conventionality when unconventionality becomes convention.”

The funny part here is that we are actually doing this with….you guessed it…..boxes. Today we finished the first phase of construction and we are rolling into the testing of container-based deployments.  Our facility in Chicago is our first purpose-built data center to accommodate containers on a large scale.  It has been an incredibly interesting journey.  The challenges of solving things that have never been done before are many.  We even had to create our own container specification, one specifically with the end-user in mind to ensure we maximized the cost and efficiency gains possible, not to mention standard blocking and tackling issues like standardizing power, water, network and other interfaces.  All sorts of interesting things have been discovered, corrected, and perfected.  From electrical harmonics issues to streamlining materials movement, to whole new operational procedures.

Chicago Container Spaces with load banks

The facility is already simply amazing and it’s a wonder to behold. Construction kicked off only one year ago and when completed it will have the capacity to scale to hundreds of thousands of servers which can be deployed (and de-commissioned as needed) very quickly.  The joke we use internally is that this is not your mother’s data center.  You get that impression from the first moment you step into the “hangar bay” on the first floor. The “hangar’s” first floor will house the container deployments and I can assure you it is like no data center you have ever seen.  It’s one more step to the industrialization of the IT world, or at least the cloud-scale operations space.  To be fair, and it’s important to note, only one half of the total facility is ready at this point, but even half of this facility is significant in terms of total capacity.

That “Industrialization of IT” is one of the core tenets of my mission at Microsoft. Throwing smart bodies at dumb problems is not really smart at all. The real quest is how to drive innovation and automation into everything that you do to reduce the amount of work that needs to be performed by humans.  Dedicate your smart people for solving hard problems.  It’s more than a mission, it’s a philosophy deeply rooted in our organization.  Besides, industry numbers tell us that humans are the leading cause of outages in data center facilities. :) Our Chicago facility is a huge step forward to driving that industrialization increasingly forward.  It truly represents an evolution and demonstrates what could happen when you blend the power of software and breakthrough innovative design and engineering. Even for buildings!

 Chicago Container Spines being constructed

I have watched with much interest the back and forth on containers in the media, in the industry, and the interesting uses being proposed by the industry. The fact of the matter is that Containers are a great “Out of the Box Paradox” that really should not be terribly shocking to the industry at large. 

The idea of “containment” is almost as old as mechanical engineering and thermodynamics itself. Containment gives you the ability to manage the heat or lack thereof more effectively in individual ecosystems. Forward looking designers have been doing “containment” for a long time. So going back to the paradox that “out of the box, is in the box thinking” shift, the concept is not terribly new.  It’s the application at our scale and specifically to the data center world which is most interesting.  

It allows us to get out of the traditional decision points common to the data center industry in that certain infrastructure decisions actually reside in the container itself, which allows for a much quicker refresh cycle of key components and the ability to swap out for the next greatest technology rapidly.  Therefore, by default it allows us to deploy our capital infrastructure costs much more closely aligned with actual need versus the large step functions one normally sees in data center construction (build a large expensive facility, and fill it up over time versus build capacity out as you need it).   This allows you to better manage costs, better manage your business, and give you the best possible ramp for technology refresh.  You don’t particularly care if its AC or DC, if it’s water cooled or air cooled.  Our metrics are simple – Give us the best performing, most efficient, lowest TCO technology to meet our needs. If today that’s AC, great.  Tomorrow DC?  Fantastic.  Do I want to be able to do a bake-off between the two?  Sure. I don’t have to reinvest huge funds in my facilities to make those changes. 

For those of you with real lives and have not been following the whole container debates here is a quick recap -

  1. Microsoft is using standard 40 foot shipping containers for the deployment of servers in support of the software + services strategy and in support of our cloud services infrastructure initiatives.
  2. The containers can house as many as 2500 servers achieving a density of 10 times the amount of compute in the equivalent space in a traditional data center.
  3. We believe containers offer huge advantages at scale in terms of both initial capital and ongoing operating costs.
  4. This idea has met some resistance in the industry. As highlighted by my interesting back and forth with Eric Lai from Computerworld magazine. Original article can be found here, with my “Anthills” response found here.
  5. Chicago represents one of the first purpose-built container-built facilities ever.

To be clear, as I have said in the past, containers are not for everyone, but they are great for us.

The other thing which is important is the energy efficiency of the containers. Now I want to be careful here as the reporting of efficiency numbers can be a dangerous exercise in the blogo-sphere. But our testing shows that our containers in Chicago can deliver an average PUE of 1.22 with an AVERAGE ANNUAL PEAK PUE of 1.36. I break these two numbers out separately because there is still some debate (at least in the circles I travel in) on which of these metrics is more meaningful.  Regardless of your position on which is more meaningful, you have to admit those numbers are pretty darn compelling. 

image

For the purists and math-heads out there, Microsoft includes house lighting and office loads in our PUE calculation. They are required to run the facility so we count them as overhead.

On the “Sustainability” side of containers it’s also interesting to note that shipping 2500 servers in one big container has a positive reduction on the CO2 related to transportation, let alone the amount of packaging material eliminated.

So in my mind, containers are driving huge cost and efficiency (read also as cost benefits in addition to “green” benefits) gains for the business.  This is an extremely important point, as Microsoft expands its data center infrastructure, it is supremely important that we follow an established smart growth methodology for our facilities that is designed to prevent overbuilding—and thus avoid associated costs to the environment and to our shareholders.  We are a business after all.  We must do all of this while also meeting the rapidly growing demand for Microsoft’s Online and Live services.

Containers, and this new approach is definitely a change in how facilities have traditionally been developed, and as a result many people in our industry are intimidated by it.  But they shouldn’t be. Data center’s have not changed in fundamental design for decades.  Sometimes change is good. The exposure to any new idea is always met with resistance, but with a little education things change over time.

In that vein we are looking at holding our second Microsoft Data Center Experience (MDX) event in Chicago in the Spring/Summer 2009.  Our first event held in San Antonio, was basically an opportunity for a couple hundred Microsoft enterprise customers to tour our facilities, ask all the questions they wanted, interact with our Data Center experts (mechanical, electrical, operations, facilities management, etc.), and generally get a feel to our approach. It’s not that ours is the right way, or the wrong way…..just our way.  Think of an Operations event for Operations people, by Operations people. 

It’s not glamorous, there are no product pitches, no slick brochures, no hardware hunks or booth babes, but hopefully it’s interesting.  That first event was hugely successful with incredible feedback from our customers. As a result, we decided to do the same thing in Chicago with the very first container data center.  Which of course makes things a bit tricky.  While the facility will be going through a vigorous testing phase from effectively now moving forward, we thought it better to ensure that any and all construction activity be formally complete before we go moving large groups of people through our facility to ensure safety.  Plus, I don’t think I have enough hard hats and safety gear for you all.  

So if you attended MDX-San Antonio and really want to drill deeper in on Containers, in a facility custom built for them, or would like to attend just to ask questions, look for details on it from your Microsoft account management team or your local Microsoft sales office for details next Spring. (Although it’s not a sales event, you are more likely to reach someone there faster than calling into Global Foundation Services directly, after all we have a global infrastructure to run.)

/Mm