Uptime Institute Interview with Dr. Jonathan Koomey
Dr. Jonathan Koomey, consulting professor at Stanford University's Civil and Environmental Engineering school, and a project scientist at Lawrence Berkeley National Laboratory, sat down with the Institute to talk about his upcoming research into data center efficiency and the problems facing large corporations.

Please click the play below to start the Podcast.



You are welcome to download the Podcast and take it with you. Please right click on the Download Now button and choose 'Save Target As' (Internet Explorer) or 'Save Link As' (Firefox) to save the file to your computer.
  podcast-mini2_download.jpg  
 

 


The following are excerpts from the interview:

 

INSTITUTE: You've been doing a lot of research into energy efficiency. What does your research show?

KOOMEY: I've been investigating options for improving the energy efficiency of data centers as a way to reduce the total cost of ownership for these energy-intensive facilities.

INSTITUTE: Total cost of ownership includes many things, but the cost of cooling and getting the heat out of those systems is a new twist, right?

KOOMEY: It's true, total cost of ownership has been defined in different ways. For our analysis we're looking at the cost of installing the information technology (IT) equipment, the power delivery systems, and the cooling equipment. It also includes the cost of electricity and the other operating expenses. Generally, we're not counting the cost of the software because that's not something these efficiency operations affect. We really want to get to the items where we see large and growing costs and where we can have influence over those costs. The most rapidly growing cost has been that for cooling and power infrastructure.

 

INSTITUTE: So what do your trend lines show?

KOOMEY: We're analyzing the trend lines for the amount of computation, the amount of energy used for computation and the cost of that computation. Both energy use and cost per computation are going down, but the cost per computation for IT equipment is going down much faster than the energy use per computation. Ken Brill talks about it in terms of how many watts of power you need for every $1,000 of expenditure on IT hardware. If there are more watts per $1,000 of IT equipment, that means you have to pay for the cooling and the power delivery for those watts. More and more watts means you're spending more on infrastructure, so infrastructure as a percentage of the total cost of the data center is growing. Right now site infrastructure and IT costs are roughly comparable in magnitude for typical business data centers, which means that if you build a data center, roughly half the cost of facility is the servers and other IT equipment. The other half is the cost of the site infrastructure for cooling and power. But as the amount of kilowatts per $1,000 of IT expenditure goes up, infrastructure becomes a greater and greater portion of the total cost.

 

INSTITUTE: Let's fast forward five years. You're running a data center and you want to install new IT equipment. What does that picture look like?

KOOMEY: There are always uncertainties in such forecasts, but if current trends continue, more than three quarters of the total installed costs will be for site infrastructure. Five or 10 years ago, IT costs were the main costs. Now site infrastructure capital costs rival the IT costs, and soon they will exceed the IT costs. The problem, of course, is that most companies don't do a good job of analyzing or minimizing the total costs of ownership for data center installations. Most data center installations are driven by business needs, which are then translated to IT needs. Only at the end of this process are facilities costs usually considered. This works fine when IT costs are the main capital costs, but it causes big financial problems for many companies when site infrastructure costs become significant. The fundamental organizational problem is that no one is responsible for minimizing the total cost of owning and operating the data center. The IT budgets and the facilities budgets are usually separate, or 'siloed,' and those two parts of the organization don't usually communicate and coordinate as well as the CEO might prefer. The people in IT are making decisions that are totally rational from their perspective, but they're committing the company to a lot of expenditures that don't come out of IT's budget. And that creates problems for other parts of the organization. This is what I call a perverse incentive problem.

 

INSTITUTE: Something fundamental inside of corporations has shifted. At one point, the silos they erected made sense. What changed?

KOOMEY: The consequences of having these silos is much different than 10 years ago when IT was the major cost. Typically, IT expenditures drive the data center. When infrastructure was 10 percent of the total capital cost, that wasn't a big deal. If it's three-quarters of the total capital cost it creates an expenditure problem for the company.

 

INSTITUTE: In the past, many companies looked at function versus process. Process tends to cut across multiple departments, so you can assess cost to various departments. The functional approach apparently doesn't work anymore, right?

KOOMEY: In the past, the driver was the part that owned and operated IT equipment (focused mainly on function). In the future, it will have to be someone in the financial organization, to whom the IT and facilities people report as part of a whole systems process. Their budgets all have to go up to the CFO, presumably. All the people who are critical to the data center will have to be in the room when big decisions are made. Typically what happens is that the IT folks make purchase decisions, the servers show up on the loading dock on Friday afternoon, and the facilities guys pull their hair out because they didn't know the IT equipment was coming.

 

INSTITUTE: How can that lack of communication be allowed inside a company?

KOOMEY: That's the effect of siloing. If the budgets are set up by department, you have this problem. But there are also social issues of prestige and influence. The IT folks often have a bit more clout in many companies than facilities folks do, even though both are critical to making the data center work well.

 

INSTITUTE: Are those silos breaking down?

KOOMEY: They have to break down to deal with this collision with the infrastructure costs. But different companies are addressing the issues at different rates.

 

INSTITUTE: If all of this is based on total cost of ownership, let's define that. Where does it start and where does it end?

KOOMEY: Total cost of ownership starts with IT equipment expenditures. It also includes site infrastructure costs, which are the cost of the uninterruptible power supply and the power distribution units, the raised floor area, the air conditioning units and the cabling. There's IT equipment broadly defined, which includes servers, storage devices, and networking equipment such as switches and routers. Part of the infrastructure delivers cooling and power, part of it moves information to and fro. Then there are operating costs, like electricity, labor, property taxes, network fees, etc. Those are the elements of the total cost we're addressing here. You asked what companies need to do to minimize the total cost. We've already talked about the need to combine and rationalize budgets. Companies also have to develop metrics for understanding energy efficiency and performance for both IT equipment and site infrastructure. You need to be able to make intelligent choices, and you can't do that without metrics. These metrics need to be defined by some sort of neutral third party, with procedures well defined, so that all companies that produce IT equipment, for example, will report the energy use and performance information in a standardized way. Then it will be easy to do comparisons between servers from different companies. It isn't easy now.

 

INSTITUTE: Total cost of ownership can vary by geography, right?

KOOMEY: Absolutely. It depends on climate and electricity rates, as well as labor costs, which all vary regionally.

 

INSTITUTE: Can you give us some more details on the metrics that are needed?

KOOMEY: You need metrics to buy the equipment in the first place, as well as metrics to help you operate the data center effectively. Right now, there are not widely used metrics on either side. There are discussions about which metrics should be used, and the SPECpower group will be coming out with metrics that will allow purchasers to say which machine they want. That's the first step. On the infrastructure side, you need to be able to rank the efficiency of your infrastructure equipment compared to other data centers in your company and other facilities with comparable activity level in other parts of the country. Once you create rankings, it creates a force for continuous improvement.

 

INSTITUTE: Are the metrics enough to make good choices in a complicated environment?

KOOMEY: You need both the metrics and a cost model to determine what the costs depend on. If you increase the IT load by a certain amount, how much more do you have to spend on infrastructure? You need to have some sort of simple model so you can do the tradeoffs. Right now, there is no simple model, but I'm working on one for [The Uptime Institute] right now.

 

INSTITUTE: So we gather this is going to be like an energy rating on an appliance, where it tells you how much it will cost to run for a year?

Koomey: That boils it down to the essence. You want to be able to give people a sense that a data center with these characteristics will cost this much. Here's how much it will cost to buy, here's how much it will cost to operate. Consider the power supply in a low-end server. The power supply takes AC power and turns it into DC because that's what the chips use. Those power supplies typically are 75 to 80 percent efficient. You can get them 88 percent to 90 percent efficient, but people don't buy those for the bulk of the market because people making the servers think they're competing on first costs. In the market, without metrics, they are competing on first cost. But if you spend $20 or $30 on a more efficient power supply, that will pay for itself five times over in saved infrastructure costs. If you give them a simple model to compute the total costs, the person buying the server will know they will save $100 if they pay the extra $20 for the server, so that's an easy case to make—assuming that the person buying the server sees that benefit in his budget.

 

INSTITUTE: How widespread is this first-cost, sticker-shock mentality?

KOOMEY: It's pervasive throughout the economy. The first cost issue is something that has to be faced in the data center because the cost is so large. But this is a solvable issue. Once the true costs are seen at the top level of the company, people will start making more intelligent decisions.

 

INSTITUTE: Is that because these are long-term investments?

KOOMEY: Yes, these are mission-critical purchases. They are designed for 15 years, even though IT equipment in the data centers will turn over more frequently than that—typically every three or four years. The IT budget usually is a fixed budget. If you say you have to spend another 10 percent for servers that means you get 10 percent fewer servers. Spending more may be the right economic choice when you think in terms of total cost, but that doesn't matter to IT now. They're going to use up IT budget and buy as many servers as they can.

 

INSTITUTE: So the charge-back structure for departments has to change?

KOOMEY: Up until now, people were charged by the square foot for space in the data center. Increasingly they have to consider the number of kilowatts consumed, which will almost certainly drive some institutional changes.

 

INSTITUTE: How will these ratings be applied? Will it be a study with tables or will they be listed on the servers, almost like the energy use ratings on white goods?

KOOMEY: Servers are complicated, so it will probably be a chart that says if you're computing at full load, here's the energy use. If you're computing at 50 percent or 20 percent, here's the energy use. The people buying servers are generally pretty sophisticated. But one of the dirty little secrets of data centers is that most servers run at 5 or 10 percent of their computing load. This results in a lot of valuable capital sitting around doing nothing, which is one of the reasons virtualization has gained a foothold. If you're running one application on one server at 5 percent load, on average, you could be running six or eight applications on that server and still not be taxing the server's capabilities. There really is a large potential for reducing inefficiencies in capital use, and also energy use, by virtualization.

 

INSTITUTE: Once you implement virtualization, what's the theoretical improvement in efficiency on total cost of ownership?

KOOMEY: I don't know if anyone has actually calculated that number. It's very application-specific.

 

INSTITUTE: Who runs the data center of the future, when all these metrics are in place?

KOOMEY: The Uptime Institute talks about Integrated Critical Environment teams, where you bring in financial people, facilities people and IT people on decisions about how the data center is operated. You will need to have all the key players at the table. In terms of the day-to-day operations, the IT folks will make decisions about what equipment is used and what applications are run, and the facilities folks will decide when to take something down for maintenance and backup and how to deal with an electricity outage. Each part of the organization will have its responsibilities. But ultimately, to preserve the reliability of the data center, they will need to have Integrated Critical Environment teams—or some other way to coordinate decisions related to the data center.

 

INSTITUTE: How do you calculate energy usage and cost? What are the factors you analyze?

KOOMEY: For a specific data center, you need to know the complement of equipment. You start out with square feet of electrically active floor area. That's the area where you put the server racks.

 

INSTITUTE: Does it matter how they're arranged?

KOOMEY: Yes. It matters for air flow and heat dissipation. You start with the amount of IT equipment. That tells you how much heat needs to be removed. Then you assess the physical geometry and the air flows, which will help you calculate how much cooling you need.

 

INSTITUTE: Does that make each data center unique?

KOOMEY: Yes, each of them is different, but for purposes of understanding the underlying relationships you can abstract the key characteristics. That's what the simple model for total cost of ownership is all about. You can create a simple model in a spreadsheet that most data center operators will look at and say, 'Okay, that's not exactly like my facility but it looks very similar and if I tweak this parameter and this parameter is will be close enough.' For purposes of understanding the relationships, you can poll the wisdom of Ken Brill and Bob Sullivan and Pitt Turner and compile that in a way that other people will find useful at the high level. You're not going to use it to design a data center, but maybe the CFO will use it to get a handle on IT energy use and IT costs, and infrastructure energy use and infrastructure cost. It gives them the basic tools to understand what's going on.

 

INSTITUTE: This sounds like it's going to force a restructuring inside of large corporations, which in the past were driven by decisions made at the top. Now it looks as if reality will creep up from the basement.

KOOMEY: Exactly. In this particular case, there's no way to hide it anymore. The only way to solve the problem is to change the way data center management is organized.