Since the electricity bill of a data center constitutes a significant portion of its overall operational costs, reducing this has become important. We investigate cost reduction opportunities that arise by the use of uninterrupted power supply (UPS) units as energy storage devices. This represents a deviation from the usual use of these devices as mere transitional fail-over mechanisms between utility and captive sources such as diesel generators. We consider the problem of opportunistically using these devices to reduce the time average electric utility bill in a data center. Using the technique of Lyapunov optimization, we develop an online control algorithm that can optimally exploit these devices to minimize the time average cost. This algorithm operates without any knowledge of the statistics of the workload or electricity cost processes, making it attractive in the presence of workload and pricing uncertainties. An interesting feature of our algorithm is that its deviation from optimality reduces as the storage capacity is increased. Our work opens up a new area in data center power management.
C.4 [Performance of Systems]: Modeling techniques; Design studies
Algorithms, Performance, Theory
Power Management, Data Centers, Stochastic Optimization, Optimal Control
Data centers spend a significant portion of their overall operational costs towards their electricity bills. As an example, one recent case study suggests that a large 15MWdata center (on the more energy-efficient end) might spend about $1M on its monthly electricity bill. In general, a data center spends between 30-50% of its operational expenses towards power . A large body of research addresses these expenses by reducing the energy consumption of these data centers. This includes designing/employing hardware with better power/performance trade-offs [9,17,20], software techniques for power-aware scheduling , workload migration, resource consolidation , among others. Power prices exhibit variations along time, space (geography), and even across utility providers. As an example, consider Fig. 1 that shows the average hourly spot market prices for the Los Angeles Zone LA1 obtained from CAISO . These correspond to the week of 01/01/2005-01/07/2005 and denote the average price of 1 MW-Hour of electricity. Consequently, minimization of energy consumption need not coincide with that of the electricity bill.
Given the diversity within power price and availability, attention has recently turned towards demand response (DR) within data centers. DR within a data center (or a set of related data centers) attempts to optimize the electricity bill by adapting its needs to the temporal, spatial, and cross-utility diversity exhibited by power price. The key idea behind these techniques is to preferentially shift power draw (i) to times and places or (ii) from utilities offering cheaper prices. Typically some constraints in the form of performance requirements for the workload (e.g., response times offered to the clients of aWeb-based application) limit the cost reduction benefits that can result from such DR. Whereas existing DR techniques have relied on various forms of workload scheduling/shifting, a complementary knob to facilitate such movement of power needs is offered by energy storage devices, typically uninterrupted power supply (UPS) units, residing in data centers.
A data center deploys captive power sources, typically diesel generators (DG), that it uses for keeping itself powered up when the utility experiences an outage. The UPS units serve as a bridging mechanism to facilitate this transition from utility to DG: upon a utility failure, the data center is kept powered by the UPS unit using energy stored within its batteries, before the DG can start up and provide power. Whereas this transition takes only 10-20 seconds, UPS units have enough battery capacity to keep the entire data center powered at its maximum power needs for anywhere between 5-30 minutes. Tapping into the energy reserves of the UPS unit can allow a data center to improve its electricity bill. Intuitively, the data center would store energy within the UPS unit when prices are low and use this to augment the draw from the utility when prices are high.
In this paper, we consider the problem of developing an online control policy to exploit the UPS unit along with the presence of delay-tolerance within the workload to optimize the data center's electricity bill. This is a challenging problem because data centers experience time-varying workloads and power prices with possibly unknown statistics. Even when statistics can be approximated (say by learning using past observations), traditional approaches to construct optimal control policies involve the use of Markov Decision Theory and Dynamic Programming . It is well known that these techniques suffer from the "curse of dimensionality" where the complexity of computing the optimal strategy grows with the system size. Furthermore, such solutions result in hard-to-implement systems, where significant recomputation might be needed when statistics change.
In this work, we make use of a different approach that can overcome the challenges associated with dynamic programming. This approach is based on the recently developed technique of Lyapunov optimization   that enables the design of online control algorithms for such time-varying systems. These algorithms operate without requiring any knowledge of the system statistics and are easy to implement. We design such an algorithm for optimally exploiting the UPS unit and delay-tolerance of workloads to minimize the time average cost. We show that our algorithm can get within O(1/V ) of the optimal solution where the maximum value of V is limited by battery capacity. We note that, for the same parameters, a dynamic programming based approach (if it can be solved) will yield a better result than our algorithm. However, this gap reduces as the battery capacity is increased. Our algorithm is thus most useful when such scaling is practical.
One recent body of work proposes online algorithms for using UPS units for cost reduction via shaving workload "peaks" that correspond to higher energy prices [3, 4]. This work is highly complementary to ours in that it offers a worst-case competitive ratio analysis while our approach looks at the average case performance. Whereas a variety of work has looked at workload shifting for power cost reduction  or other reasons such as performance and availability , our work differs both due to its usage of energy storage as well as the cost optimality guarantees offered by our technique. Some research has considered consumers with access to multiple utility providers, each with a different carbon profile, power price and availability and looked at optimizing cost subject to performance and/or carbon emissions constraints . Another line of work has looked at cost reduction opportunities offered by geographical variations within utility prices for data centers where portions of workloads could be serviced from one of several locations [11,18]. Finally,  considers the use of rechargeable batteries for maximizing system utility in a wireless network. While all of this research is highly complementary to our work, there are three key differences: (i) our investigation of energy storage as an enabler of cost reduction, (ii) our use of the technique of Lyapunov optimization which allows us to offer a provably cost optimal solution, and (iii) combining energy storage with delay-tolerance within workloads.
We consider a time-slotted model. In the basic model, we assume that in every slot, the total power demand generated by the data center in that slot must be met in the current slot itself (using a combination of power drawn from the utility and the battery). Thus, any buffering of the workload generated by the data center is not allowed. We will relax this constraint later in Sec. 6 when we allow buffering of some of the workload while providing worst case delay guarantees. In the following, we use the terms UPS and battery interchangeably.
Let W(t) be total workload (in units of power) generated in slot t. Let P(t) be the total power drawn from the grid in slot t out of which R(t) is used to recharge the battery. Also, let D(t) be the total power discharged from the battery in slot t. Then in the basic model, the following constraint must be satisfied in every slot (Fig. 2):
Dr. Bhuvan Urgaonkar, PhD has over 15 years of experience in the field of Software Engineering and Computers. His work includes research in computer systems software, distributed computing (including systems such as Zookeeper, Redis, Memcached, Cassandra, Kafka), datacenters, cloud computing, storage systems, energy efficiency of computers and datacenters, big data (including systems such as Hadoop, Spark). He serves as an expert / technical consultant with multiple firms helping them (i) understand technical content related to state of the art products in areas such as content distribution, distributed computing, datacenter design, among others and (ii) interpret patents in these areas and connections between them and state of the art products and services. Services are available to law firms, government agencies, schools, firms / corporations, and hospitals.
©Copyright - All Rights Reserved
DO NOT REPRODUCE WITHOUT WRITTEN PERMISSION BY AUTHOR.