Comparative Study on Power Gating Techniques for Lower Power Delay Product, Smaller Power Loss, Faster Wakeup Time

The power gating is one of the most popular reduction leakage techniques. We make comparison among various power gating schemes in terms of power delay product, energy loss, and wake-up time using the 45-nm Predictive Technology Model. In my conclusion, the Dual-Switch Power Gating (DSPG) shows lower power delay product, smaller energy loss, faster wake-up time than the other power gating schemes such as the Single-Switch and Charge-Recycled Power Gating schemes. Based on these advantages, the DSPG is suggested in this paper as a viable candidate suitable to a fine-grain leakage control scheme, where logic blocks go in and out very frequently and shortly between the active and sleep modes.


Introduction
As CMOS size continues to be scaled, transistor density increases, and power consumption becomes a very important constraint in very-large-scale integration (VLSI) design. Power dissipation comes from two sources including static power and dynamic power. Dynamic power is calculated when system is in active mode. The static component is power as no signals are changing their value. The dynamic power consists of switching power and short circuit power. Switching power is caused by charging and discharging of load capacitance. Short circuit power is caused by charging of internal nodes. The main sources of static power are sub-threshold leakage, gate leakage, gate induced drain leakage, oxide tunneling and junction leakage [1]. As device scaling goes on, these leakage current sources are more and more increasing that is as much as a third of total power [2].
The leakage current is particularly important in mobile devices, where the battery lifetime is decided by their leakage during sleep time. To mitigate the leakage current, a number of low-leakage techniques have been developed for many years [3][4][5][6]. Among them, power gating techniques have been used widely for many years, where leakage current can be cut off by an NMOS header or PMOS footer with high threshold voltage [7]. At wake-up moment, the header or footer that was off, becomes turned on, and a logic block powered by the header or footer goes to an active mode from a sleep mode.
The new power gating schemes with charge recycling technique have been introduced [8][9] where an amount of switching energy which should be lost in turning on and off power switches can be lowered. This energy saving comes from the charge sharing which happens between a virtual V DD and V SS lines, at both a sleep-in and wake-up moment. Here, virtual V DD and V SS lines are connected to real power supply and ground supply through the PMOS switch and NMOS switch, respectively. When the sleep time is very  short, however, the charge-recycled power gating can lose more energy than the conventional power gating schemes without charge sharing. Moreover, the charge-recycled power gating needs more time in equalizing its virtual V DD and V SS lines. Thereby, its wake-up time is longer. This large energy loss and slow wake-up may prevent the chargerecycled power gating from being used particularly in a finegrain leakage control scheme, where the logic blocks go in and out between the active and sleep modes very frequently and shortly. Thus, the sleep time of fine-grain leakage control scheme is likely much shorter than a coarse-grain leakage suppression scheme [10][11][12]. To be useful in the fine-grain leakage reduction scheme, a power gating circuit should be able to awake the logic block as fast as possible at the wakeup moment [13][14]. And, also, an energy loss due to this power gating has to be as small as possible.
Recently, dual power gating has been re-visited in results of low leakage consumption compared to the conventional power gating and the charge recycling power gating (CRPG) with same timing constraint [15]. Here, three schemes are analysed in scenario of 10% or 20% timing overhead. We can realize that to make the same timing constraint, overhead switch area of dual power gating technique should increase four times compared to the conventional power gating and charge recycling power gating at least. It means that cost will be increased in case of dual power gating to achieve lower leakage consumption in high speed applications. In term of low cost, low leakage consumption, the dual power gating is analysed by comparison with the other two schemes in this paper. A solution with low cost, low power delay product, fast wakeup time and small energy loss consumption in a reasonable speed can be very helpful in applying this technique, for example, in wireless sensor network systems.
In this paper, we extend my work to prove more advantage of the proposed leakage reduction technique [16]. We continue to compare three power gating schemes which are the Single-Switch Power Gating (SSPG) which can be regarded as the conventional power gating technique, Charge-Recycled Power Gating (CRPG) [8][9], and Dual-Switch Power Gating (DSPG), respectively, in terms of energy loss due to power gating, power delay product, wake-up time, so on. The comparison tells us that the DSPG has the lowest energy loss regardless of how long the sleep time is, among 3 schemes. Moreover, the DSPG can wake up faster than the CRPG because it does not need any more time in charge sharing. And, we need to mention the ground bounce noise which becomes more significant with supply voltage being scaled down, as IR drop and di/dt noise that are introduced by abrupt change of virtual power lines increase [17]. The ground bounce noise in the DSPG has been known better than the other two due to its small voltage swing and small rush current on power lines [17][18]. Based on the comparison, we suggest in this paper that the DSPG with smaller energy loss, smaller power delay product and faster wake-up is more suitable to the fine-grain leakage control scheme than the others. Thus, this paper shows the advantages of DSPG than the others and based on these advantages, we suggest the DSPG as a viable candidate suitable to the fine-grain leakage control scheme.  Figure 1(a) illustrates the SSPG scheme that has two logic blocks, L 0 and L 1 , which are made of low threshold voltage (low V TH ) transistors with large leakage current. To cut off the leakage during the sleep time, the L 0 and L 1 are powered by the header, MP 0 and the footer, MN 0 , respectively, which are made of high threshold voltage (high V TH ) transistors. Here the V SSV and V DDV are virtual V SS and V DD lines, respectively, which are connected to real V SS and V DD line when the header and footer are turned on. On the contrary, when the MP 0 and MN 0 are off, the V SSV is raised up to V DD and V DDV is lowered to V SS . Here, the PGN and PGP mean enable signals for the MN 0 and MP 0 , respectively. Figure 1(b) shows the CRPG scheme, where the V SSV and V DDV which are controlled by the MN 0 and MP 0 , respectively, are connected each other through the MN 1 and MP 1 that constitute a transmission gate. This transmission gate is turned-on at both sleep-in and wake-up moments in which charges are shared each other between the V SSV and V DDV . The TGN and TGP turn on the transmission gate at both the sleep-in and wake-up. The DSPG is shown in Figure 1  Gating (DSPG) scheme. Figure 2 compares the V SSV and V DDV waveforms of 3 schemes. Here, the sleep-in and wake-up happen at the t 0 and t 3 , respectively, and the t sleep means the sleep time. The PGN is a control signal for the footer and the TGN is a control signal for the transmission gate in the CRPG.

Various power gating schemes
At the both sleep-in and wake-up, the transmission gate should be turned on for a short time of the t 1 -t 0 and the t 3 -t 2 , as shown in Figure 2. We can see the V DDV of SSPG, firstly. When the sleep time is long in Figure 2, the V DDV has a voltage swing as large as the ∆V 0 at the wake-up time of t 3 . Thus, the SSPG loses a large amount of switching energy at this moment. When a sleep time is short, ∆V 0 has a small voltage swing, thus SSPG only loses a small amount of switching energy at this wakeup moment. Next, for the CRPG, the V SSV and V DDV are equalized during the t 1 -t 0 , then, they start to decay toward the real V DD and V SS , respectively. The V SSV and V DDV are equalized again during the t 3 -t 2 , and they are restored to the real V SS and V DD at the t 3 , respectively. At time of t 3 , the CRPG in Figure 2 has a voltage swing as large as the ∆V 1 on its V DDV . Comparing the CRPG with the SSPG, we can realize that the CRPG has larger voltage swing on its V DDV than the SSPG at the wakeup when the sleep time is short. It means that the CRPG is not effective in saving energy when the sleep time is short. Unlike the SSPG, the ∆V 1 of CRPG are almost the same regardless of the sleep time. This is because that the V DDV and V SSV of CRPG are equalized every the sleep-in and wake-up moment, thus their voltage swings being about half V DD regardless of the sleep time.
Finally, the DSPG is considered, its V DDV swing is as small as the ∆V 2 . With a short sleep time, the DSPG's swing voltage is like the SSPG. When a sleep time becomes longer, the ∆V 2 becomes larger but it does not exceed half V DD unlike the SSPG. For this long sleep time, its voltage swing is almost the same with the ∆V 1 , of the CRPG. Figure 3 shows analysis results of 31-stage ring oscillator at temperature of 27 o C. The simulation is done using the 45-nm Predictive Technology Model (PTM) [19] with various voltage supplies. Here, the DSPG uses both PMOS and NMOS switches to cut off power lines, thus drop voltage on these switches is a little bit larger than SSPG which uses only NMOS switch. Consequently, delay of DSPG is slightly higher than that of SSPG as shown in Figure 3 (a). However, power delay product is a metric related to efficiency energy measuring energy consumed per switching event. The power delay product of DSPG is 7% smaller than that of SSPG as shown in Figure 3 (b) indicating that the DSPG has higher energy efficiency even in the active mode. Figure 4(a) shows the comparison of 3 schemes in terms of energy loss. The logic block used here is composed of 50% INVs, 25% NANDs, and 25% NORs. Here the power switch's channel width used in this paper is 10% of the total channel width of logic block. The power-gating energy loss is defined by an amount of energy which is lost between the sleep-in and wake-up moment. For a certain sleep time, if the energy loss due to power gating is smaller than the active leakage energy which is expected to dissipate during the sleep time, we can save some amount of energy using power gating scheme. On the contrary, if the energy loss is larger than the active leakage energy, we had better not to use the power gating. A sleep time when the energy loss of power gating becomes the same with the active leakage energy is defined as a crossover time. This crossover time is very important when we try to apply a power gating technique to the fine-grain leakage control circuits, where logic blocks are subject to transit between the active and sleep modes very frequently and shortly. In Figure 4(a), when the sleep time is short, the SSPG needs power-gating energy loss smaller than the CRPG. As mentioned earlier, this is due to that the SSPG has smaller voltage swings on its V DDV and V SSV than the CRPG when its sleep time is short. As the sleep time becomes longer, the CRPG begins to have smaller voltage swings on the V DDV and V SSV than the SSPG thus needing smaller energy loss of power gating thereby some amount of energy being able to be saved. Among these 3 schemes, the DSPG shows the smallest power-gating energy loss when a sleep time is either short or long. For the short sleep time, the V DDV and V SSV of DSPG change as small as the SSPG thus minimizing its energy loss as small as the SSPG. Comparing with the CRPG, the DSPG can reduce the energy loss by 85% for the sleep time=10ns and 27C. And, for the long sleep time=10s, the DSPG can save 30% than the SSPG. This saving is caused from that the V DDV and V SSV swing of DSPG is only about half of the swing of SSPG, as shown in Figure 2. One more thing to  note is that the DSPG does not lose any amount of energy in equalizing the V DDV and V SSV thereby being able to save more energy than the CRPG, as shown in Figure 4(a). The crossover time can be extracted from Figure 4(a). The SSPG, CRPG, and DSPG have 35ns, 100ns, and 30ns, respectively. Figure 4(b) shows the energy loss of power gating at a temperature of 100 o C. Comparing Figures 4(a) with (b), we can notice that the crossover times of 100 o C are shorter than those of 27 o C.This is because sub-threshold leakage at 100 o C is larger. One more concern in the CRPG is an equalizing time which is defined by the t 1 -t 0 and t 3 -t 2 in Figures 2. The CRPG needs this time for the transmission gate to equalize the V DDV and V SSV resulting in a longer wake-up time than the SSPG and DSPG. If this equalizing time is not long enough to equalize the V SSV and V DDV fully, an amount of energy loss of the CRPG can be increased. Figure 5(a) shows that the power gating energy loss can be changed in the CRPG with varying the equalizing time. When the equalizing time becomes shorter, the CRPG has larger energy loss. For the SSPG and DSPG, their energy loss has nothing to do with the equalizing time. To achieve the energy loss as low as around 30pJ, the equalizing time should be longer than 250ps. This equalizing time is added to the wake-up time. This slow wake-up may prevent the CRPG from being used in a fine-grain leakage control scheme, where a short wake-up time is demanded not to degrade the active-mode performance. Figure 5 Figure 4(b), the SSPG, CRPG, and DSPG have the crossover times of 17ns, 35ns, and 12ns, respectively, indicating that the DSPG can be the most suitable to the fine-grain leakage control demanding a short crossover time. the wake-up times of SSPG, CRPG, and DSPG with varying a sleep time. As expected, the wake-up time of CRPG is the longest among 3 schemes due to the equalizing time. For the SSPG and DSPG, their wake-up times become longer and saturate with a sleep time increasing. Here the wake-up time is defined by a time when the V SSV and V DDV are restored to 90% of their final values of V SS and V DD . We also investigated the layout overhead of SSPG, CRPG, and DSPG. The SSPG and DSPG have the same layout area as long as their power switches have the same size. The CRPG, however, needs a larger area for its transmission gate as shown in Figure 1(b). To equalize the V DDV and V SSV in a short time, we need to increase the width of MP 1 and MN 1 in Figure 1(b) more thereby the area overhead being larger. Table 1: 32-bit input vectors applied to the 32-bit carry-look-ahead adder.

Simulation results
FFFFFFFF FFFFFFFF Figure 6. (a) Power-gating energy loss of the 32-bit Carry-Look-Ahead adder when the sleep time is as short as 10ns for the 45-nm PTM, V DD =1.1 V, and W PG /W Logic =10%. The DSPG consumes almost the same energy with the SSPG, but its energy loss is much smaller than the CRPG by as much as 72% on average. This result is consistent with Figure 4(a).
(b) Power-gating energy loss of the 32-bit Carry-Look-Ahead adder when the sleep time is as long as 4s. The DSPG consumes smaller energy than the SSPG and CRPG by as much as 32% and 18% on average, respectively. As expected from Figure 2, the SSPG and DSPG show the largest and smallest energy loss, respectively.
In this paper, the width of the transmission gate in Figure 1(b) is half of the width of power switches, thus the area penalty of CRPG being as large as 15% compared with the penalty of SSPG and DSPG as small as 10%.
The three power gating schemes are applied to a 32-bit Carry-Look-Ahead (CLA) adder to compare the energy loss due to power gating. The 32-bit adder is implemented using the 45-nm PTM, at V DD =1.1V and 27C. Figures 6(a) and (b) show the energy loss of 32-bit CLA adder when a sleep time is 10ns and 4s, respectively. The simulated input vectors of 32-bit adder are shown in Table 1. In Figures 6(a) for the sleep time=10ns, the CRPG shows the largest energy consumption which is caused by the V 1,S larger than the SSPG and DSPG, respectively. From this figure, the SSPG, CRPG, and DSPG have average energy loss of 2.3pJ, 8pJ, and 2.25pJ, respectively. When the sleep time is as long as 4s, the SSPG seems to lose the energy on average as much as 35.2pJ compared with the CRPG of 29.2pJ and DSPG of 23.9pJ. Among 3 schemes, the DSPG loses the smallest energy for its power gating, making the DSPG the most suitable to the fine-grain leakage controlled VLSIs.
ISCAS-85 Benchmark circuits, that are C432, C449 and C880, are verified to show that DSPG is better than others in term of leakage power consumption. The normalized leakage power is compared at 27 o C and 100 o C as shown in Table 2 and Table 3 respectively.

Conclusion
Among various power gating technique, we have compared 3 power gating schemes in terms of power delay product, energy loss, wake-up time using the 45-nm Predictive Technology Model. The comparison results show that the DSPG is smaller energy loss, lower power delay product, faster wake-up time than the other power gating schemes. Based on these advantages, we suggest the DSPG as a viable candidate suitable to a fine-grain leakage control scheme, where logic blocks go in and out very frequently and shortly between the active and sleep modes.