Adaptive Power Optimization: Enhancing Efficiency and Reliability in Battery-Operated Chip Design

Video size:

Abstract

Power inefficiencies can hinder battery-operated chip designs. What if power calculations adapted in real time for precision and reliability? This talk reveals a data-driven approach to optimize power distribution, cut failures, and enhance efficiency—automating smarter, scalable chip design!

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.

Hi folks. Greetings of the day. This is pun Gupta here. Today I'm going to talk about adaptive power calculation for low power or battery operated chips design. In today's scenario, when we look at our live, we are using lot of systems or devices which are battery operated and has become essential part of our life. Thus, it becomes very necessary to make sure that those power runs longer on battery. You don't need to charge again and again, and whenever you need it, the battery is available to supply the adequate coin. How we can make sure that's happening. We can design a system. We can design the chip. With the correct specification so that we can deploy the battery, which can provide the adequate current whenever it is needed. So a proper calculation is required when we are designing the system. We are designing the shape or any other part of this overall device. What we will be using in our day to day life. Some of the examples of these devices are smartphones. What we use day to day life. We want to charge it for a small time, we want to use it for a longer time. Tablets are there cars, are there many more things? So let's just go through this presentation and see what are the components which are responsible for power calculation. What is the current state of power? How we are calculating nowadays, what are the problems or flaws in our current methodology and flows right now during the power calculation for the chips and how it impacts the overall process. So first slide. So power calculation challenge. What are the challenges which makes this process a little bit difficult or insert the incu accuracies when we calculate the power. So there are a lot of power parameters. What we need for power calculations are dynamically change, like changing in day to day chip design process, or they keep changing as chip design process progresses. So if we look at any holistic chip design process, we can divide it into multiple sub persons, like architecture. Then we have RTL, then we have physical design, and then we have final sign off. These parameters can be based on the formula, the toggling file, the load cap. Sometimes we look at the frequency and voltage variation level. Also. So all these parameters are dynamically changing and when we try to calculate those things, if those are changing, our power estimate for the power may not accurate. Early inaccuracies are the, another set of components, like when we started with one of set of the architecture, we did not consider few features. And if those features we are adding at the later stage. And those are some high frequency switching features that may change the power computation. And it can cascade with many other things in the later stage of the design. The distributor network is huge. Also comes in the picture when we calculated the power at some numbers and then we designed the power network when we are using it. The lot of things has been changed in terms of, let us say we added the feature, we changed the RTL, that increased the area of the design and it need more current now. And when we more current, we need to add, make the power network robust, and that will add the load cap stands, so the total power of the design. So these are some of the challenges, what we are seeing in day to day chip design process. Current state is the fixed calculation, how we are doing it. We have some specification based on the architecture. We are looking into it, say we are doing CPO design. So batch will be happening in this cycle. So this logic will be switching only in this particular cycle for two cycles. This amount of logic will be off, so only leakage will be there and so on. So we have this kind of scenario, but actually we cannot accumulatively look at the entire design switching that, how this will be switching. So there is a possibility that we are still leaving a significant margin of errors. Us, we are using aesthetic design parameters like okay. We will be having this much area, the area will grow by this much and this is how it is going to change our load curve stance. But there is a possibility that when we are going through the design closure phase, these parameters may change and if we will not include, we may see that the calculations, what we have done for the power may be inaccurate or less approximate. Sometimes what we see that. During the late stage, when we are done with all these things, we figure out, okay now whatever the power we were estimating has changed. The reason maybe during area conversions, what designer did, he did use lot of el leaky cells, ELBT, or ELBT cells, which reduced the area, but actually increased the power. And all these things mixed and match will result in a suboptimal designs where it can have higher power conjunction. It can have like lower performance and at the same time the battery may be leaking really fast. So all these things are in the current state of the chip design, how we can overcome of all these things. So we know that when we are starting the design, we have some architecture, but can we account for some extra feature if we know about it? Okay. Can we model such some kind of additional margins when we are competing on the power initially? So actual toggling file based on the design requirement we get at the end of the physical design closure. When we run gate level simulation, but can we model these kind of FSDs in the starting so that have, we can calculate the power in a more approximated way, so the calculation can be done based on the comprehensive power estimate and using preliminary specification. Also, we can use some statistical models based on the prior prior experience, prior design experience, prior technology experience. And based on what we are thinking that this feature will be adding or something like that. Now, at the same time, we can dynamically incorporate these real time measurements data and can refine our parameters. So at the architecture level, we estimated, okay, this much switching activity will be there. But when we are moving further, we are not able to do enough clock getting. Can we change that kind of parameters and can do the recomp computation of the power, adjusting those power specification and then announcing the procession of those power computation will help us to optimize the battery life. And then after that, we can validate those results at the same time by benchmarking against. Whatever the test factors, we have to see that whatever the improvements we have done are actually going to meet the power requirement or not. Or still, we need to provide the feedback to the architecture or to the RTL team or to the physical design team that we are not meeting, and we can take this adaptive methodology and can go through a couple of different cycles to close it. Now key milestone for adaptive recalculation. So as I mentioned, based on the architecture definition, that is the baseline power S where we have a specification and we can use statistical modeling. We can do this through the RTL development. So when we know that this is the RTL, this is the activity will be there. We can do the some simulation at that time and can do the all the analysis for the power based on the workload analysis and available RTL. Next is the physical design. When design is placed off clock three, synthesis is done, routing is done. We have more realistic value of load cup stents. We can have a good quality of toggle rate file and we can do much more accurate power profiling at physical design stage, sleek and validation. When a design is ready to tap out, ready to go to for masks generation to the foundry at that time, what kind of power calculations are there? So we can do rigorous power computation based on whatever FSDV we have. We can have functional FSDs, we can have DFT based FSDs, and in those two scenarios also, there can be multiple modes. So for each and every mode, whatever we were expecting, are we meeting those goals or not? So we can check at that time not tracking input quality at each and every stage. As I mentioned that. When initially we are starting the design, we can account for some of the factors toggling rate. Okay? That all the sequential and all the input outputs should have at least 45% activity annotated. The parasitic data can be assumed to be 80% based on the model, 20% based on the real one. Same can be scaled going further. So during the Synthes stage, we can consider, okay, 65% toggle rate should meet parasitic data can be little bit more accurate beyond the 55 or more percentage because we have physical aware synthesis capabilities available place in route, we expect 80 to a hundred percent, or I'll say. In ballpark favor between 95 to a hundred percent toggle rate to meet. And then the completeness of parasitics should be 85 to a hundred percent, and then we can calculate a more accurate power. And then why like here the numbers are a little less because we are not considering the optimization, so we can calculate these numbers. At the, just before, at the starting of the place and route and at the end of the place and route at the final design. So that's where the 95%, 200% comes, and 80%. So when we have these kind of collateral sub level, we can include few things based on the statistical models and few inputs are the actual, and we can have a very much approximated power calculation. Automation spread, G so in trend we can make our methodologies or process in such a way that we can intelligently identify available and put parameters. So like flow. If this file exists, like FSDV file exist, use it. Otherwise use a pessimistic toggle rate. Implementation mod opt implementation of optimal co calculation models based on the data maturity. So how we'll be going for the, whatever the data maturity will be there. Based on that, we can refine our models and proactively we can flag Okay, FSDV is missing, this library is missing the toggle rate looks really low. All these things we can flag and can report it to the actual owner. If it need to be feedback to the RTL or if need to feed back to the architecture or if need to be feedback to the physical design engineer. Wherever there is a gap of something missing, we can bridge with the help of predictive models. So let us say if initially D-F-T-F-S-D-V is missing. Can we generate some kind of model which can predict the activity and those can be leveraged in these competitions. Any historical data can also be included and self prediction also available if we can utilize ai. I at the same time, when we are doing all these computation, we can provide our feedback to all the cross team members. We can work with the like design verification team. Okay. This is what we are seeing. This is what we should see where is the mismatch, and can look at all these industry standard tools to make sure that whatever the power analysis we have is accurate. Kind of dashboard also we created, which can be shared across all the teams and they can look at it at the same time and take the proactive and whatever is required. So power distribution, network optimization is part, also part of it. So based on the initial computation, we designed some kind of power network. So we need to precisely calibrate to match the dynamic power requirements so that we can provide the adequate current to all the circuits at the required frequency and voltage. We need to make sure that we do not have any IR drop issues. We do not have any electro migration. We need to have a good network which can supply adequate current. At the same time, we need to make sure that we are leaving. Enough routing resources for the clocks and for the signal routing so that we will not run into the crunch or congestion situation. And that overall we will see that we have enough power getting done at like when there is a certain amount of logic doesn't need to switch. We have power getting done available. Or any kind of operation which can provide more battery life, like dynamic voltage and frequency scaling. At the same time, let us say we can reduce the frequency, reduce the voltage, and can perform that operation at a slower rate or a kind of camera example. When we are not using it, the tool or the system should automatically switch it off, so those kind of. Optimization can be done to get better battery life. So when we use these kind of techniques in real life, we see that we can save 30% to, for 30 to 40%. Or in my case, I have seen 37% power saving during the chip operation. We saw that when we calculated the accurate power and designed our network. We were able to improve the efficiency in terms of performance by 42% without any area penalty and battery longevity improved by 2.8 times. So all these things came coming from one of the battery operated devices running at very high frequency. Implementation roadmap. As I mentioned that all these statistical models or aesthetic models, excuse me, learning based on the prior designs can be integrated in the tools and methodology and flow, the team or of the engineers all the way from architecture to RTL and implementation. Can be trained on the adaptive power calculation methodologies so that they can look at each and everything very carefully and can provide the right input or they can do the right job. Or for a physical design engineer, if they are looking at more power in their reports, they can analyze and develop why the power is more, are they using the right toggling? Or they to open the leaky cells or something else, process updates. At the same time, when we are implementing these kind of methodologies, we need to have a checkpoints which automatically triggers the power calculation and tells us so like in the flow itself, when we are running synthesis, we need to write all the power reports when we are running play soft. It should write power report at the end, after the clock off and after the routing. So when we are tracking it through various phases, we can see the difference and we can easily identify the root cause. If there is any differences systematically, we can define these prediction models or fds or any other required input for the computation, and that always helps to have more accurate power. So key takeaways from here when we are doing adaptive methodologies versus the aesthetic, we have more accurate power computation at each and every stage, and we can collaborate with the right team to provide the feedback so that they can enhance their process to provide better inputs to the next stage. Like I have seen that. My, I have four cores, which are switching simultaneously at the same time and consuming a lot of power. I provide this feedback to the architecture team and they can came up with a solution. Okay, let's not turn on all the core at the same time. We will turn on the cores based on the load requirement so that we are not drawing the power at the same time. And if there is a need of all four scores switching at the same time, we can distribute the load so that when we distribute the load in such a way that based on the clock is queue, all the switching will not be happening. At the same time, data driven approach is always good for any kind of methodology and process in engineering. When we have comprehensive input, quality tracking, what kind of qualities there is this going to give us the correct computation or not, or anything else is missing? What kind of margins we can have to avoid errors? So all these calculations based on the data can assure that we have the right design choices. Automation, of course, when we have AI and ML available today. We can fetch lot of data, we can model missing gaps and can utilize it in a like a day to day process. And also at the same time, these can be easily integrated in the EDF flows and can provide accuracies to our implementation and other methodologies. Also, it makes really fast when we are doing. These calculations using a IE and machine learning, though there is a cost function of actually cost as well as runtime and high computation requirement. So this is all from my side for today. Please reach out to me if you have any question. Thank you very much. Have a great day.

Slides

Download slides (PDF)

See all 109 talks at this event!

Conf42 Site Reliability Engineering (SRE) 2025 - Online

April 17 2025 - premiere 5PM GMT

Adaptive Power Optimization: Enhancing Efficiency and Reliability in Battery-Operated Chip Design

Video size:

Abstract

Summary

Transcript

Slides

Puneet Gupta

Principal ASIC Physical Design Engineer @ Block

Join the community!

Featured event

2026

2025

Info

Conf42 Site Reliability Engineering (SRE) 2025 - Online

April 17 2025 - premiere 5PM GMT

Adaptive Power Optimization: Enhancing Efficiency and Reliability in Battery-Operated Chip Design

Video size:

Abstract

Summary

Transcript

Slides

Puneet Gupta

Principal ASIC Physical Design Engineer @ Block

Join the community!