Conf42 Golang 2025 - Online

- premiere 5PM GMT

Enhancing Timing Closure with Slack-Aware Post-Routing Cell Legalization in ECO

Video size:

Abstract

Unlock the secret to faster, more efficient timing closure! Our slack-aware post-routing cell legalization method minimizes timing violations and optimizes resource use, achieving better QoR with fewer iterations. Say goodbye to delays and hello to faster, more reliable chip design workflows!

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hi everyone, my name is Punit, and today I'm going to discuss and share some of the innovative methodologies for timing ECO implementation during chip physical design phase. We can call it optimizing to timing ECO implementation through Slack aware post routing cell legalization and why. this is an important aspect of PNR. The reason behind it when we apply any ECU during the chip timing closure or any other phages tool, does legalization based on the net length tool, does not consider any requirement of looking at Slack or any other aspect of the design in such a scenario when tool through these cells or anytime in critical cell far away. It loops or looms into a new timing violation, which creates a necessity to discuss these kind of, things in a regular day to day EC implementation approach. overall, if we look at this approach, what we can do, we can divide the legalization process into multiple steps and can control the legalize in a very sophisticated way. this methodology helps, prioritizing some of the, negative slack cell. Strategically, we can legalize some of the cells and we can optimize the design in a step by step, and we can maintain the QOR. so to do that, we first need to understand from where the ECO requirement comes. ECO requirements can arise from many places generally during the physical design phase. What we do, we run the p and r in some of the dominant scenario, whereas there can be multiple scenarios for the timing closure need to be utilized. If we are running our p and r, in some scenarios, we may be having little bit timing closure requirement in different scenarios. So at the end of the p and r, we will be applying these timing, ECO, or let us say, when we started the design, we had some functionality and during the design verification. The team finds out that there is an issue with the design and there is a need of a logical CO to apply. such kind of scenario can occur and we need to apply the CO and we need to do further, optimization with respect to the CH. Also, at the same time, during the D-R-C-L-V disclosure, we need to move sometimes cell here and there in the design and these, requirements occur. So let's look at the details of this presentation. And these, reaches to the first of the, challenges of the timing EC implementation. So when we start, applying the ECO and if in certain areas there are so many, cells, which are densely sitting next to each other. And we need to insert some of the cells or we need to upsize or downsize of the cell. In that scenario, there is a necessity of, moving these cells, and if these cells moves from original location, these can create some of the new timing violation. So in that scenario, if we can preserve or fix those cells and move only other cells, we can reduce this challenge. The second thing, this comes from in a conventional, pla placement algorithm. What we have noticed, generally, that tool does not consider timing and it keep adding, the new violations when it moves the cells here and there where it can find a place to move those cells. the third scenario can be where we see that there are congested areas. Let us say to fix the whole timing violation tool, need to insert one more buffer and tool cannot find it, and it puts that cell pretty much far away. In that scenario, what can happen, the net routed from that newly inserted cell can cause additional violation. It may be the set setup, or either the whole violation. Sometimes you have seen because of the crosstalk. So what are the ECO corrective actions we take in that scenario? as I mentioned, we do sell up saging, based on the setup requirement, to improve the timing margins. We do insert new cells based on, if we need to insert the buffer or during the logical ECO, let us say, we can introduce some logical gates. And we need to do very careful placement if it is a congested region, even. In other case, we need to place next to each other so that we can reduce the time enclosure requirement. And the third thing is, like we have seen that there are cells sitting far away, and that path is creating a timing violation. Can we bring that cell next to the driver cell? And can reduce the timing, parasitics requirements so that we can improve that path. So these are the some ECO corrective actions we take during our, p p and r phage. Now, what is different in this approach, what I'm explaining here, the first thing is when tool sees, okay, there is a buffer inserted or cell upsized. It looks for the narrowing empty spaces and then place the cells in those areas. But if, let us say, if those species are not available, what tool is going to do, tool can throw this cell very far away where it can create some new violation. So how we can reduce that? So the approach here is, run a complete, full timing analysis on that design. C in which area we are applying. The ECO cells nearby, areas can be identified where the ECO cells will be placed. Tag the cells with all the slacks, do the slack filtering based on that slack filtering, fix the critical cells and now whatever the cells which are unfixed and need legalization. Move it by extreme algorithm. And extreme algorithm means if we need to move cell by two track, move first cell, which is sitting nearby to the next track, and then move this cell. So we are displacing all the cells by relative placement. And then once this is done, we can go with the IT refinement where we can. Take some loops and can legalize you these cells by repositioning it again and again until less. We do not see that timing is completely fixed, and here we are not creating new violations because we are fixing most critical cells first. Then, we are moving the least timing critical cells, then next level of cell, and then finally. When everything is legalized, we are trying to legalize the critical cells, which were fixed. So in this approach, we are not creating any new violations. Also, at the same time, we are taking care of everything altogether without giving any cost function to any other function. So select based, self prioritization, we can create, in four different categories, as I mentioned. The most critical slack, which are the highest priority cells, like generally, we keep our flops fixed. And then moderate, critical, slack cells, let us say if there are an end gate, which is a three input end gate, and have a critical connection to multiple cells. So those can be, considered as a secondary, critical slack cells. Then minor, negative, select cells and noncritical select cells. So in this scenario, what we can do, the category first one cell will be hard fixed. Second will be hard fixed. Third will be soft fixed. Fourth will be, unfixed. So we'll move first the noncritical cell legalize it, then unfix the category, third cells legalize it, then move the second. Category cells and I think, first one, we should not move. We just can, change some of the tracks here and there to make sure that there are no DRC violation because of those cells. So in such a methodology, what we can bring that this progressive or looped enhancement can give us a very good, placement of these critical cells. And we are not creating anything new or any new violation in terms of DRC or timing, and we are maintaining the performance of the design. In recent, experience, I have seen that CPU designs or GPU designs where we are trying to post as much edge density. These algorithms are very useful. The iterative legalization process, again, as I mentioned, that we can go fix the, all the hike, negative slack or the critical slack cells. Legalize the moveable cells. Then again, we can go back, evaluate timing, and we don't need to run the full update timing. We can run just some of the update cells, which, based on the ECO cells we can run. And, unfixed less critical cells and then move those and then we can, ably fix in three poor cycles, all these timing violations and can move on. So this methodology is not only taking care of the timing as well as what we can do, we can. create a such kind of grouping where we are maintaining the critical part timing, as well as not creating any new DRC. let's see, what is stream algorithm implementation age? We are sorting the cell based on the priority and then creating the order queue. So this can be done. we can report the max rise slack in any tool. Can tag it accordingly and create a list of those cells. We can, look into the placement analysis, can unfix and place the cells which are less critical in the available ing empty area. Then we can go for the optimal, location selection for the rest of the cells, which are critical so that we can other the minimum wire length, algorithm as well. And then we can do the incremental timing updates. In that scenario, we are taking care of each and everything at the same time. So some of the quality results, what I have seen in the designs, based on this approach, so WNS, which is we call was negative slack, had improved by 15% by utilizing this, approach for the legalization. So initially when, let's say design was 70% utilization, utilized and tool through the cell somewhere far away, created a new slack to fix that slack, inserted some new cell, created a new slack. So that kind of it, process took very long, time to close because of news slack. In this scenario, when we are doing it very control and. Observing all the things at the same time. Thus, WNS improvement observed by 15% during the ECU application. And when we are seeing good, TNS improvement, TNS will automatically will help improve. And these are the things we noticed. At the same time, when I'm taking care of my ECO in one single sort. And not, looking for the new ECO, which can arise from different kind of ECO, requirement. So I'm reducing the whole ECO cycle to fewer iterations. So let us say I fixed one ECO, created some new timing violation, fixed those, created some new timing violation. That process is not happening now. So I can reduce my ECO cycle time by say 40 to 50%, and I'm applying fewer ecos now. Now the runtime overhead comes from when I'm looking at only legalizing, only some cells. the placer can go and legalize those cells very fast and can come back also. excuse me, not doing, the complete, set of, update timing. So that is also giving me some of the advantage when I'm doing the legalization or timing updates. So these are some of the QR checks, in seven nanometer mobile SOC implementation. what I noticed that. When during the EC implementation, these things are happening. I'm not stuffing new cells, not putting, there is an, less need of, putting more buffers and other things wide drafting this algorithm. So in those areas where ECO need to apply, there is a huge amount of congestion reduction happening. By adopting these methodologies, the violations are not ha as I mentioned that. If we will do it very carefully, new violation will not come. So EC requirement also will reduce. In that scenario, I have seen that I'm applying only half of the ECOS or less than that. Also, when I'm applying very fewer ecos, my closure timing is much faster. So this kind of scenario I have seen, like in CPU design, GPU design, I've worked on neural network accelerator. The designs were really, really congested. And these requirements must have been adopted and implemented in such a way. So these are some of the advantages I have seen in the designs, which definitely helped me to get the chip out on time. Yields were really nice. when the Sleen came back, we did not see much of the problems since we have taken care of all these things at the same time. So some of the implementation consideration for these kind of methodologies, what we can see that when we are going, in partial EC implementation and doing step by step partial implementation of, legalization and timing closure. And then, again, running routing and looking it back. We have a runtime optimization, scenario where we can, look at very specific area, target those for implementation, and can definitely improve the accuracy. These, flows can be, or methodologies can be, integrate in the flows very easily. So I have written some, tick script, which tool can read and can write also. And then based on that we can write CSV file to list the cells, slacks, and keep. We can do some kind of sorting based on and tool can read it back and can apply the algorithm. So very easy to use. From the integration perspective or we can apply in an existing flows as well as we can develop some new methodologies. Parameter tuning is another thing, like lot of things, can be guided by the tool also, like some of the stream commands, tool already hedged int inside, so we can call those commands and can utilize. To do these kind of ECO. conclusion here is, so in this scenario, when we are doing this time enclosure, it significantly improves the predictivity. It definitely reduces a lot of ECO attrition and that when we are looking at very high frequency, highly congested designs. This is the only way to go for any ECU application. in the industry. I have worked on seven nanometer design, five nanometer design, and lower node of high performance designs. This is really, important and, lot of industry applications are there in other designs as well. in future there can be, Lot of things added on these, top of, this algorithm. so some of the things like if we are looking at power, if we are looking at area, so these legalization schemes can help in those aspects as well. So this methodology, overall I'll say, presents a new era of legalization for ECO cells. Also helps in all kind of, PP aspect of the designs and can be easily integrated and connected to any flow in any technology based on the requirement. this is all, from my side for today and thank you very much for listening to me. Please, feel free to reach out to me if you have any question. Thanks. Have a great day. Bye.
...

Puneet Gupta

Principal ASIC Physical Design Engineer @ Block



Join the community!

Learn for free, join the best tech learning community for a price of a pumpkin latte.

Annual
Monthly
Newsletter
$ 0 /mo

Event notifications, weekly newsletter

Delayed access to all content

Immediate access to Keynotes & Panels

Community
$ 8.34 /mo

Immediate access to all content

Courses, quizes & certificates

Community chats

Join the community (7 day free trial)