Optimizing Cache Usage in Docker Builds

Video size:

Abstract

Discover practical ways to speed up your Docker builds by optimizing cache usage. This talk covers simple techniques like organizing layers, using bind mounts, and working with cache mounts. Ideal for developers who want to save time and improve their build process.

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.

Hello everyone. My name is Ima Etsy, and I'll be talking on optimizing cash usage in Docker Butte. about me, I'm a software engineer and I'm very passionate about sharing knowledge through writing. I'm also an active contributor to the open source com, community, and then you can connect with me on LinkedIn or Twitter. Let's get started by talking about why Docker bees can feel very slow. So one of the biggest pain, points with Docker, build is that even the smallest code change can trigger a full rebuild. That means you are often rebuilding parts of your application that haven't even changed, and this can significantly, increase view times. So without optimization of your docker, Docker will usually run through unnecessary steps again and again. Wasting, the compute power, right? And, if you've ever worked in A-C-I-C-D environment, this is usually a very real problem because it's causes a very long view time and slows down your testing or your deployment, which, ultimately reduces the entire developer productivity. And no one wants to really sit around waiting for views to finish when. They can, they could be shipped or they could ship features really quickly. So why does this optimization really matter? First, it speeds up the development faster, and also it's, allows the developers to retreat, iterate, and test their changes really quickly. also it helps, saves money, right? Because when you have a faster build, it means that, we will have to spend a lot of compute power, which can reduce infrastructure cost. so yeah, understanding docker build. So now let's take a closer look at how Docker builds actually work. the first thing is that, when you start building Docker loads, the build context. So when you run a Docker build, like it gathers all the files in the directory where you are building the image, and these files make up the build context and can impact the build process. the next thing that it does is that it passes the Docker file. So Docker reads instruction in the docker file and execute them one by one. And then, the next step is that it creates an imitable layer. So just these are like set of instructions in the docker file, right? And it creates a new layer in the, for each layer, for each instruction in the docker file. It creates a new layer. and then this layer are imitable. So once they're built, they do not change. also. And last thing that happens is that Docker uses caching to speed up rebuild. So rebuild like when you need to build your docker, your application again using Docker. So Docker just uses the unchanged layout to avoid redundant processing and making your build much faster. So this is an example of, and, let me just break this down really quickly. we have the first one that's from Python three one. So this is like the base image. Ideally you could use like an Alpine, which is really in the content app, and then it copies all the content from the requirement that CXT file first. So this is like Python project, right? And then it installs the dependency. And this step is you can be cashed if the requirement of T XT doesn't change. So if you do not add any new. Stuff to the requirement re any new model to do or package the requirement of TST, it doesn't, install more dependency. And then the next thing is that it copies the rest of the application files, right? And then it defines the command to run the application. So this is what happens. but there's one very important thing, right? The order of the instruction really matters. So if you place, placing copy the requirement, your TXT and run peep install would first help Docker to cache the dependencies separately from the application code. And this avoid or unnecessary reinstalling dependencies when only code changes. now that we've really understood the fundamentals, let's explore ways to optimize cushion for even faster build. How does C work? Now we understand how docker buts work. let's just talk about caution. which is one of the most important ways doca uses to speed up buts. So what is doca? basically docker cash. docker saves previously image layers and reuses unchanged steps whenever. Possible and this avoid rebuilding the part of the image that haven't changed, making the build much faster. So for example, if you want to, if you want to install a dependence in a docker file and they haven't changed, docker res reuses, the previous build layer, instead of running the application process, again, concepts in layers, caching, and if each instruction, the dock file creates a layer and some layers can be cached while others can't. So let's go over the table in the slides, from the previous, slide I showed you, a Doco file. so here I'm just gonna talk about the layers that can be cashed in that doca file. we have, the, from Python. Three point, one zero that is cacheable base image is usually cache. So unless you pull a new version, this layer remains the same. So when you're bu the working direct change often, so Docker usually caches the spot. the step, for copying the requirement of T XT is also cashable. So this step is cash as long as like the requirement of TXT remains the same. And then, if the requirement of TXT hasn't changed, the car will reuse the cash dependencies. so the command to copy the entire, application code is usually no cacheable. And these steps, copies all the files basically in the directory. And if any file changes, disturb, breaks, and the cache is forced, disturb breaks the cache and is forced to review. So that's what happens. So this is why it's important to structure your docker strategically. problem why docker builds can become inefficient. So even. With caching, docker builds can be very slow, and here's why. So one thing is, small mistakes in the Docker file can invalidate the cache and forcing a necessary build. So another thing also is the structure. So if the instructions are not properly, ordered, this can also fo force an unnecessary rebuild. So another thing is, frequent dependency change. So if you are constantly updating your dependency means that you're forcing docket to install, the entire, requirement that takes every time. And then, also using copy. So if you use copy at the beginning of your, your docker file, this. Can just cause like this can break the cache and then it rebuilds again. And also another thing is like bigger image are usually longer to build and, deploy some common pitfalls, like I already said earlier, are you order up instructions. Then, another thing is using add instead of copy. So this extract files unnecessary, but copy uses like. Copy, use copy like, unless downloading archives. So try to avoid using add, sorry. try to use add instead of copy every single time. And then, any changes in validate the cache. So be very explicit about your copied file. you can try to always use the wild card. It practices for optimizing docker duties is in cache, structure, a docker file for maximum cache we use. so another thing is you need to place table instructions before the frequently changed ones, and then you can use the multi-stage build to reduce the final image size. So you only want to keep what's necessary in the final image, to reduce blot, and then make sure to leverage the docker ignore, to reduce the build context size. So exclude any, unnecessary power that the, that could slow down the build. Another thing we want to do is we want to use arguments as the ag instead of environment variable. And because ag is always available, it's only available during the view time. it keeps your image cleaner. so we can see from the Docker file that I placed here and we see how it was properly structured. why this is optimized, because the dependencies are copied first, and so they are cashed separately. And then the application files are copied last, so preventing the necessary cache invalidation. so there are this is the basic caching instructions or caching techniques. So in the next slide I'll be talking about advanced caching techniques. So now we've covered the basic ca cushion techniques. there are some really advanced cushion techniques and that is using the mouths for building cache. So one powerful way we can speed off things is by using mouths for cushion. buying mount, or volume mount, can help persist dependencies across build and reduce a redundant installation. So we can add the amount, equal to type and equal to type cache with build kit. And this allows docker to cache immediate build without creating unnecessary layer. And this would help to improve the performance. we can also leverage external cash sources. one of the, Docker has, built X for distributed C and this allows cach to be stored to be shared across multiple platforms and environments. yeah, I just placed a code on how to, an example of how to, stop build cash in the registry so every new build can reuse it. So this ensures that the build. ensures the build, we use the cache layers, making them much faster, especially if you're working in A-C-I-C-D environment. So another thing, when you've improved your cache layer properly, you want to measure or deb bulk your build performance. And, there are different ways of checking it or, so one way it's by. Using the Docker Build Progress plane and this shows, it just gives you a more detailed out output to check if the layers are being reused. And you can also use the Docker history. this would basically list immediately and the sizes to see what exactly is cached. So if you wanna analyze the Docker image, you can use, you can also use the docker history, you can use the time docker build. And this, measures like the duration to track if your, if there's any improvement in your, caution in all these tools. It's, it helps spot any efficiencies. and then it can keep your build. Or can make your build run really fast. Docker builds. In conclusion, docker build is very essential for speeding of builds and it reduces the resource usage. So by structuring your Docker files affect and avoiding common mistakes and utilizing advanced techniques, like I mentioned, using the Build X, we can significantly. Frequently improve the build performance. And the next step after doing all of this is to, apply this to your real life in your real project and then also try to, analyze if your build time or analyze if the C works properly based on the technique I explained earlier. Thank you.

Slides

Download slides (PDF)

See all 81 talks at this event!

Conf42 Cloud Native 2025 - Online

March 06 2025 - premiere 5PM GMT

Optimizing Cache Usage in Docker Builds

Video size:

Abstract

Summary

Transcript

Slides

Eti Ijeoma

Backend Engineer @ Manufactured

Join the community!

Featured event

2025

2024

Info

Conf42 Cloud Native 2025 - Online

March 06 2025 - premiere 5PM GMT

Optimizing Cache Usage in Docker Builds

Video size:

Abstract

Summary

Transcript

Slides

Eti Ijeoma

Backend Engineer @ Manufactured

Join the community!