Conf42 Golang 2023 - Online

Memory Management in Go: The good, the bad and the ugly

Video size:

Abstract

Memory management will always be a contention point for certain projects when working in cloud native environments. The aim of the game is to be as low cost as possible but this is hard with poorly written code. Managing memory correctly is key and in this talk Liam outlines how to do exactly this.

Summary

  • Today I'm going to be focusing on memory management in go. Going to look at a few code examples, some good and some bad, and help explain them. Also looking at some memorymanagement in some other languages, namely Java, Python, and some rust. Also going to throw in a live demo at the end.
  • Memory management is a way of keeping track of memory locations in your program and on your system. It helps to prevent memory leaks, stop security vulnerabilities appearing, and to stop the slowdown of your system and programs from running. Understanding how to manage memory is really important.
  • Go is a lightweight execution thread. It helps to create parallelism, and it helps have asynchronous running of your code. How do you manage memory in go? There are two ways to do that.
  • A memory leak is typically when you have a memory allocation that is referenced but it's no longer needed and it's not freed up. This can eventually cause your program to crash or slow down your system significantly. How can you personally help as a developer?
  • The next one, garbage collector. We are declaring a global variable of data. Once it's completed, and once it's finished, we have still got a problem because that global variable is still there. How do you fix it? Well, we reduce the recursive call number.
  • There is absolutely no guarantee that task one will finish before task two. Perfect example of a go routine and parallel execution. Let's have a look at how you can dictate and manage memory in your program.
  • Rust uses ownership and borrowing, a completely different approach to how go works with memory. Python's somewhat similar. It uses a garbage collector, and it uses a technique called reference counting. What are my top tips for effective memory management?
  • In vs code, we can see how much memory is being used in our programs. We can see leaky function right at the top with a use rate of 91%. And of course we could see a lot more of it.
  • So let's head back to the slides and we will just finish off with that. I would like to say a massive thank you to everybody for joining my talk today. And if you have any questions, please do reach out to me on social media.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hello, everybody, and welcome. I hope you're all enjoying the conference, and welcome to my talk. Today I'm going to be focusing on memory management in go, and I've namely called it the good, the bad, and the ugly. And that's because we're going to be covering a whole bunch of topics around memory management. So a quick agenda. What is this session going to be about? I'm going to tell you a little bit about me, who I am, what I do, and where I came from. The introduction to memory management itself, how go's memory model works, and how you manage memory in go. Going to look at a few code examples, some good and some bad, and help explain them. And then we're going to look at some memory management in some other languages, namely Java, Python, and some rust. Also going to look at some top tips, and I might even throw in a live demo at the end as well. So who am I? Well, my name's Liam Hampton and I'm a Microsoft senior cloud advocate. I'm an all ser ambassador because security is something I always think about, and I feel whenever I speak to developers, it's not always top of their list. I'm a dev network advisory board member and I like to write a lot of go code. That's my background. I also like to travel the world, as you can see, with a few pictures here on the slide. So with everything I do, I like to have some learning goals. If there's anything that I would like you to take away from my talk today, it is to understand the gomemory model and understand how to manage memory in go. Let's talk a little bit about memory management holistically. What is it? Well, memory management is a way of keeping track of memory locations in your program and on your system, regardless of whether they are allocated or referenced, shall we say, or unreferenced. Memory management is a holistic view of what's going on and how your program can run perfectly without falling over, without failing, and without running out of memory on your system. Well, why is that important? We ask for a number of reasons, but namely to prevent memory leaks, stop security vulnerabilities appearing, and to stop the slowdown of your system and programs from running. You need to understand how to work efficiently on your system, and you want to be the most cost effective you can be. Therefore, understanding how to manage memory is really important, not just with it programmatically, but typically how you understand the way the system works. So let's take a step back and look at the generics of memory management, everything that we know and love already. Stacks and heaps, what are they? Well, a stack stores local variables and function call frames. So whenever you kick off a new function, whenever you call it, it creates a callframe and that is popped on a stack or pushed to a stack, should I say. Now that namely brings me to the second point of it being last in first. But. So let's look at this as if you're stacking some books on the diagram. We have got one and you're pushing number two, pushing number three, four, five and six. And it creates a lovely stack, quite aptly named, if I say so myself. And then let's say you want to get the first book from the stack. You need to take each one from the top off at one at a time. So it would go 65432 and one. That's pretty typical for a stack because it's supposed to be quick, it's supposed to be agile and it's supposed to be fast. That's pretty typical. However, if we're going to be looking at the sizes of it, they're typically fixed sizes. If we look at a standard typical Linux distribution, the default size is about eight megabytes. That's how big a stack frame is supposed to be. However, if we look at how go manages that, and we'll talk a little bit about this later on, but we use something called go routines, as you may or may not be aware of. So these are another layer of abstraction away from the odd operating system, which typically deals with threads. But this is something inside the language already. Now this actually helps us with memory location or stack allocation at this point. And it starts at about 2 memory, quite small. So that's really good, really fast and really efficient. Now let's look at a heap. What does a heap look like? Well, imagine it's like a cloud as we can see here on the diagram. And next to it we have got two stacks. These store dynamically allocated memory. It is not for quick allocation, and it's definitely not for grabbing quick bits of memory or quick bits of data out of memory. It is there for longevity. These grow and shrink during the execution of a program, which makes it dynamic, but that also makes it really slow and really less efficient. So anything that cannot be stored in a stack is typically put onto the, or into the heap, which is good, but that can become a problem later down the line. So as we said, stacks are for short lived data and heaps are for long lived data. So longevity, how does Go manage all of this? So I've kind of already alluded to it already, but what does Go's memory model look like? Well, it has a garbage collector. It is pretty famous for its garbage collector. It's one of the key features of the language, and it is where it automatically goes around after you, after your program to reclaim memory, which was allocated. So it does it automatically for you, gives you a hands up approach to memory management, unlike some other languages, like I said, not manual, it reduces security and leak risk. Now, this is really important. One of the key problems that we have with memory management as it stands is security vulnerabilities. Understanding how to close connections, how to close sockets, is important. Understanding what happens to your memory when you're allocating global versus locally scoped variables is also really important. This helps to prevent the leaks and security risks that you may see. The garbage collector, like I said, runs around after you and helps to free up the dereference memory for you. Pretty cool, and it's really, really important. Next we have Go outlines and channels. Well, what is a Go routine? I've kind of already spoke about a little bit, but it is a lightweight execution thread. It's a layer abstracted away from the operating system, which typically deals with your threads, but it's a function that executes concurrently with the rest of the program. It helps to create parallelism, and it helps have asynchronous running of your code, and that in turn is very cheap. It helps you have a lightweight program that runs asynchronously. It's brilliant. And that means less overheads, which is even better. So when you actually go to write a go routine, or when you want to dictate or sort of have it in your language or in your program, you would just put the word go and then the function signature. And I'll show you an example of this afterwards. What's a channel? Well, a channel is the transportation between the two go routines that you, or more go routines that you may have. It's a communication means, it actually allows you to take data or send data to a channel from a go routine and then pull it off again so it talks to one another. Because what's the point in having a go routine that's processing some data and another go routine processing some more data and them not being able to communicate? Well, this is exactly what a channel is for, and it helps to prevent race conditions, locks and other synchronization problems that you may face. And again with the syntax, you dictate it with the chan keyword, which, I mean, I think go has about 25 keywords. It's a really awesome language like that. But when you want to write to it, let's say we have c. C is the channel and we want to take x and we want to put the value of x onto the channel. Well, you'd use that with a notation of an arrow. And likewise when you want to read from the channel, you would take that arrow and put it the other side to take it off. Again, quite simple. So in essence, what does memory model look like and what does it do? Well, it ensures the program doesn't run but of memory by using the garbage collector. And it really helps you a lot. It allows go routines to communicate safely and keep good state. Therefore it is perfect to run your parallel code. Brilliant. Now that's a whistle stop tool. There's obviously a lot more to it, but we're going to keep it at that high level just for now. How do you manage that memory? So, once you have looked at the garbage collector, we have looked at go outlines and channels. How do you manage memory in go? Well, there's two ways that we can help you do that. There's one called the new function. Now this is allocating memory of a variable for a given type, and it is typically zero valued at this point. So let's take an example. We've got pointer at the bottom and we are going to be calling the function new and we're going to give it the type of an integer. Therefore, if we want to get the value of pointer, well then it's just going to be zero, zero value from beginning. And the second way is to use the make function. Now this is allocating memory for data structures. So if you want arrays or slices, maps, channels, and you then want to use them straight away with a default value, this is when you'd use the make function. Again, let's look at an example. We want to make slice. So we say we want to declare slice and we want to give it a default value of an integer slice with the values three and five. Now this is how you would do that. And this is where you'd use make. So you'd use new when you want to initialize a variable and then use it later on your program, which is totally fine. And then you can use make when you want to create a data structure and use it straight away also fine. Two really good ways to help with memory management in go. So what about memory leaks? I've said it a few times and I'm going to say it again, memory leaks are bad now how can we avoid them? What is it? Well, a memory leak is typically when you have a memory allocation that is referenced but it's no longer needed and it's not freed up. So this can eventually cause your program to crash or slow down your system significantly. This is really bad and we don't want to come into any of these and there's ways to avoid them. But let's look at some typical scenarios of when you'd come across this. So if you're not terminating a go routine completely or properly, rather this can continue to hold on to allocated memory and it just holds it. It doesn't do anything with it, it's just there. But that memory block is now stuck. You're not freeing it up. Now imagine if you had a number of go routines doing the same thing. Well then you'd have a memory, you'd have a memory allocation of this much and it would eventually fill up bit by, but causing a problem late down the line. So if somebody else or another outlines wanted to use a piece of memory, well then it can't because it's already taken up. What do you do? Well, it's just going to fall over. Secondly, another really common mistake that I see is assigning global variables and never using them again. And I'm going to show you an example in a moment. But you never want to assign a global variable and do nothing with it. You always want to clean up after yourself and it doesn't always happen again. A scenario of a memory leak and of course the famous infinite loop. You're going to be taking memory upon memory upon memory and you're not going to be doing anything with it. It's just going to be dormant. That is a classic example of causing memory leaks. What tools can we use then? Well, there's a few that we can use and the go toolchain has a really powerful one, pprof. And this is basically a built in package that can be used to analyze and understand the go language and your functions. And I'm going to show you an example, hopefully at the end. How can you personally help as a developer? Well, there's a few things now. Number one, it's pretty obvious. Be vigilant, don't use global variables if you're not going to allocate them and deallocate them efficiently and properly. You want to understand the code that you're writing and you need to understand it properly. And secondly, the defer keyword. Now this will help to reduce leaks with files, sockets, database connections, anything that you do not want to leave open. If you are opening, I don't know, say a file, and I'm going to show you this in a moment, you're going to want to close that file regardless of whether the function is going to pass or not or whether it's going to complete. So let's have a little look at some code. Here's a good example of the default keyword in this bit of code. We're opening file txt. We're then going to check for an error, which is pretty typical, and then we're going to defer the closure. Now this is the really important part. This file close will execute after the surrounding function, which means that even if it fails, even if that function fails or doesn't complete processing properly, the file is still closed. Really important because we don't want to leave that open. That's when we use deferred keyword, network connections, database sockets files, things that you don't want to leave open that can also create security vulnerabilities at that point. Again, not something we want to do. The next one, garbage collector. What are we doing here? Well, we're creating a struct called mystruct of type struct. And inside there we have got data, and that is of type byte. So it's a byte slice in the main function. We are declaring it as a variable. We are then going to allocate it with about 100 megabytes. Really, really good example of the garbage collector because once this is completed, once this function ends, it's then going to clear up after itself. The garbage collector can reclaim that memory that my struct has been allocated at this point here. So once it's completed, done, and it will clear it off. Perfect. Now let's look at some bad examples. In here. We are declaring a global variable of data and it's of byte, it's of type byte inside the main function. We are giving it again 100 megabytes. However, once it's completed, and once it's finished, we have still got a problem because that global variable is still there. So how do you fix it? Well, you move that data allocation into the function, you give it a local scope. At this point, we don't want it to be global. We don't want it to exist outside of this function because it's never going to get cleared up. The garbage collector is just going to look at it and think, you know what, it's still there, it's still referenced, it's still being used, but it's completely dormant. So you make it locally scoped so that when it does complete that function, it is then cleared up. All of that data is freed up for the next one. Let's look at another bad example. Recursive functions, everyone's favorite. Right in here we are calling recursion and we're giving it a big number that is then going to call the recursion function a number of times. Inside here we have an if statement. If n is not equal zero, then return and do the same thing again and the same thing again and the same thing again. That is really bad because you are continuously taking up memory and you're continuously iterating over this loop and you don't want to be there. So it's eventually going to run out of memory at some point. Okay. Or it might not run out of memory. It might slow down significantly during its execution. Admittedly, you're going to need a much bigger number than this to really bring something down on like modern day systems, but this is still very valid use case. How do you fix it? Well, we reduce the recursive call number. That's a pretty obvious one. Okay. There is no golden nugget for this one. However, reducing the call number or using a controlled iterative solution such as a range or a duo loop, something that we can control specifically as a developer. Okay. You don't want to just leave it to its own devices and continue doing it time after time after time. So now let's look at some go routines, let's look at some code and how it really works. So what do we have? Let's go over to the browser, into the go playground and let's have a look, shall we? All righty. So over here we have got a go routines function. We have got go task number one and go task number two. And then we're going to wait for a little bit. We're going to wait for those tasks to finish in here. All we're doing is printing task number one and task number two. That's all we're doing. Nothing too complicated. However, when we're running task number one and task number two in go routines, there is absolutely no guarantee that task one will finish before task two. And we're going to see if we can get it to do it today. So let's run this function a couple of times and see what we get. No guarantee it'll even work this time, but we're going to see if we can try and find out. So we have one and two, we've got two and one. Straight away, two has finished before number one. Perfect example of a go routine and parallel execution. Let's have a look at channels. I briefly spoke about them. Let's have a look at them in action. So first of all we are creating a new channel and we're doing that. We're calling it c and we're saying make chan of type Int. So integers. We just want to be working with numbers here. Then we're going to send a value to that channel. We're going to say go. So we're going to dictate another go routine and we're going to say in there we got a channel and we're going to pass 42 onto the channel. Then what we're going to do is we want to read from that channel. So we want to take the data off of it at this point. We want to pull the data out of that channel. So once this executes we would expect to see 42. So let's go ahead and run it. And of course we do. Now the next one is where we're taking it down just a little bit deeper. And we're going to look at pointers and referencing with memory addresses. We are saying Liam's number is equal to 27 as an integer. We are then saying the variable of pointer is equal to the memory location of Liam's number. And then we want to print them out. We want to print out the pointer which is going to be the memory reference or the memory location. And then we want to print out the pointer of that. So then we actually want to print out the value. Then we're going to look at a little bit of arithmetic. We're going to do the reference of pointer is equal to pointer or the reference of pointer plus two. And we would expect to get a numerical value. So let's print that out and see what we get. So we go ahead and run it. And down here we can straightaway say that the value of pointer. So this line here, line number 16 is the memory address location. And then we have another one. Line number 17 is 27. The actual data at that memory location. And then we're just going to do a bit of arithmetic. So we're basically saying substituting the values in 27 is equal to 27 plus two, which equals 29. That is how we're going to be working with memory locations and pointers and references. And you do that with ampersands and little stars. So that's a really good way to look at how you can dictate and manage memory in your program. Okay, you don't want to be overwriting the wrong values. You also want to keep a true value at some point, and this is how you work with them. So let's go back to the slideshow. Okay, so what are we doing next? Let's have a look at some memory management in some other languages. Let's go and have a look at Rust. How does rust manage memory? Well, it's a little bit different to go, rust uses ownership and borrowing, a completely different approach to how go works with hits memory. So ownership is where a piece of data basically has a single owner that is responsible for managing its lifespan. So it's deallocated by the rust compiler automatically. Completely different. And borrowing is the idea that a piece of data can temporarily be borrowed by another piece of code, but it must be done in a way that guarantees its safety and integrity. So again, it's got a real tight security on its data of how it's passed around and where it's passed around. But it's a really different concept of what we're used to in go. So let's have a look at an example on the slide. I've got a main function, and we have got a string. So let's say we're giving s the value of string, and it's going to be hello. Then we're saying we want to calculate the length, but we're going to pass string into that function, but we're only going to give it a reference of string. Now we're borrowing the initial value of s with the ampersand. We're passing in a reference to calculate length. From there, we're going to take it in. We're still using it. It's still locally scoped, and it's still only a reference. We're just borrowing that version of s, and then we want to give it an output so it passes it back into the main function, which is still accessible because it's owned by the main function, and then we're going to print out the value. So it says hello is five, so five characters. That is how rust is working with its memory allocation. It passes by borrowing and ownership. Completely different to go. Then we're going to have a little look at Python. Now, Python's somewhat similar. It uses a garbage collector, and it uses a technique called reference counting. So a reference is a way for a program to access an object in memory. Okay? So when a variable is assigned to an object, a reference of that object is created. This is how Python works. It also has a cyclic garbage collector, which basically periodically checks upon unreachable objects or unreachable unreferenced objects in memory, and then that's how it then frees it up. So if it can't call it, if it can't reach it, it will free it. And it also has a built in memory. Managing this cyclic garget garbage collector is sometimes a bit of a pain in Python because it runs in the background, and when it executes or tries to free up memory, it typically pauses the execution of your code, at least. So how I understand it to be that can then cause security vulnerabilities and memory leaks. So it's maybe not the most efficient way, but it certainly works well in Python and everyone's old trusty Java. It's actually really similar to go in a lot more ways than people may think. It uses a stack and a heap in a very similar way. It has a garbage collector which manages the memory on the stack and with the heap, and it uses a technique called mark and sweep, very similar to what I just said. It will go around and try and find all the referenced and all the allocated memory, and it makes a notation of it. Everything that it finds that is not referenced, it will then sweep away and get rid of. Little bit different to how go works, because go actually uses a stop the world technique. But that is a conversation for another day. So what are my top tips for effective memory management? Use the defer keyword. It schedules a function to execute later, typically when the surrounding function returns. So always use it. This will help clean up your files, close connections, and release any locks that are still there. Number two, use the garbage collector wisely. It's important to understand you have it. It's important to know it's there and it works in your favor. But if you are consistently creating and discharging lots of short lived objects, it's going to make your program slower. Something or too much of something good can sometimes lead to something bad, and in this case you can run into that problem with go. So be very selective of how and be very pragmatic when you are making decisions with your code. And number three, monitor memory utilization on your system. Use the tools that are available to you as a developer, such as utilities on a Mac, such as what I use. Use Pprof, use the built in tools, use the go toolchain to help you create a more dynamic program. Just understand it from a lower level and it will help you write good code. In conclusion, there are a few things that I want to say. Memory management is complicated. It is by no means easy, and this talk very much scratches the surface, but hopefully opens the doors and ignites some inspiration for you to want to dig a little bit deeper into the management of memory. In Go. The garbage collector handles most of it for you. Like I said, it's important to know it's there. It's important to know it works in your favor. Just understand it properly. Thirdly, memory management is different across languages. You may be comfortable in one language and it can be completely different in another, which ultimately leads to a different style of coding and understanding. So there's more to it than just knowing how to create a hello world. Understanding how to manage it once you create more complex programs is super important. And of course leaks are really bad. So with that, before I close off, let's go see if we can create and see Pprof in action. So I am going to come out of my slides and open up vs. Code. In vs code, I have got a little program that is running. We've got a couple of imports, pretty standard. We've got Pprof also imported, but we're ignoring it. We have the little underscore which if you write go, you'll know you're ignoring anything with an underscore. We have a main function which is spinning up a go routine and it's going to create a server for me and it's on port 60 60. Now that's really interesting because port 60 60 is actually where Pprof runs. We're going to print out hello world and we're going to have a weight group. Now this basically blocks and allows go routines to complete at this point. Then we're going to add number one into a weight group. We're going to add to it and we're just going to call leaky function. Now, leaky function is a function I've got written at the bottom, which is going to do some fun stuff for us. It's just going to go around a for loop a lot of times. So let's spin this up and see what we get. So let's start the main function with gorun. Main go. Now hopefully we should see an output here. Cool. So we got hello world. Now let's go and check out Pprof. Let's go go tool pprof. And we're going to give it the heap. Although localhost 60 60. Right, we're in now just like you would do in your terminal if you want to follow some log or something. We're going to do use the word or use the command top this is going to help show us where our memory is being utilized. And straight away we can see leaky function right at the top with a use rate of 91%. So it's using a lot of memory and the longer I leave it, the more it's going to fill up. And of course we could see a lot more of it. Now, there is also another way that we can see this. We can go into a browser and we can go and get like a full stack heat print, but as well. But that's a completely separate kettle of fish. But this is just a really quick way in your terminal to see how much memory you're using and where it's being used in your functions. So let's head back to the slides and we will just finish off with that. I would like to say a massive thank you to everybody for joining my talk today. And if you have any questions, please do reach out to me on social media. I would love to answer your questions. I'd love to have conversations around this. I'm learning just like you. So please do connect and thank you very much and goodbye.
...

Liam Conroy Hampton

Senior Regional Cloud Advocate @ Microsoft

Liam Conroy Hampton's LinkedIn account Liam Conroy Hampton's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways