Conf42 Golang 2021 - Online

Concurrency for beginners

Video size:

Abstract

As a beginner myself (not so long ago) trying to learn this language, it always seemed like C++/Python like language. Golang is much more than just functions and execution, its probably the easiest to write code which exploits concurrency at its best and in an intuitive and readable manner.

This talk, I’ll discuss “channels” and “go-routines” with an example of MergeSort to demonstrate the power of this language for beginners.

Summary

  • Jay Jayaganesh Kalyanasundaram is a software engineer in Google. This lightning talk will cover his journey of venturing into and understanding concurrency in Golang. He discovered the go keyword, which is basically a magic to do multithreading in a very easy manner. Also came across the channels which he never heard before in any other language.
  • We now use channels instead of a slice, so that we get the stream of the data as and when they are ready. We are benefited by directly transferring the data with channels to improve the concurrency. That's pretty much it for the small lightning overview of how I learned to use concurrency in Golang.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hi, I am Jay Jayaganesh Kalyanasundaram, a software engineer in Google. This lightning talk in the beginner's track will cover my journey of venturing into and understanding concurrency in Golang. When I first came across the syntaxes in Golan, I thought it's almost same as C and has slight differences, like the keyword where for variable and funk for functions. I'm an avid user of oops and found that structs with interfaces can help mimic the thought of classes. Golang also supports weight groups, which can be really useful for lock retentions. Finally, for me, at the first glance, Golang was the performance of c in cpython. Like intuitive language, merge sort is one of the good known algorithms to sort an array with complexity o of n login. It works by breaking the array into two halves and recursively sorting the halves and finally merging them. For me, this is one of the good problems to code in a new language to understand the semantics of functions, variable declaration initializations, array slice manipulations. So I went ahead and coded the same in Golan. The sort function here simply breaks the array into two halves and sorts them recursively and merges them, and finally returns the result. The merge function, on the other hand, uses a two pointer to iterate through the two arrays to be merged, and updates the final array with the smaller of those two elements pointed by those two pointers. You can read more about it separately in the Wikipedia link, which I shared in the previous slide. So after coding the basic merge sort, I dwelled deeper into Golang to understand the other salient features, and discovered the go keyword, which is basically a magic to do multithreading in a very easy manner. I wrote a lot of code in C plus plus with multithreading by using the library threads. Though it wasn't rocket science, it wasn't as intuitive and easy as just using the go followed by the function name. This go keyword launches a go routine, which is the fundamental unit of running a multi threaded program in Golang. I also came across the channels which I've never heard before in any other language. My first thought towards the handling of threads was that this is a server optimization problem, and it's not of much use to me, although because the go keyword was so tempting, I decided to give that a try. With that, I basically wrote the same merge sort with slight channels by using the go routines. So here, as you can see, s one and s two are those half two halves which are basically sorted in those two concurrent threads defer in a scope is basically the last command which will be executed in that scope. So in this case, the wake group release is basically the last command which will be done in the scope of the funk. For each of those goroutines, we use weight group so that we ensure to get both the slices fully sorted before we start merging. As you can see, the waiting for the two goroutines to complete is the part which is slowing the execution down. So I was thinking, how can I speed up this execution? So I was thinking we could stream the data so that we don't need to wait for these total of two halves to be fully sorted and can proceed with the merging as and when we get the parts of the sorted data. So I thought of going back to what I came across, and that was channels. So I started reading a bit more about channels, and my first glance at them was, these are more like water pipes. So it basically transfers data from one end to the other. Okay, so what are the salient features of a channel? First, it has a start and it has an end. So it has a sender and it has a receiver. Second, it has a buffer to hold and this could be zero. This buffer is basically like the length of the pipe. So if the length is zero, imagine it's like a trying. So you just have a ring and you have the source and the sink pretty much with no gap between them. And if the receiver is not receiving more like there is a gap towards the end of the pipe, the sender will be blocked. The sender can't send more water through the pipe. Similarly, if the sender is not sending, the receiver will not know about it and will keep waiting indefinitely. The sender, however, can close the pipe once they have sent all the water or all the data, in this case for the channel, and basically say the channel that they have sent all the data, and the pipe or the channel, once has transmitted all this data to the receiver, will intimate the receiver that it has actually transmitted all the data, and it is going to close the pipe now once for all. So, with this analogy of water pipe, let's get back to writing the same code with channels. We now use channels instead of a slice, so that we get the stream of the data as and when they are ready, instead of waiting for the two fully sorted slices. So here you sre basically sorting those two halves again concurrently and merging them concurrently. That is, as and when you get the data from the individual channels. As a recap from what we did so far, we started off with sorting the two halves of the slice sequentially and then doing the merge. Then we used the go keyword and improved it by sorting those two halves concurrency and then merging them at once, and then finally we sorted and merged them again concurrently with no blocking at all. So we are benefited by directly transferring the data with channels to improve the concurrency. Also here I would highly recommend reading this article in the Go block, which basically explains why it's better to share memory by again communicating rather than communicating by sharing memory. The latter needs log contentions to ensure safe access by different processes. Now a quick anecdote of how I used it practically so we were in a system in our team. We use a system in our team where we had a front end which was surfacing a command line interface for the user. It had multiple backends which it was communicating with, and every backend interaction took a few minutes. Sometimes it was also in the order of about 510 minutes. So overall it was in the order of half an hour to 1 hour. The front ends and the back ends were run as a single binary. So for the user it is just one single CLI command which they execute, or one single binary which is finally executed. The user wanted to know the progress of the CLI, because when you're waiting for half an hour, it's kind of annoying to just see a blank screen. You would want to see something more interactive. So finally we had to solve the problem of how do we report the intermediate progress before the function call to each of the backends actually returns? So we basically built a channel from the front end to each of these backends, and the channel was populated with the progress as and when some significant milestone was reached, and each of this was populated to the front end, which basically aggregated that and showed to the user at appropriate instances. The other benefit we had was there was no extra overhead with logging them in separate place and using some extra db and stuff, so we were able to solve it in a very cheap manner. That's pretty much it for the small lightning overview of how I learned to use concurrency in Golang and the different salient features, including the most powerful of all the channels. Hope it was useful to the beginners. And thank you.
...

Jayaganesh Kalyanasundaram

SRE @ Google

Jayaganesh Kalyanasundaram's LinkedIn account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways