Conf42 Site Reliability Engineering 2021 - Online

Migrating a monolith to Cloud-Native and the stumbling blocks that you don’t know about

Video size:

Abstract

So your company has finally decided to move to the Cloud Native ecosystem. You’ve landed on containerization as your first step. You heard that all you needed to do was containerize your first app and then push it to Kubernetes/OpenShift/Nomad, and the cost savings just come. You’ve done this, and well, things have gone not as planned. Some of the tech didn’t do what you expected, and wait, what do you mean our OpEx has gone up? Simply said: the promise of containerization or migrating to the Cloud Native ecosystem can be a lie if you don’t do your homework. Sadly most companies don’t. In this talk, I’ll explain a few gotchas that a “few” enterprises, in the guise of AsgharLabs, hit moving towards the Cloud Native world, and hopefully, you’ll learn from their mistakes, so you’re trip down this path will be more comfortable and closer to the promise.

Outline Introductions * What is AsgharLabs and where they started, what they thought they needed to do * Where I came into the conversation to help AsgharLabs * Questions you should ask after getting your app containerized * Where are the architectural advantages and disadvantages? * Are we doubling up on things? * Isn’t automation good here? Why is this thing so complicated now? * Questions you should ask about the cultural shift that will happen * How the economics of the Cloud can differ from your Datacenter * What do you mean our support is now Stack Overflow? * What do you mean our goal is to move away from the CCB? * Some tangible things you can start with to help become more successful * Build that pipeline extension * Collaborate with other teams * Visibility and Monitoring * Conclusion and where you can go from here

Summary

  • You can enable your DevOps for reliability with chaos native. Create your free account at Chaos native. JJ Asghar is a developer advocate for IBM Cloud. Hopefully you can hear me and see me.
  • The promise of containerization or migrating to the cloud native ecosystem can be a lie if you don't do your homework. In this talk, I'll explain a few gotchas that a few enterprises in the guise of a company called Asgar Labs hit. Hopefully you'll learn from their mistakes.
  • JJ Asgharlabs is just a multinational tech conglomerate with multiple subsidiaries. He asked some very simple but very tough questions to these companies. Spoiler alert: It's nowhere near what they thought they could provide.
  • Who containerized your app? Was it developers or the operations team? You need to know what made them decide to do the thing they did and who they are. Why did you containerize? If you dont actually need to, you shouldn't need to if you're making money.
  • Containerization is the promise of it is you should be able to take your app, wrap it up in a container and ship it anywhere. The next step is naturally repackaging to microservices. Is this cloud the best for your company? Or is this something just forced upon you?
  • Now we need to talk about refactoring into the strangler pattern. So as you take your different was files and you break them up, now you create them into little microservices. In turn, you'll be able to allow things to be rolled out quicker.
  • The goal is to move away from the change control board now. Your Kubernetes cluster can actually be scoped to what it needs to be. One pod isn't as good as multiple smaller pods. You're going to have to sit down and really refactor and architect your application.
  • You need to lean towards more of the pipelines and collaborate to get the different widgets out and the different applications out at the right time. You'll need CI and CD pipelines. You also need to learn to collaborate with the other teams. It is unbelievably powerful when you start really, truly learning how to share, collaborate and move forward.
  • When it came to visibility monitoring, Nagius wouldn't cut it anymore. There are too many moving parts in the cloud native ecosystem that you just don't have that visibility. As long as you can get that graph that goes up and to the right, that'll be some of the best monitoring.
  • How do the economics of the cloud differ from your data center? You can't depreciate anything that is in the cloud. You're going to have to work with your CFO and your accounting teams to make sure you sit inside. There are some companies that want to leverage fully open source work.
  • When moving to the cloud native ecosystem, be sure you're choosing the right tool, the right job. Look for optimizations instead of features. You pay up front, you'll be able to get dividends for that later on.
  • Thank you. And again at jjasgar on Twitter and awesome@ibm. com. I look forward to hopefully seeing you in real life soon. Things so much. Bye.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Are youll an SRE, a developer, a quality engineer who wants to tackle the challenge of improving reliability in your DevOps? You can enable your DevOps for reliability with chaos native. Create your free account at Chaos native. Litmus Cloud hi, my name is JJ Asghar and I'm a developer advocate for IBM Cloud. Hopefully you can hear me and see me and thank youll so much for having me at Conf 24. I look forward to this talk. So let's see if I got this right. We'll go ahead and transition and you should see some slides now. Wonderful. So yes, let us continue. Yes, so we are here to do migrating a monolith to cloud native and some stumbling blocks you may have not heard about again. Hi IBM JJ, developer advocate for IBM Cloud. You can reach me at awesome@ibm.com or find me on Twitter at jjasco. So your company has finally decided to move to the cloud native ecosystem. You've landed on containerization has your first step and you heard all you need to do is containerize your first app, push it to kubernetes, openshift or nomad, and the cost savings will just come. Youve done this and well, things haven't gone as well as you thought they would. What do you mean our Opex has gone up? Simply said, the promise of containerization or migrating to the cloud native ecosystem can be a lie if you don't do your homework. Sadly, most companies don't. And in this talk, I'll explain a few gotchas that a few enterprises in the guise of a company called Asgar Labs hit moving towards the cloud native world. And hopefully you'll learn from their mistakes so you don't trip down this path that you get closer to the promise of containerization. So let's talk about what JJ Asgharlabs JJ Asgharlabs is just a multinational tech conglomerate with multiple subsidiaries, also known as a mask for me, not naming companies outright. So I'll just say Asgar Labs. And you can imagine it's some fortune company out there. In all seriousness, it's a collection of different companies that I've just ran into and the stories that I want to tell to make sure the companies are kept innocent. And yes, it's a fake company. It doesn't really exist, though the website. I do actually have my email come of my test emails from there, but no, and we're not hiring inferior if you're wondering. It was supposed to be a joke. So what did ask our labs or what didnt most of these companies think they need to do. They thought they could take their migration from the physical data center or colocation to the VMware ecosystem. So, remember back in the day when the promise of virtualization and you could be like, hey, instead of four physical machines, youll could turn it into 16 vms. With four physical machines becoming a hypervisor of some sort, they thought they could take that same monolithic concept and virtualization and just move it to cloud. Native bare metals to vms is the same as VMS to containers, right? Like, that's what they thought that we needed to do. Spoiler alert, it's not. So where do I come into this conversation? Normally, it's usually after a few successful migrations that they've had, which, honestly, I beg to differ that these were successful or not. It's nowhere near what they thought they could provide or how much they could have within time and effort and money they spent on it. I came in as the cloud native person and asked some very simple but very tough questions to these companies. Let's go talk about the very first question here. So let's ask some very straightforward questions, but they're deceptively hard to answer. First, that question is, who containerized your app? Was it developers or the operations team? Was there more than a couple status meetings? Cough Asgard Labs? Cough supposed to be a joke between the project teams and then who actually shipped it? Did you actually work closely together to make that happen? I mean, was it a completely different team? Was there a containerization team? I've seen what they call centers of excellences, where they basically had a team of people who are supposed to funnel all this stuff, but they didn't have the same skills that the developers or the questions teams had. So you need to figure out who actually gave you that container and built it. I mean, if you're building it yourself and it's a small team, sure, that's great. But if youre a massive corporation and you have separation of powers to the extreme, you need to know what made them decide to do the thing they did and who they are. That leads into the second question. It's like, why did you containerize your app? Honestly ask that question. You should have that answer very readily. Ask youre teams containerized because they were told to, not for any other reason than some execs said that our core software stack needs to now be next gen. It's the exact same thing I saw at another company where they were told they needed to be on the cloud, not realizing that they were taking a massive engineering effort to move to the cloud. That exact same CIO at a different company or same personality, I should say personality type. Personality type. You could tell, read some article saying that we needed to be next gen so we could get the next group of engineers to come and play with our technology. So we need to move things. But why? If you dont actually need to, you shouldn't need to if you're making money. Anyway, we'll get deeper into that in a minute. And where did you deploy or plan on deploying this containerized app? This spurs from a conversation, where is the choice of your cloud? Because of some ela, there's a couple clouds out there that will give you all you can eat for a first year and then all of a sudden your costs skyrocket because you didn't realize, you didn't cap it right. There are other companies like IBM Cloud of course, that have some really interesting opportunities for enterprises specifically. We don't do that by the way. We actually have really good predictive modeling. But that's a different conversation. Is this cloud the best for your company? Or is this something just forced upon you? Did you do your homework to actually understand that? Turns out Cloud A, cloud B, and Cloud C, they all focus on different things. Maybe you should look at all of them, or maybe you should put all your eggs in one basket. It really depends. And you should spend that time to do that work. And believe it or not, I've actually asked things question to a rather large conversation. And I said, so what did you containerize? And they looked at me like I was crazy. And then they're like, oh yeah, we took this war file, might have actually been an ear file now that I think about it. And we shoved it in a new container and we contain youre iStar app. And I was just like, wait, what? You just took this was file and wrapped it in a container and shoved it on Kubernetes and wondering why your app isn't doing what you expect it to do. And they're like, yeah, well, isn't that the whole point of containerization? And I took a moment and I was like, no, there's a lot more here. And let's take a quick aside and talk about some architectural changes that are required as you move towards this. So yes, containerization, the promise of it is you should be able to take your app, wrap it up in a container and ship it anywhere. But there is nuances to this. And I really hope through this presentation you actually see that it's not just cut and let's take, because of course I work at IBM, I need to have websphere somewhere in my presentation. That's a joke. But let's take a quick aside and go into these architecturals. The first one, as our previous Azar labs company, what they did is do it, a replatform example. They took their legacy application, which was just basically a war file, shoved it into a container, in this case from Websphere, and threw it on the openshift or Kubernetes. And that's cool. That's a great first step. Not your final step, your first step, because you need to figure, make sure that your containerize is fine and you start talking about the advantages of it. But if you didn't design your architecture or you didn't rearchitect your application, now basically you have one big thing of your application, so you don't take advantage of anything that's inside of Openshift or Kubernetes, which we'll talk about here in a moment. The next step is naturally repackaging to microservices, where you start breaking up your application into a couple of different was files. For instance, if you see this heard file to two different wires and you shove it on to Kubernetes or Openshift and you have your MQ sitting there and your application started talking a little bit more intelligently to all the things internally inside of the Kubernetes cluster. And that's really important. That's the next natural progression. So you've taken your single was file and now you've taken your little bit more complex application and you've now put it on Openshift or Kubernetes. And now you can actually have intercommunication, you can actually have scaling now, which is nice. So you can scale out your application layer if you need to or whatever, which makes things much more advantageous. You start seeing some real ROI on here because now instead of having one or two or three machines that run youre Webster infrastructure, now you can actually leverage the cloudiness. And cloudiness is trademarked, of course, that's another joke and be able to came out how you need, but you're not quite done yet. Now we need to talk about refactoring into the strangler pattern. And yes, I did say the word refactor. And yes, you are going to have to refactor your application. So now as you take your different was files and you break them up, now you create them into little microservices and actually start giving different functions and different features of your application to small microservices on containerize inside of kubernetes or Openshift. This takes time. This doesn't happen overnight. So when you had that promise of containerization, of shoving that war file in and calling it a day, leveraging it with microservices, now you can start taking away complexity inside of that war file or inside of that job Application I'm just going to use was as the canonical example. But you could leverage the scheduler inside of Kubernetes or Openshift now. Now you don't need a scheduler inside your application. So if you need to spin out jobs to do other stuff, you can spin out microservices to do that and giving those features their own history and microservice. In turn, you'll be able to allow things to be rolled out quicker. Because now, instead of you doing that big bang release of that war file every x number of days or months or sometimes years, now you can release that microservice in an intelligent way. But we'll get deeper into that in here in a moment. So let's talk about some real tangible architectural advantages and disadvantages of what we just went over. And first of all, as I just said a moment ago, velocity, or implied at least, velocity is probably the most important thing that you get out of this. The ability to focus on their own histories and scoping the clusters to what you actually need is great. You actually start seeing real ROI now, you don't need to have a bunch of machines sitting there in a data center or on vms on the cloud not doing anything. Now, your Kubernetes cluster can actually be scoped to what it needs to be. One pod isn't as good as multiple smaller pods. It's not like vms anymore. Granted, this does require a higher level of cooperation between your teams, and you'll need to build more advanced integration tests, along with most likely a completely different deployment system and policy system that you have, but you get some real, real benefits. So I didn't realize this when I wrote this talk, but the CCB was a teams that I thought was commonplace, but turns out it's not. Well, ccbs means change control board. So, JJ, what do you mean? Our goal is to move away from the change control board now, at least in some of the enterprises I've actually personally worked at. The CCB was literally a meeting, well, not now, but in the day. It was literally a meeting of the managers, first line managers that would sit in a room and whenever they had a release, they would literally put a thumbs up in that room to say, yes, we can release it at that time. Well at that time I was the operations guy and I has the privilege of waking up at three in the morning on a Saturday to release that code. Needless to say, that was not great. But when it comes from the enterprise standpoint, it was great. It was wonderful. Everybody had buy in, everyone actually didnt the thumbs up to say yes, we should replace it. But inside the cloud native ecosystem, you can't have that if you're going to be releasing ten to 15 times, sometimes n number of times a day, you can't have a room of middle managers with a thumbs up there. So you need to recognize that the CCB and that type of policy still exists today is no longer like as you move towards this cloud native ecosystem, you need to get rid of it. If you have these meetings, and I know some of you do, you need to be sure that they go by the wayside. So hey JJ, aren't we doubling up on like I have a schedule already build in my app or I got a load balancer already on my data center. This already exists on kubernetes. So why would I do that when I've already spent all this time and effort to get these knowledge in this space? Well, youll need to audit and verify that you aren't actually doubling up work in technology. You're going to have to sit down and really refactor and architect your application. A great example was that Azgar Labs had both a scheduler on their Java stack and they attempted to use kubernetes to schedule pods. It was kind of weird. It was really, really weird. But they spent so much time trying to figure out why when they scheduled a job to do, processing or something like that, it would always ever stay inside that one pod. And because the scheduler for Java would just spin up another process inside of the pod, right? And they're like, JJ, we have this three node cluster, it was a three node cluster at the time, but only one node is ever actually doing any work. This teams really weird. Like why is it doing this? And I started digging into it and I recognized that, oh, well, it turns out the reason why these are idle is because you're not actually leveraging the scheduler for kubernetes. So you should spend some time and break out your scheduler so it creates other pods so you start overloading or share the load across the whole closer. It was a true moment of what's a good word for it, it was a light bulb moment for that cluster or the cluster and those people. But the beauty of it is that actually, not that really the beauty, more so the challenge is that they still haven't actually done it because they didn't realize how hard it was. So arguably they failed at that cloud native migration and they were like, we have other priorities. Anyway, long story short, you need to recognize that there are tooling and things inside of kubernetes and the cloud native ecosystem like OpenShift, to be able to handle a lot of things stuff. Take cloud balancers for instance, right? Like cloud balancers, they exist, has a software layer inside of Openshift and kubernetes. And the way the ingress works and the way that works, are you really going to need can f five in front of your kubernetes cluster or your openshift cluster? Now you have to sit down and really verify and audit what you're doing. So isn't automation good here? And honestly, why are things so complicated now, right? Like, come on, there's a lot going on here. Well, first of all, of course automation is good here. There's all these moving parts. You're going to need to leverage automation to make the computers do the work for you. Humans are error prone, we all know this, right? I've probably made four mistakes in this talk already, but hopefully you haven't noticed them. Another joke, hopefully. And you need to take humans out of the equation. And then honestly, your app has already probably always been this complicated. You just now get to see into the complication, if that makes sense. Youll have to visualize the complexity when you start breaking these things out to microservices. No longer do you have two or three enterprise architects who understand how youre whole application works. Now you have a bunch of people who take care of a bunch of microservices and they understand how it interacts between one another. It helps with remove tribal knowledge. You'll be able to visualize and start focusing on the different bottlenecks and optimizations that you can gain from having this knowledge now. And when you've truly gotten to microservices, you'll be amazed on how much information you can get on how your application is actually running and where optimizations can happen. Not just internal business logic, but external business logic too, where all of a sudden you may discover that turns out there's no need for this external API anymore because it turns out we can actually do this internally or whatever. It opens up so many things. Having the shared knowledge is unbelievably powerful. A great friend of mine said this to me the other day when I was walking through this talk with him, and it really does focus everything down when it comes to your monolithic app and you're moving to microservices. You had an ordinarily, I can't say the word, an unprofessional speaker. It's embarrassing. Bull mastiff. And now you have 13 yipping chihuahuas. Now take a moment and really, really envision that in your head and you see that big dog. You still got to feed it, you still got to walk it. Barks really bad things happen, right? It can take down the postman if needs be. Now you got 13 yipping chihuahuas, all 13 of them working on. It might take them that postman down and you're going to have to feed 13 dogs now. But at the same time, it's much easier to deal with one chihuahua and then have 13 happy or twelve others happy compared to one big dog that you pay all your attention to. It's a really great observation of moving into the cloud native ecosystem. And on the flip side, if something goes wrong, just as Ken stole this quote, I'm stealing this quote from him. It really hits home. We replaced our monolith with microservices, so every outage is like a murder mystery. It's true. Youre going to have to really learn how to work together as teams to make these things happen. You have to walk through each process and what it did when more importantly, you have to create standardized logging and standardized APIs between the different outages and the teams so people can understand what actually happened. It's very challenging and it's something youve really got to spend time and effort across your whole to do. So let's talk about some questions you should ask to make sure that the culture shift can happen. I mentioned the CCB earlier, and at ask our labs, the CCB become something almost like something out of almost the Phoenix project. At the beginning, no one showed up or even if when they did, they engaged and it just became a burden. Moving to cloud native, you need to start allowing for self orchestration and rollouts and updates. You need to lean towards more of the pipelines and collaborate to get the different widgets out and the different applications out at the right time. And I mentioned the pipeline because you're going to have to build that pipeline with a cultural shift that's going to happen. You'll need CI and CD pipelines. You need to leverage the standards and linting so you can always make sure that your come is to your standards. One of the reasons why go is so easy to read is that the go format command exists. Go came along with a standard outside of the box. So at 03:00 a.m. Now, when something goes horribly wrong, the cloudnative overhead of reading code and arguing over where a parentheses is is no longer there. As an operations person reading code at three in the morning when my pager duty goes off, I was never happy to try to figure out why something was there, and I would spend time trying to understand it instead of just reading it like a book. So having that inside your pipeline, having formatting standards that everyone agrees on is the way to write it. And formatting to be able to linting to force this really does take away a huge amount of issues down the line. You also need to learn to collaborate with the other teams. One of the hardest things I saw at Asgard Labs was to actually deal with the collaboration between the teams. They had some great propaganda about scrum teams and tribes and whatnot, but still, people wanted to do things the old way. Collaboration isn't just status meetings, it's more than that. It's declaring shared contracts for jobs and responsibilities with constant communication between teams. Jira tickets can only get you so far. One of the most successful things I ever saw at a subsidiary of Asgard Labs was that every sprint, they switched out one person from one tribe to another in the global app. This allowed for new challenges and new blood for each team. So every two weeks, someone new joins your team with all the different microservices. So then all of a sudden, you had to train someone new every two weeks how to get that feature out. And before you knew it, the amount of on ramping was negligible. The amount of actual someone coming together and understanding. Oh, it turns out Billy Bob and Jane Doe over there were working on something just like this in another Microsoft version. It created this amazing culture of that. Granted, it was a massive undertaking to get that going off the ground, and there had to be some really high up agreements about it, but the amount of velocity for that company just skyrocketed. It is unbelievably powerful when you start really, truly learning how to share, collaborate and move forward. Started contracts and tickets and things like that, but when you actually sit down and work together, it's unbelievably powerful. Escalator observations. Thought they could just buy one more product and call a day. When it came to visibility monitoring, Nagius wouldn't cut it anymore. And they learned it the hard way. Sometimes you have to put multiple applications in visibility only have portions of your team's what they care about. Those single pane of glass is a great thing to give your marketing people so they can see the line that goes up and to the right. But in all seriousness, when you're actually doing this day in and day out, you're going to need different tooling for different situations. You're going to have to sit down and realize that even though some companies say you can do everything, you're going to need a lot of different ones out there. And youre going to need people to have expertise in the different technologies too. As Adley Asgard Labs wanted a single pane of glass. It's unrealistic. There are too many moving parts in the cloud native ecosystem that you just don't have that visibility. So you need to work on all things. It was a huge cultural shift, but again, as long as you can get that graph that goes up and to the right, that'll be some of the best monitoring. And if you want to ping me later, I can finish that joke off. So how do the economics of the cloud can differ from your data center? Opex? Yes, and everything can be paid by a credit card, which is great. Cfos go back and forth about this, but you need to recognize cfos will start wondering why expenses are going up. Some love it because assuming your team can keep a hold of the costs, you can really predictably understand what your costs are going to be. On the flip side, you can't depreciate anything that is in the cloud, which is a little annoying for some cfos. You're going to have to work with your CFO and your accounting teams to make sure you sit inside. The budgets has much as operations and sres and DevOps professionals and people on the operations side don't ever think about budgeting. At least most of them don't. Don't lie. You're going to have to sit down and think about it. So I strongly suggest build up a bridge to your accounting team and respect what's going on there, because it'll only make life easier in the long run. Hey JJ, what do you mean that all our support is now on stack overflow? Well, yeah, okay, you're right. In a few places it's true, especially if you're using open source kubernetes. There's no company behind it, right? There's not. I mean, there's some companies that put into the ecosystem, but there's no throat to choke. There's companies that can support you, but when it comes to actual upstream kubernetes, sorry, I mean, don't get me wrong, if you're running it on a cloud with aks or eks or iks for that matter, if you're running Openshift and you have red hat there, you have some throats to choke if something goes wrong. But there are some companies out there that want to leverage fully open source work and you need to start thinking about it. When you move to the cloud native ecosystem, the building containers with Docker and then shipping those containers out, there are some conversations at this moment right about it that you need to really think about if it's worth it for your company or not and work towards that. So keep in mind. So let's talk about come tangible things so you can really start with to move forward. There are a ton of technology and software to help you keep going. The best thing you can do is take a moment and figure out when you containerized your app, did you really containerize it or just wrap it in a pod and wash your hands of it? Have a large conversation on why you did this. Was it because you didn't want to be left behind? Or is there an actual reason for you to move into the cloud native ecosystem? Or is it because you thought you could leverage some other software out there to make your customers happy? There's really a lot of options out there and you really need to have these conversations. So let's talk about a quick conclusion here. And ideally with masking all these corporations, I've has an exposure to Asgard Labs to help me highlight some of the consistent issues I've found. The best thing you can do is first ask, do you really need to? And if youll really are committed, you really should take a beat and look for optimizations instead of features. This will drive your teams crazy. It'll drive your executives even crazier because they're going to have to be like, why are we stopping and not releasing features? You're like, well, we're rearchitecting things. It's going to take some time and you got to be reasonable about that and everything. You pay up front, you'll be able to get dividends for that later on. And if you use a correct tool for the job, you'll get there. And as a great friend of mine also said, you wouldn't use a saw if you needed a hammer or you wouldn't use a hammer when you needed a saw, right? They both can do the same job. I mean, you can use a saw to hammer in a nail, and you can use a hammer to saw a piece of wood or break a piece of wood apart, but they're not designed to do that. So when moving to the cloud native ecosystem, be sure you're choosing the right tool, the right job, and you'll miss those stumbling blocks. Thank you so much for your time. And let me go ahead and go back to the little other screen here. Thank you. And again at jjasgar on Twitter and awesome@ibm.com. I really hope for, I look forward to hopefully seeing you in real life soon. Things so much. Bye.
...

JJ Asghar

Developer Advocate @ IBM

JJ Asghar's LinkedIn account JJ Asghar's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways