Conf42 DevOps 2023 - Online

Plan CI/CD on the Enterprise level!

Video size:

Abstract

CI/CD seems to be simple. But let’s take a step back, and look at it from a helicopter view. Let’s think about the design CI/CD processes for the project, team, even the whole company. Let’s go through “architecture of CI/CD”. What areas should we cover? How to talk with Stakeholders about CI/CD when we design the backbone of DevOps driven Organization?

Summary

  • Let's talk about CI CD, but a little bit differently than just usually we do. We will not talk about the tools, pipelines, integrations, at least not about the specific tools and integrations. So let's get started.
  • Pablo Pivosch is DevOps Institute ambassador, AWS community builder, and also CD foundation ambassador. Today we will look at CI CD from a different perspective, from the helicopter view. We will look on the topic from the organization, not from the pipeline perspective.
  • The design of the system is a mirror of the communication structure inside the organization. If two components in our application have to be very close together, have to very closely and very intensively communicate with each other. This leads us to some great book and some great concept called Tim Topologies.
  • As DevOps is emerged somehow from lean, the value streams should be in our blood all the time. And last but not least, the security. Most often it is the last topic, last element.
  • The branching strategy means generally what we do with the code in our version control system. If you are not familiar with this approach, please think about git flow and multiply it by number of your environments. This also will somehow influence how we code.
  • Do we use continuous delivery on continuous or continuous deployment? What kind of testing strategies do we have? In between somewhere we have artifact management. Do you use agile waterfall? What is the approach? Is this approach valid for your CI CD lifecycle?
  • So your business type can be constraint. Your CI CD probably will look a little bit differently than just new part of the code. What prepares us to act quickly and effectively, let's say this way releases. Here we talk about composition of the DevOps part.
  • Security is really important, right? And we can use this mind map to talk about the CI CD on the organization level and to see all elements which are needed to be taken into consideration. This is not about ordering all teams to use Jenkins. It is about to think how we can improve the collaboration in terms of tooling.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hello everyone. Let's talk about CI CD, but a little bit differently than just usually we do. So we will not talk about the tools, pipelines, integrations, at least not about the specific tools, pipelines and integrations, how to use them, how to configure them, et cetera. We will do something different it. So let's get started. My name is Pablo Pivosch. I am DevOps Institute ambassador, AWS community builder, and also CD foundation ambassador. I work as lead, lead systems engineer, EPAm systems. So on the beginning, let's think about how we usually create our CI CD, right? And this is, I would say, most common approach. And I found it everywhere during the discussions after talking with people about CI CD itself. So we have some code, we know what kind of environments we want to deploy to, and we know some specific tool, Jenkins, azure, DevOps, teamcity, whatever. And those three elements together creates for us the pipeline. We use this tool to provide or to deliver this code to these environments. And that's it, right? And we consider mainly the environments, as the previous slide showed, we consider the tools, we consider the tests. Sometimes, if we are advanced enough, those of us who are seniors considers even releases, how we do the releases. If you are even above this level of senior, you will think about branching and some kind of jedis during inside the team will think about the reverts, because this is also the part of the CI CD. Of course, I'm kidding a little bit, but not that much. All right, so let's think about this from a different perspective, because what can go wrong with this specific way of doing things? Is there anything what should be done a little bit differently? So today we will look on CI CD from different perspective, from the helicopter view, we will go back in space and we will look on it from completely different point of view. So we will not look on the topic from the engineer perspective, but from the organization, not from the pipeline perspective. Pipeline itself, how it is designed or implemented, will look on the process, right. And in order to start it properly, we need to, as a first point, we need to take organization. Why is that? Because a couple of things really, and this is very important, something what we very often maybe not forget, but we don't see it, we don't have it in our minds when we do our daily work. So why organization? Because of a couple of elements. And one of the most important is Convey's law. What is Convey's law? In simple terms, to explain it in simple terms, I will say that the design of the system, so what we developed and how we develop it is a mirror of the communication structure inside the organization. What it means, it means that if two components in our application have to be very close together, have to very closely and very intensively communicate with each other. And those teams who are responsible for those components will not talk together, it will not work. Okay. Those teams need to be also in that kind of communication pattern like system is, right. And this conveyor's law leads us to some great book and some great concept called Tim Topologies. And this is based on the book of two guys, Mr. Skelton and Pai. And they created an approach where we can see all those elements, how they work together, right? And we see here, for example, platform team and the platform engineering right now is a hype, but this hype is because some kind of misunderstanding what the platform team really is, right. We will not go into details here because we have only 25 minutes left. And so let's go further. So what we need to consider when we design, when we talked about CI CD, the first element, we already discussed quickly, shortly, but we discussed organization, what else? For example, branching strategy, right. Delivery strategy, artifact management, SDLC quality, gates capabilities, constraints, responsibilities, releases, and we are in the middle of the list at this point. Rollbacks. KPIs, yes, I know, all of us love KPIs. DevOps as a practice, monitoring and measurements, those two things are different. Monitoring is not equal. Measurements or measurements are not part of the monitoring. We will go into details a little bit later. What kind of environments we have value streams also more and more popular element. But honestly speaking, as DevOps is emerged somehow from lean, the value streams should be in our blood all the time, because value stream is one of the important part of lean, what tools we use. And last but not least, the security. And I always have the fun to talk about security on the end because, yeah, why not, right? Something need to stay in our head. So most often it is the last topic, last element, right? So let's go through the details, right? So we have environments and here we can say it's easy, right? We have dev, we have QA, we have pre prod release, candidate production. What's the problem is here? What is the problem? And the problem is very easy to touch, because we have many environments, this number can change. We need to understand what is the ownership of each environment, what is the structure of each environment, what kind of access we have to each environment. Because it may happen that you have the full control on dev environment and QA environment, but no control at all on production. Production is deployed and monitored only by automated structures like CI CD, like infrastructure, as code, like monitoring platforms, et cetera, et cetera. So those elements areas important to understand. And also, as we know how many environments we have, we know what is the transition between them. We are able to design our test strategy through the chain where specific lets will be executed. Next part, branching strategy. This is like a most common, most understand element, definitely more than just the environments. And the branching strategy means generally what we do with the code in our version control system, right? So this is where this CI CD somehow starts, right? This is the first point where the code from different developers meet together. And we have a couple of strategies, right? We have trunk, we have Gitflow, we have stories, tasks, kind of strategy. We can have even environments, right? If you are not familiar with this approach, please think about git flow and multiply it by number of your environments. And yes, it goes crazy. What is the code structure, what is the repository design, how we provide access to the code there. So for example, who is and how we design pull requests, how we design the communication about pull requests, how we provide approvers to those pull requests. Do we have protected branches? How we do escalations? Because yes, again, we all love escalations, but let's be honest with each other, right? Escalations path is needed because if something happened, and it will happen in some point, we need to know what to do and who to notify, right? What kind of integrations we have, do we use feature toggles or not? And how we design those feature toggles, do we do tests? And here I'm not thinking about unit testing like this, but more like TDD, BDD approach, et cetera, et cetera. Because this also will somehow influence how we code and why this is important because for example, if we have very small code, do we want to use Gitflow? Maybe not, right? If we use specific tool for our deployments, like for example, AWS code deploy, probably we would like to go with specific branching strategy and those are interconnected somehow. So, branching strategy, this is the first or the one side of the stick, the opposite side of the stick is the delivery strategy, right? And here we need to talk about not only delivery itself, but also the CI, right? What is the trigger, what is the build process, what kind of steps we have inside? What are the responsibilities of team members? What is the communication pattern? Do we use continuous delivery on continuous or continuous deployment? Small change, but really, really big difference. Small change in the wording at least. Because continuous delivery means that we have some stop inside the process when some manual action can be taken, continuous deployment means that there is no manual action, no stop. You commit the code, you have it on production, right? So this also influence other choices. What kind of testing strategies we have? Do we have manual testing? If we have manual testing, we will never have continuous deployment, right? Unit test, integration, end to end smoke acceptance performance. There is a lot of different tests, right? What are the environments, when we need to deploy our code, what we do between stages, what will happen between, for example, QA and release candidate, release candidate and production. This is also important, what kind of deployment strategies we have. Blue, green, blue, violet. Probably you could hear blue, something else, some other color. For me it is Violet, for me it doesn't make any difference because I have tendency to be colorblind. So who cares ab a deployment strategy, canary rolling, even recreate, right? And now think about if one team, we have these two components, which are talking with each other very closely, and one team deploys using canary and second deploys using recreate, right? What can happen? Because something may happen. So how you will deal with that? All right. In between somewhere we have artifact management. And here it's important, what kind of tools we use, how we create the naming conversion, how we version our naming conversion, how we version our artifacts, how we inform others about artifacts, what is the access management to those artifact storages, what is the integration? And here I was thinking a lot about that and I found that the best description for the integration in terms of artifacts management system is to say that it can be loosely integrated. Like for example, you have code pipeline, code deployment code deploy and you keep your artifacts in the s free storage, right? This is not part of the code pipeline service itself. Or we can have highly integrated, like for example in Azure DevOps, where you have the part of the tool itself is this artifact management system. And this may be also important because if you have this loosely integrated, probably there is a lot more actions needed. Let's say this as actions, right? Security access, et cetera, et cetera, et cetera, SDLC. In other words, do you use agile waterfall? What is the approach? And is this approach valid for your CI CD lifecycle? Is the CI CD itself treated as an internal or external product platform part, right. So this is also important to know, quality gates. This is quite simple, right? So what kind of type of quality gates we have? If we have manual quality gates, we do not have continuous deployment. I said that already. How we notify how we test stuff in gates, what kind of metrics we provide from those tests, what kind of delays we can allow, because some tests which are in quality grades, can be quite long and time consuming and even resource consuming. And if we have, let's say, the project to deliver every five minutes, we cannot allow to have 4 hours long tests, simple like that. So quality gates, understand the quality gates is also important, what kind of capabilities we have. And here we talk about the team composition, about the team maturity, team knowledge, it means experience, knowing of tools, et cetera, et cetera, collaboration between the teams and between the people inside the team, the topology of the organization, right? We talk about this in terms of organization, people constraints, what kind of tooling we have. So those are elements which we need to consider before we even go to the next slide, which is constraints, because honestly, all of those elements can be also your constraint. So your business type can be constraint. Probably you will have different approach to CICD when you own the company who presents the pictures of the cats in the Internet than you have, for example, I don't know, space mission company, right? And you are responsible for some kind of space mission when you cannot just use exactly the same tooling and the same approach or different constraints. For example, you need to keep your artifacts, you need to keep your backward compatibility for 20 years, because why not, right? Or your delivery process is quite heavy. For example, you need to deliver new software to all IoT devices in cars, and you have like millions of cars around the world. Your CI CD probably will look a little bit differently than just new part of the code in your PHP script for your site. For pictures of cut. What is the time constraint? This is also important, right? What is the security constraint? Tooling constraint, technical constraints for security. For example, if you are not allowed to use cloud based environments, you will not go with Travis Circle, CI, et cetera, et cetera, et cetera, because you cannot simple, what are the responsibilities? And here we talk about people, we talk about processes, we talk about contact points, escalations, path policies, times, procedures, et cetera, et cetera, et cetera, everything. What prepares us to act quickly and effectively, let's say this way releases. So release. Please have this in mind. Release doesn't mean deploy, right? Because you can have the SDLC constructed in that way that you, for example, have some kind of committee who accept or reject the changes, right? And if you have something like that in your SDLC, it will mean that you will never have continuous deployment. Simple rollbacks. And this is interesting part, because rollbacks generally we understand, like we have the code in version X, we have the code in version X, plus one, we deploy the version X plus one and something happened. We just come back to version X, right? And yes, this is correct. This is something, what we called rollback back, but we also have rollback forwards. That's interesting, right? And that sounds a little bit scary. Roll back forward, but okay, now if you have two microservices and one is treated with rollback back and second we roll back forward, right. You see my point here? But this is not only, but this, this is about rtos, rpos, restore time objective restore point objectives. Are we allowed to have any data loss? Do we need to have zero downtime event for rollbacks? What kind of granularity we have, right. Do we roll back everything or do we roll back just the one service or part of the service? What we do with databases, those questions we need to know and those questions should be answered on the organization level, really KPIs and here this topic is quite again beloved by us. Generally it means about, it is about the service level agreement, service level objectives, service level indicators, right? So if our process allows us to restore, for example, something or do some action in 4 hours and there is no way to have this shorter, or we need to use these 4 hours for SLA, or we need to change our approach to fulfill shorter SLA, right. Simple like that. DevOps. And here we talk about composition of the DevOps part, let's say. So do we use this, what I hate really, DevOps engineer. So do we have DevOps engineer in the team? Do we have like a separate DevOps team outside the single teams? Maybe there is no DevOps at all as our role, what should be done. But anyway, what we do with automation, right? What we do with cooperation, feedbacks, how we treat feedback loops. Feedback loops and how we work with them. What kind of measurements in terms of DevOps, we have monitoring. And this is quite simple, how we monitor our pipelines, how we monitor the effectiveness of our pipelines, how we monitor stability of our pipelines. So here we talk about metrics, tools, responsibilities as well, what kind of events, actions, alerts and how we areas logs and metrics, right? And this part is technical when measurements are operational. Okay. And here we talk about something, what is called for example, Dora. So this is quite nice set of metrics created to measure DevOps. And Dora has four metrics, form based metrics which we can set up into two categories. One is throughput and here we have deployment frequency and lead cycle time. And the second is stability, where we have time to recover and change failure rate, those are taken from the monitoring stuff, partially at least, but are treated a little bit differently and the value stream. So generally if you look on CICD in most of the organizations and you will map your process, the big part of this process will be taken by CICD. What means really that you need to look on the CICD as a very integrated part of your whole chain of your value stream. So looking for all wasters, looking for all bottlenecks, problems, et cetera, et cetera, here will help you also to improve the value stream itself. All right, tools, it's quite simple, right? So what kind of tools we use, but not only that. For example, if you have three teams and those three teams areas using completely different tools, and what even worse, those free teams pay the licenses for those free tools, probably a lot of money is wasted. First of all, you have the kind of disintegrated common knowledge because all teams are using something else. And also you pay for three different licenses where you can pay only for one potentially, right? And this brings us back to convey slope to organization, to setup of everything around communication, et cetera. Right. Security. Finally, I published, not that recently, a couple of weeks ago, an update to my framework, this framework which I'm talking about right now, and there is a lot about the security, it becomes the biggest branch in this framework. But security is really important, right? And we have a couple of aspects here and this is the aspect of the security of the tool itself, of course, and this is the most obvious for us. But there is something else during the or inside the chain, we are somehow forced, at least part of some of us which are working for us government, for example, we are forced to generate s bombs, right? So software bill of materials and as this is already a law in us, this will be spread, this will be spread and more and more clients will expect to generate this kind of document into the pipeline. Inside the pipeline, what else? How we deal with the pipeline to secure any kind of injection, any kind of issue during the pipeline execution, how we deal with all required dependencies, how many of you are checking your dependencies in terms of security and dependencies of those dependencies, you know what I mean? Right. Especially when we talk about Python, when we talk about node js or JavaScript. In a broader perspective, there is tons of dependent libraries which you need to download, how often you check them, if they are secure or not. And all of this provides us into kind of mind map. And we can use this mind map to talk about the CI CD on the organization level and to see all elements which are needed to be taken into consideration, needed to be taken into the focus, needed to be improved, et cetera, et cetera, et cetera. And why I say that this is kind of architecture of the CI CD because it is very simple as we have on this side, all of those elements we discussed very shortly, but we discussed, we have also all of the quality attributes of the architecture, all those ETS capacities, scalability, interoperability, usability, security, et cetera, et cetera, right? So we can architect the vision from different perspectives, right? So the CI CD itself can look completely differently from different point of view. And why we would like to use this approach. And believe me, I use this a couple of times in real environment, in real action, let's say. And it works. It works perfectly fine. So why this approach? Because we can architect the vision, right? We can put all those points together in some relation to other points, to align the projects, to align the stakeholders and business together, to build a common understanding what is possible and what is not possible, right? And to unify the tech stack knowledge and to unify the people. And please remember, this is not about ordering all teams to use Jenkins. No. Right. It is about to think how we can improve the collaboration in terms of tooling as well. And I tend to call it as kind of Rosetta Stone of CICD. And I see that they have typo here because this is a translation point. This is translation point for business. This is translation point for the stakeholders, for engineers, for architects, and also for the security. And you can bring all those people in one room and talk with them at the same time. All right? And this is also what I've done and this is very valuable here to have the broad understanding on, let's call it enough level for all those parties which are participating in your organization success. Because on the end of the day, it's not the team or DevOps or QA or whatever, the one team who sells the product organization does it. Right? So thank you very much. I hope it was useful for you. If you have any questions, you can reach me on my LinkedIn. And if you are interested in this framework, if I make any thinking process happening in your minds right now, you can go into WW CICD run and there is this framework where you can download this mind map in the most current version. So thank you very much and enjoy rest of the day and see you somewhere else very soon. Thank you very much.
...

Pawel Piwosz

Lead Systems Engineer @ EPAM Systems

Pawel Piwosz's LinkedIn account Pawel Piwosz's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways