Conf42 DevSecOps 2022 - Online

Setting up and managing GitHub actions for multiple projects in an organization

Video size:

Abstract

With over 56 million developers using GitHub globally, Git and GitHub are becoming quintessential version control tools. Aside version control, they provide continuous integration and deployment features that ensure the project’s quality and continuity, which we will be exploring in this talk.

Summary

  • Good day. Welcome to the session on getting started with GitHub actions and some best practices along the way. Prerequisites would be that you are familiar with Git and the concept of commits, branches and remote versus local repositories.
  • GitHub actions can be specified as the automation platform on GitHub. These actions run on GitHub hosted runners, which are virtual machines. Once the action is triggered, it actually runs. This is the phase which distinguishes CI pipelines from other sorts of pipelines.
  • For each organization you could create a special repository called the GitHub repository. This repository would house templates for anything to do with our organization. For example, you could host workflow or GitHub action related templates over here. We can only use these actions in template actions that are specified within GitHub.
  • A GitHub workflow within a repository called report generates reports on the partial organization. You could specify the action to be triggered periodically or at a specified time. There is also a cron expression specified as a schedule to ensure that this actions is run once every day.
  • GitHub secrets can be used to trigger API calls or do some work. You need to be wary whenever you run GitHub actions on pull requests or changes that come from forked repositories. When you go to settings, the actions tab lets you configure only a restricted amount of things.
  • Three links that you could use to set up GitHub actions for your repositories. Also maybe look into using it for many other programming languages such as Go, JavaScript, and so on. Feel free to refer them, and feel free to post any questions, his comments, or reach out to me via LinkedIn.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Good day. Welcome to the session on getting started with GitHub actions and some best practices along the way. I'm Ranjan Mohan, the senior security engineer at my low security. So a couple of prerequisites for this presentation would be that you are familiar with Git and the concept of commits, branches and remote versus local repositories. And also familiarity with any remote git based service such as GitHub, BitBucket or GitHub would be useful diving right into the session what are GitHub actions? In an oversimplified sentence, GitHub actions can be specified as the automation platform on GitHub. You could use this to perform any actions related to git repositories hosted on GitHub. And these actions, the way they function, can be split into three phases. Broadly, they would be trigger as to the trigger event that would trigger the action to start or to run, it could be anywhere, from anything from changing a branch to opening or managing a pull request, or even scheduling a expression, a cron expression to run the action periodically. And these actions run on GitHub hosted runners, which are basically virtual machines that they host in their environment. And at the moment there are three operating systems that are supported by GitHub runners, and that would be macOS, Windows and Ubuntu. And once the action is triggered, it actually runs. It runs the specified commands and code that you have set to run as a part of the action. And the way you define an action is in a Yaml format using the syntax provided by GitHub specified by GitHub. And once the actions runs, you could use it to enforce status checks. And this is the phase which distinguishes CI pipelines from other sorts of pipelines. In case of continuous integration pipelines where you want to ensure code quality checks or even security scan checks, you could have GitHub actions that could report pass or failure status and use that to enforce pr merges or changes to a particular branch to ensure that you have a base code quality check for all changes that go into your repository. Now when we talk about maintaining GitHub actions, I have a few questions that would help determine as to where or how you should meet him. Do you reuse the same action for multiple repositories within the same GitHub organization? If the answer is yes, then you would want to store the action in the GitHub repository of that organization so that it's easy to set it up in the remaining repositories of the same organization. The only problem it solves is an ease of setting up the action in other repositories, but there is still a problem that it doesn't address, which is if the original action file, the Yaml file changes, you would need to manually propagate the changes to all the repositories using the action. The second question would be does your action need access to secrets, any secrets from GitHub secrets, or any other secret storage utility that you have? If yes, please ensure that you do not trigger any of these actions from any untrusted forks. I will explain and also show you why. Now moving on to the Git repository this is an organization that me and my friends have been working on for the past three years. It's called padao. It contains an array of Python and Java, and a few Javascript libraries that you could use for your day to day becoming. Now, as you can see, there are 30 repositories that are a part of his organization at the moment. And for each organization you could create a special repository called the GitHub repository, which would house templates for anything to do with our organization. For example, you could host workflow or GitHub action related templates over here. You could host the readme for the organization within the profile folder, and you could also store issue or pull request templates for the organization for all repositories in the organization within the care sub folder. Now let's focus on the workflow templates part of it. As you can see, there are three actions over here within this folder. One would be the maven build Yaml, other would be the Maven package publish yaml, and another one would be the Python build yaml. We're going to look into the Maven build Yaml as an example action. And as you can see, there are three files associated with this action. One would be the properties JSON which basically contains the metadata for that particular action, the name, the description, and so on. And then you have the support vector graphic file which is the icon for the action, and the yaml file which actually contains the steps that need to be run or executed as a part of the action. Let's take a look at the JSON file. You can see it has some standard metadata fields such as name description, icon name categories. One interesting field over here is called file patterns, which can be used to suggest this action for any repository in your organization based on any file patterns that are matched. For example, any repository in my organization that contains a form XML file will be suggested. This maven build, clean, test and verify action in the action stack we'll be looking into it shortly. Now, going back to the workflow templates folder, let's try to look at the action and see what are the different parts of it. We start out with specifying the name of the action, and this on section over here is the trigger section. This is where you specify what triggers this particular action or workflow for this particular action. Pushing to the default branch or opening a pull request against the default branch are the triggers. And if you can see we're using a variable called default branch which we can only use in template actions that are specified within the GitHub repository. When you try and set up this action in any of the other repositories within this organization, it gets replaced by the main branch for the repository you're setting it up for. Now moving on to jobs. We have created a job called build and we specify that it runs on a bunch of operating systems which are Ubuntu, windows and macOS. And I've specified Java value as in the JDK version as 14.0.1. And the steps of the action or this particular build job is as follows. It starts by using a preexisting action on GitHub called checkout, and this checks out the current repository. And then I've specified a git command to check out any sub modules if it's present in the repository. And then I'm setting up the JDK using another predefined or pre built action on GitHub. And after that I'm basically using this step to cache any maven related dependencies that I have pulled as a part of setting up this project. So for example, if my Java project depend has say 15 dependencies which it needs to pull every time it needs to run the project or test the project, if I end up caching it, this cache can be used by any other runs of this GitHub action to save some time on setting up the dependencies. That is the advantage of this particular step. And then we do a linking check using a tool called check style, and then a static code analysis check using a tool called PMD. And then I run all the Java unit tests and using junit and I use Jacob to enforce and verify code coverage. So as you can see, this yaml file specifies the set of steps that I would like to be run as a part of this action, and it is triggered whenever someone pushes to the default branch directly, or if they open a pull request against it. Now that we took an example and we went through it, let's look at another example of a GitHub workflow within a repository called report. So this repository called report is a repository that I personally use to generate reports on the partial organization. It generates it as a readme within the same repository. So when we look at this particular action, you can see that not only are the trigger points any pushes to the main branch or pull request to the main branch, which happens to be the default branch for this repository. There is also a cron expression specified as a schedule. So this ensures that this actions is run once every day. So that's the advantage of using any cron expression as a schedule. You could specify the action to be triggered periodically or at a specified time, every single day or on a periodic basis. Another thing that I've specified as a part of the triggers for this action is something known as workflow dispatch. And what this lets me do is it lets me manually trigger this action. In general, whenever you just use pull, request, push, or even schedule, there wouldn't be a facility for you to manually trigger this action. But since a specified workflow dispatch that is possible, and how we go about doing that is we go to the actions tab, we click on the action and then we would see this breadcrumb over here which says it has a workflow dispatch event trigger and it gives you an option to run this particular action on any branch within this repository. So I can select my branches main and click on one workflow and we should be able to see the job shortly. The actions and it has been queued and when I click on it it's starting to set up the job. It's running all the steps within the action file, installing all the dependencies and this is quick because I have already cached them and I'm retrieving them and then I'm running the report. So after it runs the report, it commits and pushes the report to the same repository, which you'll be able to see, and it caches the current dependencies in order for it to be reused later, and so on. Perfect tools like the job has completed successfully. Now if I go to the code and I look at the readme. Nice. This happens to be the latest report from today, the 26 November at 06:56 a.m. UTC time, which is the current time. And this report generation action also runs periodically, once every day, and it basically displays all the repositories in this organization along with certain information as to whether it's maintained or not, whether it is publicly exposed or whether it's a private report, how many open issues are there, and so on. So feel free to use this template to generate your own reports using GitHub actions. Now coming back to the organization page, one other aspect of configuring settings for this organization would be looking at a section known as secrets. See, whenever you run any actions, you could use it for CI related goals or any other tools, you might require some sort of secrets or credentials in order to trigger API calls or do some work. And for all those custom secrets that you need, you would configure that in GitHub secrets and then fetch it in your action. So how it goes about doing this is whenever you configure something in the secrets section of the organization, it gets set as an environment variable within the runner running the action. So you could just ask your script to fetch the environment variable corresponding to the secret and use it. That's how you can use it. But as much as this prevents us from checking in the secret into our repository, it also poses one small issue. Let's take a look at this particular pr. This particular pr is merged into one of the repositories within the partial organization and the main branch. So we're looking at the jpopper repository, and the changes are coming from a user called SIl s one 10, and they are getting the changes from their fork containing the branch called issue 33. So since they control the code that they write in issue 33, they could also modify the GitHub action to read the secrets and expose it via an API call to their own servers, or just send it externally to any other public facing service that they own. So that could lead to compromising secrets or any other sensitive information that you retrieve using GitHub secrets in your action. So you need to be wary whenever you run GitHub actions on pull requests or changes that come from fort repositories. And how we go about configuring this is something that's interesting. When you go to settings, the actions tab lets you configure only a restricted amount of things. It lets you say okay, you can only allow the actions in this organization and reusable workflows, or even actions that are created by GitHub or verified creators. This helps with an extent, but it still doesn't prevent people with forked repositories from making changes and running that action as a part of the new pull request. So we go to the organization page and the settings section, and we go to actions. And over here we see a section called four pull request workflows from outside collaborators, and we see three different settings require approval for first time contributors who are new to GitHub require approval for first time contributors who are first time contributors to this repository, any repository within this organization, and require approval for all outside collaborators. Ideally, you would want to require approval for all outside collaborators, and only when the approval is successful, as in a person has approved the pull request or the action request, will the workflow run for their changes. Another interesting setting change over here is workflow permissions if you don't like. For example, in case of the report repository, I would need read and write permissions for the GitHub action because it writes the report back to the repository. But in case you just need to build and you don't need to write back anything to the repository, always, always keep the read permissions as default. Only read permissions as default. Okay, perfect. Now that we were able to look at GitHub secrets and how they're used, and what not to do when dealing with pull requests from folks, let's get back to the presentation. So as a follow up set of links, I have specified three links that you could use to not only set up GitHub actions for your repositories, but also maybe look into using it for many other programming languages such as Go, JavaScript, and so on. So the first link over here has information on setting up starter actions for a variety of programming languages on GitHub, and the second link is to the organization report that I generate. You could always fork it, modify it as per your needs for your organization and go on. And I've also specified workflow templates for Java and Python repositories that we use in our organization. I'll specify all these links as a part of the description section as well. Feel free to refer them, and feel free to post any questions, his comments, or reach out to me via LinkedIn. I'd be more than happy to connect and answer and help in any way. Thank you for your time. Hope to see you now in the presentation.
...

Ranjan Mohan

Senior Security / Software Engineer @ Menlo Security

Ranjan Mohan's LinkedIn account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways