Conf42 Site Reliability Engineering 2022 - Online

A One Woman Show of Migrating an Entire R&D SCM From Bitbucket to GitLab

Video size:

Abstract

Writing code is something that we learned. Managing a project E2E - Probably not that much.

In this talk, I’ll share my journey of migrating the entire R&D’s codebase from BitBucket to Gitlab on my own - But with the great help of people along the way - Planning, implementation, and handoffs.

I’ll share best practices for managing a technical project with a lot of takeaways you could adopt so your project will be handled smoothly and successfully.

Summary

  • In this talk I'll show my journey of migrating an entire R D code base from BitBucket Cloud to self hosted GitLab on my own. I'll share best practices for managing a technical project with a lot of takeaways that you could adopt.
  • Hila Fish is a senior DevOps engineer and has 15 years of experience in the tech industry. She also help organize conferences in Israel, DevOps days Tel Aviv, and Statscraft monitoring conference. And she's a lead singer in a cover band.
  • This project took in total from planning to implementation and handoffs and the whole works. It took one month and a half to complete. I'm going to drill down to every aspect in this structure during this presentation.
  • A company is migrating from on premise to cloud. To achieve this, they need to plan ahead and set up milestones and due dates. Here's how the project was planned.
  • The next milestone was a second repo migration and pipeline. Everything that was really progressing towards the actual usage of GitLab was done in this phase. And the last milestone is that migration is fully done.
  • The most important thing in communication is hearing what isn't said. Information, when delivered properly without overbearing the recipient with details, can help ease the decision managing process. Running the project takes patience, and patience when planning things go wrong.
  • GitLab was totally new to me. I didn't know GitLab at all before that project. I really had to rely on documentation a lot since my tight schedule didn't really allowed me enough time to play with the system. GitHub changed their official documentation due to an issue we raised during the implementation.
  • GitLab was a new tool for the R D department. They didn't know GitLab at all. I wanted to give them the sense of support and know that anything that they need, we are here for them. I also created a dedicated slack channel for them to ask questions and raise bugs.
  • Planning is a must, especially for long term project. Collaboration matters. Documentation really is key. Change is hard when people are used to working in a certain way. You need to take that into consideration when you plan your project.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hi everyone, thank you so much for joining me. Writing code is something that we learn, right? But managing a project end to end, probably not that much. In this talk I'll show my journey of migrating an entire R D code base from BitBucket Cloud to self hosted GitLab on my own. But with help of people along the way, of course, planning, implementation and handoffs. I'll share best practices for managing a technical project with a lot of takeaways that you could adopt so your project will be handled smoothly and successfully. So first of all, hi, my name is Hila Fish. I'm a senior DevOps engineer and I work for weeks. I have 15 years of experience in the tech industry, which really allowed me to ee that. The DevOps culture is what helps companies achieve great things. I also help organize conferences in Israel, DevOps days Tel Aviv, and Statscraft monitoring conference. I'm a mentor in courses and communities for women in tech and other communities. And I'm a lead singer in a cover band. As you can see this picture, which is a lot of fun. Okay, so this project took in total from planning to implementation and handoffs and the whole works. It took one month and a half to complete. Is it a lot? Yeah, no, you tell me after seeing this presentation. So the project structure looks like that. First, I started out with planning what to think about things to consider, foresee bottlenecks and tackle them upfront as much as possible, of course, deadlines, et cetera. Then the implementation phase, making sure that GitLab is up and running, and also integrations for doing stuff with GitLab, Jenkins and Jira, and other aspects like security aspects, backup and restores, handling expected blockers, and also leave room for the unexpected, which is very, very important. Next up, I continued to the training for the R D department because this tool was completely new to them. So I needed to make sure that they get familiarized with working with GitLab because I wanted to make sure that I don't spring it up on them and they can really be sure they know what they're doing before the actual migration. So this step was pretty important to make sure that they are feeling comfortable with it. And then documentation and handoffs. I always leave documentation behind with anything that I do, but in this case it was extra crucial because fun fact, I left the company that I did this project for shortly after finishing the project, of course I raised the flag that before even the project started, I raised the flag that I'm planning to live. And then they said that they want me to do it anyway. So that's why I knew that I have to make sure that the documentation is flawless in my opinion. Of course I can't really tell, but I really wanted to make sure that I leave everything behind in a way that they can handle GitLab and not feel hopeless, right? Because I don't want them to feel like oh, Hila left now what are we going to do, right? So this phase was very important. I'm going to drill down to every aspect in this structure during this presentation, so we will talk about it a little bit later. And hovering above everything was the project ongoing statuses. Because I have managers and the managers want to see what are the deadlines and really follow up on what's going on. So throughout the project I push the notifications about anything that happens and blockers and stuff like that. We will speak about it also later on. Okay, so I mentioned planning, but actually there was a pre planning phase. So when I got the task, I first started asking questions to understand the scope of the project. So what is my deadline? If I know the deadline, I can organize my time and make priorities accordingly. Why are we doing this? Migrating? Because there are several reasons why. Doing a migration, we want to move from on premise to cloud. We want to move from SaaS solution to a self hosted one or any other internal reason. So this question is very important because I want to make sure the project process and outcome is according to the managing needs and requirements. And if I don't know why we're doing it, I wouldn't know if what I'm trying to achieve and the way I'm achieving it really works. So that's why I asked this question and then I planned accordingly. And any limitations do we need to do things in a certain way? For example, GitLab has two offerings, omnibus, which is the VM installation and helm release for the Kubernetes deployment. So if I knew that, for example, my company didn't utilize kubernetes at all, then I assume I would go with VM installation because I wouldn't want to introduce entirely new technology along with new tool for SCM. Right? Two changes at once. So I had some not really concerns, but I really wanted to make sure that if we have any limitations, I need to know about them upfront so I can plan accordingly. And speaking of plan accordingly, a goal without a plan is just a wish, right? So we really need to make sure that we plan ahead so we can know what we are going to execute. So once I felt like I have enough information to go on. Then I created a high level plan. So this was the plan and it's nice, right? But a plan is great for me, the executor of the plan, but the management wants milestones and due dates. This is how they can measure progress and convey status in a clear manner. So that's why I've taken the plan that you can see here and converted it to be actionable and measurable with setting up milestones and due dates. So let's go over each one telephone code. This was the step and the first step because this was the, let's say prerequisite for anything else telephone call to create the project for GitLab, I say project. We had two because we had Dev and production. I've deployed GitLab on the production environment for the actual source control management and also deployed on dev environment for two purposes. One to try out upgrades on Dev first to see if nothing breaks, but also let dev personnel play with it whenever they want to be familiar with the tool, but also to work on their own mini dev projects before taking them to the production instance. So also for the migration phase to play out with the tool and be familiar with it. But also if I upgrade it to version X and this version brought some new features with it, they can play with the new features on dev and then come knowledgeable to the pod instance. So everything that was related to the GitLab deployment was done in this phase. Terraform was utilized to create the projects and also the GitLab deployment and everything just covered in that phase because this was the prerequisite. Next up we had the networking troubleshooting. So since networking was handled by a different team, I automatically added a slot, dedicated slot for troubleshooting, since it involves other people, aka potential bottlenecks. Right? So it doesn't mean that the network troubleshooting was restricted only to this phase. I had to troubleshoot issues throughout the project, but I tried to map every source destination I could and have it all checked on this phase. Then the next milestone is GitLab up and running. We want to make sure that everything is up and running and we can use GitLab. This phase included a DNS name certificate and SAML integration. Then the next milestone was first a bear repo migration. I used the GitLab migration feature that allowed me to import repositories from BitBucket to GitLab. I had some issues with it, open ticket photosupport and stuff like that. So it took some time. Not that a lot, but still. And this milestone really talks about getting to the point where the repo is migrated to GitLab and I can pull it, push to it and the whole thing related to repository management. The next milestone is the peripheral info backups and monitoring. So I made sure that I can backup everything. And when I say everything, I really mean everything. Full system repo backup and branch backup and also not only backups. I tested out the restores full system repo and branch because I wanted to make sure that if something happens, they know what to do. At that time the restores wasn't really covered as well as it could be in the GitLab documentation. So I created a procedure of how to do the restores step by step, because when something bad happens we want to know exactly what to do because it's stressful as it is. So I created part of the documentation that I created later on included also the restore procedure. And also I dealt with the monitoring here. So system monitoring and application monitoring, the GitLab metrics. And in this phase I tried to close any loose ends regarding networking issues that I had until now. SSH and HTTPs tested throughout from several sources and stuff like that. The next milestone was a second repo migration and pipeline. So we had repositories that just you need to work with them as is. But we had repositories that Jenkins did stuff on them, right, pause it and then created or built docker images from files there, right. So I want to make sure that this whole ordeal works. So in this phase I set up the Jenkins integration and the configurations needed on Jenkins side and GitLab side. I can talk about it alone in a separate talk because there were several settings that needed to be done. So in this phase I really just wanted to make sure that the pipeline works and tested fully against GitLab repository. The next milestone was continuous migration, because in parallel to everything that I'm going to say here is that migration continued. So I divided the repositories to teams, so it was coordinated with each team when their repositories will get migrating. But in parallel I did also other things like training for my team that needed to maintain GitLab, but also for the R D that needs to use it. So I gave them initial intro on make features and stuff like that, and the gradual login. So GitLab used, the implementation that we used was some integration that when a user performs an initial login, then the user entity gets created on GitLab. So that was another way to tell them on the training for everyone that was on the call and on the meeting to hey please log into the system, play with it and stuff like that so your user will get created. And then once the user got created then I could set up the permissions based on the folder structure. I will mention it also later on. So everything that was really progressing towards the actual usage of GitLab was done in this phase. And the last milestone is that migration is fully done. I finished walking and doing the documentation and handoffs and the whole walks and I can tell you under the migration done section that the back end repo was the most big one and the import took time because I tested out on the dev instance and I saw that their import took several hours. So I knew that I don't want to disrupt the R and D in that way and let them paralyze them for several hours because whenever I did the migration of each repository I told them that hey, in order to avoid sync issues you can't push to BitBucket while I'm doing the migration. So since this repository took several hours then I knew that I want to make sure that they can work flawlessly. So that's why I did this migration on a weekend, and weekend on Israel is Friday and Saturday, so it was a happy Sunday after. Yeah, that's basically about that. And again finishing documentation and handles and all this stuff. So this was the entire plan. I also added some stuff that I knew later on after the fact, but this was the plan and this was what detailed and explained and what I showcased to my managers before I actually started to work on this project. So for everything regarding related to the planning, I created a planning doc that consists of the milestones that I just showed you and extra sections. I decided to put them all on this planning doc because I want to have one place that have all aspects referenced. So first steps, a mapping of needed steps both on my side or on other teams so I can lay it all out and see what can be done in parallel and what should be done Asap to reduce bottlenecks later on. To do section was out of scope, high level tasks, things to do and remember after this project is finished. Read more so I read a lot of documentation of GitLab and I saw some stuff that we should dwell on it later on and we should get back to it because it's not important right now, but it is important in general. So these kind of things I wrote under the Read more section and things to think about. So considerations for annual audits and stuff like that and some more things to add for the planning doc because why not? Some addendums that I added were report issues that I've encountered both on GitLab side and internal issues. I of course listed them on separate Jira tickets, but again, it's a centralized place to see any issues that we have that could potentially endanger the project's timeline. So that's why I put it here for my managing to view it here. And I also added some more appendices so access mapping inbound and outbound accessibility to the current BitBucket hosting so we can make sure they are open for the self hosted GitLab offering as well. I marked what was open, so it was both a place to check for status and have audit for what is needed. I added those in the GitHub documentation that I created later on, but during the project it was a good thing to have that, especially for keeping track of what is still needed to be done. The LDAP groups defined in the SAML offering to allow access to the self hosted GitLab was also mentioned here, and main repositories and CI pipelines to track progress of what I already migrated and whatnot. This was something that was added by my manager. He felt the need to add it, so conf 42 ease his mind. I went along and marked completed on those that were migrated. Okay, so we talked about the planning, right, which is extensive as it is. But I think that we really need to take a minute to talk about another important aspect of managing a project, which is communication. So as you can see here, the most important thing in communication is hearing what isn't said. Since I worked on this project alone, it was crucial to update my managers on the status so they can report back to their managers and to show them that I'm progressing according to the needed timelines. So I had weekly meetings to update my managing on the progress, but also I created ad hoc meetings with updates on blockers. So either issues that I needed their attention and help to escalate, or just an FYI, this is how I'm handling it, if you're okay with it. Cool. This is just an FYI. Me and my managers created zero tickets for really everything, and I've written detailed updates and current stats on each one. I created more tickets if there were more things to code along the way, and also tickets for other teams to do things that I need, like set up networking and provide us signed certificates and stuff like that so they can run with it in parallel, and then we can reduce bottlenecks. The other thing that I'm saying here is basically to convey the message that information is power, right? And information, when delivered properly without overbearing the recipient with details, which is important, can help ease the decision managing process and deliver a feeling of stability and allow your ongoing independence. Running the project, which is very very important, especially when you are working on a project alone and ongoing project takes patience, and patience when planning things go wrong and when you wait for other teams to do their work. So we need this patience, but we also need to think about hey, this takes too much, maybe we need to escalate. So this is also another aspect of communication that I want to tell here. And last but not least, I had a meeting with the R D team leaders to prepare them for the migrating and explain what is going to happen. Plus involve them in the folder structure decision because how I divided the repositories, I created a folder structure and each folder had the repositories related to that team and then the permissions were set up in the folder level. So since the R D team leaders are the code owners, I wanted to involve them in the folder structure decision as I wanted their inputs, but also which is something that is very important. I wanted to get them on board because I didn't want to code from the outside, from external way and say hey, this is what I'm going to do and that's that. No, I want them to feel that they are with me on this, right? So that's why I involve them in the decision and basically want them to be on board and feel that they have a say, which is very important as well. Okay, so let's talk a bit about the implementation. So we decided to go with the Kubernetes GitLab deployment and I used their helm chart which you can see here, we use terraform for everything, for the infrastructure and deployment. So I had to incorporate Dell Helm chart in our terraform code base and GitLab was totally new to me. I didn't know GitLab at all before that project, so I really had to rely on documentation a lot since my tight schedule didn't really allowed me enough time to play with the system and do some trial and error. So documentation was very important in this aspect. But also since I had a tight schedule, I knew that I had to exercise some judgment and say okay, this is a must read right now, but this could wait. And that's why I put it in the section of read more in the planning doc. So documentation is very important and the documentation was what allowed me to choose some aspects of implementation as well. So only because of the implementation of the documentation, sorry, I decided to use DB version X which was not a DB version that was default in the chart to prevent later maintenance because they were that using that version will a allow me to upgrade to version 14 which we used. And it also supports backward compatibility which is awesome for me. So only because I read the documentation I knew that this is possible. So that's why I did it. Another aspect is that I haven't implemented the high availability feature because it wasn't Ga yet, it wasn't in a general availability state. So I toggled in this telephone code. I set it up to false, but I wrote a comment saying that please check it later on because as sres we know that high availability is very important, especially for something crucial like managing code base. So I just added a comment saying that please check it out later on and once it is g eight please enable this feature. And why I didn't enable it before because I don't want to rock the boat too much and introduce instability to the environment about monitoring. So monitoring wasn't really baked for their kubernetes offering. So I basically did it myself. I went through the metrics documentation and added dashboard according to what I thought was important and alerts based on those metrics. And also fun fact, GitHub changed their official documentation due to an issue we raised during the implementation. So they said on the documentation that set up limits, ABC and stuff like that, and that's that. And then I set up the limits and we had some issues. So that's why they changed the documentation and wrote something like for your own discretion and you need to test it out first and stuff like that. So it was nice to know that we had an impact on the GitLab official documentation. Okay, let's talk a bit about the training for the R D. Right? So GitLab was a new tool for the R D department. They didn't know GitLab at all. Also the glossary is different. They are used to pull requests and now they need to think about merge request because it was actually a category in GitLab. In the UI it was called merge request. So if you're not used to it, you could just overlook it when you're working with the UI. So they need to really be familiar with the tool. And basically I didn't want to spring it up on them and expect them to feel comfortable using GitLab from day one. So that's why I had a training session to go over the basics of GitLab. So the GitLab usage and main features and basically have them be familiar with the tool before they actually need to start using it. I also encourage them to play with GitLab itself on the dev environment and that way they can really feel safe that once on the doomsday they know what to do. I also created a GitLab onboarding doc with explanations how to change their local repocopy to work with the remote repo URL of GitLab. So yeah, you can find it online. But I really wanted to give them the sense of support, especially since they used to be each time something happened with BitBucket we said yeah BitBucket, our hands have died, we can't do anything about it. But since now we are using self hosted GitLab, everything is on us, right? So I wanted to give them the sense of support and know that anything that they need, we are here for them. So this document helped me do it because even I instructed them how to change the report. This was the first step towards this sense of support. And also in spirit of sense of support, I created a dedicated slack channel for the R D to ask questions and raise bugs and get support for all GitLab related issues. And I also put the GitLab issues that I've opened or found during the implementation that are relevant for them as a pinned message in the channel so they could follow up or escalate if needed and upvote the issues and stuff like that. So that's about the planning for the R D. So we talked about the R D. Let's talk a bit about my team, right? Because I am leaving the company. But even if I didn't left the company, I don't want to let them feel or think that hey, this is GitLab stuff. So only Heli is managing me. No, I wanted to make sure that they are not hopeless and they know that they can manage it as much as me. So the competition was really I think that I covered everything that I could possibly think of, from how to deploy a new version of GitLab to how to manage and replace certificates and the summer integration and what information to provide to GitLab support when you open a ticket for them. So you would avoid the ping pong and get the fast resolution as fast as possible. So everything that I could I covered in this documentation and I did the handoff with training to go over documentation and answer the questions because I really want to make sure that they feel at ease and they know what to do once I leave the company and they can take it from there. Okay, so I talked about a lot of things, right? Planning and implementation and things specific to GitLab, things not specific to GitLab and stuff like that. So let's see what in general we can learn from this whole ordeal that really for me it was hectic, one month and a half for everything was pretty tight and I had a lot of things to consider. So let's see what I learned from it. And maybe it's stuff that you can take from it for your project as well. So planning is a must, especially for long term project. You have to plan ahead, understand the company's needs and why this project is important and use this information to try and foresee any bottlenecks and plan how you are going to tackle them upfront. Derive deadlines based on all information that you gathered and then execute accordingly. Because if you structure your plan, then it will help you achieve things in a timely manner and literally progress according to plan updates. And collaboration matters. It doesn't matter if you're working on a project alone or with others, you should always involve and bring the stakeholders up to speed. Other managing or team leaders that are affected by this project and stuff like that, including regular updates and raising flags on issues and brainstorming and collaboration matters .2 so it's always best to showcase the technical implementation you're planning to your team members and manager, first of all to make them familiarize themselves with the project or with the system because they need to support it later on, but also to see if they have remarks and brave aspects that you might have not thought of. So we want to make sure that the implementation that we have will be as best and suitable for your use case as possible. So that's why brainstorming and collaboration really matters. Trade offs are a given. You will always have a lot of things to consider and take into account. I really had a lot of things to think about and juggle and decide. This is important. Right now this is not important. So deadlines and mandatory and nice to have implementations and a lot of things you need to make sure you execute based on that because yeah, it's fun to play with the cool features that the UI offers, but if you're under a tight schedule, you should focus on what matters. Right now, documentation really is key, both documentation that you read while implementing technical and logical aspects of the projects. So it will help you take the right decisions and also defend those decisions whenever needed, but also the documentation that you leave behind because you want to make sure you share your knowledge and preserve the knowledge because you don't want to be single point of failure, right? You want to be able to go to a vacation and not have a call saying, hey, we have an issue with GitLab, please help. Right? So the quotation is very important, and that's why I think that it should be given the priority that it deserves. And finally, last but really not least, change is hard when people are used to working in a certain way. You need to take that into consideration when you plan your project, especially if it's a migration project. So you need to make sure you leave time for training and familiarization with the tool. So thank you. I know that I spoke at times fast, and I covered a lot of things because I really believe in giving value. So if you have any other questions about managing a technical project or specific questions about GitLab implementation, you're more than welcome to reach out by mail, LinkedIn, or Twitter, and I'll be more than happy to help you. Thank you.
...

Hila Fish

Senior DevOps Engineer @ Wix

Hila Fish's LinkedIn account Hila Fish's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways