Conf42 DevOps 2023 - Online

Creating Products through DevOps: The Story of VSHN

Video size:

Abstract

Since 2014, our slogan at VSHN has been The DevOps Company. Following that ethos, VSHN applies all of its DevOps, Agile, and Sociocracy practice to creating our services and products. How to take into account diverging opinions during product development? How to match ethical business practices with market needs? How can a company be respectful of its employees and profitable simultaneously? Is there a working alternative to hierarchies? In this session, you will discover how DevOps shaped all our decisions, from idea to market. In particular, we will describe the challenges, processes, and decisions we took while creating our latest product, “APPUiO Cloud.”

Summary

  • Services Mocheski: I'm developer relations at Vision, the DevOps company. Today I'm going to talk about how we build Appuio cloud at vision. How we use DevOps to create it. And then I will give some technical details.
  • Vision is a company of 50 people located in Zurich. It offers DevOps as a service with a full team of Kubernetes and Openshift experts. All decisions happen through consensus among all visioneers. The company has been self funded since day one in 2014 and profitable since 2017.
  • Appuio Cloud is meant to be an entry level product catering to Openshift customers. It is a public platform, so it comes with some gotcha. The resource availability is not guaranteed. There are community support packages available at extra cost.
  • Appuio Cloud's architecture follows Conway's law. The company has created three different sets of documentation. transparency is one of our values at vision. All the features of Appuio cloud are possible to be released either now or later.
  • Abu cloud is not and will never be finished right? It is a product that changes continuously, sometimes in small ways and sometimes in bigger ones. We realized that in our preparation we have not designed our cpu request pricing properly. We rectified our policies openly and communicated clearly with our customers.
  • Team that built Appuio Cloud is called Aldebaran. Trust is the key ingredient for an asynchronous work culture. Can other organizations use a similar process to create a product? Yes, but with a few caveats.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
You. Thank you so much for attending this talk. My name is Services Mocheski. I'm developer relations at Vision, the DevOps company. And thank you so much to the organizers of the Conf 42 DevOps 2023 event for organizing this talk and for giving me the opportunity to talk about vision and appuio cloud. Through this talk, we would like to give you an idea of how we build this new product using DevOps as a guiding philosophy. Today I'm going to talk about how we build Appuio cloud at vision. For that, I will first start by explaining a little bit of our culture. What is Appuio Cloud and how we use DevOps to create it. And then I will give you some technical details about how we created this new product. But first of all, I would like to introduce Vision to those who have never heard about it before. That's how you pronounce the name Vision. Just like Vision. That's why there's an I in the logo. We're a company of 50 people located in Zurich, and we are the DevOps company. The slogan of Vision is actually the DevOps company. We embrace the DevOps philosophy completely, and as you'll see today, we use its principles and ideas in everything we do. What does Vision do? We provide various services and products. We offer DevOps as a service or has with a full team of Kubernetes and Openshift experts ready to monitor your clusters and your applications. Twenty four seven on the cloud. We help companies become selfsufficient and cloud enabled. We help software engineers to build bridges between dev and Ops with CI CD pipelines in various platforms like GitLab, Openshift or Argo CD. And then we are Kubernetes and Openshift specialists to the point that our strategy is 100% oriented towards Kubernetes. Everything we do runs on Kubernetes. Aside from our technological choices, the important thing to know about vision is that we have chosen to drive the growth of our company in various ways that are completely non standard. First of all, we are the DevOps company, and as such, we embrace the DevOps mantra completely. Everything we do is automated as possible, and that frees our brains to think. We have decided as a company not to grow through venture capital. Instead, we rely on the good old method of organic growth. We have been self funded since day one in 2014, and we've been consistently profitable and growing since 2017. We use sociocracy as a management growth framework. This means that all decisions, and I mean all of them, happen through consensus among all visioneers. And yes, that's how we call ourselves, by the way. And we have created a handbook, freely available online, which in printed form takes 573 pages, which explains everything we are and what we do at vision with quite an incredible level of detail, I must say. I invite you to check it out at Handbook Vision CH and you will learn everything there is to learn about us. So shoocracy is our evolution framework, and we have a small team of people at vision whose only job is to help us evolve into this framework continuously. In particular, any visioneer is able to raise issues, problems and to ask for help to change procedures or situations that are hurting their happiness. At vision, visioneers are able to create vision improvement proposals, or vips as we call them. They are simply tickets in our jira that explain a current situation or decisions, the drawbacks and negative impact, and propose a solution to be discussed by everyone involved or interests in the issue. This simple mechanism has completely transformed the way we work in the past three years, and as a result, we all feel part of the structure we built, and we feel responsible for it at all times. All of these choices have shaped our culture in ways that are really not common at all and have interesting consequences in our day to day operations. For example, our hiring policies are different to those of most IT companies. Not only we do pay attention to the IT skills of those who want to join our team, but we place a very high degree of attention to the human factor. We want people to feel great at vision, and one of the primary factors we evaluate during the hiring process is the likeness of the person. That is, how much would we like to work with them every day? As Steve Jobs once said, we don't hire smart people to tell them what to do. We hire smart people so that they can tell us what to do. Another interesting consequence of how vision works is that we had embraced remote and asynchronous working well before the pandemic. When the swiss government mandated everyone to work from home in March 2020, we simply stayed at home and continued working as if nothing had happened. The important bit of information here is the asynchronous word. Not so much that we work remote, but that we work in a nonsynchronous way. And this particular mindset has shaped our company greatly. But let's get back in time a little bit and see how Appuio cloud came to be. In 2016, Vision and Apostle ITC, a well known Swiss IT and software consulting, launched a joint venture called Appuio. This is a word in Esperanto it means support. Apuyo consists of a series of products built around red Hat Openshift. For those who haven't heard of it yet, Red Hat Openshift is the most widely used Kubernetes based platform in the enterprise world. It is quite popular with big companies, and it incorporates a hardened and highly available Kubernetes cluster surrounded by lots of relevant software, for example, a container repository, a management console, CI CD pipelines, and with a very nice and professional GUI on top. We decided we wanted to be a part of the Openshift market, but we also realized that installing and operating Openshift is a huge endeavor, and many companies could not use OpenShift because of the lack of staff or budget. So we decided to join forces with puzzle ITC. So Appuio is a response to the complexity of Red Hat OpenShift. With Appuio, customers can create a ready to use cluster together with the know how of these two companies. We at visual we specialize in the setup and maintenance of Openshift clusters, and we've been operating Openshift clusters, by the way, since version three. Puzzle on the other hand, they are specialists in the creation of software solution and cloud native applications for Openshift, which is something we don't do. Together, the Appuio team can help companies make the most out of their Openshift investment. Appuio has been historically been available in various forms. First of all, there was Appuio Public based in OpenShift three. It was the first swiss based shared Openshift cluster available to customers. And it has a shared platform where customers could run their projects without having to care about management or anything else. There were Appuio public clusters running in various cloud providers, like cloudscale in Switzerland and Aws in Germany. Then there was Apuyomanage, which is the next step. With Apuyomanage, organizations get their own non shared, their own Openshift clusters for their exclusive use and puzzle ITC and vision take care of the opinions of the cluster transparently for their users. Finally, ApU yourself manages the final step in the evolution of organizations. With it, organizations not only get an Openshift cluster, keys in hand and ready to run, but we teach their id teams how to manage and maintain the cluster by themselves. We gradually fade in the background and provide help until at some point we become completely invisible and they become completely independent. Apuyo Cloud is the latest offering in the Appuio family, and it started as a product in September 2021. What is Appuio Cloud? Simply put, Appuio Cloud is to Openshift four what Appuio Public was to Openshift three. Because, you see, given the major architectural changes between Openshift three and four, instead of migrating our Appuio public infrastructure to Openshift four, we decided to create a new project from scratch, and we gave it a different name and even a different visual identity. We notified our APU public customers of the upcoming phasing out of the service Appuio public with an offer to help them migrate their payloads to Appuio Cloud. And in just one year, Appuio Cloud was fully decommissioned. That happened in September 2022, only one year after Appuio Cloud started operations. As said previously, Appuio Cloud is based exclusively on ape ship four, and at the moment, we have two Appuio cloud zones available to our customers. One in Canton, Argao in Lupfig, it's in central Switzerland, and another one in Geneva running in the exoscale premises. We plan to open more regions in the future as required and following the demand of our customers, we started working on Appuio Cloud in spring 2021, and we released it to the public in autumn that year, we reused a lot of code and infrastructure we has created for our work previously. First of all, we reused KDAP. KDAP is a Kubernetes backup operator that has been picked up by the cloud Native Computing foundation as a sandbox project. You can find it in ktab IO and also products, which is a suite of tools that allow developers to manage remotely. Lots of kubernetes clusters from a central location using Gitops. Check it out. It's on syn tools. Who is Abuyo Cloud for? So, Appuio Cloud, just like its predecessor, Appuio Public, is meant to be an entry level product catering to the long tail of Openshift customers who might be interested in getting access to a working Openshift cluster without the hassle of installing and operating it. So as such, we identified a few target groups. First of all, startups. They might be interested to have a namespace on Appuio Cloud so that they can launch their mvp and get more venture capital DevOps and CI CD pipelines to deploy your projects. Mobile lab backends for iOS and Android education. So, for example, if you want to learn about OpenShift, Appuio Cloud would be a great place to start technology trials for companies who want to hedge the risk of getting into the Openshift world, and of course, resellers who would like to offer Openshift services to third parties. What is included in Appuio Cloud. First of all, you get instant on. You sign up for Appuio Cloud, you get an Openshift namespace, you are ready to go to deploy your applications. You only pay for the resources you actually use, and you can define users and groups in your organization so that everyone can work on that project. You get pre installed KDAP so you can backup all of your work at any time. And there's a few more preinstalled operators. Among them we have, as I mentioned, KDAP, and we have also cert manager for you to create and manage your X 509 certificates. And finally, we've got community support. If you need help, you can check out the Appuio cloud forums and the chat. And for those needing more help, there are support packages available at extra cost. Now, Appuio Cloud is a public platform, so it comes with some gotcha. First of all, the maintenance policies are mandatory and predefined, so you know when these are going to happen and there might be some interruption for your work. We communicate status information live on status cloud. The resource availability is not guaranteed. You get what you get. We cannot guarantee much more than that. That means that the SLA is best effort and there's a fair use policy. All Appuio cloud users should behave in the sense that otherwise they could impact or degrade the service level availability for other users. There are no privileged containers running on Appuio Cloud. It's a very secure platform and the log retention is only three days. After that we clear the logs. You can download them, but we will clear them. It is not possible for the moment to have other operators than the ones that are defined and pre installed in Apoyo Cloud, but we evaluate new ones regularly depending on the demands of our customers. Now this is about DevOps, and what do we mean about the DevOps way of working? So let's see what we mean by DevOps first, because it's one of those words that can mean anything and everything, depending on who you ask. So for example, this is a famous cartoon about DevOps, and this is clearly not what we mean by DevOps, although to be honest, there's a lot of yaml involved in what we do. Usually when people talk about DevOps, they think about this physical division between developers and operation teams and how they don't communicate anymore and how much better it would be if they did. Then DevOps comes along with continuous integration, automation, cloud containers, collaboration and infrastructure as code, and then somehow all barriers are destroyed and we can once again collaborate and work better together. For us at vision, this is a limited view of what DevOps is and can bring. It is an important part, but not all. Instead, we prefer to think about DevOps as a set of three principles, following what some authors have written about it. In particular, we think the best people to talk about DevOps is the author of the DevOps Handbook and the Phoenix Project, Jean Kim himself. The latter is actually a modern rewriting and reinterpretation of a classic management books from the 80s called the Goal by Ellie Yahoo. Goldrat. But it's quite faithful in spirit as well. In those books, DevOps is usually defined by the three ways, the principles of flow, the principles of feedback, and the principles of continual learning and experimentation. So let's see how these three principles have shaped appuio cloud. Let's start with the principles of flow and see what it means for product development. The first thing we have to decide was where to start. That is, what was the value stream we wanted to provide first. We wanted to have actual results as early as possible, because seeing things happen and appearing is one of the best ways to keep a team in activity motivated and delivering that work brought together the product documentation. That's right. The first thing we created through discussion was a written documentation of what we wanted to offer. Why written? Because we work asynchronously. That means that some of us work better at night, while some work better in the morning. Having everything written down helped everyone creating down drafts of the documents until there was agreement. Agreement from whom? From everyone. From the product owners to the DevOps engineers, who, at the end, have to maintain the solution. This way, operation teams knows exactly what's going to happen. There are no surprises down the hall, and they feel empowered and listened to. All the features of Appuio cloud are, simply put, possible to be released either now or later. But they are possible. The important thing here is that we started by applying Conway's law. That is, we first structured the team that would work for Appuio Cloud, and then we got to create the system. The end result of this process is that the architecture of Appuio Cloud, following Conway's law, strictly mirrors the structure of our team. We do not fight Conway's law. We embrace it. The result of this work of architecture and documentation can be summarized in three different documentation websites for apiocloud. You've heard right. We have created three different sets of documentation, and we keep them updated every day. There's product owner documentation. There is system engineer documentation, and end user documentation. We've made all of the documentation publicly available and viewable, actually even editable. Because transparency is one of our values at vision, we want all of our customers to know exactly we are doing things the way we do. In turn, this generates trust in our existing customers and it shows our know how to prospective ones. These three documentation sites are, simply put, great marketing tools. The principle of flow requires teams to make work visible, reducing batch sizes and intervals of work, and to build quality in. We limited work in progress to the strict minimum and we automated has much has possible in the process. Talking about automation, this automation involves removing the human factor from the maintenance of those clusters as much as possible. One of the key factors for doing this was project scene, a suite of tools we started building in 2019 that allows our small team to manage hundreds of clusters from a central location. We created project scene as a way to be able to operate our customers assets with reduced human footprint, but it turned out to be a great way to handle appuio cloud as well. Thanks to products in DevOps, engineers can specify and deploy changes to lots of clusters from a central location using Gitops. Just commit your changes as infrastructure as code to a git repo. Wait a few seconds, all of the clusters have those changes. We use Project Syn to deploy Kyverno security policies, for example, to our Appuio cloud clusters so that all regions conform to the same rulebook. We also configured each of the Appuio cloud zones with the mandatory differences between the cloud providers we use, because exoscale and cloudscale do not offer exactly the same features, and being able to see those differences written down allows us to manage those systems in the best possible way and to take the best possible decisions. So we know that Apu Glad is a complex system. It's built out of complex systems itself, and they are all prone to failure at any given time. So this is not a matter of if, but rather a matter of when things are going to go wrong. So we need observability and we have built observability and management tools immediately from the start. In our work of Appuio Cloud, we have reused the management infrastructure provided by Openshift, the same one we were using for our private customers. But we have built on top of that Appuio cloud specific tools so that we have a complete observability on the cluster at all times. Using everything as code as a basis for our work means that every time we fix an issue on the platform, we have to change a configuration file somewhere. This information is later stored in a git repo as part of the project history. Not only that, but we also update the required documentation files, both internal and external, so that everyone knows asynchronously and at their own rhythm what happened when and most importantly, why and when we say everything has code, we mean it. Security policies, build configurations, general configuration, infrastructure, and documentation. All of this is described in their corresponding files and decisions. In git repos, we use GitLab and its integrated CI CD pipelines are configured to automatically build, test, and eventually deploy changes as they happen, thanks to products. In all of the feedback we bring back to the system is automatically deployed whenever possible, which reduces the amount of brain work required to keep things running. Even our documentation is automated. We use the Antora documentation generator tool, which can automatically extract and integrate documentation from various sources into a single website, and we use GitLab pipelines for that as well. With this process, engineers only have to update these documentation sources using ASCII Doc, which is very similar to markdown and git push. These changes are immediately picked up, verified. We actually have styling and syntax check built in into our pipelines, and all of this is deployed automatically. Now, regarding the principles of continual learning and experimentation, I'm going to share with you an anecdote. Abu cloud is not and will never be finished right? It is a product that changes continuously, sometimes in small ways and sometimes in bigger ones. And this screenshot it's a screenshot of the appuio cloud console a few months ago, around the month of May. And can you see the red banner on top? That red banner on top indicates the result of us learning something interesting about Appuio cloud, something we did not know, a change we have to bring to the platform. Here is the text of the Red banner in the previous screenshot issue with cpu requests resolved, the resolution in cloud includes a slight change to the pricing model. This, as you can imagine, is the result of a learning process. We realized that in our preparation we have not designed our cpu request pricing properly. As a result, as soon as the first users started using the platform last year, we realized that some of them were consuming disproportionate amounts of cpu and this was a huge problem because they were not aware of that. And we as a company, we would have to cover for a lot of extra costs at first. So that of course, from a business point of view was a disaster. But we basically modified the policies in our clusters and we made this clearly visible and communicated this to all of our customers and we updated our documentation has shown on the slide the solution is exactly what you see right there. It's an extract of the documentation. This was an unexpected and unplanned learning, right? A local discovery that brought a global improvement in Appuio cloud for all users, and also for us in terms of business. We did cover some of the costs, but we rectified our policies openly and communicated clearly with our customers. The result is not only all of them acknowledged and understood the changes. We didn't lose a single customer because of it, and this level of cooperation with our customers is one of the things we're most proud of. So let me give you now some details about the work we did, including team sizes, tech stack used, and many other details. Transparency is one of our values, so we're very happy to tell you everything about it. The team that built Appuio Cloud is called Aldebaran. Envision was the team that was mostly in charge of the design, deployment and operation of Appuio Cloud. They also received help from people from other teams, in particular those with very good experience in the deployment and operations of openshifts, clusters, of course, and also from marketing and sales to coordinate the communication and the marketing campaigns to get new users onto the platform. The project manager and main product manager of Appuio Cloud is Tobias Bruner, one of the founders of Vision and the current CTO, who provided very strong vision, no pun intended, about how Apu Cloud should behave and look like let's go into some details about how APU cloud is built. This slide contains all the major components we've chosen for it. We've got red has OpenShift, four point eleven, of course, we use for security policies, Kiverno for identity management, we use keycloak. We store secrets in vault, we use rook as the storage plugin, and we use isovalent silium enterprise as the networking plugin. For backup. As I mentioned, we use kdap, and for all the GitHub's operations we need, we use project syn. For the documentation websites, we use Antora, which I strongly recommend that you check it out. It's a wonderful tool. During our day to day asynchronous communication and collaboration, we use the usual took that you need to keep in touch with your peers, for example, Zoom, rocket chat, Jira, confluence, and so on. A short timeline of events that led to the release of Appuio Cloud. We started talking about Apu Public 20 around two years ago. By July 2021, we had chosen the product name and we registered the domain. Things accelerated during the summer of 2021, and we made the public announcement of Appuio Cloud in September, and by October, users started migrating their apps from Apuya Public to Appuio Cloud. In December, we announced a partnership with isovalent to use their CNI plugin on Appuio Cloud. Last year, we opened up a new region in Geneva and we released the ApU Cloud portal so that our users can manage their projects, users and groups autonomously. And finally, we released our new product, Appcat, which allows a Puyo cloud users to specify dependencies such as s three buckets, databases, message queues, other systems directly in YamL from their Openshift projects. We also enable vertical polyto sailing and workload monitoring for all of our users. So can other organizations use a similar process to create a product? We believe yes, it is possible. However, there are a few caveats that we know some companies should have to work on those items first to be honest, in order to have a successful DevOps journey. First of all, writing skills are fundamental. We need DevOps engineers to be writers and to put everything down, not only as everything as code, security, infrastructure, business rules, et cetera, but also as documentation writers. Making sure that both the engineers and users are able to refer to a written document that explains the reasons why things happen and keeping that written documentation updated. This part of the work is not a chore, it's a bonus. It's part of the deliverables. It must be updated, reviewed and proofread. Second, cloud native technology has been designed to work faster than ever. Containers Kubernetes CI CD pipelines open source all of the ecosystem of cloud native technology is the greater enabler of our modern world. The technological context constitutes a fantastic giant shoulder where we can stand on to go faster, to go better. We definitely could have never done this work without the ecosystem of open source cloud native technologies available today. But third of all, trust is paramount. You have to trust your teams. We actually think that trust is more important than flat hierarchies, even though these have helped us. Without trust, there's no way we could have created a poyo cloud in such a short amount of time. Trust allows teams to work independently, moving fast and without the inherent fear typical of a blame culture. And trust is the key ingredient for an asynchronous work culture. You cannot really go full async if you do not trust your teams. We stress this point because this factor is a deal breaker for many teams in many places in the world. These are, we think, the three important pillars of our DevOps culture, writing, technology, and trust those who have helped us shape appuio cloud into the product that is right now steadily growing and changing continuously. Is it easy to work like this in a DevOps mode? Of course not. There's a lot of things that can go wrong. But is it worth it? Let's put it this way. After all this time, we have internalized this way of working so much we couldn't do things any other way. We think it's totally worth it and as a result, we just do things like this all the time. With Appuio Cloud Vision demonstrates that we can deliver world class product in a short amount of time with a small team of experts and with fast cycles of feedback and experimentation baked in the process. We regularly publish blog posts telling the story of the product and sharing news about future features of development. So please check it out. At Vision CH blog, for example, we have blogs about behind the scenes about our API, how billing works, and all of that. So please check it out and if you would like to try Apu cloud for 30 days, please go to Apuyo Cloud register, use the voucher code conf fourty two and get a thirty day openshift project for your team to use and to test. Thank you so much for your attention. It's been great explaining to you all of this story. I hope it's been useful to you and I hope you have questions. I will be around in the chat to answer some of them. Thank you so much.
...

Adrian Kosmaczewski

Developer Relations @ VSHN AG

Adrian Kosmaczewski's LinkedIn account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways