Conf42 DevSecOps 2023 - Online

Simplifying Kafka Governance for Developers

Video size:

Abstract

Are you burdened by the governing a multitude of Kafka topics, numbering in thousands, uncertain whether security measures are adequately enforced and to monitor numerous producers consumers ? If any of these concerns resonate with you this presentation is precisely tailored to address your needs.

Summary

  • Orali Darbasani will go through a few challenges companies are currently facing when using Apache Kafka. We would also explore a few tools and aiven into claw which is an open source project. We have about 30 minutes for this talk, so let's begin.
  • Kafka is widely adopted in many companies of all sizes due to its unique measures such as scalability, retention, reliability. However, adapting Kafka comes with its share of challenges. In the next slide we will see how bringing structure to an entity would be more productive.
  • Claw is a toolkit web application designed to automate the process of creating and managing the kafka topics, ACL schemas and connectors. It focuses on four main principles, governance, self service, security and automation.
  • Claw is developed with React, the new UI as the front end technology. Claw has defined workflows for applying configurations to Apache, kafka clusters, and also other types of clusters. In the next five demos we will have two users, William and Jennifer, where one requests and the other approach them.
  • Promotion is a key feature of claw that improves governance, administration, and control of the topics. A topic can be initially created in the lowest environment and then promoted to next environments as needed. On approval, the topic is promoted to the next environment.
  • Claw supports synchronization of the configuration between claw and Apache, Kafka and other clusters. Claw allows for seamless synchronization of topics and acls from these clusters into your new setup. Disaster recovery is a common phenomenon in most of the software projects. Now Claw helps immensely in this recovery process.
  • Claw is an open source solution based on Apache license so you can download and deploy in any of your environments for free. It basically consists of two Java producers and is also available as docker images. Additionally, Claw offers a rich react based UI which can be accessed when the NPM assets are built.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
You hi, I am Orali Darbasani, staff software engineer at Ivan. In this talk I will go through a few challenges companies are currently facing when using Apache Kafka. We would also explore a few tools and aiven into claw which is an open source project. I will also give a demo of it and see how it actually overcomes those problems. We have about 30 minutes for this talk, so let's begin. Let's start with some background on Kafka itself. As we all know, Kafka is widely adopted in many companies of all sizes due to its unique measures such as scalability, retention, reliability, making it apart from any traditional messaging platforms. Kafka is similar to other messaging solutions, but with its exceptional features it definitely makes it stand out. However, adapting Kafka comes with its share of challenges. Now, many companies include Kafka in their technology stack as it enables them to grow and scale effectively. Now events are published to Kafka and they are actually stored into topics which can grow significantly as the applications around them also grow. However, managing these topics and their authorizations can be a very big struggle for many organizations. I should say large companies might have hundreds or even thousands of these topics, and the improper governance and the authorization management can become a big headache. To address these challenges, organizations often turn to open source or commercial products, and some even develop their own automation frameworks using tools like Jenkins, Git, confluence, or even Excel. Now in our next slide, we will see how bringing structure to an entity would be more productive. This is the image of a library where it is almost impossible to find the right book you are looking for, and on the right hand side we see how books are properly organized and categorized. Most importantly, they're also tracked on who is actually renting those books, right? And for how long? We can actually compare these books to Kafka topics. In our next slide, let's see how topics actually are represented in Kafka. Here we are seeing a typical Kafka environment and provision of topics to different producer and consumer teams. Some of these problems include manual activities like one single topic creation. Providing producer and consumer access and even promoting to higher environments can take up to ten email communications at least. Now executing these steps, which means these actual commands execution can take up to two to 3 hours, excluding these non human response time. Do these users have the authority to request for access on these topics? And whoever is actually creating those topics, there is no record of who executed those manual commands for topics and the ACL creation basically, a centralized governance is challenging when actions are initiated through emails or maintained in spreadsheets and manual release process can lead to errors and system outages, especially when we think about schemas and their evolution. Deployments of various versions on the schemas would become critical. Do you know that several Kafka consumer clients get into deserialization issues when a new schema version is released on a topic and basically they are not aware of this unless the right compatibility is set on the topic schema? Various questions emerge in the context of kafka implementation, posing challenges for developers who has the permission to produce and consume on a topics. We know it's an application from a team, but what if they are not authorized to get access? Who owns a topic or a kafka connector or most importantly the schemas? How can topics be promoted seamlessly from lower to higher environments with tested configurations? Is security properly enforced with access controls? Now how can the kafka configuration be backed up to manage the disaster recovery process? Now these challenges are typically not evident during the initial stages of the project. However, as the application or the project expands and the number of topics grow, managing them really becomes cumbersome. There are a few tools available to address certain problems in this space and below I have mentioned a few. Kafka manager from Yahoo is quite a good one with a nice user friendly interface and if you are looking for metadata to be stored in git and with CI CD pipelines, then Julie Gitops has a good automation there producers is a UI for Apache Kafka. It is a simple tool that makes your data observable and also helps you find and troubleshoot the issues faster. These tools are doing quite good, but not actually what we are looking for. For example, a user can get a topic or ACL directly created on the cluster without any approvals and there wouldn't be any ownership on the topics. So it's very hard to find the right contact person and that's where we see the need of claw. Claw is a toolkit web application designed to automate the process of creating and managing the kafka topics, ACL schemas and connectors. It focuses on four main principles, governance, self service, security and automation, which we shall look into detail in the next slide. Governance involves defining roles, responsibilities, ownership, auditing activities, and naming conventions, to name a few, and claw acts as the single source of truth preventing manipulations on the clusters. However, Claw can identify those manual changes, notify administrators, and synchronize across the systems. Self service it empowers teams to become independent of the infrastructure teams and manage the kafka configurations. They can create, promote, edit and claim topics, provide producer and consumer access and request for schemas and connectors. Security it is quite crucial for preventing unauthorized manual changes on the cluster. Claw supports various protocols like kerberos and SSL for secure connections, and users can log in using these active directory credentials or existing SSO mechanisms. And the last one, automation is a core feature of claw, making it self sufficient. It enables easy provisioning of configurations and allows smooth promotion of topics across different environments. Metadata on Claw can be synchronized back and forth from the cluster, with users receiving notifications for every request or the configuration update. Now that we have seen the fundamentals, let's see the architecture of Claw. Claw is developed with React, the new UI as the front end technology. While it also has angular based UI, Claw has two jaw based spring applications. Claw has defined workflows for applying configurations to Apache, kafka clusters, and also other types of clusters. Now, instead of directly creating configurations on the cluster, we know Kafka follows the four s principle concept. This approach entails raising a request and obtaining approval before actually implementing any changes. Benefits of workflows it's like it provides an additional layer of security by mitigating any risks associated with those manual entries. Basically, it ensures a thorough review and verification of the request by another person, ensuring the sanity of the application, and we can easily track these configurational change history. All this ownership metadata and actual topic ACL schema configurations is stored in the metastore. By default, Claw will use H two as the metastore. This means that there are no additional dependencies to get started with your project. If you prefer to use another rdbms such as MySQL, we of course recommend to use it. And the second spring application concerns to clusters for cluster related operations. Now another nice aspect is you can connect to any number of Kafka clusters and there is no limit there. As Claw runs on this concept of forest principle, it is recommended to use Claw with two users, at least to request and review topics, et cetera. Now, in the next five demos we will have two users, William and Jennifer, where one requests and the other approach them. I will be demonstrating provision of topic ACL schemas and also the disaster recovery process. This use case is to provision a Kafka topic. Here, William requests for a topic which is validated and stored in the database, and Jennifer reviews and approves the request on approval topic is provisioned on the cluster. That's it. Now, same principle is applied for the other elements too. I have two browser sessions open, one with William user and the other one with Jennifer. So this is William, the other one is Jennifer. So let's request for a topic and these approve with the other user. Go to dev say let's say test topic for demo conf we can mention a topic description topics for demo and we could also add some text for the approver to know what is these topic about and why do we need that topic? Need needed for demo we can see our topic request topic for democonf. Let's if we go to approve requests, then we should be able to see it in the last one. So here is the topic request. Let's go to view that and approve the request. And if you go to topics search for conf, we should be seeing the topic. Now that we have seen this topic creation, let's get a consumer access on it. Usually there would be multiple applications producing and consuming from the Kafka topics. Now these applications can be owned by different teams and these consumers teams would have to request for read access on the topics. If this process is not automated, it could take ages to get approvals and it's very hard to track what's actually happening in the background. In claw, any team can view a topic and request for access on it. It's the topic owner team who actually decides to approve or decline the request. Now note that claw masks the ACL information for other teams. Only the owner teams can view the acls like IP addresses or sfls or the principals. This is an added security clause in the demo here. William requests for a consumer ACL on a topic which is actually owned by a different team. And a member of that team is going to approve the request on approval. ACL is directly created on the topic, which means the relevant application can already start consuming from the topic. Now let's get into the demo. And for these demo purpose, I have moved Jennifer to a different team. We can see that here. Jennifer now belongs to Devrel team. So let's get into the topic. If you go to subscriptions, currently there are no subscriptions on it, I will request for a consumer ACL on it. Go to consumer and then consumer ACL access now that the request is created. If we come to approve requests from the topic owner team, if we see in the acls, we should be seeing this request. We can view it and this is the ACL access. Now let's approve this request. Now that the request is approved, let's see these topic. We can see there is one consumer access on it. So we have now created the topic and also provided a consumer access. Schemas are created for events or measures now. They provide a good structure to the event. In Kafka projects, it is very much needed to define these schemas in the initial stage of the project now, else it would be hard to get things in the right direction. As these project grows, claw relies on the rest API of the schema HT server for all of its API operations. Clause uses the concept of a team for schema ownership and management where the topic owner team requests a schema for a specific topics and the team that owns the topic is responsible for making the final decision on any schema related request, such as approving or declining the request. Now claw enforces these topic naming strategy to measures. Only one schema is applied per topic which uses the topic name to identify the schema. Subject used for the schema lookups now clause supports Aiven, scarapace and also confluent schema registry in the demo. Here, William requests for a schema on the topic which is owned by them. Request is validated and stored, and the other user, who actually belongs to the same team is going to approve on approval. Schema is directly created on the subject. Now let's get into the demo. To request for schema on the topic, go to the topic and schema tab. There are no schemas here. Request a new schema, select dev upload a new schema. Okay, so we have now selected a schema. Now that we have submitted a request, let's go to the other user. The schemas tab you can see already the request is waiting. Let's approve it so we can see now the schema exists. Promotion is a key feature of claw that improves governance, administration, and control of the topics. Now, with topic promotion, a topic can be initially created in the lowest environment and then promoted to next environments as needed. Now, once a topic is created in the base environments, you can promote it. Now, this will create a promotion request that your teammates can review, approve, or decline the topic. Overview will show the environments of the topic where it is configured, including the environment to which you want to promote the topic. Now, in the demo here, William requests for the topic promotion request is validated and stored, and the other user who belongs to the same team is going to approve the request. On approval, the topic is promoted to the next environment. Now, you might be wondering if at all we need two people for these activities. I would say yes. All these topics, or acls is nothing but infrastructure, and we are creating infrastructure as a code and it has to be reviewed by peers. When I meet people, they keep asking me like is ownership mandatory on topics or schemas? Yes, again imagine, without defining ownership, whom to contact for any issue on the topic or any permission, usability or documentation and many more. It is similar to the books in the library where either they are rented by people or they are living in the library. Basically this ownership gives responsibility to them. Let's now get to the demo. We will now promote the topic from dev to test. If you see here, this topic is only available in dev. So let's go inside and click on promote. We want to promote with the same configuration, what is provided and topic promotion. Submit a promotion request and for this demo purpose again I have moved Jennifer back to the same team. We can see that now if we come back to approve requests, go to topics. If we search for conf, it is waiting for approval, view it and these approve if we come back to topics. If you see here, the topic now exists in both the and test. Disaster recovery is a common phenomenon in most of the software projects and so is with Kafka. Now Claw helps immensely in this recovery process. Claw supports synchronization of the configuration between claw and Apache, Kafka and other clusters. Note that it is only the configuration and not the actual data which is lying on the topics. Claw allows for seamless synchronization of topics and acls from these clusters into your new setup. If your claw instance is already up and running or restored from a backup or unaffected by any cluster outage, you can leverage the synchronized option to reinstate or update topics and acls across the clusters, which basically measures the data consistency and uninterrupted operations. In the last demo here, superadmin, who has the sync topics permission, logs in and tries to fetch the topics from the cluster and sync to claw. Let's get into the demo now. So, like I mentioned, synchronization of topics from cluster can be done in two ways, with the individual options or with the bulk options. Here you have to select each and every topic one by one, and in the bulk options you can select all of them in one go. So here we are seeing about 162 topics which are out of sync, which means they don't exist in claw. And if you select so, for example, and we want to assign them to a particular team, now all the topics are synchronized with clop. Note that we are not synchronizing any data which exists on the topics. Rather it is only the topics configuration. Now all these topics exist in Claw. So does claw fit into your project? The answer is mostly yes. Claw is an open source solution based on Apache license so you can download and deploy in any of your environments for free. It basically consists of two Java producers and is also available as docker images, making it production ready and deployable in high availability mode. Additionally, Claw offers a rich react based UI which can be accessed when the NPM assets are built. We are almost at the end of the talk here. We have a few useful links. If you have any technical inquiries regarding the project, please feel free to raise a git issue in the git repository. The project can be downloaded from git or Docker, or you can also access it from the available releases. Thank you for watching and hope you enjoyed it.
...

Muralidhar Basani

Staff Software Engineer @ Aiven

Muralidhar Basani's LinkedIn account Muralidhar Basani's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways