Conf42 Site Reliability Engineering (SRE) 2025 - Online

- premiere 5PM GMT

Reliability Patterns in Permissioned Blockchain Systems

Video size:

Abstract

Unlock the secrets to rock-solid, high-performance permissioned blockchains! Learn how to bolster fault tolerance and secure dependable transactions in Hyperledger Fabric networks. From consensus resilience to disaster recovery, discover hands-on patterns that guarantee reliability at scale.

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hey everyone, and welcome to this session on Reliability Patterns, information Doctor Systems. Your host today is maybe we will be too, and I preferably go by the name. Great. I currently work as an SRA and I've spent several years focusing on reliability engineering in distributed systems. I've spent the last three years building our permission systems blockchain networks like capital fabric. You got checkups like Key 12 does the link and here in outside work stuff. I like philosophy and mathematics. So in this talk we are going to explore, proving architectural and operational patterns that can help you keep your fabric network highly available and silence to failures. I'll walk you through everything on how to cluster nodes for foster arounds to handling Byzantine scenarios. The practical performance tuning and yeah, maybe in the end we'll talk about reward incidents. Yeah, this session overview so quickly, we're going to have a fabric overview, so that we're aligned on the basic components of like fabric network. Then next hour we'll dive into fourth tolerance, no clusters, focusing on peers and ordering. Then we'll look at Byzantine fourth tolerance and how fabric is evolving in that area. Then after that, we'll discuss network partitioning. How you handle network partitions because, things like this happen, bad things happen. The nurse will talk about performance tuning for high availability. Then we would cover monitoring and all alerting strategies. So you can detect issues early. Then we'll review reward failure scenarios and, learn from them. What is fabric? Hyper Fabric is a modular. Permission blockchain framework or system. So unlike public blockchains like Bitcoin, Orum, or Sona it's designed for environments where participants are known or semi state. So you are going to find this in government or you're going to find this in financial institutions or supply chain, that kind of stuff. Fabric main confidence peers, they host ledgers and smart contracts which you could also call chain codes. Then the ordering service, puts transactions and put transactions into blocks. So we would see how raft, MD ft can be used there for reliability. Then maybe shift service or cs or. Certificate authorities the issue and manage identities. Then chain code is a smart contract logic. Chain codes are installed on ps so what chain codes to is to endorse and, and, validate transactions. We'll talk about we're going to talk about how to prevent that time data loss. I know. Of catastrophic failures in such a system like this. And fabric. The way fabric is designed to have reliability, it has, it does like a distributor system where, you know each and everyone knows kinds of dough replication. The consensus is also configurable. So you can either run Raft or PFT even solo. Kafka is also deprecated. And yeah, the way these peers sync is or the way this system sync is through the protocol. Yeah, this is an example diagram of a multi organization fabric network in this organization. There are about in this network there are about four organizations the root organization, which the ordering service could run. And established partner needs to have a separate or there are organization where, yeah this network has a art organizations. Organization one, two, Andre. And if you look at each of these organizations have at least one pair running in them a could. A good a, a good example of a network that's built for redundancy will have at least two pairs on them. So you could see organization one is running about, about eight Ps and organization two and three are running one pair each. And also like the ordering services the road organization is running about three orders which are built on Rev. So as this store goes on I think a lot of things will make sense here. Yeah, it also, if you notice, each of these organizations are running ac to manage identity, the big picture here is that each org has multiple peers. These peers talk to the ordering notes to get blocks. So we, and the way each of this component talk is through the gossip protocol. So we rely on the basic protocol for peers to share their data among themselves. Yeah, like I said, I architectural patterns each or two to three PS for redundancy plus protocol for ledger syn. The ordering service could have three to five orders, and you need to select an org number because NCES runs on this thing. And yeah, if you're running raft. Okay, yeah, we'll get more on that. Yeah, CAS you, the cluster running with that CAS might not be a problem. Ca, ca. Cas if you, if your ca goes down, that will probably not a problem. The network will still run. But in cases where clients have to renew certificates or you want to enroll a new certificate, that becomes a problem. So you probably want to have a backup on cas and yeah, if you are running on Kubernetes, there's a feature in Kubernetes for anti affinity. So you want to spread the Ps or other ports across different modes. You don't want all of them running in same Kubernetes mode because if that Kubernetes no should go down, for example, you have your home network going down. So yeah. In this, in this section we're going to talk about force tolerant, no clusters, how you build fault node level, force tolerance. So like I said in fabric you have two types of nodes, the PS and the ordering service. The p at least each org should have at least two or three years. Oftentimes in a good network trade. If one feels the other one can endorse transactions or save queries. Go see, ensure that, they keeping consistent ledger copy. Yeah. For the ordering service, we, you typically roll three to five other if you are using raft for crash for to Rails. So if one fill the cluster will select a new leader as long that there's still a majority up. So Raft is based on leader based consensus. So at each, every time the cluster needs a leader. There's a leader election that goes on if one of if a leader crashes or dies and, if you want to, if you want higher 40 tolerance, it could go as high as five nodes. So in five nodes it could probably be able to tolerate two crashes. So keep adding, yeah, keep adding nodes for, higher 40 tolerance. You should ensure it's a high no. So we also want to make sure you are not adding a lot of notes because there's, you, do you want to make sure that you are not having so many notes that. Throw out the cluster from equilibrium. So this architecture, is crucial because if we had only one ordering node, or if we had only a single pay, any downtime, we just host the network or the entire network. We could see why this is, This is, important. The way raft works is by having a node. So we'll just talk a little bit on raft mechanics. Raft works by having one no. As a leader, that leader proposes transaction batches. When a majority of the nude or Fullers acknowledge the block is considered committed. If the leader fails one of the follower steps or, and becomes a leader. So in a three, no plus losing one no doesn't necessarily stop the show because you still have two nos forming the majority. That's how we keep ordering online despite hardware failures or crashes. Yeah, so to avoid single points of failure, to summarize the key is to avoid single point of failure. No single other error is always a cluster. Multiple Ps For organization, you want to have three at least. So if your identity issuance is mission critical or very important, you want to have a backup. And if you are running a ities, you want to set up anti affinity to make sure that ports are not necessarily on the same. Post as a rule of thumb, you consider your failure domains like rejoin data center or even the rack and, spread the nodes according. So let's talk a little bit on fault, uming fault tolerance. So fabric default. Consensus is craft is crash fault tolerant, which means that is raft. Raft is crash fault tolerant. But if your nudes, what if your nudes could be outright initial? For example, one of your nodes is trying to reorder the order of transactions. So yeah, that is a form that, doesn't take care of. Bezant for tolerance means that the system can tolerate notes that act actively misbehave or attempt to fuck the chain, or attempt to crazy things like reorder transactions. So Fabric introduced A BFT ordering service called BFT in professional trade. So this requires three F plus one. No. To tolerate failures means that if you want, for example, one to tolerate, if you want to tolerate one 14 root, you need to have four. So that is three times one plus one. Yeah. So every f is the number of failures or malicious nodes you want to be able to tolerate. So if you want to be able to tolerate, two 40 or malicious notes, that is three times two plus one, which is about seven nodes. So it, yeah, it's more resource intensive and has more a message overhead. You get a stronger guarantee, but you get a stronger guarantee that the network will remain consistent even if some participants act maliciously. The use cases for this is, if you are in an environment where you are, you don't really trust the other members. Yeah, VFT is really suitable for this all in know, very high value finance environment. You have to summarize raft handles, crash failures well, but not malicious ones. So VFT can handle both. Yeah most deployments would not immediately jump to BFT because it's still, it's still something very new in fabric. But if you have a, use a use case for truly un untrusted participants yeah, then it's worth evaluating. So yeah. Let's let's look at handling network partitions. So real distributor systems must handle like a case where the network itself fails or, becomes segmented. So let's discuss other partition, let's say the set of other partition themselves. So with Ralph, the only partition, the partition that has a majority of other roles. Would continue producing blocks. So the minority side stores doesn't create a fork. So when the, once the partition heals the minority automatically catches up. But for pair isolation, if the pairs are isolated, they stop receiving blocks with when connectivity returns. They do a course C based catch up that is pulling missed blocks from the. Healthy ps. So as architects, we want to distribute moves across different zones or data centers so that we can still achieve a mi a majority, even if a region is fought off. This is a partition diagram. This network shows, this diagram shows that the majority partition keeps ordering. Going, why the minority stores? So the big takeaway here is that no F happens, right? So the minority side, eventually, since once the network is restored. So let's discuss performance tuning for high availability. Reliability is not really just about redundancy, it's also about ensuring that the system can handle those without feeling. Yeah, just like some tips, increased block size for throughput, you also want to watch for latency. You have to tune this for your workload. If possible, use more looser or flexible policies. EG, any two of three orgs can keep you running if one of the org is down. So strict policies can cause a stall or hold if a required organization goes offline. Then also like for the study, level DB is faster. But CouchDB supports switcher queries, that is adds like some form of overhead. Then you also, want to have multiple channels like izing, workload for scalability or isolating issues. Past SSD enough, CPU, you want to ensure that you are beefed at the hardware level, so you enough, CPU, memory, SSD, and even network bandwidth running upgrades. Update one at the time. Yeah. So to avoid total art. So this is a sample raft config. So you can see in the XRP how we define ants in other tmo. So this is sample other TMO for a raft cluster. If you, the key parameters here, like tick interval, election, tick and heartbeat, it can be tuned. Based on the network latency. Many times default values are often fine but in high latency or global deployments, you may want to increase them. So yeah, you can tune this election tick a b tick, and also concern us. So this is an endorsement policy example. It's like a Jason snippet. That requires any two or three orgs. So this means that one org would be completely offline, but will still be able to get valid endorsement. This is much more silent than a policy that is requiring, or three. Which means that if one organization store, then the whole network is stored or transaction in the is stored. Yeah, for monitoring and all alerting. So you want to proactively catch issues. So we need monitoring. Kissing notes include like no health and container restart. You want to also look like block production rate, leisure heights to dictate lacking those. You want to also check transaction throughput, latency, resource usage, and, set expiration. So a popular combo is a parameters plus Grafana. But I think you can use stuff like a LK. You can also use stuff like a LK Splunk Datadog for. Log analysis. Until you want to out for, you want to out for no new blocks has been created or, or even resource tion. Yeah. So yeah, all that. Like proactive, all alerting you could do is detect ano early. So you want to check if the RAF lid has changed. You want to check on, you want also all alert on cause failures or monitor stuff like this. And if possible, you have taken, this matrix a lot, a long time. You want to do like historical trend analysis on, on this. Yeah, these are some, like even with all these patterns, something can go wrong. So let's look at some examples. Audrey? No Audrey, no. Crash. Ralph would quickly elect a nude, neither and the metal. So this needs no really manual intervention. Yeah. So PS out of sync. Gossip sometimes can, automatically catch up. Only stream, probably the net, their network failures outside of fabric. All down strict endorsement policy probably has frozen the network or you don't have enough organizations. Set expiration. I think if a most case of this is what happened with T Cash on CBDC when it went down for months. I've also, when I worked at gala. The chain was also based on Hyperledger fabric. There are some instances where, set exploration caused big issues. Yeah, put sometimes to things like chin code bugs, like a container can panic, but other peers. And also there's partial, continuity network partition. Only majority sites. Keep, only majority side continues, so minority will be stored until it reconnects. Sometimes you have like denial service act where they are valued in pair with proposals that leads to slow endorsement or maybe even a store. This is going to need rate limits. So from this incident, we have seen like that even multi mode setups, flexible policies, automatic renewals, and to monitoring are crucial. So in all of these, I think monitoring is really important here. So industry, there's some industry use cases and, lessons industry use cases or fabric. And we've seen fabric in financial services. I've seen, fabric and supply chain. The financial services you is typically like multi joint deployment, so three to five afternoons. If they like, they could also use BFD also zero down time updates and very heavy on monitoring supply chain. This is high volume, so you need, parallel channels. And you also have to meet, data is always available even if some peers feel. For identity systems very small, very security focused networks, they'll po possibly adopt BFT for malicious fault tolerance. In every of these cases, the ops team heavily invest in monitoring, scaling, and testing. Best practices, like a recap of best practices. You want to have redundant peers and other nodes, like redundant components. You want to have flexible endorsements avoid single or endorsement if possible. And you also don't have to require every org. Regular certificates management and backup of admin credentials. Two block sizes and endorsements for your workload. Active monitoring actively monitor also fill over and rolling upgrades stone. Minimize time. Time and yeah, to test your assumptions. You could, shut down, or simulate in partition to see if the network recovers. So like sometimes do some chaos testing. So the conclusion here is hyper fabric can be very reliable if we take advantage of its architecture. If we distribute other replicate pairs, set up to full endorsement policies, and build a solid monitoring track, we have looked at both crash force tolerance and byzantine force tolerance and how to handle network partitions. And how to two performance and how to avoid reward pitfalls like certificate expiry. So by following this reliability patterns, you'll be well on your way to a resilient enterprise ready fabric network. Thank you for your attention and yeah, this was I hope this was a good talk and. Gives you some perspective on how to design reliability patterns in permission blockchain systems. Thank you. Bye.
...

Umegbewe Nwebedu

@ Botanix Labs

Umegbewe Nwebedu's LinkedIn account Umegbewe Nwebedu's twitter account



Join the community!

Learn for free, join the best tech learning community for a price of a pumpkin latte.

Annual
Monthly
Newsletter
$ 0 /mo

Event notifications, weekly newsletter

Delayed access to all content

Immediate access to Keynotes & Panels

Community
$ 8.34 /mo

Immediate access to all content

Courses, quizes & certificates

Community chats

Join the community (7 day free trial)