Operating a distributed system like HBase/Hadoop FS at peta byte scale took years to master in our private data centers. This talk describes our dramatic shift towards running a mission critical stateful application on Kubernetes in Public Cloud, why we did it and the challenges we had to overcome.
Salesforce runs a very large footprint of HBase and HDFS clusters in our data centers with multiple petabytes of data, billions of queries per day over thousands of machines. After more than a decade of running our own data centers, we pivoted towards public cloud for its scalability and availability. As part of this foray, we made a bold decision to move our HBase clusters from staid bare metal hosts to the dynamic and immutable world of containers and Kubernetes. This move brought with it a number of challenges which are likely to find echoes in other such mature stateful applications adapting to public cloud and Kubernetes. The challenges include
1. Limitations in Kubernetes while deploying large scale stateful applications
2. Failures in HBase/HDFS as the DNS records keep changing in Kubernetes
3. Resilience of HBase/HDFS even when a whole availability zone fails in Public Cloud
4. Introducing encryption in communication over untrusted public cloud networks without application being aware
This talk will go over how we overcame these challenges and the benefits we are beginning to see from these efforts.
Priority access to all content
Exclusive promotions and giveaways