Conf42: Site Reliability Engineering 2021


Investigating Performance Issues In Microservices Arch. With Distributed Tracing

Dotan Horovits
Product Evangelist @

Dotan Horovits's LinkedIn account Dotan Horovits's twitter account

Running a multi-tenant SaaS at scale is no easy task. With the massive scale-out we’ve started encountering performance issues. Investigating those issues turned out tricky with our microservices architecture running on Kubernetes and Docker containers across multiple regions and multiple cloud providers. We run an Observability SaaS platform, both using it internally and offering it to others. Our system is instrumented inside and out around logging and metrics, but that proved not to be the right tool for the job. We needed another weapon for our performance blitz. This is how we got to the world of distributed tracing, first as practitioners and then also started offering it as part of our Observability platform.

In this talk I’ll share our journey to distributed tracing and Jaeger open source project, how it helped us overcome our performance issues in our application across the stack from the Node.js down to the Java and database backends, and how it has become an integral part of our daily routine. If you’re battling performance issues, if you’re considering making your first steps into distributed tracing - this talk is for you. I’ll show useful examples, best practices and tips to make your life easier in battling performance issues and gaining better observability into your system, as well as how to make this a gradual and smooth journey, even into high-scale production systems.

Awesome conferences for

Priority access to all content

Community Discord

Exclusive promotions and giveaways