Conf42 Site Reliability Engineering (SRE) 2025 - Online

- premiere 5PM GMT

Unlocking Observability with React & Node.js

Video size:

Abstract

This talk shares the secrets of observability with React and Node.js! We will discuss practical strategies for debugging, monitoring performance, and delivering seamless user experiences. The session will end with tips for reforming your dev workflow and ensuring your apps run flawlessly.

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hello everyone. Good morning, good afternoon, and good evening to wherever you are. I am really excited and thrilled to be speaking in front of you today. Today I'm going to be talking about unlocking observability with react and no J Applications. My name is Moi Ani. I am currently a staff software engineer for Twilio and I've been with them for almost four years. I'm part of the phone numbers product and what my team is basically responsible for is procuring numbers and all of its management so we can enable our cross channels like email messaging and also drive other communication products as well. But for today we'll be diving into a critical piece of. Full stack observability specifically for applications built with React on the front end and no js on the backend in today's digital landscape. Understanding how these two key parts of our application work together has become more important than ever. So without further ado, let's dive in. So quickly going through the agenda today first I'll be talking a little bit about why do I think there is a need for a unified view, and then I will talk about what does this unified full stack observability mean. Then next we will look at how can we correlate the data that we receive on front end with the backend for observability. Some case studies tools, best practices, and then we'll conclude with some some of our findings. Our applications are no longer monolith. We have sophisticated user interfaces built with React, interacting with backend APIs, powered by node js. This complexity makes it challenging to understand the root case, root cause of issues when we treat the front end and backend as isolated systems. Traditional monitoring approaches which look at these in silos, often fail to capture the crucial interactions between them. This is where full track observability comes in, offering a way to see the entire picture. One example that I really like to use it, think of it like, trying to understand a play only by watching the actors on stage or just by reading the script. You need both to grasp the full narrative. Similarly, we need to see both the user experience and the server's operations to truly understand the state of our applications. So what exactly is unified full stack observability for react and no J applications? It's about having a single comprehensive view of everything that's happening, right from the user's interaction on the React interface to the server side logic, and then data handling at the data layer. This approach recognizes that the front end and backend are deeply connected. And their performance would impact each other. To achieve this, we basically need tools that can automatically link data from both sides, giving us a complete understanding without any manual effort. This is a significant step up from traditional monitoring with the front end and backend. Teams would look at these things in isolation. They will have their own dashboards, and then they will try to connect the dots. Okay. Imagine a slow loading page. Is it the React code rendering, the network request? The No J server processing or a database query? Unified observability should help us answer this exact situation. It moves us beyond from just knowing what is happening, to understanding where in the system it is happening and why it is happening. So let's talk about why should we strive for this unified, integrated observability? The benefits are substantial. First, when issues would arise, we can pinpoint by looking at the dashboard the cause much faster because we are looking at a complete picture, right? So it'll cause a significant or reduced MTTR or MTTD. Secondly, it fosters better collaboration between the front end and backend teams as they would have a shared, collected view of the system's health. Thirdly, we can gain deeper insights into the performance and then use it to prepare for ongoing releases as well. This will also help us in identifying the bottlenecks. That might span across a front end backend, or it could also be something between front end and backend. This data then allows us to make informed decisions about how we can optimize our applications and also allocate these resources more effectively. Finally we can use anomaly detection. Across our entire stack which can, which will often help us to catch potential problems before they can even affect our users. Ultimately as we've discussed this would lead to a much better user engagement. To the point I would like to say happiness for our users and customers. More efficient development cycles. At a more stable, more efficient, and a more reliable application. Think about also the cost savings from reduced downtime and the increased user satisfaction from consistently performing application. So next, the interesting part is how do we correlate the data coming from the front end and the backend, right? Because as we just discussed, we want our whatever action is happening from front end, we want to be able to click it down from backend to the data. There are several techniques to achieve this. One is trace propagation. Where a unique ID is generated and whatever user action happens it's basically assigned a unique id. And that unique ID is carried through the backend requests. We do have some standards for that, like trace context and adders and like Ry Trace or Trace Parent facilitate this. Another thing we can do is we can use correlation IDs or session IDs to link these user activities and embed these IDs and our logs allows, will allow us to trace a re request journey. And importantly, many observability platforms integrate real user monitoring on the front end. With a PM on the backend application prog performance monitoring automating this correlation. Open Telemetry provides a vendor neutral way to implement this context propagation as well. As you can see I've added a request with get host transparent and Trace Parent and trace State. This ensures that when a user clicks on the button on the front end, we can follow that specific action as it travels through the entire system. If we do not have it. You are basically looking at fragmented pieces of information, and as discussed earlier, this definitely makes debugging very difficult. Now to effectively monitor our react and node JS applications, we need to target track specific KPIs or key performance indicators. So I've added some key performance indicators in this table. Let's go through them one, one by one. And I think they are some of the most important ones. For example, users perceive latency tells us how quickly or how fast the application is. Error rates across both front end and back end are basically telling us how stable the application is. What kind of errors can also be monitored, are being experienced by the users. Then we have end-to-end request latency, which basically shows us the total time a given request is taking. Then we have throughput, which is basically how much our application can handle. Then our resource utilization helps us identify our potential bottlenecks. You can figure out which part of our resource chain is getting slow or acting up, and we can address this accordingly. Then user engagement metrics is connecting performance to user behavior and core pep vitals are crucial for understanding the front end user experience, which is often influenced by the backend. Tracking these KPIs together like in a dashboard, a setup will give us a comprehensive understanding of our application's health, both on the front end and backend. And these KPIs in a way could, the way I see it as that they would basically be starting conversation starters for front end backend teams to discuss performance. By monitoring these, my metrics holistically. We can understand how changes in one part of the stack can affect the other, and ultimately the user experience and the user engagement. So as we discussed, distributed tracing is a powerful technique for understanding how requests flow through the application. It involves breaking down each request into a series of operations. That's called a span. And a series of a series of spans form a trace. Now the key here, as I mentioned, was the key here is context propagation, ensuring that each span, whether it's on the front end or the backend, is linked together using the unique trace id. This allows us to visualize the entire journey of a request and pinpoint exactly where any slowdowns or errors are happening. Telemetry provides a standard approach for this to implement this. And tools like Yeager and Zipkin allow us to visualize these traces. Now let's imagine a slow, API call distributed tracing would allow us to see. If the delay is in the front end, making the request, the backend receiving it, or a specific function within the backend, a database query, or it could be like a external service call, which is taking a really long time. This level of detail is invaluable for very efficient debugging. Now let's look at some real world scenarios. Imagine if there was a sudden increase in your errors on the React front end with unified observability. You would be able to see that this might, let's say, be coinciding with an increased latency in your specific backend. API call immediately pointing to the backend as a source of problem, or perhaps you notice that users are taking a particular path on the front end. That leads to high CPU usage on the backend service. This could be a specific endpoint or a specific route. By seeing both these sites you can optimize that the specific user flow. For instance, if a user has a payment failure, a trace might show you the front end request, the call to your back backend, and then a timeout when your backend is trying to communicate with a third party payment gateway. Okay. Similarly, slow loading of product images on the front end might be traced back to a slow API response from your node JS server fetching that image data. Now, for those using the MON Stack observability solutions often provide specific integrations to monitor all of these components. Together, these examples highlight how unified observability transforms, debugging from guesswork to a very precise science. By connecting this user behavior with backend performance, we can quickly understand the impact of technical issues on the user experience without losing that critical time during an incident, and obviously saving a lot on the revenue side as well. Okay. Let's look at some of the tooling and products available to us for implementing this unified observability. Again, it's a very crowded space right now. You have platforms like Datadog and New Relic, which offer comprehensive solutions for both React and node js. Allowing for a seamless correlation of data. We have Honeycomb, which focuses on providing a unified way to analyze all of our telemetry. Then we have our Dynatrace ai. They they basically help us and detect and analyze issues across the stack as well. Then you have Elastic Observability and Grafana Cloud providing robust platforms for unifying and visualizing data. Observe specifically focuses on connecting the front end user experience with backend troubleshooting. We have open observe, which is an open source alternative which has an open source alternative. And then underpinning most of these solutions is open telemetry, which provides a standardized way to instrument our all our applications. Myself for Twilio, we are considering Open Elementary right now. Again, I think it's there's no one size fits all. Choosing the right tool depends on your specific needs. What is the scale and where the, what your exact requirements are based on your stack. Most of these companies offer free trials, so I would recommend also exploring doing a few POCs and taking feedback with the team. To figure out which we which one fits your team's work for the best and provides the insights that you guys need. Awesome. So let's talk about some best practices now. To truly leverage Unified Observability, we need to adopt some best practices, creating dashboards that show key metrics. And traces from both our front end and not just backend in one place. I think it's very essential for a holistic high level view. We should set up alerts that trigger based on correlated data. For example alerting us only when we see both us spike in front end errors and an increase in backend latency. Then comes then comes anomaly detection. Which can help us catch issues we didn't even know to look for. Consistent tagging of our telemetry data also makes it easy to filter and correlate information. Defining clear observability goals and service level objectives helps us understand what good looks like and also improves our monitoring setup. Finally, it is crucial for, to involve both our front end and backend teams in defining what we monitor and how we are alerted, ensuring everyone has a shared understanding of what good looks like and making us a very strong, cohesive unit. One more thing to remember is we are not just collecting data, but we need to collect the right data. And then how do we turn that data into actionable insights? Only then we can improve our application and the user experience. Regularly review your dashboards and alerts to ensure they're still relevant and provide value. To conclude this talk, embracing Unified Full Stack Observability for our React and node applications offer significant advantages. It allows us to resolve issues more quickly, gain deeper insight into our application performance. And also allow allow us to resolve and proactively address potential problems. By moving away from siloed monitoring, we can achieve a much more comprehensive understanding of our systems. With the help of modern tools and open standards like Open Telemetry, this approach is becoming increasingly accessible and essential for building and maintaining reliable. High performing applications that deliver excellent user experiences. To finish, I would like to say that it, we have the right tools. It's just that how we use them to understand and manage these complex systems efficiently. Unified, full stack observative is not just a trend. But I think it's a fundamental shift towards building more resilient and user-centric applications. I want to thank all of you for listening to this talk. I had a really great time sharing all these insights with you. If you have more questions, please don't hesitate to reach out to me and I look forward to connecting with you. Bye-bye.
...

Mohit Menghnani

Staff Software Engineer @ Twilio

Mohit Menghnani's LinkedIn account



Join the community!

Learn for free, join the best tech learning community for a price of a pumpkin latte.

Annual
Monthly
Newsletter
$ 0 /mo

Event notifications, weekly newsletter

Delayed access to all content

Immediate access to Keynotes & Panels

Community
$ 8.34 /mo

Immediate access to all content

Courses, quizes & certificates

Community chats

Join the community (7 day free trial)