Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello everyone.
Good morning, good afternoon, and good evening to wherever you are.
I am really excited and thrilled to be speaking in front of you today.
Today I'm going to be talking about unlocking observability
with react and no J Applications.
My name is Moi Ani.
I am currently a staff software engineer for Twilio and I've been
with them for almost four years.
I'm part of the phone numbers product and what my team is basically
responsible for is procuring numbers and all of its management so we
can enable our cross channels like email messaging and also drive other
communication products as well.
But for today we'll be diving into a critical piece of.
Full stack observability specifically for applications built with React
on the front end and no js on the backend in today's digital landscape.
Understanding how these two key parts of our application work together
has become more important than ever.
So without further ado, let's dive in.
So quickly going through the agenda today first I'll be talking a little
bit about why do I think there is a need for a unified view, and then
I will talk about what does this unified full stack observability mean.
Then next we will look at how can we correlate the data that
we receive on front end with the backend for observability.
Some case studies tools, best practices, and then we'll conclude
with some some of our findings.
Our applications are no longer monolith.
We have sophisticated user interfaces built with React, interacting with
backend APIs, powered by node js.
This complexity makes it challenging to understand the root case, root
cause of issues when we treat the front end and backend as isolated systems.
Traditional monitoring approaches which look at these in silos,
often fail to capture the crucial interactions between them.
This is where full track observability comes in, offering
a way to see the entire picture.
One example that I really like to use it, think of it like, trying to understand
a play only by watching the actors on stage or just by reading the script.
You need both to grasp the full narrative.
Similarly, we need to see both the user experience and the server's
operations to truly understand the state of our applications.
So what exactly is unified full stack observability for
react and no J applications?
It's about having a single comprehensive view of everything that's happening, right
from the user's interaction on the React interface to the server side logic, and
then data handling at the data layer.
This approach recognizes that the front end and backend are deeply connected.
And their performance would impact each other.
To achieve this, we basically need tools that can automatically link data
from both sides, giving us a complete understanding without any manual effort.
This is a significant step up from traditional monitoring
with the front end and backend.
Teams would look at these things in isolation.
They will have their own dashboards, and then they will try to connect the dots.
Okay.
Imagine a slow loading page.
Is it the React code rendering, the network request?
The No J server processing or a database query?
Unified observability should help us answer this exact situation.
It moves us beyond from just knowing what is happening, to
understanding where in the system it is happening and why it is happening.
So let's talk about why should we strive for this unified,
integrated observability?
The benefits are substantial.
First, when issues would arise, we can pinpoint by looking at the dashboard
the cause much faster because we are looking at a complete picture, right?
So it'll cause a significant or reduced MTTR or MTTD.
Secondly, it fosters better collaboration between the front end and backend
teams as they would have a shared, collected view of the system's health.
Thirdly, we can gain deeper insights into the performance and then use it to
prepare for ongoing releases as well.
This will also help us in identifying the bottlenecks.
That might span across a front end backend, or it could also be something
between front end and backend.
This data then allows us to make informed decisions about how we can optimize
our applications and also allocate these resources more effectively.
Finally we can use anomaly detection.
Across our entire stack which can, which will often help us to
catch potential problems before they can even affect our users.
Ultimately as we've discussed this would lead to a much better user engagement.
To the point I would like to say happiness for our users and customers.
More efficient development cycles.
At a more stable, more efficient, and a more reliable application.
Think about also the cost savings from reduced downtime and the
increased user satisfaction from consistently performing application.
So next, the interesting part is how do we correlate the data coming from
the front end and the backend, right?
Because as we just discussed, we want our whatever action is happening from
front end, we want to be able to click it down from backend to the data.
There are several techniques to achieve this.
One is trace propagation.
Where a unique ID is generated and whatever user action happens it's
basically assigned a unique id.
And that unique ID is carried through the backend requests.
We do have some standards for that, like trace context and adders and like Ry
Trace or Trace Parent facilitate this.
Another thing we can do is we can use correlation IDs or session IDs to link
these user activities and embed these IDs and our logs allows, will allow
us to trace a re request journey.
And importantly, many observability platforms integrate real user
monitoring on the front end.
With a PM on the backend application prog performance monitoring
automating this correlation.
Open Telemetry provides a vendor neutral way to implement this
context propagation as well.
As you can see I've added a request with get host transparent
and Trace Parent and trace State.
This ensures that when a user clicks on the button on the front end, we
can follow that specific action as it travels through the entire system.
If we do not have it.
You are basically looking at fragmented pieces of information, and as
discussed earlier, this definitely makes debugging very difficult.
Now to effectively monitor our react and node JS applications,
we need to target track specific KPIs or key performance indicators.
So I've added some key performance indicators in this table.
Let's go through them one, one by one.
And I think they are some of the most important ones.
For example, users perceive latency tells us how quickly
or how fast the application is.
Error rates across both front end and back end are basically telling
us how stable the application is.
What kind of errors can also be monitored, are being experienced by the users.
Then we have end-to-end request latency, which basically shows us the
total time a given request is taking.
Then we have throughput, which is basically how much
our application can handle.
Then our resource utilization helps us identify our potential bottlenecks.
You can figure out which part of our resource chain is getting slow or acting
up, and we can address this accordingly.
Then user engagement metrics is connecting performance to user
behavior and core pep vitals
are crucial for understanding the front end user experience, which
is often influenced by the backend.
Tracking these KPIs together like in a dashboard, a setup will give
us a comprehensive understanding of our application's health, both
on the front end and backend.
And these KPIs in a way could, the way I see it as that they would basically be
starting conversation starters for front end backend teams to discuss performance.
By monitoring these, my metrics holistically.
We can understand how changes in one part of the stack can affect
the other, and ultimately the user experience and the user engagement.
So as we discussed, distributed tracing is a powerful technique for understanding
how requests flow through the application.
It involves breaking down each request into a series of operations.
That's called a span.
And a series of a series of spans form a trace.
Now the key here, as I mentioned, was the key here is context propagation,
ensuring that each span, whether it's on the front end or the backend, is linked
together using the unique trace id.
This allows us to visualize the entire journey of a request
and pinpoint exactly where any slowdowns or errors are happening.
Telemetry provides a standard approach for this to implement this.
And tools like Yeager and Zipkin allow us to visualize these traces.
Now let's imagine a slow, API call distributed tracing would allow us to see.
If the delay is in the front end, making the request, the backend receiving
it, or a specific function within the backend, a database query, or it
could be like a external service call, which is taking a really long time.
This level of detail is invaluable for very efficient debugging.
Now let's look at some real world scenarios.
Imagine if there was a sudden increase in your errors on the React front
end with unified observability.
You would be able to see that this might, let's say, be coinciding with an increased
latency in your specific backend.
API call immediately pointing to the backend as a source of problem, or
perhaps you notice that users are taking a particular path on the front end.
That leads to high CPU usage on the backend service.
This could be a specific endpoint or a specific route.
By seeing both these sites you can optimize that the specific user flow.
For instance, if a user has a payment failure, a trace might show you the
front end request, the call to your back backend, and then a timeout when
your backend is trying to communicate with a third party payment gateway.
Okay.
Similarly, slow loading of product images on the front end might be traced back
to a slow API response from your node JS server fetching that image data.
Now, for those using the MON Stack observability solutions often
provide specific integrations to monitor all of these components.
Together, these examples highlight how unified observability
transforms, debugging from guesswork to a very precise science.
By connecting this user behavior with backend performance, we can quickly
understand the impact of technical issues on the user experience without
losing that critical time during an incident, and obviously saving
a lot on the revenue side as well.
Okay.
Let's look at some of the tooling and products available to us for
implementing this unified observability.
Again, it's a very crowded space right now.
You have platforms like Datadog and New Relic, which offer comprehensive
solutions for both React and node js.
Allowing for a seamless correlation of data.
We have Honeycomb, which focuses on providing a unified way to
analyze all of our telemetry.
Then we have our Dynatrace ai.
They they basically help us and detect and analyze issues across the stack as well.
Then you have Elastic Observability and Grafana Cloud providing robust platforms
for unifying and visualizing data.
Observe specifically focuses on connecting the front end user
experience with backend troubleshooting.
We have open observe, which is an open source alternative which
has an open source alternative.
And then underpinning most of these solutions is open telemetry, which
provides a standardized way to instrument our all our applications.
Myself for Twilio, we are considering Open Elementary right now.
Again, I think it's there's no one size fits all.
Choosing the right tool depends on your specific needs.
What is the scale and where the, what your exact requirements
are based on your stack.
Most of these companies offer free trials, so I would recommend also
exploring doing a few POCs and taking feedback with the team.
To figure out which we which one fits your team's work for the best and
provides the insights that you guys need.
Awesome.
So let's talk about some best practices now.
To truly leverage Unified Observability, we need to adopt some best practices,
creating dashboards that show key metrics.
And traces from both our front end and not just backend in one place.
I think it's very essential for a holistic high level view.
We should set up alerts that trigger based on correlated data.
For example alerting us only when we see both us spike in front end errors
and an increase in backend latency.
Then comes
then comes anomaly detection.
Which can help us catch issues we didn't even know to look for.
Consistent tagging of our telemetry data also makes it easy to filter and
correlate information.
Defining clear observability goals and service level objectives helps us
understand what good looks like and also improves our monitoring setup.
Finally, it is crucial for, to involve both our front end and backend teams in
defining what we monitor and how we are alerted, ensuring everyone has a shared
understanding of what good looks like and making us a very strong, cohesive unit.
One more thing to remember is we are not just collecting data, but
we need to collect the right data.
And then how do we turn that data into actionable insights?
Only then we can improve our application and the user experience.
Regularly review your dashboards and alerts to ensure they're
still relevant and provide value.
To conclude this talk, embracing Unified Full Stack Observability
for our React and node applications offer significant advantages.
It allows us to resolve issues more quickly, gain deeper insight
into our application performance.
And also allow allow us to resolve
and proactively address potential problems.
By moving away from siloed monitoring, we can achieve a much more comprehensive
understanding of our systems.
With the help of modern tools and open standards like Open Telemetry,
this approach is becoming increasingly accessible and essential for
building and maintaining reliable.
High performing applications that deliver excellent user experiences.
To finish, I would like to say that it, we have the right tools.
It's just that how we use them
to understand and manage these complex systems efficiently.
Unified, full stack observative is not just a trend.
But I think it's a fundamental shift towards building more resilient
and user-centric applications.
I want to thank all of you for listening to this talk.
I had a really great time sharing all these insights with you.
If you have more questions, please don't hesitate to reach out to me and
I look forward to connecting with you.
Bye-bye.