Conf42 Incident Management 2025 - Online

- premiere 5PM GMT

Accelerating Incident Response with Distributed Graph Technology: Leveraging Google Spanner Graph for Complex Security Incident Investigation

Video size:

Abstract

Transform incident response with Google Spanner Graph! Trace complex attacks across distributed systems in real-time, eliminate data silos, and slash investigation times. See revolutionary graph technology turn chaotic incidents into clear attack stories.

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hi everyone. My name is Push. Today I will be presenting accelerating IR response with distributed draft Technology. Okay, let's begin. So could disclaimer before we go further and talk. So content presented here is based on my personal experience and publicly available information, it has not been officially approved and endorse by any organization. Let's discuss the limitations of the traditional I systems. So I categorize the problem enough for subcategories, the data fragmentation, time, synchronization issues, relationship lightness, and manual correlation overload, so data fragmentation. So in today's world, we have a lot of systems producing data. It's very hard to connect the data across those disconnected system, which hinders a timely aggregation and correlation. The next is time synchronization. So all the systems generate events according to times which is relevant to them. So we don't get accurate forensic time when we want to reconstruct everything. Relationship planning has traditional model failed to capture the complex relationship between various entities. The last one is manual correlation. I've seen lot of SRE engineers looking at dashboards, which is very time process to. That happened. The, so these inherent limitations result in fundamental problems in the security systems, which are leading extended detection system, reservation time, and incomplete investigation, and increased operational cost. We trying to address these issues. So now graph technology for incident response. So graph can transform the all the entities in a relationship model, which can help us to debug the issue better. So it can categorize the issue among nodes, edges, and properties. So nodes are the users. Machine IP addresses, processes, edges can be like relationship access created, communicated with properties can be contextual information like timestamps, severity, and classifications. So the structure enable us relationship based investigation, which follows the natural progression of security incident across the system boundaries. So in graph database, we can build this unified sematic layer across all silo data sources enabling the comprehensive analysis. For example, we can. Put all the data from network application endpoint, cloud identity indexes in one big giant dashboard and we can query across. So now whatever problems we've discussed, spanner graph is a natural fit for that. There are a lot of SaaS application available in the market, but with span graph, we just use graph for your. Day to day OTP workload and also for analysis, workload, and build a huge graph. So SPAN graph is a globally distributed graph, extending spans capability for precise, reliable analysis of massive distributed dataset. So for incident response, paragraph offers global consistency. All it ensures like all the data is consistent and immediately available across all regions. The. It enables a precise globally synchronized timelines. Eliminate the timestamp discrepancies. For accurate forensic P construction relationship model, we can bring the massive graph by edges and connect those edges across multiple nodes, scalability and reliability. So Spanner inherently has horizontal scalability for petabytes of data with automatic charging and replication for high availability. By combining these attributes, paragraph, transform the fragmented data into cohesive, globally consistent relationship based database, which can accelerate investigation and strengthen security portion. We already discussed two specific things, which are already interesting. First is two, time Mother is a relationship model. True is a very fundamental construct in this span, a PA, which provides solution to as problems, like we get globally consistent timestamps across different system. We get bounded uncertainty, so we have a maximum skew. That can happen between various even cordial consistency. It guarantees that events are ordered across all regions, preventing a true sequence of actions, a relationship model. So as you've discussed earlier, with a relationship model can be defined by your nodes, edges, and properties. So what this get us, so this can help us to identify a few of the early things in the. To the incident responders, which we usually look for first, it'll identify the blast radius by traversing the connection from the breach point of the affected asset so it can do the later moment patterns across traverse system, revealing the attackers path in detail. We can also hidden discover hidden relationships across unrelated events, connecting the. The dots which traditional tools miss. We can build a comprehensive attack timelines with perfect sequencing, and thanks to the temporal, consistent, and co mapping. So the holistic view empowers the incident responders to move beyond fragmented, alert driven system, and leading to fast record incident response and minimizing the business impact. I will briefly touch on the query capabilities that SPAN Graph has, so we can do path analysis, find all the possible connections between compromised assets. We can do pattern matching. We can do take centrally available measures like critical assets and identify the choke points. We can do the community detection, like group of related events to identify the campaign scope. So these kind of scenarios can be easily covered by span graph capabilities. A very brief slide on the implementation if someone wants to onboard, like what they need to do. So I categorize that in four or five, I would say five different categories. So the first one is collection. We need to collect the data from all the sources. Then we need to normalize those. And find. After that, we need to fit that in those graph model. And once we fit that in graph models, panel storage is a natural fit for that. And after that we can just investigate by using those panel query model three technical challenges, which I have listed. If someone wants to design the application. So the first one is schema design. So we need to see how we can balance between various entities and relations for getting the good performance. The second is ingestion timeline. So we need to see whether we are doing the real time or the batch processing, how we are handling out the quarter events, how we are enriching the data across. So this, we need to ensure like all the queries are responding within the threshold. For that we might need some of the tune enables in query either better planning or some caching, or we need to balance out the response timing with the data that we're fetching. Key takeaways, spanner transforms higher. By providing few the capabilities which are unique to the system. The first one is it provides a unified time synchronized view across all security events so we can trace the attack progression through complex multi relations. It provide the precise time coordination that eliminates the timestamp discrepancies, so it helps us to build a relationship model that reveals the full scope of compromise. By implementing this approach, organization can achieve faster. Thank you.
...

Pushap Goyal

Software Engineer @ Google

Pushap Goyal's LinkedIn account



Join the community!

Learn for free, join the best tech learning community

Newsletter
$ 0 /mo

Event notifications, weekly newsletter

Access to all content