Conf42 Rustlang 2025 - Online

- premiere 5PM GMT

Building Mission-Critical Emergency Response Systems in Rust: Memory Safety Meets Life-Critical Performance

Video size:

Abstract

When 911 systems crash, people die. See how Rust’s borrow checker becomes a literal lifesaver in emergency response systems processing 100K msgs/sec. Live demos of compile-time safety preventing disasters + real performance gains where milliseconds

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hello everyone. Welcome. This is LaMi. Today. In this conference, I wanna talk about building mission critical emergency response systems. In rush. I'm a highly experienced cloud DevOps lead with our 13 years experience in IT industry, currently selling at conent, I'm specialized in cloud native architectures, con container orchestration and infrastructure automation. I have extensive experience in AWS Amazon Web Services and Azure Cloud platforms, Kubernetes, Docker, and continuous indication, continuous delivery, continuous deployment implementation, using tools like Jenkins and configuration management through Ansible and Terraform. I have master's degree in cover science from University of Houston, clear Lake, which combines strong technical capabilities. Leadership skills, having LED cloud migration projects and jobs in transformations at T-Mobile, AudioPro and conent. This experience spans both traditional middleware administration and modern cloud native technologies, making me a cloud technology provisional, who consistently delivers resilient, scalable infrastructure solutions, building mission critical emergency response systems in rush. In an emergency, whether it is a natural disaster or a chemical spill or mask casualty event, systems can't effort to fail. Every millisecond matters. Today I'll showing how Rust can GA a game changer for building life critical applications while performance and safety are non-negotiable. Imagine a major earthquake hits a city within minutes. Emergency calls, call volumes spike 800% systems must process over a hundred thousand messages a second without missing a beat. These systems don't get a chance. Second chance to respond. The operational environment is extreme. We aim for 99%. 99.9% uptime even when infrastructure is collapsing around us. These are not just performance requirements. They are survival requirements. The critical challenges of life saving systems, traditional tech tax fails here. The average response delay in some legacy setups is 17 seconds for long, too long, which is too long. 84% of critical failures trace back to memory safety issues. In C or CPS space. Java brings garbage collection, pauses with sometimes to 35 seconds delay, right when you need instant responses. Problems with which can't in public safety. Why rest for emergency response systems? Rush gives us memory safety without a runtime cost. Thanks to ownership and borrowing, we get zero cost abstractions, fearless concurrency, and deterministic cleanup without GC pauses. In fact, these our Rush implementation delivered 42% faster response times and 35% memory usage compared to. Java Systems rushed illness critical weaknesses while keeping speed. Zero cost abstractions, maintain readability scalability without content penalties. The fearless concurrency, which helps data elevated compiled time, which you 42% concurrency faster response times, and that if it percent lower memory usage compared to Java equivalent. Last ownership model making invalid states and representable ownership rules sound simple, but they're powerful. One. One per value, exclusive or share differences are no inval differences. This means there are no leaks, no race conditions, no null differences, even at a high concurrency in emergency systems. This means we remove anti classes of bugs before they even exist. This leads to emergency systems. Invalid systems can't happen even at extreme concurrency. I want to provide analogy here, like strict airport security. Nothing unsafe gets through the airport security. To case study based emergency dispatch system, real system, real results. Our dispatch system uses asynchronous a eight for non rocking Ivo input output. Tokyo for tasks, scheduling Kafka for real reliable event streaming and web assembly for secure edge computing. We process real-time reports with machine learning models at 90 93 0.7. Accuracy and rush guarantees mean zero catastrophic failures in production. I wanna discuss in detail about the architecture of the systems. The first component is as Synchron I wait for non-blocking input output. I want to compare this with traditional back blocking IO model in traditional blocking IO model. A third makes a network or file request. Then the voice operating systems OS puts the thread to sleep until data arrive the third. Can't do anything else while it is waiting. If you have thousands of requests waiting on input output, Ivo. You need thousands of threads, which is memory heavy and slow due to context switching. The asynchronous heavy process approach is totally different. A function can pause when it's waiting for IO without blocking the thread. While the function is waiting, the thread can run other ready tasks while the completes the runtime. The Tokyo resumes the function exactly where it left off. I want to provide an analogy here. Think of a chef in a kitchen. In blocking Iwo, the chef starts boiling pasta and stands there staring at the pot until it is done. In asynchronous Iwo, the chef starts boiling pasta, sets a timer and moves on to a prepad while waiting. The moment that timer rings, they return to finish. The pasta how it works in the and a synchronous function, returns a feature, a value representing this work. This work will complete later. If this feature is not done yet, pass here. Give control back to runtime and resume me when it's ready. The Tokyo Runtime pulls these features and schedules them efficiently. Coming to Next Component is a, Tokyo is the most wide user asynchronous runtime for Rush. It's combination of task scheduler, Ivo, even two timer system asynchronous utilities. A task scheduler decides when asynchronous task run. The I even group reacts to network file timer events without blocking the timer system for delays, timeouts intervals coming to Kafka for streaming events, which is the next component to this architecture. Apache Kafka is a distributed fault, tolerant and even streaming platform. It is used to send store and process streams. Our data at scale. It works like durable message bus, durable message. Bus producers send events, consumers read them and Kafka keeps them in topics or for set retention period. The core concepts are topic, a name channel for events. Partition is split off topic for scalability. Events in partition are strictly ordered. The producer application then writes events to Kafka. A P Service publishing user actions, the consumer that tweets even from Kafka, the broker Kafka, serves the stores topic per partitions, multiple brokers from Kafka's cluster. Why use Kafka streaming? After discussing the core concepts like topic, partition, what is partition, what is producer, what is consumer? We discuss about what the main users of Kafka are streaming. The high throughput can handle multiple millions of events it at the same time. Kafka gives durability. Events are persistent to disk and replicate across blockers. It's scalable as partitions of parallel process and it's real time consumers can read data at is arrives and enabling low pipelines, and it also provides decoupling. Producers and consumers don't need to know about each other. They just read right or write to topics. How it works in architecture, producer services receive events from clients, then they serialize the data and publish it to Kafka topics and Kafka. Brokers store these events durably and replicate them for fall tolerance. Consumer services subscribe to these topics. Some do enrichment of transformation. Others update materialized views in database. Some might forward even to web assembly edge modules for policy-based routing. So consumer fails, another in the same consumer group takes over. Continuing from the last complete offset benefits of this context, it provides loose coupling a PL layer and analytics don't depend on each other as uptime. Process historical events by replaying from an earlier offset and it provides scalability. Add more partitions, consumers to increase throughput, and it also provides fault tolerance. Data is replicated and consumers can resume after crashes. Next, coming to next component, we assembly to secure edge components. Web assembly is binary instruction format that can run in its sandbox environment, built for browsers, but now used for widely used for out of them. You can write coding languages like rust, C, c plus, go and compile. It is designed to small to be small, fast and secure ideal for edge deployments. Why use at the Edge? Edge means running core and servers that are geographically close to the end user. Benefits of running web assembly. It provides low latency, the which can be processed locally instead of traveling to a central server. It also provides SEC security. Strong sandbox model web assembly. Can't access the host o directly. It only uses a lot of APAs. Portability. Some web assembly module can run on different edge providers without rewriting, and it also provides fast cold starts, modules load almost instantly. And compared to countless of VMs, ENC coding data domain safety in rush type system with rust, we can encode domain rules directly into types. Patient ID is just not an inte. Just an integer isn't just an inte. It's type that can only be visible Validly constructed. This vari by construction approach moves error detection from unpredictable runtime. Failure to predictable compar time checks. This is where the business rules bake into the compiler. Coming to zero cost abstractions for iot sensor integration for environmental monitoring, to traffic surveillance to building integrity checks. Rush trade system uses zero cost abstractions with we integrate varied iot hardware without runtime penalties, which is vital when processing live data under load. The real world Examples like environmental monitoring for hazard hazards, traffic surveillance for accidents. Structural integrity monitoring for buildings result are examples of the realtime scenarios. These traits allow hardware diversity without runtime penalty crucial in realtime data environments. The core message of Euro cost abstractions for iot sensor integration is DI sensor. Same reliability, web assembly support for edge computing. With web assembly, we detect hazards 70 minutes faster than centralized systems we keep running during network outages. Product privacy by processing local leak and cut bandwidth in cus crisis surges. In disaster edge means survival benefits continue. W working offline keeps sense. Two, data local and reduces search bandwidth. The core message of web assembly for edge computing is resilience. When central infrastructure fails coming to performance benchmarks we compare trust rest versus. Traditional systems we, in this graph you can see the response times memory, UC utilization, and worst case pause. That is GC pauses, garbage correction, pauses. The Benchmark Show Rush delivers near C plus per near C plus Cs with but without its memory risks and it all performance Java while avoiding garbage collection pauses. For us, this means no compromise between speed and safety. The core message is Rush proves a performance press safety equation. A synchronous await for non-blocking emergency processing. Here it is a real emergency call flow. In the rest, we geo geolocate the caller pull their info query historical incidents, and all concurrently then a machine learning model, res resources, and dispatches triggered rush asynchronous syn uses concurrency without call back help. Patterns for fall, fall tolerant distributed systems. In rush, the implementation of circuit breakers that automatically isolate failing components prevents ca cascading failures that might otherwise disable anti-system while fallback mechanisms ensure that. Critical operations continue even when o optimal performance is impossible. These approaches have prevent essential for maintaining emergency surveys functionality during infrastructure disruptions. While traditional systems often experience complete outages when key components fail, lessons from deploying rushed in public safety. In 18 months of operation, we had zero memory related caches, strong performance, co, easier compliance, and cost savings and reduced infrastructure costs. The challenges are the developer learning curve, some missing legacy integrations and longer compar times all solvable with training and ecosystem growth. The core message is successes outweigh the challenges. The key takeaways, lush ownership model prevents catastrophic bugs. It delivers C performance, safety type driven design, illness and type error classes, and web assembly readable. Edge computing in count. We can the right response, right choice for mission. Thank you.
...

Lakshmi Vara Prasad Adusumilli

Senior Devops Lead/Cloud Lead @ Conduent

Lakshmi Vara Prasad Adusumilli's LinkedIn account



Join the community!

Learn for free, join the best tech learning community for a price of a pumpkin latte.

Annual
Monthly
Newsletter
$ 0 /mo

Event notifications, weekly newsletter

Delayed access to all content

Immediate access to Keynotes & Panels

Community
$ 8.34 /mo

Immediate access to all content

Courses, quizes & certificates

Community chats

Join the community (7 day free trial)