Abstract
Learn how real-time data powers fraud detection, personalization, and instant insights! This talk dives into the tools, architectures, and battle-tested strategies behind high-speed, scalable systems. If you care about performance, this is where speed meets engineering mastery.
Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello everyone.
This is LaMi Avi and my topic here today is Real time data Systems engineering
for speed, scale and resilience.
So let's talk about the realtime in two.
Let's talk like 180 terabytes of data where the realtime demand is 30% and the
latency target is less than one seconds.
The expected global data volume by 2025.
Modern applications demand instant responsiveness from fraud detection
to personalized recommendations.
Realtime data processing has become the bone of mission
critical JavaScript experiences.
Where does realtime matters most?
FinTech fraud, detect.
Within millisecond level transaction analysis, analytics, protecting
billions in assets, web personalization, instant content adaptation based
on user behavior and preferences.
Iot sensor on networks, subsequent response to critical environmental or
equipment changes like architecture.
Overview, the real time pipeline ingesting the data into even streams.
Captured data, a source.
Processing the data and transform and enrich data in flight storage.
Fast query engines for instant access delivery to the applications and end
users, even stream ingestion, getting data in fast like Apache Kafka distributed
log for high throughput scenarios.
Strong ordering guarantees per partition, rich ecosystems of
connectors, and a w Genesis assist.
Manage service with auto scaling.
Seamless AWS integration, lower operational overhead, and built
in data retention and replay.
Choosing the right ingestion layer depends on cput requirements, operational
complexity, and existing infrastructure.
Both solutions handy millions of events per second when configured correctly.
Stream processing like transformation and motion like Apache Flink true stream
engine with exactly one semantics, stateful operations and event time
processing for complex event patterns.
Sparks structured streaming micro batch processing with unified badge of
streaming API strong SQL support and excellent for existing spark ecosystems.
Fraud detection in milliseconds is a case study like number
one is even capture rate.
Transaction data ingested via Kafka topics with subsecond 10 milliseconds.
Latency two is realtime enrichment where joined with a user profile,
historical patterns and risk code.
And flight three is ML model scoring.
Where lightweight models evaluate risk within 50 milliseconds
using feature vectors.
Four is action and alert where high risk transactions blocked instantly.
Alerts sent to security teams here.
End-to-end latency, like 80 to one 20 milliseconds from transaction to decision.
Fast query engines the final mile.
Apache drilled, optimized for time series and even data with storage.
Subsequent aggregations across billions of events, perfect for realtime analytical
dashboards like realtime ingestion with historical data, approximate algorithms
for speed, horizontal scalability built.
And click House All lab database.
Designed for analytical query at scale, exceptional compression
ratios and query performance.
Ideal for high cardinality data, vectorized query execution,
native replication and shading.
SQL interface for developers.
Engineering for low latency.
Minimize network hops, optimize serialization, bad, smart, not big.
And catch chase strategically, like when you say minimize network, hops,
collocate processing, and storage.
When possible, each network round trip adds like one to five
milliseconds designed for data.
Locality and reduced cross center data calls.
Optimize serialization, choose efficiency formats like Avro or
Proto Booth over js ON serialization can consume 20 to 30% of processing
time in high throughput systems.
Bad, smart, not big, small micro batches, balanced throughput, and latency.
Sweet spot is often a hundred thousand events or 10 to 50 milliseconds
window depending on use case.
Catches strategically using memory caches for hard data and
lookup tables ready or cast.
Conserve reads in microseconds, not milliseconds.
Horizon beyond machines.
Partition your data distribute load across multiple nodes using consistent
hashing or range partitioning.
Stateless when possible.
Like stateless services, scale effortlessly when state is required
to use external stores like reds or distributed state management
and your system processes.
Autoscaling strategies monitor qip, then processing lag.
Scale out when lag increases.
Scale in during quiet periods to optimize costs without sacrificing performance.
Consistency versus availability.
The trade off.
Strong consistency.
All readers see the same data immediately.
Higher latency and reduced availability.
Use for financial transactions and critical operations.
Eventual consistency data propagates over time, lower latency and
higher availability, acceptable for analytics, recommendations,
and non-critical features.
Tuneable consistency, just preparation using Quorum reads rights balance
requirements dynamically based on the importance of each request,
there is no one size fits all answer.
Choose based on the business requirements, not technically.
Preference.
Many systems use different consistency levels for different data types.
Fall tolerance building for failures like we have checkpointing replication,
retrial logics, circuit breakers where we can prevent cascading failures by failing
fast monitor downstream services and hard requests when systems are degraded.
Let's talk about like frontend integration bringing.
Realtime, the ui, web sockets and server sent events enabled
bidirectional or unidirectional streaming from server to browser.
Perfect for live dashboards, notifications and collaborative features.
Graph QL subscriptions, declarative realtime data into your existing GraphQL.
API client subscribe to specific data changes and receive updates automatically.
State management, use redx, noex, or just turn to handle streaming
updates in React applications.
Normalize data to prevent rear-ends and maintain performance.
Take away building real time systems, like start with the requirements.
Define latency, targets and consistency needs.
Before choosing technology, not everything needs subsecond.
Processing design for failure, assume components will fail and
build resilience from day one.
Monitoring and alerting are not optional.
Measure everything.
Instrument your pipeline with metrics at every stage.
You can't optimize what you don't measure.
Scale incrementally.
Start simple and add complexity only when needed.
Or engineering early creates technical depth.
Think end to end.
Latency in one component affects the entire system.
Optimize the full pipeline, not individual pieces.
Thank you.