Conf42 Cloud Native 2021 - Online

Global Active-Active Serverless

Video size:


Be it for data-sovereignty, latency or resiliency, organizations expanding their businesses world-wide are adopting global multi-region serverless architectures.

In this session, you will learn how to improve customer experience on your global services by deploying into multiple regions to reduce latency, and by applying event-driven architectural patterns to increase performance.

See how to use path-based traffic routing to allow for gradual migrations of legacy API operations, how to route requests based on network latency, how to separate reads from writes with CQRS, how to use a Lake House to support multiple data-access patterns while keeping them in sync between regions, and how to simplify the complexity of deploying and maintaining a global active-active architecture by using serverless orchestration across regions.


  • Organizations today are looking to become more agile so that they can innovate. One of the main factors impacting global applications is the end user network latency. Having locally available applications and content is becoming more and more important. But building and successfully running a multiregion active active architecture is hard.
  • The goal is to have our applications deployed to several regions across the globe. These regions will communicate through interregional secure connections. The next step is to keep the data in sync between the AWS regions. There are at least two approaches for this cross regional data replication.
  • Our last step is to ensure data consistency across the data persistence layers of the application's microservices. This is a failure management pattern to coordinate transactions between multiple microservices in AWS. There are some additional resources around multi regional architectures at AWS for you to explore.


This transcript was autogenerated. To make changes, submit a PR.
Hi everyone, welcome to this session on global active active serverless architectures. My name is George Fonseca. I'm a senior solution architect at Amazon Web Services. Whatever the industry, organizations today are looking to become more agile so that they can innovate and respond to changes faster. To do that, organizations need to build applications that can scale to millions of users while still having global availability, respond in milliseconds and manage petabytes, if not exabytes, of data. We call them modern applications and they cover use cases from web mobile, IoT to machine learning applications, but also shared service platforms and microservice backends and more. One of the main factors impacting global applications is the end user network latency. In the past, Amazon has reported a drop of 1% in sales for each additional 100 milliseconds in load time. Content delivery networks, or cdns, have successfully been used to speed up the global delivery of static content, and these include images, videos and JavaScript libraries. But the dynamic calls those still need to be sent back to the back reads. For example, if you have users in Europe accessing a backend in Australia, those users will notice an additional 300 milliseconds in latency, and this isn't acceptable for most popular games, but also banking requirements or interactive applications. Therefore, having locally available applications and content is becoming more and more important these days. But bear in mind, building and successfully running a multiregion active active architecture is hard. In this session we will address one common use case, an HTTP API processing relational data with heavy reads and fewer writes. Also, these writes need to be transactional across multiple microservices as well as third party services and on premise components. Other use cases for multiregion deployment include disaster recovery, where multiregion is a standard practice to keep your disaster recovery environment on a different region. Also, data residency, where multiregion is a solution for compliancy and regulation when you need to keep the data of your users within the regions of those users. There's also software as a service applications, where multiregion is a standard practice for tenant isolation. And then there are the antipatterns. When should you avoid using multiregion deployment? Well, there is the high availability insider region. For these scenarios you should leverage availability zones, not multiregion deployment. Then there is the comparison between multiregion and AWS edge service solutions, and often edge service is enough to address the latency without complicating your solution. But I am assuming that you have already analyzed the pros and cons of these solutions and you have decided for the strategy of multiregion. So let's move on. The final goal of the session is to have our applications deployed to several regions across the globe. These regions will communicate through interregional secure connections, and the on premise data centers will have a high throughput, low latency connectivity to the closest AWS region. Finally, our users will interact via the Internet with the application stack at the region with the lowest latency relative to those users. The first topic to approach is the connectivity between the AWS regions. For that, customers can use a managed network service called AWS Transit Gateway. Transit gateway connects vpcs to on prem networks through a central hub. This simplifies networking peering and acts as a cloud router both within the regions and across the regions. This interregion peering uses the AWS global network to connect the transit gateways together, and this is a fully redundant fiber network backbone providing many terabytes of capacity between the regions at speeds of up to 100 gigs per second. Furthermore, data is automatically encrypted and never travels over the public Internet. For our use based customers can deploying transit gateway on each target region and then connect pairs of regions with the transit gateways. Customers will then connect each transit gateway with all other transit gateways and the result will be this full mesh of transit gateways across the globe. But how about connectivity from on premise data centers and third party networks to AWS? For that, we recommend customers to use AWS direct connect for all production workloads. Direct Connect is a cloud service solution that establishes a private connectivity between AWS and your data center or your office or your colocation environment with speeds up to 100 gigs per second. Alternatively, you may use AWS site to site VPN and AWS client VPN to establish that secure connection over the public Internet. To improve the end user experience, customers can also expose the APIs using Amazon Cloudfront. This is a fast, secure and programmable CDN. It is used by customers like Tinder and Slack to secure and accelerate the dynamic API calls as well as the websocket connections. It accelerates traffic by routing it from the edge locations where the users are to the AWS origins using AWS's dedicated network backbone. But when you deploy to multiregions, cloud front might not be enough. Users will have to be routed to the nearest AWS region and the question is how do you do that while keeping the same URL? The answer is Amazon route 53. This is a highly available and scalable cloud DNS web service. In particular, we want to look at its geolocation, geoproximity and latency routing policies so that we can route end users to the best application endpoints for our multiregion active active use case customers can combine cloud from and route 53 to achieve this domain name translation, but also dynamic content security and acceleration. Also path based functional routing and finally latency based routing across the regions. So let's recap the network architectures for our use case customers can use transit gateway to establish the secure low latency connectivity between the AWS regions, use direct connect to establish a secure private low latency connectivity between the data centers and the nearest AWS region, and then use cloud from with route 53 to route end users from the edge locations to the best AWS region based on network latency and with the network connectivity sorted, the next step is to keep the data in sync between the AWS regions and for that there are at least two approaches for this cross regional data replication, the synchronous solution and the asynchronous solution. With synchronous replication, write requests need to successfully replicate across regions so that they can be acknowledged back to the application. This ensures consistency across regions, but creates a dependency on other regions, so if one region fails, they all fail. With a synchronous replication, write requests are successful if they are persisted locally only and the cross regions replication is deferred by milliseconds or until the target regions are available. So with this we overcome the regional outages, but we also cause writing conflicts between regions. Nonetheless, we will follow the async approach for our use case and we will see how to solve these issues. The first crossregional database engine to consider is Amazon DynamoDB Global Tables DynamoDB is a serverless key value and document pathbased that can handle up to 10 trillion requests per day and support peaks of up to 20 million requests per second. The global tables feature provides a fully managed automatic table replication across AWS regions with single digit millisecond latency. Unfortunately, it does not fit our use based because we have the requirement for relational and transactional data. The second cross regional database engine to consider is Amazon Aurora Global database. Aurora is a MySQL and PostgreSQL compatible relational database built for the cloud. The global database feature, which is available with Amazon Aurora serverless version two, allows a single database to span multiple AWS regions with subsecond latency to readonly replicas on those regions. In the case of a regional outage, a secondary region can be promoted to primary region in less than 1 minute. This results in an RTo of 1 minute, which is recovery time objective and an RPo of 1 second, which is recovery point objective. Finally, customers can consider Amazon s three replication. Amazon S three is an object storage service used to store and protect any amount of data for a range of use cases, and these use cases include data lakes, websites, enterprise applications, IoT, and big data analytics, just to name a few. The S three replication feature allows data to be replicated from one sourced bucket to multiple destination buckets, with most objects being replicated in seconds and 99.99% of objects being replicated within the first 15 minutes. So let's recap the considerations for cross regions data replication. We looked at Amazon DynamodB, Amazon Aurora, and Amazon S three. Each of these services offers crossregional automatic replication features using serverless technology, so now it's time to put them to work. For that, I will introduce the command and query responsibility segregation pattern, or cqrs for short. This architectures pattern involves separating the data mutation part of a system from the query part of that system. In traditional architectures, the same data model is used to query and update the pathbased. That is simple and works well for basic crude operations. But our use case is asymmetrical with heavy reads and fewer writes. Therefore, it has very different performance and scalability requirements between reads and writes. Customers can use cqrs to perform writes onto normalized modules in relational databases and then perform queries against a normalized database that stores the data in the same format required by the APIs. This will reduce the processing overhead of the reads while increasing the maintainability of complex business logic on the writes in AWS. The CQRS pattern is typically implemented with Amazon API gateway. In this scenario, mutations are post requests processed by an AWS lambda function that calls domain specific services. A denormalized version of the data is then mirrored onto DynamoDB so that subsequent queries are performed by get requests reading the normalized objects directly from DynamoDB tables. The asynchronous version of CQRS pattern implementation adds a queue to the writing operation. This will allow for long running writes with the immediate API responses back to the clients, but now reads have become more complexity because the write duration of the requests is unknown. The solution is to notify the clients using the websockets feature of API gateway, which is invoked by a lambda function when the dynamodB tables are updated. Another way to implement CQRs is to use AWS app sync, exposing the API as a GraphQL API. This definitely simplifies real time updates because it allows for no code data subscriptions directly from DynamoDB, and it also reduces the implementation complexity by allowing front end developers to query multiple data entities with a single graphic QL endpoint. When applied to our use case, this results in a multiregional architecture where route 53 routes requests to a regional API endpoint based on latency, and the data is then queried or subscribed to from DynamoDB global Tables in sync between or across the regions AWS. For the migrations, they are limited to the primary region only and mutation requests on the secondary regions are forwarded to the primary regions. This pattern is called read local, write global and the advantage is that it removes the data conflicts across the regions at a cost of added write latency on the secondary regions, but that has minor relevance to our use case. Our last step is to ensure data consistency across the data persistence layers of the application's microservices, considering our use based dependency on third party services and on premise components for that, I will introduce the saga pattern. This is a failure management pattern to coordinate transactions between multiple microservices. In AWS. The saga pattern is typically implemented with AWS step functions. This is a serverless orchestration service that lets you combine and sequence lambda functions and other AWS services to build business critical applications. For example, for a shopping cart checkout operation, the application will first need to pre authorize the credit card that is successful. Then the application will actually charge the credit card, and only if that is successful will the customer information be updated. So by using step functions, each step can follow the single responsibility principle. While all of the plumbing and the failure management and the orchestration logic are kept separated, going back to our use based architecture where we have the lambda functions performing the mutations step functions will now take their place. As for asynchronous migrations, customers can leverage Amazon Eventdriven. This is a serverless event bus for building event driven applications at scale, directly integrating with over 130 event sources and over 35 targets, and it's now time to bring it all together into our final architectures. So in this application, route 53 plus cloud from will route the requests from the edge locations to the regions based on latency. Appsync will then query and subscribe the normalized data from Dynamodb. Applying will also orchestrate the mutations, but only on the primary regions. An event bridge will serve the event driven asynchronous mutations. Finally, the data layer will be automatically synchronized between the regions. Upon a secondary region failure route 53, health checks will divert traffic automatically to other regions, so the affected users within the lost secondary region will only notice an additional network latency and when that affected secondary region recovers, data will be automatically resynced and the secondary regions will then be able to resume services the users. On the other hand, upon a primary regions failure, the application will enter a readonly mode. If the expected duration of the outage is acceptable by the business, then the application may continue to run with limited functionalities until the primary region recovers. Otherwise, a secondary region should be promoted to primary at this point by activating the failover or the disaster recovery procedures. This reference architecture is publicly available on the AWS Architecture center. The PDF version includes further details and you can download it by scanning the QR code on the bottom left corner of this. Slide it finally, there are some additional resources around multi regional architectures at AWS for you to explore. Take a look at these links and I hope you have enjoyed this session. Thank you for your time.

Jorge Fonseca

Solutions Architect @ AWS

Jorge Fonseca's LinkedIn account

Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways