Abstract
            
Be it for data-sovereignty, latency or resiliency, organizations expanding their businesses world-wide are adopting global multi-region serverless architectures.
In this session, you will learn how to improve customer experience on your global services by deploying into multiple regions to reduce latency, and by applying event-driven architectural patterns to increase performance.
See how to use path-based traffic routing to allow for gradual migrations of legacy API operations, how to route requests based on network latency, how to separate reads from writes with CQRS, how to use a Lake House to support multiple data-access patterns while keeping them in sync between regions, and how to simplify the complexity of deploying and maintaining a global active-active architecture by using serverless orchestration across regions.
           
          
          
          
            
              Transcript
            
            
              This transcript was autogenerated. To make changes, submit a PR.
            
            
            
            
              Hi everyone, welcome to this session on global active
            
            
            
              active serverless architectures. My name is George Fonseca.
            
            
            
              I'm a senior solution architect at Amazon Web Services.
            
            
            
              Whatever the industry, organizations today are looking to become more
            
            
            
              agile so that they can innovate and respond
            
            
            
              to changes faster. To do that, organizations need to build applications
            
            
            
              that can scale to millions of users while still having global
            
            
            
              availability, respond in milliseconds and manage
            
            
            
              petabytes, if not exabytes, of data. We call
            
            
            
              them modern applications and they cover use cases
            
            
            
              from web mobile, IoT to machine learning applications,
            
            
            
              but also shared service platforms and microservice backends
            
            
            
              and more. One of the main factors impacting global applications
            
            
            
              is the end user network latency. In the past,
            
            
            
              Amazon has reported a drop of 1% in sales
            
            
            
              for each additional 100 milliseconds in load time.
            
            
            
              Content delivery networks, or cdns, have successfully been used
            
            
            
              to speed up the global delivery of static content,
            
            
            
              and these include images, videos and JavaScript
            
            
            
              libraries. But the dynamic calls those still
            
            
            
              need to be sent back to the back reads. For example,
            
            
            
              if you have users in Europe accessing a backend in Australia,
            
            
            
              those users will notice an additional 300 milliseconds
            
            
            
              in latency, and this isn't acceptable for most popular games,
            
            
            
              but also banking requirements or interactive applications.
            
            
            
              Therefore, having locally available applications and
            
            
            
              content is becoming more and more important these days.
            
            
            
              But bear in mind, building and successfully running a multiregion
            
            
            
              active active architecture is hard. In this session we
            
            
            
              will address one common use case, an HTTP
            
            
            
              API processing relational data with heavy reads
            
            
            
              and fewer writes. Also, these writes need
            
            
            
              to be transactional across multiple microservices as
            
            
            
              well as third party services and on premise components.
            
            
            
              Other use cases for multiregion deployment include
            
            
            
              disaster recovery, where multiregion is a standard practice to
            
            
            
              keep your disaster recovery environment on a different region.
            
            
            
              Also, data residency, where multiregion is a solution
            
            
            
              for compliancy and regulation when you need to keep the
            
            
            
              data of your users within the regions of those
            
            
            
              users. There's also software as a service applications,
            
            
            
              where multiregion is a standard practice for tenant isolation.
            
            
            
              And then there are the antipatterns. When should
            
            
            
              you avoid using multiregion deployment? Well,
            
            
            
              there is the high availability insider region. For these
            
            
            
              scenarios you should leverage availability zones,
            
            
            
              not multiregion deployment. Then there is the comparison between multiregion
            
            
            
              and AWS edge service solutions, and often edge
            
            
            
              service is enough to address the latency without complicating
            
            
            
              your solution. But I am assuming that you have already analyzed
            
            
            
              the pros and cons of these solutions and you have decided
            
            
            
              for the strategy of multiregion. So let's move on. The final
            
            
            
              goal of the session is to have our applications deployed to
            
            
            
              several regions across the globe. These regions will communicate
            
            
            
              through interregional secure connections, and the
            
            
            
              on premise data centers will have a high throughput, low latency
            
            
            
              connectivity to the closest AWS region. Finally,
            
            
            
              our users will interact via the Internet with the
            
            
            
              application stack at the region with the lowest latency
            
            
            
              relative to those users. The first topic to
            
            
            
              approach is the connectivity between the AWS regions.
            
            
            
              For that, customers can use a managed network service
            
            
            
              called AWS Transit Gateway. Transit gateway
            
            
            
              connects vpcs to on prem networks through a
            
            
            
              central hub. This simplifies networking peering and
            
            
            
              acts as a cloud router both within the regions
            
            
            
              and across the regions. This interregion
            
            
            
              peering uses the AWS global network
            
            
            
              to connect the transit gateways together, and this is a fully
            
            
            
              redundant fiber network backbone providing many terabytes
            
            
            
              of capacity between the regions at speeds of up to
            
            
            
              100 gigs per second. Furthermore, data is automatically
            
            
            
              encrypted and never travels over the public Internet.
            
            
            
              For our use based customers can deploying transit gateway on
            
            
            
              each target region and then connect pairs of regions with
            
            
            
              the transit gateways. Customers will then connect each transit gateway
            
            
            
              with all other transit gateways and the result will be
            
            
            
              this full mesh of transit gateways across the globe.
            
            
            
              But how about connectivity from on premise data
            
            
            
              centers and third party networks to AWS?
            
            
            
              For that, we recommend customers to use AWS direct
            
            
            
              connect for all production workloads. Direct Connect is
            
            
            
              a cloud service solution that establishes a private
            
            
            
              connectivity between AWS and your data center or
            
            
            
              your office or your colocation environment with
            
            
            
              speeds up to 100 gigs per second.
            
            
            
              Alternatively, you may use AWS site to site
            
            
            
              VPN and AWS client VPN
            
            
            
              to establish that secure connection over the public Internet.
            
            
            
              To improve the end user experience, customers can also
            
            
            
              expose the APIs using Amazon Cloudfront. This is a
            
            
            
              fast, secure and programmable CDN. It is used by
            
            
            
              customers like Tinder and Slack to secure and accelerate
            
            
            
              the dynamic API calls as well as the websocket connections.
            
            
            
              It accelerates traffic by routing it from the edge locations
            
            
            
              where the users are to the AWS origins using
            
            
            
              AWS's dedicated network backbone.
            
            
            
              But when you deploy to multiregions,
            
            
            
              cloud front might not be enough. Users will have to be routed
            
            
            
              to the nearest AWS region and the question is how
            
            
            
              do you do that while keeping the same URL? The answer
            
            
            
              is Amazon route 53. This is a highly available and
            
            
            
              scalable cloud DNS web service. In particular,
            
            
            
              we want to look at its geolocation, geoproximity and
            
            
            
              latency routing policies so that we can route end users
            
            
            
              to the best application endpoints for our multiregion
            
            
            
              active active use case customers can combine cloud from
            
            
            
              and route 53 to achieve this domain
            
            
            
              name translation, but also dynamic content security
            
            
            
              and acceleration. Also path based functional
            
            
            
              routing and finally latency based routing across
            
            
            
              the regions. So let's recap the network architectures
            
            
            
              for our use case customers can use transit gateway
            
            
            
              to establish the secure low latency connectivity between the AWS
            
            
            
              regions, use direct connect to establish a
            
            
            
              secure private low latency connectivity between the data centers
            
            
            
              and the nearest AWS region, and then use cloud
            
            
            
              from with route 53 to route end users
            
            
            
              from the edge locations to the best AWS region based on network
            
            
            
              latency and with the network connectivity sorted, the next
            
            
            
              step is to keep the data in sync between the AWS
            
            
            
              regions and for that there are at least two approaches for
            
            
            
              this cross regional data replication, the synchronous
            
            
            
              solution and the asynchronous solution. With synchronous replication,
            
            
            
              write requests need to successfully replicate across regions so
            
            
            
              that they can be acknowledged back to the application. This ensures
            
            
            
              consistency across regions, but creates a dependency on
            
            
            
              other regions, so if one region fails, they all
            
            
            
              fail. With a synchronous replication, write requests are
            
            
            
              successful if they are persisted locally only
            
            
            
              and the cross regions replication is deferred by milliseconds or
            
            
            
              until the target regions are available.
            
            
            
              So with this we overcome the regional outages,
            
            
            
              but we also cause writing conflicts between regions.
            
            
            
              Nonetheless, we will follow the async approach for
            
            
            
              our use case and we will see how to solve these issues.
            
            
            
              The first crossregional database engine to consider is
            
            
            
              Amazon DynamoDB Global Tables DynamoDB
            
            
            
              is a serverless key value and document pathbased that
            
            
            
              can handle up to 10 trillion requests per day and support
            
            
            
              peaks of up to 20 million requests per second.
            
            
            
              The global tables feature provides a fully managed
            
            
            
              automatic table replication across AWS regions
            
            
            
              with single digit millisecond latency. Unfortunately, it does
            
            
            
              not fit our use based because we have the requirement for
            
            
            
              relational and transactional data. The second
            
            
            
              cross regional database engine to consider is Amazon Aurora
            
            
            
              Global database. Aurora is a MySQL and PostgreSQL
            
            
            
              compatible relational database built for the cloud. The global
            
            
            
              database feature, which is available with Amazon Aurora serverless
            
            
            
              version two, allows a single database to span
            
            
            
              multiple AWS regions with subsecond latency to readonly
            
            
            
              replicas on those regions. In the case of a regional outage,
            
            
            
              a secondary region can be promoted to primary region
            
            
            
              in less than 1 minute. This results in an RTo
            
            
            
              of 1 minute, which is recovery time objective and
            
            
            
              an RPo of 1 second, which is recovery
            
            
            
              point objective. Finally, customers can consider Amazon
            
            
            
              s three replication. Amazon S three is an object storage
            
            
            
              service used to store and protect any amount of data for a
            
            
            
              range of use cases, and these use cases include
            
            
            
              data lakes, websites, enterprise applications,
            
            
            
              IoT, and big data analytics, just to name a few.
            
            
            
              The S three replication feature allows data to be replicated from
            
            
            
              one sourced bucket to multiple destination buckets, with most
            
            
            
              objects being replicated in seconds and 99.99%
            
            
            
              of objects being replicated within the first 15 minutes.
            
            
            
              So let's recap the considerations for cross regions
            
            
            
              data replication. We looked at Amazon DynamodB, Amazon Aurora,
            
            
            
              and Amazon S three. Each of these services offers crossregional
            
            
            
              automatic replication features using serverless technology,
            
            
            
              so now it's time to put them to work. For that,
            
            
            
              I will introduce the command and query responsibility segregation
            
            
            
              pattern, or cqrs for short. This architectures pattern involves
            
            
            
              separating the data mutation part of a system from the query part
            
            
            
              of that system. In traditional architectures, the same data
            
            
            
              model is used to query and update the pathbased.
            
            
            
              That is simple and works well for basic crude operations.
            
            
            
              But our use case is asymmetrical with heavy reads
            
            
            
              and fewer writes. Therefore, it has very different performance
            
            
            
              and scalability requirements between reads and writes.
            
            
            
              Customers can use cqrs to perform writes onto normalized
            
            
            
              modules in relational databases and then perform
            
            
            
              queries against a normalized database that stores the
            
            
            
              data in the same format required by the APIs.
            
            
            
              This will reduce the processing overhead of the reads while
            
            
            
              increasing the maintainability of complex business logic on
            
            
            
              the writes in AWS. The CQRS pattern is typically
            
            
            
              implemented with Amazon API gateway. In this scenario,
            
            
            
              mutations are post requests processed by an AWS
            
            
            
              lambda function that calls domain specific services.
            
            
            
              A denormalized version of the data is then mirrored
            
            
            
              onto DynamoDB so that subsequent queries are
            
            
            
              performed by get requests reading the normalized objects
            
            
            
              directly from DynamoDB tables. The asynchronous version
            
            
            
              of CQRS pattern implementation adds a queue to
            
            
            
              the writing operation. This will allow for long running writes
            
            
            
              with the immediate API responses back to the clients,
            
            
            
              but now reads have become more complexity because the write
            
            
            
              duration of the requests is unknown. The solution
            
            
            
              is to notify the clients using the websockets feature
            
            
            
              of API gateway, which is invoked by a lambda function when
            
            
            
              the dynamodB tables are updated. Another way
            
            
            
              to implement CQRs is to use AWS app
            
            
            
              sync, exposing the API as a GraphQL
            
            
            
              API. This definitely simplifies real time updates
            
            
            
              because it allows for no code data subscriptions
            
            
            
              directly from DynamoDB, and it also reduces the
            
            
            
              implementation complexity by allowing front end developers to
            
            
            
              query multiple data entities with a single graphic
            
            
            
              QL endpoint. When applied to our use case,
            
            
            
              this results in a multiregional architecture where
            
            
            
              route 53 routes requests to a regional
            
            
            
              API endpoint based on latency, and the data is
            
            
            
              then queried or subscribed to from DynamoDB global
            
            
            
              Tables in sync between or across the regions
            
            
            
              AWS. For the migrations, they are limited to the
            
            
            
              primary region only and mutation requests on
            
            
            
              the secondary regions are forwarded to the primary regions.
            
            
            
              This pattern is called read local, write global
            
            
            
              and the advantage is that it removes the data conflicts across
            
            
            
              the regions at a cost of added write latency on the
            
            
            
              secondary regions, but that has minor relevance to
            
            
            
              our use case. Our last step is to ensure data
            
            
            
              consistency across the data persistence layers of
            
            
            
              the application's microservices, considering our use
            
            
            
              based dependency on third party services and on premise
            
            
            
              components for that, I will introduce the saga pattern.
            
            
            
              This is a failure management pattern to coordinate transactions between
            
            
            
              multiple microservices. In AWS. The saga
            
            
            
              pattern is typically implemented with AWS
            
            
            
              step functions. This is a serverless orchestration service
            
            
            
              that lets you combine and sequence lambda functions and
            
            
            
              other AWS services to build business critical applications.
            
            
            
              For example, for a shopping cart checkout operation, the application will
            
            
            
              first need to pre authorize the credit card that is successful.
            
            
            
              Then the application will actually charge the credit card,
            
            
            
              and only if that is successful will the customer information be
            
            
            
              updated. So by using step functions, each step can
            
            
            
              follow the single responsibility principle. While all of the plumbing
            
            
            
              and the failure management and the orchestration logic are
            
            
            
              kept separated, going back to our use based architecture
            
            
            
              where we have the lambda functions performing the mutations step
            
            
            
              functions will now take their place. As for asynchronous migrations,
            
            
            
              customers can leverage Amazon Eventdriven. This is a serverless event
            
            
            
              bus for building event driven applications at scale,
            
            
            
              directly integrating with over 130 event sources
            
            
            
              and over 35 targets, and it's now time to bring
            
            
            
              it all together into our final architectures. So in this
            
            
            
              application, route 53 plus cloud from will
            
            
            
              route the requests from the edge locations to the regions based on
            
            
            
              latency. Appsync will then query and subscribe
            
            
            
              the normalized data from Dynamodb.
            
            
            
              Applying will also orchestrate the mutations, but only on the primary
            
            
            
              regions. An event bridge will serve the event driven asynchronous
            
            
            
              mutations. Finally, the data layer will be automatically
            
            
            
              synchronized between the regions. Upon a
            
            
            
              secondary region failure route 53, health checks
            
            
            
              will divert traffic automatically to other regions, so the
            
            
            
              affected users within the lost secondary
            
            
            
              region will only notice an additional network latency
            
            
            
              and when that affected secondary region recovers, data will be
            
            
            
              automatically resynced and the secondary regions
            
            
            
              will then be able to resume services the users.
            
            
            
              On the other hand, upon a primary regions failure,
            
            
            
              the application will enter a readonly mode. If the expected
            
            
            
              duration of the outage is acceptable by the business, then the
            
            
            
              application may continue to run with limited functionalities
            
            
            
              until the primary region recovers. Otherwise, a secondary region
            
            
            
              should be promoted to primary at this point by activating
            
            
            
              the failover or the disaster recovery procedures. This reference
            
            
            
              architecture is publicly available on the AWS Architecture
            
            
            
              center. The PDF version includes further details
            
            
            
              and you can download it by scanning the QR code on the bottom
            
            
            
              left corner of this. Slide it finally, there are some additional
            
            
            
              resources around multi regional architectures at AWS
            
            
            
              for you to explore. Take a look at these links and I hope you
            
            
            
              have enjoyed this session. Thank you for your time.