Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello everyone.
This is Shankar.
I'm here to present my topic on the Platform Engineers Hidden
Network Challenge, especially to talk about the multicast in
the cloud native environments.
Traditionally the multicast is proven to reduce the bandwidth consumption
in the the on-prem environments, but the major cloud providers.
Most of them don't natively support multicast.
So I'm gonna be talking about the pros and cons of multicast and some
of the alternate options that could potentially be implemented within
the the cloud native environments.
So when I say multicast what exactly it means?
It's replicating the same data to multiple recipients.
So since the data and the same data is being received by the multiple
in-house typically where this helps.
If we consider unicast, it's one-to-one, right?
Sender sends the packet to see what gets it.
So for that to happen, they have to establish a session initially.
And if we assume hundreds or thousands of receivers and one sender that
single sender has to build hundred or thousands of TCP sessions with each
one of them, which is CPU Intensive.
So you're literally wasting.
The resource of the sender just by establishing these
sessions which is efficient.
So the better approach is using malca.
We use one sender since the packet.
To the multicast group and whoever joins that group, the receivers,
they start receiving the packets.
And there is no session between the sender and receiver who are joins the group.
They get those packets, and if you leave, you're out of the stream.
So you could consider the multicast as an example of probably a radio
stream or a TV channel streaming.
Where the broadcaster broadcast the channel, the programs on the channel
and who are like, if you tune, tune into that channel, you start seeing
the picture in your TV and whoever joins the channel, they'll be able to
see the same thing at that given time.
And same thing with the radio, like the radio broadcaster
streams on a certain frequency.
And if you tune the radio to that frequency.
You start listening the sounds or whatever broadcast is done.
So it's the same concept like the multicast one center streams
data and whoever joins the group they start receiving the data.
There's the, though there's a dramatic efficiency gain but implementing that
in the large scale virtual environments.
Has been complex and difficult which is why I think it's still
not yet need to be supported in some of the major cloud providers.
So what are the critical barriers, right?
That broke multicast state explosion and software defined networks.
When we talk about multicast the elements that are in the network
have to maintain a forwarding table.
And since and it has to be consistent across the network
infrastructure in the cloud.
And as the, as there is if you have multiple resources or the receivers
joining and leaving the group, there's a huge amount of calculations that happen.
So this could stress out the control plane of the the backend infrastructure.
So that could be.
One reason that could potentially prevent multicast being implemented in this
wide network of public cloud security.
Another consideration.
Typically the cloud native platforms this strictly depend on the
multi-tenancy with the different clients.
Our customer workloads running the shared fiscal infrastructure in the
backend, but they're typically separated.
There is no leakage of data or the resource don't get shared into
the other customers, environment.
But the.
Multicast designed to be operated under one administrative control.
So there could be a potential that the the data could leak into
the other tenant's environments.
So that probably one reason that the multicast could
potentially be not implemented.
The other thing is the operational complexity and automation challenges.
With unicast, you can define it, static routing.
But when it comes to multicast it's dynamically changing behavior.
It's, which is hard to predict and as I said, they it potentially.
Causes a lot of recalculations in the forwarding table and a lot
of stress in the control plane.
So that's another reason the fourth one performance degradation
in the virtual environments.
Theoretically multicast reduces the the bandwidth consumption,
but the com but it comes with the complexity of implementation.
And there's the processing of multicast package through the virtual environments
could potentially be far reaching or maybe it exceeds the bandwidth savings.
And the fifth one is the the architecture mismatch with the cloud native patents.
So the microservices they emphasize loosely on the loose coupling.
And they express explicit service boundaries, but when it comes to
multicast, it's implicit coupling between the publisher and the subscribers.
Those are like the five constraints that could potentially has been
restricting the multicast to be natively supported in the cloud.
And how.
How having the multicast restricted in this cloud it, it could
impact infrastructure automation.
One would be the migration challenges.
So many financial data telecommunication media streaming services, heavily
relying on multicast for the efficient data distribution.
As most of their models depend on one couple centers, multiple receivers.
But when it needs to be, when it needs to be brought into the the
the cloud native it changed like the legacy application moving into the.
Changes the whole infrastructure architecture because as it's not natively
supported, so if you're moving from on-prem legacy app, which is a multicast
based into cloud yeah that's where.
Since there is no native support from multicast in the cloud you would have
to employ some alternate solutions.
So which kind of becomes complex and that could result in migration challenges
and infrastructure as code, right?
Like terraform and cloud formations.
There is, as there is no direct resource natively supported, so you cannot directly
implement using the declarative approach.
Rather you have to have some complex workarounds through custom scripts.
And that kinda breaks the process.
The platform engineers rely on like the consistent repeatable code that
we kinda rely on with the state files.
That kind of probably gets lacked.
So those could be two of the impacts the infrastructure mission can
have with the missing multicast, native multicast supporting.
Cloud container orchestration complications.
So service discovery service discovery and Kubernetes typically depends
on the DNS, which is like the point to point communication patterns.
Applications that generally use multicast for dynamic service discovery.
They must be modified to work with the Kubernetes service abstraction model.
So that's one challenge that if you're gonna go from.
Legacy app to like the microservices that could be a huge change that needs
to be considered security policy gaps.
You can apply security policies for part to part communications.
But the Kubernetes security process do not control the multicast traffic.
And that could be, huh.
Potential security gap that could be left pod lifecycles.
The pods lifecycle management the pods could be created, destroyed
pretty quickly even before the multicast recalculations happen.
So that could be one complication.
If you want to implement.
Multicast within the Kubernetes in the cloud native environment.
So does that mean like we cannot do multicast?
I would say there are a couple approaches that could be employed though a little
complex and have their pros and cons.
But they, there could be few approaches that could be taken.
One is at the application level itself within the developers actually
take that network layer into the application like code, the the
membership, the group membership management the message delivery and
error handling within that application.
And they actually maintain the whole communication from center to
receivers within the application.
Though it increases the complexity, but it does provide with an approach
where you could generally move migrate your applications into cloud.
But the overhead of network now falls under the developers.
And also with the networking layer added into the application
there could be performance issues.
As well.
The other approach that we can employ is creating over networks.
So there, there are a couple vendors that do support over networks within
the public cloud environments.
So the one way is to build let's say.
You have a marketplace router that you can deploy and you can build
the GRE tunnels from the host and the receivers, I mean from the
center end receivers to that cloud router, which can as a central point.
And then build BGP between the end house to that centralized router.
And then the sender sends the package to the.
Router and you can configure like the pim which kind of manages
the whole multicast protocol.
That's one way that it can be achieved.
Though as I said it, it has its own complexity, but it is one way
that you can take off that overhead from the application and bring it
back into the the network layer.
Compared to what we've discussed earlier about application level replication,
There are, there, there are a couple cloud native alterna approaches that
could potentially be reviewed as well.
Mainly the messaging services like.
Amazon SQS cloud pop sub they provide reliable scalpel message
delivery and if properly planned and designed potentially the cloud
native approach could be avoided.
The network approach could be avoided and potentially the application
will be, can be built more of with the cloud native resources.
And if there is an application like the content delivery networks.
If the application relies on the content distribution rather than
real time data, then every cloud provider provides support CDN.
So that could be one that could be employed.
And and there are like software defense solutions.
Like the industries now, the network industries coming up with
some of the solutions that could potentially be supported in the cloud
as well, like a market solutions.
So that, that could be one option that we could potentially see in the future.
Okay.
So though we have the workarounds, but it, it does have its own cons and
first one is the complex workaround.
Like even if you try to do it overly or apply it in the application.
The lack of native multicast support it forces the development
teams to learn new things and try to inculcate into the application.
The whole developer experience gets affected and it might slow down the
software lifecycle, like the software development lifecycle as well.
Environment inconsistency.
You might be having a legacy application that locally.
Works fine.
But with the with the alternate approach that we try to implement in the cloud.
So there's, there could be this huge disconnect between the application,
the on-prem to what it's working now in the in the cloud native and the
in the public cloud with alternate solutions that you implement.
And also the testing.
The testing and the legacy would differ a lot.
When it comes to the solution implemented in the cloud.
So these are some of the challenges that we might come across if
implemented the alternate way of implementing the multicast in the cloud
and operational overhead and complexity.
So
monitoring, complexity, and, capacity planning challenges and
multiple communication patterns.
So monitoring becomes complex with the multicast because monitoring and
troubleshooting actually given you would be implementing different rather
than using a cloud native now we'll be implementing multiple, overlay networks
with, implementing overlay network with the different vendor elements in it.
So it'll be hard to figure out where the issue is, and also like monitoring
different elements centrally that becomes a huge problem And capacity
planning given the multicast if you implementing the application it would
be hard to know what's the usage and the scaling becomes difficult if it is
managed entirely within the application.
So that's another challenge that could arise.
Implementing the alternate solutions.
So it typically affects in higher costs, right?
Higher costs.
Incident resolution become longer and there could be SLAs being missed,
service disruptions, production issues.
It, you could implement with alterna approaches, but there are
complexities and there are it, its own cons that come along with it.
As I mentioned, the future prospects and emerging technologies.
SGN native multi multicast protocols.
So the vendors the networking, networking vendors are probably gonna be building
something that could natively support in future within the cloud environments
or maybe something that could be not implemented across the wide area network.
In the cloud, but potentially limited to a certain scope for a specific purpose that
could be option that could be implemented.
And with the advancement of some of the nicks, like the smart nicks
and the data processing units that could help potentially reduce
the the challenges that we see in the the virtualized environments.
That could potentially reduce the the performance issues that multicast posts in
the cloud environments and maybe service mesh integrations like ST on console.
There's been great advancements and potentially that's one area
that could, can have like multicast like functionality in future, one
cloud native option that could.
Being researched is serverless event in architectures.
Because I think with so many issues that we could come across with the alternate
approaches like doing multicast within the application or doing the OA networks.
As in increases as it in introduces complexity.
I think going with the cloud native approach or redesigning your applications
to be more cloud native probably would be the going forward and as
serverless even architectures could be something that could be employed.
And yeah, as I mentioned, you could the cloud native alternates is
one option that could be employed.
Sorry.
The operational complexity and architectural constraints in the cloud
environment, the total con of cost of ownership include not just resource
consumption, but also operational overhead and development complexity, and the future
of efficient group communication in cloud environment likely lies in solutions
Design specifically for software defines centrally controlled architectures.
Rather than adaptions of traditional distribution protocols.
That ends my topic.
Thank you all for listening.
Thanks.