Conf42 Cloud Native 2021 - Online

Pragmatic App Migration to the Cloud: Quarkus, Kotlin, Hazelcast and GraalVM in action

Video size:

Abstract

With the Cloud becoming ubiquitous, it’s time to assert whether our traditional application stack is up to it.

Summary

  • Nicola Frankel talks about pragmatic application migration to the cloud with Quarkus Kotlin, Hazel, cost and VM. Right now he's a developer advocate for Hazelcast. Hazelcast has two products. One is an in memory data grid and the other is in memory stream processing.
  • Today I will talk to you about the cloud. There are good reasons to migrate to the cloud and there are not so good reasons. The first reason why you would like to migrate is visibility. If you think you'll gain money, or at least waste less money, that's probably wrong.
  • There are twelve principles that cloud native applications must follow. The first is to declare all your dependency. The second is to start up fast. The third is to have streaming logs in containers. There are a couple of issues if we want to rewrite the application.
  • But in both cases we need additional people. They will maintain and handle the change on the legacy application, while your own workforce will work on the new version. The middle path is actually to reuse the existing codes, but change the way they are used.
  • GraalVM has a GVM platform. It can speak multiple languages. You can use multiple languages in your application. There is an underlying framework called truffle. This allows you to create your own language.
  • GraalVM allows you to create native executable from existing bytecodes. Of course it has some limitations. Other limitation includes the lack of security manager. Now a generation of cloud native frameworks such as micronaut or quarkus.
  • Instead of having the Tomcat we just replace everything with Quarkus. Now we have a single process and it's a native binary. The best stuff happens when you want to deploy in a container.
  • The idea is I want to build that for the cloud. When possible, reuse your existing sync code. leverage different frameworks, cloud native frameworks. The demo that I've showed you is publicly available. If you got interested in Hazelcast, please join our slack.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hi folks, thanks to be here for this talk about pragmatic application migration to the cloud with Quarkus Kotlin, Hazel, cost and VM. I'm Nicola Frankel. I've been a former developer and other technical roles. I worked in the backend fields and mainly in Java with Javae and spring technologies. And right now I'm a developer advocate. I work for a company called Hazelcast. Hazelcast has two products. The first one is an in memory data grid and you can think about an in memory data grid as distributed data structures. So you would have a cluster of nodes and you can short the data over several nodes or replicate it. And the other one is that will cause jet and this is in memory stream processing. Today I will talk to you about the cloud. There is no denying that today everybody goes to the cloud. You can think about it as a sort of gold rush. And there are good reasons to migrate to the cloud and there are not so good reasons. Let's talk about the good reasons. The first reason why you would like to migrate to the cloud is visibility. That's very important. Some people think mistakenly that a we will migrate to the cloud, it will be less expensive. Well, it might be the case, it probably won't. But the reason is not about the exact sum. The fact is in traditional it, you know, cost of exequician, you might know the cost of maintenance because you have contracts. But besides that, it's very hard to compute exactly the cost of running piece of infrastructure. There are people, you don't know really exactly how much time they spend. There are a lot of factors that can influence it and well it's very hard to compute. Before we talked about the total cost of ownership and yeah, there are metrics, there are like diagrams, sheets, whatever, but still it takes a lot of effort and in the end it's a best guess. Now, if you migrate to the cloud, it's very simple because you have your build and it's broken down into, yeah, you use this service and you use this much memory and this much data, this much cpu and it costs you that. So transparency, visibility is the first argument regarding migrating to the cloud. The second is flexibility in traditional it. When you buy your own hardware, you must scale your hardware according to the maximum peak of usage. So if you are an ecommerce shop, you probably will buy your hardware and scale it according to Black Friday, cyber Monday, this kind of stuff. So very high peak and during the rest of the year the difference between normal loads and the peak loads is just waste. You just bought that hardware for this peak. The rest of the year, it's wasted. So the idea is, in the cloud at least you pay average when you need average, and you pay like much more when you need much more. So it's very flexible. You can scale nulli at will. The last argument that I was not really aware of that, because I work mainly in large or at least medium companies, is if you are a small team, even like a single developer, and you want to develop a product, well, you will need to acquire hardware. And again, this is a big step at the beginning when you have nothing, when you just want to create your business. So the cloud allows anybody to virtually start. And they have wonderful ideas, they have the skills to implement them. They can start their journey like on day one, nearly on day one. As I mentioned, those are good reasons, not so good reasons. Well, do like everybody else. This is the worst reason of all. And again, let me reinstate it. If you think you're migrating to the cloud, you will gain money, or at least waste less money. That's probably a very wrong assumption, and please be aware of that. Now imagine you already have software. How do you migrate? How do you take this software that was made for the old words for onsite hardware to the cloud? Well, there are three main paths. The first is you take it and you move it to the cloud. That's called lift and shift. The second is, hey, it's no good, let's rewrite everything. And the third, in which I would propose you, advise you, at least that's the gist of this talk, is to walk the middle path, that is between the two. So let's first talk about the lift and shift. Lift and shift is very easy. I mean, the cloud is just somebody else computer. You just say, oh, instead of deploying on my side, on my hardware, I will deploy it on another hardware. And most of the times it's relatively easy, because then every cloud provider, they provide some way to run containers. So you just containerize your application and you deploy it. And actually it has a high chances of working. As expected, you will be bell to deploy it. Unfortunately, it might not be so good in the middle term. Worst case, it won't run at all. It will be deployed, but it won't run at all. Because hey, your application, it expected some hard coded paths or some local resource that is not available or available through another interface in the cloud. Best case, it will run because you thought about everything. But your application was not designed to run on the cloud. So wasting some cpu cycles, wasting some memory that was not so important when you run on premise. Now when you run on the clouds, everything counts. This waste, it might cost you lots. Actually, perhaps you have already stumbled upon those twelve factor app sheets and it lists twelve principles that cloud native applications must follow. I won't go through all of them. Here they are as a reminder. But I have been a Java developer and let's check what a standard GVM web application, whether it's compliant with those principles. Well, the second factor says, hey, you should declare all your dependency. Your application should declare all its dependency. Okay, now if we have a regular GVM web application, it's probably a war and you expect it to be deployed on an application server or at least a GSP servlet container. But you don't declare any dependency. Okay, remove this issue and say, now we know how to run self executed. The jar, they embed the servlet container itself, so they embed the Tomcat and well, now it's completely self isolated. I don't need to declare any dependency. Everything is self contained. And again, it's wrong. Like the GVM is a huge dependency and the jar expects a GVM with a minimal version and it's not declared. That's one principle we don't follow. The second one is configuration. Your application must be easily configurable. But with traditional wars, that's completely untrue. What we learned is that we are using an abstraction. For example, let's say a data source that is available through a virtual URL, and then we map this URL to another real URL. And this is done in every environment. So you keep your war the same, you promote the artifact, which is good, and then the configuration is done on every application server in every environment. That defeats the third principle, principle number nine. We must start up fast. The reason for that is that, hey, let's imagine we are running on kubernetes. And kubernetes is super great because when a pod starts misbehaving, well, you just kill the pods and you start a new one. But if you start a new one, you assume that this new one will start fast enough. And guess what? The GVM was not made for that. The GVM actually starts quite slowly. And when it has started at the beginning, it's slow, it has bad performances, it needs some warm up time. So another principle that is defeated. Finally, logging. We must have streaming logs in containers. The streaming logs is a, we just write everything in the console. But again, sublet containers, not so great. Because your application will write in a file and the sublet container will write in another file or in multiple files. Perhaps your application also write in different files. If it's containerized well, how do we handle that? There is no single stream of log that we can follow. Now we've got even more issues because now it's not only about just the GVM, but it's about the frameworks that we are using. Spring, Javae, whatever, they use a lot of reflection. And again it's a startup performance hits because at the beginning it will start to load the classes through reflection. Not great. And if we are talking explicitly again about spring and Javae, they will do some class best counting. So they will check through all the class bars to say hey, which class has this annotation? Not great. So it seems like the GVM is not made for the cloud. So the idea in that case a we will regret the application. And as engineers we love to start from scratch. We love greenfield projects. We don't want to handle the mess that was made by previous developers, even if those previous developers were us. But there are a couple of issues if we want to rewrite the application. The first is obviously the costs. If you want to rewrite the application, you must have like a nontrivial budget. So you will go to your manager, manager will at some point go to the business and it will probably go like that. Hey, I need x million. What for? To rewrite the application. And what competitive advantage does it bring us? What age do we have on the market with that rewrite? Nothing. It will be like a rewrite feature for feature. So you can probably imagine what will be the outcome of this conversation. But imagine for a second that yeah, you're the business, understand the cloud. I mean you already have the biggest advantage on the market, you have no competitors, everything is fine. Let's go a bit into the detail. When you start rewriting the application, let's imagine you start in January. Well it will take some time, it will take months until you have rewritten the application. And so the target that you are chasing, the version that was done in January now is not the same anymore because the legacy version of the app probably had upgrades because the legacy version, the business wants to add more feature. So you will be actually be like developing toward running targets, which is never good. Of course there are risks involved. I mean legacy projects, there might be legacy, but at least most of the bugs have already been solved because people are already encountered them earlier on with a Greenfield project, there will be bugs for sure, even with the best quality process and with the biggest test harness in the world, so not great. And finally, if you are a team lead, if you are a manager, you must think how you will organize your teams. Rewrite means that we will need additional workforce. So either we recruit temporarily or we outsource. But in both cases we need additional people, and the usual way to do that is those new people. They will maintain and handle the change on the legacy application, while your own workforce will work on the new version. But there is a high chance that since the new people, they don't know the application that well, they will probably need support from the people who know the application. So there will be a lot of interactions of interruptions and it won't be super great. So those are four reasons why rewriting the application might not be such a good idea. So if lift and shift is not a good idea, if rewriting the application is not a good idea, we don't have that much choice. We probably have a middle path. And the middle path is actually to reuse the existing codes, especially the annotations from spring and Javae and whatever, but change the way they are used so the engine that uses them is not the traditional engine. And before I go further, let me introduce you about VM. GraalVM is actually a bag of many features. Here are a couple of them. So first, GraalVM has a GVM platform. So instead of using Oracle GDk or OpenGDk or whatever, you are using GraalVm GDk, that's fine. The other thing that GraalVM brings us is it's polyglot, so it can speak multiple languages. You can use multiple languages in your application, so that for example you can have a Java application. But at some point you need to use r because you need to do some statistics. Well, it's very easy to integrate this R file into your Java application. At least GraalVM makes it easy. The reason for that is that there is an underlying framework called truffle, which all those languages have been implemented with. So there is for example truffle Robbie. And this allows you, for example, to create your own language as well, or at least an implementation of the language using truffle for easier integration. But what is of interest into this? Migration to the cloud is another feature of GraalVM called substrate VM, and this allows you to create native executable from existing bytecodes, whether jars or classes, through an ahead of time compilation process. Of course it has some limitations. For example, I was talking about reflection. So reflection is the ability to say hey, I don't know which classes will be used at compile time, I will discover it at runtime and it's really a great feature of Java. But that means that you need to follow the execution path. If you do some build time compilation whereas the class is available at runtime, you understand there is a problem because at build time the class won't be there. Fortunately there are ways to cope with that. You can provide configuration file to say hey, you need to keep this file and this file. Well, there are ways to cope with that. It's not fun, but it works. Other limitation includes the lack of security manager. It's not cross platform. So if you want to have an executable for, let's say macOS, you need to build on macOS and for Windows, you need one for Windows and so on and so forth. But at least it's something. So let's have a small recap. On one side we have the GVM, and on the other side we have native executables. GVM memory consumption is high, native executable not so high. GVM starter time is long, and even more so considering that the performance at the beginning is not great native, you don't care, it's quite fast. On the opposite the GVM, you can write your program once and run it everywhere. There is a GVM for that which native executable obviously you cannot do. And also there is a reason why the GVM had always had very good performances past this warm up time is that it can adapt the native codes that it compiles to the workload. So during this warm up time it will analyze the workload and it will create the best native codes that is possible regarding this workload. And for this reason the GvM was always at least on par with native executables or C and C plus plus programs in the past. The native executable is statically compiled, so you must at build time know about the workload to use the best parameters for the computation possible. But in the clouds those are pretty good advantages, and the benefits of the GVM are not so huge. And so there is now a generation of cloud native frameworks such as micronaut or quarkus. I don't know much about helidon, but it seems to be part of the lot. And there is spring there because though spring was not designed in the cloud native way, because when it was designed there was no cloud. Now there are ways to leverage Graalvm for spring, but I won't talk about spring in the session more. And all those frameworks, they basically have the same approach, they all use Graalvm. So in the end you have a native executable and they handle reflection in another way. For example, micronaut happens to create dedicated class at build time, at compile time and sq the traditional reflection. So now let's have a use case. Imagine I want to have a URL shortener. The traditional approach is hey, you have this space of all URLs and you have a small space of all possible short term URLs and you need to have a projection and you need to handle the collisions. Great. I'm not a mathematician so I prefer to use an alternative is a I will generate random shortlinks for a URL and then I will store the mapping between the long URL and the short one, and also the opposite from the short URL to the long one. So the trade off is instead of CpU time I will thread storage for the demo, I will use the following stack. I will have a legacy Java e application or now it's called joycortle e but my application is Java e because it's legacy. I still use kotlin to write it, but it's not necessary. Any GBM language will do java whatever. Scala and I will be using Jax arrests because it uses annotations and I will be storing the data in hazelcast MDT. So my initial state is the following. I have the GVM that runs the Tomcat that runs my war and when I have a request coming in, Tomcat does its magic probably it uses Catalina jar that relies itself on Servlet Jar and Jacksonres jar and it knows which servlet it needs to call, so it call the servlet. The servlet itself, well wants to store or to read data from Hasselcast and it use a dedicated jar, hazelcast client jar, that's the initial state. Now I want to migrate to cloud native and I will use, let's say Quarkus. So the two b state in development I will still keep the GVM because the GVM has nice features like you can debug, you can set breakpoints, this kind of stuff, we like it and in development it's not an issue. So we still have the GVM, but instead of having the Tomcat we just replace everything with Quarkus. So they are the same capabilities but implemented by Quarkus so there is no Catalina jar, there is a quarkus something and there is a quarkus as lcas client but in the same way when a new HTTP request comes in then Quarkus will redirect the request to our servlet. But the best stuff happens when you want to deploy in a container like now we have the same mechanism underneath. I mean I infer it's the same mechanism, at least it works the same but now we have a single process and it's a native binary. I've talked a lot, now it's time for some demo. So this is the project and I must admit I cheated a bit. I didn't start from a legacy application, I directly created the project using the quarkus maven goals so that everything has been ready for me and because yeah you need to write couple of properties and there are dependency and plugins but you can achieve the same of course it will be more time consuming to do the same yourself by hand but anyway this is just to win some time and here you can see everything has been configured. I mean I can already use it right now and I have this rest API and so you can see jaxrs annotations here and you can see this is kotlin as well and here more jax rs so here I will respond to post, here I have path and here I have the producers to tell what it returns. I don't want to delve too much into the code but just as it is now I can start a hazelcast instance and I can start the application as well and here development I'm running inside the gvm it will compile and after a few seconds it will run the application. So it builds the application and it runs it. Yeah it takes a bit of time. My machine needs to wake up as well so I will prepare the curl a it has started so I want to store a new URL let's say this one fubar I will just have the terminal so it has contacted Azel cost, it has stored this into azel cost and I want now to do the opposite. I want to get the long URL from the short one so I will just curl this one and it returns me fubar. So everything works as expected as expected. I'm super happy now. The idea is I want to build that for the cloud. When I scaffolded this project Quarkus created two docker files for me. One is about Docker native and one is about Docker GDM. So here is the Docker native as you can see it's pretty easy, you just need to follow the instructions. Mvn package so here it takes a long, long time. This is actually where the magic happens. If we can have a look at the size of those files here you can see that it's a bit big and once it has been done you can build the docker image. You won't do it explicitly, it will do it for you. But this is like very fast here. And once this is done, I've created a docker compose file. Docker compose file is very easy. It has one azal cost node and our application. And now if I docker compose it docker compose up I will be using this new experimental feature because I like to try new stuff. So now it's on and I can again try to do some curl. So I will curl to create a new shortened URL and now since it's randomly generated, it's a new one. And now I can curl to check that everything works. And now this is only a native executable that runs underneath and as you can see, it's a seamless experience. It's the same experience. The demo is done, now we need to do some recap. So my advice would be hey, between rewriting everything and between lift and shift, just walk the middle path. When possible, reuse your existing sync code. It took you a lot of time to write it, to maintain it, to test it, reuse it as much as possible, but leverage different frameworks, cloud native frameworks that know how to mess to make the best usage of it and think about return over investment. Thanks for attention. You can read my blog, you can follow me on Twitter, you can read more about the Qualcomm and Hazelcast integration. More interestingly, you can also check the git repository. So the demo that I've showed you is publicly available. And if you got interested in Hazelcast, please join our slack. Thanks a lot and have a good day.
...

Nicolas Frankel

Developer Advocate @ Hazelcast

Nicolas Frankel's LinkedIn account Nicolas Frankel's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways