Getting Reactive with Relational Databases and R2DBC

Video size:

Abstract

Not too long ago, a reactive variant of the JDBC API was released, known as Reactive Relational Database Connectivity (R2DBC). While R2DBC started as an experiment to enable integration of SQL databases into systems that use reactive programming models, it now specifies a robust specification that can be implemented to manage data in a fully-reactive and completely non-blocking fashion.

In this session, we’ll briefly go over the fundamentals that make R2DBC so powerful. Then we’ll take a pragmatic look at the recently released R2DBC driver from MariaDB to shed some light on how you can take advantage of crucial concepts, like event-driven behavior and back pressure, that enable fully-reactive, non-blocking interactions with relational databases.

Summary

Session on building applications with r two DBC. New fullyreactive database connectivity specification used to connect with relational databases in a fully reactive manner. If you have any questions or maybe just some input on the session itself, please feel free to reach out to me.
Now let's dive into reactive programming with relational databases, specifically relational databases. What does that mean? As you increase memory usage, you can have other side effects like decreasing throughput. This is where reactive programming methodologies or thinking reactively has really stepped in to help alleviate some of these problems.
As we talk about reactive development and reactive solutions with relational data sources. We've got a declarative programming paradigm concerned with data streams in the propagation of change. What we're keying into is how we can use data streams combined with declaratives programming to really help with the propagation or the spread ofchange.
What is a data stream? We know that it has to start at some point. But what becomes more interesting is what we're sending. Data gets sent along this stream, and ultimately, it can run one of two paths. We can use all of these parts to create standards that really help us with reactive development.
Back in 2013, a group of individuals from places like Netflix and Pivotal and Lightbend got together and created a specification. Called reactive streams, it allows you to take advantage of the anatomy of a data stream. The specification is in the specification to be able to use across a variety of different libraries.
R two DBC stands for reactive relational databases, connectivity. The goals and design principles of this specification are pretty simple and straightforward. One of the goals for the specification of RDBC to be completely open. If you want to contribute to RTDBC, please do that.
The R two DBC SPI comes with two levels of compliance. URL parsing is one of the first things on that list. The idea really was to take the advantage of hindsight within RDBC and kind of strip down or simplify things as possible.
R two DBC is really just a collection of mostly interfaces. It's mostly interfaces that have to be implemented from a driver. Like fullyreactive streams, it is used directly.
The R two DBC specification provides a very broad standardized approach for running reactive interactions with the underlying database. How can I specifically dive in and start to take advantage of RTwo DBC for the relational databases that I'm using?
A demonstration of how to use r two DBC within a Java application. Using an integrated development environment known as visual Studio code. The application is going to be run inside of a single file. We're using asynchronous data streams because we're using it.
Using spring data RDBC, we really want to take advantage of the idea of repository. A repository allows us to communicate in a very simple way to a repository of information underneath. This information is actually going to be tied directly to our task table.
RDBC IO is now a part of the. reactive foundation of the R two DBC. It's expected to be completely. implemented in GA or one year. If you'd like to check out the code for mariadb. com's implementation, please go and check out more examples.

Transcript

This transcript was autogenerated. To make changes, submit a PR.

You. Hello everyone and welcome to this session on building applications with r two DBC, a new fullyreactive database connectivity specification used to connect with relational databases in a fully reactive manner. My name is Rob Hedgpeth and a little bit about myself. I actually work for MariaDB as part of the developer Relations group and essentially what that means is that I do anything and everything I can it to help improve or just better the developer experience using MariaDB products. Right? And so if you're not aware MariaDB is a relational database, that's about was much time as I'll probably spend really talking about MariaDB specifically. Some of the examples that I use, of course will use it. But what you want to take away from this is that I'm going to be taking a look at the r two DBC specification as a whole, how it can be used not only really with MariaDB, but a variety of different relational databases solutions. But mainly we're going to be looking at r two DBC as it is at the end of this. As you go throughout the rest of the conference and even after the conference, if you happen to have any questions or maybe just some input on the session itself, please feel free to reach out to me at rob or robh@mariadb.com you can reach me on Twitter at probably realrob or you can go ahead and follow me on GitHub. I put a lot of samples not only with r two DBC, but a lot of things dealing with relational databases. So feel free to go out and follow me on GitHub as well. Now let's dive into reactive programming with relational databases, specifically relational databases. But what we want to key into first, just so we can get everybody kind of running at the same pace or really on the same page, is this idea of reactive programming. Well, what does that mean? And so we're going to dive a little bit into that just to really serve as a refresher for some. And maybe you aren't familiar with reactive programming in general. Not a problem. We're going to get everybody again, like I said, on the same page. So for some of you, this may be a refresher, but let's go ahead and take a look at something that is a pretty relatable example. I think for most of us out there in a simple application or a simple solution design where we've got a client communicating with a server and more specifically communicating with a server thread, right? So some thread on the server that's actually doing the work process, something that the client is sending in maybe as a request or something like that. And so in this instance, we can imagine that the client, like I said, sends in a request and the server thread picks that up. And so when the server thread does that, the request is in this case indicating to do something with the database, which of course makes sense if we're going to talk about reactive programming with relational databases. But in the normal scenario, right, it's going to take that request, maybe it's got some instructions that basically indicate I need to execute some query. In this case, we're going to execute some SQL, some structured query language against a relational database. And while that's happening, that server thread is essentially just waiting for the process to execute, right? So whatever maybe query or queries that you're sending in, just waiting for those to finish and during that time is essentially just waiting there, right? So once it gets it, then it can go ahead and handle that response. Now you're probably knowing where I'm getting with this, but if that server thread basically is stuck waiting for the database to go ahead and continue processing or doing whatever it's doing on that side, it's not really important to us. Then if another request comes in from the client, one of the things that we know off the bat is that we can't do anything with server thread one because it's at the moment really waiting for some work to get done from the database. And so for this situation, we go ahead and we decide, and something that we've been doing for a while now is just make this multithreaded, right? Just spin up another thread. We'll handle this asynchronously, and we'll handle whatever requested information is coming in from the client that standpoint. And again, it's going to follow that same trend where maybe it works with the database, maybe it works with the file system. Doesn't really matter, right. Is that essentially it kind of all works the same. It's dealing with something that is making it wait. And so when the third request or the fourth or 100th request comes in, we're dealing with this with more and more thread context. Now that's great, right? But the problem with that is that as we add more and more threads, we're certainly making that more complex from a development standpoint, but we're also making that more complex from a computational standpoint in the sense that it could take more memory to basically manage all of this thread context. So the concepts of all these threads have just general thread management now. We're not going to dive in the minutiae of why that is. But essentially one of the side effects of that is that as you increase memory usage, you can have other side effects like decreasing throughput. All things that you really don't want, right? Increased memory released throughput. These are things that you don't want in your application. And so it causes more problems. And this is really where, if we're getting to the meat of it is where reactive programming methodologies or thinking reactively has really stepped in to help alleviate some of these problems. In a reactive solution, we've got a very similar setup here where the client is then communicating with that server thread, right, just the same, but instead of the server thread, you know, executing or communicating with the database and then waiting on that process to finish or complete for doing something else, it's really just throwing it over to fence to the database to say, hey, go ahead and do this. I'm going to go continue to process other things. Like for instance, we've got a second request in here, or third or fourth or fifth request. When you're done database and you want to give me back some results, then let me know, I will handle those. And then of course subsequently send that downstream into the client. This is a very simple explanation, but was you can see here, rather than being blocked, which we'll get into, we're completely unblocked to handle other requests like request two, until eventually the database sends us something back that we need to handle. And this is pretty age old solution that we're probably used to, right? You've probably heard of things like the observer pattern or pub sub. It's nothing necessarily new, right, or revolutionary, but this is a way of being able to handle things in a more efficient manner. So you're not necessarily which you can, but you're not necessarily looking to improve performance, but you are looking to become much more efficient in using the resources, for instance, like on the server that you have. And this is a large part of being able to think reactively. Well, as I've very broadly and very abstractly really explained this, you're probably wondering how exactly is this getting done? And there's a variety of ways that this can be done and I'm going to examine a particular way that this is done. And then particularly how that plays into r two DBC. As we talk about reactive development and reactive solutions with relational data sources. First we need to take a look at the definition. A lot of times I like to start with the simplest spot that we can build off of. And really that is this definition of fullyreactive programming, right? And it's chock full of computer sciency words. So we're going to kind of piece this apart as I read this out. But we've got a declarative programming paradigm concerned with data streams in the propagation of change. Now, like I said, that's chock full of computer sciency language. But essentially what that means is that if we look at it first, we're looking at a declarative programming paradigm. Those of you may be familiar, may not be familiar, but you can really think of declarative programming paradigms as you're not necessarily concerned about the minutiae or the step by step process of what's happening in some kind of, let's say, a command or execution. You're really just interested in performing something and then retrieving back a result of some sort. So it's a little bit different than, say, something like imperative programming where you're really controlling the workflow, right? Is a step by step process. So there's that side where it's declarative. And then this idea of dealing with data streams and ultimately the propagation of change. Now we're going to dive a lot more into data streams, but propagation of change, of course, you can just think of as this dissemination or spread changes in data, for instance, and that's really largely handled by data streams. So if we get nothing out of this definition, really what we're keying into is how we can use data streams combined with declarative programming to really help with the propagation or the spread of change. Now, if we were to ask probably a lot of you out there, well, what is a data stream? I mean, the definition that really comes first and foremost to us is probably this idea. Now we don't have to use publisher and subscriber, but you can think of point a and point b where you're just quite literally streaming data, right? And in fact you could say, hey, send it all over, just send all of the data over. But of course, most of us know that if this happens from one point to another, or in this case, what we're using is the publisher and subscriber, this subscriber can become overwhelmed, right? And so if that happens and it's not able to handle the data that comes in either at volume or even the velocity of which it's receiving it, then it's going to have to put it someplace to handle it later. And in this case, we can think of, this was a backlog, right? It's really creating this backlog of, hey, I've got this operation to do, but I will handle this later. But the problem with that is that can really start to mount up, right? So as we're sending more and more data, maybe we're not decreasing the velocity or we're increasing the volume. It doesn't really help the subscriber at that point. And so it starts to build this backlog up even more. Right. And that really has us dive into, okay, well, what is a data stream, right. It was pretty simple on that last slide. You're just streaming data. But we need to take a step deeper into what's available to us within the anatomy of a data stream that we can use to make that process more efficient. Because ultimately, as we talk about fullyreactive programming, and then as we dive into R two DBC, we're really honing in on this idea of efficiency and really improving that. And so from an anatomical perspective of a data stream, of course, we know that it has to start at some point. And logically, we hope, right, it's not some infinite process, and it's going to complete, right. And we hope at the end of that, that all data has been processed in some manner. And this all happens over the course of time, right? You start it, and then, of course, whether it's nanoseconds or it's hours, days, weeks, whatever it is, this happens over the course of time. It's going to start and it's going to complete. But what becomes more interesting, of course, is what we're sending. Now, this is all relatively straightforward, but we're going to be sending, in this case, calling them variables. But these are really that propagation or just the changes in data that I was speaking about before, right. Was things get sent along this stream. Data gets sent along this stream, and ultimately, it can run one of two paths, right? It can either successfully be processed and completed, or it can unsuccessfully, right. Causing some error or some exception. And while this anatomy is pretty straightforward and pretty easy to wrap our minds around with, it's actually very powerful, because as we dive into the next couple of slides and I start to extrapolate from this or start to really build on top of this anatomical setup for a data stream, we'll see that we can use all of these parts in order to create some standards and some specifications that really help us with reactive development. But the first thing that we want to start on is this idea of back pressure, right? So one of the things that the anatomy of a data stream really helps us key into and the original problem that we had, right, was that we're taking maybe too much information, either from a velocity perspective, a volume perspective, or both, like in this image to the face, maybe right when we've got this fire hose, and ultimately we can't handle drinking out of this fire hose. And so back pressure comes in was a way to be able to control that flow. Right. You can imagine that back pressure is basically controlled with this guy's hand so he doesn't just get annihilated in the face with some water. This really starts to introduce us to the idea of back pressure. Now, if I bring up the diagram that I had before, where we've got that simple relationship between a publisher and a subscriber, what this means is that you give control to the subscriber to communicate with the publisher to be able to say, hey, I only want this much information. This is essentially how much I can handle. And then at some given time, whenever that may be, it could be instantly, it could be seconds, could be hours. However, it is at some amount of time when the publisher is ready to do what the subscriber has asked for it do, it can send that information. And this really creates this idea of non blocking back pressure where the subscriber is going to say, hey, this is what I can handle. Send this to me when you can, and an undetermined amount of time, the publisher is going to say, hey, I'm going to go ahead and send this. And it creates a very non blacking scenario here where we can receive that information that the subscriber is set up to handle. And of course, if we put this in a scenario of operating on a couple of things, this means that if the subscriber request will say a single element or a single piece of data, then of course the publisher is going to send that single piece of information. If the subscriber says, hey, I can receive more, then the publisher can go ahead and send the subscriber two more pieces of information and so on and so forth. But this is very simple, right? And that's good. But the problem with that is that there's any number of ways that this can be implemented. And we all know that working at different shops and throughout the years, there's a lot of times where we may reinvent the wheel or we'll go grab some random package and there's no real consensus on how some of this stuff is done. And that can create a problem, right, for the longevity of a project or maintenance of a project, even onboarding people, and I've never seen this, what the heck how are you doing this? And having to basically spend cycles bringing everybody up to speed on exactly how you implemented non blocking back pressure. So back in 2013, a group of individuals from places like Netflix and Pivotal and Lightbend got together and they created a specification. And this specification is called reactive streams. And essentially it's a way to be able to take advantage of the anatomy, right? So the pieces of the data stream in such a way that you could create a very efficient design and a very reusable, almost standard, right, is in the specification to be able to use across a variety of different libraries or solutions in a reusable manner that's broadly disseminated and broadly used. And this again contains pieces like I'd shown before, like the publisher and the subscriber, but using some of the anatomical features of some of the things available within a data stream to be able to take advantage of sending variables and understanding whether or not things were done in error or cause an exception, and then whether or not things completed, we can actually piece this puzzle together, right, as we see here in this diagram, where through the use of a subscription, we can set up a relationship between a publisher, a subscriber, and we can do certain things, right, we can request certain information, we can cancel the stream altogether, and then the publisher is essentially going to notify the subscriber key into events, sorry, methods that exist on the subscriber to say, hey, here's your next item. This actually had an error, or I'm completely done. But again, this is very general. So what this means to you, as the developers out there is you can really take this and think of it very simply as a collection of interfaces. Fullyreactive streams API is really a specification. And what that means is that they're not really defining how exactly from a step by step process. You should really do this. That's up to whatever implementing libraries, which I'll get into a little bit later, decide to do that stuff instead. These collection of interfaces are basically just defining the structure and essentially the flow of how these things can work to create reactive solutions using the fullyreactive streams API. Okay, but why is this important? I know you're like, Rob, you started this session at the beginning of it talking about reactive programming and how you can use that with relational databases. And we haven't really gotten into databases at all relational otherwise. So why is this important? Well, fullyreactive database interactions, in order for that to happen, they need to be top to bottom. So in order to get the benefits of a reactive solution in general, is that the whole thing needs to be reactive, right? Otherwise, if you've got maybe a portion of it, say the server back end part of it is all reactive, but then your communication with the database is not, it's blocked. And if one part of an application or a solution is blocked, then if you think about it, it's all really blocked, right? Because it's not fully reactive in the sense that everything is communicating in this kind of publisher subscriber, what you think of as event driven programming, asynchronous data streams through event driven programming. And so if it's not top to bottom, then it's really not reactive. And so for reactive database interactions, they really need to be fundamentally non blocking and they need to use these things or this concept of back pressure, which is why I've described it. And the reason why I put this up first is because there's a lot of libraries like RX Java, project reactor that already use reactive streams, right? They already have implementations of that specification or that API was an implemented library. So they've gone in to find the actual bits and processes that need to happen to kind of fill out fullyreactive streams as an actual library or an actual solution. And because again, it's supposed to be like the standard specification, it's something that kind of plays well with how you would communicate with a database. But unfortunately, if we're talking about Java based or really JVM ecosystem based solutions, we're very used to using something like the Java database connectivity specification, which is JDBC, right? So you're probably very familiar with that. Now, as I described in the scenario before, you can create a fullyreactive application using maybe RX Java or using Project reactor. But when you hit the JDBC API, well, it was created back in 1997, and it wasn't really made in keeping in mind some of these more reactive types of interactions. In fact, those weren't really mainstream for applications in general because we weren't having to deal with things like taking advantage most efficiently of the hardware. Underneath, we were really mostly concerned about things like performance. And so that's really where a lot of the threading conversation comes in, or handling context, or handling concurrency. So by design, JDBC was actually designed as a blocking driver, right? So it communicates using some kind of wire protocol to the underlying database, and you have to wait for whatever it's done. And solutions otherwise essentially are just done through an asynchronous manner where you're kind of spinning up threads or threaded communication in order to be able to do that. But while that may be good for a lot of applications out there, if you're looking to maintain or kind of improve as much as you can the efficiency of the hardware and of the horsepower underlying for your database and your applications, then you're going to need to have another solution. And that's where finally, R two DBC enters the chat or enters the conversation. Right, but what is R two DBC? Well, as you remember, JDBC is an acronym that I went into, and I've probably said all of the words enough and you've probably guessed at this point if you didn't already know. But R two DBC stands for reactive relational databases, connectivity and like JDBC and like reactive streams, it is a specification. Now, the goals and design principles really of this specification are pretty simple and straightforward. One, we live in a day and age where everything really needs to be open. And coming from MariaDB and even before that, really Marie DB, coming from MySQL, we're really ingrained and really have our ears to the ground from the open source community, because we understand that the open source community, a whole, can really come together and solve a lot of problems. And so it was one of the goals for the specification of RDBC to be completely open. And that's because the group of people that originally started didn't set out thinking, oh well, we have all of the solutions for all the problems that may come up with communicating with relational data sources underneath. In fact, quite the opposite, right? They want to open that up so that community members, and I encourage you after this, if you want to go and contribute to RTDBC, please do that. They come from all walks and backgrounds and different types of solutions. And so they can come with different types of solutions for the problems that exist out there. And then beyond that, a lot of the principles have really gone into, as we've talked about, reactive development in general, such as being completely nonblocking. Right. It's important that in fact, it's crucial that the database interaction for a reactive solution be completely non blocking, because if you're going to create a fullyreactive solution, it needs to be fully reactive. And of course, I dove into fullyreactive streams because that really sets a standard, this universal specification that can be used within your backend, maybe API solution or whatever the solution may be, that communicates with the database, which may already be using reactive streams. We want to use that same specification within a database connectivity driver, ultimately. And along with that, we need to have a very small footprint, right? So a very lightweight specification that doesn't set out to make too many assumptions or have too many opinions. And so for RTDBC, one of the largest goals was to keep it small, to keep it simple, because while a lot of relational databases are very similar in the way that they may connect, they may execute queries, there are a lot of differences between them, right. There's a lot of vendor differences between, say, MySQL, Marie DB, Microsoft SQL Server, Oracle Postgres. Right. There's a lot of things that you can take advantage of, each ones that really set them apart. And so R two DBC, as a specification wanted to keep this in mind so that at the driver level, things could be added into it without having to absorb or kind of handle opinionated or assumptions that were created within the R two JDBC specification. And then ultimately other clients and libraries can also be created in combination or used in tandem with the drivers, which I'll get into a little bit later. And this really harkens back to the diagram that I showed a few slides ago where rather than having a green checkmark for the reactive app or the actual back end application that you may be creating that communicates the database, now we've got two green check marks because the whole application is completely reactive or fully reactive. But diving a little bit deeper into that, we need to take a look at the specification itself. Right. And how you can use something called the service provider interface, the SPI for R two DBC, to actually create a reactive driver. And that reactive driver is then something the implementation that then you can use within your applications. So first we're going to take a look at the R two DBC SPI, take a look at how that's constructed, what the principles of design were really brought together for, and how that then ultimately can be used within a vendor's reactive driver, R two DBC driver, in order to create something that you can use against a database. Well, why this service provider interface? Right. So before I get into what it is, and really when, as we look at what it is, we're going to see that it's pretty simple, but it's really important before we do that to understand why. And a little bit of this I touched on just a couple of seconds ago is this idea of being able to create a very kind of unappenated or unbiased approach, not making too many assumptions of things that need to be done, essentially keeping the specification as light as possible. And that's really done from a hindsight perspective, where we look back at things like JDBC and not to pick too much on JDBC because it was stood the test of time. Obviously you can really use JDBC for a multitude of solutions. But one of the things that I'd point out is that if you've ever dealt with JDBC, either from an API perspective or creating a driver from it, is that it can be very opinionated, right. And it has done some things that has made it ultimately very difficult, either from an API side or a driver side, to have to add some extra code to handle things specific to keywords like question mark binding, to basically have to parse through URLs on the driver side, which we all know URLs are fairly universal at this point, so there's really no need to have to reinvent the wheel for every driver empty that's constructed and so on and so forth. These problems have kind of sprouted up. But the idea really was to take the advantage of hindsight within RDBC and kind of strip down or simplify things as possible while still being broadly available or broadly be able to be used by implementing drivers. And as I mentioned, URL parsing is one of the first things on that list, because we know that URLs are pretty straightforward, right? And for database connectivity, over the course of years, through usage of things like JDBC, they've established a pretty set URL to be able to do this, right, where you're defining things like the scheme in this case of determining that it's R two DBC, specifying a driver, whether that be MySQL or SQL server, being able to identify the database through a host or a port number, maybe you've got a default database, and then being able to set on query parameters that maybe add things like security or encryption to be able to add to that overall profile of how you want to communicate with the targeted database, right? That's what the URL is for. And so within RTDBC, they're handling all the URL parsing for you. So the implementing driver doesn't have to worry about those things. There's a set standard beyond that, the R two DBC driver or the R two DBC SPI comes with really two levels of compliance. So there are things or interfaces really if we look at the API interfaces that have to be fully supported, right? So they have to have an implementation and a full implementation. And there's interfaces that exist within the SPI that have to be partially supported. And part of that is because it allows that flexibility within the drivers to key into vendor level functionality or vendor specific functionality. But a lot of this should look very familiar. Right. As we take a look at it, we're looking at things like connection factory, which of course basically a factory. It's producing some kind of product and in this case the product is a connection. And so we're going to take a look at kind of this kind of top down approach where we can think about a very normal or very standard sequence of events that may need to happen in order to be able to communicate with the database, execute a query and return, and then parse some results. And that involves in this case taking a look at a couple of interfaces that exist within the SPI. One of course that I mentioned was connectivity factory, how they can create connectivity, then from those connections that have been established, how statements can then be executed and then using results, object in a row object, how those can be parsed. So let's go ahead and first take a look at the connection factory within the SPI. It's pretty simple. This is the entire interface that exists for connection factory. Now a lot of this seems probably pretty straightforward, and I actually pretty much described exactly what it does, which is that it creates connections. But one of the things to look at, and this really takes us to the beginning of the session where I talked about how reactive streams is an integral part in R two DBC. And that's because fullyreactive streams is used directly, right? So the specification is used throughout the R two DBC specification. Now again, like fullyreactive streams, R two DBC is really just a collection of mostly interfaces. There are some classes and there's some abstract classes in there, but it's mostly interfaces that have to be implemented from a driver. Right? So from MySQL was a vendor, they've got their driver, Marie DP or postgres, Microsoft SQL Server. Right. A bunch of different drivers out there. But what we want to key in on is this usage of reactive streams, because as we take a look, we know that we're using reactive streams. Specifically we're importing the publisher. And the reason why we're doing that is because if we remember back to the relationship really between a publisher and a subscriber, we understood that, that in that first very simple one where we're talking about, hey, I want to request some information. The publisher says, okay, whatever amount of time I need to prep this or prepare this, I'm going to send this back over to you. This is what's happening here, right? Is that rather than just getting a connection, right. So we're using the create method to just get a connection back. We're actually getting back a publisher. So when we execute the create method, we're actually just receiving then this publisher that we can then subscribe to, which is going to say, hey, at a given time I'm going to give you this connection when I'm ready. And that could happen instantaneously, it could happen nanoseconds, milliseconds down the road. But we're tapping directly into the reactive streams and more importantly reactive programming to make this more of an asynchronous process of receiving this connection or ultimately this event that tells us that we can have that connection. And this is a theme that plays throughout all of the interfaces, right? And I say all of them, I mean, of course not all of them have necessarily interactions with reactive streams as we get into some of the enumerations and stuff like that. But as we get into interacting with the database itself, it all is really rooted or kind of built its foundation on reactive streams and of course reactive programming. And so receiving things or beginning things like a transaction, right? Being able to begin a transaction is a process that we can subscribe to or being able to close a connectivity, being able to commit transactions. These are all things that we're taking advantage of in a reactive manner. Even the idea of being able to execute a structured query language statement or an SQL or SQL statement is being done using reactive streams where we're essentially returning back. So we're executing whatever our statement may be. But when this was been executed, we're subscribing to a publisher to receive that result. And so when that statement has been executed and the publisher sends that to the subscriber, that's when we'll receive that result object. And of course, nothing really new here is that as we start to dive into the result object, which you could think of as a collection of rows or a collection of pieces of data that have come from whatever query statement we have executed. We can also take advantage of being able to look at things like how many rows to being able to parse through that information in a reactive format or a reactive methodology all the way down into a row, which is quite literally a row of information. Now that of course doesn't have to be tied to a specific table that could be just depending on your query. But ultimately we're trying to get to this row. So as you can see, and if you're familiar with it, doesn't even really have to exist within the Java or JVM ecosystem, any real connection driver or connectivity driver out there. Ultimately, if you're dealing with a relational database, you're trying to take advantage of being able to use that tabular information and however you constructed your query and be able to get after that. The difference here of course is that we're doing that in a very reactive manner on top of the reactive streams specification. But of course, as I've described before throughout this, is that we really want to dive into the how I can use it. How can I specifically dive in and started to take advantage of R two DBC for the relational databases that I'm using? Well, there are a variety of different R two DBC drivers, or some places call them connectors. Simply think of them as implementations of the R two DBC specification that's specific to whatever database or whatever vendor of relational storage happens to be communicated with. And there's a variety of them out there and there's even more in the works. But you can imagine distributed databases like cloud Spanner to using H two MySQL, Mariadb, Microsoft SQL Server and postgres. Right? The idea of this R two DBC specification is to provide a very broad standardized approach for running or managing reactive interactions with the underlying database. And I'm specifically going to take a look at how we can actually use the implementation from MariaDB. Now of course I mentioned that I am from Marie DB, but a lot of these drivers, especially at the highest level, are very similar, right. And that's the entire point of the R two DB specification, is to really tie together a lot of the things that relational databases all do. And the first thing that we can think about is this idea of being able to connect to the database. Of course we understand that it's going to take information like a location, right. The combination of a host address and a port number, right. So we can specify exactly where that database is. And then at the simplest level, then providing things like credentials. But of course there's other things that you can be able to add onto this, such as security and then limitation features as far as limiting timeouts and stuff like that. There's all kinds of things that we could add onto this and some of which may be vendor specific. But at the simplest case we can use this connection configuration implementation to be able to do that. Now as you can see here, it's proceeded with MariaDB and that really just tells you that the connection configuration which exists as can interface within r two DBC, we have an implementing class which is preceded by the name of the vendor. And that's pretty common for MariaDb, for Microsoft SQL Server, for MySQL these naming patterns are going to be pretty similar. And that's even more evident as we take a look at how we can take advantage of that configuration object which I created in the last slide to then be able to create an instance of the MariaDB connection factory object. And that connection factory implementation is really that just an implementation of the connection factory interface within RDBC can use that connection factory then to create or to get a hold of a connection. Now as I mentioned before, we would do this using reactive streams and this idea of first receiving back a publisher, which then at some undetermined amount of time is going to publish or send a connection object to us that we can then use. But in the case of some things such as a connection, when you create it or when you request it, you're most likely going to want to use that immediately. And so in some cases, very small cases, you actually do want to block and with the reactive streams specification implementation. So whatever that may be, in this case we're actually using project reactor. They come with different mechanisms that allow you to wait or to block that communication so we can actually wait for that connection object to be delivered. In this case we're using block and now we've got this co n in connection object and what that allows us to do once we have this. Now I understand I'm starting to incrementally add more code into each slide and that's really for a purpose which I'll get into a little bit later. But don't worry, we're not going to dive into the details of all of this. Just know that starting from the top, we're taking advantage of the connection object and then we're using the method of create statements to specify the SQL statement that we want to execute. And what that's going to do is it's going to return back a Mariadb statement implemented, right? So the statement class, Mariadb statement is a class and it's an implementation of the statement interface which exists in r two DBC. From there, that's where we can actually use that select statement object and be able to take advantage of the execute method. And that execute method is just returning us the publisher like I had said before. And of course from there, ultimately we need to subscribe to that. So at the very bottom you see that, okay, we're going to subscribe to this and it's going to give us whatever we've mapped, right. So in the middle there, there's some things that we've mapped and we've kind of parsed through some of the things like result and row object that I had sent before. But what's important about this is that we're using a reactive streams implementation, in this case that I mentioned before, project reactor, which basically will take our interface, our publisher interface of type Maredbresult, and it's going to kind of fill that out with flux. And Flux is an implementation of the publisher interface. And with that then we're able to do some things like map, right? We're going to take the data that exists in the database, we're going to do that fanciness to go ahead and map it to the Java data types and a Java object. In this case we have task and then whenever that's ready, the publisher is going to send that we've of course subscribed to it, and then we're going to do something with that task. Right the very bottom here. But got to say, that was a pretty simple implementation as we talk through the steps. But you and I, we all know that coming up with a new data access layer, having to persist or maintain objects and all of the steps in between that can be large. Which is why over the course of time, clients, right, libraries that have helped really abstract away a lot of those details have come into existence. And I hit on it a little bit earlier, but R two DBC was really designed with this in mind, right? It was kept really lightweight, not only for the vendor level, but so that the vendors could keep that as lightweight as they possibly can, but then still adhere to this specification that could then be used on a client level. And the client level is to create more humane or more opinionated APIs and ultimately create this level of abstraction or this layer of interactions that's going to handle things like creating the data access, right? So handling connection factories, handling the connections so that ultimately you can do what you need to do, which is that I want to execute query or several queries, and I want to put that into an object on my application side. I don't want to have to worry about a lot of the steps that kind of come in between that, such as the mapping and stuff that I showed before. And there's a lot of clients that already exist and there's a lot more in flight that have been validated as official RDBC clients. And the one that we're going to look at today is spring data R two DBC. So I'm going to jump into, right after this, I'm going to jump into a live demonstration where I'm going to show how we can take advantage of an R two DBC driver that's being used by the spring data R two DBC client. So you can just think of spring data if you haven't really used it, you can just think of it as, again, that abstraction that's going to help us create the data access layer and the data access and really persistence layer persisted objects that we can use within our application. And ultimately we're just going to communicate with that, with an application all the way to the database. So let's go ahead and get started. Within our demonstration, I am going to jump directly into an integrated development environment known as visual Studio code. It's a free, essentially code editor, right. And you can use it for things like compilation, but there's a variety of them out there, and a lot of them will work with the type of project that I'm going to show you. It's not necessarily important, but just so you know, I'm using visual studio code. And on the left hand side here you can see what is my solution explorer, which is going to show everything within my solution. And in this case I have a maven based project. So maven is a build management system essentially, right? So being able to specify things that you want to be able to build within what we're using as a Java application. Now, I've started by going to start spring IO to go ahead and generate this project. And really why I've done that is because I'm using spring, the spring framework. There's a lot of dependencies that come into play, and I don't want to go through the steps involved and kind of pinning all that together and individually bringing those dependencies in. And so I used a generator at start spring IO to create a spring boot project. Now, spring boot is something that you would traditionally use for something like an API. We're not going to do that. We're going to keep it much simpler than that. But really I wanted to be able to just very easily set up all the dependencies so I can take advantage of spring data r two DbC, which is the client, and then ultimately be able to take a look at the Mariadb R two DbC connector. And that really starts in a Maven project, right? A maven based project, looking at the project object model, which is this palm file and essentially the palm file, it basically just will indicate a bunch of things about your project as far as versioning, as far as different types of properties and different versioning numbers and stuff like that, that it'll use. Right. You can name the project, things like that. But what we want to key into really is this dependencies property. Inside of there we're defining the dependencies or essentially these binaries that we're pulling from something called the Maven central repository. Not really important other than the fact that we're taking these libraries, we're pulling them down, we're setting up all the dependencies that we need so that we can go write code. But what I want to point out is that inside of the project you obviously are going to pull down the spring data r two DbC binaries and then you simply need to pull down the Mariadb r two DBC connector. And then a couple of other things that I have in here. I mentioned spring boot, which we won't really be using, but it kind of helps set up this project. But then project Lombok, which basically just means that I'm lazy and I'm having Project Lombok kind of generate or handle some boilerplate code for me, some getters and setters and some classes which we'll see in here. Now let's go ahead and actually dive into the code of how we can use r two Dbc within a reactive application. Now of course, I've already created this application to kind of save some time. And this application is going to be run inside of a single file. And the way that we're going to do this is we're going to communicate with a database, which I've already set up using a docker container, which is the easiest way to really get started with a lot of these databases. And it just sits on my local host. So I'm just going to communicate using one hundred and twenty seven zero, zero, one. And I'm going to say, hey, I'm going to mess with a single table or I'm going to communicate with a single table. And I want to do this in a reactive fashion using r two DBC. And within this class I'm going to set up a couple of things to let me do that right. So within about 70 lines of code, we're going to go ahead and be able to set up everything that we need to show several different types of interactions with the database using r two DBC. Now as we dive into this application, something that I want to point out is that I have a main method here and I've done some things so that I can take advantage of the running application itself. And really ultimately why I've done this is so that at the bottom here. So when you run an application, it's going to run and then it's going to be done. But because we're using asynchronous data streams, these activities from publishers and subscribers, the application may finish before we're actually done processing the information. And so we might not be able to see anything in the output. So at the end of this I'm basically just preventing the application from exiting so that we can within the console output, see exactly what's happening. And I think that's the easiest way to really get our feelers or kind of get a sense of what R two DBC is doing and how we can use it. And then of course we can expand well beyond that. And of course you wouldn't just keep your application open indefinitely, but this will give us a good starting point. So first and foremost, when we come into an application, especially a spring application, we want to dive into application properties. Now this application properties file is going to allow us to take advantage of some of the configuration that exists in spring, specifically in spring r two DBc that allow us to specify things like the URL, the username and the password. Now in the instance of time, I'm going to go ahead and paste some of these things in here because it's going to be a lot faster than me kind of fat fingering the code and then probably not having anything that can compile at the end. But I will explain everything as I go through it. Now what I'm going to do first is I'm going to paste in the information that allows us to connect to our underlying database. Now was I kind of dove into before? The first part of that is really establishing an R two DBC URL or the URL that's parsed within the R two DBC specification. Of course that starts with your scheme. Then you can specify the driver. In this case I'm using Mariadb. I'm going to indicate that I am using localhost on port 33 six, and my default database is just called to do right inside of our database, which I'll bring up here in a console application. Very simply here I have one table that exists inside of this to do database. The table is just called task and it contains, as you'd imagine, a list of tasks. Now of course this is very simple, but it's meant to be that way so that we can establish a very clear example here. And it's already been preloaded with a couple of pieces of information which will help us in the demonstration to come. But know that this table exists within a local database on my machine. Now to get to or to gain access to this. Then I provide the username and the password all within this application properties file. That's that. So we've saved and we're going to go back to our demo application because now it's time for us to start writing the code that we can use to actually integrate and communicate with the database. So again, I'm going to paste some things in here, but I will explain these along the way. Now, if we want to think about communicating with a database, and I mentioned it before, we want to think about how we can persist those things on the application side. And the first thing that we're going to do then is create a class that mimics or that matches the task table that exists inside of our database. Now you don't have to do this and it can be much more complex, but this is a very simple application and as I saw before, I have an id field, I have a description and I have completed those all match to the table that exists inside of my database. But there's a couple of annotations here which are useful to know. One is this idea of data, and this is coming from that project Lombok that I mentioned before, which is allowing me to go ahead and basically have it build out the things like the getters and setters so I don't have to do those things. I've got some argument requirements as far as being able to require arguments within the constructor whenever I create a new object. And then most importantly here we have this annotation of a table which is then setting up the relationship really between this class of task to the task table which exists inside of my MariadB database or to do database. And then I've got some annotations which basically describe a primary key and the fact that a particular property cannot be null, right. Things that are important for this so that I don't accidentally enter a null object and we get an error. Now that is, we've appropriately mapped, or we're setting up the persistence of being able to take information from the to do or the task table and put it into this task object. Next, when we're using r two DBC or spring data, r two DBC, we really want to take advantage of this idea of repository. And a repository allows us then to be able to communicate in a very simple way to a repository of information underneath which in this case is actually going to be tied directly to our task table. And I'm doing this by using something that's already provided within spring data RDBC which is called the reactive crud repository interface. And this interface is just allowing basic crud or it facilitates basic crud operations, right? So everything from being able to insert, create, or so create, read, update and delete, right? Just the crud acronym, if you will. And all I have to do really is specify that I am going to be using a task object which I had previously mapped right through that annotation of table, and that the primary key is an integer, right. For that id. Now that's all I need to do. Now I've actually set up the communication directly between my application and the table that exists inside of my database. And I've done all of this through the application property settings that I just added in. And then now through the task repository, which uses the fullyreactive crudpository and the task class. Go ahead and save here. Now I want to add in a couple of bits here. Now a couple of things that I'm going to add in are going to be, for instance, I want to go ahead and do something called auto wiring, and this is using something called inversion of control and dependency injection, which allow me then to just create an instance of my task repository that I can then use to be able to execute those crud or those simple methods to be able to do things like read and write information to the underlying task table. I'm kind of obfuscating away some of the details there, but certainly look that up if you're not as familiar. But ultimately we just want to be able to get information from our database. Now after I've auto wired up that repository, next I want to add in a couple of methods that I've really built that I can take advantage of within side of this demo application, right, the ability to save a task and the ability to get a collection of tasks. So, right, I want to create a new task and then I want to be able to display or retrieve those and then display those in the output. And as you can see, I'm actually taking advantage of the task repository instance that I previously established, and it's returning monotask and flux tasks. Now I already mentioned Flux, which is an implementation of a publisher that specifically exists within side of the project reactor library. Flux returns zero to many objects, mono returns zero to one. That's really all you need to know at this point. But those are essentially implemented publishers from reactive streams. Now I'm going to very quickly show how I can take advantage of both of those methods, right, so as I mentioned before, I have four previously existing tasks. It's kind of like a magic trick. Here we go. I've got four tasks inside of my table. What I'm going to do now is I'm going to create a new task using the task object that I created on the bottom of the screen, the task class. And I want to implemented the only non null property that I described within the constructor, which is description. And I want to give it a new description of task five. I then want to execute using demo application or the instance of demo application to save the task. I'm going to subscribe to it, right? And then it's going to output the result. Then I want to output the results of all of the tasks. So I just want to be able to read very simple things that you'd be able to do really in any driver, any connector not necessarily specific to r two DBC. So let's go ahead and execute save here. Let's see. Okay, well, let's clear this really quick and let's just go ahead and rerun this. I might have had a previously running implementation. Well, I've obviously missed something, but it looks like I got about three minutes left. So I'm just going to pull over my other application. I promise this has worked in the past, but again, I always tend to find a way to do this. But what I want to do here and what is actually executed, then I'll see here inside of the output is that I have been able then to create a task five. So that first insert right, it's going to return through the publisher right here or through the subscription of that original save publisher, the ability to return task five. Then I just simply want to be able to read everything that exists within that table. After that, the plan was to go ahead and jump into how you can use back pressure to be able to actually modify the way that the subscriber communicates with the publisher was far as how it's requesting information. So I've kind of ruined the surprise here, but I want to show you how easily that's done. Simply using the demo application instance that I had created before, and again using git task, which is the method that I had previously created, I basically just want to create a custom subscriber. And within sight of that, if we remember back to the subscriber interface that exists inside of reactive streams, we can remember that we had a variety of different things that we could do to take advantage of the data streams. Anatomical features such as being able to listen to on subscribe methods that are all overridden here. Being able to understand when the next element or the next task in this case was actually sent over from the publisher, being able to determine if an error or an exception had happened and being able to see if it's complete. Now, primarily what we want to look at here is the ability that on next we want to control through back pressure the amount of information that's sent. And the way that I've done this is very simple, where I've essentially just determined through an integer, a variable here within side of this class to say, okay, I want to receive two at a time, and then using modular division on two, I basically just saying that every time I just want to request two pieces of information, and when I've done that and I've incremented that I've received two, then and only then am I going to request again. Right? So every two times I'm going to request that and I've added some print statements in order to be able to eliminate that. Now, inside of our application we can see that if we start here, we can see that when I were to run this, I'm actually then receiving two pieces of information or two tasks that I can print out, and then I'm printing those then and only then I request two more pieces of information or two more elements until the publisher can send everything that it has. So in this instance we have five records that exist within that task table. And so at the end I've requested two, but the publisher can really only provide one and the process is complete. And that concludes the demonstration on reactive streams. I'm going to go ahead and put this last bit up here because I just want to point out that if you happen to have an interest in contributing to the r two DBC specification, please go and visit RDBC IO. It is now a part of the reactive foundation, so it's getting quite a bit of momentum and it's expected to be completely GA or version one this year, so 2021. If you'd like to see more implementation examples, please go and check out mariadb.com developers. There's a bunch of open source free examples for getting started with R two DBC using completely free instances of a database. So you can jump in there. If you'd like to check out the driver code for mariadb R two DBC, you can check that out on GitHub as well. I am actually writing, or I've written a book and it's due to be published in April on R two DBC. It's the first book on RT DBC called R two DBC revealed. So if you're really interested in it and really want to dive into how R two DBC has come together and how you can use it within your applications, please check out that book that's coming out in April. And again, please feel free to reach out to me at robh@mariadb.com at probably real Rob and rhetchpath on GitHub. Thank you very much and I hope you enjoy the rest of the conference. Have a great day.

See all 21 talks at this event!

Conf42 Enterprise Software 2021 - Online

March 25 2021

Getting Reactive with Relational Databases and R2DBC

Video size:

Abstract

Summary

Transcript

Rob Hedgpeth

Director, Developer Relations @ MariaDB

Join the community!

Featured event

2025

2024

Info

Conf42 Enterprise Software 2021 - Online

March 25 2021

Getting Reactive with Relational Databases and R2DBC

Video size:

Abstract

Summary

Transcript

Rob Hedgpeth

Director, Developer Relations @ MariaDB

Join the community!