Conf42 Python 2023 - Online

The machine learning pipeline is a myth - build production ML systems with feature/training/inference pipelines

Video size:

Abstract

Developers often erroneously talk about ML pipelines. But real-world ML systems have many moving components, not a single monolithic ML pipeline. In this talk, we show how to build ML systems as a composition of feature pipelines, training pipelines, and inference pipelines with a shared data layer.

Summary

  • Jim Dowling: How we can refactor this monolithic machine learning pipeline that many of you may have heard about before into what we call feature training and inference pipelines. He says it's too complex to make anything easy for developers to put in production or for people to maintain.
  • In Python, we're not going to use infrastructure. We're going to talk about the way in which we advocate for structuring these machine learning systems to make it easier to manage them. What we typically have are three main components, the feature pipeline, the training pipeline, and the inference pipeline.
  • The feature pipeline will take your raw data, it'll compute features from it that's compressed signal that we're going to use to train our models with and also to make predictions with. Features can be stored online and offline via both the online and streaming API.
  • We can do batch inference with the feature view as well. For feature views we can specify specific features that we'd like to transform. This ensures consistency of these transformation functions between this training pipeline and this inference pipeline.
  • You can write Python programs in a very nice tool called modal. It basically can schedule Python programs for you. This is a really nice way to build what we call serverless machine learning systems. There's a bunch of these on the Internet.
  • There's a ton of them here in Hopsterick's tutorials. The one I'm going to show you briefly here is called credit scores. This is a really good way of doing synthetic feature pipelines if you can't access the actual data.
  • app hopsearch AI is free to create an account. It's time unlimited. You can build systems on it. The feature view you can create is the selection of features. And then we can see any training data sets created from it. These two notebooks really make the two the production system.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hi, welcome to this talk on the mythical machine learning pipeline. My name's Jim Dowling, and I'm going to talk about how we can refactor this monolithic machine learning pipeline that many of you may have heard about before into what we call feature training and inference pipelines. We're going to do it all in Python, and we're going to decompose the machine learning pipelines into these smaller, more manageable parts to build machine learning systems. So I'm going to start by making a claim that mlops, or machine learning operations as it exists today, is too hard. It's kind of like this telecom tower here. It's a very brittle set of systems that we're plugging together to try and put machine learning at production. But really it's too complex to make anything which is going to be either easy for developers and in particular python developers to put in production or for people to maintain. So if we look at mlops according to Google on this diagram, we can see here a lot of boxes. So this notion that you start at data and you go through many stages of processing the data, validating it, preparing it, training models, validating models, and all of this happens in one big, monolithic, orchestrated ML pipeline. I'm going to show you that we don't need to do it that way. Databricks are following a similar pattern for how to encourage people to what they say is follow Mlox best practice. I see this really as being an overcomplication of something which is much easier to do, which is just to build machine learning systems and do it to the principles of decomposing complexity and then putting those parts together again. And for ML ops in particular, which is really about building systems that we can incrementally improve and automatically test and version and so on, we don't need to go down to the infrastructure level, we can stay in Python. That's one of the key points I want to make here. You don't need to become a dockers or a Kubernetes expert in order to do ML ops. This is classic tensorflow extended. This is what many people learn in courses that they take in machine learning operations, that you need to go from the very beginning to the very end in one big m to ML pipeline. So machine learning pipeline, I say it's mythical, because in practice, I've never seen an end to end pipeline written like this. Where the data comes in, it's validated, transformed into features, models are trained, models are analyzed, then they're validated, and then they're served all in one big monolithic directed acyclic graph. That's not typically the way it works. And the reason why it doesn't work like that is because monolithic ML pipelines, they couple different natural stages of machine learning systems. So you have what's called feature engineering, where we take the raw data and turn it into the data, the features that we're going to use to train models, but also to make predictions with. Now, if you couple that phase with the model training phase, training might require gpus. If you're using deep learning, feature engineering does not require gpus. You put it all into one large system, and then suddenly you're using maybe gpus for feature engineering, which is quite wasteful in terms of resources. If your feature engineering is done once a day, because your new data arrives every day, but you need to predict every hour, why would you couple those two things together? Inference should not be coupled to feature engineering when they run at different cadences. By coupling them all together, you're adding development and operational complexity. You're also not reusing any of the data or features that have been created by your feature pipelines, and it's too hard to build these systems in such a way to get to a minimal viable product. I'll give you an example of one, Kubeflow pipelines, just to pick on one could have picked on many different systems. They claim on their website that it enables you to reuse components and pipelines quickly, easily, to create end to end solutions without having to rebuild each time. I've never seen this happen in practice. So mlops, we still want to follow the principles of mlops, and that means testing your software. And you can think of it as being a hierarchy of needs. You have raw data which needs to be tested to create features. Features are used to create models, so the features need to be tested in order to create the models, and the models need to be tested in order to use them by the ML enabled applications. And typically your ML enabled application will want to a b test a model before it then uses and switches over to a new version of a model. But that's where we want to get on the top of the pyramid of needs, if we will, for mlops. But we need to start somewhere. And the place to start, as any software developer will tell you, is with a working system. And in this case we're building a machine learning system. So we need to get to a working machine learning system as quickly as possible. And that means we need to have some code which will create the features we need to have some code to train the model, and we need to have some code to make predictions or inference on new data that arrives using our model. So ML Ops as a set of principles helps you to get through a working system as quickly as possible with a baseline and iteratively improve it. And the reason why you should be able to iteratively improve your software, if you're following DevOps or ML Ops principles, is that you're testing. We're testing the features, the models, and we're versioning the features and models. So if we're going to do upgrades, if we're going to deploy a new model that's connected to some new features, if they're both not versioned, we'll have a terrible time connecting those two things together and ensuring that things can be safely upgraded. So once you have testing and versioning in place for the two main assets that we see in machine learning systems, features and models, then you're able to move quickly, you're able to make small changes and improve your iteration speed, and you're testing your models and you're testing your features. So you're improving the quality of your software, and that's ultimately where you want to get to where you can move more quickly, make small changes to iteratively improve your systems, and be confident that the changes that you make will not break everything. So that's the goal of Mlops, and that's what we as developers would like to use to make our code better. Let's jump into ML Ops. In Python, we're not going to use infrastructure. There'll be no docker or Kubernetes or we're not going to talk cloud infrastructure. We're going to talk about the way in which we advocate, and particularly I advocate for structuring these machine learning systems to make it easier to manage them. So what we typically have are these three main components, the feature pipeline, the training pipeline, the inference pipeline. I call this the FTI pattern to make it easier to remember you have one program. In this case, we're looking to look at Python programs, which takes data from our data sources. This program will be typically an operational program, so it might need to be scheduled. It could be airflow if it's Python, or it could be a Python program that's scheduled to run in the cloud. So there's some pretty nice tooling out there, like modal, who allow you to schedule these Python programs with a cron like hourly run or daily run. And those programs are basically going to read your raw data, compute, features. If you've got supervised machine learning, you may need to create labels as well. And rather than store that data in an object store or in a distributed file system, we're going to look at a feature store as a way to store that data. And that's because the feature store will provide us with a very nice data frame API. Much feature engineering and feature pipelines are written in frameworks like pandas. If you need more performance, you might move to polars. If you have a lot of data, you might move to Pyspark. But the output of all of those is a data frame. So if you can write that data frame, and you can obviously do that in a secure manner. So we'll use API keys. For example, when we write to Hopsworks, then you don't have to worry about the complexity of if you're in a cloud environment, getting access to a particular bucket, an IAM role that gives you permissions to do that. It's basically going to be an API key, some privileges associated with it, and then write your data frame. So your training pipeline should be able to use the feature store in the same way. We'll be able to say, okay, I'm going to select features from all of the available features, create some training data, select labels as well, train my model, and when my model has been trained, I'll need to store it somewhere. So typically where we would store models is in some place called a model registry. There are many model registries out there, some of them are hosted over the Internet. Hopsteryx is also hosted over the Internet. We can call that serverless model registry or serverless feature store. But it basically means at least when you're getting started, you can write a python program on your laptop or in colab, and it can read from a feature store and write your model back into a model registry. So Hopsrix also is a model registry. We'll look at that later. And then finally, when you have an inference pipeline, this is when you want to generate value for your model. And this is where most machine learning courses stop. They'll train a model they'll evaluate on a static test set. So a holdout data set to see how the model generalizes on data it hasn't seen before, which is great. But if you want to see your model in the natural environment where it's creating value, you should write an end to end system so that new data can come in through the feature pipelines. And for a batch inference application it can take that new data that's been written via the feature pipeline. Read it up as features, read up the model from the model registry, generate the predictions, and store them somewhere where maybe a downstream dashboard or an operational system will consume those production to make those applications AI enabled. Of course, this generates logs. You typically want to save those to help improve your observability and monitoring of the system. So I'm going to look at a little demo later on. The code is available on the link below. It's basically looking at credit scoring as a machine learning system. This is quite a popular type of machine learning system we see at financial institutions. You may have information about people who would like to apply for credit, and we'd like to give them a score to decide on whether we're going to give that person credit or not. So typically what you'll need to have is data from different sources. And you can see there's some sources on the left here. A feature pipeline will create the features from those data sources. Training pipeline will train your model. We look at an Xgboost model and then inference pipeline will take this XgBoost model and then some new credits. So credit applications to score and that could be done in a batch manner. You could have like a batch that arrive and once a day you score them and then you send an email out the next day to say you're approved. But even better would be an interactive application. So as the user goes to a website and fills in their details, then we can have an online system which can read those features back and the feature store and hopsearch in particular enables that. We're going to look at the batch case today and the training pipeline. It can be run in any kind of training environment, any Python enabled environment. And our pipelines can also be run in any Python environment. So today we'll look at running it in my notebook, but of course you can schedule them to run in any Python environment. So let's start and look at feature pipelines and have a look at a little bit of the code that we need to create a feature pipeline. So the feature pipeline will take your raw data, it'll compute features from it that's compressed signal that we're going to use to train our models with and also to make predictions with. So there's an example here we're using pandas to do our transformations. So we call these model independent transformations because the features we store in the features store should be reusable across many different models. We can see here that here we're doing a classic aggregation where we're counting the number of events happening in a four hour window. We can see here that when we've created a data frame window AGSDF here with our features that we'd like to use in our feature store, we'd like to store them in the feature store. And these features here are related to the credit card, the number of transactions, or the frequency of the transactions in that four hour period of time. But basically with this window AGS data frame, what we're going to do is we're going to insert it into this thing called a feature store. So here we're creating the feature store. We're giving it a name and a version, a description. We're identifying which column is the primary key. So the unique row level identifier column that uniquely identifies each row in this data frame. If there is a unique timestamp column within that particular data frame, it doesn't need to be unique. Of course, they can be the same across many different rows, but if there is an event time at which that particular row was generated as a column, you can specify it here, because what we'll see is that in feature stores, if I have many different tables of features that have different event times on them, we need to line those up so that we don't get what we call data leakage. So we don't get future data. When we join these columns of features together from different tables, we don't want to have data that's in the future because then our model will be able to learn from future data which we don't want to do. So that's called data leakage. We want to avoid it. So if you specify the event time column in your feature group that can be used by the feature store to do what's called a point in time correct join, so there's no data leakage. So once you've created this feature group object, you can just insert your pandas data frame. So you just call insert on it, it'll write it to the feature store and your data will end up there. Now, Python is great for that smaller scale of data. If you have larger volumes of data, you may want to use to Pyspark. It's great for scale and testing, but it's obviously a bit more challenging to develop with, more challenging to debug with and operate. SQL is also quite popular. SQL has low operational overhead. It can scale quite well. There are many different features, as you know, that are very difficult to implement in SQL. Even if you embed udfs in your data warehouse, things like embeddings and there's many libraries of course, in pandas we use to compute features that are not particularly suitable in SQL. Now Python is obviously a great and popular language to develop features in, so it has low operational overhead. Pandas is very popular, Polars has become popular. It scales to larger data volumes than pandas, but they're all good choices. Now if you need stream processing, so very low latency features that are computed on real time data, then you might want to look at a streaming framework like Flink or Pyspark, which also has spark streaming. So if you want to write that code in Python, probably Pyspark is your best. But Flink is very Java centric. So in hopsworks we can write the data to the feature store from Python in what we call a streaming API or a batch API. The batch API will store the data offline. It's only historical data, so it'll only be available via this offline API. So the offline API is to get training data or batch data that you'd like to do inference on. But if you have an online application, you need low latency access to your features. So if we have for example the credit scoring application that needs to look up your features within a few milliseconds because it's going to do a live prediction of your credit score, then you need the online API. So in that case you write using the streaming API. Now in Hopster's the default is a streaming API. So when I called insert on my feature group, it wrote the data via this API. It'll be stored both an online offline store and it'll be available via both the online and offline APIs. Now what we can see here is that the tables of features are called feature groups, and then we have something called a feature view. So in hopsworks you're able to reuse features across many different models. And the way you do that is by selecting features into something called a feature view. So you can select from many different feature groups, all updated potentially by different feature pipelines, and then the feature store will perform this point in time correct join for you. So if we're able to reuse features across many different models, we can see here one model at the top it has a feature view selecting a certain set of features, another feature view down here selecting different features, and each one can create their own training data sets. Train your models on and each one can pull out. Then if we're doing batch inference, you'll pull the features by the feature view, say well, the data that arrived in the last 24 hours, for example, and then you'll do inference on those with your model. Now training pipelines use these feature views to read data to train models with. So it uses this offline API. You can get your data back as files, maybe CSV or parquet or TF records even. Or you can get in the back as a pandas data frame if the data is not too large. And you can train directly with your data. And it has if support written nice things like random splits and time series splits of your data. So you don't need to do that. Even in Scikitlearn you can just read the ready made split data frames with your features and labels. So the feature view itself is an API for model development and also for operations. In the online case, you basically select your features. You say what label it is for this particular feature view. One other thing that you might want to do is features are typically shared untransformed. So what that means is if you have a categorical variable, store it in the feature store unencoded, and then when your model is selected, it can encode it, because some models will need to encode categorical variables. So for example gradient descent based models, but other models, so cat boost for example, can work directly with the categorical data. And the same is true for numerical variables. So we call those model specific transformations. They can be associated with features in the feature view so that you don't have to write that code separately yourself. Because there is a potential that if you write the code in the training pipeline to transform features like encode them or normalize numerical features, you need to do the same thing in the inference pipeline. And there is a potential for what we call SKU there. So the feature view will help you avoid that potential problem, feature view. Then once you have it, it can apply the model specific transformations on the data, create training data, batch inference data, and it'll apply those transformations consistently. So the point in time correct join I mentioned already, you take tables from features from different tables, ensure that they're lined up on the correct transaction event time here. So this value of this category at this point in time, we can look up the amount for that particular week and month, and we're not going to get future values in here. There'd be no data leakage. Now, if you were to write this join this point in time correct join yourself. It's going to be quite complex. For that example, we can see there's quite a lot of SQL code for it, but in Python you just write the following code below. You select the features that you'd like from the different feature groups. You join them together in pandas like syntax, and you then get back what's called a query object. This is our selection of features, and then we can create our feature view with that particular selection of features. Now once we've created our feature view, we can see we've got a feature view object here, we can create training data from it, and we can ready split that train data into random sets, a test set and a training set, and a validation set as well, even, and you can say what the file format you'd like to store that train data is. You can also get the train data back directly as pandas data frames. So here we can get our features in the training, test and validation sets and the labels in the training, validation and test sets back. And I don't need to run it through scikitlearn splitting algorithm. This is going to be a random split, but there is also a time series split. So you can say if I've got a time series model and time series data, I'll train on maybe the data here up to version two, and then I'll predict how my test set will be the data that came after the end of this particular point in time. Now what you can also do with a feature view is you can get training data, for example for version one of your model, or get training data for version two. New data will keep training in the system, the feature pipelines will keep training new features, and then you can read that data and make predictions on it. So we can do batch inference with the feature view as well. We create batch inference data and finally for feature views we can specify specific features that we'd like to transform. So here we're saying the amount of money last month we're going to apply the standard scalar, the category variable. We're going to encode it, it's categorical variable, apply a label encoder on it, and the same for amount last week, the standard scalar. So when we read the training data, the batch inference data, or in this case the online inference data, the feature vector, it'll apply these transformations after it's read the data in the feature store, but before it's returned to the client. And this ensures consistency of these transformation functions between this training pipeline and this inference pipeline. You can do it yourself in scikitlearn pipelines, and then you need to make sure that both of these are consistent. So let's move over to inference pipelines and we'll have a look at some of the code. This is a predictor class. So this is a model that you will be deployed in a model serving server. We can see here the method called predict. But before we predict, we need to initialize this object. So when it's loaded into our model serving server, init will be called we'll connect to the feature store that we can see here. We'll initialize the feature store. And what we need to do is we need to tell it what the version of the training data was that we trained our model on, because we can create many different train data sets from a feature view. So here it was version number one. So we say, okay, I'm going to initialize my feature view on version number one. The reason we need to do that is the transformations often need the state from that training data set version. So if I'm normalizing a numerical feature, I'll need to know what the mean value of that feature was in the training data set the model was training on. So that information is captured in the feature view. You don't need to supply train data set when you're deploying your model for serving here. And then you can see we're loading the model from the model registry and it's now available in our predict object. So when a request comes in to make a prediction, the model is loaded, the connection to the feature store has been established, and we can just call self dot model predict, return the pre computed features from the feature store here and then send them to the model to make the prediction. So this is going to go to the feature store, return our precomputed features. We're going to cast them to a numpy array, and then the model predict method will be applied to that numpy array to make our prediction, which is returned to the client. Let's have a quick look at a demo. So this particular demo, the way that we often see a lot of these machine learning systems built when people are starting out, when you want to use get started as quickly as possible, is you can write Python programs in a very nice tool called modal. So modal has a very generous free tier. It basically can schedule Python programs for you. And if you need to install libraries, you just annotate the functions and say Pip, install hopsworks, for example. And in that modal code we can write our feature engineering pipeline that we saw earlier. It'll read the data and write data frames to hopsworks. And then modal can also train. You may want to train because training is not an operational system. You can train on colab and do it offline, if you will, or on demand, or when the model has become stale. You don't necessarily need to do it every day as you would when in a feature pipeline if new data was arriving every day, and then for inference. Again, it's an operational system. So maybe it's going to be either on demand when the user goes into the website in an interactive system, makes a prediction, or maybe it's going to be a dashboard that gets updated at some cadence. So hugging faces is a really nice way of doing a lot of inference, because it has, again, a very generous free tier where you can use hugging face spaces to do some really nice uis for dashboards or for interactive applications, in this case from hopsearch. They can read the models and the data frames. This is the data that you want to make your predictions with. And we can also even save the logs back to hopsearch and even do user interfaces in hugging face for monitoring our models in production. So hopsearch in this case, again, has another free serverless tier. You don't need to install anything, you get 25gb of free storage. This is a really nice way to build what we call serverless machine learning systems. And here's some examples. There's a bunch of these on the Internet. I think there's about 40 or 50 of them now. Some really nice ones are air quality predictions. In Poland, this is a streamlet application. It runs the feature pipelines daily on mobile, and it stores then these features and hopsworks. And then when you go to the dashboard here, it can redraw with the predictions of the air quality in the different cities. So this is for today in early, this is actually early March, and you can look at different days and it will connect to hopsworks, read down new data to score and redraw this particular UI and streamlit. And there's another one here that will predict whether your post on Reddit will be liked or not. So it's using sentiment analysis and it's very interesting. Another one on Tesla, stock price production. So I think it's using sentiment from Twitter, amongst other places, and then another one using New York electricity. A lot of services globally have been digitalized and a lot of data is available now for us to build these prediction services with. So you can write a Python program which will go to the New York electricity market. And if you follow this link, you'll see where they get the data from. And you can see in many cases, it's actually outperforming the ETA's forecast or the EIA's daily forecast, which it's doing here. So there's a lot of cool things you can build. I'm going to show you briefly this one that I mentioned here. There's a ton of them here in Hopsterick's tutorials. The one I'm going to show you briefly here is called credit scores. Now, I've actually opened up the notebooks already, so this is just running on my laptop here. This is the hopster tutorials that I've checked out locally. Do I need quick start? No, I don't. Let's move to the first one. So in this case we've got the advanced tutorials, credit scores. So what this is going to do is I actually have two feature pipelines at the beginning. Sometimes you'll have two, one for historical data to create your feature groups and add all the metadata. And this is a backfill one. So what this one is doing is basically reading some csv files with our loan applications, bureau balance, credit card data, payment installments and a lot of other data, doing some feature engineering. It's also showing you doing some EDA as well, which you typically would do, and then creating the feature groups. This is where I ran it, I think, earlier on. So there might be some diagrams in here. You can see we're trying to understand basically the data before we get into the process of modeling. But obviously the kind of things you do here is clean up the data, extract any features that you need to extract or create. And then what we can have is a feature pipeline. So this is a much simpler program. It's going to read the data, it's going to read the new data. And this should run on a cadence. Now, it's a notebook, so it's really just for learning. But it'll connect to the feature store. We can see here that it's getting a reference to the feature groups. And in this case, we're not reading new data from the Internet because this is credit card loans and so on. That information is not available. So what we're doing is generating data. So this is a really good way of doing synthetic feature pipelines if you can't access the actual data. So we're just generating some random data in this case, or not random, but based on the distribution of the historical data. So it's generating new feature values based on those distributions. And then we're just going to write to the feature story. So you're just basically going to call insert on these data frames. And now you'll have some feature groups in here. So let's have a look at what the feature groups look like. We can see here, this is hopsearch here, and we've got a bunch of projects. And this is app hopsearch AI. So it's free to create an account. It's time unlimited. You can build systems on it, and you can see there's a bunch of these feature groups that have been created. One of them is called application. We can see all the features that are in here. There's quite a lot of features, 72 in total. We can see how it's being used as a feature view created from it. You can have expectations, if you want to use great expectations to do data validation before you write to the feature store. And you can see the results of those expectations here, the same, you can create alerts if bad data is being ingested. We can preview some of the data that's in there to just do some EDA. You can see, have a look at some of the data, quite a lot in there. So just give me a random sample of some rows there. We can pre compute statistics over the data so you can understand the distribution. In this case, we got descriptive statistics of the features, and we can see what's happened in this feature group over time. So we had some data written to it earlier, and then we wrote some more data to it later on. So the feature view you can create is the selection of features. So this is what it looks like here, and I'll show you the notebook in a second. We can see the provenance of those features. This one came from this particular feature group bureau. Another one came from application. This is the target column that our supervised machine learning model is going to use as the label. And then we can see any training data sets created from it. So I only created one training data set from it, but let's look at the code for that. So in our feature view, what we do is we just get a reference to these feature groups first, and then we need to select our features. And that's what's happening in here. It calls a query preparation, but really what it's doing is feature selection. So let's call it that, it's feature selection. We're selecting features from a bunch of different feature groups or tables, and we join them together. You can add filters and things in here as well. Now, the point in time correct join is very complex, as you see, but the only code we had to write was this relatively straightforward code in Python. What we can do then is we can add transformation functions. And here what we do is we can see, we're saying, okay, in this particular case we're looking at a mapping transformer here, a label encoder and a standard encoder and we're basically applying those transformation functions then to those select set of features that are defined in here. So basically what it's saying is all the columns that are categorical we're going to apply the label encoder to and all the columns that are numerical we're going to apply the standard encoder to. And that's all going to be captured in this mapping transformer object. So the feature view is created with this selection of features called FG Query. We're applying these transformations to them. We've identified the target column, which is called target funnily enough, and then we've given a name and version so the feature views and the feature groups can both be versioned to enable you to apply ML Ops best principles. Of course, once we have a reference to it and here we're just getting a reference to the feature group again, we don't need to January. It's just if I start my notebook from here it's nice and handy, then what I can do is I can create training data here. I didn't specify the version train data so it just defaulted to version one. And I've created the training data as files and it's a random split with 20% test data and 80% training data. Now the train data sets being created so I could have kept in the same notebook and just gone ahead and read the train data and train. My model I separated into two notebooks. So these two notebooks really make up the training pipeline. In a production system you typically have a single pipeline. But what we can do here is we can get back our feature view and say, hey, that train data has been created as files. I just want to read it up. And I want to read it up, split into my features, the training and test set and then my labels, the Y train and y test from the training and test set. So that's great. Now I can do some messing with the data here, just cleaning up a little bit. Then I can use just a simple model so I can use scikitlearn random forest classifier, I can use XgBoost. I'm just going to fit to my training features, sorry, my features in the training set and my labels in the training set. And now I've got my model and I can predict on the test set and check the accuracy of the model and save the model to the model registry. Then I have an inference pipeline, in this case the batch prediction pipeline. It's going to go to the feature store, connect to the model registry as well, download our model, get some batch data that's going to score with in this case it's getting all the data. But we can specify a time range or a set of ids that we'd like to score for, and then we can go ahead and make our predictions, and then we can save those predictions. So what we're doing is downloading the model first. But once we have the model, then we can make predictions on our data frame and we can save those then predictions to a data store and we're done. So that's it. That's the kind of end to end machine learning system that you can have and build with this feature training inference pipeline architecture. And we do the feature store model registry as the data layer, pulling it all together. If you're curious to find out more, I work with Hopsirix, which is a core part of this new serverless machine training stack. You can go to app hopsearch AI and create a free account or join our slack. But if you really want to learn more about building these service machine learning systems, I recommend you take a course that I developed called serverless ML. It's all Python, pure Python. The course came out in the fall last year, so we weren't using modal or hugging face. There is no hugging face there, I should admit, but it uses GitHub actions instead of modal. But they're both great tools for orchestrating and running Python programs. If you want to learn more, go to serverlessml.org, give it a whirl, and good luck on your journey on serverless machine learning.
...

Jim Dowling

CEO @ Hopsworks

Jim Dowling's LinkedIn account Jim Dowling's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways