Conf42 Site Reliability Engineering 2023 - Online

GPT: Revolutionizing Monitoring and Logging Systems

Video size:

Abstract

How you can us GPT to improve the performance of your logging and monitoring. We’ll go into the specifics of constructing prompts, enriching logs and automating the process through integration with ELK, so you can quickly build extractions in natural language and get more value from your logs.

Summary

  • GPT is revolutionizing monitoring and logging systems. Where GPT really shines is summarizing large amounts of data into something human readable. How can we use that knowledge to improve applications that we're building?
  • GPT four is by wide margin the most capable large language model out today. GPT 432K is also fantastic. Always be careful when sending sensitive data to a third party. If you're looking for privacy, I highly recommend looking at stability AI's open source models.
  • Many large language models out there today, data sent to them is used to train future models. Sending sensitive data to a third party is always a risk. If you are very concerned about the sensitivity of your data and who has it, I highly recommend looking at stability AI.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hello everyone, and welcome to my talk titled GPT, revolutionizing monitoring and logging systems. So the AI you've been hearing about on the news mostly refers to llms, which includes chat, GPT, bard, my AI from Snapchat, and so on. Llms are called large language models, and they can be roughly thought of as a system that generates likely strings in the same grammar that follows can input string. It's important to note that these are not magic, they are not self aware, and they don't retain any memory of anything that's been said to them beyond what was in the prompts that you provided at the time of this recording. The big LLM providers are OpenAI, which is the market leader in the space, offers the most sophisticated models available today, Google, with their Bard AI and stability AI, which is open source and can run on your own infrastructure. It's important to note that logs are liars often, or sometimes rather. While they generate likely strings, these don't necessarily mean accurate strings, and they're best suited to workflows where there's some human review, and creating better prompts gives you dramatically better results. So how do we create those better prompts? Most people use GPT and other llms by giving it what are called single shot prompts, where you give it a question and expect some output back. Here is an example of a single shot prompt where we ask it to create a list of characters from the book Kadoon and the format of first letter of first name and last name concatenated together as you would usually do for usernames. As you can see, it didn't quite give us the results we wanted, though it gave us something that loosely looks like the results we wanted. So how can we improve this? We can improve it by giving it some examples. Here we give it the example of Paul Atreides. We want the username P. Atreides for that. So as you can see, it generated a list of names that better fit the output that we're looking for. Knowing that providing it examples gives us better results. How can we use that knowledge to improve applications that we're building? With OpenAI, we can do what is called tuning as well. A tuned model can be thought of as mini shot. OpenAI allows you to provide json formatted data of example prompts and example outputs that we want. For those prompts, we can send many, many prompts, prompts, and example outputs this way to OpenAI to tune our model. And it's highly recommended that you do this, as it creates much higher quality results and also reduces costs significantly because we're no longer putting those examples into our prompts, which counts against our license. Here's how we can use GPT to improve our monitoring and logging. Say we have a raw log file. As you can see, it's not too easy to work with that as is, and we can give it a handful of examples of the specific fields we want it to extract from that raw log event. As you can see, it pulled out all of those individual fields. Chat GPT's response is highlighted in green here. Once we have those, we can then say let's write some regexes that pull out the field values for each of those fields, and you can see it didn't always quite do the best job. But this is a great jumping off point for building regexes that we can then use in our application. Here's an example of where we can use GPT to help us out with our monitoring and logging. This is can event from Azure ad describing a user being disabled using that few shot method discussed earlier. We can give it some fields and their field values and ask it to get the rest of the fields and field values from that raw log event. As you can see, it did that. We can then ask it to write regexes that extract each of those fields, which it did with mixed results, but it's a great jumping off point to build the rest of your field extractions. Where GPT really shines is summarizing large amounts of data into something human readable. So here's that raw log event from Azure ad that describes a user being disabled. As you can see, it's not too human readable as is, unless you spent quite a bit of time looking at azure logs. So what can we do to make that immediately recognizable to a analyst? We can ask GPT to summarize it for us, and as you can see, it did a fantastic job of that described the core directory service where that law came from. It turned that isolinear timestamp into something human readable, and it described what the acting user was and what the target user was and the result of that operation. This was done using very little training. It was immediately useful to us. How can we do this automatically with Kibana? Kibana plugins are written in typescript, so they're pretty easy to work with, and elastic offers a template plugin on their GitHub page. Highly recommend you take a look at that and just building off that template, add the API integrations you want to use. There's also helpful guides out there if you want to create your own kibana plugin from scratch. With the kibana plugin that we put together, we were able to give it that original log file and then have GPT add a description to that log, which we then stored. This is much more readable to a human analyst who would be reading through this. Some caveats with building plugins that interface with OpenAI is there is a token limit, which we'll talk about later, that basically specifies that we can only send some amount of data to the API and get some amount back from OpenAI. So the raw event we're sending it may not fit, and we want to be careful to trim what data we send to only what's necessary. It's also worth noting that this raw event may contain information you don't want to send, including client ids, specific usernames, and so on. So working with the OpenAI API, there's a number of different parameters we can look at the model, the prompt itself, temperature max tokens, top p frequency okay, let's talk about some OpenAI parameters for each of these. There's often analogous parameters for other llms. Temperature is a value that describes how random we want the model to behave. It's a flute between zero and two, where zero is instructing the model to behave completely deterministically, where one prompt will always give us the same answer back and higher values will get more random and varied answers back. It's advisable that you have a low temperature for things where determinism is valuable. That can be field extractions, creating regexes, and higher values, where it's providing more human readable responses back, that can be summarizations, and so on. Tokens describe the max amount of data that can be sent and received in response to OpenAI. Both the data sent and the data you get back are summed together to see if they hit that max token limit or not. One token is loosely one word, though sometimes several words can be added together to equal one token. Top P returns more probabilistic answers back, where the lower the value, the more probabilistic answers are returned. So, for example, 0.1 will represent the top 10% of possible answers OpenAI might generate, and it will only give you results that came out of that top ten. It's recommended that you set either top P or temperature if you want to increase or decrease the randomness of your responses, but not both. Frequency penalty decreases the likeliness for OpenAI to repeat itself. It's a float value between zero and one, and presence penalty decreases the likeliness, or, excuse me, it increases the likeliness for OpenAI to talk about new topics. Finally, there's the model of which OpenAI has many different models worth taking a look at. DaVinci three is the most sophisticated GPT-3 model out right now. It can provide the most detailed and creative responses, though other models might be faster, they might be lower cost, they might be more suited to specific subjects, such as code generation, and so on. Finally, let's talk about some privacy and confidentiality considerations. All right, here's some open AI models that would be of interest to somebody viewing this presentation. GPT four is by wide margin the most capable large language model out today. At the time of this recording, it is a bit expensive to use, so I wouldn't recommend using this for bulk tasks. Maybe summarization if a high deal of sophistication is needed in the response. For the most part, I recommend using GPT-3 five turbo, which is a very capable three five model. It's optimized for the sort of work that this presentation has covered and can be used very cheaply. However, if you're looking to perform tasks that would require a large amount of tokens, that it would be summarizing very large log files. If you're looking for very detailed responses back, GPT 432K is also fantastic. As you can see on this table here, you can use far more tokens with that than you can with any other model. Finally, let's talk about some privacy and confidentiality considerations. OpenAI and their privacy agreement does say that they don't use any data sent to them via their API for training new models, so you don't have to fear secrets sensitive being included in any training set for future models. However, they do not say they don't retain logs or other properties about what might have been sent to them via the API. So always be careful when sending any sensitive data to a third party that might include keys, secrets, client ids, usernames, if you don't want to disclose that, and so on. If you're looking for privacy, I highly recommend looking at stability AI's open source models, which you can run locally on your own hardware. Hopefully this presentation has been very informative for you, and if you have any questions, feel free to reach out to me. Clay Langston at Oak nine finally, let's talk about some privacy and confidentiality considerations. With many large language models out there today, data sent to them is used to train future models, which you would not want them to do with a lot of the data you'd be sending them for what we talked about in this video. OpenAI and their confidentiality agreement says that they do not use any content sent to them to train future models if that content was sent to them via their API. However, they do not say that they do not retain logs or other properties about what was sent to them. So be mindful. Sending sensitive data to a third party is always a risk, especially if that data might include secrets, tokens, so on, or usernames, client ids, or other data like that. If that's not something you wish to disclose. If you are very concerned about the sensitivity of your data and who has it, I highly recommend looking at stability AI and looking at one of their models that you can run locally on your own hardware. And that is all. My contact info is C. Langston at Oak nine IO. Feel free to reach out to me if you have any questions about anything talked about today, or if you just want to say hello. All right, thanks everyone.
...

Clay Langston

Security Engineer @ oak9

Clay Langston's LinkedIn account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways