Conf42 Large Language Models (LLMs) 2024 - Online

The future of search

Abstract

Vector databases and semantic search have been around for a while, but the rise of LLMs opened up new horizons in the field of search. In this talk, we’ll explore a project codenamed SeSeDB, which allows searching through databases, text, and documents using human language (any language and even images).

Summary

  • Ben: In this talk, we're going to dive deep into semantic search and how to set that up. The building block of semantic search is vector databases. Gartner predicts that by 2026, roughly 30% of all businesses will have implemented this.
  • There are a lot of pitfalls when you set this stuff up. The most important aspect of them all by far is how to handle large datasets. You have to choose the one that fits your use case best.
  • semantic search is a database that has roughly 100 bmws. You can search in it using human language. Here, cheap car is suitable for my dog. What about extravagant cars for successful managers? That's the idea behind semantic search.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hi, everyone. My name is Ben. I'm the co founder of Symante AI. And in this talk, we're going to dive deep into semantic search and how to set that up, what the technical difficulties are, and most importantly, where you would use that in a business sense. So without further ado, let's just jump straight into it, and we have to give a little bit of technical detail first. And that is what the building block of semantic search is, and that is vector databases. This is actually a technology that's been around for some time. In fact, it dates back to sometime around 2005, but with the emergence of large language models, particularly, obviously, chat GPT, but many others as well, we've seen a revamp of this technology, and most importantly, we've seen a huge technological breakthrough. So now semanticsearch is technically so much easier to implement than it was a couple of years ago without large language models. Actually, Gartner predicts that by 2026 or so, roughly 30% of all businesses will have implemented this vector database, which is the thing that powers semantic search. So what are vector databases? Let's just take a look at the example on the right, on the bottom right, where you have a matrix which consists of four words, man, woman, king and queen. Now, obviously, those are words that are somehow connected. Man and woman are genders. King and queen are both types of royalty, but you know that they're polar opposite between each other in that subcategory. And so if you were to assign them coordinates, so king would be one and one, queen would be two and two, man is one, one and a half. Woman is two, two and a half. You can see on that matrix that man and king are much closer to each other than, for example, woman and king. And conversely, woman and queen are very close, but queen and man are a bit further apart, and that is because they are semantically linked. They're closer to each other. So what happened in this example is that these words were stored as vector representations, then they had to be somehow compared, and that comparison was done using one of any number of methods. There are multiple, just to name a few, cosine, euclidean or Jakkar distance. So you basically measure the distance between those coordinates, in a sense, or the vectors, and then you have to use an algorithm which is going to somehow find the next nearest neighbor when you have a query. So that's kind of the, the way that's vector databases. Power semantic search as the backbone, there are multiple solutions out there, and new ones are sprouting just about every day now. But what we found is that they usually only give you a part of the process. The most typical part that they give you is this vector database, which basically says, okay, you have a bit of data here, it can be documents, structured data, audio, images. And what we're going to do for you is create vectors out of that data. So you put that data in via rest API or whatever they have at their disposal and we'll spit out some vectors for you. And you might be sitting there saying, okay, well, that does not seem like the full search solution, so we need a little bit more. And that more is now offered by large language models. And the large language model basically allows you to do two things. First and foremost, represent a user's query in a vector state as well, and also generate an answer once you retrieve some information. But again, you might be saying, okay, so I have on one side the vectors from my data, on the other side I have the vectors from what the user types in. So I still need to do a little bit of work here. And so what we decided to do is package up that little bit of work, which is actually a lot of work as you're going to see in just a second, and just offer it as a solution. And that's what Symante basically is. And the key difference here is, well, first and foremost, there's similarity search already built in. Everything is configured for you and packaged up, and you have everything accessible as a set of rest APIs. So to use our technology, you just configure API endpoints and that's it. And the next slide is going to dive very deep into all kinds of problems we ran into. And let me tell you, there are a lot of pitfalls when you set this stuff up. The first pitfall, or the kind of first question you have to ask yourself is which LLM do you want to use? Somebody would typically say, all right, let's just use GPT four, it doesn't matter. But there are a lot of considerations to make. GPT is not necessarily the cheapest model, and in fact it's not the best model for some use cases. Yes, there are some use cases where a much, much less powerful model will actually be more optimal for that use case. Just to name an example, GPT four will translate text to English. So it does work in a multilingual fashion. But because everything is kind of translated to English, the results may be different if you ask in different languages. And so if you want to have a truly multilingual solution, you're going to have to use a model that is language agnostic, meaning that it doesn't actually matter which language the data is in and the query is in, because everything is just represented as vectors and it's completely irrelevant which language it ultimately is. The second aspect is obviously choosing the right similarity algorithm. There are many of them, and you have to choose the one that fits your use case best. Some of them have pros, some of them have other pros and cons, so you know you're going to have to make a selection here. But the most important aspect of them all by far is how to handle large datasets. The first implementation that we did had 1.5 million records, so that one was really a test for us. That's a very large database to give you a benchmark. 1.5 million records using GPT four would have taken several years just to get vectors from, and it would have cost hundreds of thousands of dollars to set everything up. That's not something we wanted to invest in. So with our solution, we had to somehow optimize that whole vector generation and the whole process of creating vector embeddings and also the cost associated with it. And so that's something that we really tweaked and played around with very significantly. Secondly, you're going to want to curb the cost. So some things are very easily handled by much less powerful models than GPT four. On the generation aspect, you might use GPT four. On the search aspect, you might use another language model. It's all your choice, but you'll have to make that choice if you want to make something up and running. Now, in the real world, your database very rarely remains static over long periods of time, right? Entries change, new entries are added. So every single time something changes, you're going to have to re index your database and you're going to have to go through that process again. So we've taken through this re indexing and we've really made sure that that's something that is not anything you have to consider when you use our solution and then finally implementing it in practice. So you still have to integrate it into the customer's application landscape. And so this is why we just make sure that everything is accessible as an endpoint. Essentially it's just an API because we wanted to make sure it's very easily integrated into your customers landscape. Finally, you're going to sometimes run into very complex queries. A user might ask something like, I want to have a red house which is no bigger than 400 m², but also not smaller than 200. It should have a garage. Those are very complicated queries. I guarantee you virtually no large language model other than GPT four will actually understand that query properly. But more importantly, some parts of that query can actually be resorted back to keyword search. Right? You have a bunch of parameters in there. A red house, 200 garage, yes or no. Those things can actually be kind of done in a hybrid fashion where some parts are going to be searched using keywords and other parts are going to be searched using semantic search. Another example is when a user just types in red car. Probably overkill to use semantic search for that. So you're gonna have to make a decision on when to fall back to your regular keyword search. And finally, when you have an eshop or a product catalog, they're almost always gonna be an underlying SQL database there. And so you're gonna have to create some kind of mechanism to sort of translate AI search to SQL search. And this is something that we've really tweaked and played around with quite a bit. But you're gonna have to do this if you, you want to use semanticsearch on your eShop, for example. I could actually go on and on and on about all the things that we ran into, but these are just some of the more important technical intricacies that you have to deal with. Now, on top of all of that, we've actually done something a little bit extra. This is obviously not a must have, but we decided to also visualize data using this neat 3d map where you can see the semantic proximity of your words or, or queries, let's say. And you can browse through this, you can click through it and it's interactive. So that's just another feature that we decided to implement into semantic search. So we have the technology down. And so what's kind of the way that you would use it in real life? So the most obvious is obviously search in an eshop or within your knowledge base. You can just search using human language and you don't have to use keyword search. This one is pretty easy to understand and it's one of the most common ways that semanticsearch is used. But there are other ways. For example, you could use categorization. So an entire text can very easily be placed into one or other categories. Or you can use a generative model to suggest categories for that text, which is very useful in, for example, incident management or when you're working with customer service, that immediately the request is routed to the proper department because it deals with a certain type of request. Other things you can do is similar image search. So you can kind of paste an image into an eshop and it will spit out things that look similarly. Chatbots is another example where you have the generative part where the customer is actually talking to that chatbot, but then you have that retrieval aspect where you can actually. Okay, so what is the customer actually asking about? You can sort of look in your knowledge base. Is there something that's been answered before? Yes. Okay, let's generate that to the customer. That's a very powerful chatbot right there. And then other things you can do is recommendations. So for example, rather than using the typical recommendation engine, you can actually say, okay, if a customer is using kind of, let's say, nose drop as a search, you can also offer them tissues and you can offer them allergy relief. So kind of things that are similar in terms of meaning. And anomaly detection is an interesting one because we're dealing with patterns here. So banks are often searching for things that are kind of out of pattern. And interestingly, semanticsearch offers a way of doing this very scalably. Matching engines are another thing. So if you have a long, long text, one of the things that you can actually do is match it to other long texts that are similar in terms of meaningful. If you have a use case for that, you can use semanticsearch. And this is all nice, but let's see a real world application. So because I had to stick to PowerPoint, I had to create a GiF. But we're going to go through this gif many times. So what you have here is an actual demo of a real world application. It's a database that has roughly 100 bmws, and you can search in it using human language. So, for example, here, cheap car is suitable for my dog, and what we're looking for is basically two things. It has the cheap aspect and it has the suitable for my dog aspect. Cheap obviously refers to a low price, very easy to do. So we have to spit out cars that are cheap. But the suitable for my dog is a kind of semantic aspect, which means we're going to have to probably have a larger trunk. So you can see all three of these cars have kind of that extended trunk. They're active tours. What about extravagant cars for successful managers? So again, there's two aspects here. One would be that they're supposed to be probably expensive because they're designed to be for successful managers. And extravagant is kind of somehow unusual, right? So you're going to want cars that are kind of flashy, maybe very expensive, maybe just spec to the maximum. So what we do when we type this in here, we have the XM, which is the most expensive suv that you have, and then the other two are kind of flashy colors. They're spec to the max. They're very expensive. Expensive. So that's the idea behind semantic search when you use it in the context of a product catalog like this. And that's actually it for this talk. Guys, I'm very thankful that I was able to be here. Thanks so much for paying attention. If you have any questions whatsoever, just ping me a message and I'll be very happy to answer.
...

Ben Fistein

Co-Founder @ Semantee.ai

Ben Fistein's LinkedIn account Ben Fistein's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways