Interested in a hands-on learning experience for developing LLM applications?
Join our LLM Bootcamp today and Get 28% Off for a Limited Time!
Listen on your favourite podcast app
This episode features Bob van Luijt, Co-founder and CEO of Weaviate—a prominent open-source vector database in the industry. At just 15 years of age, Bob started his own software company. He went on to study music at ArtEZ University of the Arts and Berklee College of Music, and completed the Harvard Business School Program of Management Excellence. Bob is also a TEDx speaker, exploring the fascinating relationship between software and language.
In this episode, Raja and Bob trace the AI evolution over the years, the current LLM landscape, and its outlook for the future. They further dive deep into various LLM concepts such as RAG, fine-tuning, challenges in enterprise adoption, vector search, context windows, the potential of SLMs, generative feedback loop, and more. Lastly, Raja and Bob explore Artificial General Intelligence (AGI) and whether it could be a reality in the near future. This episode is a must-watch for anyone interested in a comprehensive outlook of the current state and future trajectory of AI.
Next Recommended Podcast: Building a Multi-million Dollar AI Business – AI Founders Reveal their Success Hacks
(AI-generated)
Raja:
Welcome to the show, everyone. I’m Raja Iqbal. My guest today is Bob van Luijt. Bob is the founder of Weaviate, which I consider one of the foremost databases industry. It is one of the foremost vector databases, in fact. Welcome to the show, Bob.
Bob:
Well, thanks so much for having me and your kind words in the introduction.
Raja:
So we’ll switch to the general election landscape. So in general, when you look at the overall landscape, there are many companies working and entire immediately I mean, let’s talk about the right by applying and perhaps, you know, in general, the overall, you know, tools, the technology is emerging all the way from starting from the models embeddings then, you know, long chain allama index and semantic kernel.
Raja:
But you know, for more for the lack of a better word, frameworks or orchestration frameworks, then you have vector databases that again, many players out there then guardrails so many things that are happening and fine tuning and then papers coming out like, you know draft and you know you’re doing but you know, you’re fine tuning using some retrieval augmentation.
Raja:
So a lot of it is happening in the fascinating, fascinating thing is that it is hard to keep up. I mean, so we we are the first bootcamp in industry and we have been playing catch up, right? So by the time we deliver bootcamp and the next bootcamp is happening two months from now and then we will have to accommodate so many things right?
Raja:
So we are constantly playing catch up. Someone was asking in one of the calls in a customer called someone said, Hey, duty to the everything that is the latest and the greatest. I said, No, that’s not possible for us because and because we have to we have to have a lead time, right? So because we cannot simply say something came in yesterday and then let’s teach it tomorrow in the bootcamp.
Raja:
Right? So we are living in very interesting times. So which company besides Weaviate, of course, that you think, or maybe a bunch of companies that you think are going to disrupt things, Any ideas, any companies, any space in general? I mean, where do you think is the most juice that is out there that needs to be squeezed or someone who has already figured out where the juice is and they are trying to squeeze in general? I mean, where do you think the biggest innovation going to happen?
Bob:
Yeah. So let me start by answering this question first on like a meta level, if I may, because it’s a very it’s as it’s an important question, right? It’s an important question. So for a long time, we were we as we being the industry at large were asking ourselves the question maybe you guys said as well. Right.
Bob:
So asking the question is this whole machine learning thing and I’m using the term machine learning on purpose and not AI. So the machine learning thing is just a great feature, right? Or is it really like a new industry on its own right? Is something really new happening? And of course, I mean, I as I run a business that sits in that space, I would make that argument and arguments for that.
Bob:
But I actually I can even make the argument just you know, if you just agnostic of the business I’m running and that is if you look in the last couple of decades, the big things were the Internet mobile cloud. And the reason I would argue that the fourth wave is A.I. is because we can do some better matching.
Bob:
And one of the things, the patterns that we can match is that it’s like every time this new era erupts, right? We it’s a big bang of stuff. Right. Frameworks, infrastructure, apps, educational content, whatever you can think of. Right. So everything that’s what’s happening right now. So we’re just after the big bang, right? So it was ignited. The Big Bang happened and that’s what’s happening right now.
Bob:
So now the particles in the Big Bang, right. Need to organize themselves, right? So they need to reference. And here we can do some pattern matching again, right? Because I mean, yes, it’s new, but it’s like not everything is new, right? So there’s better matching we can do. And one thing that we see is that a couple of things are the same.
Bob:
So, for example, infrastructure is the same. The role of data, gravity seems to be the same. The role of frameworks seems to be the same. The way how we educate developers, data scientist seems to be the same. Right. Then I let me just say developers, it’s just everybody who writes code. I’m just going to try to about power code developers.
Bob:
So how we educate these people and how they learn and the curves that they go through and how businesses, enterprises and startups adopt these new technologies is the same, right? So now all of a sudden these pillars starts to emerge and things fit in there. So let me start with the first one. So infrastructure, that’s where we sit.
Bob:
But unique are the models, right? So the model says model, model serving infrastructure. The roles that the that ships play in that those kind of things. Right. That that’s extremely important. But we don’t know if that will stay. I’m very bullish also on CPU interference and with the databases like in this case, for example, where we’ve there’s data gravity, right?
Bob:
So we still have the data that we enrich with fact embeddings and there’s a form of gravity frameworks have a similar function. The problem no, sorry, not problem. A challenge with a framework is that I believe because frameworks are stateless, right. It’s a little bit harder to build a business around that. Right. So if we go back in time, so we look at the previous epoch, if you will, right.
Bob:
So Cloud, we also saw that these frameworks were starting to emerge. So you might remember from the front end days, right? You had angular react, those kind of things. And these frameworks tend to come and go because things evolve. So that always very I mean, I remember the angular, how popular that was, right? But now a lot of people use anymore.
Bob:
And what’s so hard is because you don’t keep state, right? It’s the value sets in connecting things together and those kind of things. Right. So our friends from the framework companies need to somehow figure out how to do that. I mean, there are companies very successful in Brazil, both harder and as an infrastructure company.
Bob:
The gameplay is very traditional, right? So it’s just people the job to be done of infrastructures, make sure that you know that there’s no data loss that is fast enough that it’s easy to operate that kind of stuff. That’s what people pay you for right then. Education, is very similar. We’ve learned that the world today lives through a bottom-up motion.
Bob:
So through the developers you go into the enterprise, education plays a crucial role in that. So that’s unchanged. And but now there’s something else interesting, and that’s the models, right? So what what do we need to think about the models and the everything we know every day there’s a new foundational model being released. And what is so interesting is that my argument about models is that are they stateful or are they stateless, they’re stateless.
Bob:
And so you download a model and then, you know, it’s a snapshot in time, if you will. And what makes it very hard with that is that the analogy that our the metaphor often that I use and MP three fell by is it’s so hard to monetize and retrieve files, arbitrary files themselves because if I listen to a great song and I send it to you and it’s RJ, you really need to listen to the song, then you have value.
Bob:
I have value, but for the artist to capture that value twice is very difficult, right? That’s not the case with a database because I you this open source, I can download the binary, you can download the binary, but the value you’re getting is not from that binary. The value sits in that you want to have a source service that you want as a place and that kind of stuff.
Bob:
That’s the business model, right? So one thing that I expect that will happen is that this or predict that will happen is that these models starts to merge more with the applications that they ship with. Right. And that also go for database. And that brings me to your point about RAG, right? So this is a long way to get to RAG, but so preamble because if we look at the original RAG paper and what’s happening there is that the argument that’s being made is that we do a retrieval or so we sort, we augment the generation by doing retrieval insights model.
Bob:
Today the way RAG is being done, It’s actually that’s it’s not that it’s very primitive. It’s like the reason that the fact the database plays an important role in that is because just the input query is unstructured. So then this but then actually what we do after that is very primitive because we shoot in the prompt it’s very ugly. So bringing those my preamble together with this RAG point is that what I believe that’s happening, and but we’ve had to do a lot of work in that. The rules of publish about that is that we had basically we said to no pun intended we’ve that the database and the models more closely together. So for example if somebody goes like did the large context window kill the affected database, I said no, because the fact that the base is becoming the context window.
Bob:
But the thing is the data gravity part still stays the same. So they come together and if you so if you want a monetized model, then the way to use a Spotify model to that, right? So we have it behind an API that’s great, but there will be a point in time. Look at about the great work that the people at Ollama are doing, right?
Bob:
Is that just I mean, I can now run these models living on my laptop, iPhone eight gigabytes, MacBook and I can just run Ollama on here, right? So we see where that’s kind of going, right? So long story short, the I think that’s now currently the state is very fuzzy. That’s like it’s like a big bang where these infrastructure frameworks are now the models as well and applications are just emerging and floating around in space.
Bob:
But I predict that how they will land is very traditional. And so what we what we need to built as builders and the applications we can build for that, that’s completely new. But the mechanics, the economic mechanics on how that will lend to the kind of stuff, I think that that’s just going to be very traditional.
Raja:
Okay. And you mentioned Ollama in the context of you can now run the model locally on your laptop on your MacBook Pro. So that’s a that’s a very interesting point. So are you suggesting that we are going to be more focused or not in general across industry? But, you know, in many cases we will have what we call small language models, basically a smaller, maybe less parameters, maybe more quantization, some kind of models that are smaller and perhaps more domain specific. Is that what you’re hinting at when you mentioned this?
Bob:
Yes. And the reason is so the short answer is yes. And the reason I believe that is that at the risk of diverging too much. But but you might find this interesting. There’s this researcher and her name is Neri Oxman, and she did some work at MIT where she created something that she calls the cycle of creativity and where she marries together KREB cycle, you know, comes from proteins, how they you know, how they evolve or whatever. I’m no expert on that. But the point that she wants to make with that name is that you have like you have art science engineer thing and design and they are interlinked together. And art, if you go really way back right in history, everything that’s happening now with the models, even before large language models, just machine learning in general comes from analytical philosophy, right?
Bob:
So that’s like a kind of artistic, creative way of thinking about then that goes that, that plays into science. And the scientists just had the goal of solving the problem, of creating relations in language and proving that there’s a relation in the way that the words are structured in a sentence or two pixels in the image. But that’s for the sake of argument that keep it to to a language or and out of that came the thing like that.
Bob:
The scientists just wanted to prove like, hey, we think by applying attention we have a mechanism that we can predict the new embedding to get through a new token in a in a matrix or an array of embeddings. But the engineering part is different because what the engineer started to for me, something like Ollama is not a scientific effort, it’s an engineering effort.
Bob:
And the engineering effort is very different. The engineering efforts are great. You have these models, there’s a bunch of weights. They sit in this binary file here and to the lengths of how an engineer looks at that. So a scientist looks at that, they go like, Holy cow, this is amazing. But an engineer looks at it. They go like, this is really slow.
Bob:
So the Holy Grail there is like, How can I do this as fast? And as cheap and as optimized as possible? So now we see all these. We’re now in the Krebs in theory of cycle. We’re not in the space of engineering where we basically say like, can we make it faster? How can we make it more optimized?
Bob:
So there, there, where is where we that’s where we are right now. Based and based on your question, I would say like so that’s the focus. That’s why I’m so bullish on CPU interference somewhere. I predict that somewhere somebody does some kind of an invention by combining certain boosting algorithms together. Then we go like, Hey, it’s actually cheaper to run this stuff on CPUs than on a GPU, right?
Bob:
And then out of that, we’re going to design and then we when we go to applications like what is the stuff we can create with this? How do we, how do these, how does a really true native world, what does that look like in the application layer? And then the cycle starts from the beginning again. But so this is a long answer to your point, that engineering mindset that is very different, in my opinion, than the scientific mindset.
Bob:
But a scientist wants to solve the problem and the engineer looks at like, how can we do this better, faster, cheaper, and so on and so forth. And that’s where I believe we are right now.
Raja:
So on a lighter note, you mentioned CPU inference, right? So do let me know when you think we are ready. Bigger so I can start selling my Nvidia stocks and start buying Intel stocks. Right. So and because it would be interesting, I mean but do let me know ahead of time then I closer right.
Bob:
It’s funny it’s like I was on another podcast and somebody asked me a question what my prediction was of I forgot how the question was phrased, but something like what do you, what would you now do in the in the market? And I said like, well, this is not any market advice from me, but I would buy CPU stuff, right?
Bob:
Because these people are and of course Intel is also a big enterprise. So that takes about we as we’ve had to collaborate also with Intel and we see stuff that they’re working on and that’s very exciting and that’s beating up stuff significantly. So these chips will go nowhere because we need them for, for fine tuning, we need them for a foundation of old training.
Bob:
So they go they don’t go anywhere. But from an inference perspective, completely different story. But this is a production. I could be completely off, but let’s do this again a year from now and then. Well, you know, if I ever if I was right.
Raja:
Yeah, I would bookmark this and I will remind you. So this discussion. Yeah. So going back to your point about, you know, small language models being an engineering project, So there is this two things, right? So now we can actually build models that are more efficient, more optimized, smaller size and definitely needed. I think most notably I think Llama as a, a lighter, even like lighter than seven beat version that they released recently, Microsoft has by Apple also has those and I’m sure there are others.
Raja:
So that’s there but there is more scientific effort right? So on the on the large language model side, right. So increasing the context window and maybe potentially in infinite context window are very large context window. And philosophically, do you think that it is the context window that is the constraint or is it the surrounding engineering that happens? You know, definitely.
Raja:
I mean, is it a larger context window always better? Is GPT four always better than GPT 3.5 just as an example? That’s a maybe perhaps a you know, a model that has a bigger context window. Is it always better than a smaller context to model? What is your take on that?
Bob:
So the simplistic answer. So please allow me just to first start with the simplistic answer is like the or the oversimplified answer is like everything that is a larger context window and that is cheaper, smaller to run and operates better, right? So that’s just that’s a given, right? So then the question becomes like, what problem are we trying to solve?
Bob:
Right? So if you look at it from a scientific perspective, and this is something I often see that goes wrong when I see people, for example, from a business perspective talk about this, they go like, yeah, but these bigger context window needs that. How are we going to sell that? And I got a gap, but the scientist doesn’t per se care about that.
Bob:
The scientist says the scientific challenge at hand is how do I create a context window of infinite length, right? And we as engineers or business people might leverage that knowledge that is being, you know, the stuff that’s being invented in a by the scientific community to engineer out or to build business around. And so the things that need to be solved are like a handful of things, right?
Bob:
So the models are big, let’s say clunky, that’s better. So they’re clunky so we know that they need to be smaller. How small? I don’t know. But how small will be the part that will be determined by the more that’s slow, relatively speaking. So if you do a production use case on it, it’s it’s hard to use, right.
Bob:
The way that we interact with the context when those very primitive right now there’s a lot of work we can do to more tightly integrated together. But that’s not the problem our friends from a scientific perspective are solving. So our friends from a scientific perspective are solving what’s the least amount of parameters we need to still score high on the benchmarks.
Bob:
What invention do we need to do to make the context when over a theoretical infinite length by the engineers? So for example, people work on alleviate. They’re asking the questions like how we speed this up, how do we make this cheaper, how do we make this faster? And then maybe business might be able to adopt this and and use it.
Bob:
So long story short, it’s a couple of challenges and problems that we have in the models. But what the scientists are solving right now, they have a different point of view on the problem than the engineers or the market has, you know, to adopt that. So I see that as very distinctive things. But if we all come together, they need to become cheaper, faster and better.
Raja:
Yeah. And then there is this discussion around AGI, and I was looking at a small short clip by Sam Altman. He was saying, I don’t care how much money we spend, and because we are getting closer to AGI, first of all, I mean, how do you see AGI? Anyone who knows A.I. under the hood, it is still you know, it is still pattern recognition, right?
Raja:
So, you know, the principle is still the same. So how do you foresee what should or what would AGI look like in your opinion? Is it going to be well, as the name suggests, general intelligence, or is it going to be AGI for a specific task? I know it’s a fairly vague question in general because there is no agreed upon definition in my opinion right now.
Raja:
And then. And what is the biggest barrier? Do you think it is going to be the scientific side that will achieve AGI? Is it going to be scientific effort or endeavor combined with the engineering effort? Right. So maybe more optimization because, you know, there’s constraints on all sides. You know, there are some, you know, algorithmic or theoretical constraints.
Raja:
There’s compute constraints and all of that. So I would like to hear from you. What are your thoughts in response to this very big question.
Bob:
Yeah, so I love big questions, so I don’t mind it, but we need to talk about this. We need to we need to plant a flag in the sense in the form of definition, right? So we need to we need to say something about the definition of AGI, because otherwise we run the risk of, you know, running all over the place when we discussed this.
Bob:
Right. So let me know if you agree with this. If you disagree, we can go for something else. But let’s just plan of flag now in the sense where we say AGI operates so is an is of general intelligence using the same level of energy and as the same source capacity and on the same space that it needs to store that information as the brain.
Bob:
Right. So, I mean, I don’t know these numbers by heart, but that’s okay. There’s like like the brain cancer, you know, a couple of petabytes and that kind of stuff. The information and the ups do stuff to retrieve that and to get access to it. It’s like now the size of our if I had this, like, what is it, a couple of hundred grams, right?
Bob:
So it uses I mean, it just the energy that our hearts are generating is enough to power that thing in our heads, right? So that’s a Yeah. I don’t know but it milliwatts whatever. Right. So let’s define it. If we get to that in a digital form that’s age, you might see the same as a human learns the same as human source information.
Bob:
So, do you agree with that definition?
Raja:
That’s a very interesting definition because I had never thought about, you know, I always thought of AGI as both from the capability side, not the call it the infrastructure side, you know, how much power or how much storage and you know, what would be the space occupied by it. Yeah, I did not actually really think about it from that.
Raja:
That’s a very interesting viewpoint. So that’s a very, very interesting viewpoint. And I don’t have to agree or disagree because I’m myself actually thinking about what would AGI look like because, you know, for many of us, yeah, I mean, AGI, everyone assumes that some general intelligent machine, but what would that look like? So that’s a very interesting viewpoint, actually.
Raja:
I like that.
Bob:
Yes. Well, thank you. And that’s why the flag in the sense is important, because the things that we need to bear in mind. So let’s take Sam’s argument. Right? So the reason we need to flex in the sense is because we need to somehow say things about tool properties, and that’s time and space. Right. Related to that, like energy, I guess.
Bob:
Right. So, you know, this book or movie, The Hitchhiker’s Guide to the Galaxy and that they had built a supercomputer and they say like, what’s the answer to all things? And anything things that’s 42 and then it needs like a you know, many millions more years to explain why that is the answer. So let’s say for the sake of argument that I believe the number Sam mentioned is 50 billion.
Bob:
And so, okay, we’re ready. We have the whole set up. It’s everywhere here. We now got to press enter. We have to wait. We got to spend $50 billion on actually generating. But then we will see a glimpse of AGI because this comes out of the system right now. That is, you know, and that’s, you know, good for him, potentially bad for the environment.
Bob:
$50 billion spent. But, you know, we get to our answer 42 right so and then it becomes so esoteric and philosophical that it’s great to talk about but it’s very un pragmatic right. So that is why, I like to use the brain because it says something about energy, time, and space. You know, that’s a couple of milliseconds.
Bob:
But if you ask me question takes me a couple of milliseconds to respond. I do that based on stuff that’s stored in the equivalent of petabytes in my brain and so on and so forth. Right? So there’s a time, space and energy element to the answers that I’m providing you. And that’s a you know, that’s useful. I mean, not everybody agrees that what comes out of my mouth is useful, but let’s, for the sake of argument, say that it’s that it’s not something that’s useful.
Bob:
So and then having that like on a on a machine, right. That a machine and that’s it. So then going back to your original question, so the problem is, so if you go back to that take that step cycle again, is that I would argue with the limited knowledge that I have about neuroscience and this I just have that just as an immature being interested in that kind of stuff.
Bob:
We don’t know, we being humanity, right? So we need to probably go one step back in that crap cycle, look at the arts and the great creativity in the world to come up with ideas or how the brain might be doing that. So there’s like a research I find super interesting. It’s called mesh code. It comes from the University of Kent is it’s called mesh code but is unrelated to software as we’re talking about it.
Bob:
And it talks about how a protein called starling might be storing memory. Right. And that there’s a binary representation in that. And then I’m trying to figure out if that’s actually how the brain is storing information. And if the answer to that question is yes, then we know how we can get to those petabytes of information and then we can even we can’t read it yet.
Bob:
We just know like we see lots of ones and zeros in the brain, but that we don’t know how to interpret that yet. But that might be one step forward right. So long story short, by the time with their true AGI based on the likely the planet and said that’s going to take a long time, I think because they’d say it’s we’re just not there but we probably will see happening is that for marketing purposes in the new industry that see merging that that term will be used.
Bob:
Right. So at some point some GPT model, we see some new architectures now that people are doubling around with. Right. And it coming out of that people will say AGI is reached but it will probably just be the similar statistical models that we that we have now. From a business perspective, a lot of value will be created by it.
Bob:
So that’s I’m not concerned about that one bit. But if you really want to talk about AGI, I think that’s not it. And I think the reason that’s also I think the reason that some might be a little bit fake about the definition because as long as you don’t have a and forgive me if somebody is listening to this, it goes, no, no, no, you’re wrong, Bob.
Bob:
Look at this page is very clear what the definition, what he means. But as far as I’m aware, the definition is out there. And what makes it difficult is that we define a label. But the definition, what the label means is not there so we can slap it on anything we feel that fits the bill for the moment.
Bob:
So I hope this is a helpful answer, but I think that AGI is a term that probably will consumed for marketing purposes before the sciences. The scientists can really say something valuable about what the definition is, how that thing might work and those kind of things.
Raja:
And I think in our lifetime, like next 30, 40, 50 years or do you think and then that’s and once again, I mean, I know that there’s not enough evidence enough data right now to actually come to a conclusion at the moment. Yes. I mean, we will keep seeing it for marketing reasons. You know, companies will keep mentioning, but in general, I mean, the AGI that would actually consider AGI, do you see that happening within the next 30, 40, 50 years or perhaps not?
Bob:
Let me start with the optimistic answer. I hope so. I mean, that would be fantastic. I hope so. The more pessimistic answer is that I do not know if the tools that we have available now as humans are currently able to get to actually provide us with that answer. And that goes very deep. So it’s okay. I have an immature interest in language philosophy, so that’s just pure and mature there, but I have an interest in that.
Bob:
And one thing that’s happening right now in philosophy is that the question is being asked and there’s a lot of pushback from the scientific community to this philosophical question as the philosophers like are the limits and boundaries to the scientific method that we are currently applying. Right. And I find the so the this was kind of born out of them and I’ll bring this back to AA, but this was kind of born out of the question of the multiverse.
Bob:
So that from a, from a philosophical mathematical point of view, you could say something about multiple. Yeah, multiple universes, right. In philosophy, it’s very common to say like talk about multiple words, worlds, theories and those kind of things. But the argument was like from the, from the physicist was like, Yeah, but that’s not scientific because we’re bounded by the universe we’re in.
Bob:
So we don’t know how to look outside it to see if there’s another one. And that’s where the question came from. Yeah, but if we can say something about it then might that also and might it not be enough? So do we maybe need to revise the scientific method that we’re using? So I would not be surprised that to get to true AGI, the first principles that we need to go back to is as deep as maybe together as as humanity.
Bob:
Ask ourselves the question, is the scientific method itself sufficient to get this question answered? And that will be a super exciting journey. So regardless if you get there in my lifetime or not, it’s going to be exciting. But my pessimistic view would be I am not sure if we’re going to make that into the time. Well, at this I have left assuming that I am, you know, that not something super interesting happens on the biological space, but you know, with long lived and that kind of stuff.
Bob:
But the so we’ll see. But the journey will be exciting. The journey will be exciting, but I just don’t know.
Raja:
And journey is already very exciting, right? So when you see the updates and the news coming out every single day, it is exciting. So yeah, I think.
Bob:
That the nice thing is that the thing that we just discussed is very, very esoteric, right? Very, you know, but the nice thing is that the fact that if you bring it a little bit more to today, so the stuff that’s happening with AI now and machine learning because so much money is being poured into that, that means that at the tail end of that, on the more philosophical, deep scientific end of this whole spectrum, more money flows to that as well.
Bob:
So a lot of work is happening there too, and that is something I’m super excited about. So it’s okay. It’s anybody’s benefiting from what’s happening now in it. But we really need to make a clear distinction between what do we do now as a business? Where do we create value for our customers on this on one end of the spectrum versus what we have been just discussing completely on the other end of the spectrum, both super important but very different.
Raja:
And so thanks, Bob and I enjoyed the discussion. This was wonderful. Let’s actually go back or maybe, you know, steer the discussion back to the present. Right? So what we have the state of the art. So based on that V8 or any other vector database, they’re going to be at the center of a drag application, right? So that the application cannot exist without a vector database.
Raja:
So what are some of the challenges that you what are some of the actual real challenges that you may have that you have seen in enterprises when they’re trying to adopt a RAG based solution and maybe even maybe I can extend it to even if they wanted to adopt a fine tuning solution. So what is the biggest barrier?
Raja:
What are the biggest barriers and risks in enterprise adoption of LLM applications be a drag or fine tuning or maybe perhaps hybrid? And then a follow up question is going to be how does Weaviate help in overcoming those challenges?
Bob:
Yeah, so I think the simple and the short answer is to and again, this is just history repeating itself. This stuff is not a silver bullet, right? So we see people just throw data at it and it just go like and now all of a sudden it should do its magic and now it’s solved. Right? And of course that’s not the case.
Bob:
So, you know, sometimes people struggle with chunking strategies or those kind of things or sometimes the queries that people have just failed to find the right candidates to provide into direct pipeline. So pipeline is successful if the candidates that are selected from the, you know, the body of data that you have actually contain the answer. Right. And so that says that those are very pragmatic things.
Bob:
Right? So that we see and.
Raja:
If you allow me to interject, these are more engineering problems than more of the scientific problems. Right. So, you know, so yes.
Bob:
So what the scientists have done, that’s we can take that now into our world of engineering and create businesses around it. And that concept of that’s why all these businesses exist and that is wonderful. So we do not need, for the sake of argument, we do not need more help from the scientists. Right now. We were good there, Right?
Bob:
So we’re now in the engineering world where just a business just like, hey, I have like 50 million, I don’t know, won’t track, say my database. I want to efficiently search through them and understand what like relations are in these contracts. And then you talk about chunking strategies where the frameworks play a role, how you started in the database, how to treat, how you operate the database, you know, those kind of things.
Bob:
Those are the challenges that you see today. And that goes back to also the great work. Of course, what you guys are doing is like, we need to help people to learn and understand them and to educate them on how to do that kind of stuff. So that’s why we all need to work together. And I would say that’s the biggest challenge right now.
Bob:
So if I would have to split that into one just ops operational, right? So, the market now goes a little bit faster than where most of the pure play fact databases are. Because if people play around thinking like, you know, I have a thousand documents and they go like, okay, now I have like a petabytes of stock and go and that’s operational aside.
Bob:
And the second thing is like education. So how does effective space actually work? How do we efficiently organize in that space? Why do I need hybrid separates? Well, how do I need to organize my filters and so on and so forth. So those are just a very basic day to day things. I see a tweet yet that people are struggling with, but we’re also helping people, right?
Bob:
So it’s like it’s not a struggle. Leading to like a negative outcome is just it’s just like the status quo, but more and more materials coming out and people are learning and they because it’s such a paradigm shift, it’s so different than what we’re used to. Plus we are also, from a product perspective are doing stuff. So I’m very, very bullish on this concept of generative feedback loops.
Bob:
I don’t know if you’re familiar with that, but the where we say like, Hey, the database can actually do more, that the generative model can actually directly interact with the database to help the user organize their data in different ways or to create data objects or update data objects and those kind of things. And the tongue in cheek joke that I always make is that I say with generative feedback loops, we can turn chickenshit into chicken salad.
Bob:
And that is like a completely new thing in our space, right? So that’s new. That did not exist If you drew something in the database. Regardless, it was good or bad. Just you retrieve what you stored in it and now all of a sudden by weaving model and the database together, the model can have an opinion on what you’ve actually stored your database.
Bob:
So all these kind of things are contributing to all these this new stuff and this new market that’s emerging, but just things people struggle with today. Is it okay, we get it, we see it, we tried it out, we want this. And then in bringing that to production, they have some very basic challenges like chunking strategies, operational questions and that concept.
Bob:
But that’s what will overcome. That will be will be fine.
Raja:
And so you mentioned generative feedback loop. So does any of the mainstream vector databases at this point, do they offer this? Because I mean, as the engineer in me or maybe more of an architect in me, I mean, we have a product that we have built. This is something that, you know, I immediately I can relate to it.
Raja:
I mean, how can we actually improve the retrieval, right? So Raft is probably the you know, you can have a custom model for embeddings and so there are different approaches, but one thing would be directly impacting the retrieval. So is there anything that you’re aware of that actually offers this kind of generative feedback loop?
Bob:
Yes, us we we do. Okay. But on a really on a serious note, so this is what I mean when I say earlier in our conversation, we were talking about like, you know, existing databases adopting the vector, embedding us data that where I’m saying this is great because if the market wants that, that’s great. But this is where the big differentiating factor between vector search as a feature and being a native.
Bob:
And let me explain this. Let’s bring something the just let me give you an use case example, right, where you can easily distinguish. So let’s say that you have a use case where you have support tickets that are coming in, use case number one, where we see vector search is a feature is that we might say we’re going to create an embedding for this ticket, right, for this sport and we’re going to try to predict what label should be attached to this ticket or what support engineer we want assigned to this ticket.
Bob:
Because based on the in fact of space, we can see what similar tickets were handled by certain people. Right? That is Vector Search is a feature, which is great. That’s a lot of value. Sounds great. And the example is if I would take that out of my support ticket system, the sports ticket system would still work, but it would just not have labels, right?
Bob:
But it would still work. Now you can make distinction. Do I want to use an ID like an existing database or a pure play vector database player? And then it’s very simple, depends on like scale operations, that kind of stuff. Right? So if you go like, you know, I can just I’m using database X works well, I’m just going to add this as a feature, right?
Bob:
So, well, a bigger use case, I might want to do a pure play. Fine. The two, but it’s all vector search as a feature. Now let’s take the same use case. So same situation. So we have a support ticket system and on it we do something else. We do create a vector embedding for what’s stored in it. We start out in database like we’ve had, but now because we have a generative feedback loop with a generative model, we instruct the model use database to find the answer to this question and provide it yourself to the person who created the support ticket.
Bob:
So now no labeling, no nothing? No, no, just answer it. That’s an air. Need a few skips and there we need to have integrated RAG. We really need to have the database and the models evolving together. We need to make sure that the ball has direct support. As in that support the direct access to the database. How does it do that?
Bob:
It uses vector embeddings for that, right? And now if you would take that out of the application, my application would just die, right? So then my application is gone, then nothing happens. That’s an API native use case. So the big difference to think about this is like, is it a feature where you just add do something in vector space where just vector space is better suited to.
Bob:
So if you use case right, then you can use a new player like, like, like where we’re doing and there are certain reasons why you might want to do that, but you can also do it in an existing database. But if you say no, I’m building like the new stuff, like the cool stuff, right? The air native complete different story because now you need a database that does this out of the box, that has all the integrations.
Bob:
That’s really for purpose to do that. And that is the big difference in the there we’re seeing right now in the ecosystem. And I’m yeah, obviously initially very on the on the second one for me because I believe that five years from now, less two years from now, a year from now, we just we’re not even going to think about these models.
Bob:
We’re just going to go like, of course we have generative feedback loop in our application, right? Of course we’ll do that. So we’re not even going to talk about it. It’s just going to be part of how we build applications because it’s so handy and so helpful. So that is where I believe things are going.
Raja:
Yeah, and I think you’re probably referring to, if I look back in my during my little early days of machine learning or perhaps early days of adoption in industry and machine learning, you know, we used to spend a lot of time dealing the hyper parameters, right? So I see the parallels here in terms of you know, what chunking and chunk optimization and, you know, hybrid search and tweaking, the hybrid search chunk sizes, all of that, it’s pretty much the same as the what used to be hyper parameter tuning.
Raja:
And then we started seeing these tools evolving and doing moving more into the, you know, autosomal are somewhat of a maybe less intense and less human intervention type approach is libraries. They’ve started becoming smarter as we were able to figure out what the real problems are that people are facing.
Bob:
Yeah, exactly. And I think what’s now, what’s different right now than back then is that we. Dittrich Kinda. It seems that we’re over the bump of like the, the uncanny valley of the, of what the models can do. So are you familiar with that concept of the uncanny valley? Yeah. So let me quickly define it, because I completely forgot.
Raja:
For the audience, those.
Bob:
Listeners that I will be listeners. So sorry. Yeah, I was so in the conversation. So, so for those listening who are not familiar with the uncanny valley, it basically says if you create like an old fashioned robot that you’ve got like, what a cute little robot. And as you can clearly see, that’s a robot. And then even if they try to become like a humanoid robot, a robot, and then there’s at some point, then there’s a robot that looks like a human, but it’s just not a human.
Bob:
And then people are scared of it. They go like, look, what’s how scary? And that’s the uncanny valley that we need to go through before that. It’s like a truly humanoid robot. And that uncanny valley, in my opinion, also exists for these ML models, right? Or DJI models in the sense that I remember that we saw the first models that were doing predictions.
Bob:
I mean, I believe that was a model from Microsoft called Google, I believe, or something like that, where it was like predicting new tokens, but it was kind of crappy and you were like, that’s so sweet, right? So it’s like it’s almost there, but it’s like, it’s completely useless, right? So it’s like a, like a robot that just drives around.
Bob:
It doesn’t do anything right. So but it’s like kind of sweet and they improve these models. But we might remember that I believe it was also Microsoft who created this chatbot which basically said horrible things right.
Raja:
Before they released it on Twitter. Yeah. No.
Bob:
Yes, yes, yes, yes, yes, yes. And now we’re at at the bottom of the uncanny valley that people are like, well, if this is what is what it can do, then no, thank you. Right. And what happened with Techy Beat was that we were out of the uncanny valley. We started to anthropomorphize with the models that people were saying, like, Hey, the model is talking to me, right?
Bob:
I remember just after it was released that I was listening to a podcast where they were interviewing people who were like very shy or didn’t have any friends who were talking to the model. And at the model was their friend. So out of the uncanny valley and from a business perspective, right, coming out of the uncanny valley meant that all the things that we could do now with factory embeddings coming out of the models were not only a feature, it was a new thing on its own.
Bob:
We really could create. Now I probably saw again, but I like to go and native because the models were out of the uncanny valley. They were good enough and that is new. So the new thing is not having a new data type that we can search with because we’re doing that now for a couple of years already and people got a lot of value from that.
Bob:
But that was just a feature. The fact that the models were out of the uncanny valley meant that we could build a completely new application A.I. native applications. And it happened to be the case that the infrastructure in the form affected at the base to support that existed. Right. And that is that really the new thing here, how people are, in my opinion, building. And that’s the big difference, I think.
Raja:
Thank you for this was very insightful.