Future of Data and AI / Hosted by Data Science Dojo

Jay Alammar on RAG, AI Education, and Industry Transformation - Future of AI

“What will you create now that AI puts more creative tools at your fingertips than Walt Disney had in its prime?”

Jay Alammar

Director, Engineering Fellow (NLP) at Cohere

Listen on your favourite podcast app

In this episode, Raja Iqbal welcomes Jay Alammar, a renowned educator, researcher, and visual storyteller in machine learning. Jay shares his fascinating journey into simplifying complex AI concepts through visual storytelling and his passion for making AI education accessible to everyone.

Raja and Jay discuss the power of visual learning, the role of intuition in understanding AI, and the challenges and opportunities in enterprise AI adoption. Jay also explores how AI is reshaping industries, the importance of tools like Retrieval-Augmented Generation (RAG), and his experiences at Cohere, where he helps organizations harness the power of large language models for real-world applications.

Transcript

Raja Iqbal:
Hello everyone! I’m excited to welcome Jay Alammar to the podcast. Jay is a renowned machine learning educator, researcher, and visual storyteller best known for The Illustrated Transformer and The Illustrated Book. Jay has a unique talent for simplifying complex AI concepts that has educated and inspired learners around the globe. Jay is currently a director and engineering fellow at cohere.

Raja Iqbal:
Beyond his technical expertise, Jay is deeply passionate about democratization of AI education and making this transformative technology accessible to all. Welcome to the podcast. It is an honor to have you with us.

Jay Alammar:
Thank you so much for having me. It’s, really delightful to be able to come here and chat with you. Okay.

Raja Iqbal:
Okay, so let’s start with your background as an educator. Has it been around you for a long time that you want to teach? Tell us a bit about this.

Jay Alammar:
Absolutely. So, I mean, technically, it kind of started about ten years ago when I first wanted to start learning machine learning. So I’m a computer science, major, and I worked as a software engineer and sort of had a parallel career in starting companies and growing tech companies. But when TensorFlow came out, I was like, okay, this is the time for me to get into this field that I’ve always wanted to get into.

Jay Alammar:
This was 9999 and, you know, started to follow blog posts and tutorials. Three months into it, I was like, okay, how much did I learn? But I needed artifacts to really guide me, give me a sense of, you know, how much did I accomplish? So I started writing blog posts and then posting them to Reddit to mainly solidify my own understanding, because once I published 1 or 2, it felt like I knew the topic much more comfortably.

Jay Alammar:
And because, I would be very concerned to put something out under my own name, that would be, let’s say, maybe incorrect. That would force me to learn even deeper. So while if I wasn’t writing, I would just go over tutorial or go over a paper and I’m like, oh, okay. I have the general sense. But if I had to explain it publicly to people on the internet and make it a narrative that really helps people understand the concept, that really forced me to learn deeper and read the paper ten more times, and go over the code and make sure that it works.

Jay Alammar:
And so really, it started with that as writing as the best form of learning yourself and sort of deepening your understanding. Visuals are a important thing, but also when earlier in the beginning I was also very into interactive visualizations and that takes quite a bit of time. So I sit in that a little bit. I haven’t been able to invest in that as much, but some of the earlier intros to machine learning and machine learning math are very detailed and interactive web applications about their, you know, the gradients and finding those.

Jay Alammar:
So the intersection of visualization and machine learning has been fascinating to me. But what really worked the most are these, graphical conceptual explanations. And then some of these, both that you discuss, that we can go a little bit deeper into. But that’s really how it started. Does, you know, one thing to understand, putting out a couple of blog posts, getting great reaction from them, and that’s sort of motivating my journey to learn deeper, deeper, and sort of follow the latest topics.

Jay Alammar:
Because when I started, it was one thing. It was deep learning and then deep learning. There was multiple tracks and deep learning. I focused a little bit on language. RNNs at the time were the paradigm for neural networks that understand text. A lot of it. If you some of your, audience or somebody you have, people can go and look back at the Andrej Karpathy blog post called The Unreasonable Effectiveness of RNNs.

Jay Alammar:
It was the first time that I see software generating text, and it sounds like absolute magic. And I’ve been hooked ever since. And this technology has, it’s been delivering ever since and just making fascinating advances into what’s possible with software.

Raja Iqbal:
And did you know that you have this in you even before, or you were surprised when you released those tutorials? because I have watched those tutorials, before. Yeah. I even got to know you. Right. So just as a learner. So did you know that you have this in you, the teaching?

Jay Alammar:
I haven’t done much of it in the past. Like I’ve invested in just learning the tools in the past. But I was surprised and glad of people’s reception to the visuals because they’ve invested so much time in them. But it definitely evolved over time. So after I wrote of the first few blog posts, I talked with Udacity and sort of started building their deep learning curriculum.

Jay Alammar:
And that’s where I met one of your previous, guests, the legend Luis Serrano. One of these times of working with Luis just over lunch, just discussing a lesson that he’s been working on and something that I’ve been working on, that was maybe the most major boost in my entire, way of how I do things. Just him going over the tools and seeing how he thinks about building a lesson or explaining a concept just as it comes through in his videos.

Jay Alammar:
just seeing that process. I think for me it was quite magical. That also is when I started adopting Apple Keynote, which, unlocked me in a major way. But previously my tools were so slow they took a long time, but then with the right piece of software, making it visual became quickly. Iterating becomes quickly and creating animations it becomes much quicker.

Jay Alammar:
And these are all very, very important. So something like the illustrated transformer, for example, there may be 9999 figures to explain the main intuitions of a model like the transformer. It’s not everything, but it’s enough to build comfort and the visuals alone. Each one I would draw and then as soon as it was there on the screen, I could start to see flaws in it, maybe misunderstandings or ways that are not clear that could be made clear.

Jay Alammar:
And so I would make a second version or a third grade. And so everything almost that comes out, on the blog is like the sixth or seventh version of that specific visual. And that is oddly similar to how diffusion models work. So they create multiple versions that get clearer and clearer and more crisp with time. And you can say, why don’t I just create the right one from the beginning?

Jay Alammar:
And a lot of the time it’s really not possible. Like you have to iterate, like you cannot get to version seven without going through versions one, two, three, and four. So that iterative process was really helped by having the right tools that are that quick. And I have a video that sort of explains sort of my setup, my color swatch, and how I use it over time and over, let’s say working with a lot of incredible educators like Luis, like, me or or at cohere, you start to develop visual language for concepts that are coming to the domain.

Jay Alammar:
So in machine learning, for example, it was very difficult to explain Transformers or RNNs without having a visual representation of a vector or of a, of a matrix. And so we come up with something to or let’s say we reuse something that appeared in some papers, for example. Okay, what to layer is what a network is, what an attention layer is, what the vector, what a matrix is.

Jay Alammar:
And you know, some people are very comfortable with the math. They can just read, you know, seven lines of or two pages of math and be comfortable with the symbolic representation. For others, it’s more friendly if they actually see the vectors and how they transform and how they evolve through the process. And I definitely think that second bucket.

Jay Alammar:
So even if I’m reading papers, a lot of the times, it’s for me to understand that at a level of depth that I’m comfortable with, I would draw them and sort of be comfortable with a visual representation for delving into, let’s say, the code or the math. And so some things in my style evolve with time. Kind of like this, like not getting to the math or the code without explaining the intuitions first, because when I was, you know, reading a paper, for example, or if I would be intimidated if somebody went in a little bit too deep without explaining the intuition or touching on sort of a basic intuition of the main idea or

Jay Alammar:
the main concept. And so around those. Yeah. And developed some sort of a philosophy and AI style through a lot of the courses and lessons at Udacity. And there was an incredible feedback mechanism that few audacity courses, for example, where students would, write reviews all the time for the videos. And so we had hundreds and hundreds of them made, maybe, I don’t know, 99 or 99 lessons there covering unsupervised learning, transformers and neural networks.

Jay Alammar:
And the feedback sort of was really helpful. I kind of miss it. But do you get that from Twitter? And you get it from the same responses on the blog? But yeah, that’s a little bit of how that started evolving and sort of building that visual language and that conceptual, process for explaining these concepts.

Raja Iqbal:
Yeah. That’s great. So I like the analogy that you gave like diffusion models. And I think any endeavor, any learning pursuit has is like a diffusion model. I mean, you don’t get there in the first go. You know it. There’s more iterative. So in terms of math, how much of math do you think one should know in terms of, you know, there’s always this question.

Raja Iqbal:
I have seen people, hey, I won’t start doing machine learning until I know this much algebra and this much probability and stat and this much of the takes and all of that. What’s your, take on that basically.

Jay Alammar:
So there was a saying that software is eating the world. So every company in the economy has to adopt it and has to adopt software. And then we’re seeing AI and machine learning sort of eats it and eats software. And so there’s so many people now almost, you know, billions of people have to interact with AI models in different ways.

Jay Alammar:
And so for them there are different goals where it will dictate how much they deal with it. Then for the vast majority of it, they don’t have to really know any machine learning to do prompt engineering. For example, the models have come up to, such a high level of abstraction that, you know, sometimes even you can write very complicated software.

Jay Alammar:
We’re still in the early stages for that. But you can write, you can have the models write complicated software for you just by a natural language. But if you’re going into research, you know, you definitely have to be a little bit more comfortable with the research. A lot of the people that I speak with are software engineers, software developers, where they don’t necessarily need to know so much more than what they would learn in computer science school.

Jay Alammar:
Yeah, a lot of it is mechanical. It would be helpful. Like, the more they learn, the more that it would sort of benefit them. The vast majority of people don’t need to train models. Now, if you’re maybe an advanced user, you might fine tune a model. With that. You will use, you know, specific tools. If you’re planning to come up with new methods or new optimization methods, then sort of, you know, diving deeper into the matter would help you scan the literature out there and sort of have a better understanding of how do you compare the different reinforcement learning methods out there, for example, as one topic that is very deeply mathematical, but others are.

Jay Alammar:
There’s so much to be done with, you know, prompt engineering, with just manipulating software or with paying attention to data. So data cleaning data sourcing, data annotation, these are extremely deep fields that require sort of different skill sets to work on all of them. And they’re all needed to create the best models and the best systems out there.

Raja Iqbal:
So if I understand it correctly, what you’re implying is it really depends upon the role that you’re in. But, you know, if you’re a grad student publishing papers in machine learning, of course you need to understand the math part of it. But as an engineer, you can get by without even doing a lot understanding a lot of math if you’re just building products.

Jay Alammar:
Yeah, I mean, it would be helpful, like you would do as an engineer in your education. You’d cover calculus and having a sense of what the models do under the hood. So for me, it was very helpful. It’s always helpful to not think of the models as magic. and so having a sense of how are they learning, how are they improving?

Jay Alammar:
Like a really that’s a solid understanding of, backpropagation. And how that works just removes a lot of the mystery around with these models. And I think that’s something that we, you know, educators and speakers we should do a lot. So to sort of reduce how much of a black box this is to people and make it sort of accessible because people make mistakes, they see, you know, the general public when they first come across a system that surprises them, they don’t really know how to deal with it, and they might ascribe it with trust that had not really earned it.

Jay Alammar:
And we saw this with the rise of HoloLens in 9999, people called them, you know, these are the search engine killers, and they were relying on them for informational queries before they realized that there’s a problem called hallucinations. These models aren’t always going to be, you know, factual. They’ll answer every question, but they have no guarantees that it’s going to be correct, which is if you’re close to the mouth and close to the domain, you’d understand this is what they’re trained to do just to model language, just to predict the next token, not to be factual, but when we sort of recognize that we put on it a or projected on it some abstraction of

Jay Alammar:
intelligence that we want to sort of, think so. Yeah. The deeper one understands and invests, the better, but I don’t want it to be sort of also, you know, gate keep the field from people who can be very productive careers in developing AI and developing systems that use AI without needing to invest, you know, a year or two, additional maps than the.

Raja Iqbal:
So I think this would be a good segue way to actually talk about real world engineering using some of these large language models. so in your opinion, what are some of the biggest challenges, for an enterprise, when it comes to adoption of, broadly speaking, I call it generative AI. And and I don’t have it model to any kind of multimodal.

Raja Iqbal:
Right. So general category of AI applications, in your opinion, based on your work, what are some of the biggest barriers that you have seen?

Jay Alammar:
So a lot of the times it’s getting the right talent and that is able to translate the problems of the organization into actual solutions that the model can develop. So 1 in 1 case you come across is what use cases does a company want out of AI. And a lot of the times it’s okay. There’s a lot of excitement on chat.

Jay Alammar:
For example, back in 9999. So a lot of companies were like, okay, we want public bots on our website. We want the our companies to talk with these chat bots. And that’s when like during a time where these models were controllable enough, you can sort of guarantee the tracks or the information that they sort of so like the right understanding of the capabilities and mapping those capabilities to the needs of our organization is one of the things that is developing.

Jay Alammar:
And that’s a maybe the first thing that is helpful to an organization. And then from that, like understanding use cases and let’s say building blocks that which AI capabilities are reliable and robust and which ones are experimental. And how even for the ones that are experimental, which use case or scenarios or workflows are they better suited for? And so if you want to, when you’re coming across, that’s a AI or language model deployments, some of the most reliable and robust things that you can do is improve your search systems.

Jay Alammar:
That’s a very reliable use of AI. You have no worry about hallucinations in a use case like this. So using a rerank or or an embedding model to improve internal search is usually one of the best first steps to use a model to. And then when people think about generative models, which is because this all of a lot of this falls under the nomenclature of generative AI, which I think is maybe an unfortunate name, because a lot of these use cases that are reliable are not generative in nature, but they’re representative.

Jay Alammar:
So we have the boom that we have in AI because we have better representations of data, of text, of images, where generation is only one category of. So to do generative AI with, you know, I advise people to think, think beyond chat. So WeChat is one thing you can do, but it’s not the best thing to start with.

Jay Alammar:
There are other generative use cases to think, and these tend to be things like summarization, extraction and copywriting. This is when you’re relying on the models language ability that is kind of measurable. so you’re not relying on its informational content. You’re not relying on ability to steer a conversation one way or another. If you’re, let’s say, in a customer service about to.

Jay Alammar:
And as you build these use cases, you build the internal maturity and the company to know how to, for example, generate test cases. This is a use case that we have. We’re trying these five models. This is a small test set that we’ve developed internally of 990 cases of the type of output that we want. And so this helps us pick the best model.

Jay Alammar:
This helps us when we want to use or upgrade to a new model version, helps us sort of understand that it’s regress on any of these use cases. And so I adoption a maturity follows from that. And a lot of it comes from just sound software engineering. So this is all analogies of unit testing and regression testing. These I would say are things that are helpful for companies.

Jay Alammar:
So think beyond generative. Think representation with things like embeddings reranking the classification and these kinds of use cases. And within generative think beyond the chat. So think about these other building blocks. Because once you’re comfortable with chat generation extraction, then you can start to build these very complex pipelines of AI systems that are, you know, taking things to the next level.

Raja Iqbal:
so yeah, you mentioned that some of the valuable use cases within a business, are not generative in nature. Right? So take, search for example. Right. So if you implement semantic search and semantic search, or even hybrid search. Right. So it takes your results to the next level. Right. So do you have any not even have a published playbook in mind.

Raja Iqbal:
If an enterprise is trying to come on board and they don’t have the scale, you know, technology is the easier piece these days, right? So, you know, you can go and get any service, but, with an enterprise is trying to adopt, Genii for their business. Would you recommend start with the non generative cases that sort of, maybe start with a crawl, walk, run approach.

Raja Iqbal:
Do you have some playbook in mind? Would you recommend anything to companies. How should they start? if they’re dealing with, you know, a major league, the skill gap, which is a major challenge. so what would be your recommendation in this case?

Jay Alammar:
So definitely depends on the needs of the organization. But as a general template that for the vast majority of companies to adopt, adapt this playbook, where you want to end up is retrieval, augmented generation. That’s the most commonly needed use case in a lot of enterprises at the moment. And that’s where you can interact with a chat model that is grounded on the information of the organization, for example, or on a specific data set that is relevant for that user within.

Jay Alammar:
And a single augmented generation model is made up of two kind of subsystems a search and retrieval system and a generation of the system. And the retrieval system is the one that is you can apply and build more robustly. You can technically deploy one of these within a week. And so your internal search systems, every company has a lot of internal data and a lot of internal information that they don’t have the ability a lot of the time to search across all of these sources, these systems, a lot of them tend to have existing search systems.

Jay Alammar:
And so the most straightforward and the fastest way to inject the intelligence of a language model into the process is to use a rewriter, to use a system that rewrites the results of an existing search system. So if we have, you know, a query we send it to, I don’t know, our SharePoint search system, it will give us 990 results.

Jay Alammar:
We present that to something like Rerank from some cohere or re rankers that do this, and they would just change the order of the results of these 990 results. They will still return the same 990, but they will say, okay, looking at the query, result number 99 here is actually the most relevant. So I will make that number one.

Jay Alammar:
And that tends to improve search results dramatically. And that is one of the major investments you can create to empower your future writing system. Because retrieval augmented generation works in this way where a search system gets the most relevant results to answer, you know, the question of a user, you know, and if you have a robust search model, the right document is going to be in the context is going to be in the top three or top five documents.

Jay Alammar:
It will be injected into the prompt and sent to the generation model. And so building in that way I find, is a way where people can get a major win very quickly. It can be deploying it’s only one API call and you can deploy it basically in a weekend you would have a working language model with no concerns for things like hallucination.

Jay Alammar:
Search is a very well defined problem. It’s very well researched. There’s a lot of methods to sort of measure them. And so that’s a great first way to build that. And then these systems can then be improved by adding semantic search via embeddings. But a rewriter is generally the first the best first step because it’s quickest lets it and then you can add on that with the generative system.

Jay Alammar:
Again you can build up to chat, but you’re probably better off doing a bunch of summarization tasks into a bunch of extraction. You know, there’s a long text, give me the names of the companies that are in this contract, for example, things like that of extracting relevant text and models tend to be really good at those. And then with those, once you’re comfortable sort of building with these building blocks decomposing.

Jay Alammar:
Yeah. And more and more advanced systems like chat, you can build up towards them. That’s better than spending a year building a chat system and then sort of figuring out what’s the best way to deal with hallucinations. And is it reliable for this use case, you know, or not. But that tends to be, what? I would advise people to go through that process.

Raja Iqbal:
So one of the reasons I’ve seen adoption to slow down in enterprises is that, fear. Right. So data governance and AI governance, and broadly speaking, let’s call it AI governance. What if the documents, you know, the access controls and, the sensitivity of documents and more so the generation part and misfiring and, you know, commonly what we call hallucination giving incorrect information, regulatory issues.

Raja Iqbal:
So what are your thoughts around that. Right. So yes, you can build a very basic working drug application and maybe a day or two, maybe in an hour. And then depending upon what, you call a rag, but, deploying these, systems at scale in an enterprise for extracting business value, it can come with, some challenges.

Raja Iqbal:
And based on your work. Well, at cohere and industry can you speak about some of those challenges?

Jay Alammar:
Yeah, 990%. There’s definitely a friction to customers and sort of people and enterprises building these systems and rolling them out reliably. And that’s why the sort of the expertise of where to apply these models comes. And so if you’ve invested in search and improved your search systems and sort of there had the built internal capabilities of evaluating these systems, and then you start to sort of build your first rank system.

Jay Alammar:
What are the first questions you should ask is who should the end user be? Should this be internal customers or let’s say internal users to the organization? Or is it ready for, let’s say, external users and the general public? And at this level and in, let’s say, the cycle of evolving these models are better ways to do it internally and capture sort of these interactions internally and that data and use it to sort of improve, let’s say ride or chat system, which is, let’s say assisting internal employees who can catch, you know, there’s a result that is incorrect or needs to be that signal should also be sort of captured and, used to improve

Jay Alammar:
the system down the line. And so this is a one important sort of aspect to decide, you know, who is the end user, who’s the appropriate end user of our system with internal employees or internal experts who have the sense of sort of overriding the AI and improving it. That is a good sort of initial approach to this.

Raja Iqbal:
And, I see you being, And please correct me if I’m wrong. I’ve seen you mentioning Greg, on multiple occasions. Right. So in different talks. Yeah. So, do you think it is going to be the preferred paradigm of choice? when it comes to implementing enterprise application, do you see fine tuning being anywhere in this situation?

Jay Alammar:
So this is a good question. But fundamentally it’s broken down into when people chat with a system, they expect some sort of source of its information. And if you get in models, you just interact with the model using its own information. You’re dealing with whatever was captured in its training data. And a lot of the times that is the internet or something like that, that you don’t really it’s not auditable, let’s say, in a very specific way.

Jay Alammar:
You can’t sort of be extremely sure about what it will say in different things. And so that’s why the right paradigm came, which is let’s inject the right information, let’s keep the information stored in a place. You know, we have incredible systems for storing data. They’re called databases or existing or other existing systems. And at query time, let’s get the right information and inject it into the model.

Jay Alammar:
This is a much better paradigm than storing all of the world’s information inside of a massive model, storing it all in very expensive GPU memory. And then just, let’s say, being stuck with that sort of information. If we want to update that information, we need to fine tune the model for another month or two months. Fine tuning is a useful tool in the toolbox of a.

Jay Alammar:
But let’s say injecting information isn’t the first thing I would reach for fine tuning. to do. So definitely including that information at query time with Rag light systems is the best way to start with injecting, let’s say, or running a model in a specific data set that we want, which tends to be the use case that is in high demand.

Jay Alammar:
And where, you know, a lot of people who are in companies are require fine tuning comes when, for example, you have a very deep domain that the model was not exposed to. You know, you have a company with millions of documents with product names and, jargon and let’s say abbreviations that are not commonly used outside a system like this would benefit somewhat from being fine.

Jay Alammar:
Or if you want the very specific format that you not able to get from just prompting, that’s another, high level sort of a way to use, fine tuning to do. But if you want to instill, as, you know, an information into a model, right, is better for a lot of reasons. It’s cheaper. You can update it at any time without the need to continue, you know, retraining the model.

Jay Alammar:
So that’s why I would definitely reach towards rank like systems for a lot of these approaches, and leave fine tuning for other, more advanced needs that after experimenting with solving it with prompt engineering, then, think about things like minded, okay.

Raja Iqbal:
And in terms of, when you’re building AI applications at scale, for an enterprise, how much of that is your traditional classic, software engineering at scale versus how much of that is I.

Jay Alammar:
So like my advice to people is to use good software judgment in general, and not think of AI and AI systems as the end all, be all. And a lot of the times people want to use language models for something and we’re like, you know, other software is better for this use case. If this is the pattern you want to extract, yes, you can throw it at 990 billion parameter model.

Jay Alammar:
But it will be much faster if you use regular expressions for it. That will give you a better, faster result, which is another thing where we also advocate quite heavily for things like embeddings, embedding models to solve things that people would reach out to a generative model to do. And yes, generative models can do these, some of these tasks, but they do it in a much more expensive way, in a slower way, in a way with more latency.

Jay Alammar:
And so the shininess of the latest text generation system shouldn’t take us away from using the best tool for the job and the best tool for the job. A lot of the time it comes down to how expensive is it? What is the latency which within the language processing applications of AI tends to be? This can probably be better than with an embedding model than a generation model, or with a different piece of software, or with a model, a pipeline where the model is only step number three and step number seven.

Jay Alammar:
But then you have, you know, ten other steps that are software processing via sort of other software tools. And so yeah, really major advocate for for some software engineering and using the best tool for the job, even if it’s not AI or especially if it’s not AI, because then that changes the profile of the system. How much GPUs does it need, how much API calls it needs, and things like latency and robustness are very important.

Raja Iqbal:
And so let’s, switch the discussion a bit to your current role at go here. So can you tell us in general what I mean? I know what cohere does, but for someone who has heard of cohere, what they do. So what does cohere do? and what is your role at cohere?

Jay Alammar:
So cohere is one of the earliest large language model companies. The founders were some of the earliest engineers working on Transformers. I wanted the co-founders. Aden is one of the coauthors of the Attention is all you need paper. So the company’s been building and training large language models for 4 to 5 years, 4 to 5 years now. and that includes two major families of models.

Jay Alammar:
There’s a search model. So beddings and reranking models, but also text generation models for chat. This year we saw the launch of text generation model series called command R. These were also launched to release as open weights so people can download them and evaluate them and sort of hack around on them for, noncommercial use cases that these models excelled at things like retrieval, augmented generation, like tool use and support for lots of languages.

Jay Alammar:
So multilingual is a major, underappreciated, need that people in so many companies require. And we saw the first wave of language models really excelled in English and English alone. And so here and it’s open research is go here for AI has developed lots of work and data sets. And these models that support multiple languages, which is really important for companies that, you know, if you’re covering the US, for example, you will most likely have a good number of customers that are speaking in Spanish or they will search your website in Spanish.

Jay Alammar:
Do you build multiple systems in each language, or do you use language models that are able to sort of do, or do you have a global footprint and you have to support, you know, tens or hundreds of languages? So these have been the major focus areas for cohere building. These models embed Rerank and command are all with multi lingual support and then open research which resulted in models like I and The Expanse which is the latest one that are vast number of the language I’ve worn many hats, so I joined mainly out of fascination for language models and wanting to sort of see how they come out of the lab and become products and

Jay Alammar:
become features and become entire industries and, and player has been an incredible way to see that evolution and guide so many different companies to adopt these and roll out these models into production. So I worked quite a bit for a while on education, on sitting with, and advising enterprises on the best use cases that they need and how to, take specific capabilities to production.

Jay Alammar:
With colleagues like Luis Serrano and Miramar, we built something called Lim University because we wanted to there was an education gap. So many executives and even engineers had to deploy and rush to learn these models, but there weren’t enough accessible resources to teach about all of these topics. And so, you know, was very lucky to work with some of the most incredible visual communicators, of these concepts.

Jay Alammar:
And we have about seven modules there covering search and generation and tool use. And rag and a lot of these concepts that are extremely important. And then also part of that role is going out and speaking with developers and different developer conferences without any research, conferences to see what issues that they face, but also explain these paradigms and how things like retrieval, augmented generation works and how these developments and rollouts, more recently have been more focused on research and, developing the next generation of systems that I’m excited.

Jay Alammar:
So commend our seven BS, the model we released just in December. And we’re really excited about lot of things that come after that. Then we saw excel at things like, you know, if you see some of the figures on the code generation side, things like doing really well at SQL, or even programing language like COBOL are major enterprise pain points that are a little bit different in how what enterprises need out of our models versus how individuals need our models.

Jay Alammar:
And that’s a little bit of my experience on it.

Raja Iqbal:
And you mentioned multilingual, is that one of the scenarios that if I’m interested in a different language, then cohere should be my NLM of choice? Does school here support, other languages better than it? let’s say the llama, series. GPT models.

Jay Alammar:
Yeah. So definitely. So it’s broad language adoption and bright line would focus on the 990 languages in embed. And, I think with the command R and R plus refresh, these are about 99 to 99 languages. And that also expand a little bit with this command or 99 or latest release. So multilingual is yeah, it’s it’s one of the sectors that really we see people making the difference for them.

Jay Alammar:
But another one is private deployments. And this goes back to the point that you mentioned earlier with companies really their highest concern is things like data security, data privacy. And so so here was built to focus on private deployments and allowing customers and companies to be able to deploy the models to make the model to the data and not the data to go out and hit an API and do that across all the different cloud providers, out there.

Jay Alammar:
So being cloud agnostic and being able to deploy to each cloud. So this is the set. Let’s say, features that makes this, tweaked towards enterprises.

Raja Iqbal:
And, and speaking of, you know, cohere, llama, GPT, OpenAI, Alia. He did a dark recently and he said something to the effect that pre-training as we know it will end very soon. And what are your thoughts on this? Because, you know, the data is limited or there’s a lot of data, but it’s still there is a finite amount of data that is out there.

Raja Iqbal:
So what are your thoughts on this, statement.

Jay Alammar:
So it’s the area definitely develops over time. And let’s say the last two years have seen a lot more progress on the Post-Training side. So while you can do enough pre-training to get the model to a specific level of capability, post-training things like the two stages of supervised fine tuning and then preference tuning, either with RL or without RL.

Jay Alammar:
And so it started out with RL for reinforcement learning from human feedback. But then that is very difficult to get right for a lot of the organizations and developers. And so other methods that are a little bit more straightforward, you can get similar results like DPO came together. So yeah, you have pre-training and then these two, post training steps, which you have seen a lot more, focus and yes, well, natural, what you called fossil fuel data as it exists out in the internet is limited.

Jay Alammar:
There’s been growth in these other steps from using things like, you know, human annotators or things like synthetic, data generation. A lot of the cutting edge models that came out this year had to benefit, tremendously from things like synthetic data generation for supervised fine tuning or for reinforcement learning, because the models at maybe the beginning of 9999 started to be good enough to create high quality for where they’re the text that they generate.

Jay Alammar:
If you give them the right information, the text that they generate is better than a large number of percentage of documents. So and then the formulas became, do you have the right recipe to generate the information in the right way and sort of inject the right, you know, seed information into. And so that’s where a lot of development were in this year.

Jay Alammar:
And that’s probably what Ilya was referring to, is another major theme. We’ve seen take off thing in the last few months of things like reasoning, training the models to do more at inference time than just create one output to solve a problem. And this is most likely going to be one of the hottest areas of these models in 9999.

Jay Alammar:
So things like how are the models reason, how they can reason across, you know, trees of reasoning and traversing them and backtracking to solve these complex, reasoning problems. We saw the at NeurIPS. Francois chalet was there and he was talking about the like his main one of his main focus is the arc AGI challenge. And there have been some tremendous findings there of how to scale the intelligence that you can get out of a model by allowing that during inference time, during test time, to do a lot more calculations, do a lot more structured thinking than just generate one generation of that system to do these multiple steps, which can be quite advanced

Jay Alammar:
and quite sophisticated. So some of these approaches are like you know, at test time, at entrance time, you generate examples, you generate synthetic examples, and then fine tune a model and then use that fine tuned model to sort of generate the answer that you want, and you fit that against your test examples and see if they generate a good approach or not.

Jay Alammar:
There’s a lot of on evolving, chain of thought reasoning and sort of verifying different, approaches to reasoning. I would say the sense from sort of, yeah, that I got from the new Nurbs of this year and the context around the earliest talk. For me, these are the two major themes that I think about, you know, synthetic data generation.

Jay Alammar:
Now that this is this improved these models over the last year. And then what will happen next include things like reasoning and test time or inference time compute and tuning.

Raja Iqbal:
And so I have seen this, let’s call them camps. Right. So there is this camp idea is nearer. It’s going to about to happen this camp notably, you know, Elon Musk. Hey, AGI is going to happen based on Yann LeCun. no, it’s not going to happen. This is not how I works. So what are your thoughts on this?

Raja Iqbal:
How do you see, you know, two camps being so vocal about, you know, something that, I mean, I have my own opinion. All right. So as a technologist, I mean, I have some background, but I would like to hear your thoughts on this. Right. So if there is a reality, maybe somewhere in the middle, I don’t know.

Raja Iqbal:
Yeah.

Jay Alammar:
Yeah. So I get people’s fascination with human like AI, but I don’t share I don’t follow that very closely. I would come across with these discussions, but like my interest really is on a few other things that are adjacent to that. So if the existing AI systems right now, even if we don’t have any more developments, how can we get them more into production?

Jay Alammar:
So people are, you know, competing to say how much in the future you know, how long until we have AGI, but how many? Everything through how many systems are actually in production today that are solving real needs for a company? I tend to think a little bit about that a little bit more, which is to say, the deployment of these systems, making robust, reliable AI systems that makes humanities are more productive.

Jay Alammar:
And to me, that I think of that aspect a little bit more. I think if you think about AI systems, there’s a class of AI systems that is extremely important and that it should probably get some of this fascination that people have around AGI, and that is things like recommender systems that really are regulating the information flow to the feeds that we read and the content that we consume every day.

Jay Alammar:
So some of the most AI that the most important AI systems maybe in existence today are things like, okay, like Google PageRank, and then the recommender systems of TikTok, not Twitter, like social media in general that regulate the information flow. And people are, you know, glued for hours at these, systems. That’s a subset of AI that I think is under people don’t think of them as much as they should.

Jay Alammar:
And for me, they play a more major role in what I think, AI is today. Problem with AGI, is it a little bit nebulous in terms of definition? The existing AI systems are already, you know, a calculate. We’ve already ceded ground that calculators are better than humans at calculation. You know, this is 99 years ago where if you go 990 years back, we would think that’s intelligent think.

Jay Alammar:
But now it’s no longer intelligent because we moved the goalposts on what intelligence is. You know, computers can read a million documents in a second, and we no longer think of that really as intelligence, because we are, again, moving the goalposts. So yes, AI, the existing level of AI is way off the charts when compared to humans and a lot of different tasks, but we still want to see this maybe reasoning with memory, this very specific custom fit of intelligence that can be recognized as human.

Jay Alammar:
So I think about it a little bit, but not to the level that, you know, a lot of people out there, sort of.

Raja Iqbal:
So you mentioned recommender systems, you know, using, these models, they can actually perform substantially better. Right? So our classic traditional content based and collaborative filtering. And now we are introducing the semantic aspect and this generative aspect as well. Does this concern you I mean information flow is controlled I mean as a human, does it bother you? Does it concern you at all.

Jay Alammar:
So it tend to be a tech optimist. Things like technology has improved the wealth of humanity and sort of taken a lot of people out of poverty. And we need to continue to guide its evolution to ensure the best outcomes for everybody. So it has the potential to sort of continue to improve things, but also that will happen if a lot of people and good regulation and good sort of decisions are made on things like safety, like deployment, like awareness of what is safe to do with these models, what is a responsible way to do things with these models and what is not.

Jay Alammar:
And so yeah, we need to, you know, raise that awareness and, you know, learn where these models are good and where they’re not. And the most major example is that the one we talked about earlier, you know, okay, we have chat systems. They can answer questions. Are they good search engines on their own. And it took media and the general public a good maybe six months before they realized that.

Jay Alammar:
Yeah. No, these are hallucinations is a problem. And so we need to have another kind of component to the system to guide the tour to the same way we saw something similar with recommenders with optimizing for engagement. so YouTube, for example, algorithm or social media, if they saw people were engaged to a piece of content, it would elevate that and sort of recommend it to more people, which led to things like clickbait, where the title is very controversial.

Jay Alammar:
People click on it and it either doesn’t really deliver. And so that’s another learning where the systems then need to evolve to account for this, that people will, because these systems are really now the gatekeepers for content. So if you’re a new musician and you’re creating music and you want to be exposed to a new audience, like what you’re being told is TikTok is the way for people to discover it.

Jay Alammar:
And the TikTok algorithm, for example, or maybe YouTube or Instagram algorithms. These are the new makers of the artists that really start to have grand and major appeal. And so once the people will adapt to that system and in the past, we’ve dealt with things like clickbait, which then the system creators needed to change their systems to say, okay, if users click on it, that’s not the right signal.

Jay Alammar:
Maybe they need to be engaged with it for two minutes and watch the video, or listen to the song for two minutes. And that is maybe the level of engagement. And then you see content creators coming up with other sort of tools or mechanism to game that metric. And so this continues to be a cat and mouse.

Raja Iqbal:
It is that classic adversarial problem, right? you know, you make some tweaks to the algorithm. You know, the other side, the adversary adjusts and tweaks to beat the algorithms, I think. Yeah. And that’s interesting. So, Jay, you recently published this book with a coauthor. Tell us more about the book. Why did you decide to write the book?

Raja Iqbal:
And, who should be reading that book?

Jay Alammar:
So lots of people have had to switch and learn machine learning or learn lessons very quickly. And these can be either machine learning people, data scientists, or general developers coming from back end or front end or other kinds of system development as well as others, were not necessarily engineers. And what was the statistic is that, you know, they grew to 990 million users in only, let’s say, a month or two as compared to technologies that came in the past.

Jay Alammar:
This is, let’s say, meteoric rise. And so, so many people have to interact with these systems. And the more and they will be empowered to use them correctly, the better that they understand them. And so that was the goal of the book of how to create this educational resource that is highly visual in nature, that is accessible to everybody, that allows developers to build with the models.

Jay Alammar:
And so it’s built in three parts. The first part is just high level, just a quick history and a breakdown of the technology of what a language model is. And this is a refreshed version of the illustrated transform basically. So the illustrated transformer came seven years ago when the model first came out. How have the evolved since then?

Jay Alammar:
How is that different from the latest 9999 era transformer? There are some things that have James that are things that did not change. And so we describe today’s transformer as opposed to an old encoder decoder model. We talk about today’s positional embeddings. Today’s information. attention. But that’s only the first three chapters as that gives people the main idea.

Jay Alammar:
So if somebody understands embeddings, tokenization and the high level view of how the models work, then they can sort of start to go and build them into systems. And the vast majority of people really only need the second part of the book, which is how to use pre-trained elements for text classification, how to use them for clustering and topic model, and how to use them to build the right systems, how to do prompt engineering, and how to think about multimodal language models.

Jay Alammar:
The late last part and part of the three is about fine tuning. And so that’s a little bit more for the more advanced users who want to represent or fine tune an embedding model or text generation models in advanced, use case. So the book is in 990 pages. It has 990 figures that are originally made just for this book.

Jay Alammar:
And they follow the same, like when I told you the illustrated transformer visuals I had to go through, each one had to go through seven iterations. The one in this book had each had to go through like maybe ten different iterations. because there was a lot of work involved to get them to their final stage. It’s all in color, and collaborating with Martin has been absolutely incredible.

Jay Alammar:
So Martin is the creator of popular open source tools like Bert Topic that are used by hundreds of thousands of people. And I really wanted to work with Martin because not only did he build and understand the software and maintains these popular open source libraries, but these libraries had some of the best documentation out there, was highly accessible, understandable and visual.

Jay Alammar:
And so he really rose to the occasion with any incredible work is done for this. And it does include, I think chapter five is the one on topic modeling. He’s the one individual in the entire world that is sort of best suited to write about topic modeling with large language models. As the creator of Bert Topic, and I was really surprised that even after working on this book together for a year and a half, he continued to create incredible guides on his Substack of other topics that are super relevant things like quantization, mamba architectures, and mixture of expert models so highly.

Jay Alammar:
That’s the highly sort of suggest. Would love for people to go over the book. Let us know what they think. We hope it’s an accessible introduction to a lot of these topics that is very empathetic, and how it introduces these intuitions. All of the code is on GitHub. The repo has about 9999 stars so far. We’re really excited about the feedback and if people like it, would love it if they leave a review as well, wherever they, you know, Amazon or Goodreads or whatever, that really helps us out.

Jay Alammar:
But we hope ultimately it’s helpful to the largest number.

Raja Iqbal:
I have looked at the book, of course, for its own work in this space. How long did it take you to write this book? I mean, even with Martin, I mean, so both of you are known very well known in this space. But, I mean, there are some laws of physics, right? So you have to have the time to actually write.

Raja Iqbal:
How long did it take you to write a book about that quality?

Jay Alammar:
Yeah, this is about a year and a half of actual sort of writing time, but it’s really benefited from the previous nine years of developing visual language to explain things. And so I didn’t really start from scratch because we had already sort of worked out, you know, how to describe SLOs of models, how to describe, like we talked about, you know, how to describe matrices, visual language to describe things like attention.

Jay Alammar:
And so that really benefited from the last decade of developing that visual language. But and then from starting to write the book and iterating on it and edits and going over, you know, redesigning all of the images took us about a year and a half until we maybe submitted it in August. So we started early in 9999, and then we sent it to the printers in September.

Jay Alammar:
And it I think it came out in October at this time shipping around.

Raja Iqbal:
And who do you think should be reading this? Is it people who are doing research for people who are building real world applications beginners, product managers, software engineers. Who did you keep in mind as the target audience of the book?

Jay Alammar:
So I have a philosophy here, which was helpful to my blog in general, which is I want the general public to be able to get something out of, like, I think the best artifact of writing or explanation. If somebody comes in and they know nothing about the topic, they can get something out of it. They just elevates their learning of the way they wrote in a paper at an Eichler workshop.

Jay Alammar:
It talks a little bit about this, a lot about communication, but technical communication and scientific communication. And that definitely applies to the. So I think the general public in general would be able to get the intuitions of how does this model work, what is you know, it’s not just a black box, it’s a tokenizer. Then, you know, a series of transformer blocks and then a language model.

Jay Alammar:
And this is what they work, how they work. And this is the process that they work under. So we targeted the most accessible category. But the core audience is software developers. So if you know a little bit of Python, you’ll be able to run through all of the code examples and get the most out of the book and go into the deeper topics with things like fine tuning, which I don’t think the majority of people need to go into fine tuning.

Jay Alammar:
But if you want to take it to that level, which also includes researchers, so, you know, working in their field, you will see that there’s credible computer vision researchers that maybe this is the first time they work with a transformer. And so they need an accessible way to sort of learn the various components of this model. So they were sort of in mind as well.

Jay Alammar:
Hopefully the visual language makes it as accessible to the broadest possible audience.

Raja Iqbal:
So if I’m a decent software engineer working for a company and now we have to build an application, this book would be a great point to start.

Jay Alammar:
It’s my hope. Yeah, that’s what we sort of built it to do. Specifically, if you have a use case where we have a chapter on. So if you want to do classification or topic modeling or writing, we will give you a very gentle introduction to the main 993 hundred intuitions that you’ll have to be a better builder of these systems.

Raja Iqbal:
Yeah. So when ChatGPT came out, these modular transformer models, they have existed even before ChatGPT came out. Now in most cases when they look at generative AI, well, they think that it is supposed to, you know, just, well, generate a few things. Write something. You write something there. I’ll talking about an average layperson, but there’s this whole notion of a genetic framework.

Raja Iqbal:
Now, this agent, I can you speak to a little bit, where do you see all of this being headed? Right. So, you know, creating a social media post, writing about something is summarization, right? I mean, and then models do a very good job, a decent job on that. But there could be well, here first, I mean, assistants that are in a regulated industry call them assistants.

Raja Iqbal:
Right? Not really chat, but I need a medical opinion on something very, you know, heavily regulated versus I need a social media post. Then I also in some industries, I may want my application to go and make a decision on my behalf. So how do you see this, first of all, this evolution? And what are certain things that you think are going to be a norm around us in society?

Jay Alammar:
I find it fascinating. It’s maybe one of the most fascinating areas of technology today is this multi-step tool use perspective on language models and AI. In my talks, I give people an example of rank systems and how they evolved and how people’s expectations. Every time you have a system people start to take, try to take it to the next level, and the system needs to evolve to tackle that use case.

Jay Alammar:
And that’s how you end up with these more and more advanced systems. And so people started with naive writing systems that would only search. You ask any question, it will send that question to a search component and then get the top document and put that in a prompt, send it to a generative model. But then once people deploy that model to production, they find that people actually send that very complex queries, for example, where that’s not the right thing to search for.

Jay Alammar:
You’ll find somebody who asks the model things like, we have an essay due for work tomorrow. We can write about, we need to write about animals. I can write about penguins, but maybe maybe I should write about dolphins. Where do they live, for example. So a question like this, if you throw this out in any search system, it will only confuse it.

Jay Alammar:
And the documents that you will retrieve will not be. And so one of the first ways to improve a writing system like this is to say, okay, let’s do a step of query, rewrite the beginning and so to find the right documents done this this question will rewrite this query to say let’s only search for where do penguins live.

Jay Alammar:
That is this search query. We extracted that search query from this message. And with that you have a better rank system that did one step of query rewriting using a generative model in the retrieval sort of step. And then people keep asking more and more advanced or more complicated questions that maybe we will not be able to answer just by searching for one document.

Jay Alammar:
So if I say compare for me, the financial results of Nvidia in 9999 versus 9999, you might find one document that contains all of this information, but a more exact system would be able to say, okay, to answer this question, I will need to do these two searches. Nvidia 9999 and Nvidia 9999. I get the best results, I synthesize them and then from there I get and then there are more advanced questions like, what are the top car manufacturers in 9999 and do they make EVs or not there?

Jay Alammar:
You most likely will not be able to find one document that contains all of this information. And so here you need to do what’s called multi-hop. Right. First step. What are the top car manufacturers after then? You know that it’s okay to go to Volkswagen. Then you search another set of queries. Does Toyota make EVs? Does Volkswagen make EVs or.

Jay Alammar:
And so now here you’re starting to branch out in the capabilities. And in this process of improving your system, you really have created an information agent where the language model has taken, this other type of role, where it’s not just this generator of text, it is somewhat of a reasoning engine that is able to issue commands to other systems.

Jay Alammar:
And that’s a build process. And so this is where sort of agents started coming in. They improve by the models being able at generating code for example. And so this can be abstracted into okay, the model can now call tools or issue a function calls and call any API out there and get that information and solve problems on across multiple steps.

Jay Alammar:
And when you’re close to it, it’s very hard to look at this and not think, this is the most major, one of the most major areas of high potential in computing in the near future. And so for me, it feels like maybe what it felt like in Xerox Park in the 9999s and in the 99s when the personal computer, first of when people said, you know, figured out computers that are now small, then we can have the mouse and keyboard on the screen.

Jay Alammar:
So it feels like we’re on the precipice of a new era of computing, guided by elements that generate code and execute it and solve problems across multiple steps. So I don’t advise people to sort of keep an eye on that and sort of start experimenting with it. And it’s still early, so we still need a lot more work on the reliability of these systems.

Jay Alammar:
So they will only get better with time. But to me it’s one of the areas of the most potential.

Raja Iqbal:
And in addition to reasoning now we are also talking about actions. Right. So an example could be an actual example. And one of the use cases that we were building for a customer that you are looking at, let’s say you want to search for clinical trials. So for a specific patient. So tell me something about John Doe who’s a patient.

Raja Iqbal:
Right. So maybe my patient ID there. You know the system returns you the patient information. Then you say find me all clinical trials for patient John Doe. And then now you have a tool, any API tool that is connected to clinicaltrials.gov. Now you get it back. You get back all the trials and, you know, ten trials.

Raja Iqbal:
Okay, give me information on that specific trial. It goes back and again, does that for you and gives you the information. Okay. Now send an email to this, the main point of contact for this clinical trial and to enroll this patient. Right. And now you are connected to this. Maybe Gmail or your outlook. And it sends that email.

Raja Iqbal:
And so we have something like this that we actually built. And now another level to this could be as well. It is working overnight. You know every night there is a nightly run that happens. And for all the patients in the system based on their new MRI, their imaging, their clinical notes, it just matches them to clinical records.

Raja Iqbal:
And now you’re not even intervening there and then sending emails. I don’t know how do, I mean, should it be just firing off those emails? So what are your thoughts on this? What industries are going to be impacted the most by these kind of call them agents by definition. I mean, again, there are a lot of things that are very loosely defined and there is no generally accepted definition.

Raja Iqbal:
But agents, I mean, we talk about, that autonomy in them. So which industries do you think that, in your opinion, would have more autonomous agents versus assistants that are helping you in real time?

Jay Alammar:
So I find like an abstraction that helps, or perspective that helps people understand this is software. It keeps growing. You know, there’s so many things are tasks that were done by humans, by clerks. You had an entire maybe 2 or 3 floors in a New York skyscraper, somewhere in some bank in 9999s or 99s, doing something that now is being done by an Excel sheet.

Jay Alammar:
You change this number, and then you have all of this cascade of different operations that sort of updates all the other cells that are involved there. This is something that sort of used to be done manually. So this software is sort of expanding into being able to solve more problems. And then I taking a lot more of what software is, is being able to do and sort of taking it to the next level.

Jay Alammar:
So wherever software is growing fast, AI will grow fast. And you see this, you can, you know, see this in venture capital. The common example is in 9999, there were two of the top 99 companies were software companies. So it was Microsoft and Intel. These were the big and then now it’s like, you know, the top 7 or 8 or all specifically IT or software companies and every other non software or technology company has a tremendous amount of software and technology or otherwise.

Jay Alammar:
It would not be sort of compact. So that growing growth of software adoption across industries is sort of what leads to also AI, because AI is just the next level of. So this is the next version of software that does things. Software was not good at understanding images or classifying them or generating them. Now it’s able to do that.

Jay Alammar:
And so that’s a more positive and useful perspective. And so I find it useful to think of it as just an extension of software, where you have easy software that deals with complex things like yeah, like reasoning or things like images or text data. Now it’s going to be able to do more and more advanced things and really see it, propagate in those industries faster.

Raja Iqbal:
And do you see any industries, any roles, any professions, any kind of, specifically any kind of knowledge work that will cease to exist or that would be impacted the most?

Jay Alammar:
That’s no more useful to analyze these not at the level of a job, but a level of task where a job is a collection of a 990 tasks, for example. And these tasks are of different levels of complexity, and some of the tasks will be automated, you know, sooner than others. But the nature of the job and a lot of the times needs to still be there, even if a lot of the tasks are sort of automated.

Jay Alammar:
And so that’s a maybe a more useful level of analysis. But I mean, this field was very quickly and it does seem and feel like their capabilities sort of, grow very quickly and it’s hard to predict. But people should invest, you know, whatever their in knowledge work, they should invest in learning these new tools and these getting the right abstractions and being comfortable with using these systems to improve their skill set and sort of use them.

Jay Alammar:
Because there are two ways to this, two sides to this coin. So very few people thought, you know, there was a lot of discussions on AI and machines and creativity, but these models will never be able to sort of draw beautiful pictures or create sort of new forms. But they were able to do it and sort of very quickly.

Jay Alammar:
And you can see this as in two ways, but one way of seeing it is that an individual using these tools, having a subscription to a couple of these tools, now has more creative tools under their fingerprints than probably Walt Disney had back in. And so what will you do with this new potential of capabilities? If you can write entire software using one prompt, or generate thousands of art pieces and music and video using only natural language, what will you create?

Jay Alammar:
What is the next thing that you’re sort of now only now able to do as an individual without a major Hollywood studio behind me? I find it exciting to sort of see how people think about what are the next things that people are now only now able to do as individuals.

Raja Iqbal:
I like to be describe it. You’re looking at it more as a task level, right? So in my day to day job as a business analyst or a data analyst or an HR professional, perhaps, and I believe a Mayfield fund, we’re talking about maybe 99% of the knowledge work, maybe 99%, maybe 99%. But I mean, that’s the estimate.

Raja Iqbal:
Then it can be done reliably by an AI, right? So are you suggesting that you know all the roles, maybe some roles more, maybe tech writers, proposal writers and all of that, maybe a little more of, maybe 99%, maybe 99%, but you’re still thinking that at least the way we look at it, it is going to be some of my tasks as a knowledge worker can be automated by AI as opposed to completely all of me being eliminated, my job being eliminated by.

Raja Iqbal:
Yeah.

Jay Alammar:
So I try to track it from yeah, seeing what capabilities the model has. And one thing, for example, you start to see now is one thing that has evolved or developed quickly is translation, in which the models are really good at translation. what level of, you know, how much of these tasks should be sort of done by professional translators?

Jay Alammar:
Is it the other thing? Is it a final check in that sort of kind of evolution? No. What we’ve seen this throughout history, technology always sort of changes the economy and the composition of the various tasks out there. It used to be elevators could not work without a human operator, and computer was an job title for people who actually solved mathematical equations in the 9999s and the 9999s that, you know, has been sort of just better than using software.

Jay Alammar:
So there’s a lot of, yeah, speculation. I tried to think about it in terms of seeing the capabilities that grow and the new possibilities that are sort of possible, and the ever evolving nature of what people do and sort of are you to have the best options in the future, always thinking about adopting the latest tools that make you be better at what you do, is a good edge to have.

Jay Alammar:
So to keep learning, keep updating individual sort of skill sets rather than sort of, you know, sticking to one definition of how this job existed at this snapshot here and expecting that it will continue to be like that. And so the best thing that people can do is keep learning, developing their skill sets, be early adopters of these tools, and understand not only where they’re good at, but also where they fail.

Jay Alammar:
And there are still the other modes and failure methods that for you to be a more advanced user of these tools, you should know where they fail even more than where they succeed.

Raja Iqbal:
Yeah. And so regarding our dependance on these tools now. Right. So all of us see user use these tools. quite a bit. I was looking at, someone sent me this tweet by Greg Eisenberg that he met a student, 99 year old student from Stanford. Fairly smart, and I don’t know how much, was the element of exaggeration, but, so he said that the student could not he was having a difficult time putting the words together.

Raja Iqbal:
He would pause midstream. And then after the conversation, the reason the student gave was that, well, he uses GPT all the time, and now he’s having this, you know, difficulty putting sentences together. I was a bit skeptical when I read it. Right. So, you know, I mean, I used GPT three and our internal tools that we have built quite a bit.

Raja Iqbal:
I mean, so far I’m fine. I mean, I’m not the best of communicators, but I’m a decent communicator right. But do you foresee because, the things like character, AI, ChatGPT as they become more prevalent and when we stop writing things and B stop putting thoughts together altogether and then we rely on these systems, do you think we will have, you know, our language is going to be delegated to the light language models, and our own language abilities are going to be impacted?

Raja Iqbal:
This is more of a hypothetical question. I read it, I mean, I’ve my own skepticism about this tweet about, I mean, more of a philosophical question. Do you see anything like that?

Jay Alammar:
So, like, regardless if we’re, communicating with models predominantly or with people will still want to use language. And so either written or spoken, so I don’t see necessarily a major shift in using language. I do see language barriers maybe breaking down with better tools for immediate translation. For example, and the ability to speak in practice any language that you want with a model.

Jay Alammar:
And so as a tech optimist, I always, yeah, try to see where the highest benefit imaginable can be, but also thinking through misuse. And so where some people might overly rely on these models for things that where they’re not, and the LMS are very fluent in, people might ascribe to them because of this fluency, they might trust them a little bit more than they should.

Jay Alammar:
And this is exactly, you know, what we talked about with factuality, but it’s also the things like using chat agents and these tools that uses a software, as, you know, like language models, as chat avatars that you talk with social media, for example, and you’ve seen sort of good examples of people who are building deeper and deeper sort of relationships with some of these models.

Jay Alammar:
And that’s an area of where I think we should be cautious in how the real, let’s say, high awareness. And this is a continued theme in humans relationship with technology, which other examples we talked about with clickbait for example. So or spam before that. So an incredible technology like email comes around. You’re able to send messages to anybody in the world.

Jay Alammar:
Then it takes less than a second. And then with that you start to see misuse of it. And in the beginning we don’t have the antibodies to really think about that misuse. And so we might actually get a message from somebody saying, you know, send me a $99,990 and I’ll give you $1 million if I get it from this.

Jay Alammar:
And so that’s a common spam theme. But often time people sort of start to build internal social antibodies to this misuse, and we still need to develop a bunch of this for these, AI systems. I think, just like we built with spam, with engagement bait and clickbait for social media, people need to have more awareness in terms of what is it really capable of doing?

Jay Alammar:
What can be trusted to do? What can it not be trusted to do? And so there I think we still have that responsibility of not misusing the systems. And I get concerned the closer they get to being, let’s say, the human where they can speak like humans. And very soon, or maybe right now, you can sort of start to do a starting to develop a little bit of antibodies towards deepfakes, but we don’t really have the tools to do it.

Jay Alammar:
And you start to hear about these scam vectors of somebody calling you as your mother, for example. And do you get an invoice or do you get in video? An AI is a major tool for the solution here as well. And so we continue to need, innovation of AI that, you know, catches this stuff and protects you and protects your attention and protects sort of you from misuse.

Jay Alammar:
And so, yeah, I would say the discussion and conversation about safety and responsible use of AI is needed not only on the moderate model creator side and product creator side, but also on the end user and consumer that sort of need to navigate how best to use these tools and, you know, not fall to prey like we did with spam in the past or clickbait.

Raja Iqbal:
Yeah. And the last question around this, the society part of it, do you think that the way we educate in k-through-99 in elementary, middle school, high school education, usually the traditionally the assignments that we give to students. No. Well, genitals, they can within a few milliseconds or within a few seconds. I mean, that, kids are dying.

Raja Iqbal:
So do you think that teachers, they will have to our education system will need a complete overhaul due to these realities that educators they have to now reimagine and come up with new ways of delivering education.

Jay Alammar:
Yeah. So I, I’m an educator. A lot of what I do is just to teach people of different backgrounds. This is technology that I’m extremely excited about. The possibilities of these models or these systems that are to customize the learning for the learner. And I already rely on these models, understand difficult concepts. You know what is better? It’s such a magical world that we live in that a question that you have, you can ask a model.

Jay Alammar:
It will give you some answers. They don’t always give you ways to verify. So you have to ask for citations. So you know what is the source of this information. So you still have to have the responsibly the ultimate responsibility of verifying that information. We still need to have better tools that show you the citations and the sources.

Jay Alammar:
But it’s just as magical as Wikipedia when it first came out or the search engine was first out was the first came out from people who remember them. So there are people like me. I don’t know about you. You look young, but I don’t know if you remember a time before the internet or before, you know, really.

Raja Iqbal:
I did grad school when Google was actually, you know, picking up. So I know it. Right? So, yeah. Absolutely. Right. So, and then we have to adapt, right. And now Google is there, Wikipedia is there. It’s a reality to the fact. What do we do now.

Jay Alammar:
And that’s it. So yeah education we are really excited for the potential for education tools. Tools is a little bit different. Because if there’s one thing that I sort of learned from Covid is that schools aren’t really to dump information into people’s heads, that’s not the education. A lot of it is the social angle and social sort of interaction of students with other students that helps them to collaborate with other humans in the future.

Jay Alammar:
And so, yes, really excited for next generation education tools for adults and for youths and, young people alike. Science fiction tackles this. There’s, a novel called The Diamond Age where this little girl has this kind of an iPad like this was written maybe 99 years before the iPad came out. So it’s one prophetic way of how education can be and how it can guide somebody that doesn’t have the right sort of tools to have a very customized learning path for something that works for them and engages their learning process and the things that they need to learn.

Jay Alammar:
Another thing I think advise people is to sink into science fiction. We have smart people over the last 990 years that explored so many possible futures, both from the utopia to the dystopia and everywhere in between, that are all very useful in envisioning what is possible and what we should watch out for. So it’s very important. Sci fi that is dystopic, that tells you we should not do this.

Jay Alammar:
You see the Terminator and you see The Matrix, and you see things like, for example, with films like Doctor Strangelove, with nuclear weapons, for example, that are cautionary tales that are seen as self preventing prophecies. And so science fiction is one area to really help bridge. Nobody can stop thinking about the future at this time of transformation, where, you know, software is able to do things that want to be able to, to do in the past.

Jay Alammar:
And so it’s fiction provides an incredible canvas to think through these artifacts. And I would recommend something like the culture series. That’s a series of maybe ten novels that envision futures of how humans and AI and the future society of humans in AI, and how they interact together as one interesting idea for possible futures.

Raja Iqbal:
Yeah, I mean, maybe the added benefit is, I mean, not everyone is lucky like me, that to have a one on one conversation with the right. So you’re talking about deepfakes and you know, your teaching style, right? So maybe you can give my one on one mentor on demand, right? So I can learn American history from a George Washington Abraham Lincoln.

Raja Iqbal:
Right. So, you know, those kind of things, they are not even fun. I mean, so we see them very much possible from a technology standpoint.

Jay Alammar:
990%. And as an educator, I would love that. Like, it would be incredible to scale educators and scale the best teachers out there, to be able to teach 990,990 students instead of just 99.

Raja Iqbal:
Yeah. So just one last thing. This segment is more for, you know, I will ask you in a rapidfire. I mean, most of the topics that I have discussed, I will ask in rapid Fire, you have to give very brief, short answers. Okay. So as short as you can, but I mean, sometimes you may want to go for, longer.

Raja Iqbal:
I will let you decide. But generally these are rapid fire questions. Right. Okay. How would you define yourself? Are you an educator or an engineer?

Jay Alammar:
I’m an engineer, and I learn by writing and sharing.

Raja Iqbal:
What I’m in terms of, building enterprise applications is really the way to go or fine tuning or something else.

Jay Alammar:
Right? Or retrieval. Augmented generation is predominantly the most useful application for enterprises at the moment, because it grounds a language model in a specific, information source. While fine tuning is a more advanced tool that you should only possibly reach out to after you try a few things like prompt engineering and ragging. Since none of that works, you should start to think about fine tuning for a specific subset of those.

Raja Iqbal:
And what is the biggest barrier to adoption of AI in enterprise? Is it, scales? Is it culture? Is it fear? Is it regulatory issues? Is it something.

Jay Alammar:
Else? So awareness and let’s say technical capability are two things. There’s still a bit of friction in adopting models, deploying them safely and getting the right use case that serves a business need. And so a lot of companies are developing that knowhow of connecting the dots between model capabilities and sort of business needs. So awareness and building the right that’s a skill set is one area where a lot of companies are developing.

Raja Iqbal:
Tech is web search. And we see that Google is concerned, now, with GPT and open, the eye of Bing is there. And both Bing and Google they have incorporated, you know, that generative answer before even their search results is web search.

Jay Alammar:
Did web search is dead. and it will be replaced by web search, maybe aided by the next level of software, aided by text generation, which will require still a lot of dependance on web search. so it will continue to evolve. It will continue to incorporate more and more. I like it did in the past five years ago and ten years ago, it kept adding more and more AI.

Jay Alammar:
That stack will just continue to use heavier and heavier sort of, AI capabilities to improve systems.

Raja Iqbal:
And, is AGI going to happen within the next five years, next ten years later or something else?

Jay Alammar:
Yeah. I don’t really speculate on these things. I like to focus on how can we get the most companies, I say, benefiting from rolling out intelligent and useful systems to production fast. And how can we go beyond sort of the hype and anticipation to actually building and solving real world problems in the world today? I really tend to think more about what this bucket of problems.

Raja Iqbal:
And, in terms of productivity, and levels in society, we are clearly more productive. Those who have adopted, in our knowledge work, we are more productive. Will these tools and make us smarter or.

Jay Alammar:
Dumber depends on how you use them. So we can use them for both, can use the internet, or you can use software or a laptop for good or, for evil. I am optimistic in the nature of humanity and the nature of people to play non-zero-sum games and plaything, or do things in a way that improves the world for everybody.

Jay Alammar:
And, you know, advocate. The more of us that sort of take that as our approach to technology. And some of the best examples here are open source software. These are incredible people out there building incredible software, putting it out for everybody to use. And that’s one of the most incredible ways, that I think about for how in general, we can create a better future for everybody to benefit from technology together.

Jay Alammar:
And AI has benefit tremendously from open knowledge, open sharing, open research and open source software.

Raja Iqbal:
In terms of, large language models, do you think that the answer is in bigger or more parameters, bigger context lengths versus small language models? Where do you think is going to be the biggest value?

Jay Alammar:
I’m really excited for smaller models. They keep getting better and better. Scale has gotten us to a good place, but it’s not where a lot of the major developments of the future will be. So we still need better and better higher quality data, sort of more specific tasks and, you know, types of systems. So really excited for small language models, getting more and more capable, really excited for things like, you know, capable embedding models that can solve problems in specific cases.

Jay Alammar:
And, you know, lower latency and lower memory footprint than large models. So definitely a big advocate for using the best job for the tool, something that would be the faster, better, more reliable is always, the better approach.

Raja Iqbal:
Dan, thank you so much. It was a pleasure having you. I absolutely enjoyed the conversation. I know we have been talking for almost two hours now, but thank you so much for your time.

Jay Alammar:
Thank you so much. I really enjoyed the conversation and really love what you do for the team. Thank you so much for having me. thank you.

Subscribe to our Podcast

Subscribe to our podcast for the latest insights in data science, AI and technology. Let us know if you would like to know about our webinars, tutorials, newsletters and more!

LLM - Online Courses

Reviews

Consulting

Community

Jay Alammar on RAG, AI Education, and Industry Transformation - Future of AI

Jay Alammar

Youtube

Apple Podcasts

Spotify

Transcript

Subscribe to our Podcast

Join our growing community of 900K+

Training Programs

Enterprise

Community

About