Price as low as $4499 | Learn to build custom large language model applications
Large Language Models, LLMs

All the Weird Things About LLMs – The Curious Case of Anthropomorphism

Welcome to Data Science Dojo’s weekly newsletter, “The Data-Driven Dispatch”.

LLMs can sometimes be confusing. On one hand, they can give you responses that would require a human prodigy. But on the other hand, they would fail to do something a 4-year-old child could easily do otherwise.

Let’s take the mayonnaise example.

LLMs cannot count

Why did the model behave this way?

Well, the fact of the matter is that we often treat these language models as intelligent human beings answering our questions on the other side of the screen.

However, that is not the case. Because of how these models are designed, they sometimes behave in a very unpredictable manner.

We’ve compiled all of them for you. Time to dig in!

The_Must_read

While it is complex to entirely understand how language models work, you can get a fair idea by learning about the transformers and attention mechanisms. Read: The Brains Behind LLMs: An Introductory Guide to Transformers and Attention Mechanisms

But the idea here is that LLMs do not process information as we humans do. To put it simply, they are like big probabilistic models trying to predict the best responses possible.

And because of how they process information and predict outputs, they often behave weirdly. We have compiled all the interesting ones for you!

Hallucinations in LLMs

LLMs hallucinate, a lot! This means if you make up a situation, something completely made up, they’ll answer you confidently! Time for some play:

LLMs Hallucinations

Want to know a fun fact? The LLM bootcamp has never happened in Singapore yet. Check our website to learn more about the bootcamp. But the idea here is LLMs hallucinate with confidence on hypothetical stuff.

Read more: AI hallucinations: Risks associated with large language models

Training on Huge Amounts of Data

Did you know how many tokens GPT -4 is trained on? It’s about 1.76 trillion tokens. To put this staggering number into perspective, think of it this way. It would take a person 22,000 years to read through 1 trillion words at normal speed for 8 hours a day.

This reveals that LLMs are developed using vast data collections. However, the nature of this data remains a mystery. What exactly are these models trained on? The specifics of the training materials are something companies typically keep under wraps, marking a peculiar aspect of LLMs.

How Does Huge Training Data Affect Model Performance?

The popular discourse is that the bigger the model the better it works. But in reality, it is a popular myth. And in this newsletter, we love busting myths.

  1. LLMs Memorize a lot and Hence They’re Easier to Trick

LLMs are more susceptible to deception due to their tendency to heavily rely on memorization. This leads to a phenomenon known as the ‘Memo Trap,’ where the model generates predictions based on its extensive prior training, rendering it inattentive to the actual question being asked.

It’s almost like when humans think they’re too smart to handle a situation because of their good understanding however they fail to comprehend the issue at hand.

Memo trap - inverse scaling
Source: Inverse Scaling: When Bigger Isn’t Better

The graph indicates a consistent pattern where, as we use more computing power (measured in FLOPs), the models’ ability to avoid falling into memory traps gets worse. This aligns with the observation that bigger language models tend to rely more on memorization, resulting in lower performance on tasks.

Read more: Inverse scaling – Explore things that can go wrong when you increase the size of your language models

2. Deep Double Descent – More Data Leading to Lower Accuracy

Deep Double Descent implies that as these models become larger and more complex, their performance doesn’t always scale linearly with size. At certain scales, the accuracy of LLMs may drop as they increase in size due to their complexity before potentially improving again if they continue to scale up.

The Deep Double Descent

Read: Deep double descent – where data can damage performance

LLMs Change their Mind Under Pressure

Jamie Bernardi noted that ChatGPT often switches from a correct to an incorrect response when questioned with “Are you sure?”.

This tendency could stem from the model’s training, which involves reinforcement learning from human feedback—a method that encourages the model to align closely with user satisfaction.

Read more: Are you sure? Testing ChatGPT’s confidence

LLMs Spread Myths

The researchers behind the TruthfulQA paper discovered that the accuracy, or “truthfulness,” of responses given by large language models (LLMs) tends to decline as the size of the model increases. The study involved a dataset named “TruthfulQA,” which consists of 817 questions across various subjects. These questions were not chosen at random; instead, they were the result of adversarial selection.

LLMs spreading myths

Recommended Read: Small Language Models: The Unsung Heroes of AI

Hear_it_from_an_expert_Newsletter_Section

Upcoming Live Talks

Here’s a great talk happening soon by Wojtek Kuberski, Founder of NannyML. He’ll be covering why and how to monitor LLMs deployed to production.

The talk will focus on state-of-the-art solutions for detecting hallucinations.

Hallucination Free LLMs Webinar

What Will You Learn?

You will build an intuitive understanding of the LLM monitoring methods, their strengths and weaknesses, and learn how to easily set up an LLM monitoring system.

Professional_Playtime_Newsletter_Section

Pop Quiz! Answer the question in the comments at the end of the newsletter.

The 4 Vs Quiz - Data Science Quiz

AI_News_Wrap

Finally, let’s end the week with some interesting headlines in the Gen-AI-Verse.

  1. Vision Pro is Apple’s first major push into AI, which will help drive a 30% stock gain this year, Wedbush says. Read more
  2. Google unveils mobile diffusion: A leap forward in on-the-go text-to-image generation technology. Read more
  3. OpenAI revolutionizes ChatGPT conversations with the new feature, GPT mentions. Read more
  4. Prophetic releases Morpheus 1, the world’s first multi-modal generative transformer designed to induce and stabilize lucid dreams. Read more
  5. Google Maps employs generative AI to enhance local guides. Read more
Large Language Models, LLMs
Newsletters | Data Science Dojo
Data Science Dojo | data science for everyone

Discover more from Data Science Dojo

Subscribe to get the latest updates on AI, Data Science, LLMs, and Machine Learning.