The artificial intelligence community has a new champion in Falcon 180B, an open-source large language model (LLM) boasting a staggering 180 billion parameters, trained on a colossal dataset. This powerhouse newcomer has outperformed previous open-source LLMs on various fronts.

Falcon AI, particularly Falcon LLM 40B, represents a significant achievement by the UAE’s Technology Innovation Institute (TII). The “40B” designation indicates that this Large Language Model boasts an impressive 40 billion parameters.

Notably, TII has also developed a 7 billion parameter model, trained on a staggering 1500 billion tokens. In contrast, the Falcon LLM 40B model is trained on a dataset containing 1 trillion tokens from RefinedWeb. What sets this LLM apart is its transparency and open-source nature.

Falcon operates as an autoregressive decoder-only model and underwent extensive training on the AWS Cloud, spanning two months and employing 384 GPUs. The pretraining data predominantly comprises publicly available data, with some contributions from research papers and social media conversations.

Significance of Falcon AI

The performance of Large Language Models is intrinsically linked to the data they are trained on, making data quality crucial. Falcon’s training data was meticulously crafted, featuring extracts from high-quality websites, sourced from the RefinedWeb Dataset. This data underwent rigorous filtering and de-duplication processes, supplemented by readily accessible data sources.

Falcon’s architecture is optimized for inference, enabling it to outshine state-of-the-art models such as those from Google, Anthropic, Deepmind, and LLaMa, as evidenced by its ranking on the OpenLLM Leaderboard.

Beyond its impressive capabilities, Falcon AI distinguishes itself by being open-source, allowing for unrestricted commercial use. Users have the flexibility to fine-tune Falcon with their data, creating bespoke applications harnessing the power of this Large Language Model. Falcon also offers Instruct versions, including Falcon-7B-Instruct and Falcon-40B-Instruct, pre-trained on conversational data. These versions facilitate the development of chat applications with ease.

Hugging Face Hub Release

Announced through a blog post by the Hugging Face AI community, Falcon 180B is now available on Hugging Face Hub.

This latest-model architecture builds upon the earlier Falcon series of open-source LLMs, incorporating innovations like multiquery attention to scale up to its massive 180 billion parameters, trained on a mind-boggling 3.5 trillion tokens.

Unprecedented Training Effort

Falcon 180B represents a remarkable achievement in the world of open-source models, featuring the longest single-epoch pretraining to date. This milestone was reached using 4,096 GPUs working simultaneously for approximately 7 million GPU hours, with Amazon SageMaker facilitating the training and refinement process.

Surpassing LLaMA 2 & Commercial Models

To put Falcon 180B’s size in perspective, its parameters are 2.5 times larger than Meta’s LLaMA 2 model, previously considered one of the most capable open-source LLMs. Falcon 180B not only surpasses LLaMA 2 but also outperforms other models in terms of scale and benchmark performance across a spectrum of natural language processing (NLP) tasks.

It achieves a remarkable 68.74 points on the open-access model leaderboard and comes close to matching commercial models like Google’s PaLM-2, particularly on evaluations like the HellaSwag benchmark.

Falcon AI: A Strong Benchmark Performance

Falcon 180B consistently matches or surpasses PaLM-2 Medium on widely used benchmarks, including HellaSwag, LAMBADA, WebQuestions, Winogrande, and more. Its performance is especially noteworthy as an open-source model, competing admirably with solutions developed by industry giants.

Comparison with ChatGPT

When comparing Falcon 180B directly with ChatGPT, the distinctions become clear. The free version of ChatGPT is powered by GPT-3.5, and while it handles everyday queries well, Falcon 180B often delivers more precise and contextually rich responses. This is because Falcon 180B is engineered to offer enhanced natural language understanding, making it a step up from the free service.

On the other hand, ChatGPT Plus runs on GPT-4—a model known for its sophisticated reasoning and nuanced conversational abilities. In many evaluation benchmarks, Falcon 180B typically falls between GPT-3.5 and GPT-4, meaning it outperforms the free version but doesn’t quite match the advanced capabilities of the paid service.

This positioning makes Falcon 180B an exciting alternative for those seeking improved performance over GPT-3.5 without the premium commitment required for GPT-4, adding valuable diversity to the AI landscape.

Falcon AI with LangChain

LangChain is a Python library designed to facilitate the creation of applications utilizing Large Language Models (LLMs). It offers a specialized pipeline known as HuggingFacePipeline, tailored for models hosted on HuggingFace. This means that integrating Falcon with LangChain is not only feasible but also practical.

Installing LangChain package

Begin by installing the LangChain package using the following command:

This command will fetch and install the latest LangChain package, making it accessible for your use.

Creating a Pipeline for Falcon Model

Next, let’s create a pipeline for the Falcon model. You can do this by importing the required components and configuring the model parameters:

Here, we’ve utilized the HuggingFacePipeline object, specifying the desired pipeline and model parameters. The ‘temperature’ parameter is set to 0, reducing the model’s inclination to generate imaginative or off-topic responses. The resulting object, named ‘llm,’ stores our Large Language Model configuration.

You might also like: 6 best ChatGPT plugins for data science

PromptTemplate and LLMChain

LangChain offers tools like PromptTemplate and LLMChain to enhance the responses generated by the Large Language Model. Let’s integrate these components into our code:

In this section, we define a template for the PromptTemplate, outlining how our LLM should respond, emphasizing humor in this case. The template includes a question placeholder labeled {query}. This template is then passed to the PromptTemplate method and stored in the ‘prompt’ variable.

To finalize our setup, we combine the Large Language Model and the Prompt using the LLMChain method, creating an integrated model configured to generate humorous responses.

Putting It Into Action

Now that our model is configured, we can use it to provide humorous answers to user questions. Here’s an example code snippet:

In this example, we presented the query “How to reach the moon?” to the model, which generated a humorous response. The Falcon-7B-Instruct model followed the prompt’s instructions and produced an appropriate and amusing answer to the query.

This demonstrates just one of the many possibilities that this new open-source model, Falcon AI, can offer.

A Promising Future

Falcon 180B’s release marks a significant leap forward in the advancement of large language models. Beyond its immense parameter count, it showcases advanced natural language capabilities from the outset.

With its availability on Hugging Face, the model is poised to receive further enhancements and contributions from the community, promising a bright future for open-source AI.

LLM - Online Courses

Reviews

Consulting

Community

Open-Source AI

Data Science Dojo Staff

Falcon 180B Language Model Overtakes Meta and Google