Level up your AI game: Dive deep into Large Language Models with us!

Learn More


Orchestration frameworks: Simplify LLM-Apps with LangChain and Llama Index
Ruhma Khawaja
| September 14, 2023

In the dynamic realm of language models and data-driven apps, efficient orchestration frameworks are key. Explore LangChain and Llama Index, simplifying LLM-app interactions.

Large language models (LLMs) are becoming increasingly popular for a variety of tasks, such as natural language understanding, question answering, and text generation. However, LLMs can be complex and difficult to use, which is where orchestration frameworks come in.

Orchestration frameworks provide a way to manage and control LLMs. They can help to simplify the development and deployment of LLM-based applications, and they can also help to improve the performance and reliability of these applications.

There are a number of orchestration frameworks available, two of the most popular being LangChain and Llama Index.

What is Orchestration Frameworks
What are Orchestration Frameworks?

LangChain and Orchestration Frameworks

LangChain is an open-source orchestration framework that is designed to be easy to use and scalable. It provides a number of features that make it well-suited for managing LLMs, such as:

  • A simple API that makes it easy to interact with LLMs
  • A distributed architecture that can scale to handle large numbers of LLMs
  • A variety of features for managing LLMs, such as load balancing, fault tolerance, and security

Llama Index is another open-source orchestration framework that is designed for managing LLMs. It provides a number of features that are similar to LangChain, such as:

  • A simple API
  • A distributed architecture
  • A variety of features for managing LLMs

However, Llama Index also has some unique features that make it well-suited for certain applications, such as:

  • The ability to query LLMs in a distributed manner
  • The ability to index LLMs so that they can be searched more efficiently

Both LangChain and Llama Index are powerful orchestration frameworks that can be used to manage LLMs. The best framework for a particular application will depend on the specific requirements of that application.

In addition to LangChain and Llama Index, there are a number of other orchestration frameworks available, such as Bard, Megatron, Megatron-Turing NLG and OpenAI Five. These frameworks offer a variety of features and capabilities, so it is important to choose the one that best meets the needs of your application.

LangChain and Orchestration Frameworks
LangChain and Orchestration Frameworks – Source: TheNewsStack

LlamaIndex and LangChain: Orchestrating LLMs


The venture capital firm Andreessen Horowitz (a16z) identifies both LlamaIndex and LangChain as orchestration frameworks that abstract away the complexities of prompt chaining, enabling seamless data querying and management between applications and LLMs. This orchestration process encompasses interactions with external APIs, retrieval of contextual data from vector databases, and maintaining memory across multiple LLM calls.

LlamaIndex: A data framework for the future

LlamaIndex distinguishes itself by offering a unique approach to combining custom data with LLMs, all without the need for fine-tuning or in-context learning. It defines itself as a “simple, flexible data framework for connecting custom data sources to large language models.” Moreover, it accommodates a wide range of data types, making it an inclusive solution for diverse data needs.

Continuous evolution: LlamaIndex 0.7.0

LlamaIndex is a dynamic and evolving framework. Its creator, Jerry Liu, recently released version 0.7.0, which focuses on enhancing modularity and customizability to facilitate the development of LLM applications that leverage your data effectively. This release underscores the commitment to providing developers with tools to architect data structures for LLM applications.

The LlamaIndex Ecosystem: LlamaHub

At the core of LlamaIndex lies LlamaHub, a data ingestion platform that plays a pivotal role in getting started with the framework. LlamaHub offers a library of data loaders and readers, making data ingestion a seamless process. Notably, LlamaHub is not exclusive to LlamaIndex; it can also be integrated with LangChain, expanding its utility.



Navigating the LlamaIndex workflow

Users of LlamaIndex typically follow a structured workflow:

  1. Parsing Documents into Nodes
  2. Constructing an Index (from Nodes or Documents)
  3. Optional Advanced Step: Building Indices on Top of Other Indices
  4. Querying the Index

The querying aspect involves interactions with an LLM, where a “query” serves as an input. While this process can be complex, it forms the foundation of LlamaIndex’s functionality.

In essence, LlamaIndex empowers users to feed pertinent information into an LLM prompt selectively. Instead of overwhelming the LLM with all custom data, LlamaIndex allows users to extract relevant information for each query, streamlining the process.


Large language model bootcamp

Power of LlamaIndex and LangChain

LlamaIndex seamlessly integrates with LangChain, offering users flexibility in data retrieval and query management. It extends the functionality of data loaders by treating them as LangChain Tools and providing Tool abstractions to use LlamaIndex’s query engine alongside a LangChain agent.

Real-world applications: Context-augmented chatbots

LlamaIndex and LangChain join forces to create context-rich chatbots. Learn how these frameworks can be leveraged to build chatbots that provide enhanced contextual responses.

This comprehensive exploration unveils the potential of LlamaIndex, offering insights into its evolution, features, and practical applications.

Why are orchestration frameworks needed?

Data orchestration frameworks are essential for building applications on enterprise data because they help to:

  • Eliminate the need for foundation model retraining: Foundation models are large language models that are trained on massive datasets of text and code. They can be used to perform a variety of tasks, such as generating text, translating languages, and answering questions. However, foundation models can be expensive to train and retrain. Orchestration frameworks can help to reduce the need for retraining by allowing you to reuse trained models across multiple applications.


  • Overcome token limits: Foundation models often have token limits, which restrict the number of words or tokens that can be processed in a single request. Orchestration frameworks can help to overcome token limits by breaking down large tasks into smaller subtasks that can be processed separately.

  • Provide connectors for data sources: Orchestration frameworks typically provide connectors for a variety of data sources, such as databases, cloud storage, and APIs. This makes it easy to connect your data pipeline to the data sources that you need.

  • Reduce boilerplate code: Orchestration frameworks can help to reduce boilerplate code by providing a variety of pre-built components for common tasks, such as data extraction, transformation, and loading. This allows you to focus on the business logic of your application.

Popular orchestration frameworks

There are a number of popular orchestration frameworks available, including:

  • Prefect is an open-source orchestration framework that is written in Python. It is known for its ease of use and flexibility.

  • Airflow is an open-source orchestration framework that is written in Python. It is widely used in the enterprise and is known for its scalability and reliability.

  • Luigi is an open-source orchestration framework that is written in Python. It is known for its simplicity and performance.

  • Dagster is an open-source orchestration framework that is written in Python. It is known for its extensibility and modularity.


Read more –> FraudGPT: Evolution of ChatGPT into an AI weapon for cybercriminals in 2023


Choosing the right orchestration framework

When choosing an orchestration framework, there are a number of factors to consider, such as:

  1. Ease of use: The framework should be easy to use and learn, even for users with no prior experience with orchestration.
  2. Flexibility: The framework should be flexible enough to support a wide range of data pipelines and workflows.
  3. Scalability: The framework should be able to scale to meet the needs of your organization, even as your data volumes and processing requirements grow.
  4. Reliability: The framework should be reliable and stable, with minimal downtime.
  5. Community support: The framework should have a large and active community of users and contributors.


Orchestration frameworks are essential for building applications on enterprise data. They can help to eliminate the need for foundation model retraining, overcome token limits, connect to data sources, and reduce boilerplate code. When choosing an orchestration framework, consider factors such as ease of use, flexibility, scalability, reliability, and community support.

Learn to build LLM applications                                          

Unleash LlamaIndex: The key to uncovering deeper insights in text exploration
Muhammad Jan
| July 10, 2023

Before we understand LlamaIndex, let’s step back a bit. Imagine a futuristic landscape where machines possess an extraordinary ability to understand and produce human-like text effortlessly. LLMs have made this vision a reality. Armed with a vast ocean of training data, these marvels of innovation have become the crown jewels of the tech world.

There is no denying that LLMs (Large Language Models) are currently the talk of the town! From revolutionizing text generation and reasoning, LLMs are trained on massive datasets and have been making waves in the tech vicinity.

One particular LLM has emerged as a true superstar. Back in November 2022, ChatGPT, an LLM developed by OpenAI, attracted a staggering one million users within 5 days of its beta launch.

Source: Chart: ChatGPT Sprints to One Million Users | Statista  

When researchers and developers saw these stats they started thinking on how we can best feed/augment these LLMs with our own private data. They started thinking about different solutions.

Finetune your own LLM. You adapt an existing LLM by training your data. But, this is very costly and time-consuming.

Combining all the documents into a single large prompt for an LLM might be possible now with the increased token limit of 100k for models. However, this approach could result in slower processing times and higher computational costs.

Instead of inputting all the data, selectively provide relevant information to the LLM prompt. Choose the useful bits for each query instead of including everything.

Option 3 appears to be both relevant and feasible, but it requires the development of a specialized toolkit. Recognizing this need, efforts have already begun to create the necessary tools.

Introducing LlamaIndex

Recently a toolkit was launched for building applications using LLM, known as Langchain. LlamaIndex is built on top of Langchain to provide a central interface to connect your LLMs with external data.

Key Components of LlamaIndex:

The key components of LlamaIndex are as follows

  • Data Connectors: The data connector, known as the Reader, collects data from various sources and formats, converting it into a straightforward document format with textual content and basic metadata.
  • Data Index: It is a data structure facilitating efficient retrieval of pertinent information in response to user queries. At a broad level, Indices are constructed using Documents and serve as the foundation for Query Engines and Chat Engines, enabling seamless interactions and question-and-answer capabilities based on the underlying data. Internally, Indices store data within Node objects, which represent segments of the original documents.
  • Retrievers: Retrievers play a crucial role in obtaining the most pertinent information based on user queries or chat messages. They can be constructed based on Indices or as standalone components and serve as a fundamental element in Query Engines and Chat Engines for retrieving contextually relevant data.
  • Query Engines: A query engine is a versatile interface that enables users to pose questions regarding their data. By accepting natural language queries, the query engine provides comprehensive and informative responses.
  • Chat Engines: A chat engine serves as an advanced interface for engaging in interactive conversations with your data, allowing for multiple exchanges instead of a single question-and-answer format. Similar to ChatGPT but enhanced with access to a knowledge base, the chat engine maintains a contextual understanding by retaining the conversation history and can provide answers that consider the relevant past context.

Difference between query engine and chat engine:

It is important to note that there is a significant distinction between a query engine and a chat engine. Although they may appear similar at first glance, they serve different purposes:

A query engine operates as an independent system that handles individual questions over the data without maintaining a record of the conversation history.

On the other hand, a chat engine is designed to keep track of the entire conversation history, allowing users to query both the data and previous responses. This functionality resembles ChatGPT, where the chat engine leverages the context of past exchanges to provide more comprehensive and contextually relevant answers

  • Customization: LlamaIndex offers customization options where you can modify the default settings, such as the utilization of OpenAI’s text-davinci-003 model. Users have the flexibility to customize the underlying language model (LLM) and other settings used in LlamaIndex, with support for various integrations and LangChain’s LLM modules.
  • Analysis: LlamaIndex offers a diverse range of analysis tools for examining indices and queries. These tools include features for analyzing token usage and associated costs. Additionally, LlamaIndex provides a Playground module, which presents a visual interface for analyzing token usage across different index structures and evaluating performance metrics.
  • Structured Outputs: LlamaIndex offers an assortment of modules that empower language models (LLMs) to generate structured outputs. These modules are available at various levels of abstraction, providing flexibility and versatility in producing organized and formatted results.
  • Evaluation: LlamaIndex provides essential modules for assessing the quality of both document retrieval and response synthesis. These modules enable the evaluation of “hallucination,” which refers to situations where the generated response does not align with the retrieved sources. A hallucination occurs when the model generates an answer without effectively grounding it in the given contextual information from the prompt.
  • Integrations: LlamaIndex offers a wide array of integrations with various toolsets and storage providers. These integrations encompass features such as utilizing vector stores, integrating with ChatGPT plugins, compatibility with Langchain, and the capability to trace with Graphsignal. These integrations enhance the functionality and versatility of LlamaIndex by allowing seamless interaction with different tools and platforms.
  • Callbacks: LlamaIndex offers a callback feature that assists in debugging, tracking, and tracing the internal operations of the library. The callback manager allows for the addition of multiple callbacks as required. These callbacks not only log event-related data but also track the duration and frequency of each event occurrence. Moreover, a trace map of events is recorded, providing valuable information that callbacks can utilize in a manner that best suits their specific needs.
  • Storage: LlamaIndex offers a user-friendly interface that simplifies the process of ingesting, indexing, and querying external data. By abstracting away complexities, LlamaIndex allows users to query their data with just a few lines of code. Behind the scenes, LlamaIndex provides the flexibility to customize storage components for different purposes. This includes document stores for storing ingested documents (represented as Node objects), index stores for storing index metadata, and vector stores for storing embedding vectors.The document and index stores utilize a shared key-value store abstraction, providing a common framework for efficient storage and retrieval of data

Now that we have explored the key components of LlamaIndex, let’s delve into its operational mechanisms and understand how it functions.

How Llama-Index Works:

To begin, the first step is to import the documents into LlamaIndex, which provides various pre-existing readers for sources like databases, Discord, Slack, Google Sheets, Notion, and the one we will utilize today, the Simple Directory Reader, among others.[Text Wrapping Break][Text Wrapping Break]You can check for more here: Llama Hub (llama-hub-ui.vercel.app)

Once the documents are loaded, LlamaIndex proceeds to parse them into nodes, which are essentially segments of text. Subsequently, an index is constructed to enable quick retrieval of relevant data when querying the documents. The index can be stored in different formats, but we will opt for a Vector Store as it is typically the most useful when querying text documents without specific limitations.

LlamaIndex is built upon LangChain, which serves as the foundational framework for a wide range of LLM applications. While LangChain provides the fundamental building blocks, LlamaIndex is specifically designed to streamline the workflow described above.

Here is an example code showcasing the utilization of the SimpleDirectoryReader data loader in LlamaIndex, along with the integration of the OpenAI language model for natural language processing.

Installing the necessary libraries required to run the code.

Importing openai library and setting the secret API (Application Programming Interface) key.

Importing the SimpleDirectoryReader class from llama_index library and loading the data from it.

Importing SimpleNodeParser class from llama_index and parsing the documents into nodes – basically in chunks of text.

Importing VectorStoreIndex class from llama_index to create index from the chunks of text so that each time when a query is placed only relevant data is sent to OpenAI. In short, for the sake of cost effectiveness.


LlamaIndex, built on top of Langchain, offers a powerful toolkit for integrating external data with LLMs. By parsing documents into nodes, constructing an efficient index, and selectively querying relevant information, LlamaIndex enables cost-effective exploration of text data.

The provided code example demonstrates the utilization of LlamaIndex’s data loader and query engine, showcasing its potential for next-generation text exploration. For the notebook of the above code, refer to the source code available here.

Unleashing the power of LangChain: A comprehensive guide to building custom Q&A chatbots 
Syed Hyder Ali Zaidi
| May 22, 2023

The NLP landscape has been revolutionized by the advent of large language models (LLMs) like GPT-3 and GPT-4. These models have laid a strong foundation for creating powerful, scalable applications. However, the potential of these models is greatly influenced by the quality of the prompt, highlighting the importance of prompt engineering.

Furthermore, real-world NLP applications often require more complexity than a single ChatGPT session can provide. This is where LangChain comes into play! 


Get more information on Large Language models and its applications and tools by clicking below:

Large language model bootcamp


Harrison Chase’s brainchild, LangChain, is a Python library designed to help you leverage the power of LLMs to build custom NLP applications. As of May 2023, this game-changing library has already garnered almost 40,000 stars on GitHub. 



Interested in learning about Large Language Models and building custom ChatGPT like applications for your business? Click below

Learn More                  


This comprehensive beginner’s guide provides a thorough introduction to LangChain, offering a detailed exploration of its core features. It walks you through the process of building a basic application using LangChain and shares valuable tips and industry best practices to make the most of this powerful framework. Whether you’re new to Language Learning Models (LLMs) or looking for a more efficient way to develop language generation applications, this guide serves as a valuable resource to help you leverage the capabilities of LLMs with LangChain. 

Overview of LangChain Modules 

These modules serve as fundamental abstractions that form the foundation of any application powered by the Language Model (LLM). LangChain offers standardized and adaptable interfaces for each module. Additionally, LangChain provides external integrations and even ready-made implementations for seamless usage. Let’s delve deeper into these modules. 

Overview of LangChain Modules
Overview of LangChain Modules


LLM is the fundamental component of LangChain. It is essentially a wrapper around a large language model that helps use the functionality and capability of a specific large language model. 


As stated earlier, LLM (Language Model) serves as the fundamental unit within LangChain. However, in line with the “LangChain” concept, it offers the ability to link together multiple LLM calls to address specific objectives. 

For instance, you may have a need to retrieve data from a specific URL, summarize the retrieved text, and utilize the resulting summary to answer questions. 

On the other hand, chains can also be simpler in nature. For instance, you might want to gather user input, construct a prompt using that input, and generate a response based on the constructed prompt. 


Prompts have become a popular modeling approach in programming. It simplifies prompt creation and management with specialized classes and functions, including the essential PromptTemplate. 

Document Loaders and Utils: 

LangChain’s Document Loaders and Utils modules simplify data access and computation. Document loaders convert diverse data sources into text for processing, while the utils module offers interactive system sessions and code snippets for mathematical computations. 


The widely used index type involves generating numerical embeddings for each document using an Embedding Model. These embeddings, along with the associated documents, are stored in a vectorstore. This vectorstore enables efficient retrieval of relevant documents based on their embeddings. 


LangChain offers a flexible approach for tasks where the sequence of language model calls is not deterministic. Its “Agents” can act based on user input and previous responses. The library also integrates with vector databases and has memory capabilities to retain the state between calls, enabling more advanced interactions. 

Building Our App 

Now that we’ve gained an understanding of LangChain, let’s build a PDF Q/A Bot app using LangChain and OpenAI. Let me first show you the architecture diagram for our app and then we will start with our app creation. 

QA Chatbot Architecture
QA Chatbot Architecture


Below is an example code that demonstrates the architecture of a PDF Q&A chatbot powered by the new technology. This code utilizes the OpenAI language model for natural language processing, FAISS database for efficient similarity search, PyPDF2 for reading PDF files, and Streamlit for creating a web application interface. The chatbot leverages LangChain’s Conversational Retrieval Chain to find the most relevant answer from a document based on the user’s question. This integrated setup enables an interactive and accurate question-answering experience for the users. 

Get expert advice on how generative AI can improve your marketing, sales, or operations. Schedule a call now!

Importing necessary libraries 

Import Statements: These lines import the necessary libraries and functions required to run the application. 

  • PyPDF2: Python library used to read and manipulate PDF files. 
  • langchain: a framework for developing applications powered by language models. 
  • streamlit: A Python library used to create web applications quickly. 
Importing necessary libraries
Importing necessary libraries

If the LangChain and OpenAI are not installed already, you first need to run the following commands in the terminal. 

Install LangChain

Setting OpenAI API Key 

You will replace the placeholder with your OpenAI API key which you can access from OpenAI API. The above line sets the OpenAI API key, which you need to use OpenAI’s language models. 

Setting OpenAI API Key

Streamlit UI 

These lines of code create the web interface using Streamlit. The user is prompted to upload a PDF file.

Streamlit UI
Streamlit UI

Reading the PDF File 

If a file has been uploaded, this block reads the PDF file, extracts the text from each page, and concatenates it into a single string. 

Reading the PDF File
Reading the PDF File

Text Splitting 

Language Models are often limited by the amount of text that you can pass to them. Therefore, it is necessary to split them up into smaller chunks. It provides several utilities for doing so. 

Text Splitting 
Text Splitting

Using a Text Splitter can also help improve the results from vector store searches, as eg. smaller chunks may sometimes be more likely to match a query. Here we are splitting the text into 1k tokens with 200 tokens overlap. 


Here, the OpenAIEmbeddings function is used to download embeddings, which are vector representations of the text data. These embeddings are then used with FAISS to create an efficient search index from the chunks of text.  


Creating Conversational Retrieval Chain 

The chains developed are modular components that can be easily reused and connected. They consist of predefined sequences of actions encapsulated in a single line of code. With these chains, there’s no need to explicitly call the GPT model or define prompt properties. This specific chain allows you to engage in conversation while referencing documents and retains a history of interactions. 

Creating Conversational Retrieval Chain
Creating Conversational Retrieval Chain

Streamlit for Generating Responses and Displaying in the App 

This block prepares a response that includes the generated answer and the source documents and displays it on the web interface. 

Streamlit for Generating Responses and Displaying in the App
Streamlit for Generating Responses and Displaying in the App

Let’s Run Our App 

QA Chatbot
QA Chatbot

Here we uploaded a PDF, asked a question, and got our required answer with the source document. See, that is how the magic of LangChain works.  

You can find the code for this app on my GitHub repository LangChain-Custom-PDF-Chatbot.

Wrapping Up 

Concluding the journey! Mastering LangChain for creating a basic Q&A application has been a success. I trust you have acquired a fundamental comprehension of LangChain’s potential. Now, take the initiative to delve into LangChain further and construct even more captivating applications. Enjoy the coding adventure.

Want to see how generative AI can help your business grow? Get a free consultation today and give your business the edge it needs to succeed.

Related Topics

Machine Learning
Generative AI
Data Visualization
Data Security
Data Science
Data Engineering
Data Analytics
Computer Vision
Artificial Intelligence
DSD icon

Discover more from Data Science Dojo

Subscribe to get the latest updates on AI, Data Science, LLMs, and Machine Learning.

Up for a Weekly Dose of Data Science?

Subscribe to our weekly newsletter & stay up-to-date with current data science news, blogs, and resources.