For a hands-on learning experience to develop Agentic AI applications, join our Agentic AI Bootcamp today. Early Bird Discount

Large Language Models bootcamp

In just one week, we will teach you how to build agentic AI applications. Learn the entire LLM application stack.

Technologies and Tools

4.95

Switchup Rating

12,000+

Alumni

2,500+

Companies Trained

1M+

Community Members

Who is this bootcamp for?

The Large Language Models Bootcamp is tailored for a diverse audience, including: 

Data professionals

Looking to enhance their skills with generative AI tools. Learn how to integrate LLMs into your data workflows and boost efficiency with real-world AI applications.

Product leaders

At enterprises or startups aiming to leverage LLMs for improving products, processes, or services .Discover how LLMs can drive innovation and streamline decision-making across your product lifecycle.

Beginners

Seeking a head start in understanding and working with LLMs. Build a solid foundation in generative AI with guided, beginner-friendly lessons and hands-on exercises.

Instructors and guest speakers

Learn from thought leaders at the forefront of building LLM applications

zain hasan data science dojo bootcamp

Zain Hasan

Senior DevRel Engineer, Together AI
sage elliot data science dojo bootcamp

Sage Elliot

AI Engineer, Union AI
Kartik Talamadupula data science dojo bootcamp

Kartik Talamadupula

Head of AI, Wand AI
Luis Serrano data science dojo bootcamp

Luis Serrano

Founder, Serrano Academy
Raja Iqbal data science dojo bootcamp

Raja Iqbal

Founder, Ejento AI
Hamza Farooq data science dojo bootcamp

Hamza Farooq

Founder, Travesaal AI
Adam Cowley data science dojo bootcamp

Adam Cowley

Developer Advocate, Neo4j
Rehan Jalil data science dojo bootcamp

Rehan Jalil

Co-Founder | CEO, Securiti AI
Jerry Liu data science dojo bootcamp

Jerry Liu

CEO/Co-founder, LlamaIndex
Sophie Daly data science dojo bootcamp

Sophie Daly

Staff Data Scientist, Stripe
Suman Debnath Agentic AI Conference

Suman Debnath

Technical Lead, ML Anyscale

curriculum

Explore the bootcamp curriculum

Overview of the topics and practical exercises.

Understanding the LLM Ecosystem

Key Topics:

  • Fundamentals of the LLM Landscape: From visual perception and natural language understanding to tokenization, embeddings, and how large language models generate text; the canonical end-to-end architecture connecting every layer.
  • Vector Databases and Similarity Search: Why vector databases exist, how embeddings represent meaning as geometry, and how similarity search powers retrieval at scale.
  • Prompt Engineering Best Practices: Structuring prompts that produce consistent, grounded outputs — and the failure modes that undermine them.
  • Building Custom LLM Applications: Three distinct paths — training from scratch, fine-tuning foundation models, and in-context learning — with clear tradeoffs between each.
  • Retrieval-Augmented Generation (RAG): How the RAG pipeline connects retrieval to generation, where it breaks down, and what a production-ready implementation actually requires.

Challenges in Enterprise Adoption

Key Topics:

  • Adoption Reality: Why most LLM projects stall between proof-of-concept and production; the competing pressures of cost, accuracy, latency, and infrastructure constraints that make scaling harder than building.
  • Technology Challenges: Context-window limits, dataset quality and PII handling, the real cost of fine-tuning (including PEFT approaches), and how inference costs compound with RAG and token volume.
  • Business and Risk Challenges: Compliance exposure, legal risk, misaligned customer expectations, and the organizational culture gaps that prevent meaningful KPI alignment.
  • Human Behavior Challenges: Fear of AI, resistance to change, biased prompting, and the subjectivity in feedback that makes evaluation harder than it looks.
  • Prompting Challenges: Prompt sensitivity, fatigue, overengineering, jailbreaking and injection risks, hallucinations from lack of grounding, and how ambiguity creates silent failures.
  • Best Practices: Guardrails and defensive UX design, LLM caching for cost control, structured feedback collection, and evaluation frameworks built around fairness and explainability.

Challenges in RAG Applications

Key Topics:

  • Privacy, Security, and Compliance: PII-safe ingestion through anonymization and differential privacy; authenticated access and user filtering at the retrieval layer; logging and data retention for compliance.
  • Multimodal and Multilingual RAG: Vision models (CLIP, BLIP-2, Gemini), audio pipelines (Whisper, audio embeddings, noise-tolerant search), and cross-lingual retrieval with LaBSE, mBERT, and XLM-R.
  • Advanced Architectures: DSPy for prompt compilation and retrieval chain optimization; KG-RAG for graph-grounded, traceable answers; self-improving retrieval through feedback loops, ranking callbacks, and adaptive scoring.
  • Retrieval Layer and Index Optimization: FAISS, Qdrant, Weaviate, and Pinecone compared; flat, HNSW, IVF, and PQ index types; improving search quality with hybrid BM25 + dense retrieval and re-rankers like BGE and ColBERT.
  • Evaluation and Metrics: Retrieval metrics (Precision@K, Recall@K, Hit Rate@K), generation metrics (BLEU, ROUGE, LLM-as-a-Judge), and tooling with TruLens, Phoenix, and EvalGen.
  • Latency, Cost, and Scalability: Identifying bottlenecks across retrieval, re-ranking, and generation; caching strategies at the embedding, prompt, and disk layers; cost controls through batch queries, static prompts, and quantized models.

Transformer Architecture and Attention Mechanisms

Key Topics:

  • Introduction to LLMs: Strengths and weaknesses of large language models; discriminative versus generative AI; how predictive and generative models differ in what they’re actually doing.
  • Transformer Architecture: Tokenization, embeddings, positional encoding, and the attention mechanism that holds it all together — explained from the ground up.
  • Embeddings and Similarity: How words become vectors, what proximity in that space actually means, and why it matters for retrieval and reasoning.
  • Attention Mechanism: Keys, queries, and values in self-attention; how the model decides what to focus on when generating each token.
  • Softmax and Probabilities: How raw attention scores become a probability distribution, and what that means for next-token prediction.
  • Training and Fine-Tuning: Adapting pre-trained models with curated data — what changes, what doesn’t, and where overfitting quietly creeps in.
  • Search and Retrieval: Building a semantic search engine with embeddings; connecting retrieval to generation for grounded, factual answers.
  • Hands-On Exercises: Sentence Transformers, semantic search, attention scoring, and implementing attention mechanisms directly.

Vector Databases

Key Topics:

  • Overview and Rationale: Why vector databases exist, how they differ from traditional search, and the role they play in grounding LLM applications.
  • Search Types: Vector search, text search, and hybrid search — when to use each and what you lose by picking the wrong one.
  • Indexing Techniques: Product Quantization (PQ), Locality Sensitive Hashing (LSH), and Hierarchical Navigable Small World (HNSW) — the tradeoffs between speed, memory, and recall.
  • Retrieval Techniques: Cosine similarity, nearest neighbor search, and how retrieval quality degrades at scale without the right index design.
  • Advanced RAG Techniques: Chunking and filtering strategies, query rewriting, hybrid search, auto-cut, and re-ranking for precision at the top of the result set.
  • Embedding and Model Selection: Domain-specific embeddings, fine-tuned retrieval models, and compression approaches including scalar quantization, product quantization, and binary and matryoshka methods.
  • Adaptive Retrieval and Multi-Tenancy: Multi-phase rescoring, preserving recall under compression, tenant isolation, and resource allocation patterns for shared environments.
  • Production Challenges: Scaling, reliability, and cost optimization — the problems that don’t surface until you’re running real query volumes.
  • Hands-On Exercises: Vector search, similarity search, hybrid search, generative search, Weaviate Query Agent, multi-tenancy, vector compression, and semantic caching.

Mastering Langchain

Key Topics:

  • Introduction to LangChain: What LangChain is for, what it abstracts away, and the class of RAG challenges it was built to address.
  • Core Components: LLMs and chat models, prompt templates, example selectors, document loaders, and transformers as the building blocks of LLM-powered applications.
  • Output Parsers: Structured data extraction, consistent formatting across model responses, and error handling when outputs don’t conform.
  • Retrieval and Vector Stores: Embedding, vectorization, metadata filtering, parent document retrieval, and efficient similarity search optimized for large datasets.
  • Chains: Sequential prompt logic with pre- and post-LLM steps; integrating tools and retrieval into coherent, composable workflows.
  • Tool Use and Memory: Connecting APIs and external actions, passing results back into the workflow, managing conversation history, storing state between calls, and maintaining context persistence for agents.
  • Callbacks and Observability: Event hooks during runs, monitoring and logging, and custom actions on success or failure.
  • LCEL (LangChain Expression Language): Piping components with runnables, parallel branches, and modular workflow composition.
  • LangGraph and Agents: Graph-based workflows for complex agent orchestration, dynamic control flows, decision-making steps, and combining tools with context for non-linear reasoning.

Fine-Tuning LLMs

Key Topics:

  • Core Concepts: Transfer learning and why fine-tuning works; full fine-tuning versus LoRA and QLoRA; parameter-efficient tuning and quantization as practical paths to adapting large models without prohibitive compute.
  • Key Considerations: Data quality and relevance as the primary lever for fine-tuning outcomes; overfitting risks and the limitations fine-tuning cannot resolve; when fine-tuning is the right choice versus RAG.
  • Hands-On Exercises: Instruction fine-tuning, deploying, and evaluating a LLaMA2-7B 4-bit quantized model in class; fine-tuning and deploying OpenAI and Llama models on Azure AI Studio as a take-home project.

Evaluation of LLMs

Key Topics:

  • Need for Evaluation: Why reliability, accuracy, and safety can’t be assumed — and how business alignment, ethical accountability, and user trust depend on structured measurement.
  • Challenges in Evaluation: Hallucinations, prompt sensitivity, and weak context handling; the difficulty of evaluating outputs where multiple valid answers exist; navigating tradeoffs between accuracy, fluency, and creativity.
  • Benchmarking Approaches: MMLU for multitask accuracy, HELM for holistic metrics across accuracy, robustness, and fairness, BBH and HotpotQA for reasoning and multi-hop question answering.
  • Text Quality Metrics: BLEU for n-gram precision, ROUGE for recall-based evaluation, BERTScore for semantic similarity, METEOR for synonym and stem alignment, and perplexity as a measure of prediction confidence.
  • RAG-Specific Evaluation (RAGAs): Faithfulness, answer relevance, context precision, and context recall — metrics that score retrieval and generation jointly rather than in isolation.
  • Open-Ended Output Evaluation (G-Eval): Fluency, faithfulness, answer relevance, and claim-level scoring for outputs that don’t have a single correct answer.
  • Additional Benchmarks and Metrics: GLUE for NLU tasks, TriviaQA for multi-hop QA, RealToxicityPrompts for safety evaluation, MRR and MAP for ranking performance, and ROSCOE for reasoning quality across semantic alignment, logical integrity, and commonsense coverage.

Key Topics:

  • Origins & Motivation: Addressing fragmented integrations and brittle bespoke adapters; introducing a unified, interoperable interface — the “USB-C for AI.”
  • Protocol Structure: Client–server handshake model; defining resources, tools, and prompts; JSON-RPC transport with structured, schema-driven messages.
  • Context Exposure: How MCP surfaces tools, data, and metadata through a consistent schema to enable discoverability, governance, and controlled access.
  • Agentic Integration: Connecting MCP endpoints to reflection, planning, tool-use, and multi-agent coordination patterns for modular and scalable systems.
  • Hands-On Labs: Setting up an MCP client in Streamlit; discovering and registering tools; automating workflows through data retrieval and validation; logging traces for monitoring and review.

Build A Multi-agent LLM Application

Key Topics:

Project Tracks:

  • Conversational Workflow Orchestration: Design a multi-turn assistant coordinating tasks across specialized agents.
  • Knowledge-Enhanced Agent: Integrate search and APIs for grounding, fact-checking, and real-time data access.
  • Document-Aware Action Agent: Retrieve and reason over documents; trigger external tools or services based on insights.
  • Orchestrated Collaboration (MCP): Build coordinated multi-agent systems using the Model Context Protocol for seamless tool and enterprise integration.

Attendees Will Receive:

  • Comprehensive Datasets: Industry-spanning document collections for robust development and testing.
  • Step-by-Step Implementation Guides: Clear instructions from environment setup to deployment.
  • Ready-to-Use Code Templates: Prebuilt templates within Data Science Dojo’s sandbox for accelerated development.

Learners Can Choose to Implement:

  • Virtual Assistant
  • Content Generation (Marketing Co-pilot)
  • Conversational Agent (Legal & Compliance Assistant)
  • Content Personalizer
  • MCP Chatbot – AI agent with calendar, CRM, and API integrations

Outcome:

A production-ready multi-agent application demonstrating mastery of reasoning, retrieval, tool use, and protocol-driven interoperability.

Earn a verified certificate

Earn a verified certificate from The University of New Mexico Continuing Education:

  • 5 Continuing Education Credit (CEU)
  • Acceptable by employers for reimbursements
  • Valid for professional licensing renewal
  • Verifiable by The University of New Mexico Registrar’s office
  • Add to LinkedIn and share with your network
LLM Bootcamp | Data Science Dojo

We accept tuition benefits

All of our programs are backed by a certificate from The University of New Mexico, Continuing Education. This means that you may be eligible to attend the bootcamp for FREE.

Upcoming sessions

Reserve your spot

Learn to build Large Language Model applications from leading experts in industry. 

Large Language Models Bootcamp

Use LLM1500 for USD 1500 discount.

Instructor Led

Seattle Cohort

March 23rd to 27th, 2026

Mon to Fri, 9 AM to 5 PM PT

$5000

$3499

Instructor Led

Online Cohort

March 23rd to 27th, 2026

Mon to Fri, 9 AM to 5 PM PT

$5000

$3499

Instructor Led

Seattle Cohort

June 8th to 12th, 2026

Mon to Fri, 9 AM to 5 PM PT

$5000

$3499

Instructor Led

Online Cohort

June 8th to 12th, 2026

Mon to Fri, 9 AM to 5 PM PT

$5000

$3499

Instructor Led

Seattle Cohort

August 24th to 28th, 2026

Mon to Fri, 9 AM to 5 PM PT

$5000

$3499

Instructor Led

Online Cohort

August 24th to 28th, 2026

Mon to Fri, 9 AM to 5 PM PT

$5000

$3499

Instructor Led

Seattle Cohort

November 2nd to 6th, 2026

Mon to Fri, 9 AM to 5 PM PT

$5000

$3499

Instructor Led

Online Cohort

November 2nd to 6th, 2026

Mon to Fri, 9 AM to 5 PM PT

$5000

$3499

Recognition

A word from our alumni

LLM Bootcamp | Data Science Dojo
LLM Bootcamp | Data Science Dojo
LLM Bootcamp | Data Science Dojo
LLM Bootcamp | Data Science Dojo
LLM Bootcamp | Data Science Dojo

Frequently asked questions, answered.

I am from a non-technical background. Is the LLM bootcamp also for me?

Our LLM bootcamp has been attended by many individuals from non-technical backgrounds, including those in business consulting and strategy roles. The program is designed to make Generative AI concepts accessible, regardless of your technical expertise.
You’ll gain practical insights into how LLMs are applied across industries, empowering you to advise clients better, lead AI initiatives, or collaborate effectively with technical teams.

You need a very basic level of Python programming for our LLM bootcamp.

Yes! We offer a short introductory course to help you get comfortable with Python. Plus, all code is provided in Jupyter Notebooks, so you’ll focus on understanding rather than writing code from scratch. 

The address for the LLM bootcamp venue is given below:

Seattle Venue Address: Data Science Dojo5010 148th Ave NE, Redmond, WA 98052 [View on Map]

Our LLM bootcamp is an immersive five-day, 40-hour learning experience, available both in-person (Seattle) and online. 

Yes, these sessions are live and are designed to be highly interactive.

Yes, the online session will be held at the same time with the same instructors as the in-person session.

By joining the LLM bootcamp, you will receive:

  • 40 hours of theory and hands-on learning
  • Live sessions with industry experts
  • 1-year access to dedicated learner and coding sandboxes
  • LLM tokens, GPU clusters, and other required subscriptions
  • A verified certificate from the University of New Mexico upon completion

Yes. You will receive a certificate from The University of New Mexico with 5 CEUs.

Yes, participants who complete the bootcamp will receive a certificate of completion in association with the University of Mexico. This certificate can be a valuable addition to your professional portfolio and demonstrate your expertise in building large language model applications.

While we do not provide exact recordings of the live sessions, all key topics and content covered during the class are available in our companion courses. We’ve created structured lesson clips that reflect the material discussed, allowing you to review everything at your convenience even if you miss a live session.

No, the price for the Large Language Models bootcamp will remain the same, regardless of whether you attend in person or online.

If, for any reason, you decide to cancel, we will gladly refund your registration fee in full if you notify us at least five business days before the start of the training. We can also transfer your registration to another cohort if preferred.

 

However, refunds cannot be processed if you have transferred to a different cohort after registration. Additionally, once you have been added to the learning platform and have accessed the course materials, we are unable to issue a refund, as digital content access is considered program participation.

Transfers are allowed once with no penalty. Transfers requested more than once will incur a $200 processing fee.

Looking to upskill your team?