NFO
Mimir, Keeper of the Well of Wisdom
Datadog LLM Observability: Monitor & Trace AI in Production
https://www.udemy.com/course/datadog-llm-observability/
Year : 2026
Language : English
Level : Intermediate Level
Category : Development
Subcategory : Data Science
Duration : 4h 7m
Lectures : 32
Rating : 4.8/5 (5 reviews)
Students : 169
INSTRUCTOR(S)
HEADLINE
Master enterprise AI monitoring with tracing, evaluations,
cost control, and security compliance using Datadog Platform
WHAT YOU'LL LEARN
* Instrument LLM applications with Datadog's ddtrace SDK for
full visibility into prompts, completions, and token usage
* Trace complex AI agent workflows including multi-turn
conversations, tool calls, and decision paths with
enterprise-
grade debugging
* Implement production evaluations using managed checks
(toxicity, relevancy) and custom LLM-as-a-judge evaluators
* Monitor and optimize LLM costs with automated cost tracking,
budget alerts, and model comparison dashboards
* Run experiments to test prompt and model changes before
production deployment using Datadog's experimentation
framework
* Build secure AI systems with PII scrubbing, compliance
patterns, and security monitoring for enterprise
requirements
* Instrument RAG pipelines with custom spans for embedding,
retrieval, and generation steps for complete workflow
visibility
* Integrate LLM observability with existing Datadog APM,
infrastructure, and security tools for unified enterprise
monitoring
REQUIREMENTS
* Basic Python programming experience (intermediate level)
* Familiarity with LLM APIs (OpenAI, Anthropic, or similar)
* Datadog account (free trial available) or ability to create
one
* Basic understanding of LLM concepts (prompts, tokens,
completions)
* Optional: Experience with LangChain or similar LLM
frameworks
WHO IS THIS COURSE FOR
* Enterprise Teams already using Datadog for
APM/Infrastructure
who want unified visibility into their AI workloads
* Technical Leads & Architects evaluating or implementing LLM
observability solutions within existing Datadog ecosystems
* Platform Engineers building internal AI infrastructure who
need to provide observability standards for development
teams
* ML Engineers & AI Engineers building LLM-powered
applications
who need production-grade monitoring and debugging
capabilities
DESCRIPTION
Are your LLM applications running blind in production? You've
deployed an AI agent, a RAG pipeline, or an LLM-powered
chatbot.
But can you answer these questions? How much did that runaway
agent loop cost before someone noticed? Why did hallucination
rates spike last Tuesday? Which step in your RAG pipeline is
returning irrelevant documents? How do you prove to compliance
that you're protecting customer PII in LLM conversations? If
you
can't answer these questions with data, you have a production
problem. Traditional APM tools see your LLM as a black box.
They measure latency and error rates, but they can't show you
token flows, prompt effectiveness, or quality degradation.
LLMs
are fundamentally different?non-deterministic, multi-step,
token-priced, and quality-sensitive. You need LLM-native
observability. Introducing Datadog LLM Observability This
course
is the definitive guide to Datadog's LLM Observability
platform
for enterprise teams. If you're already using Datadog for APM,
infrastructure, or security, this integrates directly into
your
existing stack?no new tools to learn, no separate dashboards
to
monitor. What you'll build: Throughout this course, you'll
instrument a production-grade Customer Support AI Agent with:
Multi-turn conversation tracing Tool integration (order
lookup,
refund processing) Custom quality evaluations Cost monitoring
dashboard PII scrubbing compliance This isn't a toy
example?it's
the architecture real enterprise teams deploy.
COURSE CONTENT
Chapter 1: Introduction & Enterprise Value
1. What You'll Learn
2. Why LLM Observability Matters for Enterprises
3. Datadog LLM Observability ? Core Capabilities and
Dashboard Demo
Chapter 2: Setting Up LLM Observability
4. Datadog Account Setup and Testing ? Hands-on
5. Span Types and SDK Integrations Overview
6. First Traced LLM Call in a Local Environment ? Hands-on
7. Quick Check in
Chapter 3: Instrumenting LLM Applications
8. Creating LLM Spans with Annotations and Tags ? Hands-on
9. Instrumenting Multi-step Workflows ? Hands-on
10. RAG Pipeline with Full Observability ? Hands-on
11. LangChain Integration ? Part 1
12. LangChain RAG Pipeline Auto-Instrumentation ? Hands-on
Chapter 4: Tracing Agentic AI Workflows
13. Tracing Agentic Workflows ? Hands-on
14. Multi-Agent Systems ? Hands-on
15. Debugging Agent Issues ? Overview
Chapter 5: Evaluations & Quality Monitoring
16. LLM Experiments Overview and Dataset Creation ? Hands-on
17. Generating a Golden Evaluation Set ? Hands-on
18. Running LLM Experiments in Datadog ? Dashboards and
Comparisons
19. A/B Testing Prompts ? Full Workflow ? Hands-on
20. Setting Up Evaluations and Quality Monitoring ? Hands-on
21. Creating Custom Evaluations
22. Evaluations and Monitoring in Code Only ? Hands-on
Chapter 6: Cost Monitoring & Optimization
23. Datadog Automatic Cost Tracking ? Dashboard Walkthrough
24. Cost Optimization Strategies ? Overview
Chapter 7: Security, Compliance & Production Patterns
25. Security, Compliance, and Production ? Overview
26. Setting Up PII Redaction Function and Testing - Hands-on
27. Data Scanning Dashboard Overview
28. Testing a Custom PII Redaction Group in Datadog ? Hands-
on
29. LLM Apps Security and Compliance and production Hands ?
Hands-on
30. Production Deployment Architecture and Checklist
Chapter 8: Bonus
31. Wrap-up and Next Steps
32. Bonus
DATES
Published : 2026-03-17
Last Updated : 2026-03-16
If you fear the truth, dont come to my well.
CRC32: 649c96a337b2fc3b9779d71b73921cf803d26576