Why Data Teams Are Pairing AI Agents With Analytics Platforms: The dbt and Gemini Experiment

AI agents are moving beyond chatbots into data engineering workflows, but only when they have built-in safety mechanisms. A hands-on exploration by dbt Labs shows how pairing AI reasoning with deterministic data validation creates a new category of autonomous analytics work. Instead of agents guessing whether their SQL is correct, they now get instant feedback from a compiler that enforces data rules .

What Makes an AI Agent Different From a Regular AI Tool?

The distinction matters because it changes what AI can actually accomplish in your data stack. Traditional AI tools like large language models (LLMs), which are the "brains" behind modern AI systems, excel at answering questions and generating text. They respond when prompted but don't take independent action. Agentic AI systems, by contrast, receive a goal, break it into steps, use available tools to execute those steps, and adapt when something goes wrong .

Think of it this way: ask an LLM to "fix the authentication bug," and it suggests solutions. Ask an agentic system the same question, and it examines the codebase, implements a fix, runs tests and deploys the update without waiting for approval at each stage. The progression from observation to communication to action explains why agentic AI feels fundamentally different from what came before .

Four core capabilities define agentic systems:

  • Autonomy: The agent determines steps needed to reach a goal without constant human instruction or approval at each decision point
  • Reasoning: When something fails, the agent troubleshoots and tries alternative approaches instead of stopping at the first error
  • Multi-step execution: The agent chains multiple capabilities together into complete workflows where each step feeds context into the next
  • Environmental awareness: The agent interacts with real systems like codebases, databases, APIs and production infrastructure, not just text in a chat window

Why Is dbt Pairing With AI Agents Now?

The timing matters. For years, AI agents remained theoretical because the necessary pieces didn't align. Modern foundation models can now reason through complex, multi-step problems reliably. APIs and tool integrations have expanded across the software ecosystem, letting agents authenticate to multiple systems in a single workflow. Compute costs dropped while orchestration frameworks matured .

But data engineering presented a unique problem: agents could propose SQL or YAML configurations, but nobody knew if they were correct until they ran against the actual warehouse. That's where dbt's Fusion engine changed the equation. Real-time parsing and a smart, deterministic compiler mean AI no longer has to "hope" its output is correct. Every generated model, test, or metric can be validated immediately against the warehouse, the project graph, and dbt's rules .

"Instead of treating AI like a clever autocomplete, Fusion makes it possible to treat AI like a junior analytics engineer. It can propose models, tests, and metrics. Fusion can instantly tell us whether they compile, parse, and conform. Mistakes become feedback loops, not production risks," explained Stephen Robb, a data engineer who built a working dbt agent using Google's tools.

Stephen Robb, Data Engineer at dbt Labs

How to Build an AI Agent for Your Data Stack

The practical path forward involves three layers working together:

  • The reasoning engine: Google's Gemini model handles multi-step problem solving and code generation, serving as the cognitive core that decides what to do next
  • The tool interface: The Model Context Protocol (MCP), a standard way for AI models to safely interact with tools and systems, exposes dbt capabilities like metadata, models, tests and commands as tools the agent can use without breaking anything
  • The orchestration framework: Google's Agent Development Kit (ADK) provides the structure for building serious agent-based systems, managing how agents think, what tools they can access, and how they interact with infrastructure safely

The dbt MCP server specifically grants agents access to most dbt functionality through a consistent interface. Instead of teaching an AI how to use dbt differently each time, MCP gives it a rulebook and toolbox. The agent can ask "What models exist?" or "Run this dbt command" without accidentally breaking the project .

Getting started requires Python, Git, and either dbt Core or the dbt platform. Once your environment is set up, you can connect Claude, Cursor, or other AI clients to the dbt MCP server and confirm it's working. The tools then let you explore your dbt project, query metrics, analyze lineage, monitor job runs and troubleshoot issues .

What Problems Are Agents Actually Solving Today?

Agentic AI is already delivering measurable results in specific domains. In software development, GitHub's Copilot coding agent accepts a GitHub issue, researches the repository, writes an implementation, runs tests and opens a pull request for human review. The developer assigns the issue and returns to a finished PR ready for inspection. Google's Big Sleep agent, built by DeepMind and Project Zero, identified a zero-day SQLite memory corruption flaw that was already known to threat actors and cut off the attack before exploitation could begin .

In customer operations, agents resolve tickets by pulling customer data, checking order status across multiple systems and drafting responses that follow company guidelines. Data analysis workflows that once required manual orchestration across multiple tools now run autonomously from question to insight .

The pattern is consistent: any workflow that requires gathering information from multiple systems, making a judgment call and acting on it is a candidate for agentic automation. Agents are already coordinating software releases end-to-end, managing cloud infrastructure in response to real-time demand and running compliance audits that previously took a team days to complete manually .

What Challenges Still Need Solving?

The technology works, but deployment at scale requires solving four critical problems. Reliability remains the biggest hurdle: agents have to consistently achieve goals without creating more problems than they solve. A system that works 95% of the time still fails disruptively in production .

Oversight is equally important. Organizations need new ways for humans to monitor agent actions without micromanaging them. The goal is auditability: knowing what an agent did, why it did it and whether the outcome was correct, without requiring approval at every step .

Ethics and accountability frameworks don't yet exist for autonomous systems. Existing compliance structures assume a human made the decision. Agents break that assumption, raising questions about who is responsible when an agent takes an action that causes harm .

Finally, teams need new collaboration models where people set the goals and agents handle execution. This means defining which actions are safe to automate fully and which require human sign-off based on risk .

The convergence of mature foundation models, expanded API ecosystems, cheaper compute and orchestration frameworks has made agentic AI practical at scale. For data teams specifically, pairing agents with deterministic validation systems like dbt's Fusion engine removes the guesswork and transforms agents from experimental tools into reliable collaborators that can iterate on data logic in real time with confidence.