Skip to content
AI Observability Updated Apr 21 2026

AI-Assisted Code that’s Now Data-Aware, with MC Agent Toolkit

AUTHOR | Mor Ofir

In this new and rapidly changing reality of software, humans aren’t the only ones doing the work. Far from it, actually; it’s agents who are now writing code, opening pull requests, and modifying data pipelines. In most data stacks, they’re doing all of this without the critical knowledge of whether their inputs are reliable and trustworthy.

Agents can query your warehouse directly, generate a dbt model in minutes, and push a schema change without a reviewer in the loop. What they can’t do, without help, is know whether the data they’re reasoning over is actually reliable. They don’t know if the table they’re about to modify has 316 downstream dependents, for example, or if the freshness anomaly they’re investigating is a symptom of a bigger problem or the true root cause.

This is a significant and consequential gap that is growing faster than most teams realize.

Think about it this way: when a human engineer hits bad data, they notice, stop, and investigate. When an agent does, however, nobody knows until the damage has compounded. The agent cannot recognize that something is “bad” in the intuitive way that humans can, largely because it does not have institutional memory. It needs some help in this direction.

The MC Agent Toolkit is how Monte Carlo closes this gap, putting years of institutional memory, lineage, and observability context directly inside the agents your team is already using.

How do Agents Really Need to Access Software?

Agents don’t need a dashboard, navigation or visualizations, or even export menus. What they need is direct access to the data and your product capabilities and – critically – the context to know whether the data they’re acting on can be trusted.

That means, for example, knowing the current health of a table before touching it or the real impact of a schema change before it ships. Agents have to be able to understand which alerts are high-signal and which are noise.

What we are describing here is institutional memory. It’s what Monte Carlo stores, and it’s what no raw infrastructure tool can provide on its own because those tools are stateless. They show you what’s true right now. Monte Carlo stores what was true before, and why it changed.

The MC Agent Toolkit puts that memory in your agent’s hands. It requires only one install, and a trust layer for your agents is live.

Every skill in the MC Agent Toolkit is powered by Monte Carlo’s MCP server — an implementation of the Model Context Protocol, the open standard that lets AI coding agents communicate with external tools and data sources in a structured, composable way.

The Full Lifecycle, Inside Your Coding Agent

The toolkit now covers four phases of the agent-powered data workflow. Each phase corresponds to a question agents need to answer before they can act with confidence.

Phase 1: Check before you touch

Asset Health is the simplest skill in the toolkit and, we predict, the most used. Ask your coding agent “how is table X doing?” and get a structured trust report: current status (healthy / degraded / unhealthy), active alerts, monitoring coverage, and upstream dependency health. It’s the trust check that should precede any change and, until now, it didn’t exist inside a coding agent.

MC Prevent goes further. It’s the only skill in the toolkit that’s automatic and enforced. The moment you open a dbt model, it pulls Monte Carlo context into your editor without being asked: table health, active alerts, downstream blast radius, monitor coverage. It assigns a risk tier and, for high-risk changes, blocks the agent from editing until a change impact assessment has run. When we tested this on our own dbt repo, it surfaced the full impact of a column rename across 316 downstream dependents before a single line was touched.

Stylized view of the MC Prevent skill operating in Claude Code

Phase 2: Change safely

Once you know it’s safe to proceed, the toolkit makes sure the change ships correctly.

Generate Validation Notebook produces a ready-to-run set of SQL validation queries — row counts, NULL checks, uniqueness, before/after comparisons — tailored to the specific models that changed. The PR Agent now auto-generates these notebooks for every qualifying pull request, so the validation work is done before anyone has to ask for it.

Monitor Creation ensures new logic doesn’t ship without coverage. AI agents creating monitors on their own tend to fail in predictable ways — invalid field names, wrong parameters, nonexistent tables. This skill validates first, then creates, outputting monitors-as-code YAML ready to deploy with the MC CLI.

Stylized view of the MC Validation skill operating in Claude Code

Phase 3: Investigate and fix

When something breaks, the toolkit now covers the full incident lifecycle without leaving your coding agent.

Automated Triage fetches recent alerts, scores each one by confidence and impact, and runs deep troubleshooting on high-signal alerts. It can start from a built-in example workflow or be customized to match how your team responds — designed to move at your pace from manual review to scheduled recommendations to automated actions.

Root Cause Analysis walks the lineage chain upstream, checks ETL jobs across Airflow, dbt, and Databricks, detects query changes, and profiles actual data when a DB connector is available. It’s systematic investigation of the kind that currently requires jumping between four different tools and a lot of context held in someone’s head.

Remediation picks up where investigation ends. It discovers what tools are available, proposes a fix with a risk assessment and rollback plan, executes with safety rails, and documents everything on the alert. Even when it can’t execute directly, it always produces a runnable plan.

Together, these three skills compose naturally: triage surfaces the alerts, root cause analysis investigates them, remediation fixes them. The full loop, in one place.

Stylized view of the MC Triage skill operating in Claude Code

Phase 4: Optimize and observe

The newest additions extend the toolkit beyond incident response into ongoing operational intelligence.

Monitoring Advisor brings Monte Carlo’s coverage gap detection directly into Claude Code. Instead of context-switching to the MC app, ask your coding agent to assess monitoring coverage, including discovering warehouses, analyzing use cases, and detecting coverage gaps, and create monitors inline.

Agent Monitoring closes a loop that matters more as AI agents proliferate: who’s watching the agents? This skill discovers the AI agents in your account, analyzes their behavior, and recommends the right monitors — latency, token usage, error rates, quality — with one-click setup.

Performance Diagnosis and Cost-savings round out the set — surfacing your slowest queries and most wasteful assets so agents can help optimize the warehouse alongside the pipelines they’re building.

Getting to the right skill, automatically

Knowing the toolkit exists is one thing. Knowing which skill to reach for is another challenge, especially when you’re staring at a firing alert or trying to figure out what isn’t being monitored.

The mc-agent-toolkit now solves this with built-in skill discoverability and guided workflows. A context-detection router reads what you’re asking about — a stale table, a missing monitor, an unexplained pipeline failure — and routes to the right skill automatically, without requiring you to know its name in advance.

Two new workflow skills ship alongside the router. The first one, incident-response, walks you through triage and root cause analysis for a firing alert, end to end. The second, proactive-monitoring, helps you find coverage gaps before they become incidents. Both are designed to be reached through natural-language questions: “why did this pipeline break?” lands in incident-response; “what’s not being monitored?” lands in proactive-monitoring. Eval accuracy on these flows jumped from 50% to 80% with this routing layer in place.

New slash commands (/mc) make the entrypoints discoverable from inside your editor, and a welcome hook surfaces the full toolkit at session start — so the first time a new team member opens their coding agent, they know exactly what’s available.

One Install, Every Editor

The mc-agent-toolkit plugin is available for Claude Code, Cursor, OpenCode, GitHub Copilot CLI, and Codex. Install it once and get everything: MCP server, editor-specific hooks, and a growing set of skills that update automatically.

For Claude Code:

/plugin marketplace add monte-carlo-data/mc-agent-toolkit
/plugin install mc-agent-toolkit@mc-marketplace

When you install the mc-agent-toolkit plugin, it registers Monte Carlo’s MCP server with your coding agent, giving it a set of callable tools backed by Monte Carlo’s full observability graph: lineage, monitors, alerts, incident history, and more. The agent doesn’t need to know how to query Monte Carlo’s API, it just calls the right tool, and the MCP server handles authentication, data retrieval, and response formatting.

This is what makes skills like Asset Health or Root Cause Analysis feel native to your editor rather than bolted on: from the agent’s perspective, checking whether a table is healthy is no different from reading a local file. Monte Carlo’s MCP server is now available at mcp.getmontecarlo.com/mcp, with support for streaming responses, lower latency, and larger payloads — purpose-built for the kind of complex, multi-step workflows that agentic coding actually requires.

The toolkit is open source at github.com/monte-carlo-data/mc-agent-toolkit.

Full documentation is at docs.getmontecarlo.com/docs/agent-toolkit.

Why This Matters Now

Agents are becoming the primary users of your data stack, and they’re not waiting for you to build an agent strategy. They are already connected, querying, and acting with quite a bit of autonomy.

That brings us to a critical junction where we need to ensure the data that they’re finding to execute their tasks is worth trusting.

Monte Carlo has always been the system that knows what’s happening in your data. Now, with the Agent Toolkit, that same knowledge is available to every agent in your stack — before a change is made, while a PR is in review, the moment an alert fires, and at every step in between.

The trust layer should be a part of your coding agent, inextricably linked from the rest of its makeup because data & AI are two halves of a whole. This way, trust also compounds over time; the longer you use it, the richer the institutional memory your agents can draw on.

If you’re interested in getting started with these skills, explore the toolkit on Github and be sure to check out the docs.

Learn more about Agent Observability at Monte Carlo here.

Our promise: we will show you the product.