Skip to content
Data Observability Updated May 04 2026

How to make Claude a trusted analyst for your whole company

AUTHOR | Lior Gavish

Every data leader is being asked the same question right now: “Can’t we just point Claude at our warehouse and let everyone ask their own questions?”

The naive version works for about a week. Then two execs ask the same question and get two different numbers, Claude joins the wrong tables and answers confidently anyway, and someone makes a call off a metric based on a pipeline that broke three days ago. The risk is that you replace a slow, accurate analytics function with a fast, unreliable one. That’s worse, not better.

Trusted self-serve is achievable — but it requires architecture, not just access. Here’s the pattern we run at Monte Carlo, generalized so any analytics team can adopt it. Today it covers 100% of internal data inquiries across our 150-person company.

The unit of leverage: an organization-level skill

Claude Skills are small packages of instructions, optionally with code and reference files, that Claude loads when a question matches the skill’s description. They don’t replace Claude’s reasoning — they constrain and guide it.

A single, well-designed organization-level analytics skill becomes the choke point through which every analytics question flows. Instead of N employees writing N versions of “please query our warehouse,” everyone benefits from one curated playbook owned by the data team and versioned like any production artifact. On Claude Team and Enterprise plans, an admin can provision a skill organization-wide from a single upload — deploy once and every employee picks up the latest version automatically.

That’s the unit of governance. Everything below — dashboards, semantic layer, health checks — is wired through it.

Step 1: Route to curated dashboards first

Counter-intuitive but critical: the first thing your analytics skill should do is not generate SQL. It should check whether a curated dashboard already answers the question.

Dashboards are already vetted, governed, and shareable. Ad-hoc SQL output is none of those things. A huge fraction of “analytics questions” are actually discovery questions — “is there a report that shows X?” — and the right answer is a link, not a query.

Encode this by shipping the skill with a table of your most important dashboards (Looker, Tableau, Mode, Hex — whatever you use), each tagged with the topics it covers. The skill instructs Claude to match the question against this table before doing anything else, and to return a link if there’s a clean match. A “what’s our weekly active users trend?” question routes straight to the WAU dashboard — no SQL, no chance of fragmenting the metric.

Step 2: Anchor ad-hoc analysis to a semantic layer

When there’s no dashboard match, the skill falls back to ad-hoc analysis. This is where most teams get burned: Claude on raw warehouse tables will invent joins, fabricate column meanings, and confidently mis-aggregate.

The fix is to route Claude through a semantic layer. We use Snowflake semantic views (the SQL reference covers logical tables, dimensions, metrics, and verified queries); the same idea works with dbt’s semantic layer, Cube, Atlan, or any modeled definition of metrics and entities. It constrains Claude to entities you’ve explicitly modeled, eliminates hallucinated joins, and gives it verified queries to adapt rather than write from scratch — so “active user” means the same thing every time.

The skill teaches Claude how to read the semantic layer and encodes presentation standards — chart aspect ratios, how to handle partial weeks, when to report median vs. mean. Boring, but it’s what separates a trustworthy answer from a misleading one.

Step 3: Validate data trust with live health checks

Even with a perfect semantic layer, you still have one fundamental problem: the underlying data might be broken at the time of analysis. A freshness issue, dropped records, or an insidious upstream schema change — any of these silently corrupts the answer Claude returns. The query is right; the data is wrong.

Think about how a seasoned analyst handles this today. Before they hand a number to an exec, they don’t just run the query — they pause and check. Did that pipeline get rebuilt last week? Is there an open thread in #data-quality about the accounts table? Did the Fivetran sync error overnight? They know which dashboards have been flaky, which sources are mid-migration, and which metrics to caveat. That instinct isn’t in the SQL — it’s in being a member of the data team, and it’s what makes their answers trustworthy.

Claude doesn’t have that membership. The Agent Toolkit gives it the equivalent: the institutional awareness a long-tenured analyst would have, plus the tools to act on it. This is the gap between “AI analyst demo” and “AI analyst you trust in production.” Most teams skip it.

Close it by running a live data-health check before presenting results. After Claude generates and runs the SQL, the skill makes it:

  1. Extract the base tables from the FROM and JOIN clauses.
  2. Check the tables for any existing issues, including upstream problems that may not be immediately visible in the table.
  3. Surface any issues as part of the answer, with severity and status.

We do this with the Monte Carlo Agent Toolkit — an open-source bundle of skills and an MCP server that puts Monte Carlo’s full observability graph (lineage, monitors, alerts, incident history) inside the agent as callable tools. The skill we use for this step is Asset Health: the agent asks “how is this table doing?” and gets back a structured trust report — status, active alerts, monitoring coverage, and upstream dependency health.

The presentation pattern matters as much as the check itself:

  • Tables clean → present results normally, no caveat needed.
  • Minor alerts on non-critical tables → footnote.
  • Active SEV-1 or SEV-2 on a table feeding the key metric → lead with the warning. “⚠️ There’s an active freshness incident on dim_accounts. The numbers below should be treated as provisional until it’s resolved.”

The AI doesn’t just answer faster — it tells you when not to trust the answer. That’s the difference between an analyst people rely on and one they learn to ignore.

The full workflow

Trust is baked in and standardized, mimicking the actions a seasoned analyst would take. Steps 2 and 3 ensure the right data is used in the right manner and step 6 ensures there are no current issues with that data.

Try it yourself

The building blocks are open:

Wrap those tools in your own org-level analytics skill and you have a Claude that knows your business and knows when its answers should be trusted.

What this changes

Org-level skills, a semantic layer, and live data-health checks finally resolve the trade-off self-serve analytics has always faced: governance centralized in one artifact, access universal through Claude.

The transformation at Monte Carlo has been concrete. Our analysts no longer spend their days writing ad-hoc SQL for whoever asks loudest — they spend them on the deep, strategic work that actually moves the business. All 150 employees get answers to their data questions in seconds, on demand, at any hour. And because every answer carries its own trust signal, the people receiving those answers act on them with full confidence.

The headline metric isn’t “questions answered per day.” It’s trust — the share of answers a decision-maker can act on without checking with the analytics team first. That’s the number that tells you whether you’ve built an AI analyst or just an AI demo.


Our promise: we will show you the product.