Using EXPLAIN ANALYZE for Cost Attribution in MySQL

This page shows the exact Python needed to turn MySQL 8.0’s EXPLAIN ANALYZE iterator tree into a deterministic per-query cost record you can charge back and quota-enforce.

Back to: Query Execution Cost Modeling

Instance-level billing amortizes an entire MySQL bill across every workload that touched the engine, which hides the single report query quietly burning a tenant’s vCPU budget. MySQL 8.0.18+ closes that gap with EXPLAIN ANALYZE, which runs the statement and reports real per-iterator timing, row counts, and loop iterations rather than the optimizer’s static guesses. This page is the MySQL-engine counterpart to the broader work of attributing per-statement resource use back to a dollar figure; the parser you build here converts the tree into the CPU-seconds and rows-processed terms that a compute-versus-storage cost split prices, and the resulting signal feeds directly into hard and soft quota boundaries.

The unit cost of a single execution is a weighted sum over the two dimensions the tree exposes — elapsed CPU time and the volume of rows the plan pushed through its iterators:

C_{query} = (t_{ms} \cdot L_{root}) \cdot r_{cpu} + \left(\sum_i \text{rows}_i \cdot \text{loops}_i\right) \cdot r_{io}

where $t_{ms}$ is the root iterator’s per-loop last-row time in milliseconds, $L_{root}$ its loop count, each $r$ is the corresponding normalized cloud rate, and the summation is the total rows processed across every node in the plan. The whole task is to recover those terms honestly from the tree and price them against a rate model that stays comparable across engines.

Prerequisites

Before running the pipeline, confirm the following are in place.

MySQL version: 8.0.18 or newer. EXPLAIN ANALYZE emits the TREE format by default; FORMAT=JSON output arrived in 8.3.0, but the parser below targets the universally available tree text.
MySQL grants: EXPLAIN ANALYZE executes the statement, so the attribution user needs the same SELECT privileges as the workload it inspects, plus read access to performance_schema to discover which digests are worth analyzing. Grant a dedicated read-only principal — never reuse an application or admin credential, in line with least-privilege access control for cost data.
```
CREATE USER 'cost_reader'@'%' IDENTIFIED BY '<rotated-secret>';
GRANT SELECT ON appdb.*             TO 'cost_reader'@'%';
GRANT SELECT ON performance_schema.* TO 'cost_reader'@'%';
FLUSH PRIVILEGES;
```
Python: 3.10 or newer (the code uses match on optional dataclasses and modern asyncio APIs).
Libraries: install the async MySQL driver.
```
pip install "asyncmy>=0.2.9"
```
Safety note: because the statement runs for real, restrict attribution to read-only SELECT workloads and route it at a read replica, never the primary. Running EXPLAIN ANALYZE on a mutating statement will apply its side effects.

Step-by-Step Implementation

The pipeline sends EXPLAIN ANALYZE to a replica, parses the returned iterator tree into physical metrics, prices those metrics with a pure function, and fans the work out concurrently with a per-query timeout and a cached fallback. The data flow from execution to attributed cost is:

Step 1 — Read a real EXPLAIN ANALYZE tree

Run the statement manually first so you know the shape the parser must handle. Each iterator line carries an estimate block (cost=… rows=…) and, critically, an actual block (actual time=first..last rows=R loops=L):

EXPLAIN ANALYZE
SELECT o.customer_id, SUM(o.total)
FROM orders o JOIN customers c ON c.id = o.customer_id
WHERE o.created_at >= '2026-06-01'
GROUP BY o.customer_id;

-> Group aggregate  (actual time=12.481..18.902 rows=214 loops=1)
    -> Nested loop inner join  (actual time=0.211..15.640 rows=8901 loops=1)
        -> Index range scan on o using idx_created  (actual time=0.088..4.902 rows=8901 loops=1)
        -> Single-row index lookup on c using PRIMARY  (actual time=0.001..0.001 rows=1 loops=8901)

The two facts the cost model needs: the root iterator’s last time multiplied by its loops is total wall time in milliseconds, and rows × loops summed over every line is the total row volume the plan moved (note the inner lookup ran 8901 times).

Step 2 — Parse the tree into metrics

The parser is pure and side-effect-free so it can be unit-tested against captured tree text. It extracts every actual-block, treats the first (outermost) line as the root for timing, and sums rows × loops across all lines as the I/O proxy.

import re
from dataclasses import dataclass

# Matches the "(actual time=first..last rows=R loops=L)" block on each iterator line.
_ACTUAL_RE = re.compile(
    r"actual time=(?P<first>\d+(?:\.\d+)?)\.\.(?P<last>\d+(?:\.\d+)?)"
    r" rows=(?P<rows>\d+) loops=(?P<loops>\d+)"
)


@dataclass(frozen=True)
class PlanMetrics:
    total_time_ms: float   # root last-row time * root loops = wall time
    rows_processed: int    # sum(rows * loops) across all iterators
    node_count: int        # iterators seen (0 => no actual block => not analyzed)


def parse_explain_tree(raw_text: str) -> PlanMetrics:
    """Reduce a TREE-format EXPLAIN ANALYZE output to physical cost metrics."""
    matches = list(_ACTUAL_RE.finditer(raw_text))
    if not matches:
        return PlanMetrics(total_time_ms=0.0, rows_processed=0, node_count=0)

    root = matches[0]  # outermost iterator is printed first
    total_time_ms = float(root["last"]) * int(root["loops"])
    rows_processed = sum(int(m["rows"]) * int(m["loops"]) for m in matches)
    return PlanMetrics(
        total_time_ms=total_time_ms,
        rows_processed=rows_processed,
        node_count=len(matches),
    )


# Expected against the Step 1 tree:
#   total_time_ms  -> 18.902  (18.902 * 1 loop)
#   rows_processed -> 214 + 8901 + 8901 + (1 * 8901) = 18917
#   node_count     -> 4

Step 3 — Price a parsed plan

Pricing is a second pure function so it can be unit-tested against known metrics and cached rates. The rates come from the platform’s normalized rate model, not a hard-coded constant — the same source that keeps a MySQL-on-RDS cost comparable to Postgres, via normalizing provider billing exports into a unified schema.

from dataclasses import dataclass


@dataclass(frozen=True)
class RateModel:
    cpu_per_ms: float   # $ per millisecond of query wall time
    io_per_row: float   # $ per row processed through the plan


@dataclass(frozen=True)
class QueryCost:
    digest: str
    total_time_ms: float
    rows_processed: int
    cost_usd: float
    source: str  # "live_explain" or "cached_fallback"


def price_plan(digest: str, metrics: PlanMetrics, rates: RateModel,
               source: str = "live_explain") -> QueryCost:
    """Apply C = (t_ms * L_root)*r_cpu + (sum rows*loops)*r_io."""
    cpu_cost = metrics.total_time_ms * rates.cpu_per_ms
    io_cost = metrics.rows_processed * rates.io_per_row
    return QueryCost(
        digest=digest,
        total_time_ms=metrics.total_time_ms,
        rows_processed=metrics.rows_processed,
        cost_usd=round(cpu_cost + io_cost, 8),
        source=source,
    )

Step 4 — Execute concurrently with timeout and fallback

Running EXPLAIN ANALYZE synchronously across a batch adds unacceptable tail latency, so the collector fans out with asyncmy, bounds each execution with asyncio.wait_for, and degrades to a cached-cost record instead of stalling the loop when a query times out. That circuit-breaker posture is the same one described in fallback routing for cost APIs, and the bounded-concurrency shape mirrors the async, semaphore-controlled parsing workflows used across the rest of the platform.

import asyncio
import logging
from typing import Optional
import asyncmy

logging.basicConfig(level=logging.INFO, format="%(asctime)s | %(levelname)s | %(message)s")
logger = logging.getLogger("mysql_cost_attribution")


class ExplainCostCollector:
    def __init__(self, rates: RateModel, timeout_s: float = 5.0, **conn_kwargs):
        self.rates = rates
        self.timeout_s = timeout_s
        self.conn_kwargs = conn_kwargs  # host, port, user, password, database
        self.pool: Optional[asyncmy.Pool] = None
        self._rate_cache: dict[str, QueryCost] = {}

    async def initialize(self) -> None:
        self.pool = await asyncmy.create_pool(minsize=2, maxsize=8, **self.conn_kwargs)

    async def close(self) -> None:
        if self.pool:
            self.pool.close()
            await self.pool.wait_closed()

    async def _analyze_one(self, digest: str, sql: str) -> QueryCost:
        assert self.pool is not None, "call initialize() first"
        try:
            async with self.pool.acquire() as conn:
                async with conn.cursor() as cur:
                    await asyncio.wait_for(
                        cur.execute(f"EXPLAIN ANALYZE {sql}"),
                        timeout=self.timeout_s,
                    )
                    rows = await cur.fetchall()
                    raw = "\n".join(str(r[0]) for r in rows)
            cost = price_plan(digest, parse_explain_tree(raw), self.rates)
            self._rate_cache[digest] = cost  # warm the fallback cache
            return cost
        except asyncio.TimeoutError:
            logger.warning("digest %s exceeded %.1fs; using cached fallback", digest, self.timeout_s)
        except Exception as exc:  # driver / SQL errors are isolated, not fatal
            logger.error("EXPLAIN ANALYZE failed for %s: %s", digest, exc)
        return self._fallback(digest)

    def _fallback(self, digest: str) -> QueryCost:
        cached = self._rate_cache.get(digest)
        if cached is not None:
            return QueryCost(**{**cached.__dict__, "source": "cached_fallback"})
        return QueryCost(digest=digest, total_time_ms=0.0, rows_processed=0,
                         cost_usd=0.0, source="cached_fallback")

    async def analyze_batch(self, workload: list[tuple[str, str]],
                            max_concurrency: int = 4) -> list[QueryCost]:
        sem = asyncio.Semaphore(max_concurrency)  # cap live EXPLAIN executions

        async def guarded(digest: str, sql: str) -> QueryCost:
            async with sem:
                return await self._analyze_one(digest, sql)

        return await asyncio.gather(*(guarded(d, s) for d, s in workload))


async def main() -> None:
    collector = ExplainCostCollector(
        rates=RateModel(cpu_per_ms=0.0000025, io_per_row=0.00000012),
        host="replica.internal", port=3306,
        user="cost_reader", password="<rotated-secret>", database="appdb",
    )
    await collector.initialize()
    try:
        workload = [
            ("d1f3", "SELECT customer_id, SUM(total) FROM orders "
                     "WHERE created_at >= '2026-06-01' GROUP BY customer_id"),
        ]
        for cost in await collector.analyze_batch(workload):
            logger.info("digest=%s cost=$%.6f rows=%d source=%s",
                        cost.digest, cost.cost_usd, cost.rows_processed, cost.source)
    finally:
        await collector.close()


if __name__ == "__main__":
    asyncio.run(main())

Expected log line for the Step 1 query at the rates above:

2026-07-05 09:14:22,061 | INFO | digest=d1f3 cost=$0.002316 rows=18917 source=live_explain

Verification

Confirm the attribution is correct before wiring it into any dashboard or quota policy.

Assert the parser against a captured tree. Parsing is deterministic, so pin it with a fixture:

tree = (
    "-> Group aggregate  (actual time=12.481..18.902 rows=214 loops=1)\n"
    "    -> Index range scan on o  (actual time=0.088..4.902 rows=8901 loops=1)\n"
    "    -> Single-row index lookup on c  (actual time=0.001..0.001 rows=1 loops=8901)"
)
m = parse_explain_tree(tree)
assert m.total_time_ms == 18.902
assert m.rows_processed == 214 + 8901 + 8901  # 18016
assert m.node_count == 3

Reconcile wall time against performance_schema. The parsed total_time_ms should track the digest’s average latency. Cross-check:
```
SELECT DIGEST_TEXT,
       ROUND(AVG_TIMER_WAIT / 1e9, 3) AS avg_ms   -- picoseconds -> ms
FROM performance_schema.events_statements_summary_by_digest
WHERE DIGEST = '<digest>' \G
```
A parsed time within roughly 2x of avg_ms is healthy; a large gap means the analyzed run hit a cold buffer pool (see Gotchas).
Check the record shape. Every emitted QueryCost should carry a non-empty source of either live_explain or cached_fallback, a non-negative cost_usd, and a rows_processed that is zero only when the statement genuinely returned no analyzed tree.

Gotchas & Edge Cases

rows is emitted-per-loop, not “rows examined.” Each iterator’s rows is the average count it produced on a single loop, which is why the model multiplies by loops. True examined-row totals (including rows filtered out) come from the session’s Handler_read* status counters, not the tree — reach for those if your I/O attribution must count discarded rows.
Actual time is cumulative and per-loop. The last value is the time to the final row on one iteration; a child node’s time is included in its parent’s. Summing times across nodes double-counts — only the root’s last × loops is total wall time.
Cold cache inflates a single run. The first execution after a restart reads from disk (Innodb_buffer_pool_reads); the next reads from the buffer pool. Report cost as a windowed average across several runs, or warm the query once before the measured run, so a cache miss does not look like a permanently expensive query — the same cache-warmth discipline that governs per-query cost modeling generally.
It runs the statement for real. EXPLAIN ANALYZE INSERT/UPDATE/DELETE executes the mutation. Guard the collector to SELECT-only workloads and point it at a read replica.
Output format drift. MySQL has adjusted iterator wording between minor versions (Index range scan vs older Index range scan on …). Pin the engine version in your pipeline and match on the stable actual time=… block, as the regex here does, rather than on node prose.
Serverless and burst pricing. On Aurora MySQL Serverless v2 the effective cpu_per_ms tracks ACU scaling, so the rate is time-varying; sample the rate at analysis time rather than assuming a constant floor rate.

Frequently Asked Questions

How is EXPLAIN ANALYZE different from a plain EXPLAIN for costing?

Plain EXPLAIN returns the optimizer’s estimated cost and row counts without running the query — useful for plan shape, useless for real spend because the estimates drift from runtime. EXPLAIN ANALYZE executes the statement and reports measured time, actual rows, and loop counts per iterator, which are the numbers you can defensibly turn into a dollar figure.

Does running EXPLAIN ANALYZE change my data?

Only if the analyzed statement itself mutates data. For a SELECT it is read-only. For INSERT, UPDATE, or DELETE, EXPLAIN ANALYZE runs the statement and applies its effects, so restrict the collector to read workloads and run it against a replica.

Why multiply rows by loops instead of using the top-line rows?

Because a nested-loop inner side runs once per outer row. A single-row lookup showing rows=1 loops=8901 touched 8901 rows in total, not one. Multiplying rows × loops per node and summing captures the true volume the plan pushed through, which is the honest I/O and CPU-work proxy.

Which MySQL privileges does the attribution user actually need?

The same SELECT privileges as the workloads it analyzes (because the statement runs), plus SELECT on performance_schema to enumerate digests. It needs no write, PROCESS, or admin rights. Assign it to a dedicated rotated credential, never an application or superuser account.

How do I keep quota enforcement running when a query times out?

The collector caches the last successful QueryCost per digest and returns it tagged cached_fallback when a live EXPLAIN ANALYZE exceeds its timeout or errors. Enforcement keeps evaluating against the last known cost instead of stalling, and the source field lets you alert when fallback usage climbs — the same graceful-degradation contract as fallback routing for cost APIs.

Modeling CPU time vs query cost in PostgreSQL — the Postgres-side equivalent, correcting the dimensionless planner cost against real vCPU-seconds.
Query Execution Cost Modeling — the parent topic covering per-statement attribution across engines.
Database Quota Boundary Design — turning the per-query cost this page produces into hard and soft enforcement tiers.
Multi-Cloud Cost Normalization — the canonical rate model that makes the MySQL cost comparable to other engines.

Back to: Query Execution Cost Modeling

Using EXPLAIN ANALYZE for Cost Attribution in MySQL #

Prerequisites #

Step-by-Step Implementation #

Step 1 — Read a real EXPLAIN ANALYZE tree #

Step 2 — Parse the tree into metrics #

Step 3 — Price a parsed plan #

Step 4 — Execute concurrently with timeout and fallback #

Verification #

Gotchas & Edge Cases #

Frequently Asked Questions #

How is EXPLAIN ANALYZE different from a plain EXPLAIN for costing? #

Does running EXPLAIN ANALYZE change my data? #

Why multiply rows by loops instead of using the top-line rows? #

Which MySQL privileges does the attribution user actually need? #

How do I keep quota enforcement running when a query times out? #

Related #