How to Separate Compute and Storage Costs in Azure SQL

Azure SQL Database and Managed Instance invoices arrive as a single blended figure, and this page walks through the exact Python you need to split that figure into deterministic compute and storage buckets from Cost Management meter records.

Back to: Compute vs Storage Cost Breakdowns

In Azure, compute cost scales with provisioned vCores, DTU allocations, or serverless compute-seconds, while storage cost accrues from data files, transaction logs, and automated backups. The Cost Management API exposes each of these as a distinct meter, but raw exports frequently roll them up under identical resource IDs, so the split only becomes reliable once you filter deterministically on meterCategory, meterSubCategory, and meterName. This is the same compute-versus-storage disaggregation that underpins every downstream chargeback model — and the classifier you build here is a reusable template for the broader work of normalizing provider billing exports into a unified schema.

At a billing level the two dimensions compose linearly, which is what makes a clean split worth the effort:

C_{db} = \sum (\text{vCore-hours} \times r_{compute}) + \sum (\text{GB-month} \times r_{storage}) + C_{backup}

The whole task is to recover the first term and the second-plus-third terms separately from records that Azure delivers pre-summed.

Prerequisites

Before running the pipeline, confirm the following are in place.

Azure RBAC: the identity running the job needs read access to Cost Management for the target scope. Assign the built-in Cost Management Reader role (or Billing Reader at the billing-account scope). Least-privilege enforcement here is part of broader access control for cost data; never reuse an Owner or Contributor credential for read-only extraction.
```
{
  "properties": {
    "roleDefinitionId": "/providers/Microsoft.Authorization/roleDefinitions/72fafb9e-0641-4937-9268-a91bfd8191a3",
    "principalId": "<managed-identity-object-id>",
    "principalType": "ServicePrincipal"
  }
}
```
The roleDefinitionId GUID above is the fixed ID of Cost Management Reader; scope the assignment to /subscriptions/<sub-id>.
Python: 3.10 or newer (the code uses structural typing and modern asyncio APIs).
Libraries: install the async HTTP client and the Azure identity package.
```
pip install "aiohttp>=3.9" "azure-identity>=1.16"
```
Auth context: in production, attach a user-assigned Managed Identity to the host (Azure Function, Container App, or VM). DefaultAzureCredential will pick it up automatically; locally it falls back to your az login session.

Step-by-Step Implementation

The pipeline posts a grouped query to the Cost Management Query API, classifies each returned row by meter, and aggregates into compute, storage, and other buckets. It handles token acquisition, nextLink pagination, and exponential backoff on throttling — the same resilience posture described in fallback routing for cost APIs.

Step 1 — Classify a meter deterministically

The classifier is pure and side-effect-free so it can be unit-tested in isolation. Compute keywords are checked before storage keywords because the string "Storage" appears in both "Data Storage" and, occasionally, compute-adjacent meters.

from typing import Optional

def classify_meter(meter_name: str, meter_category: str) -> str:
    """Route an Azure SQL meter into compute / storage / other."""
    if meter_category != "SQL Database":
        return "other"
    name = meter_name.lower()
    compute_keywords = ("vcore", "dtu", "serverless", "compute")
    storage_keywords = ("data storage", "log storage", "backup storage", "storage")
    if any(kw in name for kw in compute_keywords):
        return "compute"
    if any(kw in name for kw in storage_keywords):
        return "storage"
    return "other"

# Expected:
#   classify_meter("vCore", "SQL Database")        -> "compute"
#   classify_meter("Data Stored", "SQL Database")  -> "storage"
#   classify_meter("PITR Backup Storage", "SQL Database") -> "storage"

Step 2 — Fetch a page with backoff

Cost Management returns HTTP 429 with a Retry-After hint under load and 5xx during transient outages. Both are retried with exponential backoff; everything else fails fast.

import asyncio
import logging
import aiohttp

MAX_RETRIES = 3
RETRY_DELAY = 2.0

async def fetch_cost_page(
    session: aiohttp.ClientSession,
    token: str,
    url: str,
    payload: dict,
    retry_count: int = 0,
) -> dict:
    """POST one page to the Query API, retrying on 429/5xx."""
    headers = {
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json",
    }
    timeout = aiohttp.ClientTimeout(total=30)
    async with session.post(url, headers=headers, json=payload, timeout=timeout) as resp:
        if resp.status == 429 or resp.status >= 500:
            if retry_count < MAX_RETRIES:
                delay = RETRY_DELAY * (2 ** retry_count)
                logging.warning("Throttled/5xx (%s). Retrying in %.1fs", resp.status, delay)
                await asyncio.sleep(delay)
                return await fetch_cost_page(session, token, url, payload, retry_count + 1)
            raise RuntimeError(f"Max retries exceeded ({resp.status}) for {url}")
        resp.raise_for_status()
        return await resp.json()

Step 3 — Orchestrate query, pagination, and aggregation

The Query API groups by ResourceId, MeterCategory, and MeterName, so each row carries the meter fields the classifier needs. Pagination follows properties.nextLink until it is absent.

The retry-and-paginate flow below shows the request lifecycle from token acquisition through the final bucketed total.

import os
from urllib.parse import urlencode
from typing import Dict, Optional
from azure.identity import DefaultAzureCredential

AZURE_COST_MGMT_BASE = "https://management.azure.com"
API_VERSION = "2025-03-01"

async def run_cost_attribution_pipeline(
    subscription_id: str,
    start_date: str,
    end_date: str,
) -> Dict[str, float]:
    """Extract, classify, and aggregate Azure SQL cost by dimension."""
    credential = DefaultAzureCredential()
    token = credential.get_token("https://management.azure.com/.default").token

    query_payload = {
        "type": "ActualCost",
        "timeframe": "Custom",
        "timePeriod": {"from": f"{start_date}T00:00:00Z", "to": f"{end_date}T00:00:00Z"},
        "dataset": {
            "granularity": "Daily",
            "aggregation": {"totalCost": {"name": "PreTaxCost", "function": "Sum"}},
            "grouping": [
                {"type": "Dimension", "name": "ResourceId"},
                {"type": "Dimension", "name": "MeterCategory"},
                {"type": "Dimension", "name": "MeterName"},
            ],
        },
    }

    base_url = (
        f"{AZURE_COST_MGMT_BASE}/subscriptions/{subscription_id}"
        f"/providers/Microsoft.CostManagement/query"
    )
    query_url = f"{base_url}?{urlencode({'api-version': API_VERSION})}"
    breakdown: Dict[str, float] = {"compute": 0.0, "storage": 0.0, "other": 0.0}

    async with aiohttp.ClientSession() as session:
        next_url: Optional[str] = query_url
        while next_url:
            response = await fetch_cost_page(session, token, next_url, query_payload)
            props = response.get("properties", {})
            # Column order for this grouping:
            # [PreTaxCost, ResourceId, MeterCategory, MeterName, Currency]
            cols = [c["name"] for c in props.get("columns", [])]
            idx = {name: i for i, name in enumerate(cols)}
            for row in props.get("rows", []):
                category = row[idx["MeterCategory"]]
                name = row[idx["MeterName"]]
                cost = float(row[idx["PreTaxCost"]])
                breakdown[classify_meter(name, category)] += cost

            next_url = props.get("nextLink") or None
            logging.info("Running totals: %s", breakdown)

    return breakdown

if __name__ == "__main__":
    logging.basicConfig(level=logging.INFO)
    sub = os.environ["AZURE_SUBSCRIPTION_ID"]
    result = asyncio.run(run_cost_attribution_pipeline(sub, "2026-05-01", "2026-06-01"))
    print(result)

Reading the column index from properties.columns rather than a hard-coded position is deliberate: Azure has reordered Query API columns between API versions, and positional indexing silently misattributes cost when that happens.

Expected output for a subscription with two provisioned databases and one serverless database:

{'compute': 1843.52, 'storage': 412.17, 'other': 6.40}

Verification

Confirm the split is correct before wiring it into any dashboard.

Reconcile the total. The three buckets must sum to the invoice subtotal for the same scope and period. A quick assertion:

result = asyncio.run(run_cost_attribution_pipeline(sub, "2026-05-01", "2026-06-01"))
assert abs(sum(result.values()) - 2262.09) < 0.01, "buckets do not reconcile to invoice"

Cross-check against the portal. In Cost Management > Cost analysis, group by Meter and filter Service name = SQL Database. The sum of vCore/Serverless/DTU meters should equal your compute bucket; the Data Stored, LTR Backup Storage, and PITR Backup Storage meters should equal storage.
Inspect the other bucket. A non-trivial other value means an unclassified meter slipped through — print the raw MeterName values landing in other and extend the keyword lists. In a healthy Azure SQL scope, other should be near zero (it captures things like IPv4 charges, not database resource consumption).

Gotchas & Edge Cases

Data Stored vs Data Storage. Single-database and elastic-pool storage meters are named Data Stored, not Data Storage. The classifier catches both because it substring-matches on "storage" and "data storage" — but if you tighten the match to exact strings, you will silently drop pool storage into other.
Backup storage is billed separately from data storage. Point-in-time restore (PITR Backup Storage) and long-term retention (LTR Backup Storage) are distinct meters. Both belong in storage, but if your chargeback model treats backup as a separate line, split it out with a dedicated bucket before the compute/storage rollup.
Serverless auto-pause. A paused serverless database bills zero compute but continues to bill storage. Never infer “database is idle, downsize compute” from a low storage figure; correlate against actual compute-seconds — the same discipline used in query execution cost modeling.
Managed Instance uses SQL Managed Instance, not SQL Database. If your estate includes Managed Instance, add that meterCategory to the guard clause in classify_meter, or its cost lands entirely in other.
Elastic pools bill at the pool, not the member database. Cost attributes to the pool resource ID; per-database allocation inside a pool requires a second apportionment step against each database’s DTU/vCore usage — feed that split into your quota boundary policies so a noisy tenant cannot silently consume the pool’s budget.
Meter latency. Cost Management data can lag actual consumption by 8–24 hours. Run reconciliation against a closed period (yesterday or earlier), never the current day.

Frequently Asked Questions

Why not use the UsageDetails API instead of the Query API?

Both work. UsageDetails returns one record per meter per day and is ideal for line-item audit, but it is chattier and paginates heavily. The Query API pre-aggregates server-side, which is faster for a compute-versus-storage rollup. Use UsageDetails only when you need per-resource forensic detail the grouped query cannot provide.

How do I attribute cost inside a shared elastic pool?

The invoice charges the pool, so start from the pool total and apportion it across member databases by their share of consumed DTUs or vCore-seconds (from sys.dm_db_resource_stats). Multiply each database’s usage fraction by the pool’s compute cost; storage is apportioned by allocated Data Stored per database.

Does serverless compute show up as compute or storage?

Compute. Serverless billing appears under meters containing Serverless, which the classifier maps to the compute bucket. The storage a serverless database occupies is billed independently and continues even while compute is auto-paused.

Which role is the minimum to run this in production?

Cost Management Reader scoped to the subscription (or resource group) you are attributing. It grants read-only access to cost and usage data and nothing else — no ability to read database contents or modify resources. Assign it to a managed identity, not a personal account.

Why does my `other` bucket keep growing month over month?

New meter names appear when Azure introduces SKUs or renames existing ones (Hyperscale named replicas and zone-redundant backup are common culprits). Treat a rising other as a classifier-drift alarm: log the distinct MeterName values landing there and extend the keyword lists, rather than letting real cost hide in the fallback bucket.

Tracking IOPS vs baseline storage spend in RDS — the AWS-side equivalent, correlating provisioned IOPS against actual storage spend.
Building async Python parsers for AWS Cost Explorer — the same async extraction pattern applied to Cost Explorer.
Graceful degradation when billing APIs are down — hardening the pipeline against Cost Management outages.
Compute vs Storage Cost Breakdowns — the parent topic covering disaggregation across all managed database engines.

Back to: Compute vs Storage Cost Breakdowns

How to Separate Compute and Storage Costs in Azure SQL #

Prerequisites #

Step-by-Step Implementation #

Step 1 — Classify a meter deterministically #

Step 2 — Fetch a page with backoff #

Step 3 — Orchestrate query, pagination, and aggregation #

Verification #

Gotchas & Edge Cases #

Frequently Asked Questions #

Why not use the UsageDetails API instead of the Query API? #

How do I attribute cost inside a shared elastic pool? #

Does serverless compute show up as compute or storage? #

Which role is the minimum to run this in production? #

Why does my other bucket keep growing month over month? #

Related #