Automating ETRM sync with Python requests

Q: Why refresh the token at 80% of its lifespan instead of on a 401?

Refreshing reactively on a 401 means at least one page already failed, and during a settlement window that failed page may be silently dropped or force a full restart. Refreshing at 80% of expires_in guarantees the credential is renewed while it is still valid, so pagination never stalls on an expired token.

Q: Can I reuse this retry strategy for the POST that writes settlements back?

No. The Retry here is deliberately scoped to GET, HEAD, and OPTIONS. Retrying a non-idempotent POST or PUT without a vendor-guaranteed idempotency key can double-book a trade. Writes need their own path with an explicit idempotency key on each request.

A settlement batch that silently drops half its rows because an access token expired on page 7 of a paginated pull is the exact failure this page eliminates: a synchronous requests-based sync of ETRM settlement records that keeps its credentials fresh, retries only idempotent reads, and hashes every page for reconciliation before a single row reaches the ledger. This is a transport-layer concern within the ETRM API Integration Patterns component, which itself sits under the broader Trade Ingestion & Matching Workflows domain; where that parent page covers the async, circuit-broken fetcher, this page owns the synchronous request-per-page loop that most desks reach for first — and the credential and pagination discipline that keeps it from stranding trades.

The sequence below traces the paginated sync loop, where the client re-resolves the auth header on each page so a proactively refreshed token replaces an expiring one, then hashes and yields each batch until the API reports no next page.

Prerequisites

Before running the sync, provision the following. The credential and network-perimeter concerns underneath — token rotation, mTLS, and scope enforcement — are owned by Building secure API gateways for ETRM sync; this page assumes the gateway is already in front of the vendor endpoint.

Dependency	Minimum version	Purpose
`requests`	2.31	HTTP client and connection-pooled session
`urllib3`	2.0	`Retry` strategy mounted on the `HTTPAdapter`
Python stdlib	3.10+	`decimal`, `hashlib`, `logging`, `datetime`

API keys / permissions: an OAuth 2.0 client-credentials pair (client_id + client_secret) scoped to trade:read settlement:write. The secret must be pulled from a vault at runtime, never committed.
Endpoints: the token URL (/oauth2/token) and a paginated settlements endpoint (/api/v2/settlements) that accepts start_date, end_date, page, and page_size query parameters and returns a results array plus a next_page flag.
Data contract: confirm which fields carry monetary and volumetric quantities (volume_mwh, price_usd) so they can be parsed as Decimal before any settlement arithmetic runs — floats are prohibited on the financial path. The full modeling rules live in Schema Validation Frameworks.

Credential lifecycle and session persistence

The most frequent failure vector here is improper credential lifecycle management. Vendors enforce OAuth 2.0 or short-lived JWTs with strict expiration windows and rotating scopes. Instantiating a new requests.Session() per call or hardcoding a static token guarantees 401 Unauthorized or 403 Forbidden responses during high-volume settlement windows. A production sync needs a token-aware session that caches the credential in memory, parses expires_in, and refreshes proactively once the token crosses 80% of its lifespan — before it expires mid-pagination, not after a page fails.

Mounting a persistent HTTPAdapter with connection pooling and keep-alive reduces TCP handshake overhead and prevents socket exhaustion when a long-running sync walks thousands of pages. This is the synchronous counterpart to the concurrency-limited approach in Handling rate limits in async trade ingestion: fewer moving parts, same non-negotiable discipline on tokens and backoff.

Idempotency and resilient retry architecture

Network latency and vendor-side rate limiting are operational constants, not anomalies. A naive try/except around requests.get() silently drops settlement records or triggers duplicate postings when retries lack idempotency guards. Exponential backoff with jitter is mandatory, targeting 429 Too Many Requests, 502, 503, 504, and cloud-edge errors (520–524). The urllib3.util.Retry class integrates directly with requests.adapters.HTTPAdapter and must retry only idempotent methods (GET, HEAD, OPTIONS).

For settlement analysts this distinction is non-negotiable: blindly retrying a non-idempotent POST or PUT without a vendor-guaranteed idempotency key can double-book a trade, corrupt the general ledger, and trigger SOX or FERC compliance violations. Enforce strict method-level retry policies and log every retry attempt with a correlation ID for the downstream audit trail. Refer to the urllib3 Retry documentation for jitter configuration.

Implementation

The client below is complete and copy-pasteable. It composes token refresh, connection pooling, exponential backoff, cursor-durable pagination, and per-page sha256 hashing for reconciliation integrity.

import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
import logging
import hashlib
import json
from decimal import Decimal
from typing import Dict, List, Optional, Generator
import time

# Configure structured logging for audit compliance.
# Emit UTC timestamps so the trailing 'Z' is accurate (logging defaults to local time).
logging.Formatter.converter = time.gmtime
logging.basicConfig(
    level=logging.INFO,
    format='{"timestamp":"%(asctime)s","level":"%(levelname)s","module":"%(module)s","message":"%(message)s"}',
    datefmt='%Y-%m-%dT%H:%M:%SZ'
)
logger = logging.getLogger("etrm_sync")


class ETRMSyncClient:
    def __init__(self, base_url: str, client_id: str, client_secret: str, token_url: str):
        self.base_url = base_url.rstrip("/")
        self.client_id = client_id
        self.client_secret = client_secret
        self.token_url = token_url
        self.session = self._build_session()
        self._token_cache: Optional[Dict] = None
        self._token_expiry: float = 0.0

    def _build_session(self) -> requests.Session:
        session = requests.Session()
        retry_strategy = Retry(
            total=4,
            backoff_factor=1.5,
            status_forcelist=[429, 502, 503, 504, 520, 521, 522, 523, 524],
            allowed_methods=["GET", "HEAD", "OPTIONS"],  # idempotent reads only
            respect_retry_after_header=True
        )
        adapter = HTTPAdapter(max_retries=retry_strategy, pool_connections=10, pool_maxsize=20)
        session.mount("https://", adapter)
        session.mount("http://", adapter)
        return session

    def _refresh_token(self) -> str:
        logger.info("Initiating OAuth2 token refresh")
        payload = {
            "grant_type": "client_credentials",
            "client_id": self.client_id,
            "client_secret": self.client_secret,
            "scope": "trade:read settlement:write"
        }
        resp = self.session.post(self.token_url, data=payload, timeout=10)
        resp.raise_for_status()
        token_data = resp.json()
        self._token_cache = token_data
        # Proactive refresh at 80% of token lifespan so the credential never
        # expires mid-pagination.
        self._token_expiry = time.time() + (token_data.get("expires_in", 3600) * 0.8)
        return token_data["access_token"]

    def _get_auth_header(self) -> Dict[str, str]:
        if time.time() >= self._token_expiry or not self._token_cache:
            self._refresh_token()
        return {"Authorization": f"Bearer {self._token_cache['access_token']}"}

    def fetch_settlement_batches(self, start_date: str, end_date: str) -> Generator[List[Dict], None, None]:
        """Paginated sync for settlement records with per-page audit hashing."""
        endpoint = f"{self.base_url}/api/v2/settlements"
        params = {"start_date": start_date, "end_date": end_date, "page_size": 1000, "page": 1}

        while True:
            # Re-resolve the auth header each page so long paginated syncs pick up
            # a proactively refreshed token instead of carrying a stale one.
            headers = self._get_auth_header()
            logger.info(f"Fetching page {params['page']} from {endpoint}")
            resp = self.session.get(endpoint, params=params, headers=headers, timeout=30)
            resp.raise_for_status()
            data = resp.json()

            if not data.get("results"):
                break

            # Deterministic hash over sorted keys — the reconciliation fingerprint
            # for this page. A replayed page reproduces the same digest.
            settlement_batch = data["results"]
            payload_hash = hashlib.sha256(
                json.dumps(settlement_batch, sort_keys=True).encode()
            ).hexdigest()
            logger.info(
                f"Page {params['page']} ingested | records={len(settlement_batch)} | sha256={payload_hash}"
            )

            yield settlement_batch

            # Cursor durability: only advance when the API confirms a next page,
            # so a truncated response cannot skip records.
            if not data.get("next_page"):
                break
            params["page"] += 1


def to_decimal_ledger_row(record: Dict) -> Dict:
    """Coerce monetary/volumetric fields to Decimal before any settlement math.

    Route vendor values through str() so a binary-float mantissa never touches
    the financial path — this is what prevents sub-cent month-end drift.
    """
    return {
        **record,
        "volume_mwh": Decimal(str(record["volume_mwh"])),
        "price_usd": Decimal(str(record["price_usd"])),
    }


# Usage pattern for utility ops & settlement analysts
if __name__ == "__main__":
    client = ETRMSyncClient(
        base_url="https://etrm-vendor.example.com",
        client_id="svc_settlement_sync",
        client_secret="REPLACE_WITH_VAULT_SECRET",
        token_url="https://etrm-vendor.example.com/oauth2/token"
    )
    for settlement_batch in client.fetch_settlement_batches("2024-01-01", "2024-01-31"):
        ledger_rows = [to_decimal_ledger_row(r) for r in settlement_batch]
        # Pass ledger_rows to the downstream matching engine or settlement ledger.

Verification steps

Confirm the sync is correct before wiring it into month-end close:

Record count reconciliation. The vendor’s report portal exposes a total record count for the queried window. Sum len(settlement_batch) across every yielded page and assert it equals that control total — a mismatch means pagination truncated or a page was skipped.
Hash stability on replay. Re-run the identical start_date/end_date window. Each page’s sha256 in the logs must match the prior run byte-for-byte. A changed digest on unchanged source data signals non-deterministic ordering upstream and blocks a clean reconciliation diff.
Decimal integrity. After to_decimal_ledger_row, assert all(isinstance(r["price_usd"], Decimal) for r in ledger_rows). Any surviving float on the financial path is a defect, not a rounding curiosity.
Token refresh under load. Set the vendor token TTL low in a staging tenant and page past the 80% threshold. The log must show a single OAuth2 token refresh event mid-run with zero 401 responses — proof the proactive refresh fired before expiry rather than after a failed page.

The reconciliation diff itself — matching this synced fingerprint against the operating-day ledger — depends on the batch landing in the correct bucket, which is governed by Settlement Cycle Mapping.

Compliance note

This implementation is built to satisfy the audit-lineage expectations of FERC EQR quarterly reporting and the data-integrity controls under NERC CIP: every request emits a UTC-stamped structured log line, every page is fingerprinted with a reproducible sha256 for immutable audit trails, and retries are confined to idempotent methods so a replayed request cannot double-book a wholesale transaction and breach SOX controls. Validate the retained logs and hashes against your desk’s own REMIT/EMIR record-keeping obligations before treating the sync as the system of record. Downstream, the same guarantees feed the platform-level view in ETRM System Architecture.

Frequently Asked Questions

Why refresh the token at 80% of its lifespan instead of on a 401?

Refreshing reactively on a 401 means at least one page already failed, and during a settlement window that failed page may be silently dropped or force a full restart. Refreshing at 80% of expires_in guarantees the credential is renewed while it is still valid, so pagination never stalls on an expired token.

Can I reuse this retry strategy for the POST that writes settlements back?

No. The Retry here is deliberately scoped to GET, HEAD, and OPTIONS. Retrying a non-idempotent POST or PUT without a vendor-guaranteed idempotency key can double-book a trade. Writes need their own path with an explicit idempotency key on each request.

Why hash each page with sha256 instead of trusting record counts?

Counts detect missing rows but not silently mutated ones — a changed price or reversed sign leaves the count intact. A deterministic sha256 over sorted keys is a content fingerprint: it changes the instant any field in the batch changes, which is what makes a reconciliation replay meaningful.

Is `requests` sufficient, or should I move to async?

For a scheduled batch sync of bounded pages, a connection-pooled requests session is simpler and fully sufficient. Move to the async, circuit-broken pattern when you need high concurrency or real-time confirmation streams — the trade-offs are covered in the parent component and its async siblings.

Automating ETRM sync with Python requests

Prerequisites #

Credential lifecycle and session persistence #

Idempotency and resilient retry architecture #

Implementation #

Verification steps #

Compliance note #

Frequently Asked Questions #

Why refresh the token at 80% of its lifespan instead of on a 401? #

Can I reuse this retry strategy for the POST that writes settlements back? #

Why hash each page with sha256 instead of trusting record counts? #

Is requests sufficient, or should I move to async? #

Related #