feat(providers): catalog refresh + image_gen.model + OpenRouter caching (WS-6)

Surfaces the v0.13 provider catalog work in Scarf v2.8.0. Five new model IDs (deepseek/deepseek-v4-pro, x-ai/grok-4.3, openrouter/owl-alpha, tencent/hy3-preview, arcee/trinity-large-thinking) flow through models_dev_cache.json on next refresh — no manual catalog entries needed; the picker reaches them automatically. The grok-4.20-beta → grok-4.20 rename is handled via a new ModelCatalogService.modelAliases map plus resolveModelAlias() helper, called from validateModel(), model(_:_:), and provider(for:) at read time. Lossless: stored configs are never rewritten. Vercel AI Gateway is demoted to the bottom of the picker via a new demotedProviders set + sort-comparator axis (between subscription-gated and alphabetical). Always-on, no capability gate — sort-order consistency across Hermes versions. image_gen.model (top-level v0.13 YAML key) and openrouter.response_cache.enabled (provisional key shape per TODO(WS-6-Q1)) are surfaced as new SettingsSection rows in AuxiliaryTab, capability-gated on hasImageGenModel + hasOpenRouterResponseCache so pre-v0.13 hosts hide them. Image-gen picker has a curated 7-entry allowlist (HermesImageGenModel) plus free-form Custom model ID entry. CLAUDE.md gains two schema-drift bullets next to the existing overlayOnlyProviders requirement (modelAliases + demotedProviders mirror with hermes_cli/providers.py). Tests: 4 new M0cServicesTests (sort axis, alias resolution + cross- provider isolation, image-gen allowlist, demoted-set sentinel) and 2 new M6ConfigCronTests (YAML round-trip + empty-default). Implements WS-6 of Scarf v2.8.0 (Hermes v0.13.0 catch-up). Plan: scarf/docs/v2.8/WS-6-providers-v0.13-plan.md (on coordination/v2.8.0-plans). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 18:44:45 +00:00 · 2026-05-09 19:02:45 +02:00
parent 3e470c7155
commit 57a6340985
8 changed files with 340 additions and 8 deletions
@@ -173,6 +173,10 @@ v0.10.0 introduced the **Tool Gateway** — paid Nous Portal subscribers route w

 **Keep `ModelCatalogService.overlayOnlyProviders` in sync** with `HERMES_OVERLAYS` in `~/.hermes/hermes-agent/hermes_cli/providers.py`. When Hermes adds a new overlay-only provider, mirror the entry (display name, base URL, auth type, subscription-gated flag, doc URL) or the picker won't reach it.

+**Keep `ModelCatalogService.modelAliases` in sync** with Hermes's deprecated-model-ID map (currently release-notes-only upstream; the canonical successor lives in `hermes_cli/providers.py` if/when upstream tracks it in code). Drift here means a user's old model ID stops resolving in the picker even though Hermes still accepts it at runtime.
+
+**Keep `ModelCatalogService.demotedProviders` in sync** with the deprioritized-provider list in `hermes-agent/hermes_cli/providers.py`. Drift means Vercel AI Gateway (or any future demoted provider) sorts in the wrong position in Scarf's picker.
+
 ## Kanban v3: drag-and-drop board + per-project tenants (v2.7.5)

 Scarf v2.7.5 promotes Kanban from a read-only list to a full board with drag-and-drop, every Hermes write verb wired up, and per-project boards bound to a Scarf-minted tenant slug. The list view is preserved as a `Board | List` toggle for accessibility / narrow-window fallback.
@@ -667,6 +667,27 @@ public struct HermesConfig: Sendable {
    /// useful for cost auditing and screen-recording demos.
    public var runtimeMetadataFooter: Bool

+    // -- Hermes v0.13 additions ----------------------------------------
+
+    /// `image_gen.model` (v0.13+) — overrides the per-provider default
+    /// image-gen model. Empty string means "let Hermes pick the
+    /// provider default". Hermes v0.12 advertised this key but ignored
+    /// it; Scarf's `AuxiliaryTab` only renders the picker when
+    /// `HermesCapabilities.hasImageGenModel` is `true`.
+    public var imageGenModel: String
+
+    /// `openrouter.response_cache.enabled` (v0.13+) — when true, Hermes
+    /// asks OpenRouter to cache responses for repeat prompts within a
+    /// session. Off by default in Scarf's parser per WS-6 plan
+    /// recommendation. UI gated on
+    /// `HermesCapabilities.hasOpenRouterResponseCache`.
+    // TODO(WS-6-Q1): the exact YAML key shape is provisional. Verify
+    // against a v0.13 host's `hermes config check` output before
+    // shipping (see WS-6-plan §Open Questions #1). Candidate alternative
+    // shapes: `providers.openrouter.response_cache_enabled` or
+    // `prompt_caching.openrouter.enabled`.
+    public var openrouterResponseCacheEnabled: Bool
+
    // Grouped blocks
    public var display: DisplaySettings
    public var terminal: TerminalSettings
@@ -747,11 +768,15 @@ public struct HermesConfig: Sendable {
        homeAssistant: HomeAssistantSettings,
        cacheTTL: String = "5m",
        redactionEnabled: Bool = false,
-        runtimeMetadataFooter: Bool = false
+        runtimeMetadataFooter: Bool = false,
+        imageGenModel: String = "",
+        openrouterResponseCacheEnabled: Bool = false
    ) {
        self.cacheTTL = cacheTTL
        self.redactionEnabled = redactionEnabled
        self.runtimeMetadataFooter = runtimeMetadataFooter
+        self.imageGenModel = imageGenModel
+        self.openrouterResponseCacheEnabled = openrouterResponseCacheEnabled
        self.model = model
        self.provider = provider
        self.maxTurns = maxTurns
@@ -284,7 +284,18 @@ public extension HermesConfig {
            homeAssistant: homeAssistant,
            cacheTTL: str("prompt_caching.cache_ttl", default: "5m"),
            redactionEnabled: bool("redaction.enabled", default: false),
-            runtimeMetadataFooter: bool("agent.runtime_metadata_footer", default: false)
+            runtimeMetadataFooter: bool("agent.runtime_metadata_footer", default: false),
+            // -- v0.13 additions -------------------------------------
+            // TODO(WS-6-Q1): the `openrouter.response_cache.enabled`
+            // key shape is provisional pending verification against a
+            // v0.13 `hermes config check`. If upstream uses a different
+            // path (e.g. `providers.openrouter.response_cache_enabled`
+            // or nested under `prompt_caching`), update this single
+            // line + the matching `setSetting` key in
+            // `SettingsViewModel.setOpenRouterResponseCache`. Default
+            // is `false` per WS-6-plan §Open Questions #2.
+            imageGenModel: str("image_gen.model", default: ""),
+            openrouterResponseCacheEnabled: bool("openrouter.response_cache.enabled", default: false)
        )
    }
 }
@@ -155,9 +155,20 @@ public struct ModelCatalogService: Sendable {
            )
        }
        return byID.values.sorted { lhs, rhs in
+            // Subscription-gated first (Nous Portal).
            if lhs.subscriptionGated != rhs.subscriptionGated {
                return lhs.subscriptionGated
            }
+            // Demoted last (Vercel AI Gateway, per Hermes v0.13). The
+            // axis is unconditional — we don't gate on the Hermes
+            // version because "Vercel mid-alphabet on v0.12, bottom on
+            // v0.13" would be more confusing than the consistent
+            // "Vercel last" treatment for everyone.
+            let lDemoted = Self.demotedProviders.contains(lhs.providerID)
+            let rDemoted = Self.demotedProviders.contains(rhs.providerID)
+            if lDemoted != rDemoted {
+                return !lDemoted
+            }
            return lhs.providerName.localizedCaseInsensitiveCompare(rhs.providerName) == .orderedAscending
        }
    }
@@ -235,7 +246,10 @@ public struct ModelCatalogService: Sendable {
    public func provider(for modelID: String) -> HermesProviderInfo? {
        guard let catalog = loadCatalog() else { return nil }
        for (providerID, p) in catalog {
-            if p.models?[modelID] != nil {
+            // Resolve any model-rename alias for this provider before
+            // checking the catalog — see `modelAliases` for rationale.
+            let resolved = resolveModelAlias(providerID: providerID, modelID: modelID)
+            if p.models?[resolved] != nil {
                return HermesProviderInfo(
                    providerID: providerID,
                    providerName: p.name ?? providerID,
@@ -299,14 +313,17 @@ public struct ModelCatalogService: Sendable {
    /// Look up a specific model by provider + ID. Returns nil if not in the
    /// catalog (e.g., free-typed custom model).
    public func model(providerID: String, modelID: String) -> HermesModelInfo? {
+        // Resolve any model-rename alias for this provider before
+        // checking the catalog — see `modelAliases` for rationale.
+        let resolved = resolveModelAlias(providerID: providerID, modelID: modelID)
        guard let catalog = loadCatalog(),
              let provider = catalog[providerID],
-              let raw = provider.models?[modelID] else { return nil }
+              let raw = provider.models?[resolved] else { return nil }
        return HermesModelInfo(
            providerID: providerID,
            providerName: provider.name ?? providerID,
-            modelID: modelID,
-            modelName: raw.name ?? modelID,
+            modelID: resolved,
+            modelName: raw.name ?? resolved,
            contextWindow: raw.limit?.context,
            maxOutput: raw.limit?.output,
            costInput: raw.cost?.input,
@@ -344,10 +361,14 @@ public struct ModelCatalogService: Sendable {
    /// HTTP 404 at runtime. Catch that at save time, not 6 hours later.
    public func validateModel(_ modelID: String, for providerID: String) -> ModelValidation {
        ScarfMon.measure(.diskIO, "modelCatalog.validateModel") {
-            let trimmed = modelID.trimmingCharacters(in: .whitespacesAndNewlines)
-            guard !trimmed.isEmpty else {
+            let raw = modelID.trimmingCharacters(in: .whitespacesAndNewlines)
+            guard !raw.isEmpty else {
                return .invalid(providerName: providerID, suggestions: [])
            }
+            // Resolve any model-rename alias before lookup so configs
+            // referencing a deprecated ID (e.g. `x-ai/grok-4.20-beta`)
+            // validate against the canonical successor.
+            let trimmed = resolveModelAlias(providerID: providerID, modelID: raw)

            // Overlay-only providers (Nous Portal, OpenAI Codex, Qwen
            // OAuth, …) serve their own catalogs that aren't mirrored to
@@ -433,6 +454,78 @@ public struct ModelCatalogService: Sendable {
        let output: Int?
    }

+    // MARK: - Model aliases (model rename resolution)
+
+    /// Hermes deprecates model IDs across releases. When a stored config
+    /// `model.default` references a deprecated ID, resolve to its
+    /// canonical successor. Lossless — we never rewrite the user's
+    /// `config.yaml`; the alias just lets `validateModel` /
+    /// `model(providerID:modelID:)` / `provider(for:)` succeed against
+    /// the new ID.
+    ///
+    /// Keys are slash-joined `providerID/modelID` to disambiguate
+    /// across providers — even if `vercel` later adds a `grok-4.20-beta`
+    /// alias on its own, the openrouter resolution shouldn't fire.
+    /// Values are the bare resolved model ID (no provider prefix).
+    ///
+    /// **Schema is Swift-primary.** Mirror new entries into Hermes's
+    /// upstream deprecation map in `hermes_cli/providers.py` if/when
+    /// upstream tracks renames in code (today they're release-notes
+    /// only).
+    public static let modelAliases: [String: String] = [
+        // v0.13: x-ai dropped the `-beta` suffix once Grok 4.20 GA'd.
+        // The model is the same one served at the same OpenRouter slot;
+        // only the marketing identifier changed.
+        // TODO(WS-6-Q4): verify whether OpenRouter retired the
+        // `x-ai/grok-4.20-beta` slot entirely. Either way the alias is
+        // correct (cosmetic if old slot stays live, load-bearing if it
+        // 404s).
+        "openrouter/x-ai/grok-4.20-beta": "x-ai/grok-4.20",
+        "xai/grok-4.20-beta": "grok-4.20",
+        "vercel/xai/grok-4.20-beta": "xai/grok-4.20",
+    ]
+
+    /// Resolve a stored model identifier through the alias map. Returns
+    /// the input unchanged when no alias exists. Pure function — used at
+    /// read time everywhere a config'd model ID is rendered, validated,
+    /// or sent to Hermes.
+    public func resolveModelAlias(providerID: String, modelID: String) -> String {
+        let composite = "\(providerID)/\(modelID)"
+        return Self.modelAliases[composite] ?? modelID
+    }
+
+    // MARK: - Demoted providers (sort tail)
+
+    /// Provider IDs that Hermes v0.13 explicitly deprioritizes in the
+    /// picker. `loadProviders()` sorts these to the tail of the list,
+    /// after the alphabetical group, so users who haven't manually
+    /// chosen Vercel as their gateway don't end up there by default.
+    /// Mirrors Hermes's deprioritized-provider list in
+    /// `hermes-agent/hermes_cli/providers.py`.
+    public static let demotedProviders: Set<String> = [
+        "vercel",
+    ]
+
+    // MARK: - Image-generation model allowlist (curated)
+
+    /// Known image-generation models, used to pre-populate the
+    /// `image_gen.model` picker on the Auxiliary tab. The list is
+    /// curated — `models_dev_cache.json` doesn't tag image-capable
+    /// models, so we maintain this by hand on Hermes version bumps.
+    /// Always free-form-typeable on the picker too, so missing entries
+    /// don't block users with non-listed image providers.
+    ///
+    /// Order: most-likely-to-be-chosen first.
+    public static let imageGenModels: [HermesImageGenModel] = [
+        .init(modelID: "openai/gpt-image-1", display: "OpenAI · gpt-image-1", providerHint: "openai"),
+        .init(modelID: "google/imagen-4", display: "Google · Imagen 4", providerHint: "google-vertex"),
+        .init(modelID: "google/imagen-3", display: "Google · Imagen 3", providerHint: "google-vertex"),
+        .init(modelID: "stability/stable-image-ultra", display: "Stability · Stable Image Ultra", providerHint: "stability"),
+        .init(modelID: "fal-ai/flux-pro-1.1", display: "fal · FLUX 1.1 Pro", providerHint: "fal"),
+        .init(modelID: "black-forest-labs/flux-1.1-pro", display: "Black Forest Labs · FLUX 1.1 Pro", providerHint: "openrouter"),
+        .init(modelID: "openai/dall-e-3", display: "OpenAI · DALL·E 3", providerHint: "openai"),
+    ]
+
    // MARK: - Hermes overlay providers

    /// The 11 providers Hermes surfaces via `hermes model` that have no
@@ -538,6 +631,27 @@ public struct ModelCatalogService: Sendable {
    ]
 }

+/// Curated entry for the `image_gen.model` picker on the Auxiliary
+/// tab. Hermes v0.13 honors a top-level `image_gen.model` key but the
+/// models.dev catalog has no `image: true` tag, so we maintain a
+/// short hand-curated allowlist keyed by display order. The picker
+/// always allows free-form-typing too, so any provider's model ID
+/// works regardless of whether it appears here.
+public struct HermesImageGenModel: Sendable, Identifiable, Hashable {
+    public let modelID: String
+    public let display: String
+    /// Hint at which provider serves this model — surfaced as a
+    /// "Configure provider X first" advisory but never enforced.
+    public let providerHint: String?
+    public var id: String { modelID }
+
+    public init(modelID: String, display: String, providerHint: String?) {
+        self.modelID = modelID
+        self.display = display
+        self.providerHint = providerHint
+    }
+}
+
 /// Scarf-side mirror of `HermesOverlay` from hermes-agent's
 /// `hermes_cli/providers.py`. Describes a provider that isn't in the
 /// models.dev catalog.
@@ -310,6 +310,74 @@ import Foundation
        }
    }

+    // MARK: - ModelCatalogService — WS-6 (v0.13)
+
+    @Test func vercelAIGatewayDemotedToBottom() throws {
+        // Build a minimal catalog with vercel + alphabetically-later
+        // providers, then assert vercel sorts after them. Locks the
+        // demoted-axis sort comparator added in WS-6.
+        let json = """
+        {
+          "anthropic": { "name": "Anthropic", "models": {} },
+          "vercel":    { "name": "Vercel AI Gateway", "models": {} },
+          "zonk":      { "name": "Zonk Provider", "models": {} }
+        }
+        """
+        let tmp = FileManager.default.temporaryDirectory
+            .appendingPathComponent("scarf-models-\(UUID().uuidString).json")
+        try json.write(to: tmp, atomically: true, encoding: .utf8)
+        defer { try? FileManager.default.removeItem(at: tmp) }
+        let svc = ModelCatalogService(path: tmp.path)
+        let providers = svc.loadProviders().filter { !$0.isOverlay }
+        let names = providers.map(\.providerName)
+        // anthropic first (alpha), zonk next (alpha), vercel last
+        // (demoted) — even though `vercel` < `zonk` alphabetically.
+        #expect(names.last == "Vercel AI Gateway")
+        let vercelIdx = names.firstIndex(of: "Vercel AI Gateway") ?? -1
+        let zonkIdx = names.firstIndex(of: "Zonk Provider") ?? -1
+        #expect(vercelIdx > zonkIdx)
+    }
+
+    @Test func grok420BetaAliasResolvesToGrok420() {
+        let svc = ModelCatalogService(path: "/tmp/scarf-nonexistent-\(UUID().uuidString).json")
+        // OpenRouter's old `-beta` ID resolves to the GA name.
+        #expect(svc.resolveModelAlias(providerID: "openrouter", modelID: "x-ai/grok-4.20-beta")
+                == "x-ai/grok-4.20")
+        // xAI direct provider keeps the same shape minus prefix.
+        #expect(svc.resolveModelAlias(providerID: "xai", modelID: "grok-4.20-beta")
+                == "grok-4.20")
+        // Non-aliased ID passes through unchanged.
+        #expect(svc.resolveModelAlias(providerID: "anthropic", modelID: "claude-4.7-opus")
+                == "claude-4.7-opus")
+        // Cross-provider isolation: same modelID on a different
+        // provider isn't aliased — composite key in `modelAliases`
+        // disambiguates by providerID.
+        #expect(svc.resolveModelAlias(providerID: "fictional", modelID: "x-ai/grok-4.20-beta")
+                == "x-ai/grok-4.20-beta")
+    }
+
+    @Test func imageGenModelAllowlistShape() {
+        // Lock the curated list size + a few sentinel entries so
+        // unintentional edits get caught in review. Free-form-typing
+        // bypasses the allowlist, so additions/removals here are
+        // purely UX (which models surface as picker rows).
+        let models = ModelCatalogService.imageGenModels
+        #expect(models.count >= 5)
+        #expect(models.contains(where: { $0.modelID == "openai/gpt-image-1" }))
+        #expect(models.contains(where: { $0.modelID == "google/imagen-4" }))
+        // Every entry has a non-empty display + a non-empty modelID.
+        for m in models {
+            #expect(!m.modelID.isEmpty)
+            #expect(!m.display.isEmpty)
+        }
+    }
+
+    @Test func demotedProvidersContainsVercel() {
+        // Minimal lock-in for the demoted-providers static set. Mirrors
+        // Hermes's deprioritized-provider list in providers.py.
+        #expect(ModelCatalogService.demotedProviders.contains("vercel"))
+    }
+
    // MARK: - ProjectDashboardService

    @Test func projectDashboardServiceRegistryRoundTrip() throws {
@@ -92,6 +92,27 @@ import Foundation
        #expect(c.security.redactSecrets == true)
        #expect(c.compression.enabled == true)
        #expect(c.voice.ttsProvider == "edge")
+        // v0.13 additions default to empty / off when the YAML omits
+        // them — pre-v0.13 hosts produce this exact shape.
+        #expect(c.imageGenModel == "")
+        #expect(c.openrouterResponseCacheEnabled == false)
+    }
+
+    @Test func parsesImageGenAndOpenRouterCache() {
+        // WS-6: round-trip the two new top-level v0.13 keys. If the
+        // OpenRouter key shape changes upstream (see TODO(WS-6-Q1)),
+        // this test is the single touchpoint that pins the parser
+        // line + setter key + UI binding to a single shape.
+        let yaml = """
+        image_gen:
+          model: openai/gpt-image-1
+        openrouter:
+          response_cache:
+            enabled: true
+        """
+        let c = HermesConfig(yaml: yaml)
+        #expect(c.imageGenModel == "openai/gpt-image-1")
+        #expect(c.openrouterResponseCacheEnabled == true)
    }

    @Test func parsesTopLevelModel() {
@@ -195,6 +195,24 @@ final class SettingsViewModel {
        setSetting("auxiliary.\(task).timeout", value: String(value))
    }

+    // MARK: - Image generation (v0.13+)
+
+    /// `image_gen.model` — overrides the per-provider default image
+    /// model (Hermes v0.13+). Empty string clears the override.
+    /// Capability-gated in `AuxiliaryTab` so pre-v0.13 hosts never
+    /// invoke this setter.
+    func setImageGenModel(_ value: String) { setSetting("image_gen.model", value: value) }
+
+    /// `openrouter.response_cache.enabled` — toggles OpenRouter
+    /// response caching for repeat prompts (Hermes v0.13+).
+    /// Capability-gated in `AuxiliaryTab` so pre-v0.13 hosts never
+    /// invoke this setter.
+    // TODO(WS-6-Q1): the YAML key path is provisional — keep in lockstep
+    // with `HermesConfig+YAML.swift`'s parser line.
+    func setOpenRouterResponseCache(_ value: Bool) {
+        setSetting("openrouter.response_cache.enabled", value: value ? "true" : "false")
+    }
+
    // MARK: - Security / Privacy

    func setRedactSecrets(_ value: Bool) { setSetting("security.redact_secrets", value: value ? "true" : "false") }
@@ -139,6 +139,23 @@ struct AuxiliaryTab: View {
                auxRows(for: task.key)
            }
        }
+        // -- Hermes v0.13 additions ---------------------------------
+        // Image-gen model picker. Hermes v0.13 honors `image_gen.model`
+        // as a top-level YAML key; pre-v0.13 hosts ignore it silently.
+        // Hide the section on pre-v0.13 hosts to spare users a
+        // "I set this and nothing happened" trap.
+        if capabilitiesStore?.capabilities.hasImageGenModel ?? false {
+            SettingsSection(title: "Image Generation", icon: "photo") {
+                imageGenRow
+            }
+        }
+        // OpenRouter response caching toggle (v0.13+). Same hide-on-
+        // pre-v0.13 rationale: the toggle no-ops on older Hermes hosts.
+        if capabilitiesStore?.capabilities.hasOpenRouterResponseCache ?? false {
+            SettingsSection(title: "OpenRouter", icon: "shippingbox") {
+                openRouterResponseCacheRow
+            }
+        }
        // Unknown / unrecognised aux tasks present in config.yaml.
        // Shown only when at least one such key is present so the
        // typical user with a clean config never sees this section.
@@ -225,6 +242,60 @@ struct AuxiliaryTab: View {
        }
    }

+    // MARK: - v0.13 surfaces
+
+    /// Image-gen model picker — curated allowlist + free-form custom
+    /// entry. Capability-gated by the caller; this view assumes the
+    /// host honors `image_gen.model` (Hermes v0.13+).
+    @ViewBuilder
+    private var imageGenRow: some View {
+        let value = viewModel.config.imageGenModel
+        Picker("Model", selection: Binding(
+            get: { value },
+            set: { viewModel.setImageGenModel($0) }
+        )) {
+            Text("Provider default").tag("")
+            Divider()
+            ForEach(ModelCatalogService.imageGenModels) { model in
+                Text(model.display).tag(model.modelID)
+            }
+            // User has set a custom value not in the curated list;
+            // preserve it as a tagged option so the picker renders the
+            // actual selection rather than collapsing to "Provider
+            // default".
+            if !value.isEmpty
+                && !ModelCatalogService.imageGenModels.contains(where: { $0.modelID == value }) {
+                Divider()
+                Text(value + "  (custom)").tag(value)
+            }
+        }
+        .pickerStyle(.menu)
+        EditableTextField(label: "Custom model ID", value: value) { newValue in
+            viewModel.setImageGenModel(newValue.trimmingCharacters(in: .whitespaces))
+        }
+        Text("Used for image generation calls. Leave as Provider default unless your provider documents a specific model ID for image-gen.")
+            .font(.caption2)
+            .foregroundStyle(.tertiary)
+            .padding(.horizontal, 12)
+            .padding(.bottom, 4)
+    }
+
+    /// OpenRouter response-caching toggle (Hermes v0.13+). Off by
+    /// default; surfaced for users with highly repeated prompts who
+    /// want OpenRouter to cache identical-prompt responses.
+    @ViewBuilder
+    private var openRouterResponseCacheRow: some View {
+        let isOn = viewModel.config.openrouterResponseCacheEnabled
+        ToggleRow(label: "Response caching", isOn: isOn) { newValue in
+            viewModel.setOpenRouterResponseCache(newValue)
+        }
+        Text("OpenRouter caches identical prompts within a session to reduce token costs. Off by default — enable when your workload has highly repeated prompts.")
+            .font(.caption2)
+            .foregroundStyle(.tertiary)
+            .padding(.horizontal, 12)
+            .padding(.bottom, 4)
+    }
+
    private func auxModel(for key: String) -> AuxiliaryModel {
        switch key {
        case "vision": return viewModel.config.auxiliary.vision