M7 #2: non-retryable ACP errors surface as chat error banner

Pass-1 hit HTTP 404 from Nous Portal (misconfigured model), the
agent reported it via ACP stderr + stopReason="error", and ScarfGo
showed nothing — users saw only the perpetual working spinner. Mac
had an errorBanner for this pattern; ScarfGo didn't.

Promotes the error-banner state and helpers from Mac's ChatViewModel
(Mac target) into RichChatViewModel (ScarfCore) so both apps share:

- `acpError`, `acpErrorHint`, `acpErrorDetails` — the banner triplet.
- `clearACPErrorState()` — called on reset() and addUserMessage()
  so stale errors don't linger across prompts.
- `recordACPFailure(_:client:)` — populate triplet from a thrown
  error + stderr tail, using the existing `ACPErrorHint.classify`.
- `recordPromptStopFailure(stopReason:client:)` — populate triplet
  from a non-retryable ACP `promptComplete` stopReason. Provides a
  fallback hint per stopReason when classify doesn't match.
- `acpStderrProvider: () async -> String` — closure the controller
  sets once so `handlePromptComplete` (called from the event stream)
  can pull recent stderr without the VM holding a direct ACPClient
  reference.

Mac ChatViewModel's local triplet becomes forwarding properties to
richChatViewModel.* — call sites (~15 in ChatViewModel) stay
unchanged. `recordACPFailure` + `clearACPErrorState` become one-line
forwarders.

ScarfGo ChatView gains an `errorBanner` modeled on the Mac one:
- Orange triangle + hint + raw error
- Expand/collapse "Details" button showing stderr tail (monospaced,
  scrollable, max ~140pt tall)
- Copy-all button via `UIPasteboard.general.string` (Mac uses
  NSPasteboard; same structure otherwise)
- Rendered above the message list so it's always visible

ChatController wires `acpStderrProvider` to
`{ await client?.recentStderr ?? "" }` before the handshake and
calls `recordACPFailure` on ACP client start / newSession /
sendPrompt failure paths. `handlePromptComplete` already handles
the common provider-404 case via `recordPromptStopFailureUsingProvider`.

Both schemes build green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Alan Wizemann
2026-04-24 13:17:51 +02:00
parent 8e14e0e776
commit 96f60a176d
3 changed files with 207 additions and 23 deletions
@@ -79,6 +79,94 @@ public final class RichChatViewModel {
return last.isAssistant && !last.content.isEmpty return last.isAssistant && !last.content.isEmpty
} }
// MARK: - Error banner state (shared macOS + iOS)
/// Human-readable error message shown in the chat's error banner.
/// Nil = no active error. Populated from `recordACPFailure(...)`
/// (throws from ACP ops) and from `handlePromptComplete` when the
/// response's `stopReason` is `"error"` (non-retryable provider
/// failures like Nous Portal HTTP 404 for an unknown model
/// pass-1 M7 #2).
public var acpError: String?
/// Short hint derived from the error + stderr tail (e.g.
/// "set ANTHROPIC_API_KEY" or "pick a different model this
/// one isn't in the provider's catalog"). Shown above the raw
/// error in the banner when present. Classified by
/// `ACPErrorHint.classify(errorMessage:stderrTail:)`.
public var acpErrorHint: String?
/// Tail of stderr captured from `hermes acp` at the time of the
/// failure. Shown in a collapsible "Show details" section so
/// users can copy-paste the raw output into a bug report.
public var acpErrorDetails: String?
/// Optional stderr-tail provider the controller can hook up when it
/// creates the ACPClient. Used by `handlePromptComplete` to enrich
/// the error banner on non-retryable stopReasons. The closure is
/// called async so callers can await `ACPClient.recentStderr`
/// without blocking the MainActor. Defaults to nil (no stderr in
/// banner, just the hint fallback).
public var acpStderrProvider: (@Sendable () async -> String)?
/// Clear the error triplet. Call on session reset / new chat /
/// successful new prompt so stale errors don't linger.
public func clearACPErrorState() {
acpError = nil
acpErrorHint = nil
acpErrorDetails = nil
}
/// Populate the error triplet from a thrown Error + the ACPClient
/// we can query for recent stderr. Safe to call from anywhere
/// that catches an ACP op failure.
public func recordACPFailure(_ error: Error, client: ACPClient?) async {
let msg = error.localizedDescription
let stderrTail = await client?.recentStderr ?? ""
let hint = ACPErrorHint.classify(errorMessage: msg, stderrTail: stderrTail)
acpError = msg
acpErrorHint = hint
acpErrorDetails = stderrTail.isEmpty ? nil : stderrTail
}
/// Populate the error triplet when `handlePromptComplete` sees a
/// non-`end_turn` stopReason (i.e. the provider rejected the
/// prompt and Hermes correctly surfaced it via ACP). The hint
/// classifier reads the stderr tail; for stopReason="error" cases
/// the tail typically contains the provider's HTTP status + reason.
public func recordPromptStopFailure(stopReason: String, client: ACPClient?) async {
let msg = "Prompt ended without a response (stopReason: \(stopReason))."
let stderrTail = await client?.recentStderr ?? ""
let hint = ACPErrorHint.classify(errorMessage: msg, stderrTail: stderrTail)
?? Self.fallbackHint(for: stopReason)
acpError = msg
acpErrorHint = hint
acpErrorDetails = stderrTail.isEmpty ? nil : stderrTail
}
/// Same as `recordPromptStopFailure` but pulls stderr from the
/// `acpStderrProvider` closure the controller registered. Used by
/// `handlePromptComplete` where we don't have direct ACPClient
/// access.
private func recordPromptStopFailureUsingProvider(stopReason: String) async {
let msg = "Prompt ended without a response (stopReason: \(stopReason))."
let stderrTail = await acpStderrProvider?() ?? ""
let hint = ACPErrorHint.classify(errorMessage: msg, stderrTail: stderrTail)
?? Self.fallbackHint(for: stopReason)
acpError = msg
acpErrorHint = hint
acpErrorDetails = stderrTail.isEmpty ? nil : stderrTail
}
private static func fallbackHint(for stopReason: String) -> String? {
switch stopReason {
case "error": return "The provider returned an error. Check the details below — often the configured model isn't in the provider's catalog."
case "refusal": return "The session may have been cleared on the server. Start a new chat to continue."
case "max_tokens": return "The response was cut off before any content was produced. Try a shorter prompt or raise the max-tokens limit in Settings."
default: return nil
}
}
// Cumulative ACP token tracking (ACP returns tokens per prompt but DB has none) // Cumulative ACP token tracking (ACP returns tokens per prompt but DB has none)
public private(set) var acpInputTokens = 0 public private(set) var acpInputTokens = 0
public private(set) var acpOutputTokens = 0 public private(set) var acpOutputTokens = 0
@@ -165,6 +253,9 @@ public final class RichChatViewModel {
acpInputTokens = 0 acpInputTokens = 0
acpOutputTokens = 0 acpOutputTokens = 0
acpThoughtTokens = 0 acpThoughtTokens = 0
acpError = nil
acpErrorHint = nil
acpErrorDetails = nil
acpCachedReadTokens = 0 acpCachedReadTokens = 0
acpCommands = [] acpCommands = []
pendingPermission = nil pendingPermission = nil
@@ -197,6 +288,10 @@ public final class RichChatViewModel {
/// Add a user message immediately (before DB write) for instant UI feedback. /// Add a user message immediately (before DB write) for instant UI feedback.
public func addUserMessage(text: String) { public func addUserMessage(text: String) {
// Fresh prompt clear any stale error banner from a prior
// failed attempt so we don't show "old error" + "still thinking"
// simultaneously. Matches the Mac ChatViewModel pattern.
clearACPErrorState()
let id = nextLocalId let id = nextLocalId
nextLocalId -= 1 nextLocalId -= 1
let message = HermesMessage( let message = HermesMessage(
@@ -408,6 +503,13 @@ public final class RichChatViewModel {
finishReason: response.stopReason, finishReason: response.stopReason,
reasoning: nil reasoning: nil
)) ))
// Pass-1 M7 #2: surface the same failure as a top-of-chat
// error banner with the stderr tail, so users don't have
// to rely solely on the system-message to understand why
// nothing happened. The controller registers
// `acpStderrProvider`; if absent, the banner still shows
// with the hint fallback.
Task { await self.recordPromptStopFailureUsingProvider(stopReason: response.stopReason) }
} }
// Accumulate token usage from this prompt // Accumulate token usage from this prompt
+84 -1
View File
@@ -38,6 +38,7 @@ struct ChatView: View {
var body: some View { var body: some View {
VStack(spacing: 0) { VStack(spacing: 0) {
errorBanner
messageList messageList
Divider() Divider()
composer composer
@@ -176,6 +177,75 @@ struct ChatView: View {
.background(.regularMaterial) .background(.regularMaterial)
} }
@State private var showErrorDetails: Bool = false
/// Inline error banner rendered above the message list when the
/// ACP layer signals a non-retryable failure (provider HTTP 4xx,
/// malformed model, missing credentials). Mirrors the Mac pattern
/// in scarf/scarf/Features/Chat/Views/ChatView.swift:errorBanner;
/// both now pull from RichChatViewModel's shared error triplet.
/// Pass-1 M7 #2 previously errors vanished into stderr and the
/// user saw a perpetual spinner.
@ViewBuilder
private var errorBanner: some View {
if let err = controller.vm.acpError {
VStack(alignment: .leading, spacing: 6) {
HStack(alignment: .top, spacing: 8) {
Image(systemName: "exclamationmark.triangle.fill")
.foregroundStyle(.orange)
VStack(alignment: .leading, spacing: 2) {
if let hint = controller.vm.acpErrorHint {
Text(hint)
.font(.callout)
.textSelection(.enabled)
}
Text(err)
.font(.caption)
.foregroundStyle(.secondary)
.textSelection(.enabled)
.lineLimit(showErrorDetails ? nil : 2)
}
Spacer(minLength: 4)
if controller.vm.acpErrorDetails != nil {
Button(showErrorDetails ? "Hide" : "Details") {
showErrorDetails.toggle()
}
.buttonStyle(.borderless)
.controlSize(.small)
}
Button {
let payload = [
controller.vm.acpErrorHint,
err,
controller.vm.acpErrorDetails
]
.compactMap { $0 }
.joined(separator: "\n\n")
UIPasteboard.general.string = payload
} label: {
Image(systemName: "doc.on.doc")
}
.buttonStyle(.borderless)
.controlSize(.small)
}
if showErrorDetails, let details = controller.vm.acpErrorDetails {
ScrollView(.vertical) {
Text(details)
.font(.caption2.monospaced())
.foregroundStyle(.secondary)
.textSelection(.enabled)
.frame(maxWidth: .infinity, alignment: .leading)
}
.frame(maxHeight: 140)
}
}
.padding(.horizontal, 12)
.padding(.vertical, 8)
.frame(maxWidth: .infinity, alignment: .leading)
.background(.orange.opacity(0.12))
}
}
/// Shown while we're opening the SSH exec channel + spawning /// Shown while we're opening the SSH exec channel + spawning
/// `hermes acp` + creating the ACP session. Typically ~0.51.5 s /// `hermes acp` + creating the ACP session. Typically ~0.51.5 s
/// on a warm network silent before this overlay existed, which /// on a warm network silent before this overlay existed, which
@@ -269,10 +339,19 @@ final class ChatController {
) )
self.client = client self.client = client
// Hand the VM a closure that can fetch the ACPClient's recent
// stderr when it needs to enrich the error banner on a non-
// retryable `promptComplete` (pass-1 M7 #2). The VM caches
// this; we only need to set it once per client.
vm.acpStderrProvider = { [weak client] in
await client?.recentStderr ?? ""
}
do { do {
try await client.start() try await client.start()
} catch { } catch {
state = .failed(error.localizedDescription) state = .failed(error.localizedDescription)
await vm.recordACPFailure(error, client: client)
return return
} }
@@ -299,6 +378,7 @@ final class ChatController {
state = .ready state = .ready
} catch { } catch {
state = .failed(error.localizedDescription) state = .failed(error.localizedDescription)
await vm.recordACPFailure(error, client: client)
await stop() await stop()
} }
} }
@@ -319,7 +399,10 @@ final class ChatController {
} catch { } catch {
// The event task may already have surfaced a // The event task may already have surfaced a
// .connectionLost; show the send-time error only if the // .connectionLost; show the send-time error only if the
// state didn't already fail. // state didn't already fail. Always populate the error
// banner so the user sees actionable detail regardless
// of which path raised first (M7 #2).
await vm.recordACPFailure(error, client: client)
if case .ready = state { if case .ready = state {
state = .failed("Prompt failed: \(error.localizedDescription)") state = .failed("Prompt failed: \(error.localizedDescription)")
} }
@@ -82,13 +82,21 @@ final class ChatViewModel {
return acpStatus.hasPrefix("Reconnecting") return acpStatus.hasPrefix("Reconnecting")
} }
} }
var acpError: String? /// Error triplet moved to RichChatViewModel in M7 #2 so ScarfGo can
/// Human-readable hint derived from error + stderr (e.g. "set ANTHROPIC_API_KEY"). /// share the same banner. These are forwarding accessors to keep
/// Shown above the raw error in the UI when present. /// the many existing call sites in this file unchanged.
var acpErrorHint: String? var acpError: String? {
/// Tail of stderr captured from `hermes acp` at the time of the last get { richChatViewModel.acpError }
/// failure shown in a collapsible details section so users can copy/paste. set { richChatViewModel.acpError = newValue }
var acpErrorDetails: String? }
var acpErrorHint: String? {
get { richChatViewModel.acpErrorHint }
set { richChatViewModel.acpErrorHint = newValue }
}
var acpErrorDetails: String? {
get { richChatViewModel.acpErrorDetails }
set { richChatViewModel.acpErrorDetails = newValue }
}
/// True when `hasAnyAICredential()` returned false at last preflight. /// True when `hasAnyAICredential()` returned false at last preflight.
var missingCredentials: Bool = false var missingCredentials: Bool = false
@@ -109,26 +117,17 @@ final class ChatViewModel {
missingCredentials = !fileService.hasAnyAICredential() missingCredentials = !fileService.hasAnyAICredential()
} }
/// Clears the error/hint/details triplet so future failures overwrite /// Forwarders to the ScarfCore implementation so the error-banner
/// cleanly instead of stacking on top of stale state. /// state lives in one place (M7 #2). The per-site logging label
/// stays here only the storage is shared.
private func clearACPErrorState() { private func clearACPErrorState() {
acpError = nil richChatViewModel.clearACPErrorState()
acpErrorHint = nil
acpErrorDetails = nil
} }
/// Populates acpError, acpErrorHint, acpErrorDetails from an error + the
/// stderr tail the ACP client captured, and logs the failure with a
/// site-specific context label. Call on any failure path.
@MainActor @MainActor
private func recordACPFailure(_ error: Error, client: ACPClient?, context: String) async { private func recordACPFailure(_ error: Error, client: ACPClient?, context: String) async {
let msg = error.localizedDescription logger.error("\(context): \(error.localizedDescription)")
logger.error("\(context): \(msg)") await richChatViewModel.recordACPFailure(error, client: client)
let stderrTail = await client?.recentStderr ?? ""
let hint = ACPErrorHint.classify(errorMessage: msg, stderrTail: stderrTail)
acpError = msg
acpErrorHint = hint
acpErrorDetails = stderrTail.isEmpty ? nil : stderrTail
} }
// MARK: - Session Lifecycle // MARK: - Session Lifecycle