Compare commits

...

60 Commits

Author SHA1 Message Date
Alan Wizemann 97ec4d2882 chore: Bump version to 2.7.0 2026-05-05 20:41:39 +02:00
Alan Wizemann cd5bb32a21 release: prep v2.7.0 — consolidated notes + in-app Sparkle release notes
Rolls up everything since v2.6.5 (36 commits across remote-perf,
project wizard, dashboard widgets, OAuth resilience, ScarfMon
instrumentation, and the v2.7 skeleton-then-hydrate redesign) into
a single 2.7.0 release.

* releases/v2.7.0/RELEASE_NOTES.md — full consolidated notes,
  reorganized around the throughline (slow-remote performance) with
  five thematic sections: skeleton-then-hydrate loaders, SSH
  cancellation, project wizard + Keychain cron secrets, dashboard
  widgets, OAuth resilience, and ScarfMon. Replaces the previously-
  drafted dashboard-only v2.7.0 stub and the separate v2.8 wizard
  stub (both unreleased).
* releases/v2.8/ — deleted; folded into v2.7.
* README.md — "What's New in 2.6" → "What's New in 2.7" with the
  five-section summary linking out to the full notes.

* tools/render-release-notes.py — stdlib-only Markdown → HTML
  renderer covering the subset of GitHub-flavored markdown that
  release notes use (## / ### headings, paragraphs, ul lists,
  fenced code, inline code/bold/italic/links, hr). Output includes
  a small <style> block tuned for Sparkle's update alert WebKit
  view (light + dark variants via prefers-color-scheme).
* scripts/release.sh — render the active RELEASE_NOTES.md and
  inject the result as <description><![CDATA[...]]></description>
  on the appcast item. Sparkle's standard updater renders this in
  the in-app update sheet so users see release-specific "what's
  new" alongside the version number, not just the bare version.
  Falls back to a "see GitHub release page" placeholder when the
  notes file is missing.

User runs ./scripts/release.sh 2.7.0 to ship.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 20:31:27 +02:00
Alan Wizemann 5e23b59697 test(model-preflight): cover detect-mismatch + fix newline-trim bug
* New ModelPreflightTests suite (19 tests) covering both `check(_:)`
  and the v2.8 `detectMismatch(_:)` paths. Pins the dogfooding
  scenario (anthropic-prefixed model + nous active provider after
  Credential Pools OAuth swap), the case-insensitive prefix match,
  empty-prefix / empty-bare-model edge cases, and multi-slash model
  ids (OpenRouter style).

* Bug fix surfaced by the tests: `ModelPreflight` was using
  `trimmingCharacters(in: .whitespaces)` which doesn't strip
  newlines. A stray `\n` in a hand-edited config.yaml would either
  miss the missing-fields classifier OR false-positive the mismatch
  banner (showing "anthropic" vs "anthropic\n"). Switched both
  trims to `.whitespacesAndNewlines`.

perf(observability): instrument Tier C load paths + fetchSessionPreviews

No behavior change — adds ScarfMon coverage so future captures show
how often Memory/Skills/Cron/Curator/SessionPreviews load paths fire
and what they cost on remote (each is multiple sequential SFTP RTTs
that pre-fix were invisible). New events:

* `mac.fetchSessionPreviews` / `.rows` / `.transportError`
* `memory.load` / `.bytes`
* `cron.load` / `.jobs`
* `skills.load` / `.count`
* `curator.load` / `.bytes`

All 321 ScarfCore tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 20:03:35 +02:00
Alan Wizemann 09e33b2999 perf(chat,activity,transport): skeleton-then-hydrate loaders + SSH cancellation propagation
Major perf overhaul for slow-remote contexts. Chats and Activity now
render in <2s instead of timing out at 30s; abandoned SSH work is
killed within 100ms instead of pinning a ControlMaster session.

* Skeleton-then-hydrate chat loader. New `fetchSkeletonMessages`
  selects user+assistant rows only (skips role='tool', NULLs
  tool_calls + reasoning at the SQL level). Wire payload bounded by
  conversational text alone — sub-second on remote regardless of
  underlying tool result blob sizes. Background `startToolHydration`
  pages through `hydrateAssistantToolCalls` (5-id batches) to splice
  tool calls in. Tool-result CONTENT is opt-in via Settings → Display
  → "Load tool results in past chats" (default off); inspector pane
  lazy-fetches per-result via `fetchToolResult(callId:)` on expand.

* Skeleton-then-hydrate Activity loader. New
  `fetchRecentToolCallSkeleton` returns metadata-only rows in ~3 KB
  for 50 entries; placeholder ActivityRows render immediately, real
  per-call entries swap in as paged hydration completes. Loading
  pill in the page header, orange transport-error banner replaces
  the pre-fix silent empty state.

* SSH cancellation propagation. `Task.detached` and unstructured
  `Task<...> { ... }` don't inherit cancellation from awaiting
  parents — without bridging, killing a Swift Task left the ssh
  subprocess running for the full 30s deadline, pinning a remote
  sqlite query and a ControlMaster session. Wired
  `withTaskCancellationHandler` through `SSHScriptRunner.run` and
  `RemoteSQLiteBackend.query`; cancellation now reaches `Process`
  within ~100ms. New `ssh.cancelled` ScarfMon event.

* L1 single-id retry. When a 5-id `hydrateAssistantToolCalls` page
  trips the 30s timeout (one row carries an oversized tool_calls
  blob — long Edit args, big diffs), fall back to single-id queries
  to isolate the whale. Non-whale rows in the same batch hydrate
  normally; whale row stays bare. New `mac.hydrateToolCalls.singleTimeout`
  event tracks how often the recovery fires.

* L2 in-flight coalescing for `loadRecentSessions`. File-watcher
  deltas during streaming used to stack 2-3 parallel sessions-list
  reload tasks; subsequent callers now await the active one. New
  `mac.loadRecentSessions.coalesced` event tracks dedup hits.

* Loading-state UX hardening. New `isStartingSession` flag flips
  synchronously on user click so the chat sidebar greys + disables
  immediately instead of waiting for `client.start()` to return
  (5-7s on remote). Phase-typed status: "Spawning hermes acp…" →
  "Authenticating…" → "Loading session…" → "Loading history…" →
  "Ready". `ChatSessionListPane` overlays a ProgressView showing
  the current phase.

* Partial-result detection. `fetchMessagesOutcome` distinguishes a
  transport failure from a genuine empty result; `loadSessionHistory`
  surfaces "Couldn't load full chat history — connection timed out"
  through the existing acpError triplet so the user sees what
  happened instead of a silent empty transcript.

* Model/provider mismatch banner. `ModelPreflight.detectMismatch`
  recognizes when `model.default` carries a `<provider>/...` prefix
  that disagrees with `model.provider` (e.g. anthropic prefix +
  nous active provider after switching OAuth via Credential Pools).
  Banner offers one-click fix in either direction. Companion: ACP
  error classifier recognizes `model_not_found` / `404 messages`
  and surfaces "Hermes pins each session to its original model —
  start a new chat" so the pinned-model failure mode has a clear
  recovery path.

* OAuth-completion provider swap prompt. After successful OAuth in
  Credential Pools, if the just-authed provider differs from
  `model.provider` in config.yaml, surface "Switch active provider
  to <name>?" with [Switch] / [Keep current] instead of
  auto-dismissing.

All 302 ScarfCore tests pass. New ScarfMon events documented in the
Performance-Monitoring wiki page.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 19:43:53 +02:00
Alan Wizemann 9f2e2ecfcd perf(chat): exclude reasoning_content from initial fetch + drop page size to 25
The 160-message thinking-model session still timed out at the 30s
ceiling even after dropping page size 200→50 in commit a193003.
ScarfMon trace:

  mac.fetchMessages    30,105,329,125 ns ← 30s timeout fired
  mac.hydrateMessages.rows  count=1     ← 1 partial row only

Root cause: `reasoning_content` is huge on thinking models (20+
KB per row). Even 50 rows × 30 KB = 1.5 MB JSON shipping over a
420ms-RTT remote SSH channel exceeds the budget. The chat
appeared empty AGAIN.

Two cuts:

1. **`messageColumnsLight`** — same as messageColumns but omits
   `reasoning_content`. Used by `fetchMessages` so the bulk
   wire payload is small. `messageFromRow` reads
   reasoning_content via `row.optionalString(at: 11)` which
   gracefully returns nil when the column isn't present, so the
   shape change is transparent.

2. **`fetchReasoningContent(for:)`** — single-row lazy fetch
   the inspector pane calls when the user expands a thinking
   disclosure. One small SSH round-trip per inspection vs. paying
   for ALL reasoning content on every session boot.

3. **`HistoryPageSize.initial` 50 → 25** — sized for the lite
   column shape with margin for sessions that include some heavy
   tool-call payloads. The "Load earlier" affordance still
   pages back through older messages.

Net effect on the user-reported case: 160-message session loads
the most-recent 25 messages in ~5-10s (one SSH round-trip ~420ms
plus ~3 KB × 25 = 75 KB wire). The remaining 135 are reachable
via Load earlier.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:28:40 +02:00
Alan Wizemann 1eb5c92f6a fix(aux-tab): correct nested-YAML parser so unknown-task surface works on remote
Bug 1 — the previous parser collected every indented child under
`auxiliary:` as if it were a task name, including leaf fields
(provider, model, base_url, api_key, timeout). Result: bogus rows
on local where the parser happened to fire, plus pollution of
the unknown-tasks set with field names that subtractFrom-known
left orphaned.

Bug 2 — the flat-dot-path branch (`auxiliary.X.Y:`) was dead
code. config.yaml is always nested YAML; the dot-path form only
appears in interactive `hermes config get` output, never on
disk. Removing it.

User reported the unknown-tasks section showed on local but not
on remote. Most likely root cause: the buggy parser surfaced
junk on local (where their config has nested-form aux settings)
while the dead flat-path branch never fired on remote either,
so remote silently rendered nothing. With the parser fixed both
contexts now surface real unknown task names if any are
present.

Rewrite as a clean two-pass walker:
- First nested line inside the block locks taskIndent.
- Only collect at exactly taskIndent (skip leaf fields deeper).
- Tolerate CRLF line endings, blank lines, and YAML comments
  without resetting block state.
- Handles 2-space and 4-space indent equally.

Verified manually with four fixture shapes: 2-space, 4-space,
with-comments-and-blanks, no-aux-block. All correct.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:12:55 +02:00
Alan Wizemann bccaba0742 feat(acp,aux): classify resolve_provider_client errors + surface unknown aux tasks
Two fixes for the user-reported "ACP -32603 Internal error" after
removing a Nous OAuth provider while config.yaml still referenced
nous for an auxiliary task. The actual stderr was clear:

  agent.auxiliary_client: resolve_provider_client: nous requested
    but Nous Portal not configured

But Scarf's chat banner showed only the bare JSON-RPC code and
the user had no actionable path through the UI.

**ACPErrorHint.classify** now pattern-matches the
`resolve_provider_client: <name> requested but` stderr line and
extracts the provider name. Surfaces:

  An auxiliary task is configured to use `<name>` but that
  provider isn't authenticated. Open Settings → Aux Models, or
  check ~/.hermes/config.yaml for auxiliary.<task>.provider: <name>
  and switch it to your active provider (or set it to `auto`).

Routed through the existing chat-banner pipeline that already
catches OAuth revocation and missing-credentials errors.

**AuxiliaryTab** gains an "Other tasks in config.yaml" section
that surfaces aux task keys present in YAML but not in Scarf's
typed list (vision, web_extract, compression, session_search,
skills_hub, approval, mcp, flush_memories, curator). Common
case: `auxiliary.summarization.provider: nous` left over from
older Hermes versions or hand-edited configs. Each unknown task
gets a one-click "Reset provider" button that writes
`auxiliary.<key>.provider: auto` — the most-actionable fix
for the OAuth-removal failure mode. Detection scans both
flat-dot-path and nested YAML shapes so it works regardless of
how Hermes dumped the file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:00:48 +02:00
Alan Wizemann 4684b9deed feat(credential-pools): OAuth remove button + auto-refresh on auth.json change
User reports the Nous OAuth provider still showed in the
credential pool after they 'removed' it, and Reload didn't help.
Two underlying bugs:

**Bug 1 — no UI path to remove OAuth providers.** The pool view
had a Re-authenticate button on each OAuth row but no remove.
Users who switched active provider thought that removed Nous;
the OAuth tokens stayed in auth.json and the row kept rendering.
Add a trash icon next to Re-authenticate that calls
`hermes auth logout <provider>` after a confirmation dialog.
ViewModel route is `removeOAuthProvider` mirroring
`removeCredential`.

**Bug 2 — view didn't refresh on external auth.json changes.**
Pool view subscribed only to .onAppear and sheet-dismiss. A
terminal `hermes auth logout` or another window's OAuth flow
left the view stale until manually re-entered. Wire up
`fileWatcher.lastChangeDate` so any auth.json mtime tick
triggers a reload (the file watcher already polls auth.json
on the remote SSH path).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:46:41 +02:00
Alan Wizemann f6dc45b397 feat(scarfmon): track empty-assistant turns + document Nous quirk
User reports chats "dying" on Nous models — screenshot shows the
assistant bubble stuck with `(°□°) deliberating...` and a
1.7s turn-duration pill (turn DID complete; the content is the
problem). The literal placeholder string isn't in Scarf's source;
it's coming from Hermes or Nous itself when the model emits a
brief thought stream and then fails to produce any visible
output.

ScarfMon trace confirms the failure mode:
  mac.sendViaACP    →  firstThoughtByte (25 bytes)
  mac.handleACPEvent  ✓
  mac.sendPrompt     ✓ (1.7s, normal)
  finalizeStreamingMessage  ✓ (turn cleanly closed)

So Scarf sees no transport error — the turn finalized normally
with empty assistant text plus a small thought stream. The
visible "deliberating" text is content Hermes/Nous chose to
substitute for the missing response.

Adds `mac.emptyAssistantTurn` event (category .chatStream) that
fires whenever a turn finalizes with empty `streamingAssistantText`
and empty `streamingToolCalls`. Bytes carry the thinking-text
length so we can distinguish:
  - bytes=0: total empty turn (model produced nothing)
  - bytes>0: thoughts-only turn (model thought but didn't answer)

Both are user-visible failures. The fix is upstream — Hermes
should refuse to finalize a turn with no response and surface
an error, OR Nous should not return empty responses with the
placeholder string. Document this finding so a future capture
that shows multiple `mac.emptyAssistantTurn` events confirms
the rate / model-correlation.

For now Scarf surfaces the same UX as before (no UI change in
this commit). A follow-on commit could intercept this case and
replace the bubble with a clearer "Model returned no response"
banner, but that requires a confident heuristic for which
empty-finalize cases are real failures vs. legitimate
no-response turns.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:40:21 +02:00
Alan Wizemann f2ddcbbd60 feat(model-picker): add search filter to Nous overlay model list
Nous returned 402 models in the recent perf capture (~496 KB of
JSON). The picker's existing top-bar search field already filters
the catalog list (`filteredModels`) but the Nous overlay path
showed all 402 unfiltered, making it nearly unusable.

Add `filteredNousModels` mirroring the `filteredModels` shape:
filters `nousModels` by case-insensitive substring match against
both `id` and `owned_by`. Updates the empty-state overlay so
"no matches" surfaces a different message from "no models
loaded" — the user knows the catalog is fine, the search just
didn't match.

User feedback: "we need a search in the model picker, some of
these lists are large and unorganized."

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:38:30 +02:00
Alan Wizemann a193003842 fix(chat): paginate session-load + race-guard against session switch
Two related bugs from remote-context perf captures.

**Bug 1 — 30s timeout fetching the 157-message session.** The
initial page size was 200 messages. For a session including
`reasoning_content` from a thinking model, that produces enough
JSON over `sqlite3 -json | ssh` to time out at exactly 30s on a
420ms-RTT remote, returning 0 rows. Bumping queryTimeout further
just trades latency for stalls.

Drop `HistoryPageSize.initial` from 200 → 50. Sized to fit
comfortably inside the 30s queryTimeout; the existing "Load
earlier" affordance pages back through older messages on demand.

**Bug 2 — session-switch race silently swaps transcripts.** When
the user picks a small chat while a slow fetch for a different
chat is still in flight, the slow fetch finishes second and its
`messages = …` assignment overwrites the small chat's transcript.
User sees the small chat "jump back" to the big one. ScarfMon
trace: parallel `mac.fetchMessages` events at t=641870 (small,
425ms, 2 rows) and t=643316 (big, 30,028ms timeout) — last
write won.

Add a `loadingForSession` capture and three guards: after the
DB refresh, after the primary fetch, after the ACP-fork fetch.
Each compares `self.sessionId` against the captured id; on
mismatch fire `mac.hydrateMessages.dropped` and return without
assigning. Race is silent in normal usage but visible in traces.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:38:19 +02:00
Alan Wizemann 93a64e3e82 fix(nous-picker): kill 120s beach-ball — dedupe readCache + 5s timeout
Two stacking bugs in the Nous-overlay branch of the model picker
caused a 120-second beach-ball on remote contexts.

**Bug 1 — duplicated readCache.** ModelPickerSheet.refreshNousModels
called `service.readCache()` directly (for instant first-paint),
then called `service.loadModels(forceRefresh: false)` which calls
`readCache()` AGAIN as its first step. Two SSH round-trips per
picker open. Drop the inline call; loadModels is already cache-first
on its happy path (returns `.cache(...)` when fresh). One read
per open.

**Bug 2 — 60s readFile timeout for a hint.** `readCache()` goes
through SSHTransport.readFile which has a 60s default timeout. On
a remote with a corrupted or oversized cache file, `cat` never
returns and we wait the full 60s — twice, due to bug 1, for a
total 120s picker stall. ScarfMon perf capture (commit 00a1bbd's
diagnostic split) localized this precisely:

  nous.readCache.fileExists  =   251 ms  ✓
  nous.readCache.readFile    = 60,011 ms   (60s timeout)

Cache is an optimization, not a requirement. Added
`readCacheWithTimeout(seconds: 5)` that races readCache against
a 5-second sleep via withTaskGroup. On timeout returns nil; caller
treats that as no-cache and falls through to the network fetch
(which succeeded in 2s in the offending capture, returning 402
models). The runaway `cat` keeps running on its own 60s transport
timeout but no longer blocks the picker.

New ScarfMon event: `nous.readCache.timeoutFired` surfaces hits
in traces so we can tell whether the timeout is being exercised
in the wild.

The underlying `cat` hang on the cache file is still unexplained;
the file size (~500KB) shouldn't take 60s on a 420ms-RTT SSH link.
For now: deleting the cache file (`rm ~/.hermes/scarf/nous_models_cache.json`
on the remote) is the workaround. The next picker open will rebuild it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:17:45 +02:00
Alan Wizemann 00a1bbd109 feat(scarfmon): split nous.readCache into fileExists/readFile/decode/bytes
Last perf capture showed nous.readCache as a single 60-second
interval — but the function does three things (transport.fileExists,
transport.readFile, JSONDecoder). Splitting the measure points so
the next capture localizes which step actually owns the wall-clock.

Adds:
- nous.readCache.fileExists (interval) — SSH `test -e` round-trip
- nous.readCache.readFile (interval) — SSH `cat` round-trip
- nous.readCache.bytes (event) — payload size of the cache file
- nous.readCache.decode (interval) — JSON parsing cost

If the next 60-second beach ball localizes to readFile, we know
the cache file is somehow huge or the SSH read is hung; if it's
fileExists, the path resolution is the issue; if decode, we have
malformed JSON. All three wear the same outer wrapper so the
existing nous.readCache total stays for trend comparison.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:07:43 +02:00
Alan Wizemann 20cc3a2985 perf(sessions): fold sessions+previews into one batched SSH round-trip
Audit Finding 1 — ChatViewModel.loadRecentSessions and
SessionsViewModel.load each fired two sequential `await
dataService.fetch*` calls (sessions + previews), paying the 420 ms
SSH RTT twice on every reload. Visible in ScarfMon traces as
back-to-back `ssh.run` intervals, totaling ~840 ms minimum
overhead per sidebar refresh.

Adds HermesDataService.sessionListSnapshot(limit:) — same shape
as the existing dashboardSnapshot, folds both queries into a
single backend.queryBatch() call. Both call sites switched.

Halves the SSH round-trips for every sidebar load. With Finding 5's
coalescing, redundant parallel reloads also become free. Together,
the 9× redundant queries-per-minute observed in baseline captures
should drop substantially.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:07:31 +02:00
Alan Wizemann 432d5b0b52 fix(remote-sqlite): bump query timeout 15s→30s + add in-flight coalescing
Two issues from the perf capture:

1. fetchMessages on a 157-message session timed out at exactly 15.06 s
   (`mac.fetchMessages` interval = 15,062,646,042 ns), then silently
   returned 0 rows. The chat appeared empty but the session had data;
   the timeout was firing before sqlite3 -json could ship the ~50KB
   payload over a 420 ms-RTT SSH link. Bumped queryTimeout to 30 s.
   The streamScript transport-level timeout still fires on truly
   wedged hosts.

2. mac.loadRecentSessions fired twice in parallel at t=960450 +
   t=960584, finishing 134 ms apart — two independent watcher ticks
   each spawning a full 3-query SSH load for the same data. Added
   in-flight request coalescing keyed on the inlined SQL text:
   when a query with the exact same SQL is already pending, second
   caller awaits the first task instead of spawning a new
   subprocess. New ScarfMon event `sqlite.query.coalesced`
   surfaces hits in traces.

Coalescing is surgical — applies to single `query` calls only,
not `queryBatch` (different timeout scaling, concurrent-same-batch
is rare). Avoids serializing independent work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:07:19 +02:00
Alan Wizemann 12e152bfea perf(ssh): replace Thread.sleep spin with kernel-wait for runLocal timeout
Audit Finding 3 — every SSH operation funnels through SSHTransport.runLocal,
which used a 100ms Thread.sleep loop while waiting for the timeout. Each
call held one cooperative-pool thread for the full timeout duration with
spin-poll overhead, AND had 100ms granularity on the deadline.

Replace with proc.terminationHandler + DispatchGroup wait — kernel-wakeup
when the process exits (or the deadline fires), no spin. Same one-thread
blocking footprint, but eliminates the per-operation spin work that
inflated query latency 60-70% under concurrent SSH load (visible in
ScarfMon as 7-second mac.loadRecentSessions outliers when sidebar reload +
chat finalize + watcher poll all fired together).

Minimum-touch fix; full async migration of runLocal documented for
follow-up. The bigger refactor would let cooperative-pool threads
park on a true async suspension during the wait, but requires
propagating async through every ServerTransport caller.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:06:58 +02:00
Alan Wizemann 099d73dde8 feat(scarfmon): instrument Nous model catalog + subscription path (beach-ball investigation)
User reported a remote-context beach-ball when opening the model
picker with Nous as the active provider. Existing measure points
showed loadProviders + loadModels at ~315ms each (fast). The
beach-ball must be in the uninstrumented Nous-overlay branch the
picker fires when nous is selected.

Adds four measure points covering every blocking call in that path:

- nous.subscription.loadState (interval, .diskIO) — auth.json read
  via NousSubscriptionService.loadState. Already known to do an SSH
  read; now precisely measurable.
- nous.readCache (interval, .diskIO) — nous_models cache read,
  TWO sequential SSH ops (fileExists + readFile).
- nous.bearerToken (interval, .diskIO) — auth.json read AGAIN inside
  fetchModels. **This is a duplicate read** — loadState already
  parsed the same file moments earlier. Comment-flagged as a
  caching candidate.
- nous.fetchModels (interval, .transport) + .bytes (event) — HTTP
  GET against the Nous /v1/models endpoint with the body byte count
  attached. The most likely beach-ball culprit if the endpoint is
  slow or hung.

After the next capture we'll know which of the four owns the user's
wall-clock; if `nous.bearerToken` shows up alongside
`nous.subscription.loadState` with similar duration, the duplicate
read is also a real cost worth fixing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 11:50:51 +02:00
Alan Wizemann 4efd84c119 feat(projects,cron): new project wizard + keychain env mirror + #75 fix
Three coordinated additions to the project surface:

1. New Project from Scratch wizard. Toolbar entry that scaffolds a
   Scarf-standard project skeleton (`<project>/.scarf/dashboard.json`
   placeholder + `AGENTS.md` marker block), registers it, opens an ACP
   chat session in the project's cwd, and auto-sends a kickoff prompt
   that activates the bundled `scarf-template-author` skill. The skill
   drives the substantive setup conversationally — widgets, optional
   config schema, optional cron, AGENTS.md content.

2. Keychain secrets mirror into ~/.hermes/.env. Cron jobs can now
   reference Keychain-backed config values via env vars named
   `SCARF_<UPPER_SLUG>_<UPPER_FIELDKEY>`. Hermes reloads .env per cron
   tick (cron/scheduler.py:897-903), so credential rotation is free.
   Source of truth stays in the Keychain — config.json keeps
   `keychain://` URIs unchanged. Mirror runs at install, post-install
   Configuration save, uninstall, "Remove from List", and on app
   launch (reconcileAll). Mode 0600 on `.env` enforced by
   LocalTransport's existing `.env` heuristic.

3. Configuration form layout recursion fix (issue #75). Per-stage
   frame sizes on `ConfigEditorSheet` triggered
   `_NSDetectedLayoutRecursion` for projects with manifest.json.
   Stabilized the outer frame at the editing stage's intrinsic size so
   transitions only swap content, never resize the container.

New services:
- `ProjectScaffolder` (Mac) — bare-shell project + AGENTS.md marker
- `SkillBootstrapService` (Mac) — copies bundled skills into ~/.hermes/skills/
- `KeychainEnvMirror` (Mac) — splice/unmirror/reconcileAll over ~/.hermes/.env
- `SecretsEnvBlock` (ScarfCore) — pure marker-block helpers

Bundled skill `scarf-template-author` v1.1.0 ships in
`Resources/BuiltinSkills.bundle/`; SkillBootstrapService copies it
into `~/.hermes/skills/scarf-template-author/` on launch (idempotent +
version-gated). The skill grew a "Using secrets in cron prompts"
section documenting the env-var convention.

Migration: launch reconciler auto-populates .env on first v2.8 launch.
Users with cron prompts authored against the old (broken) pattern need
to update them to use $SCARF_… references — see release notes.

Tests:
- SecretsEnvBlockTests: 24/24 (`swift test --filter SecretsEnvBlock`)
- KeychainEnvMirrorTests: 11/11 (`xcodebuild ... -only-testing:scarfTests/KeychainEnvMirror`)

The idempotent-mirror test caught a real bug: applyBlock's replace
path consumed the trailing newline from blockRange but didn't restore
it, breaking the no-op-when-unchanged contract that the launch
reconciler relies on. Fixed.

v2.8 RELEASE_NOTES.md committed but no release cut yet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 11:44:23 +02:00
Alan Wizemann bd9bacb8b3 feat(scarfmon): B2 + B3 + iOS dashboard — file watcher, message hydration, dashboard load
Three areas instrumented in this batch. Both targets build clean.

B2 — Mac HermesFileWatcher (FSEvents + remote SSH poll)
- mac.fileWatcher.localFire (event) — every FSEvents change on a
  watched core or project path. High counts during streaming chats
  are normal (state.db-wal ticks per persisted message); high counts
  during idle suggest a runaway watcher install.
- mac.fileWatcher.remoteRestart (event, bytes=path-count) — fires
  once per SSH poller restart, with the union path count attached.
  Frequent restarts mean the project-list update path is churning.
- mac.fileWatcher.remoteDelta (event) — fires per non-empty change
  detected on the SSH poll. Pair with `ssh.streamScript` cadence to
  see actual poll latency.

B3 — Chat session boot + message hydration
- mac.fetchMessages (interval) + .rows (event) — bounded SQL
  fetch from HermesDataService. Catches slow paginated scrolls
  back through long sessions.
- mac.refreshSessionFromDB (interval) — RichChatViewModel's
  post-promptComplete refresh that picks up cost/token data.
- mac.hydrateMessages (interval) + .rows (event) — full session-boot
  hydration in RichChatViewModel.loadSessionHistory. Was the suspected
  trigger of the 22-bubble session-start storms in the Phase 3a
  baseline; now precisely measurable.

iOS Dashboard (resolves the original "out of sync" mystery)
- ios.loadDashboard (interval) — wraps the four dataService.fetch*
  Citadel SFTP round-trips in IOSDashboardViewModel.load().
- ios.allSessions.count (event) — sidebar list size after each
  load, correlates load latency with list growth.
- ios.dashboardRefresh.trigger (event) — fires only on
  pull-to-refresh, separates that entry path from initial appear.

**Architectural finding:** the original v2.6.0 user feedback
("chat out of sync iOS↔Mac on fast LAN") is now firmly attributable
to this — iOS does NOT subscribe to a file watcher. The dashboard
refresh path is appear-time + pull-to-refresh only.
`CitadelServerTransport.watchPaths()` is effectively dead code on
iOS today; nobody calls it. Earlier A1 instrumentation (commit
9df7142) put measure points on it, which is why captures showed
zero `ios.fileWatcher.tick` events. Future work: either add a
foregrounded poll loop to iOS, or thread the file watcher into
the dashboard subscription. Documented in the ScarfMon roadmap
memory.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 23:52:11 +02:00
Alan Wizemann 96af545e66 feat(scarfmon): Tier A2/A3/B1/B4 — sessions, model catalog, dashboard widgets, image encoder
Four parallel instrumentation drops orchestrated by the perf roadmap.
All adds; no logic changes; both targets build clean.

A2 — Mac sessions list reload
- mac.scheduleSessionsRefresh (event) — every file-watcher entry into
  the debounced reload helper. Pair with mac.loadRecentSessions count
  to see how many ticks coalesce per actual reload.
- mac.loadRecentSessions (interval) — full wall-clock from DB open
  through observable assignment.
- mac.recentSessions.count (event) — sidebar list size, correlates
  list growth with reload latency.

A3 — ModelCatalogService loads
- modelCatalog.loadProviders (interval) + .providers.count (event).
- modelCatalog.loadModels (interval) + .models.count (event).
- modelCatalog.validateModel (interval) — covers loadCatalog ->
  transport.readFile, hits disk on every call.
Sync wrap (not measureAsync): the inner Task.detached body is
synchronous; the detached hop is the async boundary.

B1 — Dashboard render
- mac.dashboard.body (event) — ProjectsView body re-eval count.
- dashboard.loadRegistry (interval) — projects.json read + decode.
- widget.markdown_file.load / widget.log_tail.load /
  widget.image.load / widget.cron_status.load (intervals) —
  one per v2.7 file-reading widget. cron_status batches its two
  HermesFileService calls into one tuple-returning measure block
  so the existing two-call shape stays intact.

B4 — Image encoder
- imageEncoder.input.bytes (event) — raw input size.
- imageEncoder.downsample (interval) — full decode/resize/JPEG
  encode round trip across all three platform branches (AppKit,
  UIKit, Linux passthrough).
- imageEncoder.bytes (event) — final encoded JPEG size, lets us
  spot blowup cases.
Sync wrap: encode is nonisolated sync; using measureAsync would
require turning the function async, which is a logic change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 23:38:50 +02:00
Alan Wizemann 9df7142f49 feat(scarfmon): A1 — instrument iOS file-watcher polling cadence
Adds three measure points to CitadelServerTransport.watchPaths:

- ios.fileWatcher.tick (interval) — full poll cycle latency including
  the SSH stat round-trips. > 1500ms here is what 'out of sync' feels
  like — the channel is congested or the host is slow.
- ios.fileWatcher.delta (event) — fires only when the signature
  actually changed. Low delta/tick ratio means we can safely drop
  the 3-second cadence; high ratio means we'd just burn bandwidth.
- ios.fileWatcher.paths (event, bytes=count) — number of paths watched
  per cycle. Explains slow ticks as the project list grows.

Surgical addition; existing 3-second cadence + signature-diff logic
unchanged. With Full mode on, a few minutes of usage on LAN will
tell us empirically whether the cadence can drop to 1s — the
original v2.6.0 user feedback complained 'chat is out of sync'
between iOS and Mac on a fast LAN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 23:33:30 +02:00
Alan Wizemann 9ff9a018e7 feat(scarfmon,chat): Phase 3b — dampen finalize bursts + Thinking… status + wider loadConfig stack
Three targeted fixes from the Phase 3a baseline.

Bubble-burst dampening (Phase 3b-1):
- RichChatViewModel.finalizeStreamingMessage wraps both the
  streaming-id rewrite and the empty-finalize remove() in a
  no-animation Transaction. The id flip from 0 → permanent value
  was the load-bearing trigger of the 5–8 RichMessageBubble.body
  fires we were seeing 1–2 ms after every `finalizeStreamingMessage`
  interval; SwiftUI ran an animated diff against neighbors and
  re-evaluated their bodies. The new message is content-equal to
  the streaming one — there is no animation worth running.

Thinking… status promotion (Phase 3b-2):
- RichChatViewModel exposes `isStreamingThoughtsOnly` — true while
  a turn is in flight, has emitted thought-stream bytes, and has not
  yet produced any visible assistant text. The Phase 3a baseline
  showed this is where most of the user-perceived "feels slow" lives:
  reasoning models commonly take 3–8 s before producing visible
  output, and Scarf surfaced no specific signal during that window.
- Mac ChatView.displayedStatus promotes the toolbar pill to
  "Thinking…" when the flag is true.
- iOS connectionBanner gains a transient "Thinking…" strip with
  spinner, same trigger condition.

Phase 3a fix-up:
- HermesFileService.loadConfig stack-trace logging widened from
  one frame to a 10-frame window prefixed with "#N", so the actual
  caller is visible past inlined ScarfMon wrappers (the prior log
  surfaced ScarfMon.measure itself, not the loadConfig caller).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 23:14:03 +02:00
Alan Wizemann 0a4f8de492 feat(scarfmon): Phase 3a — diagnostic measure points for chat-render bursts
Adds four targeted measure points so the next baseline capture can
attribute the bubble-re-render storm and the slow sendPrompt to a
specific cause:

- mac.RichChatMessageList.body — distinguishes "the parent is
  re-issuing the ForEach" from "the bubbles are re-rendering on their
  own". If list.body fires once and bubble.body fires N times, churn
  is in the bubbles; if list.body fires N times, the ForEach itself
  is being rebuilt.
- finalizeStreamingMessage (interval) — pinpoints the end-of-stream
  burst trigger. The 20-bubble re-eval burst we saw at the close of
  each turn lines up with this call; measuring it surfaces whether
  it's the streaming-id rewrite, the turn-duration assignment, or
  something downstream.
- firstByte / firstThoughtByte (event) — fires once per turn on the
  first chunk after currentTurnStart is set. Splits user-tap → first
  byte (network + Hermes thinking, the dominant component of the 7-11s
  sendPrompt) from first byte → turn end (Scarf streaming render).
- loadConfig caller hint via os.Logger — when ScarfMon is in Full mode,
  logs the first stack frame above each loadConfig call to the
  com.scarf.mon subsystem so mystery callers (the read at t=264282
  with no apparent trigger in the prior baseline) become traceable
  via `log stream`. Symbol-only, no PII, free outside Full mode.

All four are pure additions — no behavior change, same zero-cost
default-off semantics as Phase 2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 22:47:29 +02:00
Alan Wizemann 3126c34561 feat(scarfmon): chat + transport + sqlite measure points (Phase 2)
Wires ScarfMon measure points into the chat hot path on both targets,
plus the underlying SSH transport and remote-SQLite backend. All
callsites are surgical adds — no behavior change. Cost when ScarfMon
is in `.signpostOnly` (default) is one os_signpost emit per call,
elided by the runtime outside an Instruments session. In `.full` mode
the same callsites also push samples into the in-memory ring buffer.

Render counters (event):
- mac.ChatView.body / ios.ChatView.body — full transcript pane re-evals
- mac.RichMessageBubble.body / ios.MessageBubble.body — per-bubble re-evals

Stream + session (event + interval):
- mac.sendViaACP, mac.sendPrompt — user tap → first-byte
- mac.acpEvent, mac.handleACPEvent — per-event delivery + handle cost
- mac.startACPSession — session boot
- ios.send, ios.startResuming — same shape on iOS
- ios.acpEvent, ios.handleACPEvent — same per-event split on iOS

Transport + SQLite (interval, with byte counts on rows):
- ssh.streamScript (Citadel iOS) — SSH round-trip
- ssh.run (SSHScriptRunner Mac) — SSH round-trip
- sqlite.query, sqlite.queryBatch — Remote SQLite per-call
- sqlite.query.rows — row count + stdout bytes per query

Disk I/O (interval):
- diskIO.loadConfig — config.yaml read + parse
- diskIO.loadCronJobs — cron jobs.json decode

Body counters use the `let _: Void = ScarfMon.event(...)` pattern at
the top of `body` — works inside `@ViewBuilder` and fires on every
re-eval, which is exactly the signal we want.

To use:
  Mac: Settings → Advanced → Performance Diagnostics → Full
  iOS: Settings → Diagnostics → Performance → Full
Both panels auto-aggregate by (category, name), surface top 20 by
p95, and offer Copy as JSON for sharing in feedback threads.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 22:18:06 +02:00
Alan Wizemann 6cf59c8a44 feat(scarfmon): perf instrumentation plumbing for iOS + Mac (Phase 1)
ScarfMon lands the always-on perf instrumentation harness. Phase 1 ships
the plumbing only; Phase 2 wires the chat measure points.

Core (ScarfCore/Diagnostics/):
- ScarfMon — public API: measure / measureAsync / event with @inline(__always)
  short-circuit when the backend set is empty so the off path is one
  branch + return. Categories are an enum, names are StaticString so
  user content cannot leak through metric tags.
- ScarfMonRingBuffer — fixed-capacity (4096) lock-protected ring; one
  os_unfair_lock per record; summary() aggregates by (category, name)
  with nearest-rank p50/p95; exportJSON() emits a one-line-per-sample
  dump for the Copy as JSON button.
- ScarfMonSignpostBackend — emits os_signpost into a dedicated
  com.scarf.mon subsystem so Instruments → Points of Interest shows
  Scarf's own measure points without a debug build.
- ScarfMonLoggerBackend — Logger(.debug) sink for users running
  `log stream --predicate 'subsystem == \"com.scarf.mon\"'`.
- ScarfMonBoot — three modes (off / signpostOnly / full); persists the
  user's choice in UserDefaults under ScarfMonMode; configure() is
  idempotent and replaces the active backend set atomically.

Tests: 11 cases covering ring ordering / wrap / reset, summary
aggregation, p95 percentiles, event vs interval semantics, install /
isActive, measure + measureAsync (including the throw path), boot
mode transitions, and JSON export round-trip. @Suite(.serialized)
because the suite mutates process-wide backend state.

App wiring:
- ScarfIOSApp.init + ScarfApp.init call ScarfMonBoot.configure(mode:)
  with the persisted mode (default .signpostOnly).
- iOS Settings → Diagnostics → Performance row leads to a list-style
  panel with the segmented mode picker, top-20 stat rows by p95, Copy
  as JSON, and Reset.
- Mac Settings → Advanced gains a ScarfMonDiagnosticsSection with the
  same shape (NSPasteboard for copy).

Open-source by design — no remote upload, no analytics. The ring buffer
never leaves the device unless the user explicitly taps Copy as JSON.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 22:08:21 +02:00
Alan Wizemann 272da6a915 fix(transport,widgets): code-review fixes for v2.7 + iOS Citadel transport
- CronStatusWidgetView: include jobId + lineCount in `.task(id:)` so widget reload fires when dashboard.json changes either field, not only when the file watcher ticks
- CitadelServerTransport.runScript: enforce the timeout via withThrowingTaskGroup race; propagate transport-level Citadel errors as TransportError.other (so RemoteSQLiteBackend.query maps them to BackendError.transport instead of misclassifying as BackendError.sqlite via a fake -1 exit code); throw TransportError.timeout on the deadline branch with partial stdout preserved
- SSHScriptRunner: close fileHandleForReading on stdout/stderr Pipes in the timeout branch (success path already did); check Task.isCancelled inside the busy-wait so a cancelled parent task terminates the subprocess early instead of waiting out the full timeout. Both runOverSSH and runLocally fixed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 21:40:07 +02:00
Alan Wizemann c7bcfd8655 feat(dashboards): v2.7 widget catalog — file-reading widgets, sparkline, typed status, project-wide watch
Major project-dashboard release. Five new widget types (markdown_file, log_tail,
cron_status, image, status_grid), inline sparkline on stat, typed status enum
shared by list + status_grid, structured WidgetErrorCard, and a project-wide
.scarf/ directory watch that picks up files cron jobs write next to dashboard.json.

- ProjectDashboard: extend DashboardWidget with path/lines/jobId/cells/gridColumns/sparkline; add StatusGridCell + ListItemStatus (lenient parse with synonyms)
- HermesFileWatcher: watch each project's .scarf/ dir alongside dashboard.json (local FSEvents + remote SSH mtime poll); updateProjectWatches signature now takes dashboardPaths + scarfDirs
- New widget views: CronStatus, Image, LogTail, MarkdownFile, StatusGrid, plus WidgetErrorCard for structured failure messaging; legacy "Unknown" placeholder replaced everywhere
- WidgetPathResolver: project-root-anchored path resolution that rejects absolute paths + ".." escapes pre and post canonicalization
- Stat widget gains optional inline sparkline (pure SwiftUI Path, no Charts dep); list widget rows route through typed status with semantic icons + ScarfColor tints
- iOS list widget + unsupported card adopt typed status + warning-toned error card (parity with Mac error styling); new widget types remain Mac-only
- Site mirror: widgets.js renders all five new types (file-reading widgets show annotated catalog placeholders), sparkline SVG, status-grid grid; styles.css adds typed-status palette + error-card + sparkline + grid styles
- Catalog validator: tools/widget-schema.json is the single source of truth; build-catalog.py loads it and enforces per-type required fields. 8 new test cases in test_build_catalog.py covering schema load, v2.7 additions, and missing-required rejection
- Template-author skill (SKILL.md) gains v2.7 Widget Catalog section + canonical status guidance; CONTRIBUTING.md points authors at widget-schema.json; template-author bundle rebuilt
- Localizable.xcstrings picks up auto-extracted strings for the previously-shipped OAuth keepalive feature
- Release notes drafted at releases/v2.7.0/RELEASE_NOTES.md

Backwards compatible — existing dashboard.json renders byte-identically, status synonyms (ok/up/down/active/etc.) keep working.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 21:16:29 +02:00
Alan Wizemann 9d945150e0 fix(chat): suppress 'stop' badge in metadata footer for normal turn ends
Every text-bearing assistant turn finalizes with `finishReason="stop"`
(set by `RichChatViewModel.finalizeStreamingMessage` line 881 — the
standard end-of-turn signal Hermes/ACP/OpenAI all emit). The
`metadataFooter` in `RichMessageBubble` was rendering it
unconditionally, so every assistant bubble carried a `· stop · TIME`
footer. Combined with terse model output (e.g. deepseek-v4-flash
emitting only a brief status line before ending the turn), the
badge created a misleading "the agent gave up" impression — there
was no warning, error, or actual failure.

Match the convention used by ChatGPT, Claude.ai, Cursor, etc.:
suppress the badge for normal end-of-turn (`stop` / `end_turn`),
reserve it for abnormal terminations the user actually wants to
see (`max_tokens`, `length`, `error`, `refusal`, `content_filter`,
…). When it does render, color it with severity tone — warning
yellow for "response cut short" cases, danger red for failures
and refusals, muted otherwise.

The existing `handlePromptComplete` system-message-injection path
(line 725-751) for non-`end_turn` stops still surfaces those cases
explicitly at the top of the chat — this change only trims the
always-on badge from the per-message footer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 15:40:31 +02:00
Alan Wizemann fa15634381 fix(oauth-keepalive): drop unsupported --silent flag from cron create
`hermes cron create` only accepts --name, --deliver, --repeat,
--skill, --script, --workdir. The `silent: Bool?` field on
HermesCronJob exists in the JSON model but isn't exposed through
the CLI's create verb today — argparse rejected the unknown flag,
non-zero exit, toggle failed with the generic CLI hint.

Drops the flag; the keepalive runs with Hermes's default delivery.
Token-refresh side effect during session boot is unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 15:33:25 +02:00
Alan Wizemann 3271391506 fix(chat): debounce sidebar reloads so sessions list doesn't flicker mid-stream
ChatView's `.onChange(of: fileWatcher.lastChangeDate)` fired an
unconditional `Task { await viewModel.loadRecentSessions() }` on
every file-watcher tick. During an ACP message stream the watcher
fires 5–10 times per second (every message Hermes persists bumps
`state.db-wal`'s mtime), and each spawned task re-fetched sessions +
previews + project attribution and reassigned `recentSessions` even
though the data was identical. Each reassignment triggered an
@Observable re-render of the chat sidebar; the user saw the chats
list visibly disappear and reappear several times while typing the
first message in a new chat.

Two changes:

* Add `scheduleSessionsRefresh()` to ChatViewModel — coalesces rapid
  ticks into one trailing `loadRecentSessions()` ~500 ms after the
  last tick. ChatView's onChange now calls this instead. The 500 ms
  window is short enough that idle external changes (a session
  created from another `hermes` invocation, a rename from a
  different window) still appear "soon", and long enough to absorb
  a streaming-response burst.
* Add an explicit `await loadRecentSessions()` to
  `autoStartACPAndSend` after the new session id resolves — the
  debounce would otherwise delay the just-created chat from
  appearing in the sidebar by 500 ms after first send. Mirrors what
  `startACPSession` already does at line 619 for the explicit New /
  Resume paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:56:59 +02:00
Alan Wizemann 5afd391838 feat(sidebar): promote Projects to first section + move profile chip under server name
Two small UX tweaks to the macOS sidebar:

* Reorder sections so Projects is the top section above Monitor.
  Reflects how users actually start sessions in Scarf — they pick a
  project first, then drill into chat / sessions / etc. The previous
  order put the read-mostly Dashboard at the top, which made
  Projects feel like a secondary surface.
* Move the active-profile chip out of the top header HStack (where
  it competed for horizontal space with the server-name pill) and
  drop it into a second row right-aligned under the server name.
  Top row stays clean: `[icon] Scarf       <server>`. Second row:
  `                              profile: <name>` only on local
  contexts. Same click target, same .help, just better-anchored.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:37:29 +02:00
Alan Wizemann 2a368a04f7 feat(window): persist window size + position across app launches
SwiftUI's WindowGroup exposes `.defaultSize` and `.windowResizability`
but no built-in autosave for window frame across launches. The
documented escape hatch is AppKit's
`NSWindow.setFrameAutosaveName(_:)`, which writes the frame to
UserDefaults on resize/move and restores it on next open.

Add a small `WindowFrameAutosave` NSViewRepresentable that finds its
hosting NSWindow on first appear and stamps the autosave name. Apply
it to `ContextBoundRoot` keyed off `context.id` so each open server
window remembers its own geometry. New servers fall back to the
WindowGroup's `.defaultSize(1100, 700)` until the user resizes once.

A previous WIP attempt (dd4a61f) tried to use a fictional
`.windowFrameAutosaveName(...)` SwiftUI modifier that doesn't exist —
which is why it was never merged. This works because we go through
AppKit directly.

Also picks up Xcode's auto-extracted cron-related Localizable.xcstrings
entries that had been pending.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:34:08 +02:00
Alan Wizemann 9aa901a286 fix(credential-pools): refresh view after OAuth sheet dismiss
The sheet auto-closes 0.8s after `oauthFlow.succeeded` flips, but
the parent view didn't reload — so the expiry badge stayed red and
the `tokenTail` stayed stale until the user hit Reload. Hook
`viewModel.load()` + `probeKeepalive()` into the sheet's
`onDismiss` so the freshly-written `auth.json` lands on screen
immediately. Runs on every dismiss (success or cancel) — `load()`
is cheap and idempotent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:33:22 +02:00
Alan Wizemann 111fe9bb67 feat(oauth): unblock remote re-auth + daily keepalive to prevent expiry
Two related fixes for OAuth subscriptions (Nous Portal, Anthropic
Claude OAuth, etc.):

- **Remote re-auth stall**: Both `NousAuthFlow` and
  `OAuthFlowController` set `PYTHONUNBUFFERED=1` only on local
  contexts. On remote, setting `proc.environment` only affects the
  local-side ssh process — not the remote python interpreter. ssh
  doesn't forward arbitrary env vars without `SendEnv` configured on
  both sides, so remote hermes ran with default block-buffered stdout
  and the device-code prompt never reached Scarf — the sheet hung at
  "Contacting Nous Portal" forever. Fix: when remote, wrap the
  command in `env PYTHONUNBUFFERED=1 …` to inject the var on the
  remote side regardless of ssh config.
- **Daily keepalive**: Hermes refreshes OAuth access tokens on agent
  startup but never proactively. If the user goes longer than the
  refresh-token lifetime (~30 days for Nous) without starting a
  session, the refresh token itself expires and full re-auth is
  required. New `OAuthKeepaliveCronService` registers a Scarf-owned
  daily cron job (`[scarf:oauth-keepalive] OAuth token refresh`) at
  4am that runs a minimal one-token prompt — booting the session is
  what triggers `resolve_nous_runtime_credentials()`. Wired as an
  opt-in toggle in the OAuth providers section of CredentialPoolsView.
  When `hermes auth refresh <provider>` lands upstream we'll swap the
  prompt for that verb; the surrounding wiring stays unchanged.
- **Stale-refresh nudge**: `NousSubscriptionState` gains
  `daysSinceLastRefresh()` + `hasStaleRefresh` (>= 14 days, half of
  Nous's 30-day refresh-token window). The keepalive section
  surfaces an inline orange warning when stale and the toggle is
  off — points the user at the toggle that would have prevented the
  problem.

Verification: scarfCore 263/263; Mac app builds clean. Manual repro
of remote stall against Digital Ocean droplet pending user test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:32:06 +02:00
Alan Wizemann 6191c9f19f fix(remote-backend): pre-expand ~/ in Swift via resolvedUserHome
The previous fix (b8b426e) rewrote `~/.hermes/state.db` to
`"$HOME/.hermes/state.db"` and relied on the remote shell to expand
$HOME. That works on Mac SSHTransport (login shell with $HOME set in
the environment) but not reliably through Citadel's exec channel +
base64-decode + inner-/bin/sh pipeline on iOS — the user reports
"unable to open database \"~/.hermes/state.db\"" connecting from
ScarfGo (iOS Simulator) to 127.0.0.1, meaning the literal `~`
character reached sqlite3 untouched.

Switch to client-side expansion: probe remote $HOME once at
RemoteSQLiteBackend.open() via the existing
ServerContext.resolvedUserHome() helper (which uses transport.runProcess
to `echo $HOME` — same code path Hermes CLI calls already exercise
successfully on iOS). Cache the result. quoteForRemoteShell then
substitutes `~/` with the absolute path in Swift before single-
quoting, so sqlite3 receives `/Users/alan/.hermes/state.db` directly
— no nested-shell expansion required.

Falls back to the previous "$HOME/..."-quoted form when the home
probe fails (rare; covers the case where runProcess can't reach the
remote but the user happens to have a working streamScript path).

Mirrors how RemoteBackupService.expandTilde already handles the same
problem upstream.

Refs #74

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 13:40:33 +02:00
Alan Wizemann b8b426ed75 fix(remote-backend): expand ~/ to $HOME so sqlite3 finds the DB
Default-config remotes (Hetzner, Digital Ocean, anything where the
user hasn't overridden remoteHome on the SSHConfig) have
`paths.stateDB == "~/.hermes/state.db"`. The streaming backend was
single-quoting that path, which suppresses tilde expansion, and
sqlite3 itself doesn't expand `~` (that's a shell affordance). Result:
"Error: unable to open database \"~/.hermes/state.db\": unable to open
database file" — the path was reaching sqlite3 with a literal `~`
that it tried to interpret as a directory name.

Replace the single-quote-only `escape(_:)` with `quoteForRemoteShell(_:)`
that mirrors `SSHTransport.remotePathArg`'s pattern: rewrite leading
`~/` to `"$HOME/..."` (double-quoted so the shell expands `$HOME`,
backslash-escaping any embedded `\\`, `"`, `$`, ` to keep the literal
intact), bare `~` to `"$HOME"`, and absolute paths get the standard
single-quote-with-`'\''`-escape treatment.

Adds a regression test (`openWithDefaultTildeHomeExpands`) that
exercises the tilde-rewrite end-to-end against a real /bin/sh: places
a fixture state.db at `~/.hermes/state.db` (backing up the user's
real DB if present) and verifies open() + a query both succeed
through the streaming path.

Refs #74

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 13:34:20 +02:00
Alan Wizemann 593b4e62cb feat(remote): replace SQLite snapshot pipeline with SSH query streaming
The remote-DB pipeline pulled the entire state.db down via scp on
every refresh tick. For the issue #74 user (4.87 GB DB) that meant
~7-min per-snapshot wall time even with the size-aware-timeout fix,
~30 GB/hour upload, and data permanently 5–10 minutes stale. This
isn't a bug to patch — it's the wrong architecture for any non-trivial
remote DB.

Replace it with per-query streaming over SSH. Each SQL statement
becomes one ssh round-trip running `sqlite3 -readonly -json` against
the live remote DB. ControlMaster keeps the channel warm at ~5 ms
overhead; sqlite3 cold-start adds ~30–50 ms; total ~50–100 ms per
query vs. the old multi-minute snapshot. Bandwidth scales with query
result size, not DB size.

What changed:

* New `HermesQueryBackend` protocol and two implementations:
  `LocalSQLiteBackend` (libsqlite3 in-process — local performance
  unchanged) and `RemoteSQLiteBackend` (sqlite3 over SSH per query
  with batched-statement support for multi-query view loads).
* `SQLValue` and `Row` types as the typed boundary between backends
  and the row parsers. `SQLValueInliner` substitutes `?` placeholders
  with SQLite-escaped literals for the remote-CLI codepath (local
  backend keeps real `sqlite3_bind_*`).
* `ServerTransport` swaps `snapshotSQLite` + `cachedSnapshotPath` for
  `streamScript(_:timeout:)`. SSHTransport delegates to the existing
  `SSHScriptRunner`; CitadelServerTransport (iOS) base64-encodes the
  script + decodes remotely via Citadel's exec channel since stdin
  pipes aren't supported there yet.
* `HermesDataService` becomes a thin facade — every fetch* method
  routes through `backend.query(...)`. Public API is unchanged for
  view-model callers; `lastSnapshotMtime`/`isUsingStaleSnapshot`/
  `staleAge` removed (had zero UI consumers).
* New `dashboardSnapshot()` and `insightsSnapshot(since:)` batched
  calls turn Dashboard's 4-query and Insights' 5-query view loads
  into one SSH round-trip each (~80–100 ms total instead of ~280 ms
  naive). DashboardViewModel and InsightsViewModel updated to use
  them.
* One-time launch migration in `scarfApp` wipes the orphaned
  `~/Library/Caches/scarf/snapshots/` directory (could be 5 GB+ for
  the issue #74 user).

JSON parsing detail: sqlite3 -json preserves SELECT column order in
the raw bytes, but `[String: Any]` from NSJSONSerialization doesn't.
The remote backend extracts column ordering by walking the first
object's literal bytes — without this, every positional row read
(`row.string(at: 0)`) would silently return wrong columns.

Tests: 41 new across `SQLValueInlinerTests`, `HermesDataServiceBackendTests`
(mock backend) and `RemoteSQLiteBackendTests` (integration via local
sqlite3 binary). Full suite 262/262 passing.

Builds clean on Mac and iOS. Ships as part of v2.7.

Refs #74

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 13:09:06 +02:00
Alan Wizemann de36411a8d fix(remote): size-aware snapshot timeouts and partial-file cleanup (#74)
The remote-DB snapshot pipeline was hardcoded to a 120s scp timeout and
a 60s remote-backup timeout. For users with a multi-GB state.db (the
report cites 4.87 GB), 120s is wildly insufficient — at typical home
upload speeds (5-50 Mbps) a 5GB transfer takes 13 minutes to several
hours. scp gets killed mid-transfer, leaves a partially-written .db at
the cache path, and every subsequent attempt opens that corrupt file
with sqlite_open returning garbage. Symptom: SSH connects, all
diagnostics pass, but Dashboard / Sessions / Memory show no data.

Changes to SSHTransport.snapshotSQLite:

* Probe `stat` on the remote DB before starting. Drives both the
  timeout budget and a local-disk-space pre-flight (refuses to start
  if local Caches volume can't hold size + 500MB margin).
* Adaptive timeouts based on remote size:
  - backup: 60s base + 1s per 100MB, capped at 600s.
  - scp:    300s base + 0.5s per MB (≈2 MB/s minimum throughput),
            capped at 3600s.
  Defaults of 60s/300s when stat fails (still up from 120s on scp).
* Add `-C` to scp args. SQLite DBs have lots of zero-padded empty
  pages and typically compress 30-50% in transit.
* On any failure path, remove the partial local snapshot file so the
  next attempt starts fresh instead of opening a corrupt DB.
* Rewrite the generic "Command timed out after Ns" error into a
  specific "Snapshot transfer timed out after Ns pulling X.X GB
  state.db from <host>" so users on slow links know what hit the
  wall instead of seeing a meaningless number.

Cannot reproduce locally (no 5GB state.db on hand), but the failure
mode is unambiguous from code reading: hardcoded 120s vs. real-world
multi-GB transfer durations.

Closes #74

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 11:25:38 +02:00
Alan Wizemann 6a7ac21ebe chore: Bump version to 2.6.5 2026-05-03 22:15:05 +02:00
Alan Wizemann 5be67282d8 test(layer-b): full Install → Configure → Open → Uninstall journey XCUITest (#73)
Closes the deferred Layer B install-drive that v2.7's smoke test
left as future work. The new test
(`testFullCatalogToInstallToDashboardJourney`) drives the full
install/uninstall pipeline end-to-end and validates 9 assertion
points along the way:

- Window surfaces under `--scarf-test-mode`
- Sidebar navigation to Projects
- Install sheet appears (URL handoff via launch arg)
- Parent-dir field accepts custom path + Continue
- Configure sheet renders + commit clicks
- Confirm Install runs the install pipeline
- Open Project advances to success view
- Project row appears in sidebar with uniquified name
- Right-click Uninstall + confirm Remove + Done removes the row

Runs in ~30s green on the dev Mac.

## What needed wiring up

**SwiftUI Menu / NSToolbarItem accessibility-bridging.** macOS
toolbar Menus don't propagate `.accessibilityIdentifier` through to
XCUITest — neither the menu trigger NOR the popup contents are
queryable by ID. Verified by tree-dump diagnostics. The test
sidesteps this entirely by routing the install URL through a new
`--scarf-test-install-url <https-url>` launch arg that calls
`TemplateURLRouter.shared.handle(scarf://install?url=...)` at App
init, gated on `TestModeFlags.shared.isTestMode`. Production
launches (no flag) untouched.

**Accessibility IDs added** on the new install/uninstall path:
- `templateConfig.commitButton`, `templateConfig.cancelButton`
- `projects.row.<name>`, `sidebar.section.<rawValue>`
- `projects.contextMenu.uninstallTemplate`
- `templateUninstall.confirmRemove`
- `templateInstall.success.openProject`
- `templateUninstall.success.done`

**Sandboxed-runner caveat.** The XCUITest runner's `/tmp` is
sandbox-protected (createDirectory throws EPERM); we use
`NSTemporaryDirectory()` which resolves to the runner's container
tmp (`~/Library/Containers/com.scarfUITests.xctrunner/Data/tmp/`),
which the unsandboxed Scarf app can read since it has full disk
access.

## Known cohabitation hazard (pre-existing uninstaller bug)

If the dev Mac already has a project from the same template
installed, the install pipeline uniquifies the new project's name
("HackerNews Daily Digest 2") but BOTH projects' cron jobs get
registered under the same `[tmpl:awizemann/hackernews-digest] Daily
HN digest` name. `ProjectTemplateUninstaller.loadUninstallPlan`
resolves cron jobs to remove by NAME and can target the wrong
project's job. The Layer B test surfaces this — manifests as: test
passes, the dev's real project's cron job disappears.

**Fix (separate work):** store cron-job IDs in
`<project>/.scarf/template.lock.json` at install time and resolve
by ID at uninstall time. Until then, the test docstring warns
about cohabitation; recovery is `hermes cron create` to recreate
the lost job.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 22:09:50 +02:00
Alan Wizemann c661945a1f feat(cron): auth-error banner + running indicator + per-job log tail (#72)
Cron rows now surface the same OAuth-refresh-revoked recovery flow as
chat instead of a generic red dot, plus three previously-missing
observability cues:

- ACPErrorHint.classify is reused on `job.lastError`. When it returns
  `oauthRefreshRevoked(provider)` the detail pane shows the human hint
  + a "Re-authenticate" button that drops the user into Credential
  Pools via `coordinator.pendingOAuthReauth = provider` — same wiring
  ChatView's banner uses. Unrecognized errors fall back to the legacy
  red `lastError` text (no regression).
- Row dot turns blue + pulses when `state == "running"` (taking
  precedence over disabled / error / success); the detail header gains
  a `ScarfBadge("running…", kind: .info)` next to active/paused. No new
  polling — `HermesFileWatcher.lastChangeDate` (already wired into
  ActivityView/Logs) drives `CronViewModel.load()` so state flips
  surface within a watcher tick.
- "LAST RUN OUTPUT" replaces the inline `LAST OUTPUT` block with a
  collapsible panel: a one-line summary (`<timestamp> — ok|error|running…`)
  always visible, full monospaced terminal-style scroll view on
  expand, auto-scrolls to bottom when new runs land.

Also fixes a pre-existing bug in `HermesFileService.loadCronOutput`:
Hermes nests per-run output under `~/.hermes/cron/output/<jobId>/<ts>.md`
but the loader treated the dir as flat, so the cron output panel never
rendered any content. The fix walks the per-job subdir + keeps the
legacy flat-file fallback for older Hermes layouts.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 22:09:21 +02:00
Alan Wizemann f5f8dc30b6 Dogfooding templates: HN Digest + in-app catalog browser + test harness (#71)
* feat(templates): hackernews-digest template + dogfooding test harness

First pass of the dogfooding-templates initiative. Each pre-release cycle
ships one new official `.scarftemplate` and uses installing/exercising
that template as the regression test. v1 lands the harness scaffolding
plus the first template under it.

- HackerNews Daily Digest template (`templates/awizemann/hackernews-digest/`):
  config-driven (min_score / max_items / topics) cron-only template.
  No secrets — keeps the harness minimal until the fake-Keychain shim
  lands. Bundle validates against `tools/build-catalog.py`; entry added
  to `templates/catalog.json`.
- `SCARF_HERMES_HOME` env-var override at `HermesProfileResolver` —
  the seam every Layer-B test relies on to drive Scarf against an
  isolated Hermes home. Bypasses cache + active_profile lookup; rejects
  relative paths. 5 unit tests + 3 ServerContext integration tests.
- `TestModeFlags.shared.isTestMode` — reads `--scarf-test-mode` once
  from `CommandLine.arguments`. Wiring only; gating sites (Sparkle,
  capability probe, first-run walkthrough) land as Layer-B exercises
  them.
- Layer A (`scarf/scarfTests/TemplateE2ETests.swift`): parses + plans
  the shipped HN bundle the way the app does at install time;
  asserts manifest, config schema, dashboard widgets, and cron prompt
  contract. Mirrors the existing site-status-checker coverage.
- Layer B scaffold (`scarf/scarfUITests/TemplateInstallUITests.swift`):
  proves the launch-arg + env-var plumbing reaches Scarf. Full install
  click-through deferred until fixture-Hermes-home and accessibility
  IDs land.

Wiki pages added separately on the `.wiki-worktree` branch:
- `Template-Ideas.md` — backlog of 9 v1-feasible templates +
  full-spec v3 epic for Project-Site-as-Living-Surface (eBay listings
  use case).
- `Test-Harness.md` — contributor guide for extending the harness.

Verification: scarfTests 124/124, ScarfCore 220/220, new Layer A 3/3,
Layer B scaffold 1/1, build-catalog.py + its 28 unit tests all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(test-harness): Layer B pivot to real ~/.hermes + a11y IDs + Sparkle gating

Discovered during Layer B work that XCUITest runners are sandboxed:
they can read ~/.hermes/ but writes throw NSFileWriteNoPermissionError.
That kills the SCARF_HERMES_HOME-based isolation pattern for UI tests —
snapshot/restore from inside the runner can't work. Pivot:

- Layer B drives the real ~/.hermes the dev Mac is already running
  against. The harness assumes a working Hermes install (XCTSkip if
  the binary isn't there). Cleanup is via the app's own UI flows
  (which have full disk access), not direct file I/O. Layer A keeps
  its env-var seam — those tests run inside the host app's address
  space and write freely.
- SwiftUI's WindowGroup(for: ServerID.self) doesn't auto-surface a
  window on a fresh XCUIApplication.launch(). The harness sends ⌘1
  (the "Open Server → Local" menu shortcut wired in scarfApp.swift's
  OpenServerCommands) to take the same code path real users hit via
  Dock click.
- Real user home resolved via getpwuid(getuid()) rather than
  NSHomeDirectory(), which inside the sandboxed runner returns
  ~/Library/Containers/com.scarfUITests.xctrunner/Data.
- 8 accessibility IDs added on the install path so the next iteration
  can drive the full Templates → Install from URL → Parent dir →
  Confirm Install flow without depending on view-tree label scraping:
  templates.toolbar.menu, templates.installFromFile,
  templates.installFromURL, templates.installURL.field,
  templates.installURL.confirm, templateInstall.parentDir.field,
  templateInstall.parentDir.continue, templateInstall.confirmInstall.
- TestModeFlags.shared.isTestMode now gates UpdaterService —
  --scarf-test-mode launches Sparkle inert so update prompts don't
  pop on top of an XCUITest-driven window. Production launches
  unchanged.

FixtureHermesHome.swift removed — the fixture-tmpdir approach is
abandoned in favour of using the real installation. Layer A's
SCARF_HERMES_HOME tests still pass; they just don't need a populated
home to exercise path derivation.

Verification: scarfTests 124/124, ScarfCore 220/220, Layer B smoke
1/1 (after fresh build — XCUITest is sensitive to stale binaries).
catalog.py --check still green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(chat): clip placeholder to TextEditor bounds and clear it on focus

Two related bugs in the Mac chat composer's placeholder overlay:

* The "Message Hermes… / for commands · drag images to attach" hint had
  no width constraint, so on narrower window geometries it visibly
  overflowed past the rounded TextEditor boundary. Add `lineLimit(1)`,
  `truncationMode(.tail)`, and `frame(maxWidth: .infinity, alignment:
  .leading)` so it ellipsizes inside the field instead.
* The opacity formula `text.isEmpty ? 1 : 0` only hid the placeholder
  once content was typed, not when the field gained focus. Standard
  NSTextField / UITextField semantics clear the placeholder on focus.
  Switch to `(text.isEmpty && !isFocused) ? 1 : 0` so the hint
  disappears the moment the user clicks into the field.

The opaque-background ghosting mitigation from #65 is preserved
unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(chat): surface OAuth refresh-revoked errors with in-app re-auth

When an OAuth provider's refresh token was revoked, Hermes printed
"Refresh session has been revoked. Run `hermes model` to re-authenticate."
to stderr but Scarf swallowed it — the user saw a typing indicator that
silently disappeared with no banner, no system message, no actionable
hint. The error classifier had no pattern for OAuth revocation.

- `ACPErrorHint.classify` now returns a `Classification` struct
  carrying the hint plus an optional `oauthProvider` name. New
  patterns match "Refresh session has been revoked", "re-authenticate",
  and 401-with-OAuth-provider-name (whole-word so `anthropicapi`
  doesn't false-match `anthropic`). Provider extraction lets the UI
  dispatch the right re-auth flow.
- Chat error banner ([ChatView.swift]) gains a "Re-authenticate" button
  when an OAuth provider was identified — sets
  `AppCoordinator.pendingOAuthReauth` and routes to Credential Pools.
- Credential Pools view consumes the hand-off slot to auto-present
  AddCredentialSheet seeded with the affected provider, AND adds a
  per-row "Re-authenticate" button on every OAuth provider so users
  who go straight there don't have to retype the provider name.
- `AddCredentialSheet` accepts an optional `initialProvider` that
  pre-fills providerID + authType=.oauth; the existing Nous-vs-PKCE-
  vs-CLI gate dispatches re-auth identically to first-time setup —
  reuses the same `OAuthFlowController` / `NousSignInSheet` plumbing,
  no new flow code.

Verification: ScarfCore 221/221 (incl. new
errorHintsClassifyOAuthRefreshRevoked covering the four patterns +
word-boundary guard); Mac app builds clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(catalog): in-app template catalog browser + sentinel-marker test isolation

The v2.8 catalog browser surfaces every shipped .scarftemplate from
awizemann.github.io/scarf/templates/catalog.json directly in Scarf.
Users now discover and install templates without leaving the app.
Closes the gap that publishing the catalog updated the website but
nothing inside Scarf.

Architecture mirrors NousModelCatalogService 1:1: cache-first fetch,
24h TTL at ~/.hermes/scarf/catalog_cache.json, result enum (fresh /
cache / fallback) with bundled fallback so a fresh-install / offline
user still sees something. Search + category filter + sort
(awizemann official first). Detail page renders entry.config schema
preview without separate README fetch — what's in catalog.json is
what we render. Install hands the HTTPS URL to the existing
TemplateInstallerViewModel.openRemoteURL flow; nothing about the
installer itself changes.

Files:
- Core/Models/CatalogEntry.swift — Decodable mirror of catalog.json
  per-template shape. Identity-based Equatable/Hashable on `id`.
- Core/Services/CatalogService.swift — fetch + cache + fallback
- Core/Services/InstalledTemplatesIndex.swift — walks projects.json +
  template.lock.json to build [templateId: version] map; classify()
  helper for Installed / Update available / Not installed badges
- Features/Templates/ViewModels/CatalogViewModel.swift — @Observable
- Features/Templates/Views/{CatalogView,CatalogRowView,CatalogDetailView,CatalogCategoryFilter}.swift
- Packages/ScarfCore/.../HermesPathSet.swift — adds catalogCache path
- Features/Projects/Views/ProjectsView.swift — Templates toolbar
  menu now opens with "Browse Catalog…"; sheet binding.

Tests (20 new, all passing in isolation):
- CatalogServiceTests (6) — live catalog.json snapshot, cache lifecycle,
  staleness boundary, schema-version mismatch rejection, bundled fallback
- InstalledTemplatesIndexTests (5) — empty registry, templated project,
  ad-hoc project skip, corrupt lock skip, classify() branches
- CatalogViewModelTests (6) — search filter, category filter, official-first
  sort, deduped categories, install state, install URL pass-through

Accessibility IDs (6, on the catalog path): templates.browseCatalog,
catalog.searchField, catalog.refreshButton, catalog.row.<detailSlug>,
catalog.categoryFilter, catalogDetail.installButton.

## Sentinel-marker hardening on SCARF_HERMES_HOME (incident response)

While iterating on v2.8 tests, the env-var override pattern racing
under Swift Testing's parallel-suite scheduler caused
~/.hermes/scarf/projects.json to be overwritten with fixture data
from ProjectsViewModelTests. Recovered the user's projects from the
on-disk dirs they referenced + cron-job prompt paths (6 projects
restored).

To make this class of incident impossible going forward:
HermesProfileResolver.scarfHermesHomeOverride() now requires the
override path to contain a sentinel marker file
(`.scarf-test-home-marker`). Without the marker, the override is
ignored and Scarf falls through to the real ~/.hermes/. Even if a
test crashes mid-teardown leaving the env var set, even if the var
leaks to a non-test process, even if a misconfigured launchctl plist
exports it — the override only activates against directories that
explicitly opt in by carrying the marker. Tests drop the marker in
their tmpdir setUp; production never carries it.

HermesProfileResolverTests gains overrideIsIgnoredWhenMarkerMissing
which verifies the guard is load-bearing. All test files using
SCARF_HERMES_HOME (CatalogServiceTests, CatalogViewModelTests,
InstalledTemplatesIndexTests, TemplateE2ETests) now drop the marker
before setenv.

Verification: 20/20 v2.8 + v2.7 hardened tests pass; 45/45 adjacent
existing tests pass; ScarfCore package tests pass (221/221); catalog
validator clean (3 templates); wiki secret-scan clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(swift6): retroactive conformance + verbatim help text + xcstrings refresh

Three small Swift 6 compile-cleanups that landed during the
dogfooding-templates iteration:

- MessageSpeechService — drop `@preconcurrency` on the
  AVSpeechSynthesizerDelegate conformance now that the protocol's
  Sendable annotations are upstreamed.
- ChatView — mark `RichChatViewModel.PendingPermission: Identifiable`
  as `@retroactive`. We don't own either the type or the protocol; the
  Swift 6 compiler flags this so downstream breakage is loud if
  ScarfCore ever adds the conformance upstream.
- CredentialPoolsView — wrap the `.help(...)` string in
  `Text(verbatim:)` so the backticks render literally instead of being
  interpreted as markdown inline-code by the LocalizedStringKey
  overload (which `.help(_:)` rejects styled).

Localizable.xcstrings: auto-generated catalog refresh picking up the
new active-profile + chat error-hint strings landed in earlier
commits on this branch (acd3692, 301806d).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(catalog): error logging + MainActor I/O + semver pre-release + decoder fault tolerance

- InstalledTemplatesIndex: replace bare `try?` reads/decodes with logged
  do/catch so corrupt registry/lock files leave a breadcrumb instead of a
  silent nil.
- InstalledTemplatesIndex.isVersionNewer: handle pre-release suffixes per
  semver §11 — `1.0.0-beta` no longer reports as newer than `1.0.0`,
  preventing a ghost "Update available" that would downgrade users.
- CatalogViewModel.refresh: dispatch the synchronous index walk through
  `Task.detached` so registry + N lock-file reads don't run on
  @MainActor.
- Catalog decoder: per-element fault tolerance via custom `init(from:)` —
  one malformed catalog entry is dropped with a logged warning instead
  of failing the whole catalog decode (honors the per-entry doc-comment
  contract).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 20:04:13 +02:00
Alan Wizemann 34d315793b fix(chat): clip placeholder to TextEditor bounds and clear it on focus
Two related bugs in the Mac chat composer's placeholder overlay:

* The "Message Hermes… / for commands · drag images to attach" hint had
  no width constraint, so on narrower window geometries it visibly
  overflowed past the rounded TextEditor boundary. Add `lineLimit(1)`,
  `truncationMode(.tail)`, and `frame(maxWidth: .infinity, alignment:
  .leading)` so it ellipsizes inside the field instead.
* The opacity formula `text.isEmpty ? 1 : 0` only hid the placeholder
  once content was typed, not when the field gained focus. Standard
  NSTextField / UITextField semantics clear the placeholder on focus.
  Switch to `(text.isEmpty && !isFocused) ? 1 : 0` so the hint
  disappears the moment the user clicks into the field.

The opaque-background ghosting mitigation from #65 is preserved
unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 16:48:32 +02:00
Alan Wizemann acd3692faf fix(profiles): switch-and-relaunch flow + active-profile chip + structured logs
Profile selection had no apparent effect on Webhooks/Sessions/SOUL.md/Memory
even after restart in some user setups. The path-resolution code reads
~/.hermes/active_profile correctly on paper, so the failure mode is likely
environment-specific (HERMES_HOME exported in the shell, in-process state
that didn't reset on what the user perceived as a restart, etc). Layer a
defense that's correct regardless of root cause:

* New AppRelauncher helper spawns a fresh `open -n <bundleURL>` and asks
  the current process to terminate after a 250ms delay. Refuses to fire
  from Xcode/DerivedData (the .debugBuild guard) so debug sessions don't
  lose their attached debugger.
* ProfilesViewModel.switchAndRelaunch runs `hermes profile use`, calls
  HermesProfileResolver.invalidateCache(), then relaunches via the helper.
  Existing switchTo() also gains the cache-invalidation step so the
  context-menu "Set Active (no relaunch)" path stays self-consistent.
* ProfilesView replaces the passive "Restart Scarf after switching" text
  with a confirmation-gated `Switch & Relaunch` primary button on the
  detail pane plus the same item in each row's context menu. Confirmation
  dialog flags that all Scarf windows will close.
* SidebarView header gains a brand-tinted ScarfBadge showing the
  currently-active profile on local contexts. Click to jump to the
  Profiles tab. The chip refreshes on `selectedSection` change so a
  terminal-side `hermes profile use` is visible after the next nav.
* HermesProfileResolver success logs gain `name=…, home=…, source=…`
  key=value structure across all three resolution paths (file / file-default /
  default-no-file). `log show … | grep ProfileResolver` now answers
  "what did the resolver decide?" unambiguously for support requests.

Closes #70

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 13:18:10 +02:00
Alan Wizemann ab615f0c28 feat(ios-chat): redesign composer with HIG touch targets and clear disabled state
Send button is now a 44pt circular target with an explicit color swap
(rust accent → background-tertiary) on disable, instead of relying on
SwiftUI's default opacity dim — addresses the "first tap doesn't
register" complaint by making the inactive state visibly different in
both light and dark mode. Paperclip and text field both gain a 44pt
minimum height so the row feels modern and roomy.

The text field swaps `.roundedBorder` for a plain field with a
ScarfRadius.xl rounded fill (ScarfColor.backgroundSecondary) and a
borderStrong stroke. Outer paddings and HStack spacing migrate from
magic numbers to ScarfSpace tokens.

Preserves verbatim: the `.toolbar { ToolbarItemGroup(placement: .keyboard) }`
keyboard-dismiss chevron (issue #51), draft persistence, .submitLabel,
@FocusState, photo-picker wiring, attachment-strip rendering, and every
.disabled() predicate.

Closes #69

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 13:14:09 +02:00
Alan Wizemann 982ed7da92 chore: bump iOS build to 30 for TestFlight
iOS-only patch carrying the rotation lock + chat-start preflight
off-MainActor fixes from cb164f0. Mac side stays on the v2.6.0
binary already shipped (build 29 archive); this build number bump
only affects future Mac archives, not the one already notarized.

Uploaded to App Store Connect via altool — Apple processing now,
will land in TestFlight once the binary clears the post-upload
scan (typically 5–15 min).
2026-05-01 16:20:13 +02:00
Alan Wizemann cb164f07f9 fix(ios): lock iPhone to portrait + move chat-start preflight off MainActor
Two iOS-specific crash classes from the v2.5.1 TestFlight feedback
round:

**Rotation crash** — locked the iPhone target to
`UIInterfaceOrientationPortrait` only (was Portrait + LandscapeLeft
+ LandscapeRight). The phone can't rotate the app at all anymore,
so any layout path that wasn't audited for size-class transitions
is no longer reachable. iPad orientation list left alone (target
device family is iPhone-only anyway).

**"Crash while typing" / "trying to continue an existing
conversation"** — `ChatController.passModelPreflight()` was doing
a synchronous SSH read (`context.readText(configYAML)`) on
`@MainActor` during chat-start. On a remote ScarfGo context that
blocks the main thread for seconds; iOS's non-responsive-app
watchdog kills the process around 10s. To the user this surfaces
as a "crash" while they're typing — they kept tapping the keyboard
while the connect was hung. Move the read to `Task.detached` and
await it; the UI stays responsive while the SSH I/O drains. Three
callers (`start`, `start(projectPath:)`, `startResuming`) updated
to `await passModelPreflight(...)` — they were already async.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 16:03:28 +02:00
Alan Wizemann 1dbdf9d079 chore: ignore local crashes/ triage directory
TestFlight feedback / crash JSONs land here while we're working
through an iOS fix round. They carry tester PII (emails, carriers,
locales) and aren't meant for the public repo. Kept local-only;
deleted after the round closes.
2026-05-01 15:57:41 +02:00
Alan Wizemann 101488cd0d docs(readme): bump What's New to v2.6.0 + Hermes v0.12 catch-up
Replaces the 2.5 "What's New" block with a 2.6 summary that
covers the Hermes v0.12 surfaces (Curator, multimodal images, 5
new providers, Teams + Yuanbao, Kanban, Skills v0.12, cron
--workdir, settings deltas, ScarfGo Webhooks/Plugins/Profiles)
and the post-merge chat fix round (#67/#68/#65/#62/#63/#64/#66/
#61). Verified-versions table gains v0.12.0 as the current target;
recommended-Hermes line points at v0.12.0+ for full feature
support. ScarfGo block kept but de-emphasised since it shipped
in 2.5.
2026-05-01 15:55:16 +02:00
Alan Wizemann 03c996ee80 chore: Bump version to 2.6.0 2026-05-01 15:42:48 +02:00
Alan Wizemann 8428cbff10 docs(v2.6.0): document post-merge issue fixes in RELEASE_NOTES
Adds a "Chat composer + transcript (post-merge round)" subsection
to the bug-fixes block covering #67, #68, #65, #62, #63, #64,
#66, and the partial #61 ACP-timeout bump. The pre-merge
test-target / iOS-build fixes stay grouped under "Pre-merge".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:41:48 +02:00
Alan Wizemann 381adfd925 fix(acp): bump control-message timeout 30s→60s for db-contended hosts (#61)
Field-reported (#61): under realistic concurrency where the
Hermes gateway is also running, state.db lock contention
(Discord sync / skill registration / cron scheduling all
holding write locks) stalls ACP's `initialize` / `session/new` /
`session/load` past the previous 30s watchdog, surfacing as
"Starting…" indefinitely or an opaque timeout error.

SQLite contention on a healthy host clears in seconds, so 60s
gives the lock-resolution path room to breathe while still
surfacing genuinely broken transports promptly. `session/prompt`
remains untimed (it streams events and can run for minutes).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:40:33 +02:00
Alan Wizemann 254af46e93 feat(chat): per-message TTS playback in assistant bubbles (#66)
Adds a small speaker glyph to the metadata footer of each settled
assistant bubble. Tap to read the reply aloud through
`AVSpeechSynthesizer`; tap again (or any other bubble's button) to
stop. Picks up the user's macOS Spoken Content default voice
automatically — no Hermes dependency, works offline.

- New `MessageSpeechService` (`Core/Services/`) — shared
  `@Observable` synthesizer; `playingMessageId` drives icon
  state. Markdown control characters (asterisks, backticks,
  link syntax) are stripped before speech so the user doesn't
  hear "asterisk asterisk bold".
- `SpeakMessageButton` lives outside `RichMessageBubble.==` so
  the bubble's Equatable short-circuit doesn't freeze the icon
  when playback flips between messages.

The full Hermes-provider TTS pipeline (Edge / ElevenLabs /
OpenAI / NeuTTS / Piper from Settings → Voice) is a much bigger
follow-up — wiring per-provider audio fetching, caching, and
streamed playback is its own quarter. v2.6.0 ships the immediate
"listen while doing something else" affordance.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:38:22 +02:00
Alan Wizemann 596c844da5 feat(chat): notify when Hermes finishes a prompt in the background (#64)
Sending a long prompt and switching to other work — the canonical
async-agent flow — required polling the chat to know when the
response landed. Wire a local UNUserNotificationCenter notification
to fire when an ACP prompt completes while Scarf isn't the
foreground app.

- New `ChatNotificationService` (Core/Services) handles lazy
  authorization, foreground gating, and post.
- `ChatViewModel.sendViaACP` calls it on successful prompt
  completion with the assistant's first-line preview and the
  active session title.
- Settings → Display → Feedback adds a "Notify when Hermes finishes"
  toggle, default on. Skipped for `/steer`-style mid-run sends —
  those don't end a turn.

Dock badges and per-session unread state from the issue are
worthwhile follow-ups but out of scope for v2.6.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:35:55 +02:00
Alan Wizemann ec47d191a1 fix(chat): preserve local user messages across resume cycles (#63)
When a user sent a prompt and immediately switched to a different
session before Hermes flushed the row to state.db, `resumeSession`
ran `reset()` (which clears `messages`) and then
`loadSessionHistory` read the un-persisted DB and replaced the
array with an empty result. The user's bubble came back blank or
disappeared on return.

Hold local-only user messages (negative ids) in a per-session
cache that survives `reset()`. `loadSessionHistory` re-injects any
still-pending entries for the loaded session, dedups against any
DB row that finally caught up (matching content with persisted id
≥ 0), and clears the cache as the DB confirms each entry.

Cache is bounded by sessions sent-in during one app run; entries
clean themselves out as Hermes persists, and orphaned entries
(deleted sessions etc.) are tiny and never re-surface since
session ids are unique per session.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:33:37 +02:00
Alan Wizemann 31e6c31acf fix(chat): scope composer state to active session id (#62)
`RichChatInputBar`'s `@State` `text` and `attachments` survived
session switches because the surrounding view tree is structurally
identical across sessions — SwiftUI happily reused the same
instance and leaked the previous session's unsent draft into the
new one.

Bind the composer's identity to `richChat.sessionId` so SwiftUI
rebuilds the view (and its `@State`) on session change. A stable
fallback string covers the brief "no session selected" window;
using `UUID()` here would mint a fresh id on every render and
trash the composer per body re-eval.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:28:59 +02:00
Alan Wizemann fcfe1c89d6 fix(chat): stop placeholder ghosting in chat composer (#65)
`TextEditor`'s NSTextView surfaces a typed glyph one frame before
the SwiftUI binding propagates, so the bare `if text.isEmpty`
overlay rendered the translucent placeholder text directly on top
of the just-typed character — the "behind or around" ghost the
reporter described.

Two mitigations:

- Pin an opaque `ScarfColor.backgroundSecondary` rect behind the
  placeholder Text. During any single-frame binding lag the user
  now sees a clean placeholder rather than layered glyphs.
- Switch the conditional to `.opacity(text.isEmpty ? 1 : 0)` so the
  view tree stays stable per keystroke. Pairs with the composer
  perf fix from #67.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:27:53 +02:00
Alan Wizemann df1b9caabf fix(chat): scale rich chat content with the font-size slider (#68)
The chat font-size slider only set `\.dynamicTypeSize` on the chat
root, but ScarfFont tokens are fixed-point (`Font.system(size: 14, …)`)
so dynamic type didn't reach bubble text, reasoning, tool chips, code
blocks, or markdown headings. Slider moved between 85%–130% with
little visible effect.

Plumb a separate `\.chatFontScale: Double` env value from
`RichChatView` and have the chat content views read it:

- `RichMessageBubble` — user bubble body, reasoning (disclosure +
  inline), REASONING label, token chip, tool-chip name, metadata
  footer.
- `MarkdownContentView` — paragraphs (now pinned to a scaled body
  font instead of inheriting), headings (1..5), inline-rendered code
  blocks, code-language label.
- `CodeBlockView` — code body and language label.

`ChatFontScale.{body, callout, caption, captionStrong, caption2,
mono, monoSmall, codeBlock, codeInline}(_ scale:)` helpers mirror
`ScarfFont`'s base sizes so scale = 1.0 is byte-for-byte identical
to today's UI; the slider now actually moves the visible chat text.

Other surfaces (settings, sidebar, etc.) still use the static
ScarfFont tokens — chat scaling stays scoped to the chat surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:24:45 +02:00
Alan Wizemann a41c81c048 fix(chat): coalesce composer onChange writes to stop typing lag (#67)
Typing in the chat composer became unusably laggy because
`updateMenuState()` ran on every keystroke and unconditionally wrote
both `showMenu` and `selectedIndex`. Two state writes inside one
`onChange(of: text)` handler tripped SwiftUI's "action tried to
update multiple times per frame" warning, and each redundant write
forced a full body re-eval — visible as the slow-HID stalls and the
main-thread layout churn the reporter captured in sampling.

Two changes:

- Compute the new selection up front and write only the deltas. Same
  semantics; no spurious mutations.
- Short-circuit the whole handler when the user is composing normal
  text (no `/` prefix) and the menu is already hidden — the common
  case. Stops paying for `SlashCommandMenu.filter` on every keystroke
  of regular prose.
- Replace `.onChange(of: commands.map(\.id))` with
  `.onChange(of: commands.count)`. The mapped form allocated a fresh
  `[String]` on every body re-eval; counting is one int read.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:20:15 +02:00
Alan Wizemann 88add62997 Merge branch 'v12-updates'
Hermes v2026.4.30 (v0.12.0) compatibility — autonomous Curator (Mac +
iOS), multimodal image input in chat, 5 new inference providers,
Microsoft Teams + Yuanbao gateway platforms, read-only Kanban view,
Skills v0.12 surface (URL install / reload / pin / disable), Cron
--workdir flag, Settings deltas (cache TTL, redaction, runtime footer,
Piper, Vercel), iOS read-only Webhooks/Plugins/Profiles, and a
pre-v0.12 Hermes-version banner. All new surfaces capability-gated so
older Hermes hosts see the v2.5 surface unchanged.

Release notes: releases/v2.6.0/RELEASE_NOTES.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:17:36 +02:00
159 changed files with 17244 additions and 1488 deletions
+5
View File
@@ -61,3 +61,8 @@ releases/v*/appcast-entry.xml
# Wiki helper: personal patterns (hostnames, IPs) blocked from the wiki push.
scripts/wiki-blocklist.txt
# TestFlight feedback / crash JSONs downloaded for triage. PII (emails,
# carriers, locales) and never meant for the public repo — kept local
# while a fix round is in progress, deleted afterward.
crashes/
+52 -21
View File
@@ -19,11 +19,56 @@
<a href="https://www.buymeacoffee.com/awizemann"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me a Coffee" height="28"></a>
</p>
## What's New in 2.5
## What's New in 2.7
### ScarfGo — the iPhone companion ships in public TestFlight
The biggest release since 2.6 — six weeks of work focused on **remote-context performance**, a **new project authoring flow**, **dashboard widgets**, **OAuth resilience**, and a top-to-bottom **performance instrumentation harness** that drove the bulk of the rest. 36 commits, no schema bump, no Hermes capability bump.
Same Hermes server you've been running on your Mac — now reachable from your phone over SSH. Multi-server, project-scoped chat, session resume, memory editor, cron list, skills tree, settings (read), all native iOS. Pure-Swift SSH (Citadel under the hood — no `ssh` binary needed on iOS). Per-project chat writes the same Scarf-managed `AGENTS.md` block the Mac app does, so the agent boots with the same project context regardless of which client opened the session.
### Remote chats and Activity in seconds, not 30s timeouts
Resuming a chat or opening Activity on a slow remote (a 420ms-RTT droplet, an underprovisioned VPS, a tunnel through 4G) used to fetch the full message column set in one shot, which routinely tripped the 30s SSH timeout on chats with multi-page tool result blobs. v2.7 introduces a **skeleton-then-hydrate pattern** that bounds the wire payload by what the user actually needs to see RIGHT NOW, then fills in the heavy stuff in the background.
- **Chat skeleton** — user + assistant rows only (skips `role='tool'`), `tool_calls` / `reasoning` hard-NULLed at SQL level. Wire payload bounded by conversational text. The chat appears in seconds. Background hydration pages tool calls in 5-id batches; tool-result CONTENT is opt-in (Settings → Display → "Load tool results in past chats", default off) with per-card lazy-fetch in the inspector pane.
- **Activity skeleton** — metadata-only fetch (~3 KB for 50 rows). Placeholder rows render immediately; real per-call entries swap in as paged hydration completes.
- **Single-id whale recovery** — when a 5-id batch trips the 30s timeout (one row carries an oversized `tool_calls` blob), an L1 single-id retry isolates the offender so the rest of the batch still hydrates.
### SSH cancellation that actually cancels
`Task.detached` doesn't inherit cancellation from the awaiting parent. Pre-fix, navigating away from a chat left the underlying ssh subprocess running for the full 30s, pinning a remote sqlite query and a ControlMaster session — the "third chat hangs" / "dashboard spins after rapid switching" symptom. v2.7 wires `withTaskCancellationHandler` through `SSHScriptRunner.run` and `RemoteSQLiteBackend.query`; cancellation now reaches the `Process` within ~100ms.
### New Project from Scratch wizard + Keychain-backed cron secrets
A third project entry point alongside Browse Catalog and Add Existing Project. Scaffolds a Scarf-standard skeleton, registers it, and hands off to a chat session that auto-activates the bundled `scarf-template-author` skill. The skill drives the rest conversationally — widgets, optional config schema, optional cron — and writes the final files itself.
**Cron + Keychain.** Cron prompts that referenced `secret`-typed config fields used to get the literal `keychain://...` URI back, producing 401s. v2.7 mirrors resolved Keychain values into `~/.hermes/.env` under `$SCARF_<UPPER_SLUG>_<UPPER_FIELD>` env vars. Hermes already reloads `.env` per cron tick — credential rotation is automatic.
### Project dashboards — file-reading widgets, sparklines, typed status
Five new widget types and project-wide auto-refresh. **Backwards-compatible** — every existing `dashboard.json` renders byte-identically.
- **`markdown_file`** / **`log_tail`** / **`cron_status`** / **`image`** / **`status_grid`** — file-reading widgets that auto-refresh when the underlying file changes. By convention, place files inside `<project>/.scarf/`.
- **`stat` widget gains inline sparklines** via optional `sparkline: [Number]`. SVG-only render; dozens per dashboard cost nothing.
- **Typed status badges** with lenient decode (`ok`/`up` → success, `down`/`error` → danger). Unknown strings render as plain text rather than crashing.
- **Structured widget error card** replaces the legacy "Unknown: \<type\>" placeholder.
### OAuth resilience + Credential Pools
- **Daily OAuth keepalive cron** prevents Anthropic OAuth refresh tokens from expiring after weeks of inactivity.
- **Remote re-auth** unblocked — OAuth flow drives a remote `hermes auth add` correctly with stdin forwarded.
- **OAuth remove button** + auto-refresh of Credential Pools on `auth.json` change.
- **`resolve_provider_client` errors** (auxiliary task references an unauthenticated provider) classified into a clear hint with a one-click jump to Settings → Aux Models.
- **Model/provider mismatch banner** detects when `model.default` carries a `<provider>/...` prefix that disagrees with `model.provider`, with one-click fix in either direction.
### ScarfMon — performance instrumentation harness
The diagnostic surface that drove the bulk of the v2.7 perf work. Off by default; signpost-only mode (Instruments-friendly) is free; Full mode keeps a 4096-entry in-memory ring buffer you can copy as JSON for paste-into-issue diagnosis. Wiki: [Performance-Monitoring](https://github.com/awizemann/scarf/wiki/Performance-Monitoring).
See the full [v2.7.0 release notes](https://github.com/awizemann/scarf/releases/tag/v2.7.0) for the complete list (36 commits, including: in-flight coalescing for `loadRecentSessions`, snapshot pipeline rewrite from `sqlite3 .backup` to direct SSH-streamed queries [#74](https://github.com/awizemann/scarf/issues/74), per-message TTS, window-position persistence, sidebar reorder, and many other fixes).
**Previous releases:** see the [Release Notes Index](https://github.com/awizemann/scarf/wiki/Release-Notes-Index) on the wiki for v2.6, v2.5, v2.3, v2.2, v2.0, v1.6, and earlier.
## ScarfGo — the iPhone companion
Same Hermes server you've been running on your Mac — reachable from your phone over SSH. Multi-server, project-scoped chat, session resume, memory editor, cron list, skills tree, settings (read), all native iOS. Pure-Swift SSH (Citadel under the hood — no `ssh` binary needed on iOS). Per-project chat writes the same Scarf-managed `AGENTS.md` block the Mac app does, so the agent boots with the same project context regardless of which client opened the session.
**[Join the public TestFlight](https://testflight.apple.com/join/qCrRpcTz)** — the link is live now but only accepts new beta testers once Apple's Beta Review approves the first build. If you hit a "not accepting testers" splash, bookmark it and try again in 2448h.
@@ -39,21 +84,6 @@ Same Hermes server you've been running on your Mac — now reachable from your p
See the [ScarfGo wiki page](https://github.com/awizemann/scarf/wiki/ScarfGo) for the full feature tour, [ScarfGo Onboarding](https://github.com/awizemann/scarf/wiki/ScarfGo-Onboarding) for the SSH-key setup walkthrough, and [Platform Differences](https://github.com/awizemann/scarf/wiki/Platform-Differences) for what is and isn't shared between Mac and iOS.
### Everything else in 2.5
- **Portable project-scoped slash commands.** Author reusable prompt templates as Markdown files at `<project>/.scarf/slash-commands/<name>.md` with YAML frontmatter (name, description, argumentHint, optional model override). Invoke as `/<name> [args]` from chat — Scarf substitutes `{{argument}}` (with optional `default:` fallback) in the body and sends the expanded prompt to Hermes. Mac authoring tab + iOS read-only browser. Templates carry them via the new `slash-commands/` block in `.scarftemplate` bundles (schemaVersion 3). See [Slash Commands](https://github.com/awizemann/scarf/wiki/Slash-Commands) for the full schema.
- **Hermes v2026.4.23 chat parity.** `/steer` non-interruptive guidance command, per-turn stopwatch on assistant bubbles, numbered keyboard shortcuts (19) on the permission sheet, git branch chip in the chat header. The new `messages.reasoning_content` and `sessions.api_call_count` columns surface as a richer reasoning disclosure + an "API" chip on session rows.
- **Spotify + design-md skills.** Mac ships an in-app Spotify OAuth sheet (mirrors the v2.3 Nous Portal pattern); design-md gets a host-side `npx` prereq check on both platforms. SKILL.md frontmatter (`allowed_tools`, `related_skills`, `dependencies`) renders as chip rows. A "What's New" pill on the Skills tab tells you when remote skills changed since you last looked.
- **Mac global Sessions: project filter + project badges** — parity with ScarfGo's Sessions tab. The list grows a filter Menu (All projects / Unattributed / each registered project) and each row carries a tinted folder chip with the project name when attributed.
- **Human-readable cron schedules everywhere.** New `CronScheduleFormatter` in ScarfCore translates the common cron shapes into English phrases and falls back to the raw expression on anything custom. Mac and iOS render the same.
- **Mac design-system overhaul.** Rust palette, typed token bundle (`ScarfColor`, `ScarfFont`, `ScarfSpace`, `ScarfRadius`), reusable components (`ScarfPageHeader`, `ScarfCard`, `ScarfBadge`, `ScarfTextField`, four button styles), redesigned 3-pane chat. iOS adopts the same tokens with a hybrid Dynamic Type policy so accessibility scaling on body text is preserved. See [Design System](https://github.com/awizemann/scarf/wiki/Design-System) for the full reference.
- **Under the hood** — `SessionAttributionService`, `ProjectContextBlock`, `CronScheduleFormatter`, `GitBranchService`, `SkillPrereqService`, `SkillSnapshotService`, `ProjectSlashCommandService`, and the ACP error triplet (`acpError` / `acpErrorHint` / `acpErrorDetails`) consolidated into ScarfCore so Mac and iOS consume one source of truth. 179 tests across 13 suites, three consecutive green runs. Several `try?` swallows in iOS lifecycle code now surface real failures (Keychain unlock errors no longer drop people into onboarding; partial Forget operations report what failed).
- **iOS push notifications skeleton** — `NotificationRouter` ships with foreground presentation + a lock-screen "Approve / Deny" action category gated by `apnsEnabled = false`. Lights up when Hermes ships a server-side push sender + an APNs cert.
See the full [v2.5.0 release notes](https://github.com/awizemann/scarf/releases/tag/v2.5.0).
**Previous releases:** see the [Release Notes Index](https://github.com/awizemann/scarf/wiki/Release-Notes-Index) on the wiki for v2.3, v2.2, v2.0, v1.6, and earlier.
## Connect ScarfGo to your Hermes server
ScarfGo speaks SSH directly — no companion service, no developer-controlled server in between. Onboarding takes about a minute:
@@ -145,7 +175,7 @@ Custom, agent-generated dashboards for any project. Define stat boxes, charts, t
- macOS 14.6+ (Sonoma) for Scarf
- iOS 18.0+ for [ScarfGo](https://github.com/awizemann/scarf/wiki/ScarfGo) (the iPhone companion, public TestFlight from v2.5)
- Xcode 16.0+ to build from source
- [Hermes agent](https://github.com/hermes-ai/hermes-agent) v0.6.0+ installed at `~/.hermes/` on each target host (v0.11.0+ recommended for full v2.5 feature support — `/steer`, new state.db columns, design-md/spotify skills, SKILL.md frontmatter chips)
- [Hermes agent](https://github.com/hermes-ai/hermes-agent) v0.6.0+ installed at `~/.hermes/` on each target host (v0.12.0+ recommended for full v2.6 feature support — autonomous Curator, multimodal image input, 5 new providers, Microsoft Teams + Yuanbao gateways, Kanban, Skills v0.12 surface, cron `--workdir`, prompt-cache TTL, Piper TTS, Vercel terminal)
- For remote servers: SSH access (key-based), `sqlite3` on the remote (for atomic DB snapshots), and the `hermes` CLI resolvable from the remote user's `PATH` or at a path you specify per server. ScarfGo requires the same on every Hermes host it connects to.
### Compatibility
@@ -159,9 +189,10 @@ Scarf reads Hermes's SQLite database and parses CLI output from `hermes status`,
| v0.8.0 (2026-04-08) | Verified |
| v0.9.0 (2026-04-13) | Verified |
| v0.10.0 (2026-04-16) | Verified (Tool Gateway introduced) |
| v0.11.0 (2026-04-23) | **Verified — current target (recommended for full v2.5 feature support)** |
| v0.11.0 (2026-04-23) | Verified |
| v0.12.0 (2026-04-30) | **Verified — current target (recommended for full v2.6 feature support)** |
Scarf 2.5 targets Hermes v0.11.0 for `/steer`, the new state.db columns (`messages.reasoning_content`, `sessions.api_call_count`), the new skills (design-md, spotify), the SKILL.md frontmatter chip surfaces, and the `hermes memory reset` toolbar action. Earlier Hermes versions remain supported for monitoring, sessions, file-based features, and ACP chat; v0.11-specific behavior degrades gracefully on older agents (`/steer` is harmless, new columns silently nil out).
Scarf 2.6 targets Hermes v0.12.0 for the autonomous Curator, multimodal ACP image content blocks, the 5 new inference providers, Microsoft Teams + Yuanbao gateways, the read-only Kanban view, the Skills v0.12 surface (URL install / reload / disable badges / curator pin), cron `--workdir`, `auxiliary.curator`, `prompt_caching.cache_ttl`, the redaction toggle, the runtime metadata footer, Piper TTS, and the Vercel terminal backend. Every v0.12 surface is **capability-gated** — Scarf detects the host's Hermes version once per server connection (`hermes --version` → semver + `YYYY.M.D` parse) and hides v0.12-only UI on older hosts. v0.11.0 hosts keep the full v2.5 surface (`/steer`, `messages.reasoning_content`, `sessions.api_call_count`, design-md/spotify skills, SKILL.md frontmatter chips, `hermes memory reset`). Earlier Hermes versions remain supported for monitoring, sessions, file-based features, and ACP chat; new behavior degrades gracefully on older agents.
If a Hermes update changes the database schema or CLI output format, Scarf may need to be updated. Check the [Health](#features) view for compatibility warnings.
+13
View File
@@ -106,6 +106,19 @@ The foundation of every gated surface above:
### Bug fixes
#### Chat composer + transcript (post-merge round)
- **Typing lag in the chat composer (#67)** — `RichChatInputBar.updateMenuState()` ran on every keystroke and unconditionally wrote both `showMenu` and `selectedIndex`, tripping SwiftUI's "action tried to update multiple times per frame" warning and stalling input. Composer now coalesces writes to deltas, short-circuits when not in slash mode (the common case), and watches `commands.count` instead of re-allocating `commands.map(\.id)` per keystroke.
- **Chat font-size slider had no visible effect (#68)** — `RichChatView` only set `\.dynamicTypeSize`, but `ScarfFont` tokens are fixed-point (`Font.system(size: 14, …)`) so dynamic type didn't reach bubble text, reasoning, tool chips, code blocks, or markdown headings. New `\.chatFontScale` env value plumbed through `RichMessageBubble`, `MarkdownContentView`, and `CodeBlockView`; `ChatFontScale.{body, caption, captionStrong, caption2, mono, monoSmall, codeBlock, codeInline}(_:)` helpers mirror the ScarfFont base sizes so 100% is byte-for-byte identical to today's UI.
- **Placeholder ghosting on first keystroke (#65)** — `TextEditor`'s NSTextView surfaces a typed glyph one frame before the SwiftUI binding propagates, so the bare `if text.isEmpty` overlay rendered the translucent placeholder text on top of the just-typed character. Pinned an opaque background behind the placeholder rect and switched the conditional to `.opacity(...)` so the view tree stays stable per keystroke.
- **Draft text leaked between conversations (#62)** — composer `@State` survived session switches because the surrounding view tree was structurally identical. Bound `RichChatInputBar`'s identity to `richChat.sessionId` so SwiftUI rebuilds the view (and its `@State`) on session change. Stable fallback string for the "no session selected" window — `UUID()` would have minted a new id per body re-eval and trashed the composer mid-typing.
- **Sent message rendered blank after navigating away (#63)** — when a user sent a prompt and immediately resumed a different session before Hermes flushed the row to state.db, `resumeSession`'s `reset()` cleared `messages` and `loadSessionHistory` then read an as-yet-empty DB. New per-session pending-user-messages cache survives `reset()` and re-injects still-pending entries on load; entries clear themselves as soon as a matching DB row catches up.
- **No completion notification (#64)** — sending a long prompt and switching to other work required polling the chat to know when the response landed. New `ChatNotificationService` fires a local `UNUserNotificationCenter` banner on prompt completion when Scarf isn't the foreground app. Settings → Display → Feedback → "Notify when Hermes finishes" toggle, default on.
- **Per-message TTS playback (#66)** — small speaker glyph in each settled assistant bubble's metadata footer; uses `AVSpeechSynthesizer` with the user's macOS Spoken Content default voice, picks up offline. Markdown control characters stripped before speech. The deeper Settings → Voice provider integration (Edge / ElevenLabs / OpenAI / NeuTTS / Piper) is queued as a v2.7 follow-up.
- **ACP control-message timeout under gateway concurrency (#61)** — bumped 30s → 60s. State.db lock contention on a healthy host clears in seconds, but the previous 30s watchdog tripped under realistic gateway+ACP concurrency (Discord sync / skill registration / cron scheduling holding write locks during ACP `initialize` / `session/new` / `session/load`). 60s gives lock resolution headroom while still surfacing genuinely broken transports.
#### Pre-merge
- **Test target compile** — `M5FeatureVMTests.ScriptedTransport` had drifted off the `ServerTransport` protocol after `cachedSnapshotPath` landed in v2.5.2; added the missing stub. `M0dViewModelsTests` got the `ConnectionStatusViewModel.Status.degraded` argument-name update. `CredentialPoolsGatingTests` got the missing `import ScarfCore`. The full `swift test` suite now runs (and passes — 215 tests across 17 suites).
- **iOS package compile** — `RemoteBackupService.zipDirectory` and `RemoteRestoreService.unzipArchive` used `Foundation.Process` unconditionally, breaking the iOS build entirely (Process is unavailable on the iOS SDK). Wrapped in `#if !os(iOS)` with iOS stubs that throw — backup/restore is Mac-only by design.
+78
View File
@@ -0,0 +1,78 @@
## What's in 2.6.5
A patch release that ships **template discoverability**, **cron observability**, and an **end-to-end UI test harness** that locks the new install path against regression. No breaking changes; every Hermes capability target is unchanged from 2.6.0.
### In-app Template Catalog
The catalog is no longer web-only. **Templates → Browse Catalog…** opens a sheet that fetches the live catalog from `awizemann.github.io/scarf/templates/`, renders one row per published template with name + version + tags, and one-click installs through the existing flow. Search filters across name / description / tags; the category picker constrains to whatever categories the loaded catalog actually carries.
- **Install-state badges** — each row shows "Installed v1.2.0" (green) or "Update v1.3.0" (amber) when the catalog version is newer than what's in `~/.hermes/scarf/projects.json`. Update is "uninstall + reinstall" today; in-place upgrade is on the v3 backlog.
- **24h cache** at `~/.hermes/scarf/catalog_cache.json` so opening the sheet repeatedly doesn't re-hit the network. Refresh icon force-fetches.
- **Bundled fallback** — fresh-install / offline users still see the official templates as a hardcoded list. Network failures serve stale cache with a "refresh failed" hint.
- **Catalog-schema decoder fault tolerance** — one malformed entry on the live catalog can't bring down the whole list. The bad row is dropped with a logged warning; the rest survive.
### HackerNews Daily Digest template
First template added under the new dogfooding-templates loop. Configurable `min_score`, `max_items`, `topics`; one daily-at-08:00 cron job (paused on install) that pulls the HN Firebase API, filters, and prepends a markdown digest to the project's `digest.md`. No API keys required. Live at the catalog URL above.
### Cron observability — auth-error banner + running indicator + log tail
Cron rows now surface the same OAuth-refresh-revoked recovery flow as Chat instead of a generic red dot, plus three previously-missing observability cues:
- **OAuth re-auth.** `ACPErrorHint.classify` runs on `job.lastError`; when it returns `oauthRefreshRevoked(provider)` the detail pane shows the human-readable hint + a **Re-authenticate** button that drops the user into Credential Pools — same wiring ChatView's banner uses. Unrecognized errors fall back to the legacy red `lastError` text.
- **Running indicator.** The row dot turns blue + pulses when `state == "running"` (precedence over disabled / error / success); the detail header gains a "running…" badge next to active/paused. No new polling — `HermesFileWatcher.lastChangeDate` already drives `CronViewModel.load()`.
- **Last run output.** Collapsible panel replacing the inline log: a one-line summary (`<timestamp> — ok|error|running…`) always visible, full monospaced terminal-style scroll on expand, auto-scrolls to bottom when new runs land.
Also fixes a pre-existing bug in `HermesFileService.loadCronOutput` that returned the wrong file under Hermes's per-job-id output nesting.
### Layer B install-drive XCUITest harness
The dogfooding-templates initiative ships its first end-to-end UI test that drives the install pipeline:
```
Launch with --scarf-test-mode → Sidebar → Projects → Install sheet
(via --scarf-test-install-url launch arg) → Configure → Open Project
→ Right-click → Uninstall Template → Confirm Remove → Done
```
Runs ~30 s green on the dev Mac, validates 9 assertion points across the user journey. Covers the new accessibility identifiers wired in this release: `templateConfig.commitButton`, `projects.row.<name>`, `sidebar.section.<rawValue>`, `projects.contextMenu.uninstallTemplate`, `templateUninstall.confirmRemove`, `templateInstall.success.openProject`, `templateUninstall.success.done`. The `--scarf-test-install-url` launch arg + `TestModeFlags.isTestMode` gating lets XCUITest skip SwiftUI Menu / NSToolbarItem accessibility-bridging quirks that otherwise block toolbar-menu driving.
Wiki [Test-Harness](https://github.com/awizemann/scarf/wiki/Test-Harness) documents how to extend the harness for the next template.
### Sentinel-marker test isolation (incident-response hardening)
`SCARF_HERMES_HOME` override now requires the path to contain a `.scarf-test-home-marker` file to activate. Without the marker, production code falls through to the user's real `~/.hermes/`. Lands belt-and-braces protection for cases where a test crashes mid-teardown leaving the env var set, an env var inherits from a parent shell, or a misconfigured launchctl plist exports the variable. The override remains the seam every E2E test relies on; the marker file ensures it can't accidentally pivot a non-test process off the user's data.
### Chat fixes
- **OAuth refresh-revoked surface.** Chat-side error banner now classifies the message via `ACPErrorHint.classify` and offers an in-app **Re-authenticate** button that routes through Credential Pools (#65). Same primitive the new cron banner reuses.
- **Placeholder ghosting fix.** TextEditor's placeholder now clips to the editor's bounds and clears on focus instead of bleeding past the cursor area when the user types fast (#67).
### Profile chip + structured logs
- **Active-profile chip in the sidebar header.** Click → routes to Profiles. Local contexts only (remote SSH would mislead).
- **Switch & Relaunch** flow now writes `~/.hermes/active_profile` and relaunches Scarf in a single click instead of asking the user to quit+reopen.
- Profile-resolver logs are now structured (key=value form) so `log show … | grep ProfileResolver` can pull "which profile did Scarf resolve to and why" out of support requests.
### Swift 6 cleanup
- `MessageSpeechService` — drop `@preconcurrency` on the AVSpeechSynthesizerDelegate conformance now that the protocol's Sendable annotations are upstreamed.
- `ChatView``RichChatViewModel.PendingPermission: @retroactive Identifiable`. Quiets the Swift 6 compiler so downstream breakage would be loud if ScarfCore ever adds the conformance upstream.
- `CredentialPoolsView``.help(Text(verbatim:))` so backticks render literally instead of being treated as markdown inline-code.
### iOS
- Composer redesigned with HIG touch targets + clear disabled state.
- Portrait lock retained.
- Chat-start preflight moved off MainActor.
### Known caveats
- **Cron-job-uninstall by name is ambiguous** when two projects share the same template id. The Layer B test surfaced this — manifests as: the test passes, but if you've manually installed the same template before running the test, your real cron job can disappear. Recovery is `hermes cron create`. Fix is queued: store cron-job IDs in `<project>/.scarf/template.lock.json` at install time and resolve by ID at uninstall time.
- **Full-suite parallel test runs intermittently hang** — pre-existing flaky test infrastructure unrelated to this release. Individual suites all pass; the hang only manifests on `xcodebuild test` with everything concurrent. The sentinel-marker hardening prevents user-data damage from any race.
### Compatibility
- **Hermes target unchanged from 2.6.0**: v2026.4.30 (v0.12.0). Pre-v0.12 Hermes hosts continue to work — no new capability gates added in this release.
- **Min macOS unchanged**: 14.6.
- **No schema changes** to anything in `~/.hermes/`. The two new Scarf-owned files (`scarf/catalog_cache.json` and the template-installer's `.scarf-test-home-marker` for tests) are additive.
+155
View File
@@ -0,0 +1,155 @@
## What's in 2.7.0
The biggest release since 2.6.0 — a six-week stretch covering **remote-context performance**, a **new project authoring flow**, **dashboard widgets**, **OAuth resilience**, and a top-to-bottom **performance instrumentation harness** that drove the bulk of the rest. 36 commits, no schema bump, no Hermes capability bump.
The throughline: Scarf got materially faster and more honest on slow remote SSH links, where 30-second sqlite timeouts and silently-empty UI used to be common. The skeleton-then-hydrate pattern, SSH cancellation propagation, and ScarfMon-driven diagnosis are the shape of how that work gets done now.
---
### Remote-context performance — chats and Activity in seconds, not 30s timeouts
Resuming a chat on a slow remote (a 420ms-RTT droplet, an underprovisioned VPS, a tunnel through 4G) used to fetch the full message column set in one shot, which routinely tripped the 30s SSH timeout on chats with multi-page tool result blobs. The 160-message session was broken; the 30-message session was broken too. Activity didn't load at all.
v2.7 introduces a **skeleton-then-hydrate pattern** that bounds the wire payload by what the user actually needs to see RIGHT NOW, then fills in the heavy stuff in the background:
- **Chat skeleton.** [`fetchSkeletonMessages`](https://github.com/awizemann/scarf/blob/main/scarf/Packages/ScarfCore/Sources/ScarfCore/Services/HermesDataService.swift) selects user + assistant rows only (skips `role='tool'`) with `tool_calls` / `reasoning` / `reasoning_content` hard-NULLed at the SQL level. Wire payload bounded by conversational text alone — typically a few KB. The chat appears in seconds. Background `startToolHydration` pages through `hydrateAssistantToolCalls` in 5-id batches to splice tool calls in. Tool-result CONTENT is **opt-in** via Settings → Display → "Load tool results in past chats" (default off); the inspector pane lazy-fetches per-result content via `fetchToolResult(callId:)` when you open a card.
- **Activity skeleton.** [`fetchRecentToolCallSkeleton`](https://github.com/awizemann/scarf/blob/main/scarf/Packages/ScarfCore/Sources/ScarfCore/Services/HermesDataService.swift) returns metadata-only rows (id + session_id + role + timestamp; everything else NULLed). Activity opens in <1s on remote with placeholder rows; real per-call entries swap in as paged hydration completes. New "Loading tool details…" pill in the page header surfaces hydration progress.
- **Single-id whale recovery.** When a 5-id batch trips the 30s timeout (one row carries an oversized `tool_calls` blob — a long Edit's args, a big diff), an L1 single-id retry isolates the offending row so the rest of the batch still hydrates. Whale row stays bare; assistant message stays readable.
- **Lazy tool result loading in the inspector.** Default-off avoids the bulk fetch. When you focus a tool call card, ChatInspectorPane fires `loadToolResultIfMissing(callId:)` which splices a single result into the message stream without re-fetching anything else.
Effect: a 160-message thinking-model session that used to time out at exactly 30s now opens in under 2 seconds with placeholder cards filling in over the next few. Activity loads in 500-800ms.
#### SSH cancellation that actually cancels
`Task.detached { … }` doesn't inherit cancellation from the awaiting parent, and `Task<…> { … }` (unstructured) also drops the signal. Without explicit bridging, cancelling a chat-load Task only unwinds Swift state — the underlying ssh subprocess kept running for the full 30s, pinning a remote sqlite query and a ControlMaster session slot. This produced the "third chat hangs" / "dashboard spins after rapid switching" symptom.
v2.7 wires `withTaskCancellationHandler` through [`SSHScriptRunner.run`](https://github.com/awizemann/scarf/blob/main/scarf/Packages/ScarfCore/Sources/ScarfCore/Transport/SSHScriptRunner.swift) and [`RemoteSQLiteBackend.query`](https://github.com/awizemann/scarf/blob/main/scarf/Packages/ScarfCore/Sources/ScarfCore/Services/Backends/RemoteSQLiteBackend.swift) so parent cancellation reaches the `Process` and calls `proc.terminate()` within 100ms. New `ssh.cancelled` ScarfMon event surfaces this.
#### In-flight coalescing for `loadRecentSessions`
File-watcher deltas during an active stream used to stack 2-3 parallel sessions-list reload tasks (the 500ms `scheduleSessionsRefresh` debounce only suppresses a pending tick, not one already executing). Subsequent callers now await the in-flight load instead of spawning a parallel SSH subprocess. New `mac.loadRecentSessions.coalesced` event tracks dedup hits.
#### Loading-state UX hardening
The Mac chat sidebar greys out and disables row taps the moment a session-switch is initiated (synchronously, before `client.start()` returns), with a floating ProgressView showing the current phase: **"Spawning hermes acp…"** → **"Authenticating…"** → **"Loading session…"** → **"Loading history…"** → **"Ready"**. Pre-fix the sidebar looked engageable while the 5-7 second SSH+ACP boot was still in flight, and the user could queue up a second session-switch behind the first. New `isStartingSession` flag flips on user click for instant feedback.
#### Partial-result + mismatch + pinned-model banners
- **Partial-result banner.** When the skeleton fetch trips an SSH transport failure (rather than a clean empty result), the chat surfaces "Couldn't load full chat history — the connection to *server* timed out" through the existing `acpError` triplet, plus forces `hasMoreHistory = true` so the "Load earlier" affordance shows up. Replaces the pre-fix silent empty transcript.
- **Model/provider mismatch banner.** [`ModelPreflight.detectMismatch`](https://github.com/awizemann/scarf/blob/main/scarf/Packages/ScarfCore/Sources/ScarfCore/Services/ModelPreflight.swift) recognizes when `model.default` carries a `<provider>/...` prefix that disagrees with `model.provider` (e.g. `anthropic/claude-sonnet-4.6` + `provider: nous` after switching OAuth via Credential Pools). Banner offers one-click fix in either direction.
- **Pinned-model failure hint.** ACP error classifier now recognizes `model_not_found` / `404 messages` / `model is not available` and surfaces "This session was created with a model the provider no longer offers — start a new chat to use your current model" so the pinned-model failure mode has a clear recovery path.
- **OAuth-completion provider swap.** After a successful OAuth in Credential Pools, if the just-authed provider differs from `model.provider`, surface "Switch active provider to *name*?" with [Switch] / [Keep current] instead of auto-dismissing.
---
### New Project from Scratch wizard + Keychain-backed cron secrets
A **third project entry point** alongside Browse Catalog and Add Existing Project: a wizard that scaffolds a Scarf-standard project skeleton (`<project>/.scarf/dashboard.json` + AGENTS.md marker block), registers it, and hands off to a chat session that auto-activates the bundled `scarf-template-author` skill. The skill drives the rest conversationally — widgets, optional config schema, optional cron — and writes the final files itself. Wizard stays minimal because the agent does configuration better than a multi-step form. The skill ships bundled inside `Scarf.app/Contents/Resources/BuiltinSkills.bundle/` and copies into `~/.hermes/skills/` on launch (idempotent + version-gated).
**Cron + Keychain — `$SCARF_<SLUG>_<FIELD>` env vars.** Cron prompts that referenced `secret`-typed config fields used to get the literal `keychain://...` URI back when reading `config.json`, producing 401s. v2.7 mirrors resolved Keychain values into `~/.hermes/.env` under a marker-bounded block keyed by template slug:
```sh
# scarf-secrets:begin local-news-aggregator
SCARF_LOCAL_NEWS_AGGREGATOR_API_TOKEN=actual-value
SCARF_LOCAL_NEWS_AGGREGATOR_RSS_URL=https://example.com/feed
# scarf-secrets:end local-news-aggregator
```
Hermes already reloads `~/.hermes/.env` per cron tick, so credential rotation is automatic — just edit the value in Configuration → next tick sees it. The mirror runs at every state-change point: install, post-install Configuration save, uninstall, "Remove from List", and on app launch (reconciliation pass over registered projects). Source of truth stays in the Keychain — `config.json` keeps `keychain://` URIs unchanged. Mode 0600 enforced on `~/.hermes/.env`.
Cron prompts now reference these env vars directly:
```json
{
"prompt": "Use the terminal: curl -sS -H \"Authorization: Bearer $SCARF_LOCAL_NEWS_AGGREGATOR_API_TOKEN\" \"$SCARF_LOCAL_NEWS_AGGREGATOR_RSS_URL\" -o {{PROJECT_DIR}}/.scarf/feed.xml"
}
```
**Migration.** First launch of v2.7 walks the project registry and writes the managed block per schemaful project — automatic. Existing cron prompts you wrote against the old (broken) `config.json` pattern still need updating: open the cron job in Scarf's Cron sidebar and edit the prompt, or ask the agent in chat ("Update my Local News cron job's prompt to use the new env var convention") — the bundled `scarf-template-author` skill (now v1.1.0) documents the convention with worked examples.
Also fixes [#75](https://github.com/awizemann/scarf/issues/75) — `_NSDetectedLayoutRecursion` on the Configuration form for projects whose form transitioned between stages with different intrinsic heights.
---
### Project dashboards — file-reading widgets, sparklines, typed status
Five new widget types, project-wide auto-refresh, and a structured error card for unknown widgets. Backwards-compatible — every existing `dashboard.json` renders byte-identically.
- **Project-wide auto-refresh.** [`HermesFileWatcher`](https://github.com/awizemann/scarf/blob/main/scarf/scarf/Core/Services/HermesFileWatcher.swift) used to watch each project's `dashboard.json` specifically. v2.7 promotes that to a watch on the entire `<project>/.scarf/` directory. A `markdown_file` or `log_tail` widget pointing at `<project>/.scarf/reports/foo.md` refreshes the moment a cron job rewrites the file. **By convention, place files the dashboard reads inside `.scarf/`** so the watch picks them up.
- **`markdown_file`** — renders a markdown file from disk through the same `MarkdownContentView` pipeline used by inline `text` widgets.
- **`log_tail`** — last `lines` of a file (default 20, max 200), monospaced, ANSI codes stripped.
- **`cron_status`** — last run / next run / state for one Hermes cron job by `jobId`, plus a small inline log tail. Read-only — Run/Pause/Resume controls stay on the Cron tab.
- **`image`** — local file (`path` relative to project root) or remote `url`. Optional `height` cap. Useful for matplotlib/Plotly PNGs the cron job generates.
- **`status_grid`** — compact NxM grid of colored cells, one per service / item, with hover labels.
- **`stat` widget gains inline sparklines.** Optional `sparkline: [Number]` field. SVG-only render, dozens per dashboard cost nothing.
- **Typed status badges.** `list` items and `status_grid` cells share a typed enum (`success`, `warning`, `danger`, `info`, `pending`, `done`, `neutral`) with lenient decode for synonyms (`ok`/`up` → success, `down`/`error` → danger). Unknown strings render as plain text.
- **Structured widget error card.** Replaces the legacy "Unknown: \<type\>" placeholder with a card surfacing the title, specific reason, and a hint.
- **Schema mirror.** The widget vocabulary lives once at [`tools/widget-schema.json`](https://github.com/awizemann/scarf/blob/main/tools/widget-schema.json); the catalog validator reads from it and enforces per-type required fields.
---
### OAuth resilience + Credential Pools
- **Daily OAuth keepalive cron.** Prevents Anthropic OAuth refresh tokens from expiring after weeks of inactivity. New cron job `[scarf:oauth-keepalive]` (managed by Scarf) pings Hermes on a daily cadence; the in-app Refresh All Sessions action mirrors the same path on demand.
- **Remote re-auth.** Re-authenticating against a remote droplet's OAuth provider used to be blocked by the lack of a stdin path through SSHTransport. The OAuth flow now drives a remote `hermes auth add` correctly with stdin forwarded.
- **OAuth remove button.** Per-provider remove action in Credential Pools (auth.json edit), with confirmation dialog. Companion auto-refresh of the view when `auth.json` changes externally (file-watcher).
- **`resolve_provider_client` error classification.** When an auxiliary task references a provider whose credentials aren't loaded, Hermes prints `resolve_provider_client: <name> requested but <Display Name> not configured` to stderr — pre-fix this surfaced in chat as the opaque `-32603 Internal error` with no actionable detail. Now classified into a clear hint pointing at Settings → Aux Models.
- **Aux Tab unknown-task surface.** When `config.yaml` has an `auxiliary.<task>` block for a task Scarf doesn't know about (newer Hermes added it; Scarf hasn't caught up), render it as a plain row with the raw provider/model values instead of dropping it silently.
- **Credential Pools refresh after OAuth sheet dismiss.** Closing the OAuth sheet after a successful add now refreshes the list immediately instead of leaving the just-added pool hidden until the next file-watcher tick.
---
### ScarfMon — performance instrumentation harness
The diagnostic surface that drove the bulk of the v2.7 perf work. Off by default; signpost-only mode (Instruments-friendly) is free; Full mode (4096-entry in-memory ring buffer + os.Logger) is a click away in Settings → Diagnostics → Performance. Wiki: https://github.com/awizemann/scarf/wiki/Performance-Monitoring
- **Phases 1-3** built the core: dispatcher + ring buffer + 3 backends, chat / transport / sqlite measure points, diagnostic counters for chat-render bursts, finalize-burst dampening.
- **Tier A + B** added per-feature instrumentation: iOS file watcher, sessions list, model catalog, dashboard widgets, image encoder, message hydration.
- **Nous picker investigation** localized a 60s + 120s beach-ball to a specific path (Nous catalog `readCache`), then killed the 120s one with dedupe + 5s timeout.
- **Tier C catch-up** (this release): instrumented Memory / Skills / Cron / Curator load paths so future captures show how often these tabs cost multiple sequential SFTP RTTs on remote.
- **Per-call bytes recorded** on transport + sqlite events so captures show payload sizes alongside latencies.
- **`mac.emptyAssistantTurn` event** documents the Nous quirk where the model returns a thought stream with no body (the bubble looks like Hermes is "still thinking" but the turn already finished).
Adding a new measure point is two lines. The harness covers Mac and iOS uniformly. The "Copy as JSON" button exports the ring buffer for paste-into-issue diagnosis.
---
### Other fixes + polish
- **Sessions sidebar reload debounce** — file-watcher deltas during streaming used to flicker the sessions list. Coalesced into one trailing fetch ~500ms after the last tick.
- **Session-load pagination + race guard** — switching to a small chat while a larger one is mid-fetch could last-write-wins the small chat away. Three race-checks against `self.sessionId` prevent the stale fetch from overwriting.
- **Sessions + previews batched** — two separate SSH calls folded into one `queryBatch` round trip, halving the round-trips for every sidebar refresh.
- **Remote SQLite query timeout** bumped 15→30s to better tolerate slow links; in-flight query coalescing dedupes concurrent identical queries.
- **`Thread.sleep` spin replaced** with a kernel-wait via `DispatchGroup` for `runLocal` timeout; under concurrent SSH load the old loop accumulated spin-blocked threads and produced 7-second outliers in `loadRecentSessions`.
- **Window position + size** persists across launches.
- **Sidebar reorder** — Projects promoted to first section; profile chip moved under server name.
- **`stop` badge suppressed** on metadata footer for normal turn ends (it was firing for every clean completion, looking like an error).
- **Nous picker search field** + `model-picker` filter for the long Nous overlay model list.
- **`oauth-keepalive` cron create** — drop the `--silent` flag Hermes doesn't accept.
- **Snapshot pipeline rewritten** — replaced the `sqlite3 .backup`-then-download pipeline with direct SSH-streamed query execution (issue [#74](https://github.com/awizemann/scarf/issues/74)). Eliminates the multi-minute snapshot wait on multi-GB state.db files. Companion fix: pre-expand `~/` in Swift via `resolvedUserHome` so sqlite3 finds the DB without depending on the remote shell's tilde expansion.
- **Aux nested-YAML parser** — corrected the parser so the unknown-task surface works on remote (was previously dropping aux blocks whose `provider:` value lived on a separate line).
- **`ModelPreflight` newline trim bug** — `.whitespaces` doesn't strip newlines; switched both trims to `.whitespacesAndNewlines` so a stray `\n` in a hand-edited config.yaml doesn't false-positive the mismatch banner.
---
### What's measured today
321 ScarfCore tests pass (302 prior + 19 new ModelPreflight). New ScarfMon events documented in the [Performance-Monitoring wiki](https://github.com/awizemann/scarf/wiki/Performance-Monitoring).
### Compatibility
- macOS 14+ (unchanged).
- Hermes target: still **v2026.4.30 (v0.12.0)**. No new Hermes capability gates added.
- Existing `dashboard.json` files render unchanged.
- Existing `.scarftemplate` bundles install unchanged. Catalog manifest schemaVersion stays at 1/2/3 — no bump.
- Existing `~/.hermes/.env` content is preserved byte-identically — Scarf only writes inside its `# scarf-secrets:begin <slug>` / `# scarf-secrets:end <slug>` regions.
- The skeleton-then-hydrate chat loader and SSH cancellation propagation are **Mac-only** in this release; ScarfGo (iOS) keeps its existing chat path.
### What's deferred
- **Per-widget data sources + per-widget refresh granularity.** The general "widget points at a typed data source" abstraction is the next-largest win in dashboards but materially expands the model + JS mirror + validator surface. The project-wide watch covers the common cron-driven workflow without it.
- **Cross-project health digest sidebar rollup.** Counting attention-needed projects across the registry — scoped but didn't pull its weight. The typed status enum makes it cheap to add later.
- **Automatic cron-prompt rewriter on upgrade.** Heuristic rewrites of free-form prompts are risky; the docs + agent-assisted path ships in v2.7. Revisit a "scan + fix" UI in v2.8 if real users miss the migration.
- **iOS New Project wizard + iOS Keychain-env mirror.** ScarfGo's project surface is read-only; the wizard's chat-handoff pattern depends on Mac-only ACP plumbing.
- **iOS skeleton-then-hydrate loaders.** Same data-service surfaces are public, but the iOS chat lifecycle is structured differently. Defer until iOS dogfooding shows the same payload-size pain.
- **Tier C redesigns (Memory/Skills/Cron/Curator).** Instrumented in v2.7; redesign waits for capture data showing which path actually needs the skeleton-then-hydrate treatment.
@@ -362,10 +362,17 @@ public actor ACPClient {
#endif
// session/prompt streams events and can run for minutes no hard
// timeout. Control messages get a 30s watchdog.
// timeout. Control messages get a 60s watchdog. Older versions
// capped at 30s, which the field reported (#61) was tripping
// under realistic gateway+ACP concurrency: the gateway holds
// state.db locks for Discord sync / skill registration / cron
// scheduling, and ACP's `initialize` / `session/new` /
// `session/load` stall waiting for the lock. SQLite contention
// on a healthy host clears in seconds; 60s gives that headroom
// while still surfacing genuinely broken transports promptly.
let timeoutTask: Task<Void, Error>? = if method != "session/prompt" {
Task { [weak self] in
try await Task.sleep(nanoseconds: 30 * 1_000_000_000)
try await Task.sleep(nanoseconds: 60 * 1_000_000_000)
await self?.timeoutRequest(id: requestId, method: method)
}
} else {
@@ -586,7 +593,30 @@ public enum ACPClientError: Error, LocalizedError {
/// human-readable hint for the chat UI. Pattern-matches the most common
/// fresh-install failure modes. Returns nil when no known pattern matches.
public enum ACPErrorHint {
public static func classify(errorMessage: String, stderrTail: String) -> String? {
/// Result of a classifier hit. `hint` is the user-facing copy; when
/// the failure is an OAuth refresh-revocation, `oauthProvider` names
/// the affected provider (lowercase, matching `auth.json` keys) so
/// the UI can offer a one-click re-authenticate affordance. `nil`
/// `oauthProvider` means "we matched a non-OAuth failure mode, or
/// we matched OAuth but couldn't identify which provider."
public struct Classification: Sendable, Equatable {
public let hint: String
public let oauthProvider: String?
public init(hint: String, oauthProvider: String? = nil) {
self.hint = hint
self.oauthProvider = oauthProvider
}
}
/// Known OAuth-authed providers Hermes ships. Listed lowercase to
/// match `auth.json.providers.<key>` and the values
/// `OAuthFlowController.start(provider:)` accepts.
private static let oauthProviders = [
"nous", "claude", "anthropic", "qwen", "gemini", "google", "copilot", "github",
]
public static func classify(errorMessage: String, stderrTail: String) -> Classification? {
let haystack = errorMessage + "\n" + stderrTail
// SSH-level failures come first they apply only to remote
@@ -596,30 +626,86 @@ public enum ACPErrorHint {
// all surface as opaque "ACP process terminated" / "request
// timed out", and the user has no idea where to look.
if haystack.contains("Connection refused") {
return "Couldn't reach the remote host — the SSH port is closed or the droplet is down. Check the host is running and reachable."
return Classification(hint: "Couldn't reach the remote host — the SSH port is closed or the droplet is down. Check the host is running and reachable.")
}
if haystack.localizedCaseInsensitiveContains("Operation timed out")
|| haystack.localizedCaseInsensitiveContains("Connection timed out")
|| haystack.contains("Network is unreachable")
|| haystack.contains("No route to host") {
return "Couldn't reach the remote host — the network connection timed out. Check the host is running and your network is up."
return Classification(hint: "Couldn't reach the remote host — the network connection timed out. Check the host is running and your network is up.")
}
if haystack.contains("Permission denied (publickey")
|| haystack.contains("Permission denied, please try again") {
return "SSH rejected the key. Make sure the right identity file is selected and that ssh-agent has the key loaded — open Terminal and run `ssh-add -l`."
return Classification(hint: "SSH rejected the key. Make sure the right identity file is selected and that ssh-agent has the key loaded — open Terminal and run `ssh-add -l`.")
}
if haystack.contains("Host key verification failed")
|| haystack.contains("REMOTE HOST IDENTIFICATION HAS CHANGED") {
return "The remote host's SSH key changed. If you just rebuilt the droplet, remove the old entry with `ssh-keygen -R <host>`, then try again."
return Classification(hint: "The remote host's SSH key changed. If you just rebuilt the droplet, remove the old entry with `ssh-keygen -R <host>`, then try again.")
}
if haystack.contains("Could not resolve hostname")
|| haystack.contains("Name or service not known") {
return "Couldn't resolve the host name. Check the host in this server's settings."
return Classification(hint: "Couldn't resolve the host name. Check the host in this server's settings.")
}
if haystack.localizedCaseInsensitiveContains("command not found")
|| haystack.contains("hermes: not found")
|| haystack.contains("exit 127") {
return "The remote shell couldn't find `hermes`. Either install Hermes on the remote (`pipx install hermes-agent`) or set an absolute binary path in this server's settings."
return Classification(hint: "The remote shell couldn't find `hermes`. Either install Hermes on the remote (`pipx install hermes-agent`) or set an absolute binary path in this server's settings.")
}
// OAuth refresh-token revocation. Hermes prints
// "Refresh session has been revoked. Run `hermes model` to
// re-authenticate." to stderr/stdout when an OAuth-authed
// provider's refresh token can no longer mint access tokens
// (user revoked, server rotated keys, etc.). We can't drive
// `hermes model` interactively, but `hermes auth add <provider>
// --type oauth` is the same code path Scarf already drives via
// `OAuthFlowController` for first-time setup, so we surface a
// re-authenticate affordance instead. Checked BEFORE the
// generic "no credentials found" path because the message
// contains the word "credentials" via the surrounding context.
if haystack.localizedCaseInsensitiveContains("refresh session has been revoked")
|| haystack.range(of: #"refresh.*revoked"#, options: [.regularExpression, .caseInsensitive]) != nil
|| haystack.localizedCaseInsensitiveContains("re-authenticate")
|| haystack.localizedCaseInsensitiveContains("reauthenticate")
|| (haystack.contains("401") && oauthProvider(in: haystack) != nil)
|| (haystack.localizedCaseInsensitiveContains("unauthorized") && oauthProvider(in: haystack) != nil) {
let provider = oauthProvider(in: haystack)
let suffix = provider.map { " (affected provider: \($0))." } ?? "."
return Classification(
hint: "Your OAuth session has expired or been revoked\(suffix) Click Re-authenticate below to sign in again.",
oauthProvider: provider
)
}
// Auxiliary task references a provider that isn't authenticated.
// Hermes prints `resolve_provider_client: <name> requested but
// <Display Name> not configured` when an aux task (compression,
// summarization, memory_flush, curator, vision, web_extract,
// session_search, skills_hub) has `provider: <name>` set in
// config.yaml but that provider's credentials aren't loaded.
// Common after a user removes one OAuth provider while their
// existing config.yaml still names it for an aux task. The
// chat banner used to surface this as `-32603 Internal error`
// with no actionable detail; surface a clear path now.
if let match = haystack.range(
of: #"resolve_provider_client:\s*([a-zA-Z0-9_-]+)\s+requested\s+but"#,
options: .regularExpression
) {
let line = String(haystack[match])
// Pull the captured provider name out of the matched line.
// First word after "resolve_provider_client:" is the value.
let provider: String = {
let parts = line.split(whereSeparator: { $0.isWhitespace })
if let idx = parts.firstIndex(where: { $0.contains("resolve_provider_client") }),
parts.index(after: idx) < parts.endIndex {
let candidate = parts[parts.index(after: idx)]
return String(candidate)
}
return "an unauthenticated provider"
}()
return Classification(
hint: "An auxiliary task is configured to use `\(provider)` but that provider isn't authenticated. Open Settings → Aux Models, or check `~/.hermes/config.yaml` for `auxiliary.<task>.provider: \(provider)` and switch it to your active provider (or set it to `auto`)."
)
}
if haystack.range(of: #"No\s+(Anthropic|OpenAI|OpenRouter|Gemini|Google|Groq|Mistral|XAI)?\s*credentials\s+found"#,
@@ -628,7 +714,7 @@ public enum ACPErrorHint {
|| haystack.contains("ANTHROPIC_TOKEN")
|| haystack.contains("claude setup-token")
|| haystack.contains("claude /login") {
return "Hermes can't find your AI provider credentials. Set `ANTHROPIC_API_KEY` (or similar) in `~/.hermes/.env` or your shell profile, then restart Scarf."
return Classification(hint: "Hermes can't find your AI provider credentials. Set `ANTHROPIC_API_KEY` (or similar) in `~/.hermes/.env` or your shell profile, then restart Scarf.")
}
if let match = haystack.range(of: #"No such file or directory:\s*'([^']+)'"#,
options: .regularExpression) {
@@ -636,13 +722,47 @@ public enum ACPErrorHint {
if let nameStart = matched.range(of: "'"),
let nameEnd = matched.range(of: "'", range: nameStart.upperBound..<matched.endIndex) {
let name = String(matched[nameStart.upperBound..<nameEnd.lowerBound])
return "Hermes couldn't find `\(name)` on PATH. If you use nvm/asdf/mise, make sure it's exported in `~/.zprofile` (not only `~/.zshrc`), then restart Scarf."
return Classification(hint: "Hermes couldn't find `\(name)` on PATH. If you use nvm/asdf/mise, make sure it's exported in `~/.zprofile` (not only `~/.zshrc`), then restart Scarf.")
}
return "Hermes couldn't find a required binary on PATH. Check that your shell's PATH is exported in `~/.zprofile`, then restart Scarf."
return Classification(hint: "Hermes couldn't find a required binary on PATH. Check that your shell's PATH is exported in `~/.zprofile`, then restart Scarf.")
}
if haystack.localizedCaseInsensitiveContains("rate limit")
|| haystack.localizedCaseInsensitiveContains("429") {
return "Your AI provider returned a rate-limit error. Try again in a moment."
return Classification(hint: "Your AI provider returned a rate-limit error. Try again in a moment.")
}
// Model-availability failure. Hermes pins each session to the
// model that opened it, so resuming an old session whose model
// is no longer available (provider deprecation, OAuth swapped
// to a different provider, model name changed) returns a 404
// / model_not_found from the upstream provider surfaced as
// an opaque "-32603 Internal error" in chat. v2.8 surfaces a
// clear "session is pinned" hint with the recovery path.
if haystack.localizedCaseInsensitiveContains("model_not_found")
|| haystack.localizedCaseInsensitiveContains("model not found")
|| haystack.localizedCaseInsensitiveContains("invalid_model")
|| haystack.localizedCaseInsensitiveContains("model is not available")
|| haystack.localizedCaseInsensitiveContains("unknown model")
|| (haystack.contains("404") && (haystack.localizedCaseInsensitiveContains("model")
|| haystack.localizedCaseInsensitiveContains("messages"))) {
return Classification(hint: "This session was created with a model the provider no longer offers. Hermes pins each session to its original model — start a new chat to use your current model, or run `hermes sessions clone` in Terminal to copy this conversation onto the new model.")
}
return nil
}
/// Best-effort extraction of an OAuth provider name from raw error
/// text. Returns the lowercase provider key (`"nous"`, `"claude"`,
/// etc.) when one of the known OAuth providers appears as a whole
/// word. The first match wins Hermes typically logs the active
/// provider name once, near the failure.
private static func oauthProvider(in haystack: String) -> String? {
let lowered = haystack.lowercased()
for provider in oauthProviders {
// Whole-word match so substrings like "anthropicapi" don't
// false-trigger on "anthropic".
let pattern = "\\b" + NSRegularExpression.escapedPattern(for: provider) + "\\b"
if lowered.range(of: pattern, options: .regularExpression) != nil {
return provider
}
}
return nil
}
@@ -0,0 +1,277 @@
import Foundation
#if canImport(os)
import os
import os.signpost
#endif
/// Lightweight performance instrumentation for the Scarf app family.
///
/// Three primitives `measure(...)`, `measureAsync(...)`, `event(...)` drop
/// timing samples through whatever set of backends is currently active.
/// Backends are pluggable: an always-on `os_signpost` backend (free outside
/// Instruments), an in-memory ring buffer (drives the in-app panel), and an
/// `os.Logger` debug backend (off by default).
///
/// **Cost when off.** When no backends are registered, every entry point is
/// `@inline(__always)` and short-circuits to the body call without taking the
/// `ContinuousClock.now` reading. Open source build defaults to "signpost
/// only" that backend pays one signpost emit per call, which Apple's runtime
/// elides when no Instruments session is recording.
///
/// **Privacy.** Names are `StaticString` so we cannot accidentally pass user
/// content through a metric tag. Optional `bytes:` field on `event` tracks
/// payload size, never payload contents. The ring buffer never leaves the
/// device unless the user explicitly hits "Copy as JSON" in the Diagnostics
/// panel.
public enum ScarfMon {
// MARK: - Public API
/// Synchronous timing wrapper. The body's return value flows through
/// untouched; the time it took plus `(category, name)` are recorded.
@inline(__always)
public static func measure<T>(
_ category: Category,
_ name: StaticString,
_ body: () throws -> T
) rethrows -> T {
guard isActive else { return try body() }
let start = ContinuousClock.now
defer { record(category, name, start: start, end: ContinuousClock.now) }
return try body()
}
/// Async variant. Same shape the `defer` block fires after the body
/// returns whether or not it threw, so cancelled / failed work still
/// records its duration.
@inline(__always)
public static func measureAsync<T>(
_ category: Category,
_ name: StaticString,
_ body: () async throws -> T
) async rethrows -> T {
guard isActive else { return try await body() }
let start = ContinuousClock.now
defer { record(category, name, start: start, end: ContinuousClock.now) }
return try await body()
}
/// Single-shot timestamped event. Use for things that aren't intervals
/// (token arrivals, buffer flushes) where count + optional payload size
/// is the useful signal.
@inline(__always)
public static func event(
_ category: Category,
_ name: StaticString,
count: Int = 1,
bytes: Int? = nil
) {
guard isActive else { return }
recordEvent(category, name, count: count, bytes: bytes)
}
// MARK: - Backend management
/// Install the desired backend set. Replaces the current set atomically.
/// Call once at app boot from the launch sequence; safe to call again
/// when the user toggles a setting on or off.
public static func install(_ backends: [ScarfMonBackend]) {
lock.lock()
defer { lock.unlock() }
installed = backends
cachedActive = !backends.isEmpty
}
/// Currently-installed backends. Test-only callers should not iterate
/// this in production.
public static var currentBackends: [ScarfMonBackend] {
lock.lock()
defer { lock.unlock() }
return installed
}
/// Cheap "are we recording anything?" check. The flag is updated only
/// when `install(...)` runs, so the hot path doesn't take the lock.
@inline(__always)
public static var isActive: Bool { cachedActive }
// MARK: - Internals
private static let lock = ScarfMonLock()
nonisolated(unsafe) private static var installed: [ScarfMonBackend] = []
nonisolated(unsafe) private static var cachedActive: Bool = false
@inline(__always)
private static func record(
_ category: Category,
_ name: StaticString,
start: ContinuousClock.Instant,
end: ContinuousClock.Instant
) {
let duration = end - start
let nanos = nanoseconds(of: duration)
let backends = snapshotBackends()
let sample = Sample(
category: category,
name: name,
kind: .interval,
timestamp: Date(),
durationNanos: nanos,
count: 1,
bytes: nil
)
for backend in backends {
backend.record(sample)
}
}
@inline(__always)
private static func recordEvent(
_ category: Category,
_ name: StaticString,
count: Int,
bytes: Int?
) {
let backends = snapshotBackends()
let sample = Sample(
category: category,
name: name,
kind: .event,
timestamp: Date(),
durationNanos: 0,
count: count,
bytes: bytes
)
for backend in backends {
backend.record(sample)
}
}
private static func snapshotBackends() -> [ScarfMonBackend] {
lock.lock()
defer { lock.unlock() }
return installed
}
private static func nanoseconds(of duration: Duration) -> UInt64 {
// Duration is (seconds: Int64, attoseconds: Int64). Avoid Double
// for the seconds term to keep precision on long intervals.
let comps = duration.components
let secondsAsNanos = UInt64(max(0, comps.seconds)) &* 1_000_000_000
let attoAsNanos = UInt64(max(0, comps.attoseconds) / 1_000_000_000)
return secondsAsNanos &+ attoAsNanos
}
}
// MARK: - Categories
extension ScarfMon {
/// Stable category vocabulary. Add cases here when new subsystems get
/// instrumented; renames are breaking changes for any saved JSON dumps
/// users have shared, so prefer adding over renaming.
public enum Category: String, CaseIterable, Sendable, Codable {
case chatRender
case chatStream
case sessionLoad
case transport
case sqlite
case diskIO
case render
case other
}
}
// MARK: - Sample
/// One recorded sample. All fields are value types so the struct is trivially
/// `Sendable` across backend queues without locks.
public struct ScarfMonSample: Sendable, Hashable {
public enum Kind: String, Sendable, Codable {
case interval
case event
}
public let category: ScarfMon.Category
/// Static name string captured at the call site. Not a `String` keeping
/// it `StaticString` proves at compile time that names cannot leak user
/// data through this channel.
public let name: StaticString
public let kind: Kind
public let timestamp: Date
public let durationNanos: UInt64
public let count: Int
public let bytes: Int?
public init(
category: ScarfMon.Category,
name: StaticString,
kind: Kind,
timestamp: Date,
durationNanos: UInt64,
count: Int,
bytes: Int?
) {
self.category = category
self.name = name
self.kind = kind
self.timestamp = timestamp
self.durationNanos = durationNanos
self.count = count
self.bytes = bytes
}
/// `StaticString` does not conform to `Hashable` natively (it doesn't
/// promise a stable hash). We hash via its UTF-8 representation so two
/// samples with the same source-literal name compare equal.
public static func == (lhs: ScarfMonSample, rhs: ScarfMonSample) -> Bool {
lhs.category == rhs.category
&& lhs.kind == rhs.kind
&& lhs.timestamp == rhs.timestamp
&& lhs.durationNanos == rhs.durationNanos
&& lhs.count == rhs.count
&& lhs.bytes == rhs.bytes
&& lhs.name.description == rhs.name.description
}
public func hash(into hasher: inout Hasher) {
hasher.combine(category)
hasher.combine(kind)
hasher.combine(timestamp)
hasher.combine(durationNanos)
hasher.combine(count)
hasher.combine(bytes)
hasher.combine(name.description)
}
}
extension ScarfMon {
public typealias Sample = ScarfMonSample
}
// MARK: - Backend protocol
/// One sink for samples. Implementations must be cheap on the hot path
/// callers hold no lock while invoking `record`, but the hot path runs from
/// every instrumented site, so allocations and disk I/O are off-limits here.
public protocol ScarfMonBackend: Sendable {
func record(_ sample: ScarfMon.Sample)
}
// MARK: - Lock
/// Tiny `os_unfair_lock` wrapper. CLAUDE.md says "Use os_unfair_lock (not
/// NSLock) for simple boolean flags accessed from multiple threads."
@usableFromInline
final class ScarfMonLock: @unchecked Sendable {
private let _lock: UnsafeMutablePointer<os_unfair_lock>
init() {
_lock = .allocate(capacity: 1)
_lock.initialize(to: os_unfair_lock())
}
deinit {
_lock.deinitialize(count: 1)
_lock.deallocate()
}
@usableFromInline func lock() { os_unfair_lock_lock(_lock) }
@usableFromInline func unlock() { os_unfair_lock_unlock(_lock) }
}
@@ -0,0 +1,76 @@
import Foundation
/// Boot-time wiring for ScarfMon. Both app targets call
/// `ScarfMonBoot.configure(...)` at launch and again whenever the user
/// flips the Diagnostics Performance toggle.
///
/// Three modes:
/// - `.off` nothing is recorded. Hot path is one branch + return.
/// - `.signpostOnly` Instruments-only. Default in the open-source build.
/// Free outside an Instruments session.
/// - `.full` signpost + ring buffer + os.Logger debug stream. Drives the
/// in-app panel and the "Copy as JSON" button. Opt-in.
public enum ScarfMonBoot {
public enum Mode: String, Sendable, CaseIterable {
case off
case signpostOnly
case full
}
/// User-defaults key for the persisted toggle. Same key on iOS + Mac
/// so `defaults read com.scarf.app ScarfMonMode` works on either.
public static let userDefaultsKey = "ScarfMonMode"
/// Read the persisted mode, defaulting to `.signpostOnly` so users
/// always get Instruments-visible signposts unless they explicitly
/// turn them off.
public static func currentMode(_ defaults: UserDefaults = .standard) -> Mode {
if let raw = defaults.string(forKey: userDefaultsKey),
let mode = Mode(rawValue: raw) {
return mode
}
return .signpostOnly
}
/// Persist a new mode and reinstall the backend set.
public static func setMode(_ mode: Mode, _ defaults: UserDefaults = .standard) {
defaults.set(mode.rawValue, forKey: userDefaultsKey)
configure(mode: mode)
}
/// Install the backend set for a given mode. Returns the active ring
/// buffer (if any) so the in-app Diagnostics panel can read from it.
@discardableResult
public static func configure(mode: Mode) -> ScarfMonRingBuffer? {
switch mode {
case .off:
ScarfMon.install([])
sharedRingBuffer = nil
return nil
case .signpostOnly:
ScarfMon.install([ScarfMonSignpostBackend()])
sharedRingBuffer = nil
return nil
case .full:
let ring = ScarfMonRingBuffer()
sharedRingBuffer = ring
ScarfMon.install([
ScarfMonSignpostBackend(),
ring,
ScarfMonLoggerBackend()
])
return ring
}
}
/// Process-wide ring buffer when running in `.full` mode. Nil otherwise.
/// Read by the Diagnostics panel; writes happen through the backend
/// dispatcher so this property is read-only.
///
/// `nonisolated(unsafe)` because the value is only mutated by
/// `configure(...)` (which itself runs on whichever actor invokes
/// the boot helper at app launch single-writer in practice) and
/// read from the panel UI on the main actor. Adding a lock here
/// would just add overhead with no real safety win.
nonisolated(unsafe) public private(set) static var sharedRingBuffer: ScarfMonRingBuffer?
}
@@ -0,0 +1,41 @@
import Foundation
#if canImport(os)
import os
#endif
/// `os.Logger`-backed sink. Off by default opt-in via the Diagnostics
/// settings toggle. Writes one `.debug` line per sample at the
/// `com.scarf.mon` subsystem, so users can stream the output via
/// `log stream --predicate 'subsystem == "com.scarf.mon"'` without
/// enabling private-data redaction overrides.
///
/// Only meaningful for users running their own debug build or with the
/// "verbose performance logging" toggle on.
public final class ScarfMonLoggerBackend: ScarfMonBackend, @unchecked Sendable {
#if canImport(os)
private let logger: Logger
public init(category: String = "perf") {
self.logger = Logger(subsystem: "com.scarf.mon", category: category)
}
public func record(_ sample: ScarfMon.Sample) {
switch sample.kind {
case .interval:
// `\(static:)` interpolation keeps the StaticString out of the
// private-data redaction path names are public, durations
// are public, the user's content never touches this channel.
logger.debug(
"\(sample.category.rawValue, privacy: .public) \(sample.name.description, privacy: .public) ms=\(Double(sample.durationNanos) / 1_000_000.0, privacy: .public)"
)
case .event:
logger.debug(
"\(sample.category.rawValue, privacy: .public) \(sample.name.description, privacy: .public) count=\(sample.count, privacy: .public) bytes=\(sample.bytes ?? -1, privacy: .public)"
)
}
}
#else
public init(category: String = "perf") {}
public func record(_ sample: ScarfMon.Sample) { /* no-op off-Apple */ }
#endif
}
@@ -0,0 +1,176 @@
import Foundation
/// Fixed-size, lock-protected ring of recent samples. Drives the in-app
/// Diagnostics panel and the export-as-JSON button.
///
/// Capacity is a compile-time choice; 4096 entries × ~80 bytes per sample =
/// ~320 KB resident. That's enough for several minutes of streaming-chat
/// activity at 200 samples/s without overwriting interesting context.
///
/// The hot path takes one `os_unfair_lock` per `record`. Aggregation (the
/// `summary(...)` reader) builds a fresh dictionary each call only invoked
/// from the panel UI, which polls at a human cadence.
public final class ScarfMonRingBuffer: ScarfMonBackend, @unchecked Sendable {
public let capacity: Int
private let lock = ScarfMonLock()
private var storage: [ScarfMon.Sample?]
/// Next write index. Wraps around `capacity` so the buffer never grows.
private var head: Int = 0
/// True once we've wrapped at least once switches the read order from
/// `[0..<head]` to `[head..<capacity] + [0..<head]`.
private var didWrap: Bool = false
public init(capacity: Int = 4096) {
precondition(capacity > 0, "ring buffer needs a positive capacity")
self.capacity = capacity
self.storage = Array(repeating: nil, count: capacity)
}
public func record(_ sample: ScarfMon.Sample) {
lock.lock()
defer { lock.unlock() }
storage[head] = sample
head += 1
if head >= capacity {
head = 0
didWrap = true
}
}
/// Snapshot of all currently-resident samples in chronological order.
public func samples() -> [ScarfMon.Sample] {
lock.lock()
defer { lock.unlock() }
if !didWrap {
return storage[0..<head].compactMap { $0 }
}
let tail = storage[head..<capacity].compactMap { $0 }
let leading = storage[0..<head].compactMap { $0 }
return tail + leading
}
/// Wipe the buffer. Used by the "Reset" button in the Diagnostics
/// panel and at the top of every test case.
public func reset() {
lock.lock()
defer { lock.unlock() }
for i in 0..<capacity { storage[i] = nil }
head = 0
didWrap = false
}
/// Aggregated stats over the current buffer. Buckets by
/// `(category, name)`; computes count, total nanos, mean, p50, p95.
public func summary() -> [ScarfMonStat] {
let snapshot = samples()
var buckets: [BucketKey: [UInt64]] = [:]
var counts: [BucketKey: Int] = [:]
var byteTotals: [BucketKey: Int] = [:]
var kinds: [BucketKey: ScarfMon.Sample.Kind] = [:]
for sample in snapshot {
let key = BucketKey(category: sample.category, name: sample.name.description)
kinds[key] = sample.kind
counts[key, default: 0] += sample.count
if let b = sample.bytes { byteTotals[key, default: 0] += b }
if sample.kind == .interval {
buckets[key, default: []].append(sample.durationNanos)
}
}
var stats: [ScarfMonStat] = []
for (key, _) in counts {
let durations = buckets[key] ?? []
let kind = kinds[key] ?? .event
stats.append(ScarfMonStat(
category: key.category,
name: key.name,
kind: kind,
count: counts[key] ?? 0,
totalNanos: durations.reduce(0, &+),
p50Nanos: percentile(durations, 0.50),
p95Nanos: percentile(durations, 0.95),
maxNanos: durations.max() ?? 0,
totalBytes: byteTotals[key] ?? 0
))
}
stats.sort { $0.p95Nanos > $1.p95Nanos }
return stats
}
private struct BucketKey: Hashable {
let category: ScarfMon.Category
let name: String
}
private func percentile(_ values: [UInt64], _ p: Double) -> UInt64 {
guard !values.isEmpty else { return 0 }
let sorted = values.sorted()
// Nearest-rank percentile good enough for triage and avoids
// interpolation edge cases on tiny samples.
let rank = max(1, min(sorted.count, Int((p * Double(sorted.count)).rounded(.up))))
return sorted[rank - 1]
}
}
/// Per-bucket stats surfaced to the in-app panel.
public struct ScarfMonStat: Sendable, Hashable, Codable {
public let category: ScarfMon.Category
public let name: String
public let kind: ScarfMon.Sample.Kind
public let count: Int
public let totalNanos: UInt64
public let p50Nanos: UInt64
public let p95Nanos: UInt64
public let maxNanos: UInt64
public let totalBytes: Int
public var totalMs: Double { Double(totalNanos) / 1_000_000.0 }
public var p50Ms: Double { Double(p50Nanos) / 1_000_000.0 }
public var p95Ms: Double { Double(p95Nanos) / 1_000_000.0 }
public var maxMs: Double { Double(maxNanos) / 1_000_000.0 }
}
// MARK: - JSON export
extension ScarfMonRingBuffer {
/// Compact JSON dump for the "Copy as JSON" button. One line per sample
/// keeps the output greppable when the user pastes it into a feedback
/// thread.
public func exportJSON() -> String {
struct Wire: Codable {
let category: String
let name: String
let kind: String
let timestampMs: Double
let durationNanos: UInt64
let count: Int
let bytes: Int?
}
let snapshot = samples()
let encoder = JSONEncoder()
encoder.outputFormatting = [.sortedKeys]
var lines: [String] = []
lines.reserveCapacity(snapshot.count + 1)
lines.append("[")
for (i, s) in snapshot.enumerated() {
let wire = Wire(
category: s.category.rawValue,
name: s.name.description,
kind: s.kind.rawValue,
timestampMs: s.timestamp.timeIntervalSince1970 * 1000,
durationNanos: s.durationNanos,
count: s.count,
bytes: s.bytes
)
if let data = try? encoder.encode(wire),
let line = String(data: data, encoding: .utf8) {
let suffix = i == snapshot.count - 1 ? "" : ","
lines.append(" " + line + suffix)
}
}
lines.append("]")
return lines.joined(separator: "\n")
}
}
@@ -0,0 +1,54 @@
import Foundation
#if canImport(os)
import os
import os.signpost
#endif
/// Always-on signpost backend. Emits an `os_signpost` event per sample so
/// users can attach Instruments and see Scarf's instrumentation in the
/// Points of Interest track without a debug build.
///
/// `os_signpost` is elided by the runtime when no Instruments session is
/// recording the relevant subsystem the backend pays the cost of one
/// `OSLog` lookup per emit and nothing else.
public final class ScarfMonSignpostBackend: ScarfMonBackend, @unchecked Sendable {
#if canImport(os)
private let log: OSLog
public init(subsystem: String = "com.scarf.mon") {
self.log = OSLog(subsystem: subsystem, category: .pointsOfInterest)
}
public func record(_ sample: ScarfMon.Sample) {
// Signposts want a `StaticString` name we already require
// exactly that on the API. Format string is also static; the
// dynamic values flow as printf-style args, so no allocations
// for the event name itself.
switch sample.kind {
case .interval:
os_signpost(
.event,
log: log,
name: sample.name,
"category=%{public}@ ms=%{public}.3f count=%d",
sample.category.rawValue,
Double(sample.durationNanos) / 1_000_000.0,
sample.count
)
case .event:
os_signpost(
.event,
log: log,
name: sample.name,
"category=%{public}@ count=%d bytes=%d",
sample.category.rawValue,
sample.count,
sample.bytes ?? -1
)
}
}
#else
public init(subsystem: String = "com.scarf.mon") {}
public func record(_ sample: ScarfMon.Sample) { /* no-op off-Apple */ }
#endif
}
@@ -32,10 +32,21 @@ public enum QueryDefaults: Sendable {
/// consistent budget and so we have one knob to retune if perf
/// concerns shift.
public enum HistoryPageSize: Sendable {
/// Initial chat-history load: covers the vast majority of
/// sessions in one fetch while keeping the snapshot read bounded
/// for the rare 1000+-message session.
public nonisolated static let initial = 200
/// Initial chat-history load. **Sized to fit the SSH wire payload
/// inside a 30-second `RemoteSQLiteBackend.queryTimeout`.** A
/// 157-message session at 200-row page size produced enough
/// JSON (with `reasoning_content` for thinking models) to time
/// out at exactly 30 s on a 420 ms-RTT remote. Dropped to 50,
/// then to 25 in v2.7 after a 160-message session still timed
/// out at 50 `reasoning_content` for thinking-model turns can
/// run 20+ KB per row, so 50 rows × 30 KB = 1.5 MB JSON which
/// over a slow SSH channel still trips the 30s budget. Pair
/// with `messageColumnsLight` (excludes `reasoning_content`)
/// so the on-wire payload is small even at this size; the
/// inspector pane lazy-loads via `fetchReasoningContent(for:)`
/// when the user expands a disclosure. The "Load earlier"
/// affordance pages back through older messages on demand.
public nonisolated static let initial = 25
/// Reconnection reconcile against the DB. 200 rows is plenty
/// disconnects don't generate hundreds of unseen messages.
public nonisolated static let reconcile = 200
@@ -64,6 +64,28 @@ public struct HermesMessage: Identifiable, Sendable {
if let rc = reasoningContent, !rc.isEmpty { return rc }
return reasoning
}
/// Return a copy of this message with `toolCalls` replaced. Used
/// by the v2.8 two-phase chat loader: skeleton fetch returns
/// messages with empty `toolCalls`; the background hydrate splices
/// the parsed values in without re-fetching the conversational
/// columns.
public func withToolCalls(_ newCalls: [HermesToolCall]) -> HermesMessage {
HermesMessage(
id: id,
sessionId: sessionId,
role: role,
content: content,
toolCallId: toolCallId,
toolCalls: newCalls,
toolName: toolName,
timestamp: timestamp,
tokenCount: tokenCount,
finishReason: finishReason,
reasoning: reasoning,
reasoningContent: reasoningContent
)
}
}
public struct HermesToolCall: Identifiable, Sendable, Codable {
@@ -210,3 +232,23 @@ public enum ToolKind: String, Sendable, CaseIterable {
}
}
}
/// Outcome of a `fetchMessagesOutcome` call. `transportError` is non-nil
/// only when the underlying SSH/SQLite call hit a transport-layer
/// failure (timeout, ControlMaster drop) distinguishes a genuine
/// empty session from a silent partial-load. The chat resume path uses
/// it to surface a "couldn't load full history" banner.
public struct MessageFetchOutcome: Sendable {
public let messages: [HermesMessage]
public let transportError: String?
public init(messages: [HermesMessage], transportError: String?) {
self.messages = messages
self.transportError = transportError
}
/// True when the fetch tripped a transport failure. Distinct from
/// `messages.isEmpty` an empty session is a successful zero-row
/// result, while a transport error is "we don't know what's there."
public var didTimeOut: Bool { transportError != nil }
}
@@ -98,6 +98,12 @@ public struct HermesPathSet: Sendable, Hashable {
/// on user request from the model picker. Survives offline runs so
/// the picker still has something to render.
public nonisolated var nousModelsCache: String { scarfDir + "/nous_models_cache.json" }
/// Cached `templates/catalog.json` from awizemann.github.io. Populated
/// by `CatalogService` on first sheet-open and refreshed on a 24h TTL
/// or on explicit user click. Mirrors `nousModelsCache` exactly:
/// JSON, scarf-owned, survives offline runs so the catalog browser
/// still has something to render. Wiped by a Hermes home reset.
public nonisolated var catalogCache: String { scarfDir + "/catalog_cache.json" }
public nonisolated var mcpTokensDir: String { home + "/mcp-tokens" }
// MARK: - Binary resolution
@@ -39,6 +39,13 @@ public struct ProjectEntry: Codable, Sendable, Identifiable, Hashable {
public var dashboardPath: String { path + "/.scarf/dashboard.json" }
/// Directory holding the project's Scarf-managed sidecar files
/// (dashboard.json, manifest.json, template.lock.json, config.json,
/// plus any cron-job-written reports the dashboard widgets reference).
/// Watched as a unit by `HermesFileWatcher` so any file added /
/// removed / renamed inside refreshes the dashboard automatically.
public var scarfDir: String { path + "/.scarf" }
// MARK: - Codable (custom for backward compat)
private enum CodingKeys: String, CodingKey {
@@ -152,29 +159,54 @@ public struct DashboardWidget: Codable, Sendable, Identifiable {
// List
public let items: [ListItem]?
// Webview
// Webview / Image (image reuses `url` for remote, `path` for local)
public let url: String?
public let height: Double?
// v2.7 file-reading widgets (markdown_file, log_tail, image-local).
// `path` is resolved relative to the project root (the directory that
// contains `.scarf/`). Renderers must reject `..` segments after
// normalization to prevent escape from the project boundary.
public let path: String?
public let lines: Int?
// v2.7 cron_status widget; `jobId` matches HermesCronJob.id.
public let jobId: String?
// v2.7 status_grid widget; `cells` carries label + status per square,
// `gridColumns` overrides the auto-fit column count (keep distinct
// from `columns` which is the table-widget header list).
public let cells: [StatusGridCell]?
public let gridColumns: Int?
// v2.7 optional sparkline trend on `stat` widgets.
public let sparkline: [Double]?
public init(
type: String,
title: String,
value: WidgetValue?,
icon: String?,
color: String?,
subtitle: String?,
label: String?,
content: String?,
format: String?,
columns: [String]?,
rows: [[String]]?,
chartType: String?,
xLabel: String?,
yLabel: String?,
series: [ChartSeries]?,
items: [ListItem]?,
url: String?,
height: Double?
value: WidgetValue? = nil,
icon: String? = nil,
color: String? = nil,
subtitle: String? = nil,
label: String? = nil,
content: String? = nil,
format: String? = nil,
columns: [String]? = nil,
rows: [[String]]? = nil,
chartType: String? = nil,
xLabel: String? = nil,
yLabel: String? = nil,
series: [ChartSeries]? = nil,
items: [ListItem]? = nil,
url: String? = nil,
height: Double? = nil,
path: String? = nil,
lines: Int? = nil,
jobId: String? = nil,
cells: [StatusGridCell]? = nil,
gridColumns: Int? = nil,
sparkline: [Double]? = nil
) {
self.type = type
self.title = title
@@ -194,6 +226,29 @@ public struct DashboardWidget: Codable, Sendable, Identifiable {
self.items = items
self.url = url
self.height = height
self.path = path
self.lines = lines
self.jobId = jobId
self.cells = cells
self.gridColumns = gridColumns
self.sparkline = sparkline
}
}
// MARK: - Status Grid Data (v2.7)
/// One cell of a `status_grid` widget. Status semantics match `ListItem.status`
/// parsed via `ListItemStatus(raw:)` so the same vocabulary + synonyms apply.
public struct StatusGridCell: Codable, Sendable, Identifiable, Hashable {
public var id: String { label }
public let label: String
public let status: String?
public let tooltip: String?
public init(label: String, status: String? = nil, tooltip: String? = nil) {
self.label = label
self.status = status
self.tooltip = tooltip
}
}
@@ -284,3 +339,47 @@ public struct ListItem: Codable, Sendable, Identifiable {
self.status = status
}
}
/// Typed semantic status for `ListItem` (and `status_grid` cells in v2.7+).
///
/// Wire format stays a free `String?` on `ListItem` for backwards compatibility
/// pre-existing dashboards never break. Renderers call `ListItemStatus(raw:)`
/// to map known values + synonyms to a canonical case; unknown values return
/// `nil` and render as plain neutral text.
public enum ListItemStatus: String, Sendable, Hashable, CaseIterable {
case success
case warning
case danger
case info
case pending
case done
case neutral
/// Lenient parse accepts canonical names plus common synonyms seen in
/// real-world dashboards (`ok`/`up` success, `down`/`error`/`failed`
/// danger, `active` info). Returns `nil` for unrecognized strings so
/// the renderer can fall back to plain text.
public init?(raw: String?) {
guard let raw = raw?.trimmingCharacters(in: .whitespaces).lowercased(), !raw.isEmpty else {
return nil
}
switch raw {
case "success", "ok", "up", "green", "passing":
self = .success
case "warning", "warn", "yellow", "degraded":
self = .warning
case "danger", "down", "error", "failed", "failure", "red", "critical":
self = .danger
case "info", "active", "blue":
self = .info
case "pending", "queued", "waiting", "scheduled":
self = .pending
case "done", "complete", "completed", "finished":
self = .done
case "neutral", "muted", "gray":
self = .neutral
default:
return nil
}
}
}
@@ -0,0 +1,113 @@
import Foundation
/// Pluggable query engine for `HermesDataService`. Two implementations
/// today:
///
/// * `LocalSQLiteBackend` opens the local `~/.hermes/state.db` via
/// libsqlite3 and runs queries in-process. Microseconds per query.
/// * `RemoteSQLiteBackend` invokes `sqlite3 -readonly -json` over an
/// SSH session (ControlMaster keeps the channel warm), parses the
/// JSON response into `Row`s. ~50100 ms per query.
///
/// The data service picks one based on `ServerContext.isRemote`. View
/// models are oblivious they keep calling `await dataService.fetch`
/// like before.
///
/// **Why a protocol, not a class hierarchy.** Backends have very
/// different internals (libsqlite3 handles vs. SSH script piping) but
/// the call-site shape is identical. A protocol lets us hand the data
/// service either backend through one stored property without
/// abstract-class ceremony, and keeps the test mock (see
/// `MockHermesQueryBackend` in tests) free of inheritance baggage.
///
/// **Sendable.** Concrete impls are actors, so they're trivially
/// `Sendable`. The protocol conforms to `Sendable` to satisfy Swift 6
/// strict-concurrency for the data-service stored property.
public protocol HermesQueryBackend: Sendable {
/// True iff the connected DB has the v0.7 columns (`reasoning_tokens`,
/// `actual_cost_usd`, `cost_status`, `billing_provider` on
/// `sessions` plus `reasoning` on `messages`). Detected once at
/// `open()` time.
var hasV07Schema: Bool { get async }
/// True iff the connected DB has the v0.11 columns
/// (`api_call_count` on `sessions`, `reasoning_content` on
/// `messages`). Belt-and-braces: BOTH must be present (a
/// partially-migrated DB stays on the v0.7 path to avoid "no such
/// column" failures).
var hasV011Schema: Bool { get async }
/// User-presentable error from the most recent `open()` (or the
/// most recent failed query for the remote backend's
/// connectivity-loss codepath). `nil` means everything is healthy.
var lastOpenError: String? { get async }
/// One-time setup. Local: `sqlite3_open_v2` + `PRAGMA table_info`
/// schema detection. Remote: one SSH round-trip running
/// `sqlite3 --version` plus the two PRAGMA queries.
///
/// Returns `false` on any failure; detail is in `lastOpenError`.
/// Calling `open()` on an already-open backend is a no-op that
/// returns `true`.
func open() async -> Bool
/// Local backend: `close()` then `open(forceFresh:)` re-pulls
/// the SQLite handle so a Hermes-side migration becomes visible.
/// Remote backend: a no-op when `forceFresh: false` (every query
/// is already fresh there's nothing to refresh). `forceFresh:
/// true` re-runs the schema preflight, covering the rare "user
/// upgraded Hermes on the remote, my schema flags are stale" case.
@discardableResult
func refresh(forceFresh: Bool) async -> Bool
/// Drop any persistent resources. Idempotent.
func close() async
/// Run a single SQL statement and collect every row before
/// returning. SQL uses `?` placeholders; `params` is bound
/// positionally (one entry per `?`).
///
/// Local backend: `sqlite3_prepare_v2` + `sqlite3_bind_*` +
/// `sqlite3_step` loop, materialising each row into a `Row`.
/// Remote backend: inlines params via `SQLValueInliner` to produce
/// a final SQL string, runs `sqlite3 -readonly -json` over SSH,
/// parses the resulting JSON array.
///
/// Throws `BackendError` on any failure. The data-service façade
/// generally catches and returns empty results to preserve the
/// existing "show empty UI on error" behaviour.
func query(_ sql: String, params: [SQLValue]) async throws -> [Row]
/// Run several statements in one round-trip, returning each
/// statement's row set in order. Lets multi-query view loads
/// (Dashboard's 4-query pattern, Insights' 5-query pattern)
/// amortise the SSH/sqlite3 cold-start cost.
///
/// Each `(sql, params)` pair has the same shape as `query`
/// `?` placeholders bound positionally per pair.
func queryBatch(_ statements: [(sql: String, params: [SQLValue])]) async throws -> [[Row]]
}
/// Errors that backends raise. Mapped into user-facing messages by the
/// `humanize` helper that lives alongside `HermesDataService`.
public enum BackendError: Error, Sendable, Equatable {
/// Backend is not open caller should `open()` first.
case notOpen
/// Connectivity failure (SSH down, ControlMaster dead, transport
/// can't reach the host). Carries a short human-readable reason.
/// Triggers the data-service's `lastOpenError` populate path.
case transport(String)
/// sqlite3 itself reported an error non-zero exit, parse failure,
/// schema mismatch. `exitCode` is the sqlite3 process exit (or
/// libsqlite3 result code on the local backend); `stderr` is the
/// sqlite3-emitted message (already user-readable in most cases).
case sqlite(exitCode: Int32, stderr: String)
/// JSON-parsing failed on remote-backend output. Indicates either a
/// sqlite3 binary that didn't honour `-json`, or output corruption
/// (rare). Carries the first 200 bytes of stdout for diagnostics.
case parseFailure(stdoutHead: String)
}
@@ -0,0 +1,254 @@
// MARK: - Platform gate
//
// libsqlite3 is a system module on macOS/iOS but not on swift-corelibs
// foundation. Gate the entire backend so ScarfCore still compiles for
// any future Linux target. Apple platforms the runtime targets get
// the full implementation.
#if canImport(SQLite3)
import Foundation
import SQLite3
#if canImport(os)
import os
#endif
/// `HermesQueryBackend` that opens a local SQLite file via libsqlite3
/// and runs queries in-process. Microseconds per query.
///
/// Used for `ServerContext.local` (the user's own `~/.hermes/state.db`)
/// the previous behaviour of `HermesDataService` lifted out unchanged.
/// For `.ssh` contexts the data service constructs `RemoteSQLiteBackend`
/// instead.
///
/// Actor isolation matches the parent `HermesDataService` actor: queries
/// serialise on this backend's executor, and the data service hops once
/// (`await backend.query`) per public method call.
public actor LocalSQLiteBackend: HermesQueryBackend {
#if canImport(os)
private static let logger = Logger(subsystem: "com.scarf", category: "LocalSQLiteBackend")
#endif
private var db: OpaquePointer?
private var openedAtPath: String?
private(set) public var hasV07Schema = false
private(set) public var hasV011Schema = false
private(set) public var lastOpenError: String?
private let context: ServerContext
public init(context: ServerContext) {
self.context = context
}
// MARK: - Lifecycle
public func open() async -> Bool {
if db != nil { return true }
let path = context.paths.stateDB
guard FileManager.default.fileExists(atPath: path) else {
lastOpenError = "Hermes state database not found at \(path)."
return false
}
let flags: Int32 = SQLITE_OPEN_READONLY | SQLITE_OPEN_NOMUTEX
let rc = sqlite3_open_v2(path, &db, flags, nil)
guard rc == SQLITE_OK else {
let msg: String
if let db {
msg = String(cString: sqlite3_errmsg(db))
} else {
msg = "sqlite3_open_v2 returned \(rc)"
}
lastOpenError = "Couldn't open state.db: \(msg)"
#if canImport(os)
Self.logger.warning("sqlite3_open_v2 failed (\(rc)) at \(path, privacy: .public): \(msg, privacy: .public)")
#endif
db = nil
return false
}
openedAtPath = path
lastOpenError = nil
detectSchema()
return true
}
@discardableResult
public func refresh(forceFresh: Bool) async -> Bool {
// Local always close-and-reopen the file may have been swapped
// by Hermes (rare) or we want to pick up a schema migration.
// `forceFresh` is irrelevant locally; included for protocol
// parity with the remote backend.
await close()
return await open()
}
public func close() async {
if let db {
sqlite3_close(db)
}
db = nil
openedAtPath = nil
}
// MARK: - Schema detection
private func detectSchema() {
guard let db else { return }
// sessions schema
var stmt: OpaquePointer?
if sqlite3_prepare_v2(db, "PRAGMA table_info(sessions)", -1, &stmt, nil) == SQLITE_OK {
defer { sqlite3_finalize(stmt) }
while sqlite3_step(stmt) == SQLITE_ROW {
if let name = sqlite3_column_text(stmt, 1) {
let column = String(cString: name)
if column == "reasoning_tokens" {
hasV07Schema = true
}
if column == "api_call_count" {
hasV011Schema = true
}
}
}
}
// messages schema confirm `reasoning_content` is present too.
// Belt-and-braces: a partially-migrated DB (sessions migrated,
// messages not) shouldn't blow up reads with "no such column".
if hasV011Schema {
var msgStmt: OpaquePointer?
var sawReasoningContent = false
if sqlite3_prepare_v2(db, "PRAGMA table_info(messages)", -1, &msgStmt, nil) == SQLITE_OK {
defer { sqlite3_finalize(msgStmt) }
while sqlite3_step(msgStmt) == SQLITE_ROW {
if let name = sqlite3_column_text(msgStmt, 1),
String(cString: name) == "reasoning_content" {
sawReasoningContent = true
break
}
}
}
if !sawReasoningContent {
hasV011Schema = false
}
}
}
// MARK: - Queries
public func query(_ sql: String, params: [SQLValue]) async throws -> [Row] {
guard let db else { throw BackendError.notOpen }
return try executeOne(db: db, sql: sql, params: params)
}
public func queryBatch(_ statements: [(sql: String, params: [SQLValue])]) async throws -> [[Row]] {
guard let db else { throw BackendError.notOpen }
// Local backend has no SSH/process round-trip cost running
// sequentially against the open handle is exactly equivalent
// to running each via `query`. The protocol method exists for
// remote-backend amortisation; locally we just satisfy the
// signature.
var out: [[Row]] = []
out.reserveCapacity(statements.count)
for (sql, params) in statements {
out.append(try executeOne(db: db, sql: sql, params: params))
}
return out
}
// MARK: - Internals
private func executeOne(db: OpaquePointer, sql: String, params: [SQLValue]) throws -> [Row] {
var stmt: OpaquePointer?
let prepRC = sqlite3_prepare_v2(db, sql, -1, &stmt, nil)
guard prepRC == SQLITE_OK, let stmt else {
let msg = String(cString: sqlite3_errmsg(db))
throw BackendError.sqlite(exitCode: prepRC, stderr: msg)
}
defer { sqlite3_finalize(stmt) }
for (i, value) in params.enumerated() {
let col = Int32(i + 1)
let rc: Int32
switch value {
case .null:
rc = sqlite3_bind_null(stmt, col)
case .integer(let n):
rc = sqlite3_bind_int64(stmt, col, n)
case .real(let d):
rc = sqlite3_bind_double(stmt, col, d)
case .text(let s):
rc = sqlite3_bind_text(stmt, col, s, -1, sqliteTransient)
case .blob(let d):
rc = d.withUnsafeBytes { buf -> Int32 in
guard let base = buf.baseAddress else {
return sqlite3_bind_zeroblob(stmt, col, 0)
}
return sqlite3_bind_blob(stmt, col, base, Int32(buf.count), sqliteTransient)
}
}
if rc != SQLITE_OK {
let msg = String(cString: sqlite3_errmsg(db))
throw BackendError.sqlite(exitCode: rc, stderr: msg)
}
}
// Build column-name index map once per result set, lazily on
// first row (sqlite3_column_name needs the prepared stmt; cheap
// either way). For a 0-row result set we still build it so
// callers that read column names from the first hypothetical
// row don't error though `Row.columnIndex` on an empty
// `[Row]` is moot.
let columnCount = Int(sqlite3_column_count(stmt))
var columnIndex: [String: Int] = [:]
columnIndex.reserveCapacity(columnCount)
for i in 0..<columnCount {
if let cstr = sqlite3_column_name(stmt, Int32(i)) {
columnIndex[String(cString: cstr)] = i
}
}
var rows: [Row] = []
while true {
let stepRC = sqlite3_step(stmt)
if stepRC == SQLITE_DONE { break }
if stepRC != SQLITE_ROW {
let msg = String(cString: sqlite3_errmsg(db))
throw BackendError.sqlite(exitCode: stepRC, stderr: msg)
}
var values: [SQLValue] = []
values.reserveCapacity(columnCount)
for i in 0..<columnCount {
let col = Int32(i)
let type = sqlite3_column_type(stmt, col)
switch type {
case SQLITE_NULL:
values.append(.null)
case SQLITE_INTEGER:
values.append(.integer(sqlite3_column_int64(stmt, col)))
case SQLITE_FLOAT:
values.append(.real(sqlite3_column_double(stmt, col)))
case SQLITE_TEXT:
if let cstr = sqlite3_column_text(stmt, col) {
values.append(.text(String(cString: cstr)))
} else {
values.append(.text(""))
}
case SQLITE_BLOB:
let n = Int(sqlite3_column_bytes(stmt, col))
if n > 0, let p = sqlite3_column_blob(stmt, col) {
values.append(.blob(Data(bytes: p, count: n)))
} else {
values.append(.blob(Data()))
}
default:
values.append(.null)
}
}
rows.append(Row(values: values, columnIndex: columnIndex))
}
return rows
}
}
#endif // canImport(SQLite3)
@@ -0,0 +1,651 @@
#if canImport(SQLite3)
import Foundation
#if canImport(os)
import os
#endif
/// `HermesQueryBackend` that runs `sqlite3 -readonly -json` over an
/// SSH session per query. Replaces the old snapshot-then-open pipeline
/// (issue #74): no full-DB transfers, no local cache, every query
/// against the live remote DB.
///
/// **Why one round-trip per query is OK.** ControlMaster keeps the SSH
/// session warm first connect spins up the master socket; subsequent
/// queries reuse it at ~5 ms overhead. sqlite3 cold-start is ~3050 ms,
/// query execution is sub-millisecond for indexed queries, JSON
/// serialisation is small. End-to-end ~50100 ms per query, dominated
/// by sqlite3 process spawn. Multi-query view loads (Dashboard,
/// Insights) batch via `queryBatch` one cold-start, all statements
/// in a single sqlite3 invocation, ~80100 ms total.
///
/// **Result format**. `sqlite3 -json` emits one JSON array per
/// statement that returns rows: `[{"col":val,...}, ...]`. Multi-statement
/// scripts emit each array on its own. We separate batched queries
/// with a `SELECT '__SCARF_RS_BEGIN__N' AS marker;` synthesised line so
/// the parser can split on the markers sqlite3's marker rows
/// preserve order and let us pair each result-set with the originating
/// statement index.
public actor RemoteSQLiteBackend: HermesQueryBackend {
#if canImport(os)
private static let logger = Logger(subsystem: "com.scarf", category: "RemoteSQLiteBackend")
#endif
private let context: ServerContext
private let transport: any ServerTransport
private(set) public var hasV07Schema = false
private(set) public var hasV011Schema = false
private(set) public var lastOpenError: String?
private var isOpen = false
/// Captured `sqlite3 --version` line from the most recent preflight.
/// Stashed for diagnostic logs and a future "remote sqlite3 too old"
/// error path.
private var sqliteVersion: String?
/// Resolved absolute remote `$HOME`, populated on `open()` via
/// `context.resolvedUserHome()` so that `~/` paths can be expanded
/// in Swift up front rather than relying on shell expansion across
/// the streamScript pipeline. The base64 + pipe path through
/// Citadel does not reliably propagate `$HOME` into the inner
/// `/bin/sh` on every host keeping this client-side avoids the
/// issue (and matches how `RemoteBackupService.expandTilde` already
/// handles the same problem). `nil` only when the probe failed,
/// in which case `quoteForRemoteShell` falls back to `"$HOME/..."`
/// shell expansion.
private var resolvedHome: String?
/// In-flight query coalescing keyed on the inlined SQL text,
/// value is the Task currently fetching that exact result set.
/// When two concurrent callers ask for the same query (common
/// pattern: file watcher tick + chat-finalize debounce both
/// firing `loadRecentSessions` within ~100 ms), the second
/// caller awaits the first call's task instead of spawning a
/// fresh SSH subprocess. Cleared on task completion. Drops
/// duplicate `mac.loadRecentSessions` traces observed at
/// t=960450 / t=960584 in the perf capture (two parallel 3-s
/// loads for the same data, finishing 134 ms apart).
///
/// Coalescing is *only* applied to single `query` calls, not
/// `queryBatch` batches are larger payloads with caller-
/// specific timeout scaling, and concurrent callers wanting
/// "the same batch" is rare in practice. Keep coalescing
/// surgical so we don't accidentally serialize independent
/// work that just happens to match.
private var inFlightQueries: [String: Task<[Row], Error>] = [:]
/// Per-query timeout for `query`. Healthy local queries are
/// <100 ms; remote ones over 420 ms-RTT SSH amortize one round
/// trip per call PLUS the wire payload time. A `fetchMessages`
/// over a 157-message session (~50KB JSON encoded) exceeded
/// the previous 15 s ceiling, silently returned 0 rows, and the
/// chat appeared empty a worse failure than the wait it was
/// guarding against. Bumped to 30 s; the `streamScript`
/// transport-level timeout still fires on truly wedged hosts.
private let queryTimeout: TimeInterval = 30
/// Preflight timeout. First SSH round-trip may include cold
/// ControlMaster establishment (~13 s) plus the schema PRAGMA
/// queries; 30 s is generous.
private let preflightTimeout: TimeInterval = 30
/// Marker prefix used to split `queryBatch` result sets. Picked to
/// be very unlikely to collide with a real session_id, role string,
/// or content fragment.
private static let batchMarkerPrefix = "__SCARF_RS_BEGIN__"
public init(context: ServerContext, transport: any ServerTransport) {
self.context = context
self.transport = transport
}
// MARK: - Lifecycle
public func open() async -> Bool {
if isOpen { return true }
// Resolve remote $HOME once (cached process-wide via
// ServerContext.UserHomeCache so concurrent backends share
// the probe result). Lets us hand sqlite3 absolute paths and
// skip the unreliable nested-shell expansion altogether. A
// probe failure leaves `resolvedHome == nil` and falls back
// to "$HOME/..."-quoted args; the data-service open() will
// surface whatever sqlite3 errors out with.
let probedHome = await context.resolvedUserHome()
if probedHome != "~" && !probedHome.isEmpty {
resolvedHome = probedHome
}
let dbPath = context.paths.stateDB
// One SSH round-trip running:
// 1. sqlite3 --version (sanity + capture for diagnostics)
// 2. PRAGMA table_info(sessions) | sessions schema
// 3. PRAGMA table_info(messages) | messages schema
// sqlite3 -json emits two arrays back-to-back for the two PRAGMA
// statements; we parse them as separate result sets.
let preflight = """
set -e
sqlite3 --version
sqlite3 -readonly -json \(quoteForRemoteShell(dbPath)) "PRAGMA table_info(sessions); PRAGMA table_info(messages);"
"""
do {
let result = try await transport.streamScript(preflight, timeout: preflightTimeout)
if result.exitCode != 0 {
lastOpenError = errorMessage(stderr: result.stderrString, stdout: result.stdoutString, exitCode: result.exitCode)
#if canImport(os)
Self.logger.warning("Remote preflight failed (exit \(result.exitCode)): \(self.lastOpenError ?? "", privacy: .public)")
#endif
return false
}
try parsePreflightOutput(result.stdoutString)
lastOpenError = nil
isOpen = true
#if canImport(os)
Self.logger.info("Remote SQLite backend ready: sqlite3=\(self.sqliteVersion ?? "?", privacy: .public), v0.7=\(self.hasV07Schema), v0.11=\(self.hasV011Schema)")
#endif
return true
} catch {
lastOpenError = error.localizedDescription
#if canImport(os)
Self.logger.warning("Remote preflight transport error: \(error.localizedDescription, privacy: .public)")
#endif
return false
}
}
@discardableResult
public func refresh(forceFresh: Bool) async -> Bool {
// Streaming queries are always fresh. The watcher tick still
// fires `dataService.refresh()` on every observed file change
// locally that re-opens the SQLite handle; here it's a
// no-op. `forceFresh: true` is the escape hatch for when the
// user explicitly wants a re-preflight (e.g. they upgraded
// Hermes on the remote). Drop the open state and re-run.
if forceFresh {
isOpen = false
return await open()
}
return isOpen ? true : await open()
}
public func close() async {
isOpen = false
}
// MARK: - Queries
public func query(_ sql: String, params: [SQLValue]) async throws -> [Row] {
guard isOpen else { throw BackendError.notOpen }
let inlined = SQLValueInliner.inline(sql, params: params)
// In-flight coalescing if a query with the exact same
// inlined SQL is already pending, await its task instead
// of spawning a new SSH subprocess. Surfaces in ScarfMon as
// a `sqlite.query.coalesced` event so we can see how often
// the dedup actually fires in the wild.
if let existing = inFlightQueries[inlined] {
ScarfMon.event(.sqlite, "query.coalesced", count: 1)
return try await withTaskCancellationHandler(
operation: { try await existing.value },
onCancel: { existing.cancel() }
)
}
let task = Task<[Row], Error> { [self] in
try await ScarfMon.measureAsync(.sqlite, "query") {
let dbPath = context.paths.stateDB
let script = """
sqlite3 -readonly -json \(quoteForRemoteShell(dbPath)) <<'__SCARF_SQL__'
\(inlined)
__SCARF_SQL__
"""
let result: ProcessResult
do {
result = try await transport.streamScript(script, timeout: queryTimeout)
} catch {
throw BackendError.transport(error.localizedDescription)
}
if result.exitCode != 0 {
throw BackendError.sqlite(exitCode: result.exitCode, stderr: result.stderrString)
}
let rows = try parseSingleResultSet(result.stdoutString)
ScarfMon.event(.sqlite, "query.rows", count: rows.count, bytes: result.stdout.count)
return rows
}
}
inFlightQueries[inlined] = task
defer { inFlightQueries[inlined] = nil }
// v2.8 propagate parent task cancellation INTO the
// unstructured `task`. `Task<...>{ ... }` doesn't inherit
// cancellation from the awaiting context, so without this a
// cancelled chat-hydration / dashboard-refresh would keep
// the ssh subprocess alive for the full 30s queryTimeout
// pinning a remote sqlite query and a ControlMaster
// session slot. With the bridge, the inner task's awaits
// see a cancelled parent and `SSHScriptRunner.run`'s own
// cancellation handler (v2.8) kills the ssh process inside
// the next 100ms poll.
return try await withTaskCancellationHandler(
operation: { try await task.value },
onCancel: { task.cancel() }
)
}
public func queryBatch(_ statements: [(sql: String, params: [SQLValue])]) async throws -> [[Row]] {
try await ScarfMon.measureAsync(.sqlite, "queryBatch") {
try await _queryBatchImpl(statements)
}
}
private func _queryBatchImpl(_ statements: [(sql: String, params: [SQLValue])]) async throws -> [[Row]] {
guard isOpen else { throw BackendError.notOpen }
if statements.isEmpty { return [] }
// Build one sqlite3 invocation with marker SELECTs separating
// each statement's result set. `SELECT '__SCARF_RS_BEGIN__N'`
// emits a one-row JSON array we use as a sentinel.
var sqlBlocks: [String] = []
for (i, stmt) in statements.enumerated() {
let inlined = SQLValueInliner.inline(stmt.sql, params: stmt.params)
// Marker first (so we know which result-set follows even
// if a query returns zero rows sqlite3 -json prints
// nothing for empty result sets, which would otherwise
// make the parser drift).
sqlBlocks.append("SELECT '\(Self.batchMarkerPrefix)\(i)' AS marker;")
sqlBlocks.append(ensureTrailingSemicolon(inlined))
}
let combined = sqlBlocks.joined(separator: "\n")
let dbPath = context.paths.stateDB
let script = """
sqlite3 -readonly -json \(quoteForRemoteShell(dbPath)) <<'__SCARF_SQL__'
\(combined)
__SCARF_SQL__
"""
let result: ProcessResult
do {
// Batched timeout: scale with statement count, capped at
// a comfortable 30 s. Most batches are 45 statements.
let timeout = min(30, queryTimeout + Double(statements.count) * 2)
result = try await transport.streamScript(script, timeout: timeout)
} catch {
throw BackendError.transport(error.localizedDescription)
}
if result.exitCode != 0 {
throw BackendError.sqlite(exitCode: result.exitCode, stderr: result.stderrString)
}
return try parseBatchResultSets(result.stdoutString, expectedCount: statements.count)
}
// MARK: - Preflight parsing
private func parsePreflightOutput(_ stdout: String) throws {
// Expected output:
// <sqlite3 version line>
// [<sessions PRAGMA result>]
// [<messages PRAGMA result>]
let lines = stdout.split(separator: "\n", omittingEmptySubsequences: false)
guard let firstLine = lines.first, !firstLine.isEmpty else {
throw BackendError.parseFailure(stdoutHead: String(stdout.prefix(200)))
}
sqliteVersion = String(firstLine).trimmingCharacters(in: .whitespacesAndNewlines)
// The remaining lines should contain two JSON arrays. sqlite3
// -json emits each on its own though it can wrap long arrays
// across multiple lines. We split on `][` boundaries to be
// robust. Walk the stream looking for two top-level arrays.
let rest = lines.dropFirst().joined(separator: "\n")
let arrays = splitTopLevelJSONArrays(rest)
guard arrays.count >= 2 else {
throw BackendError.parseFailure(stdoutHead: String(stdout.prefix(200)))
}
let sessionsTable = try parseTableInfo(arrays[0])
let messagesTable = try parseTableInfo(arrays[1])
// v0.7: sessions has `reasoning_tokens`.
hasV07Schema = sessionsTable.contains("reasoning_tokens")
// v0.11: BOTH sessions has `api_call_count` AND messages has
// `reasoning_content`. Belt-and-braces against partial migrations.
let sessionsHasV011 = sessionsTable.contains("api_call_count")
let messagesHasV011 = messagesTable.contains("reasoning_content")
hasV011Schema = sessionsHasV011 && messagesHasV011
}
/// Extract column names from a `PRAGMA table_info(...)` result set.
private func parseTableInfo(_ json: String) throws -> Set<String> {
guard let data = json.data(using: .utf8),
let arr = try? JSONSerialization.jsonObject(with: data) as? [[String: Any]] else {
throw BackendError.parseFailure(stdoutHead: String(json.prefix(200)))
}
var names: Set<String> = []
for row in arr {
if let name = row["name"] as? String {
names.insert(name)
}
}
return names
}
// MARK: - Result-set parsing
private func parseSingleResultSet(_ stdout: String) throws -> [Row] {
// sqlite3 -json prints nothing for empty result sets, so an
// empty stdout is valid and means "0 rows".
let trimmed = stdout.trimmingCharacters(in: .whitespacesAndNewlines)
if trimmed.isEmpty { return [] }
return try rowsFromJSONArray(trimmed)
}
private func parseBatchResultSets(_ stdout: String, expectedCount: Int) throws -> [[Row]] {
// Scan the output as a sequence of JSON arrays. Each marker
// SELECT emits a one-row array `[{"marker":"__SCARF_RS_BEGIN__N"}]`;
// the following array (if present) is statement N's result set.
let arrays = splitTopLevelJSONArrays(stdout)
var result: [[Row]] = Array(repeating: [], count: expectedCount)
var i = 0
while i < arrays.count {
let chunk = arrays[i]
// Try to read this chunk as a marker. A marker row is one
// object with exactly the `marker` field. Anything else
// is a real result set (which we attribute to the most
// recent marker we saw).
if let idx = markerIndex(in: chunk) {
// Next array (if any) is this statement's result set.
// If the next array is ALSO a marker, the current
// statement returned zero rows.
let next = i + 1
if next < arrays.count, markerIndex(in: arrays[next]) == nil {
result[idx] = try rowsFromJSONArray(arrays[next])
i = next + 1
} else {
// Empty result set for this statement.
i = next
}
} else {
// Stray array (no preceding marker). Skip shouldn't
// happen in practice given how we build the script.
i += 1
}
}
return result
}
/// If the array's single row is a marker `{"marker":"__SCARF_RS_BEGIN__N"}`,
/// return N. Otherwise nil.
private func markerIndex(in json: String) -> Int? {
guard let data = json.data(using: .utf8),
let arr = try? JSONSerialization.jsonObject(with: data) as? [[String: Any]],
arr.count == 1,
let marker = arr[0]["marker"] as? String,
marker.hasPrefix(Self.batchMarkerPrefix) else { return nil }
let suffix = marker.dropFirst(Self.batchMarkerPrefix.count)
return Int(suffix)
}
private func rowsFromJSONArray(_ json: String) throws -> [Row] {
guard let data = json.data(using: .utf8),
let arr = try? JSONSerialization.jsonObject(with: data) as? [[String: Any]] else {
throw BackendError.parseFailure(stdoutHead: String(json.prefix(200)))
}
if arr.isEmpty { return [] }
// `[String: Any]` does NOT preserve insertion order on macOS
// (NSDictionary backing). To keep the SELECT column order
// intact which the data-service row parsers depend on
// (`row.string(at: 0)` for `id`, etc.) we extract the key
// order from the FIRST object's raw JSON bytes. Subsequent
// rows reuse that key list to look up values by name from
// their parsed dictionaries.
let firstObjectRaw = extractFirstJSONObject(from: json)
let orderedKeys = firstObjectRaw.flatMap(extractKeysInOrder) ?? Array(arr[0].keys)
var columnIndex: [String: Int] = [:]
columnIndex.reserveCapacity(orderedKeys.count)
for (i, k) in orderedKeys.enumerated() { columnIndex[k] = i }
var rows: [Row] = []
rows.reserveCapacity(arr.count)
for obj in arr {
var values: [SQLValue] = []
values.reserveCapacity(orderedKeys.count)
for key in orderedKeys {
values.append(decode(obj[key]))
}
rows.append(Row(values: values, columnIndex: columnIndex))
}
return rows
}
/// Extract the substring of the first `{...}` object in a JSON
/// array string. Used so we can scan its keys in original order
/// before NSJSONSerialization's hash-table conversion strips the
/// ordering. Tolerates nested objects/arrays via depth tracking.
private func extractFirstJSONObject(from json: String) -> String? {
guard let openIdx = json.firstIndex(of: "{") else { return nil }
var depth = 0
var inString = false
var escape = false
var i = openIdx
while i < json.endIndex {
let c = json[i]
if inString {
if escape { escape = false }
else if c == "\\" { escape = true }
else if c == "\"" { inString = false }
i = json.index(after: i)
continue
}
switch c {
case "\"":
inString = true
case "{":
depth += 1
case "}":
depth -= 1
if depth == 0 {
let end = json.index(after: i)
return String(json[openIdx..<end])
}
default:
break
}
i = json.index(after: i)
}
return nil
}
/// Walk an object literal `{"k1": v1, "k2": v2, ...}` and return
/// the keys in their literal order. Doesn't decode the values
/// that's what NSJSONSerialization handles. Just extracts
/// `["k1", "k2", ...]` so we know the column ordering.
private func extractKeysInOrder(_ objectJSON: String) -> [String] {
var keys: [String] = []
var i = objectJSON.startIndex
// Skip past the leading `{`.
while i < objectJSON.endIndex, objectJSON[i] != "{" {
i = objectJSON.index(after: i)
}
if i < objectJSON.endIndex { i = objectJSON.index(after: i) }
var depth = 0
var inString = false
var escape = false
var keyStart: String.Index?
// We're at the start of object body. Looking for `"key":` patterns
// at depth 0. Toggle `expectingKey` after each `:`/`,`.
var expectingKey = true
while i < objectJSON.endIndex {
let c = objectJSON[i]
if inString {
if escape {
escape = false
} else if c == "\\" {
escape = true
} else if c == "\"" {
inString = false
if expectingKey && depth == 0, let start = keyStart {
keys.append(String(objectJSON[start..<i]))
expectingKey = false
keyStart = nil
}
}
i = objectJSON.index(after: i)
continue
}
switch c {
case "\"":
inString = true
if expectingKey && depth == 0 {
keyStart = objectJSON.index(after: i)
}
case "{", "[":
depth += 1
case "}", "]":
if depth == 0 { return keys } // end of outer object
depth -= 1
case ",":
if depth == 0 { expectingKey = true }
case ":":
if depth == 0 { expectingKey = false }
default:
break
}
i = objectJSON.index(after: i)
}
return keys
}
private func decode(_ v: Any?) -> SQLValue {
guard let v else { return .null }
if v is NSNull { return .null }
if let n = v as? NSNumber {
// NSJSONSerialization decodes both ints and doubles into
// NSNumber. Distinguish: if it round-trips through Int64
// unchanged, treat as integer; else real.
// A leading-zero-after-dot Double like 1.0 still has
// .doubleValue == 1.0 and Int64(1.0) == 1, so the round-
// trip check correctly bins integral doubles as integer
// (which sqlite3 -json does too `1` in JSON, not `1.0`).
let asInt64 = n.int64Value
if Double(asInt64) == n.doubleValue {
return .integer(asInt64)
}
return .real(n.doubleValue)
}
if let s = v as? String {
return .text(s)
}
// Fall-through: stringify whatever it is so we don't lose data
// silently. SQLite -json doesn't emit booleans or nested
// objects from PRAGMA / SELECT outputs in our usage.
return .text(String(describing: v))
}
// MARK: - JSON helpers
/// Walk a string of one or more concatenated JSON arrays at the top
/// level (sqlite3 -json's batched output) and return each array as
/// a separate substring. Tolerates whitespace/newlines between
/// arrays.
private func splitTopLevelJSONArrays(_ s: String) -> [String] {
var out: [String] = []
var depth = 0
var inString = false
var escape = false
var start: String.Index?
var i = s.startIndex
while i < s.endIndex {
let c = s[i]
if inString {
if escape {
escape = false
} else if c == "\\" {
escape = true
} else if c == "\"" {
inString = false
}
i = s.index(after: i)
continue
}
switch c {
case "\"":
inString = true
case "[":
if depth == 0 { start = i }
depth += 1
case "]":
depth -= 1
if depth == 0, let begin = start {
let end = s.index(after: i)
out.append(String(s[begin..<end]))
start = nil
}
default:
break
}
i = s.index(after: i)
}
return out
}
private func ensureTrailingSemicolon(_ sql: String) -> String {
let trimmed = sql.trimmingCharacters(in: .whitespacesAndNewlines)
if trimmed.hasSuffix(";") { return trimmed }
return trimmed + ";"
}
// MARK: - Quoting + error mapping
/// Build the shell argument that the remote `sh -c` will see for
/// the SQLite path. Three cases, in priority order:
///
/// 1. **`~`-prefixed AND we have a `resolvedHome`** the common
/// case. Pre-expand to an absolute path in Swift, then single-
/// quote. Sqlite3 receives a literal absolute path; no shell
/// expansion needed.
/// 2. **`~`-prefixed AND no `resolvedHome`** (probe failed)
/// fall back to `"$HOME/..."` and hope the remote shell expands
/// it. Works on Mac SSHTransport (login shell with $HOME set);
/// less reliable through Citadel's exec-channel + base64 +
/// inner-`/bin/sh` pipeline on iOS, which is precisely why
/// we prefer the resolved-home path above.
/// 3. **Absolute** (`/home/agent/.hermes/state.db`) single-quote
/// with the standard sh escape for any embedded single-quote.
///
/// sqlite3 doesn't expand `~` itself (that's a shell affordance),
/// so a default-config remote with `paths.stateDB ==
/// "~/.hermes/state.db"` would produce `unable to open database
/// "~/.hermes/state.db"` without one of these rewrites issue
/// reported on iOS Citadel against `127.0.0.1`.
private func quoteForRemoteShell(_ path: String) -> String {
if let home = resolvedHome {
let expanded: String
if path == "~" {
expanded = home
} else if path.hasPrefix("~/") {
expanded = home + "/" + String(path.dropFirst(2))
} else {
expanded = path
}
return "'" + expanded.replacingOccurrences(of: "'", with: "'\\''") + "'"
}
// Probe-failed fallback: rely on remote-shell `$HOME` expansion.
if path == "~" {
return "\"$HOME\""
}
if path.hasPrefix("~/") {
let rest = String(path.dropFirst(2))
let escaped = rest
.replacingOccurrences(of: "\\", with: "\\\\")
.replacingOccurrences(of: "\"", with: "\\\"")
.replacingOccurrences(of: "$", with: "\\$")
.replacingOccurrences(of: "`", with: "\\`")
return "\"$HOME/\(escaped)\""
}
return "'" + path.replacingOccurrences(of: "'", with: "'\\''") + "'"
}
/// Translate a non-zero sqlite3 exit into a user-presentable
/// message. Mirrors substrings that `HermesDataService.humanize`
/// keys off so the existing dashboard banner renders correctly.
private func errorMessage(stderr: String, stdout: String, exitCode: Int32) -> String {
let combined = (stderr.isEmpty ? stdout : stderr).trimmingCharacters(in: .whitespacesAndNewlines)
if combined.isEmpty {
return "sqlite3 exited \(exitCode) with no output"
}
return combined
}
}
#endif // canImport(SQLite3)
@@ -0,0 +1,136 @@
import Foundation
/// Typed SQLite column value. Mirrors SQLite's storage classes
/// (`SQLITE_NULL`, `SQLITE_INTEGER`, `SQLITE_FLOAT`, `SQLITE_TEXT`,
/// `SQLITE_BLOB`) so both backends libsqlite3 (`LocalSQLiteBackend`)
/// and remote `sqlite3 -json` parsing (`RemoteSQLiteBackend`) can
/// produce and consume the same `Row` shape.
///
/// Used in two places:
///
/// 1. **Bound parameters**: callers hand `[SQLValue]` to
/// `HermesQueryBackend.query(_:params:)`. The local backend feeds
/// them into `sqlite3_bind_*`; the remote backend inlines them as
/// SQLite literals via `SQLValueInliner.inline(_:into:)`.
/// 2. **Result columns**: each `Row.values` entry is one of these.
/// Parsers (`sessionFromRow`, `messageFromRow` in HermesDataService)
/// read positional accessors like `row.string(at: 3)` to get the
/// typed value.
public enum SQLValue: Sendable, Equatable {
case null
case integer(Int64)
case real(Double)
case text(String)
case blob(Data)
}
/// One result row from a query. Indexable both by position (matching the
/// libsqlite3 `sqlite3_column_*` ergonomics that `HermesDataService`'s
/// existing parsers expect) and by name (more readable for new code).
///
/// `columnIndex` is built once per result-set, not per row, so the
/// per-row overhead is just the `[SQLValue]` allocation.
public struct Row: Sendable {
/// Ordered column values, indexable by their position in the
/// underlying SELECT.
public let values: [SQLValue]
/// Column-name position map. Built once per result-set by the
/// backend, then shared (by reference) across every row in the
/// set. Lookups are case-sensitive match SQLite's default.
public let columnIndex: [String: Int]
public init(values: [SQLValue], columnIndex: [String: Int]) {
self.values = values
self.columnIndex = columnIndex
}
public subscript(_ position: Int) -> SQLValue {
guard position >= 0, position < values.count else { return .null }
return values[position]
}
public subscript(_ name: String) -> SQLValue {
guard let i = columnIndex[name] else { return .null }
return values[i]
}
// MARK: - Typed positional accessors
//
// These mirror the `columnText(stmt, i)` / `columnDate(stmt, i)`
// helpers that lived in HermesDataService so the row-parser
// migrations from `OpaquePointer` to `Row` are line-for-line.
public func string(at i: Int) -> String {
if case .text(let s) = self[i] { return s }
return ""
}
public func optionalString(at i: Int) -> String? {
switch self[i] {
case .text(let s): return s
case .null: return nil
default: return nil
}
}
public func int(at i: Int) -> Int {
switch self[i] {
case .integer(let n): return Int(n)
case .real(let d): return Int(d)
case .text(let s): return Int(s) ?? 0
default: return 0
}
}
public func optionalInt(at i: Int) -> Int? {
switch self[i] {
case .integer(let n): return Int(n)
case .real(let d): return Int(d)
case .text(let s): return Int(s)
case .null: return nil
default: return nil
}
}
public func int64(at i: Int) -> Int64 {
switch self[i] {
case .integer(let n): return n
case .real(let d): return Int64(d)
case .text(let s): return Int64(s) ?? 0
default: return 0
}
}
public func double(at i: Int) -> Double {
switch self[i] {
case .real(let d): return d
case .integer(let n): return Double(n)
case .text(let s): return Double(s) ?? 0
default: return 0
}
}
public func optionalDouble(at i: Int) -> Double? {
switch self[i] {
case .real(let d): return d
case .integer(let n): return Double(n)
case .text(let s): return Double(s)
case .null: return nil
default: return nil
}
}
/// Interpret the column as a Unix-epoch timestamp (seconds, fractional
/// allowed). Returns `nil` when the column is NULL or unparseable.
/// Mirrors the existing `columnDate` helper exactly.
public func date(at i: Int) -> Date? {
guard let secs = optionalDouble(at: i) else { return nil }
return Date(timeIntervalSince1970: secs)
}
public func isNull(at i: Int) -> Bool {
if case .null = self[i] { return true }
return false
}
}
@@ -0,0 +1,107 @@
import Foundation
/// Replaces `?` placeholders in a SQL string with SQLite-escaped
/// literal values, in order. Used by `RemoteSQLiteBackend` because
/// the `sqlite3` CLI doesn't accept `?`-bound parameters on the
/// command line it would need stdin `.parameter set @name` dot-
/// commands, which require a multi-line script for every query and
/// add round-trip overhead with no upside for our use case.
///
/// **Trust model.** This is a literal-encoder for in-tree, trusted
/// callers every current param source is either an integer (`limit`,
/// `before`, `since.timeIntervalSince1970`), a Hermes-internal ID
/// (UUID-shaped session/tool IDs that come back from the same DB), or
/// a search query that already passes through `sanitizeFTSQuery` in
/// HermesDataService. It is **NOT** a general SQL-injection defense.
/// Don't extend the data-service surface with methods that accept raw
/// untrusted user input as a `.text` param without first validating
/// upstream. The local backend skips inlining entirely (uses
/// `sqlite3_bind_*`) so this only affects the remote path.
///
/// Escape rules mirror SQLite's literal syntax:
/// * `.null` `NULL`
/// * `.integer(n)` `<n>` (no quoting)
/// * `.real(d)` `%.17g`-formatted (round-trips Double via decimal)
/// * `.text(s)` `'<s with single-quotes doubled>'`
/// * `.blob(d)` `X'<hex>'`
public enum SQLValueInliner {
/// Walk `sql`, replacing each `?` (outside SQL string literals) with
/// the corresponding `params` entry's encoded form. Throws via
/// fatalError if the placeholder count doesn't match `params.count`
/// a programmer error, not a runtime condition.
///
/// `?` inside string literals (e.g. `WHERE name = '?'`) is preserved
/// unchanged. We track quote state with a tiny scanner so existing
/// SQL with literal `?` chars in strings doesn't get mis-bound.
public static func inline(_ sql: String, params: [SQLValue]) -> String {
var out = ""
out.reserveCapacity(sql.count + params.count * 16)
var paramIndex = 0
var inSingleQuote = false
var inDoubleQuote = false
var i = sql.startIndex
while i < sql.endIndex {
let c = sql[i]
if c == "'" && !inDoubleQuote {
// Check for SQL's `''` escape (a doubled single-quote
// INSIDE a string literal stays inside; we don't toggle
// out). The next char being another `'` keeps us in.
let next = sql.index(after: i)
if inSingleQuote && next < sql.endIndex && sql[next] == "'" {
out.append("'")
out.append("'")
i = sql.index(after: next)
continue
}
inSingleQuote.toggle()
out.append(c)
i = sql.index(after: i)
continue
}
if c == "\"" && !inSingleQuote {
inDoubleQuote.toggle()
out.append(c)
i = sql.index(after: i)
continue
}
if c == "?" && !inSingleQuote && !inDoubleQuote {
// Bind placeholder.
if paramIndex >= params.count {
fatalError("SQLValueInliner: more `?` placeholders in SQL than provided params (\(params.count)). SQL: \(sql)")
}
out.append(encode(params[paramIndex]))
paramIndex += 1
i = sql.index(after: i)
continue
}
out.append(c)
i = sql.index(after: i)
}
if paramIndex != params.count {
fatalError("SQLValueInliner: \(params.count) params provided but only \(paramIndex) `?` placeholders consumed. SQL: \(sql)")
}
return out
}
/// Encode a single value as a SQLite literal. Public so callers
/// that build SQL strings by hand (rare prefer `inline`) can
/// reuse the same escape rules.
public static func encode(_ value: SQLValue) -> String {
switch value {
case .null:
return "NULL"
case .integer(let n):
return String(n)
case .real(let d):
// %.17g round-trips a Double precisely as a decimal.
return String(format: "%.17g", d)
case .text(let s):
return "'" + s.replacingOccurrences(of: "'", with: "''") + "'"
case .blob(let d):
// SQLite blob literal: X'<hex>' (case-insensitive prefix).
let hex = d.map { String(format: "%02x", $0) }.joined()
return "X'\(hex)'"
}
}
}
File diff suppressed because it is too large Load Diff
@@ -51,7 +51,19 @@ public enum HermesProfileResolver {
/// Returns the default `~/.hermes` when no profile is active OR when
/// the configured profile is invalid (logged) so the worst-case
/// failure mode is "Scarf shows what it always showed before."
///
/// **Test override.** Setting `SCARF_HERMES_HOME` in the environment
/// pins this resolver to the supplied absolute path and bypasses both
/// the cache and the `active_profile` lookup. Used by the E2E test
/// harness (`TemplateE2ETests`, `TemplateInstallUITests`) to drive
/// Scarf against an isolated tmpdir Hermes home so the user's real
/// `~/.hermes` is never touched. Read on every call (cheap; a single
/// `ProcessInfo` lookup) so tests can flip it across test methods
/// without stale-cache surprises.
public static func resolveLocalHome() -> String {
if let override = scarfHermesHomeOverride() {
return override
}
return refreshIfNeeded().home
}
@@ -60,9 +72,55 @@ public enum HermesProfileResolver {
/// reading from (issue #50 follow-up: prevents the next variant
/// of "where's my data wrong profile" by making it visible).
public static func activeProfileName() -> String {
if scarfHermesHomeOverride() != nil {
return "test-override"
}
return refreshIfNeeded().name
}
/// Sentinel filename that the override path MUST contain for the
/// override to be honored. Without it, production code refuses to
/// pivot off the user's real `~/.hermes` even if the env var is
/// set. This is the "even if a test leaks the env var, even if
/// some non-test process inherits it, the user's data is safe"
/// belt-and-braces guard. Tests create this marker before
/// `setenv("SCARF_HERMES_HOME", ...)`.
public static let testHomeMarkerFilename = ".scarf-test-home-marker"
/// Read `SCARF_HERMES_HOME` from the environment. Returns `nil` when
/// unset or empty so production callers fall through to the profile
/// resolver. The override must:
/// 1. Be an absolute path relative paths are rejected (they'd
/// land relative to the cwd of whatever process happened to
/// invoke the resolver, which is not what tests want).
/// 2. Contain the sentinel marker file
/// `<path>/<testHomeMarkerFilename>`. Without the marker we
/// treat the env var as untrusted and ignore it. This protects
/// the user's real `~/.hermes/` from any code path that
/// accidentally exports `SCARF_HERMES_HOME` to the wrong value
/// (e.g. a test crashed mid-teardown, an env var inherited
/// from a parent shell, a misconfigured launchctl plist).
/// Both checks are cheap `FileManager.fileExists` against a
/// known path is microseconds. The override is hot but not
/// hot-hot, so an extra stat per call is negligible.
private static func scarfHermesHomeOverride() -> String? {
guard let raw = ProcessInfo.processInfo.environment["SCARF_HERMES_HOME"] else {
return nil
}
let trimmed = raw.trimmingCharacters(in: .whitespacesAndNewlines)
guard !trimmed.isEmpty else { return nil }
guard trimmed.hasPrefix("/") else {
logger.warning("SCARF_HERMES_HOME=\(trimmed, privacy: .public) is not absolute; ignoring.")
return nil
}
let markerPath = trimmed + "/" + testHomeMarkerFilename
guard FileManager.default.fileExists(atPath: markerPath) else {
logger.warning("SCARF_HERMES_HOME=\(trimmed, privacy: .public) lacks sentinel marker (\(testHomeMarkerFilename, privacy: .public)); ignoring to protect real ~/.hermes.")
return nil
}
return trimmed
}
/// Force a re-read on the next call, regardless of TTL. Test helper.
public static func invalidateCache() {
lock.withLock { $0.resolvedAt = .distantPast }
@@ -95,15 +153,20 @@ public enum HermesProfileResolver {
let defaultHome = defaultRootHome()
let activeFile = defaultHome + "/active_profile"
// Absent file default profile. This is the common case for users
// who haven't run `hermes profile use ...` and shouldn't generate
// any log noise.
// Absent file default profile. Common case for users who
// haven't run `hermes profile use ...`. We still log at
// `.info` (key=value, not warning) so support requests can
// pull `log show | grep ProfileResolver` and confirm the
// resolver IS running and IS resolving to the default
// distinguishing "feature didn't fire" from "feature fired
// and chose default" (issue #70).
guard FileManager.default.fileExists(atPath: activeFile) else {
logger.info("Resolved active Hermes profile: name=default, home=\(defaultHome, privacy: .public), source=default-no-file")
return ("default", defaultHome)
}
guard let raw = try? String(contentsOfFile: activeFile, encoding: .utf8) else {
logger.warning("Found active_profile but could not read it; falling back to default profile.")
logger.warning("Found active_profile but could not read it; falling back to default. home=\(defaultHome, privacy: .public)")
return ("default", defaultHome)
}
@@ -111,6 +174,7 @@ public enum HermesProfileResolver {
// Empty file or explicit "default" default profile.
if trimmed.isEmpty || trimmed == "default" {
logger.info("Resolved active Hermes profile: name=default, home=\(defaultHome, privacy: .public), source=file-default")
return ("default", defaultHome)
}
@@ -129,7 +193,7 @@ public enum HermesProfileResolver {
return ("default", defaultHome)
}
logger.info("Resolved active Hermes profile to \(trimmed, privacy: .public) at \(profileHome, privacy: .public).")
logger.info("Resolved active Hermes profile: name=\(trimmed, privacy: .public), home=\(profileHome, privacy: .public), source=file")
return (trimmed, profileHome)
}
@@ -65,13 +65,15 @@ public struct ImageEncoder: Sendable {
sourceFilename: String? = nil
) throws -> ChatImageAttachment {
guard !rawBytes.isEmpty else { throw EncoderError.empty }
ScarfMon.event(.render, "imageEncoder.input.bytes", count: 1, bytes: rawBytes.count)
return try ScarfMon.measure(.render, "imageEncoder.downsample") {
#if canImport(AppKit)
guard let nsImage = NSImage(data: rawBytes) else { throw EncoderError.decodeFailed }
let targetSize = Self.fittedSize(for: nsImage.size, maxLongEdge: Self.maxLongEdge)
let mainData = try Self.jpegBytes(from: nsImage, size: targetSize)
let thumbSize = Self.fittedSize(for: nsImage.size, maxLongEdge: Self.thumbnailLongEdge)
let thumbData = try? Self.jpegBytes(from: nsImage, size: thumbSize)
ScarfMon.event(.render, "imageEncoder.bytes", count: 1, bytes: mainData.count)
return ChatImageAttachment(
mimeType: "image/jpeg",
base64Data: mainData.base64EncodedString(),
@@ -86,6 +88,7 @@ public struct ImageEncoder: Sendable {
let mainData = try Self.jpegBytes(from: uiImage, size: targetSize)
let thumbSize = Self.fittedSize(for: uiImage.size, maxLongEdge: Self.thumbnailLongEdge)
let thumbData = try? Self.jpegBytes(from: uiImage, size: thumbSize)
ScarfMon.event(.render, "imageEncoder.bytes", count: 1, bytes: mainData.count)
return ChatImageAttachment(
mimeType: "image/jpeg",
base64Data: mainData.base64EncodedString(),
@@ -99,6 +102,7 @@ public struct ImageEncoder: Sendable {
// input already looks like a JPEG, else refuse. Keeps the
// package compiling without a hard AppKit/UIKit dep.
if rawBytes.starts(with: [0xFF, 0xD8]) {
ScarfMon.event(.render, "imageEncoder.bytes", count: 1, bytes: rawBytes.count)
return ChatImageAttachment(
mimeType: "image/jpeg",
base64Data: rawBytes.base64EncodedString(),
@@ -109,6 +113,7 @@ public struct ImageEncoder: Sendable {
}
throw EncoderError.unsupportedFormat
#endif
}
}
nonisolated private static func fittedSize(for source: CGSize, maxLongEdge: CGFloat) -> CGSize {
@@ -178,7 +178,11 @@ public struct ModelCatalogService: Sendable {
/// can keep using the sync method.
public nonisolated func loadProvidersAsync() async -> [HermesProviderInfo] {
await Task.detached { [self] in
self.loadProviders()
let providers = ScarfMon.measure(.diskIO, "modelCatalog.loadProviders") {
self.loadProviders()
}
ScarfMon.event(.diskIO, "modelCatalog.providers.count", count: providers.count)
return providers
}.value
}
@@ -218,7 +222,11 @@ public struct ModelCatalogService: Sendable {
/// Issue #59.
public nonisolated func loadModelsAsync(for providerID: String) async -> [HermesModelInfo] {
await Task.detached { [self] in
self.loadModels(for: providerID)
let models = ScarfMon.measure(.diskIO, "modelCatalog.loadModels") {
self.loadModels(for: providerID)
}
ScarfMon.event(.diskIO, "modelCatalog.models.count", count: models.count)
return models
}.value
}
@@ -335,47 +343,49 @@ public struct ModelCatalogService: Sendable {
/// Nous's catalog has no such model and Hermes later failed with
/// HTTP 404 at runtime. Catch that at save time, not 6 hours later.
public func validateModel(_ modelID: String, for providerID: String) -> ModelValidation {
let trimmed = modelID.trimmingCharacters(in: .whitespacesAndNewlines)
guard !trimmed.isEmpty else {
return .invalid(providerName: providerID, suggestions: [])
}
ScarfMon.measure(.diskIO, "modelCatalog.validateModel") {
let trimmed = modelID.trimmingCharacters(in: .whitespacesAndNewlines)
guard !trimmed.isEmpty else {
return .invalid(providerName: providerID, suggestions: [])
}
// Overlay-only providers (Nous Portal, OpenAI Codex, Qwen
// OAuth, ) serve their own catalogs that aren't mirrored to
// models.dev, so we don't have a reliable way to check model
// IDs locally. Treat any non-empty value as provisionally
// valid the worst case is the runtime 404 we hit in pass-1,
// but the UI has the error banner now (M7 #2) to surface that
// cleanly.
//
// Exception: if an overlay-only provider DOES appear in the
// models.dev cache (unlikely but possible as catalogs evolve),
// we fall through to the real check below.
let models = loadModels(for: providerID)
if models.isEmpty {
if Self.overlayOnlyProviders[providerID] != nil {
// Overlay-only providers (Nous Portal, OpenAI Codex, Qwen
// OAuth, ) serve their own catalogs that aren't mirrored to
// models.dev, so we don't have a reliable way to check model
// IDs locally. Treat any non-empty value as provisionally
// valid the worst case is the runtime 404 we hit in pass-1,
// but the UI has the error banner now (M7 #2) to surface that
// cleanly.
//
// Exception: if an overlay-only provider DOES appear in the
// models.dev cache (unlikely but possible as catalogs evolve),
// we fall through to the real check below.
let models = loadModels(for: providerID)
if models.isEmpty {
if Self.overlayOnlyProviders[providerID] != nil {
return .valid
}
return .unknownProvider(providerID: providerID)
}
if models.contains(where: { $0.modelID == trimmed }) {
return .valid
}
return .unknownProvider(providerID: providerID)
}
if models.contains(where: { $0.modelID == trimmed }) {
return .valid
// No exact match offer the closest names (by prefix) as
// suggestions. Up to 5, ordered by release date (newest
// first already the sort order of loadModels).
let lowerTrimmed = trimmed.lowercased()
let byPrefix = models
.filter { $0.modelID.lowercased().hasPrefix(String(lowerTrimmed.prefix(3))) }
.prefix(5)
.map(\.modelID)
let suggestions = byPrefix.isEmpty
? Array(models.prefix(5).map(\.modelID))
: Array(byPrefix)
let providerName = providerByID(providerID)?.providerName ?? providerID
return .invalid(providerName: providerName, suggestions: suggestions)
}
// No exact match offer the closest names (by prefix) as
// suggestions. Up to 5, ordered by release date (newest
// first already the sort order of loadModels).
let lowerTrimmed = trimmed.lowercased()
let byPrefix = models
.filter { $0.modelID.lowercased().hasPrefix(String(lowerTrimmed.prefix(3))) }
.prefix(5)
.map(\.modelID)
let suggestions = byPrefix.isEmpty
? Array(models.prefix(5).map(\.modelID))
: Array(byPrefix)
let providerName = providerByID(providerID)?.providerName ?? providerID
return .invalid(providerName: providerName, suggestions: suggestions)
}
// MARK: - Decoding
@@ -50,7 +50,48 @@ public enum ModelPreflight: Sendable {
}
private static func isUnset(_ value: String) -> Bool {
let trimmed = value.trimmingCharacters(in: .whitespaces).lowercased()
let trimmed = value.trimmingCharacters(in: .whitespacesAndNewlines).lowercased()
return trimmed.isEmpty || trimmed == "unknown"
}
/// Result of a `model.default` `model.provider` mismatch check.
/// Captures the case where `model.default` carries a `<provider>/...`
/// prefix that doesn't match the standalone `model.provider` key
/// observed in 2026-05-05 dogfooding when switching OAuth providers
/// via Credential Pools left the prior provider's model name
/// stranded in `model.default`. Hermes can't reconcile the two and
/// chats die with an opaque `-32603 Internal error` at first prompt.
public struct Mismatch: Sendable, Equatable {
/// The provider prefix found in `model.default` (e.g. `"anthropic"`).
public let prefixProvider: String
/// The standalone `model.provider` value (e.g. `"nous"`).
public let activeProvider: String
/// The full `model.default` string as configured.
public let modelDefault: String
/// The bare model id (with the prefix stripped) what the user
/// would see if Scarf rewrites `model.default` for them.
public let bareModel: String
}
/// Detect a `model.default` / `model.provider` mismatch. Returns
/// `nil` when there's no provider prefix on `model.default`, when
/// either field is unset, or when the prefix matches the provider.
/// Uses case-insensitive comparison Hermes accepts both
/// `Anthropic/...` and `anthropic/...` casings in the wild.
public static func detectMismatch(_ config: HermesConfig) -> Mismatch? {
let modelDefault = config.model.trimmingCharacters(in: .whitespacesAndNewlines)
let activeProvider = config.provider.trimmingCharacters(in: .whitespacesAndNewlines)
guard !isUnset(modelDefault), !isUnset(activeProvider) else { return nil }
guard let slash = modelDefault.firstIndex(of: "/") else { return nil }
let prefix = String(modelDefault[..<slash])
let bare = String(modelDefault[modelDefault.index(after: slash)...])
guard !prefix.isEmpty, !bare.isEmpty else { return nil }
guard prefix.caseInsensitiveCompare(activeProvider) != .orderedSame else { return nil }
return Mismatch(
prefixProvider: prefix,
activeProvider: activeProvider,
modelDefault: modelDefault,
bareModel: bare
)
}
}
@@ -95,22 +95,68 @@ public struct NousModelCatalogService: Sendable {
/// cache lands on the droplet, not the user's Mac). Missing or
/// malformed cache nil; the loader treats that as "no cache" and
/// kicks off a fresh fetch.
public func readCache() -> NousModelsCache? {
let transport = context.makeTransport()
guard transport.fileExists(cachePath) else { return nil }
do {
let data = try transport.readFile(cachePath)
let decoder = JSONDecoder()
decoder.dateDecodingStrategy = .iso8601
let cache = try decoder.decode(NousModelsCache.self, from: data)
guard cache.version == NousModelsCache.currentVersion else {
Self.logger.info("nous models cache schema mismatch (got v\(cache.version), expected v\(NousModelsCache.currentVersion)); ignoring")
/// Race readCache against a sleep so a hung remote `cat` doesn't
/// stall the picker for the full transport-level timeout (60 s).
/// On timeout returns nil the caller treats that as "no usable
/// cache" and falls through to the network fetch.
public func readCacheWithTimeout(seconds: TimeInterval) async -> NousModelsCache? {
await withTaskGroup(of: NousModelsCache?.self) { group in
group.addTask { [self] in
// Detached because readCache is sync + does blocking
// SSH I/O; running on the cooperative pool is fine
// for one task but we don't want to fight executor
// scheduling with the timer task below.
await Task.detached { [self] in
readCache()
}.value
}
group.addTask {
try? await Task.sleep(nanoseconds: UInt64(seconds * 1_000_000_000))
ScarfMon.event(.diskIO, "nous.readCache.timeoutFired", count: 1)
return nil
}
// First completion wins; cancel the other.
let first = await group.next() ?? nil
group.cancelAll()
return first
}
}
public func readCache() -> NousModelsCache? {
ScarfMon.measure(.diskIO, "nous.readCache") {
let transport = context.makeTransport()
// Split into separate measure points so the next perf
// capture localizes the 60-second observed beach ball
// was it the fileExists probe, the read itself, or
// the JSON decode? Each on its own ScarfMon row.
let exists = ScarfMon.measure(.diskIO, "nous.readCache.fileExists") {
transport.fileExists(cachePath)
}
guard exists else { return nil }
do {
let data = try ScarfMon.measure(.diskIO, "nous.readCache.readFile") {
try transport.readFile(cachePath)
}
ScarfMon.event(.diskIO, "nous.readCache.bytes", count: 1, bytes: data.count)
return ScarfMon.measure(.diskIO, "nous.readCache.decode") {
let decoder = JSONDecoder()
decoder.dateDecodingStrategy = .iso8601
do {
let cache = try decoder.decode(NousModelsCache.self, from: data)
guard cache.version == NousModelsCache.currentVersion else {
Self.logger.info("nous models cache schema mismatch (got v\(cache.version), expected v\(NousModelsCache.currentVersion)); ignoring")
return Optional<NousModelsCache>.none
}
return cache
} catch {
Self.logger.warning("couldn't decode nous models cache: \(error.localizedDescription, privacy: .public)")
return Optional<NousModelsCache>.none
}
}
} catch {
Self.logger.warning("couldn't read nous models cache: \(error.localizedDescription, privacy: .public)")
return nil
}
return cache
} catch {
Self.logger.warning("couldn't decode nous models cache: \(error.localizedDescription, privacy: .public)")
return nil
}
}
@@ -148,15 +194,22 @@ public struct NousModelCatalogService: Sendable {
// The subscription service already checks for `present`; we
// re-read the raw token here because we need the actual string,
// not just a Bool. Mirrors the SubscriptionService parse path.
let transport = context.makeTransport()
guard transport.fileExists(context.paths.authJSON) else { return nil }
guard let data = try? transport.readFile(context.paths.authJSON) else { return nil }
guard let root = try? JSONSerialization.jsonObject(with: data) as? [String: Any] else { return nil }
let providers = root["providers"] as? [String: Any] ?? [:]
let nous = providers["nous"] as? [String: Any]
let token = nous?["access_token"] as? String
guard let token, !token.isEmpty else { return nil }
return token
// ScarfMon: separate `nous.bearerToken` measure point because
// this is the second auth.json read of the picker's open
// sequence (subscriptionService.loadState() did the first).
// Together with `nous.subscription.loadState`, total two SSH
// round-trips of the same file candidate for caching.
ScarfMon.measure(.diskIO, "nous.bearerToken") {
let transport = context.makeTransport()
guard transport.fileExists(context.paths.authJSON) else { return nil }
guard let data = try? transport.readFile(context.paths.authJSON) else { return nil }
guard let root = try? JSONSerialization.jsonObject(with: data) as? [String: Any] else { return nil }
let providers = root["providers"] as? [String: Any] ?? [:]
let nous = providers["nous"] as? [String: Any]
let token = nous?["access_token"] as? String
guard let token, !token.isEmpty else { return nil }
return token
}
}
/// Make the API call. Times out after `requestTimeout` so a hung
@@ -164,25 +217,28 @@ public struct NousModelCatalogService: Sendable {
/// `[NousModel]` on success, throws on any HTTP / decode error so
/// the caller can log + fall back.
public func fetchModels() async throws -> [NousModel] {
guard let token = bearerToken() else {
throw NousModelCatalogError.notAuthenticated
}
var request = URLRequest(url: Self.baseURL)
request.httpMethod = "GET"
request.timeoutInterval = Self.requestTimeout
request.setValue("Bearer \(token)", forHTTPHeaderField: "Authorization")
request.setValue("application/json", forHTTPHeaderField: "Accept")
try await ScarfMon.measureAsync(.transport, "nous.fetchModels") {
guard let token = bearerToken() else {
throw NousModelCatalogError.notAuthenticated
}
var request = URLRequest(url: Self.baseURL)
request.httpMethod = "GET"
request.timeoutInterval = Self.requestTimeout
request.setValue("Bearer \(token)", forHTTPHeaderField: "Authorization")
request.setValue("application/json", forHTTPHeaderField: "Accept")
let (data, response) = try await session.data(for: request)
guard let http = response as? HTTPURLResponse else {
throw NousModelCatalogError.transport("non-HTTP response")
let (data, response) = try await session.data(for: request)
guard let http = response as? HTTPURLResponse else {
throw NousModelCatalogError.transport("non-HTTP response")
}
guard (200..<300).contains(http.statusCode) else {
throw NousModelCatalogError.http(status: http.statusCode)
}
struct Envelope: Decodable { let data: [NousModel] }
let envelope = try JSONDecoder().decode(Envelope.self, from: data)
ScarfMon.event(.transport, "nous.fetchModels.bytes", count: envelope.data.count, bytes: data.count)
return envelope.data
}
guard (200..<300).contains(http.statusCode) else {
throw NousModelCatalogError.http(status: http.statusCode)
}
struct Envelope: Decodable { let data: [NousModel] }
let envelope = try JSONDecoder().decode(Envelope.self, from: data)
return envelope.data
}
// MARK: - Public entry
@@ -193,7 +249,17 @@ public struct NousModelCatalogService: Sendable {
/// based on the case so it can show a "could not refresh" hint
/// next to a stale-but-still-useful list.
public func loadModels(forceRefresh: Bool = false) async -> NousModelsLoadResult {
let cached = readCache()
// Cache-read with a short timeout. The underlying SSH `cat`
// can hang on a corrupted or oversized cache file (a
// 120-second picker stall observed in the wild two 60 s
// timeouts stacked from a duplicated read; perf capture
// localized to `nous.readCache.readFile`). Cache is a
// performance hint, not a correctness requirement; if it
// doesn't return in 5 s, fall through to the network fetch
// and let writeCache rebuild it. The runaway `cat` keeps
// running on its own 60 s transport timeout but no longer
// blocks the picker.
let cached = await readCacheWithTimeout(seconds: 5)
if let cached, !forceRefresh, !isCacheStale(cached) {
return .cache(models: cached.models, fetchedAt: cached.fetchedAt, refreshError: nil)
@@ -15,14 +15,18 @@ public struct ProjectDashboardService: Sendable {
// MARK: - Registry
public func loadRegistry() -> ProjectRegistry {
guard let data = try? transport.readFile(context.paths.projectsRegistry) else {
return ProjectRegistry(projects: [])
}
do {
return try JSONDecoder().decode(ProjectRegistry.self, from: data)
} catch {
Self.logger.error("Failed to decode project registry: \(error.localizedDescription, privacy: .public)")
return ProjectRegistry(projects: [])
// Tracks time spent reading + decoding projects.json from the transport
// (local file or SSH). Helps spot slow remote round-trips.
ScarfMon.measure(.diskIO, "dashboard.loadRegistry") {
guard let data = try? transport.readFile(context.paths.projectsRegistry) else {
return ProjectRegistry(projects: [])
}
do {
return try JSONDecoder().decode(ProjectRegistry.self, from: data)
} catch {
Self.logger.error("Failed to decode project registry: \(error.localizedDescription, privacy: .public)")
return ProjectRegistry(projects: [])
}
}
}
@@ -0,0 +1,251 @@
import Foundation
/// Pure block-splice logic for Scarf's managed regions inside
/// `~/.hermes/.env`. Each registered project that has at least one
/// resolved secret carries one block, bounded by:
///
/// ```
/// # scarf-secrets:begin <slug>
/// SCARF_<UPPER_SLUG>_<UPPER_FIELDKEY>=<value>
/// ...
/// # scarf-secrets:end <slug>
/// ```
///
/// The Mac wraps this in `KeychainEnvMirror` (Keychain-aware, atomic
/// write, mode-0600 enforcement). This file handles only the marker
/// contract + key naming + splice logic that's testable in isolation
/// against an in-memory string and shared across hosts.
///
/// **Why `~/.hermes/.env`.** Hermes's cron scheduler reloads that file
/// fresh on every tick (cron/scheduler.py:897-903), so values become
/// available to the agent's tool-invoked subprocesses (terminal,
/// code_exec) without any Hermes-side change. Per-project `.env` is
/// not loaded at cron time today, hence we mirror into the global
/// file with namespaced keys.
///
/// **Marker contract is load-bearing.** Both markers carry the slug on
/// the same line so a multi-project file is parsed deterministically
/// and one project's edits can't disturb another's block.
public enum SecretsEnvBlock {
/// Stable across releases entries on disk reference these
/// strings and a marker change would orphan every existing block.
public static let beginMarkerPrefix = "# scarf-secrets:begin "
public static let endMarkerPrefix = "# scarf-secrets:end "
// MARK: - Key naming
/// Build the env-var name for a (slug, fieldKey) pair. Uppercases,
/// replaces every non-alphanumeric character with `_`, prefixes
/// `SCARF_`. Stable: rotating a value writes to the same key.
public static func envKeyName(slug: String, fieldKey: String) -> String {
"SCARF_" + sanitize(slug) + "_" + sanitize(fieldKey)
}
private static func sanitize(_ s: String) -> String {
var out = ""
for scalar in s.unicodeScalars {
let c = Character(scalar)
let isAlpha = ("A"..."Z").contains(c) || ("a"..."z").contains(c)
let isDigit = ("0"..."9").contains(c)
if isAlpha || isDigit {
out.append(Character(scalar.properties.uppercaseMapping))
} else {
out.append("_")
}
}
// Collapse runs of underscores so `foo--bar` doesn't become
// `FOO__BAR` (two underscores trips dotenv parsers more often
// than one). Trim leading/trailing underscores too.
while out.contains("__") {
out = out.replacingOccurrences(of: "__", with: "_")
}
while out.hasPrefix("_") { out.removeFirst() }
while out.hasSuffix("_") { out.removeLast() }
return out.isEmpty ? "UNNAMED" : out
}
// MARK: - Block render
/// Render the bounded block for a single project. Empty `entries`
/// produces an empty string callers should treat that as
/// "remove the project's block" rather than "write an empty
/// block." `entries` are emitted in stable sort order so two
/// runs with the same input produce byte-identical output.
public static func renderBlock(
slug: String,
entries: [(key: String, value: String)]
) -> String {
guard !entries.isEmpty else { return "" }
let sorted = entries.sorted { $0.key < $1.key }
var lines: [String] = []
lines.append(beginMarkerPrefix + slug)
for entry in sorted {
lines.append("\(entry.key)=\(escape(entry.value))")
}
lines.append(endMarkerPrefix + slug)
return lines.joined(separator: "\n")
}
/// Quote values that would confuse python-dotenv: anything with
/// whitespace, `#`, `$`, or quote characters. Single quotes around
/// the value are dotenv-canonical and preserve `$`-style
/// references literally (no shell expansion). Backslash-escape
/// embedded single quotes by closing+reopening: `'foo'\''bar'`.
private static func escape(_ value: String) -> String {
let needsQuoting = value.contains(where: { c in
c.isWhitespace || c == "#" || c == "$" || c == "\"" || c == "'" || c == "\\"
})
if !needsQuoting { return value }
let escaped = value.replacingOccurrences(of: "'", with: "'\\''")
return "'" + escaped + "'"
}
// MARK: - Splice
/// Splice `block` (already-rendered, with markers) into `existing`
/// for the named `slug`. Three cases:
/// 1. `existing` already has a `# scarf-secrets:begin <slug>` /
/// `# scarf-secrets:end <slug>` pair replace the inclusive
/// region. Other slugs' blocks are preserved byte-identically.
/// 2. `existing` has no block for this slug append after a
/// blank line at the end of file.
/// 3. `block` is empty behave like `removeBlock`.
///
/// Idempotent: feeding the output of one call back through
/// `applyBlock` with the same inputs produces the same string.
public static func applyBlock(
_ block: String,
forSlug slug: String,
to existing: String
) -> String {
if block.isEmpty {
return removeBlock(forSlug: slug, from: existing)
}
if let region = blockRange(forSlug: slug, in: existing) {
// Replace the inclusive region. `blockRange` covers the
// begin marker line through the end marker line plus any
// trailing newline so `removeBlock` doesn't leave a
// dangling blank line but for `applyBlock`, we need to
// re-emit that trailing newline so a round-trip
// (mirrorreadmirror with identical entries) produces
// byte-identical output. Without this, the second mirror
// would write a file shorter by one newline byte and
// bump the file's mtime, breaking the
// no-op-when-unchanged contract that the launch
// reconciler relies on.
let before = String(existing[existing.startIndex..<region.lowerBound])
let after = String(existing[region.upperBound..<existing.endIndex])
// Restore a trailing newline only when the consumed region
// had one (i.e., the block wasn't at end-of-string with
// no terminating newline).
let consumedTrailingNewline = region.upperBound > existing.startIndex
&& existing[existing.index(before: region.upperBound)] == "\n"
let separator = consumedTrailingNewline ? "\n" : ""
return before + block + separator + after
}
// Append at end of file, separated from preceding content by
// a blank line. Empty-or-whitespace files just become the
// block plus a trailing newline.
let trimmed = existing.trimmingCharacters(in: .whitespacesAndNewlines)
if trimmed.isEmpty {
return block + "\n"
}
let normalized = trimmingRightNewlines(existing)
return normalized + "\n\n" + block + "\n"
}
/// Strip the bounded block for `slug` from `existing`. No-op when
/// absent. Preserves all other slugs' blocks and user-authored
/// content byte-identically.
public static func removeBlock(forSlug slug: String, from existing: String) -> String {
guard let region = blockRange(forSlug: slug, in: existing) else {
return existing
}
let before = String(existing[existing.startIndex..<region.lowerBound])
let after = String(existing[region.upperBound..<existing.endIndex])
// Collapse the blank line we may have inserted at append time
// so repeated install/uninstall cycles don't accumulate
// blank lines. Specifically: if `before` ends in `\n\n` and
// `after` starts with `\n`, drop one of the newlines.
var trimmedBefore = before
var trimmedAfter = after
if trimmedBefore.hasSuffix("\n\n") && trimmedAfter.hasPrefix("\n") {
trimmedAfter.removeFirst()
} else if trimmedBefore.hasSuffix("\n\n") {
trimmedBefore.removeLast()
}
return trimmedBefore + trimmedAfter
}
// MARK: - Range scan
/// Locate the inclusive character range covering one project's
/// block, including a trailing newline if present so removal
/// doesn't leave a dangling empty line. Returns nil when the
/// block isn't present.
private static func blockRange(
forSlug slug: String,
in existing: String
) -> Range<String.Index>? {
let beginLine = beginMarkerPrefix + slug
let endLine = endMarkerPrefix + slug
// Match begin marker as a full line guard against false
// positives where a slug is a prefix of another slug
// (e.g. "foo" vs "foo-bar"). Require the marker to be
// followed immediately by `\n` or end-of-string.
guard let beginRange = lineRange(of: beginLine, in: existing) else {
return nil
}
// Search for the matching end marker AFTER the begin range
// can't use a leading-anchor scan because there may be other
// slugs' end markers between begin and the matching end.
let searchStart = beginRange.upperBound
guard let endRange = lineRange(of: endLine, in: existing, startingAt: searchStart) else {
return nil
}
// Include a trailing newline if the file has one immediately
// after the end marker keeps the file shape clean across
// remove operations.
var upper = endRange.upperBound
if upper < existing.endIndex, existing[upper] == "\n" {
upper = existing.index(after: upper)
}
return beginRange.lowerBound..<upper
}
/// Find a substring that appears as a complete line bounded by
/// start-of-string or `\n` on the left and `\n` or end-of-string
/// on the right. Returns the range of the substring itself, not
/// including any surrounding newlines.
private static func lineRange(
of needle: String,
in haystack: String,
startingAt start: String.Index? = nil
) -> Range<String.Index>? {
var searchStart = start ?? haystack.startIndex
while searchStart <= haystack.endIndex {
guard let range = haystack.range(of: needle, range: searchStart..<haystack.endIndex) else {
return nil
}
let leftOK = range.lowerBound == haystack.startIndex
|| haystack[haystack.index(before: range.lowerBound)] == "\n"
let rightOK = range.upperBound == haystack.endIndex
|| haystack[range.upperBound] == "\n"
if leftOK && rightOK {
return range
}
// Advance past this false positive and keep searching.
searchStart = range.upperBound
}
return nil
}
private static func trimmingRightNewlines(_ s: String) -> String {
var result = s
while let last = result.last, last.isNewline {
result.removeLast()
}
return result
}
}
@@ -0,0 +1,34 @@
import Foundation
/// Process-wide toggles for test-mode launches.
///
/// Read `CommandLine.arguments` once at first access and cache the result so
/// any code path can ask `TestModeFlags.shared.isTestMode` without paying for
/// a re-scan. The harness sets `--scarf-test-mode` from XCUITest's
/// `XCUIApplication.launchArguments` and pairs it with `SCARF_HERMES_HOME`
/// (read by `HermesProfileResolver`) to drive Scarf against an isolated
/// Hermes home.
///
/// The flags themselves don't do anything on their own they're hook points
/// for production code paths to gate behavior. v1 lands the wiring; the
/// gating sites (Sparkle update prompt, capability live-probe, first-run
/// walkthrough) are added incrementally as the harness exercises them and
/// surfaces flakes.
public struct TestModeFlags: Sendable {
/// True when the process was launched with `--scarf-test-mode`. Read
/// once from `CommandLine.arguments`; never mutated.
public let isTestMode: Bool
/// Default singleton cached on first access. Production code reads
/// this; tests that need a different shape construct their own value.
public static let shared: TestModeFlags = TestModeFlags(
arguments: CommandLine.arguments
)
/// Constructor exposed for tests so a synthetic argv can be passed
/// without involving the real `CommandLine`. Production callers use
/// `.shared`.
public init(arguments: [String]) {
self.isTestMode = arguments.contains("--scarf-test-mode")
}
}
@@ -289,18 +289,35 @@ public struct LocalTransport: ServerTransport {
#endif
}
// MARK: - SQLite
// MARK: - Script streaming
public func snapshotSQLite(remotePath: String) throws -> URL {
// Local case: no copy needed. Services open the path directly.
URL(fileURLWithPath: remotePath)
/// Run `script` through `/bin/sh -c` locally. Local data path
/// doesn't actually call this in production (the data service
/// hands `LocalSQLiteBackend` the libsqlite3-direct path) kept
/// for protocol parity and for tooling that wants a uniform
/// "run a script" entry on either context kind.
public func streamScript(_ script: String, timeout: TimeInterval) async throws -> ProcessResult {
#if os(iOS)
throw TransportError.other(message: "LocalTransport.streamScript is unavailable on iOS")
#else
let outcome = await SSHScriptRunner.run(
script: script,
context: ServerContext(id: contextID, displayName: "Local", kind: .local),
timeout: timeout
)
switch outcome {
case .connectFailure(let reason):
throw TransportError.other(message: reason)
case .completed(let stdout, let stderr, let exitCode):
return ProcessResult(
exitCode: exitCode,
stdout: Data(stdout.utf8),
stderr: Data(stderr.utf8)
)
}
#endif
}
/// Local transport reads the live DB directly there's no cached
/// snapshot to fall back to (and no failure mode where falling back
/// would help, since a missing local file is missing both ways).
public var cachedSnapshotPath: URL? { nil }
// MARK: - Watching
#if canImport(Darwin)
@@ -25,6 +25,30 @@ import Foundation
/// callers can treat both uniformly.
public enum SSHScriptRunner {
/// Thread-safe boolean flag used to bridge parent-task cancellation
/// into the detached `Task` body that owns the ssh subprocess.
/// `Task.detached { ... }` does NOT inherit cancellation from the
/// awaiting parent; without this flag, cancelling a chat-load /
/// hydration / activity-fetch Task only throws `CancellationError`
/// at the chat layer while the ssh subprocess keeps running until
/// its 30s timeout fires pinning a remote sqlite query (and a
/// ControlMaster session slot) for the full deadline. v2.8 fix
/// observed in 2026-05-05 dogfooding: rapid chat-switching left a
/// chain of stale 30s ssh subprocesses behind, blocking the
/// dashboard's queryBatch and producing a "spinning" load.
private final class CancelFlag: @unchecked Sendable {
private let lock = NSLock()
private var _cancelled = false
var isCancelled: Bool {
lock.lock(); defer { lock.unlock() }
return _cancelled
}
func cancel() {
lock.lock(); defer { lock.unlock() }
_cancelled = true
}
}
public enum Outcome: Sendable {
/// Couldn't even reach the remote (process spawn failed,
/// timeout before any output, network refused). Carries the
@@ -46,22 +70,38 @@ public enum SSHScriptRunner {
/// cross-platform we return a connect failure on non-macOS so
/// the file compiles everywhere.
public static func run(script: String, context: ServerContext, timeout: TimeInterval = 30) async -> Outcome {
#if os(macOS)
switch context.kind {
case .local:
return await runLocally(script: script, timeout: timeout)
case .ssh(let config):
return await runOverSSH(script: script, config: config, timeout: timeout)
await ScarfMon.measureAsync(.transport, "ssh.run") {
// Bridge parent cancellation into the detached subprocess
// task. Without this, killing a chat-hydration Task on a
// session switch only unwinds Swift state the ssh
// subprocess keeps holding a remote sqlite query + a
// ControlMaster session for the full 30s timeout. v2.8.
let cancelFlag = CancelFlag()
return await withTaskCancellationHandler(
operation: {
#if os(macOS)
switch context.kind {
case .local:
return await runLocally(script: script, timeout: timeout, cancelFlag: cancelFlag)
case .ssh(let config):
return await runOverSSH(script: script, config: config, timeout: timeout, cancelFlag: cancelFlag)
}
#else
return .connectFailure("SSHScriptRunner is only available on macOS")
#endif
},
onCancel: {
cancelFlag.cancel()
ScarfMon.event(.transport, "ssh.cancelled", count: 1)
}
)
}
#else
return .connectFailure("SSHScriptRunner is only available on macOS")
#endif
}
// MARK: - SSH path
#if os(macOS)
private static func runOverSSH(script: String, config: SSHConfig, timeout: TimeInterval) async -> Outcome {
private static func runOverSSH(script: String, config: SSHConfig, timeout: TimeInterval, cancelFlag: CancelFlag) async -> Outcome {
var sshArgv: [String] = [
"-o", "ControlMaster=auto",
"-o", "ControlPath=\(SSHTransport.controlDirPath())/%C",
@@ -124,10 +164,28 @@ public enum SSHScriptRunner {
let deadline = Date().addingTimeInterval(timeout)
while proc.isRunning && Date() < deadline {
// Honor BOTH the detached-task's own cancellation flag
// (set by the parent's `withTaskCancellationHandler`)
// and the legacy `Task.isCancelled` check in case the
// detached body gets cancelled directly. The flag is
// the load-bearing path; Task.isCancelled is harmless
// belt-and-suspenders.
if cancelFlag.isCancelled || Task.isCancelled {
proc.terminate()
try? stdoutPipe.fileHandleForReading.close()
try? stderrPipe.fileHandleForReading.close()
return .connectFailure("Script cancelled")
}
try? await Task.sleep(nanoseconds: 100_000_000)
}
if proc.isRunning {
proc.terminate()
// Pipe fds leak otherwise closing on the timeout branch
// matches the success-path discipline (see CLAUDE.md
// "Always close both fileHandleForReading and
// fileHandleForWriting on Pipe objects").
try? stdoutPipe.fileHandleForReading.close()
try? stderrPipe.fileHandleForReading.close()
return .connectFailure("Script timed out after \(Int(timeout))s")
}
let out = (try? stdoutPipe.fileHandleForReading.readToEnd()) ?? Data()
@@ -145,7 +203,7 @@ public enum SSHScriptRunner {
// MARK: - Local path
private static func runLocally(script: String, timeout: TimeInterval) async -> Outcome {
private static func runLocally(script: String, timeout: TimeInterval, cancelFlag: CancelFlag) async -> Outcome {
return await Task.detached { () -> Outcome in
let proc = Process()
proc.executableURL = URL(fileURLWithPath: "/bin/sh")
@@ -162,10 +220,18 @@ public enum SSHScriptRunner {
}
let deadline = Date().addingTimeInterval(timeout)
while proc.isRunning && Date() < deadline {
if cancelFlag.isCancelled || Task.isCancelled {
proc.terminate()
try? stdoutPipe.fileHandleForReading.close()
try? stderrPipe.fileHandleForReading.close()
return .connectFailure("Script cancelled")
}
try? await Task.sleep(nanoseconds: 100_000_000)
}
if proc.isRunning {
proc.terminate()
try? stdoutPipe.fileHandleForReading.close()
try? stderrPipe.fileHandleForReading.close()
return .connectFailure("Script timed out after \(Int(timeout))s")
}
let out = (try? stdoutPipe.fileHandleForReading.readToEnd()) ?? Data()
@@ -620,67 +620,26 @@ public struct SSHTransport: ServerTransport {
return env
}
// MARK: - SQLite snapshot
// MARK: - Script streaming
public func snapshotSQLite(remotePath: String) throws -> URL {
try? FileManager.default.createDirectory(atPath: snapshotDir, withIntermediateDirectories: true)
let localPath = snapshotDir + "/state.db"
// `.backup` is WAL-safe: sqlite takes a consistent snapshot without
// blocking writers. A plain `cp` of a WAL-mode DB could corrupt.
let remoteTmp = "/tmp/scarf-snapshot-\(UUID().uuidString).db"
// sqlite3's `.backup` is a dot-command, not a CLI arg. The whole
// dot-command must be one shell argument (double-quoted) so sqlite3
// receives it as a single command; the backup path inside it is
// single-quoted so sqlite3 parses it correctly. The DB path is a
// separate shell argument and goes through `remotePathArg`
// (double-quoted, $HOME-aware) so `~/.hermes/state.db` actually
// resolves on the remote.
//
// The second sqlite3 invocation flips the snapshot out of WAL mode
// so the scp'd file is self-contained: `.backup` preserves the
// source's journal_mode in the destination header, so without this
// step the client would need the `-wal`/`-shm` sidecars too, and
// every read would fail with "unable to open database file".
//
// Final shell command on the remote:
// sqlite3 "$HOME/.hermes/state.db" ".backup '/tmp/scarf-snapshot-XYZ.db'" \
// && sqlite3 '/tmp/scarf-snapshot-XYZ.db' "PRAGMA journal_mode=DELETE;"
let backupScript = #"sqlite3 \#(Self.remotePathArg(remotePath)) ".backup '\#(remoteTmp)'" && sqlite3 '\#(remoteTmp)' "PRAGMA journal_mode=DELETE;" > /dev/null"#
let backup = try runRemoteShell(backupScript)
if backup.exitCode != 0 {
throw TransportError.classifySSHFailure(host: config.host, exitCode: backup.exitCode, stderr: backup.stderrString)
/// Pipe `script` to `/bin/sh -s` over the ControlMaster-shared SSH
/// channel. Used by `RemoteSQLiteBackend` to invoke `sqlite3 -json`
/// per query without the per-arg quoting that `runProcess` would
/// apply. Delegates to `SSHScriptRunner` which already implements
/// the ssh-stdin-pipe pattern correctly.
public func streamScript(_ script: String, timeout: TimeInterval) async throws -> ProcessResult {
let context = ServerContext(id: contextID, displayName: displayName, kind: .ssh(config))
let outcome = await SSHScriptRunner.run(script: script, context: context, timeout: timeout)
switch outcome {
case .connectFailure(let reason):
throw TransportError.other(message: reason)
case .completed(let stdout, let stderr, let exitCode):
return ProcessResult(
exitCode: exitCode,
stdout: Data(stdout.utf8),
stderr: Data(stderr.utf8)
)
}
// scp the backup down. scp/sftp expands `~` natively (it goes
// through the SSH file-transfer protocol, not a remote shell), so
// remoteTmp's `/tmp/...` absolute path round-trips as-is.
ensureControlDir()
var scpArgs: [String] = [
"-o", "ControlMaster=auto",
"-o", "ControlPath=\(controlDir)/%C",
"-o", "ControlPersist=600",
"-o", "StrictHostKeyChecking=accept-new",
"-o", "LogLevel=QUIET",
"-o", "BatchMode=yes"
]
if let port = config.port { scpArgs += ["-P", String(port)] }
if let id = config.identityFile, !id.isEmpty { scpArgs += ["-i", id] }
scpArgs.append("\(hostSpec):\(remoteTmp)")
scpArgs.append(localPath)
let pull = try runLocal(executable: scpBinary, args: scpArgs, stdin: nil, timeout: 120)
// Regardless of pull outcome, try to clean up the remote tmp.
_ = try? runRemoteShell("rm -f \(Self.remotePathArg(remoteTmp))")
if pull.exitCode != 0 {
throw TransportError.classifySSHFailure(host: config.host, exitCode: pull.exitCode, stderr: pull.stderrString)
}
return URL(fileURLWithPath: localPath)
}
/// Path where the most recent successful snapshot was written
/// returned even when the remote is currently unreachable. The
/// data service falls back to this when `snapshotSQLite` throws so
/// Dashboard / Sessions / Chat-history stay viewable offline.
public var cachedSnapshotPath: URL? {
URL(fileURLWithPath: snapshotDir + "/state.db")
}
// MARK: - Watching
@@ -765,12 +724,28 @@ public struct SSHTransport: ServerTransport {
try? stdinPipe.fileHandleForWriting.close()
}
if let timeout {
let deadline = Date().addingTimeInterval(timeout)
while proc.isRunning && Date() < deadline {
Thread.sleep(forTimeInterval: 0.1)
}
if proc.isRunning {
// Kernel-wait via DispatchGroup + terminationHandler instead
// of a 100ms Thread.sleep spin loop. The old loop burned a
// cooperative-pool thread for the full timeout duration AND
// had 100ms granularity on the deadline; this version blocks
// once on a semaphore that the OS wakes when the process
// terminates (or when the timeout fires). Net effect: under
// concurrent SSH load (sidebar reload + chat finalize +
// watcher poll all firing together) we don't accumulate
// multiple spin-blocked threads, which was the mechanism
// behind the 7-second `loadRecentSessions` outliers
// observed in remote-context perf captures.
let waitGroup = DispatchGroup()
waitGroup.enter()
proc.terminationHandler = { _ in waitGroup.leave() }
let outcome = waitGroup.wait(timeout: .now() + timeout)
proc.terminationHandler = nil
if outcome == .timedOut {
proc.terminate()
// Brief block until the kill actually lands so we can
// collect partial stdout. terminate() is async; without
// this wait the readToEnd below could race the close.
proc.waitUntilExit()
let partial = (try? stdoutPipe.fileHandleForReading.readToEnd()) ?? Data()
try? stdoutPipe.fileHandleForReading.close()
try? stderrPipe.fileHandleForReading.close()
@@ -96,27 +96,25 @@ public protocol ServerTransport: Sendable {
args: [String]
) -> AsyncThrowingStream<Data, Error>
// MARK: - SQLite
/// Return a local filesystem URL pointing at a fresh, consistent copy of
/// the SQLite database at `remotePath`. For local transports this is
/// just the remote path unchanged. For SSH transports this performs
/// `sqlite3 .backup` on the remote side and scp's the backup into
/// `~/Library/Caches/scarf/<serverID>/state.db`, returning that URL.
nonisolated func snapshotSQLite(remotePath: String) throws -> URL
/// Local filesystem URL where this transport caches its SQLite snapshot,
/// returned even when the remote is unreachable. Callers should
/// `FileManager.default.fileExists(atPath:)` before reading the
/// transport can't atomically check existence and return the URL
/// in one step without TOCTOU. Local transports return `nil`
/// (their data is the live DB, not a cache).
/// Pipe a multi-line shell script through `/bin/sh -s` on the
/// target and return its captured output. The script travels as a
/// single opaque byte stream no per-line shell interpolation,
/// no per-arg quoting so `"$VAR"` references, here-docs, and
/// nested quotes survive untouched.
///
/// Used by `HermesDataService.open()` to fall back to the last
/// successful snapshot when a fresh `snapshotSQLite` call fails,
/// so the app keeps showing data with a "Last updated X ago"
/// affordance instead of a blank screen.
nonisolated var cachedSnapshotPath: URL? { get }
/// Replaces the old `snapshotSQLite` + scp pipeline. Used by
/// `RemoteSQLiteBackend` to invoke `sqlite3 -readonly -json` over
/// SSH per query (or per batch). Local transport runs the script
/// in-process via `/bin/sh -c`. SSH transport delegates to
/// `SSHScriptRunner` (ControlMaster-shared channel). Citadel
/// transport (iOS) base64-encodes the script + decodes remotely
/// to skirt Citadel's missing-stdin support.
///
/// Throws on transport failures (host unreachable, ssh exit 255,
/// timeout). Returns `ProcessResult` with the script's exit code
/// + stdout + stderr on completion non-zero exit is NOT a
/// throw; callers inspect `exitCode` and decide.
nonisolated func streamScript(_ script: String, timeout: TimeInterval) async throws -> ProcessResult
// MARK: - Watching
@@ -23,6 +23,13 @@ public final class ActivityViewModel {
public var toolResult: String?
public var sessionPreviews: [String: String] = [:]
public var isLoading = true
/// True while the Phase 2 background fill is paging through
/// `hydrateAssistantToolCalls`. Drives a "Loading tool details"
/// pill in the page header so the user knows the placeholder
/// rows on screen will fill in. v2.8.
public var isHydratingToolCalls = false
@ObservationIgnored
private var hydrationTask: Task<Void, Never>?
public var availableSessions: [(id: String, label: String)] {
var seen = Set<String>()
@@ -34,8 +41,29 @@ public final class ActivityViewModel {
}
public var filteredActivity: [ActivityEntry] {
let entries = toolMessages.flatMap { message in
message.toolCalls.map { call in
let entries = toolMessages.flatMap { message -> [ActivityEntry] in
// v2.8 emit a single "Loading tool calls" placeholder
// entry per skeleton message (one whose tool_calls JSON
// hasn't been hydrated yet). The user sees the timeline
// shape immediately; real entries replace the placeholder
// in-place when `hydrateAssistantToolCalls` returns.
// Filtering still works (we apply the session filter
// below) but kind filter hides placeholders since
// .other is the placeholder's default kind.
guard !message.toolCalls.isEmpty else {
return [ActivityEntry(
id: "skeleton-\(message.id)",
sessionId: message.sessionId,
toolName: "Loading tool details…",
kind: .other,
summary: "",
arguments: "",
messageContent: "",
timestamp: message.timestamp,
isPlaceholder: true
)]
}
return message.toolCalls.map { call in
ActivityEntry(
id: call.callId,
sessionId: message.sessionId,
@@ -49,14 +77,34 @@ public final class ActivityViewModel {
}
}
return entries.filter { entry in
let kindOk = filterKind == nil || entry.kind == filterKind
// Placeholders bypass the kind filter so they don't all
// disappear when the user picks a non-`.other` filter
// chip they still represent rows that may resolve to
// the matching kind once hydrated.
let kindOk = filterKind == nil || entry.isPlaceholder || entry.kind == filterKind
let sessionOk = filterSessionId == nil || entry.sessionId == filterSessionId
return kindOk && sessionOk
}
}
/// Last load's transport-failure reason, if any. Activity surfaces
/// this to the user instead of leaving the empty-state visible
/// (which the user reads as "no activity" rather than "couldn't
/// reach the host"). v2.8.
public var loadError: String?
public func load() async {
// Cancel any in-flight hydration from a prior load (e.g. a
// file-watcher delta firing while the prior pass was still
// paging). The new skeleton replaces the message set, so
// hydrating against the old ids would just splice into rows
// that no longer exist.
hydrationTask?.cancel()
hydrationTask = nil
isHydratingToolCalls = false
isLoading = true
loadError = nil
// refresh() = close + reopen, which forces a fresh snapshot pull on
// remote contexts. Using open() here would short-circuit after the
// first load and show stale data for the view's lifetime. The DB
@@ -64,12 +112,68 @@ public final class ActivityViewModel {
// results without re-opening cleanup() closes on disappear.
let opened = await dataService.refresh()
guard opened else {
loadError = "Couldn't reach \(context.displayName) — check the SSH connection and pull-to-refresh to retry."
isLoading = false
return
}
toolMessages = await dataService.fetchRecentToolCalls(limit: 200)
sessionPreviews = await dataService.fetchSessionPreviews(limit: 200)
// v2.8 Phase L skeleton-then-hydrate. Phase 1 metadata
// fetch is bounded by 50 rows × ~50 bytes (id + session_id +
// role + timestamp; tool_calls JSON is NULLed at the SQL
// level) 3 KB on the wire regardless of how big the
// underlying tool_calls blobs are. Comes back in
// sub-second on healthy remotes; placeholder rows render
// immediately. Phase 2 (paged hydrate) fills the real
// tool details in via 5-id batches in the background.
let outcome = await dataService.fetchRecentToolCallSkeleton(limit: 50)
toolMessages = outcome.messages
if let reason = outcome.transportError {
loadError = "Couldn't load activity from \(context.displayName) — the connection timed out (\(reason)). Pull to refresh to retry."
isLoading = false
return
}
sessionPreviews = await dataService.fetchSessionPreviews(limit: 50)
isLoading = false
// Phase 2 background hydrate. Mirrors the chat path's
// `startToolHydration`. Newest-first (the splice happens in
// batch order), cancellable via `cleanup()` / next `load()`.
startToolCallHydration()
}
/// Phase 2 of the v2.8 Activity loader. Pages through
/// `hydrateAssistantToolCalls` in batches of 5 ids and splices
/// the parsed `[HermesToolCall]` arrays into the existing
/// `toolMessages` skeleton. Once a message has its tool calls,
/// `filteredActivity` swaps the placeholder entry for the real
/// per-call entries on the next observation tick.
private func startToolCallHydration() {
let messageIds = toolMessages
.filter { $0.toolCalls.isEmpty && $0.id > 0 }
.map(\.id)
guard !messageIds.isEmpty else {
isHydratingToolCalls = false
return
}
isHydratingToolCalls = true
let dataService = self.dataService
hydrationTask = Task { @MainActor [weak self] in
defer { self?.isHydratingToolCalls = false }
// Page in 5-id batches matching the chat path
// hydrateAssistantToolCalls already does the paging
// internally; here we just hand it all the ids and
// let it return whatever it could pull. Parent task
// cancellation propagates down via the v2.8 SSH
// cancellation handler we wired through SSHScriptRunner.
let map = await dataService.hydrateAssistantToolCalls(messageIds: messageIds)
guard let self else { return }
if Task.isCancelled { return }
if !map.isEmpty {
self.toolMessages = self.toolMessages.map { msg in
guard msg.toolCalls.isEmpty, let calls = map[msg.id] else { return msg }
return msg.withToolCalls(calls)
}
}
}
}
public func selectEntry(_ entry: ActivityEntry?) async {
@@ -82,6 +186,9 @@ public final class ActivityViewModel {
}
public func cleanup() async {
hydrationTask?.cancel()
hydrationTask = nil
isHydratingToolCalls = false
await dataService.close()
}
}
@@ -95,6 +202,13 @@ public struct ActivityEntry: Identifiable, Sendable {
public let arguments: String
public let messageContent: String
public let timestamp: Date?
/// True for skeleton entries emitted while the v2.8 two-phase
/// loader is still hydrating tool_calls JSON for the underlying
/// message. ActivityRow renders these as greyed "Loading" rows
/// so the user sees the timeline shape without the per-call
/// detail. Splice happens in-place when hydration completes
/// the placeholder vanishes and the real entries take its slot.
public let isPlaceholder: Bool
public init(
id: String,
@@ -104,7 +218,8 @@ public struct ActivityEntry: Identifiable, Sendable {
summary: String,
arguments: String,
messageContent: String,
timestamp: Date?
timestamp: Date?,
isPlaceholder: Bool = false
) {
self.id = id
self.sessionId = sessionId
@@ -114,6 +229,7 @@ public struct ActivityEntry: Identifiable, Sendable {
self.arguments = arguments
self.messageContent = messageContent
self.timestamp = timestamp
self.isPlaceholder = isPlaceholder
}
public var prettyArguments: String {
@@ -37,22 +37,34 @@ public final class CuratorViewModel {
isLoading = true
defer { isLoading = false }
let context = self.context
let parsed = await Task.detached(priority: .userInitiated) { () -> (HermesCuratorStatus, String?) in
let textResult = Self.runCuratorStatus(context: context)
let stateData = context.readData(context.paths.curatorStateFile)
let parsed = HermesCuratorStatusParser.parse(text: textResult, stateFileJSON: stateData)
// Best-effort markdown report: the state file points at the
// most recent <YYYYMMDD-HHMMSS>/ dir; load REPORT.md from
// there. Missing on first run, which is fine.
var report: String?
if let reportDir = parsed.lastReportPath {
let reportPath = reportDir.hasSuffix("/")
? "\(reportDir)REPORT.md"
: "\(reportDir)/REPORT.md"
report = context.readText(reportPath)
}
return (parsed, report)
}.value
// v2.8 instrumented. Curator load fires `hermes curator
// status` (CLI subprocess) plus 1-2 file reads; on remote
// each is a separate SSH RTT. Visibility lets future captures
// show how often the report file is missing or oversized.
let parsed = await ScarfMon.measureAsync(.diskIO, "curator.load") {
await Task.detached(priority: .userInitiated) { () -> (HermesCuratorStatus, String?) in
let textResult = Self.runCuratorStatus(context: context)
let stateData = context.readData(context.paths.curatorStateFile)
let parsed = HermesCuratorStatusParser.parse(text: textResult, stateFileJSON: stateData)
// Best-effort markdown report: the state file points at the
// most recent <YYYYMMDD-HHMMSS>/ dir; load REPORT.md from
// there. Missing on first run, which is fine.
var report: String?
if let reportDir = parsed.lastReportPath {
let reportPath = reportDir.hasSuffix("/")
? "\(reportDir)REPORT.md"
: "\(reportDir)/REPORT.md"
report = context.readText(reportPath)
}
return (parsed, report)
}.value
}
ScarfMon.event(
.diskIO,
"curator.load.bytes",
count: 0,
bytes: parsed.1?.utf8.count ?? 0
)
self.status = parsed.0
self.lastReportMarkdown = parsed.1
}
@@ -117,12 +117,19 @@ public final class InsightsViewModel {
}
let since = period.sinceDate
// The four insights queries (user-message count, tool usage,
// hourly + daily activity histograms) batch through one
// `insightsSnapshot` round-trip. Sessions and session-previews
// stay separate they're large result sets and stay on their
// own calls. For remote contexts this turns ~5 SSH round-trips
// into 3.
sessions = await dataService.fetchSessionsInPeriod(since: since)
sessionPreviews = await dataService.fetchSessionPreviews(limit: 500)
userMessageCount = await dataService.fetchUserMessageCount(since: since)
let tools = await dataService.fetchToolUsage(since: since)
hourlyActivity = await dataService.fetchSessionStartHours(since: since)
dailyActivity = await dataService.fetchSessionDaysOfWeek(since: since)
let snapshot = await dataService.insightsSnapshot(since: since)
userMessageCount = snapshot.userMessageCount
let tools = snapshot.toolUsage
hourlyActivity = snapshot.startHours
dailyActivity = snapshot.daysOfWeek
await dataService.close()
@@ -164,6 +164,16 @@ public final class ProjectsViewModel {
projects.map(\.dashboardPath)
}
/// Per-project `.scarf/` directories watched alongside `dashboardPaths`
/// so that file-reading widgets (markdown_file, log_tail, image) refresh
/// when their underlying files are added / removed / renamed inside the
/// directory by a cron job. In-place file appends within an existing
/// file are NOT detected here; the cron job should write atomically
/// (write-then-rename) or `touch` dashboard.json after each run.
public var projectScarfDirs: [String] {
projects.map(\.scarfDir)
}
private func loadDashboard(for project: ProjectEntry) {
dashboardError = nil
if !service.dashboardExists(for: project) {
@@ -5,6 +5,7 @@
import Foundation
import Observation
import SwiftUI
public enum ChatDisplayMode: String, CaseIterable {
case terminal
@@ -63,6 +64,23 @@ public final class RichChatViewModel {
public var messages: [HermesMessage] = []
public var currentSession: HermesSession?
public var messageGroups: [MessageGroup] = []
/// True while the v2.8 two-phase loader's background hydration
/// (tool_calls JSON + tool result rows) is in flight. Chat header
/// shows "Loading tool details" so the user knows the bare
/// transcript they're looking at will fill in. Cleared once both
/// hydration passes finish or the session-id changes underneath.
public var isHydratingTools: Bool = false
@ObservationIgnored
private var hydrationTask: Task<Void, Never>?
/// UserDefaults key controlling whether the chat resume path
/// auto-fetches the CONTENT of tool result rows (`role='tool'`) for
/// past messages. Defaults false a single tool result blob
/// (file dump, stack trace) can be hundreds of KB; bulk-fetching
/// all of them during chat resume on a slow remote can blow past
/// the 30s SSH timeout. The Mac Settings Display tab exposes
/// the toggle (mirror string in `ChatDensityKeys`).
public static let loadHistoricalToolResultsKey = "scarf.chat.loadHistoricalToolResults"
/// True from the moment the user sends a prompt until the ACP
/// `promptComplete` event arrives. Covers the whole round-trip
/// including auxiliary post-processing (title generation, usage
@@ -120,6 +138,12 @@ public final class RichChatViewModel {
/// users can copy-paste the raw output into a bug report.
public var acpErrorDetails: String?
/// Lowercase OAuth provider name (`"nous"`, `"claude"`, ) when the
/// most recent failure was an OAuth refresh-revocation Hermes asked
/// the user to fix via re-authentication. Drives the chat banner's
/// "Re-authenticate" button. Nil for any other failure mode.
public var acpErrorOAuthProvider: String?
/// Optional stderr-tail provider the controller can hook up when it
/// creates the ACPClient. Used by `handlePromptComplete` to enrich
/// the error banner on non-retryable stopReasons. The closure is
@@ -134,6 +158,7 @@ public final class RichChatViewModel {
acpError = nil
acpErrorHint = nil
acpErrorDetails = nil
acpErrorOAuthProvider = nil
}
/// Populate the error triplet from a thrown Error + the ACPClient
@@ -154,10 +179,11 @@ public final class RichChatViewModel {
}
let msg = error.localizedDescription
let stderrTail = await client?.recentStderr ?? ""
let hint = ACPErrorHint.classify(errorMessage: msg, stderrTail: stderrTail)
let cls = ACPErrorHint.classify(errorMessage: msg, stderrTail: stderrTail)
acpError = msg
acpErrorHint = hint
acpErrorHint = cls?.hint
acpErrorDetails = stderrTail.isEmpty ? nil : stderrTail
acpErrorOAuthProvider = cls?.oauthProvider
}
/// Populate the error triplet when `handlePromptComplete` sees a
@@ -168,11 +194,11 @@ public final class RichChatViewModel {
public func recordPromptStopFailure(stopReason: String, client: ACPClient?) async {
let msg = "Prompt ended without a response (stopReason: \(stopReason))."
let stderrTail = await client?.recentStderr ?? ""
let hint = ACPErrorHint.classify(errorMessage: msg, stderrTail: stderrTail)
?? Self.fallbackHint(for: stopReason)
let cls = ACPErrorHint.classify(errorMessage: msg, stderrTail: stderrTail)
acpError = msg
acpErrorHint = hint
acpErrorHint = cls?.hint ?? Self.fallbackHint(for: stopReason)
acpErrorDetails = stderrTail.isEmpty ? nil : stderrTail
acpErrorOAuthProvider = cls?.oauthProvider
}
/// Same as `recordPromptStopFailure` but pulls stderr from the
@@ -182,11 +208,11 @@ public final class RichChatViewModel {
private func recordPromptStopFailureUsingProvider(stopReason: String) async {
let msg = "Prompt ended without a response (stopReason: \(stopReason))."
let stderrTail = await acpStderrProvider?() ?? ""
let hint = ACPErrorHint.classify(errorMessage: msg, stderrTail: stderrTail)
?? Self.fallbackHint(for: stopReason)
let cls = ACPErrorHint.classify(errorMessage: msg, stderrTail: stderrTail)
acpError = msg
acpErrorHint = hint
acpErrorHint = cls?.hint ?? Self.fallbackHint(for: stopReason)
acpErrorDetails = stderrTail.isEmpty ? nil : stderrTail
acpErrorOAuthProvider = cls?.oauthProvider
}
private static func fallbackHint(for stopReason: String) -> String? {
@@ -354,10 +380,36 @@ public final class RichChatViewModel {
/// spinner and we don't fan out duplicate page requests.
public private(set) var isLoadingEarlier: Bool = false
private var nextLocalId = -1
/// Issue #63: locally-created user messages awaiting state.db
/// persistence, keyed by session id. ACP roundtrips Hermes' DB
/// write asynchronously, so a user who sends a prompt and
/// immediately switches to another session triggers `reset()`
/// before Hermes flushes the row `loadSessionHistory` then reads
/// from a DB that doesn't have the message yet, and the bubble
/// renders blank or vanishes on return. We hold a per-session
/// copy here that survives `reset()` so `loadSessionHistory` can
/// re-inject anything still in flight, and clean entries out as
/// soon as a matching DB row appears.
private var pendingLocalUserMessages: [String: [HermesMessage]] = [:]
private var streamingAssistantText = ""
private var streamingThinkingText = ""
private var streamingToolCalls: [HermesToolCall] = []
/// True while a turn is in flight, has emitted thought-stream
/// bytes, but has NOT yet produced any visible assistant text.
/// Surfaces the user-facing "Thinking" status promotion (the
/// model is reasoning before answering Hermes reasoning models
/// commonly take 38 s here, which the ScarfMon `firstThoughtByte`
/// vs `firstByte` split makes visible). Becomes false the moment
/// the first message chunk arrives or the turn ends.
public var isStreamingThoughtsOnly: Bool {
currentTurnStart != nil
&& !streamingThinkingText.isEmpty
&& streamingAssistantText.isEmpty
}
// DB polling state (used in terminal mode fallback)
private var lastKnownFingerprint: HermesDataService.MessageFingerprint?
private var debounceTask: Task<Void, Never>?
@@ -388,6 +440,9 @@ public final class RichChatViewModel {
public func reset() {
debounceTask?.cancel()
hydrationTask?.cancel()
hydrationTask = nil
isHydratingTools = false
stopActivePolling()
Task { await dataService.close() }
messages = []
@@ -435,13 +490,15 @@ public final class RichChatViewModel {
/// Re-fetch session metadata from DB to pick up cost/token updates.
public func refreshSessionFromDB() async {
guard let sessionId else { return }
let opened = await dataService.open()
guard opened else { return }
if let session = await dataService.fetchSession(id: sessionId) {
currentSession = session
await ScarfMon.measureAsync(.sessionLoad, "mac.refreshSessionFromDB") {
guard let sessionId else { return }
let opened = await dataService.open()
guard opened else { return }
if let session = await dataService.fetchSession(id: sessionId) {
currentSession = session
}
await dataService.close()
}
await dataService.close()
}
// MARK: - ACP Event Handling
@@ -468,6 +525,12 @@ public final class RichChatViewModel {
reasoning: nil
)
messages.append(message)
// Track the local message in the pending-user-messages cache
// so a reset/resume cycle on this session before Hermes
// persists the row can still re-inject it on return (#63).
if let sid = sessionId {
pendingLocalUserMessages[sid, default: []].append(message)
}
// Per-turn stopwatch (v2.5): record the start time only when
// we're entering a fresh agent turn. /steer-style mid-run sends
// arrive while isAgentWorking is already true; preserve the
@@ -614,11 +677,23 @@ public final class RichChatViewModel {
}
private func appendMessageChunk(text: String) {
// ScarfMon "first byte" fires once per turn, on the first
// visible message chunk. Splits "user tap first byte"
// (network + Hermes thinking) from "first byte turn end"
// (streaming + Scarf rendering) so we can attribute slow-feel
// bugs to the right side. `bytes` carries the first chunk's
// size, not the full turn.
if streamingAssistantText.isEmpty && currentTurnStart != nil {
ScarfMon.event(.chatStream, "firstByte", count: 1, bytes: text.utf8.count)
}
streamingAssistantText += text
upsertStreamingMessage()
}
private func appendThoughtChunk(text: String) {
if streamingThinkingText.isEmpty && currentTurnStart != nil {
ScarfMon.event(.chatStream, "firstThoughtByte", count: 1, bytes: text.utf8.count)
}
streamingThinkingText += text
upsertStreamingMessage()
}
@@ -831,6 +906,12 @@ public final class RichChatViewModel {
/// Convert the streaming message (id=0) into a permanent message and reset streaming state.
private func finalizeStreamingMessage() {
ScarfMon.measure(.chatStream, "finalizeStreamingMessage") {
_finalizeStreamingMessageImpl()
}
}
private func _finalizeStreamingMessageImpl() {
guard let idx = messages.firstIndex(where: { $0.id == Self.streamingId }) else { return }
// Only finalize if there's actual content
@@ -838,22 +919,52 @@ public final class RichChatViewModel {
|| !streamingThinkingText.isEmpty
|| !streamingToolCalls.isEmpty
// ScarfMon surface turns that finalize with NO visible
// assistant text. Common Nous-model failure mode: model
// emits a few thought-stream bytes then falls silent;
// Hermes finalizes with empty content; the user sees a
// stuck "(°°) deliberating..." placeholder bubble. The
// event fires for both the all-empty case (which gets
// removed below) and the thoughts-only case (which is
// kept as a permanent message with empty body) both
// are user-visible failures worth tracking.
if streamingAssistantText.isEmpty && streamingToolCalls.isEmpty {
ScarfMon.event(
.chatStream,
"emptyAssistantTurn",
count: 1,
bytes: streamingThinkingText.utf8.count
)
}
if hasContent {
let id = nextLocalId
nextLocalId -= 1
messages[idx] = HermesMessage(
id: id,
sessionId: sessionId ?? "",
role: "assistant",
content: streamingAssistantText,
toolCallId: nil,
toolCalls: streamingToolCalls,
toolName: nil,
timestamp: Date(),
tokenCount: nil,
finishReason: streamingToolCalls.isEmpty ? "stop" : nil,
reasoning: streamingThinkingText.isEmpty ? nil : streamingThinkingText
)
// Wrap the streaming-id rewrite in a no-animation
// transaction. Without this SwiftUI sees an identity
// change for the streaming ForEach element (id 0 new
// permanent id) and runs an animated diff against
// adjacent elements, which costs ~58 RichMessageBubble
// body re-evaluations per turn-end (visible in the
// ScarfMon ring as a 12 ms burst right after every
// `finalizeStreamingMessage` interval). The new message
// is content-equal to the streaming one there is no
// animation worth running.
withTransaction(Transaction(animation: nil)) {
messages[idx] = HermesMessage(
id: id,
sessionId: sessionId ?? "",
role: "assistant",
content: streamingAssistantText,
toolCallId: nil,
toolCalls: streamingToolCalls,
toolName: nil,
timestamp: Date(),
tokenCount: nil,
finishReason: streamingToolCalls.isEmpty ? "stop" : nil,
reasoning: streamingThinkingText.isEmpty ? nil : streamingThinkingText
)
}
// Capture per-turn duration so the chat UI can render the
// stopwatch pill (v2.5). Skips assistants we don't have a
// start time for e.g., the .promptComplete fired but the
@@ -864,8 +975,12 @@ public final class RichChatViewModel {
currentTurnStart = nil
}
} else {
// Remove empty streaming placeholder
messages.remove(at: idx)
// Remove empty streaming placeholder. Same no-animation
// transaction pattern empty-finalize used to ripple the
// ForEach diff to every following bubble.
withTransaction(Transaction(animation: nil)) {
messages.remove(at: idx)
}
}
// Reset streaming state for next chunk
@@ -940,7 +1055,20 @@ public final class RichChatViewModel {
/// Load message history from the DB, optionally combining an origin session
/// (e.g., CLI session) with the current ACP session.
public func loadSessionHistory(sessionId: String, acpSessionId: String? = nil) async {
await ScarfMon.measureAsync(.sessionLoad, "mac.hydrateMessages") {
self.sessionId = sessionId
// Capture the session-id we're loading FOR so we can verify
// it's still the active one before assigning to `messages`.
// Without this guard, switching to a small chat while a
// larger one is mid-fetch can result in last-write-wins:
// the slow fetch finishes after the small chat's, drops
// the user back into the big chat's transcript, and the
// user has to reselect the small one. Observed in remote
// perf captures (parallel fetchMessages calls, one timing
// out at 30s for a 157-message session, the other 2-message
// chat completing in 425ms; the 30s one's assignment
// overwrote the small chat).
let loadingForSession = sessionId
// Force a fresh snapshot pull on remote contexts. An earlier open()
// would have cached a stale copy on resume we need whatever
// Hermes has actually persisted since then, or the resumed session
@@ -950,9 +1078,30 @@ public final class RichChatViewModel {
// messages the agent streamed during the user's offline window.
let opened = await dataService.refresh(forceFresh: true)
guard opened else { return }
// Race-check #1: session id may have changed during refresh.
guard self.sessionId == loadingForSession else {
ScarfMon.event(.sessionLoad, "mac.hydrateMessages.dropped", count: 1)
return
}
// v2.8 two-phase loader. Phase 1 skeleton: user + assistant
// rows only, no tool_calls JSON, no reasoning, no
// reasoning_content. Wire payload bounded by conversational
// text alone so chats with multi-page tool result blobs (the
// 30s-timeout case) come up in seconds. Phase 2 (kicked off
// below in a Task.detached) fills tool calls + tool results in
// the background the chat is usable while it runs.
let pageSize = HistoryPageSize.initial
var allMessages = await dataService.fetchMessages(sessionId: sessionId, limit: pageSize)
let originOutcome = await dataService.fetchSkeletonMessages(sessionId: sessionId, limit: pageSize)
var allMessages = originOutcome.messages
var transportFailure: String? = originOutcome.transportError
// Race-check #2: session id may have changed during the
// long fetch (the most common race a 30s timeout on a
// big session lets the user switch to a small one and back).
guard self.sessionId == loadingForSession else {
ScarfMon.event(.sessionLoad, "mac.hydrateMessages.dropped", count: 1)
return
}
// The DB has more on-disk history when the initial fetch
// saturated the limit. The "Load earlier" affordance reads
// this flag.
@@ -964,17 +1113,63 @@ public final class RichChatViewModel {
if let acpId = acpSessionId, acpId != sessionId {
originSessionId = sessionId
self.sessionId = acpId
let acpMessages = await dataService.fetchMessages(sessionId: acpId, limit: pageSize)
if !acpMessages.isEmpty {
allMessages.append(contentsOf: acpMessages)
let acpOutcome = await dataService.fetchSkeletonMessages(sessionId: acpId, limit: pageSize)
// Race-check #3: same guard, after the second fetch.
guard self.sessionId == acpId else {
ScarfMon.event(.sessionLoad, "mac.hydrateMessages.dropped", count: 1)
return
}
if let acpErr = acpOutcome.transportError, transportFailure == nil {
transportFailure = acpErr
}
if !acpOutcome.messages.isEmpty {
allMessages.append(contentsOf: acpOutcome.messages)
allMessages.sort { ($0.timestamp ?? .distantPast) < ($1.timestamp ?? .distantPast) }
moreHistory = moreHistory || acpMessages.count >= pageSize
moreHistory = moreHistory || acpOutcome.messages.count >= pageSize
}
}
messages = allMessages
// Issue #63 re-inject any locally-created user messages
// we still have on file for this session that haven't yet
// shown up in state.db. Covers two paths:
// 1. The user just sent a prompt then resumed a different
// session before Hermes persisted the row. `reset()` had
// cleared `messages` but the per-session pending cache
// survived; restore the row here so the bubble doesn't
// come back blank.
// 2. The DB-resume path on first load a previously-pending
// message Hermes is still mid-write may not appear in
// this fetch. We merge it in, and drop it from the cache
// as soon as a matching DB row (same content, persisted
// id 0) shows up.
let pendingForSession = pendingLocalUserMessages[sessionId] ?? []
if pendingForSession.isEmpty {
messages = allMessages
} else {
var merged = allMessages
var stillPending: [HermesMessage] = []
for local in pendingForSession {
let persisted = merged.contains { msg in
msg.isUser && msg.id >= 0 && msg.content == local.content
}
if persisted {
continue // DB caught up drop the local copy
}
if !merged.contains(where: { $0.id == local.id }) {
merged.append(local)
}
stillPending.append(local)
}
merged.sort { ($0.timestamp ?? .distantPast) < ($1.timestamp ?? .distantPast) }
messages = merged
if stillPending.isEmpty {
pendingLocalUserMessages.removeValue(forKey: sessionId)
} else {
pendingLocalUserMessages[sessionId] = stillPending
}
}
currentSession = session
let minId = allMessages.map(\.id).min() ?? 0
let minId = messages.map(\.id).min() ?? 0
nextLocalId = min(minId - 1, -1)
// Track the oldest loaded id from THIS session (not the merged
// origin) so `loadEarlier()` pages back through the live ACP
@@ -987,7 +1182,182 @@ public final class RichChatViewModel {
.map(\.id)
.min()
hasMoreHistory = moreHistory
ScarfMon.event(.sessionLoad, "mac.hydrateMessages.rows", count: messages.count)
buildMessageGroups()
// Partial-result detection if a fetch tripped a transport
// failure (SSH timeout / ControlMaster drop) the user is now
// looking at zero or near-zero messages with no idea why. The
// pre-v2.8 behavior was a silent empty transcript. Surface a
// banner via the existing acpError triplet so the user sees
// "couldn't load full history connection slow." We assume
// more history exists (so the "Load earlier" affordance is
// honest about the gap) caller can retry by reopening the
// session.
if let reason = transportFailure {
acpError = "Couldn't load full chat history — the connection to \(dataService.context.displayName) timed out."
acpErrorHint = "Reopen the session to retry, or check the SSH link if this keeps happening."
acpErrorDetails = reason
acpErrorOAuthProvider = nil
hasMoreHistory = true
} else {
// v2.8 kick off background hydration of tool_calls JSON
// and tool result rows for the just-loaded skeleton.
// Non-blocking on the main load path (chat is usable).
startToolHydration(loadingForSession: self.sessionId ?? sessionId)
}
} // end measureAsync(.sessionLoad, "mac.hydrateMessages")
}
/// Phase 2 of the two-phase chat loader. Pulls `tool_calls` JSON
/// for the loaded assistant rows, then fetches `role='tool'` rows
/// in the loaded id range and splices both into `messages` /
/// `messageGroups` without disturbing what the user is already
/// reading. Cancellable restarting (a session switch, a
/// `reset()`) drops any in-flight pass.
///
/// Tool calls go in first because they live ON the existing
/// assistant message and surface the most-visible UI affordance
/// (the tool card chips). Tool result content rows go in second
/// because they're the heaviest payload and the UI degrades
/// gracefully without them (the cards still show "running" /
/// "complete" state; only the result body is missing).
private func startToolHydration(loadingForSession: String) {
hydrationTask?.cancel()
let sessionForLoad = loadingForSession
let dataService = self.dataService
hydrationTask = Task { @MainActor [weak self] in
guard let self else { return }
self.isHydratingTools = true
defer { self.isHydratingTools = false }
// Snapshot the assistant ids + id range from the messages
// we just loaded. Doing this on MainActor keeps us in step
// with the observable view of `messages`; the actual
// SQL calls happen in `await` slots that release the actor.
let assistantIds = self.messages
.filter { $0.isAssistant && $0.id > 0 }
.map(\.id)
guard let minId = self.messages.map(\.id).min(),
let maxId = self.messages.map(\.id).max(),
!assistantIds.isEmpty || minId < maxId else {
return
}
// Phase 2a tool_calls JSON. Splice parsed values into
// each assistant message that has them.
let toolCallMap = await dataService.hydrateAssistantToolCalls(messageIds: assistantIds)
if Task.isCancelled || self.sessionId != sessionForLoad {
ScarfMon.event(.sessionLoad, "mac.hydrateTools.dropped", count: 1)
return
}
if !toolCallMap.isEmpty {
self.messages = self.messages.map { msg in
guard msg.isAssistant, let calls = toolCallMap[msg.id] else { return msg }
return msg.withToolCalls(calls)
}
self.buildMessageGroups()
}
// Phase 2b tool result rows. Default OFF (v2.8). A
// single tool result blob (file dump, stack trace) can run
// hundreds of KB; bulk-fetching all of them during chat
// resume on a slow remote was the cause of the 30s timeout
// observed in 2026-05-05 dogfooding. Users can opt in via
// Settings Display "Load tool results in past chats"
// when bandwidth is plentiful. Tool call CARDS still
// render either way (`tool_calls` JSON loads in Phase 2a);
// only the inspector pane's "Output" section is empty
// until the user opens a card, at which point a per-call
// lazy fetch fills it in.
let loadResults = UserDefaults.standard.bool(
forKey: Self.loadHistoricalToolResultsKey
)
guard loadResults else {
ScarfMon.event(.sessionLoad, "mac.hydrateTools.skippedToolResults", count: 1)
return
}
let toolResults = await dataService.fetchToolResultsInRange(
sessionId: sessionForLoad,
minId: minId,
maxId: maxId
)
if Task.isCancelled || self.sessionId != sessionForLoad {
ScarfMon.event(.sessionLoad, "mac.hydrateTools.dropped", count: 1)
return
}
if !toolResults.isEmpty {
var merged = self.messages
let existingIds = Set(merged.map(\.id))
for tr in toolResults where !existingIds.contains(tr.id) {
merged.append(tr)
}
merged.sort { lhs, rhs in
let lt = lhs.timestamp ?? .distantPast
let rt = rhs.timestamp ?? .distantPast
if lt != rt { return lt < rt }
return lhs.id < rhs.id
}
self.messages = merged
self.buildMessageGroups()
}
ScarfMon.event(.sessionLoad, "mac.hydrateTools.complete", count: 1)
}
}
/// Lazy-load the content of a single tool result by call id and
/// splice it into `messages` / `messageGroups` as a synthetic
/// `role='tool'` row. Used by `ChatInspectorPane` when the user
/// opens a tool call card whose result hasn't been hydrated yet
/// (auto-hydrate is opt-in via `loadHistoricalToolResultsKey`).
/// No-op when the result is already present in the transcript or
/// the session id has changed underneath us.
@MainActor
public func loadToolResultIfMissing(callId: String) async {
guard let sessionForLoad = sessionId else { return }
// Already in the transcript? Done.
if messages.contains(where: { $0.toolCallId == callId && $0.isToolResult }) {
return
}
guard let content = await dataService.fetchToolResult(callId: callId) else {
return
}
guard self.sessionId == sessionForLoad else { return }
// Build a synthetic tool result row. We don't have the original
// row id (would need a second SELECT) so we use a negative
// local id that won't collide with persisted rows. The bubble
// and inspector both key on `toolCallId`, not `id`, for tool
// results so this is enough to render correctly.
let placeholderId = nextLocalId
nextLocalId -= 1
let synthetic = HermesMessage(
id: placeholderId,
sessionId: sessionForLoad,
role: "tool",
content: content,
toolCallId: callId,
toolCalls: [],
toolName: nil,
timestamp: Date(),
tokenCount: nil,
finishReason: nil,
reasoning: nil,
reasoningContent: nil
)
messages.append(synthetic)
// Re-sort so the tool result lands next to its assistant
// parent. ID-based ordering preserves the chronological order
// of all the persisted rows; the synthetic placeholder uses a
// negative id so it slots in last fine for inspector display
// since the inspector keys on toolCallId.
messages.sort { lhs, rhs in
let lt = lhs.timestamp ?? .distantPast
let rt = rhs.timestamp ?? .distantPast
if lt != rt { return lt < rt }
return lhs.id < rhs.id
}
buildMessageGroups()
ScarfMon.event(.sessionLoad, "mac.lazyToolResult.fetched", count: 1)
}
// MARK: - Load Earlier (pagination)
@@ -82,16 +82,23 @@ public final class SkillsViewModel {
let ctx = context
let xport = transport
let pins = pinnedNames
let cats: [HermesSkillCategory] = await Task.detached {
let disabled = Self.readDisabledSkillNames(context: ctx)
let pinned = pins ?? Self.readPinnedSkillNames(context: ctx)
return SkillsScanner.scan(
context: ctx,
transport: xport,
disabledNames: disabled,
pinnedNames: pinned
)
}.value
// v2.8 instrumented so future captures show how many SSH
// RTTs the SkillsScanner walk costs on remote (it stats
// every ~/.hermes/skills/* directory + reads SKILL.md per).
let cats: [HermesSkillCategory] = await ScarfMon.measureAsync(.diskIO, "skills.load") {
await Task.detached {
let disabled = Self.readDisabledSkillNames(context: ctx)
let pinned = pins ?? Self.readPinnedSkillNames(context: ctx)
return SkillsScanner.scan(
context: ctx,
transport: xport,
disabledNames: disabled,
pinnedNames: pinned
)
}.value
}
let totalSkills = cats.reduce(0) { $0 + $1.skills.count }
ScarfMon.event(.diskIO, "skills.load.count", count: totalSkills)
categories = cats
isLoading = false
}
@@ -0,0 +1,150 @@
#if canImport(SQLite3)
import Foundation
@testable import ScarfCore
/// Test double for `HermesQueryBackend`. Lets the data-service-façade
/// tests assert which SQL gets emitted, with which params, and feed
/// scripted result rows back.
///
/// Implemented as an `actor` to satisfy the protocol's `Sendable`
/// requirement and to mirror how the real backends serialize state.
/// Marked `final` to prevent accidental subclassing Swift Testing
/// instances are short-lived per-`@Test`, but a stray subclass could
/// hide override quirks.
final actor MockHermesQueryBackend: HermesQueryBackend {
// MARK: - Knobs
var openShouldSucceed: Bool = true
var hasV07Schema: Bool = false
var hasV011Schema: Bool = false
var lastOpenError: String? = nil
/// Map of SQL prefix rows. Lookup picks the longest matching
/// prefix, so callers can register both broad ("SELECT") and
/// narrow ("SELECT id, source FROM sessions") matchers without
/// the broad one swallowing the narrow one.
private var scriptedResults: [String: [Row]] = [:]
/// Map of SQL prefix backend error to throw instead of returning
/// rows. Used to test the data-service's error-swallowing paths.
private var scriptedFailures: [String: BackendError] = [:]
/// Every `query(_:params:)` call lands here in order assertion
/// material for "did the façade emit the SQL we expected".
private(set) var queryLog: [(sql: String, params: [SQLValue])] = []
/// Every `queryBatch` call lands here in order, one outer entry
/// per call, inner entries for each statement in that batch.
private(set) var batchLog: [[(sql: String, params: [SQLValue])]] = []
/// Track open/refresh/close lifecycle for a couple of tests that
/// want to assert "façade really did call open()".
private(set) var openCallCount = 0
private(set) var refreshCallCount = 0
private(set) var closeCallCount = 0
// MARK: - Knob mutators (called from tests)
func setOpenShouldSucceed(_ value: Bool) { openShouldSucceed = value }
func setHasV07Schema(_ value: Bool) { hasV07Schema = value }
func setHasV011Schema(_ value: Bool) { hasV011Schema = value }
func setLastOpenError(_ value: String?) { lastOpenError = value }
/// Build a one-row result keyed on `prefix`. `columns` is the
/// column-name position map; `values` must be the same length.
func _seedRow(forSQLPrefix prefix: String, columns: [String: Int], values: [SQLValue]) {
let row = Row(values: values, columnIndex: columns)
scriptedResults[prefix] = [row]
}
/// Seed an arbitrary row sequence for queries that share `prefix`.
func _seedRows(forSQLPrefix prefix: String, _ rows: [Row]) {
scriptedResults[prefix] = rows
}
/// Make `query` throw the specified `error` whenever it sees a SQL
/// that begins with `prefix`.
func _seedFailure(forSQLPrefix prefix: String, error: BackendError) {
scriptedFailures[prefix] = error
}
// MARK: - HermesQueryBackend conformance
func open() async -> Bool {
openCallCount += 1
return openShouldSucceed
}
@discardableResult
func refresh(forceFresh: Bool) async -> Bool {
refreshCallCount += 1
return openShouldSucceed
}
func close() async {
closeCallCount += 1
}
func query(_ sql: String, params: [SQLValue]) async throws -> [Row] {
queryLog.append((sql: sql, params: params))
if let failure = longestMatchingFailure(for: sql) {
throw failure
}
return longestMatchingRows(for: sql) ?? []
}
func queryBatch(_ statements: [(sql: String, params: [SQLValue])]) async throws -> [[Row]] {
batchLog.append(statements)
var out: [[Row]] = []
out.reserveCapacity(statements.count)
for stmt in statements {
if let failure = longestMatchingFailure(for: stmt.sql) {
throw failure
}
out.append(longestMatchingRows(for: stmt.sql) ?? [])
}
return out
}
// MARK: - Internals
/// Pick the longest registered prefix that `sql` starts with.
/// Ties go to whichever ordering Dictionary iteration produced
/// callers should not register two equal-length matchers for the
/// same SQL because the resolution order is undefined.
private func longestMatchingRows(for sql: String) -> [Row]? {
var bestMatch: (key: String, rows: [Row])?
for (prefix, rows) in scriptedResults {
if sql.hasPrefix(prefix) {
if let current = bestMatch {
if prefix.count > current.key.count {
bestMatch = (prefix, rows)
}
} else {
bestMatch = (prefix, rows)
}
}
}
return bestMatch?.rows
}
private func longestMatchingFailure(for sql: String) -> BackendError? {
var bestMatch: (key: String, error: BackendError)?
for (prefix, error) in scriptedFailures {
if sql.hasPrefix(prefix) {
if let current = bestMatch {
if prefix.count > current.key.count {
bestMatch = (prefix, error)
}
} else {
bestMatch = (prefix, error)
}
}
}
return bestMatch?.error
}
}
#endif // canImport(SQLite3)
@@ -0,0 +1,338 @@
#if canImport(SQLite3)
import Testing
import Foundation
@testable import ScarfCore
/// Exercises the `HermesDataService` façade against a `MockHermesQueryBackend`
/// via the `internal init(context:backend:)` test seam. Focus is the SQL
/// the façade emits + how it consumes the rows that come back.
@Suite struct HermesDataServiceBackendTests {
// MARK: - Helpers
/// Build a `Row` from `(name, value)` pairs in column order.
/// Mirrors the shape `LocalSQLiteBackend.executeOne` produces.
private func makeRow(_ pairs: [(String, SQLValue)]) -> Row {
var values: [SQLValue] = []
var columnIndex: [String: Int] = [:]
values.reserveCapacity(pairs.count)
for (i, pair) in pairs.enumerated() {
values.append(pair.1)
columnIndex[pair.0] = i
}
return Row(values: values, columnIndex: columnIndex)
}
/// Default 16-column session row matching `sessionColumns` for
/// the bare base schema. Uses `.text("s1")` for id by default.
private func makeBaseSessionRow(id: String = "s1") -> Row {
makeRow([
("id", .text(id)),
("source", .text("acp")),
("user_id", .null),
("model", .text("gpt-5")),
("title", .text("hello")),
("parent_session_id", .null),
("started_at", .real(1_700_000_000.0)),
("ended_at", .null),
("end_reason", .null),
("message_count", .integer(5)),
("tool_call_count", .integer(2)),
("input_tokens", .integer(100)),
("output_tokens", .integer(200)),
("cache_read_tokens", .integer(0)),
("cache_write_tokens", .integer(0)),
("estimated_cost_usd", .real(0.05))
])
}
/// 10-column message row matching `messageColumns` for the bare base schema.
private func makeBaseMessageRow(id: Int, sessionId: String = "s1", timestamp: Double = 1_700_000_001.0) -> Row {
makeRow([
("id", .integer(Int64(id))),
("session_id", .text(sessionId)),
("role", .text("user")),
("content", .text("hi #\(id)")),
("tool_call_id", .null),
("tool_calls", .null),
("tool_name", .null),
("timestamp", .real(timestamp)),
("token_count", .integer(10)),
("finish_reason", .null)
])
}
/// Use a real `ServerContext.local` so the data service has a
/// transport to construct (it's never used by these tests every
/// I/O path goes through the injected backend).
private let context: ServerContext = .local
// MARK: - fetchSessions
@Test func fetchSessionsEmitsExpectedSQLPrefixAndDefaultLimit() async {
let mock = MockHermesQueryBackend()
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
_ = await service.fetchSessions()
let log = await mock.queryLog
#expect(log.count == 1)
let first = log[0]
#expect(first.sql.hasPrefix("SELECT id, source"))
#expect(first.sql.contains("FROM sessions WHERE parent_session_id IS NULL ORDER BY started_at DESC LIMIT ?"))
// QueryDefaults.sessionLimit == 100.
#expect(first.params == [.integer(100)])
}
@Test func fetchSessionsBareSchemaUsesBaseColumnList() async {
let mock = MockHermesQueryBackend()
// Both schema flags off neither v0.7 nor v0.11 columns selected.
await mock.setHasV07Schema(false)
await mock.setHasV011Schema(false)
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
_ = await service.fetchSessions()
let sql = await mock.queryLog[0].sql
#expect(!sql.contains("reasoning_tokens"))
#expect(!sql.contains("api_call_count"))
// Sanity: base columns are still all there.
#expect(sql.contains("estimated_cost_usd"))
}
@Test func fetchSessionsWithV07SchemaIncludesReasoningTokens() async {
let mock = MockHermesQueryBackend()
await mock.setHasV07Schema(true)
await mock.setHasV011Schema(false)
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
_ = await service.fetchSessions()
let sql = await mock.queryLog[0].sql
#expect(sql.contains("reasoning_tokens"))
#expect(sql.contains("actual_cost_usd"))
#expect(sql.contains("cost_status"))
#expect(sql.contains("billing_provider"))
#expect(!sql.contains("api_call_count"))
}
@Test func fetchSessionsWithV011SchemaIncludesApiCallCount() async {
let mock = MockHermesQueryBackend()
await mock.setHasV07Schema(true)
await mock.setHasV011Schema(true)
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
_ = await service.fetchSessions()
let sql = await mock.queryLog[0].sql
#expect(sql.contains("reasoning_tokens"))
#expect(sql.contains("api_call_count"))
}
// MARK: - fetchSession(id:)
@Test func fetchSessionByIdBindsTextParam() async {
let mock = MockHermesQueryBackend()
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
await mock._seedRow(
forSQLPrefix: "SELECT id, source",
columns: makeBaseSessionRow().columnIndex,
values: makeBaseSessionRow().values
)
let session = await service.fetchSession(id: "abc-123")
#expect(session?.id == "s1") // From the seeded row.
let log = await mock.queryLog
#expect(log.count == 1)
#expect(log[0].sql.contains("FROM sessions WHERE id = ? LIMIT 1"))
#expect(log[0].params == [.text("abc-123")])
}
// MARK: - fetchMessages
@Test func fetchMessagesWithoutBeforeBindsSessionAndLimit() async {
let mock = MockHermesQueryBackend()
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
_ = await service.fetchMessages(sessionId: "s1", limit: 25, before: nil)
let log = await mock.queryLog
#expect(log.count == 1)
#expect(!log[0].sql.contains("id < ?"))
#expect(log[0].sql.contains("WHERE session_id = ? ORDER BY id DESC LIMIT ?"))
#expect(log[0].params == [.text("s1"), .integer(25)])
}
@Test func fetchMessagesWithBeforeIncludesIdLessThanClause() async {
let mock = MockHermesQueryBackend()
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
_ = await service.fetchMessages(sessionId: "s1", limit: 25, before: 999)
let log = await mock.queryLog
#expect(log.count == 1)
#expect(log[0].sql.contains("WHERE session_id = ? AND id < ? ORDER BY id DESC LIMIT ?"))
#expect(log[0].params == [.text("s1"), .integer(999), .integer(25)])
}
@Test func fetchMessagesReversesDescResultsToChronological() async {
let mock = MockHermesQueryBackend()
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
// Backend returns DESC (newest first); service should reverse to
// chronological (oldest first) for display.
let row3 = makeBaseMessageRow(id: 3, timestamp: 1_700_000_003.0)
let row2 = makeBaseMessageRow(id: 2, timestamp: 1_700_000_002.0)
let row1 = makeBaseMessageRow(id: 1, timestamp: 1_700_000_001.0)
await mock._seedRows(forSQLPrefix: "SELECT id, session_id", [row3, row2, row1])
let result = await service.fetchMessages(sessionId: "s1", limit: 10, before: nil)
#expect(result.count == 3)
#expect(result.map { $0.id } == [1, 2, 3])
}
// MARK: - dashboardSnapshot
@Test func dashboardSnapshotUsesQueryBatchNotIndividualQueries() async {
let mock = MockHermesQueryBackend()
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
_ = await service.dashboardSnapshot()
let queries = await mock.queryLog
let batches = await mock.batchLog
#expect(queries.isEmpty)
#expect(batches.count == 1)
#expect(batches[0].count == 4)
}
@Test func dashboardSnapshotBatchOrderIsStatsRecentSessionsPreviewsToolCalls() async {
let mock = MockHermesQueryBackend()
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
_ = await service.dashboardSnapshot()
let batches = await mock.batchLog
#expect(batches.count == 1)
let stmts = batches[0]
// 0: stats selects COUNT(*), SUM(...) from sessions.
#expect(stmts[0].sql.contains("COUNT(*)"))
#expect(stmts[0].sql.contains("FROM sessions"))
// 1: recent sessions selects session columns with a LIMIT param.
#expect(stmts[1].sql.hasPrefix("SELECT id, source"))
#expect(stmts[1].sql.contains("ORDER BY started_at DESC LIMIT ?"))
// 2: session previews joins messages with first user message.
#expect(stmts[2].sql.contains("INNER JOIN"))
#expect(stmts[2].sql.contains("MIN(id)"))
// 3: recent tool calls selects messages WHERE tool_calls IS NOT NULL.
#expect(stmts[3].sql.contains("WHERE tool_calls IS NOT NULL"))
}
@Test func dashboardSnapshotAssemblesDataFromFourResultSets() async {
let mock = MockHermesQueryBackend()
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
// Stats row (6 cols on bare schema).
let statsRow = makeRow([
("c0", .integer(7)), // totalSessions
("c1", .integer(50)), // totalMessages
("c2", .integer(12)), // totalToolCalls
("c3", .integer(1000)), // totalInputTokens
("c4", .integer(2000)), // totalOutputTokens
("c5", .real(1.25)) // totalCostUSD
])
await mock._seedRow(forSQLPrefix: "SELECT COUNT(*),", columns: statsRow.columnIndex, values: statsRow.values)
// Recent sessions: one base session row.
await mock._seedRows(forSQLPrefix: "SELECT id, source", [makeBaseSessionRow(id: "sess-A")])
// Previews: two-column rows (session_id, content slice).
let p1 = makeRow([("session_id", .text("sess-A")), ("preview", .text("first user msg"))])
await mock._seedRows(forSQLPrefix: "SELECT m.session_id", [p1])
// Recent tool calls: one message row with non-empty tool_calls.
var toolRow = makeBaseMessageRow(id: 99, sessionId: "sess-A")
// Manually rewrite tool_calls column (idx 5) to non-null/non-empty.
let toolRowValues: [SQLValue] = [
.integer(99), .text("sess-A"), .text("assistant"), .text("Calling tool"),
.null, .text("[{\"id\":\"t1\",\"name\":\"bash\"}]"), .text("bash"),
.real(1_700_000_010.0), .integer(15), .text("stop")
]
toolRow = Row(values: toolRowValues, columnIndex: toolRow.columnIndex)
// Both `fetchRecentToolCalls` and the dashboard batch slot start
// with the same `messageColumns` prefix; match on a shorter
// common substring that's whitespace-stable across the two
// SQL builders.
await mock._seedRows(forSQLPrefix: "SELECT id, session_id, role, content, tool_call_id, tool_calls,\ntool_name", [toolRow])
let snapshot = await service.dashboardSnapshot()
#expect(snapshot.stats.totalSessions == 7)
#expect(snapshot.stats.totalMessages == 50)
#expect(snapshot.recentSessions.map { $0.id } == ["sess-A"])
#expect(snapshot.sessionPreviews["sess-A"] == "first user msg")
#expect(snapshot.recentToolCalls.count == 1)
#expect(snapshot.recentToolCalls[0].id == 99)
}
// MARK: - searchMessages
@Test func searchMessagesEmptyInputReturnsEmptyAndSkipsBackend() async {
let mock = MockHermesQueryBackend()
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
let result = await service.searchMessages(query: " ")
#expect(result.isEmpty)
let log = await mock.queryLog
#expect(log.isEmpty)
}
@Test func searchMessagesWrapsTokensInDoubleQuotes() async {
let mock = MockHermesQueryBackend()
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
_ = await service.searchMessages(query: "config.yaml v0.7.0")
let log = await mock.queryLog
#expect(log.count == 1)
// FTS query is the first param.
guard case .text(let fts) = log[0].params[0] else {
Issue.record("Expected first FTS search param to be .text")
return
}
// Each whitespace-delimited token gets wrapped in double-quotes
// and joined with spaces.
#expect(fts == "\"config.yaml\" \"v0.7.0\"")
}
// MARK: - Error swallowing
@Test func fetchSessionsReturnsEmptyOnBackendTransportError() async {
let mock = MockHermesQueryBackend()
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
await mock._seedFailure(forSQLPrefix: "SELECT id, source", error: .transport("ssh dropped"))
let result = await service.fetchSessions()
#expect(result.isEmpty)
// Sanity: the error reached the backend (the call was made).
let log = await mock.queryLog
#expect(log.count == 1)
}
}
#endif // canImport(SQLite3)
@@ -0,0 +1,119 @@
import Testing
import Foundation
@testable import ScarfCore
/// Exercises the `SCARF_HERMES_HOME` test-mode override on `HermesProfileResolver`.
/// The override is the seam every E2E test relies on without it, tests would
/// touch the user's real `~/.hermes`. Serialized because we mutate process-wide
/// environment.
///
/// **Marker file requirement.** As of v2.8 the override only activates when the
/// path contains the sentinel `HermesProfileResolver.testHomeMarkerFilename`.
/// Tests that want the override active drop the marker before `setenv`. Tests
/// that want to verify the override is rejected (relative path, missing
/// marker, empty value) skip the marker. The hardening prevents a leaked env
/// var from ever pivoting Scarf off the user's real `~/.hermes`.
@Suite(.serialized)
struct HermesProfileResolverOverrideTests {
private static let envKey = "SCARF_HERMES_HOME"
@Test func absoluteOverrideTakesPrecedenceWhenMarkerPresent() throws {
let saved = ProcessInfo.processInfo.environment[Self.envKey]
defer { restore(saved) }
let tmp = NSTemporaryDirectory().appending("scarf-test-home-\(UUID().uuidString)")
try FileManager.default.createDirectory(atPath: tmp, withIntermediateDirectories: true)
try Data().write(to: URL(fileURLWithPath: tmp + "/" + HermesProfileResolver.testHomeMarkerFilename))
defer { try? FileManager.default.removeItem(atPath: tmp) }
setenv(Self.envKey, tmp, 1)
#expect(HermesProfileResolver.resolveLocalHome() == tmp)
#expect(HermesProfileResolver.activeProfileName() == "test-override")
}
@Test func overrideIsIgnoredWhenMarkerMissing() throws {
let saved = ProcessInfo.processInfo.environment[Self.envKey]
defer { restore(saved) }
// Real-looking dir, no marker exactly the shape a leaked env
// var or misconfigured launchctl plist would produce. Must NOT
// override; must fall through to the real resolver.
let tmp = NSTemporaryDirectory().appending("scarf-no-marker-\(UUID().uuidString)")
try FileManager.default.createDirectory(atPath: tmp, withIntermediateDirectories: true)
defer { try? FileManager.default.removeItem(atPath: tmp) }
setenv(Self.envKey, tmp, 1)
HermesProfileResolver.invalidateCache()
let resolved = HermesProfileResolver.resolveLocalHome()
#expect(resolved != tmp)
#expect(resolved.hasSuffix("/.hermes") || resolved.contains("/.hermes/profiles/"))
}
@Test func emptyOverrideFallsThrough() {
let saved = ProcessInfo.processInfo.environment[Self.envKey]
defer { restore(saved) }
setenv(Self.envKey, "", 1)
HermesProfileResolver.invalidateCache()
let resolved = HermesProfileResolver.resolveLocalHome()
#expect(!resolved.isEmpty)
#expect(resolved.hasSuffix("/.hermes") || resolved.contains("/.hermes/profiles/"))
}
@Test func relativeOverrideIsRejected() {
let saved = ProcessInfo.processInfo.environment[Self.envKey]
defer { restore(saved) }
setenv(Self.envKey, "relative/path", 1)
HermesProfileResolver.invalidateCache()
let resolved = HermesProfileResolver.resolveLocalHome()
#expect(!resolved.hasSuffix("relative/path"))
}
@Test func unsetOverrideUsesProfileResolver() {
let saved = ProcessInfo.processInfo.environment[Self.envKey]
defer { restore(saved) }
unsetenv(Self.envKey)
HermesProfileResolver.invalidateCache()
let resolved = HermesProfileResolver.resolveLocalHome()
#expect(!resolved.isEmpty)
}
@Test func overrideBypassesCacheWhenMarkerPresent() throws {
let saved = ProcessInfo.processInfo.environment[Self.envKey]
defer { restore(saved) }
let first = NSTemporaryDirectory().appending("scarf-cache-bypass-1-\(UUID().uuidString)")
let second = NSTemporaryDirectory().appending("scarf-cache-bypass-2-\(UUID().uuidString)")
try FileManager.default.createDirectory(atPath: first, withIntermediateDirectories: true)
try FileManager.default.createDirectory(atPath: second, withIntermediateDirectories: true)
try Data().write(to: URL(fileURLWithPath: first + "/" + HermesProfileResolver.testHomeMarkerFilename))
try Data().write(to: URL(fileURLWithPath: second + "/" + HermesProfileResolver.testHomeMarkerFilename))
defer {
try? FileManager.default.removeItem(atPath: first)
try? FileManager.default.removeItem(atPath: second)
}
setenv(Self.envKey, first, 1)
#expect(HermesProfileResolver.resolveLocalHome() == first)
// Flip env var without invalidating the cache. Override is read
// fresh on every call, so the new value takes effect immediately.
setenv(Self.envKey, second, 1)
#expect(HermesProfileResolver.resolveLocalHome() == second)
}
private func restore(_ saved: String?) {
if let saved {
setenv(Self.envKey, saved, 1)
} else {
unsetenv(Self.envKey)
}
HermesProfileResolver.invalidateCache()
}
}
@@ -0,0 +1,48 @@
import Testing
import Foundation
@testable import ScarfCore
/// Verifies the lenient `ListItemStatus(raw:)` parser. Real dashboards on
/// disk use a mix of canonical names + synonyms (`done`, `info`, `ok`,
/// `pending`, `up` are seen on the dev's machine today) the parser must
/// fold those onto the canonical case set without throwing or returning nil
/// for the common synonyms. Unknown strings nil so the renderer can fall
/// back to plain text without losing the original.
@Suite struct ListItemStatusTests {
@Test func canonicalNamesParse() {
for c in ListItemStatus.allCases {
#expect(ListItemStatus(raw: c.rawValue) == c)
}
}
@Test func synonymsCollapseToCanonical() {
#expect(ListItemStatus(raw: "ok") == .success)
#expect(ListItemStatus(raw: "OK") == .success) // case-insensitive
#expect(ListItemStatus(raw: " up ") == .success) // whitespace trim
#expect(ListItemStatus(raw: "down") == .danger)
#expect(ListItemStatus(raw: "error") == .danger)
#expect(ListItemStatus(raw: "failed") == .danger)
#expect(ListItemStatus(raw: "warn") == .warning)
#expect(ListItemStatus(raw: "degraded") == .warning)
#expect(ListItemStatus(raw: "active") == .info)
#expect(ListItemStatus(raw: "queued") == .pending)
#expect(ListItemStatus(raw: "complete") == .done)
}
@Test func unknownReturnsNilNotThrows() {
#expect(ListItemStatus(raw: "hologram") == nil)
#expect(ListItemStatus(raw: "") == nil)
#expect(ListItemStatus(raw: nil) == nil)
#expect(ListItemStatus(raw: " ") == nil)
}
@Test func listItemStillDecodesUnknownStatusString() throws {
// Backwards-compat invariant: `ListItem.status` stays a free String? on
// the wire. Decoding a v2.6 dashboard with a non-canonical status must
// succeed and preserve the original string (renderer falls back).
let json = #"{"text":"foo","status":"weird"}"#.data(using: .utf8)!
let item = try JSONDecoder().decode(ListItem.self, from: json)
#expect(item.status == "weird")
#expect(ListItemStatus(raw: item.status) == nil)
}
}
@@ -219,12 +219,6 @@ import Foundation
try transport.removeFile(tmp.path)
}
@Test func localTransportSnapshotSQLiteReturnsPathUnchanged() throws {
let transport = LocalTransport()
let url = try transport.snapshotSQLite(remotePath: "/tmp/some/state.db")
#expect(url.path == "/tmp/some/state.db")
}
/// The Mac target wires `SSHTransport.environmentEnricher` at launch to
/// `HermesFileService.enrichedEnvironment()` so SSH subprocesses
/// inherit SSH_AUTH_SOCK from the user's login shell (1Password /
@@ -265,19 +265,20 @@ import Foundation
errorMessage: "No Anthropic credentials found",
stderrTail: ""
)
#expect(noCreds?.contains("ANTHROPIC_API_KEY") == true)
#expect(noCreds?.hint.contains("ANTHROPIC_API_KEY") == true)
#expect(noCreds?.oauthProvider == nil)
let missingBinary = ACPErrorHint.classify(
errorMessage: "",
stderrTail: "No such file or directory: 'npx'"
)
#expect(missingBinary?.contains("npx") == true)
#expect(missingBinary?.hint.contains("npx") == true)
let rateLimit = ACPErrorHint.classify(
errorMessage: "",
stderrTail: "HTTP 429 Too Many Requests: rate limit"
)
#expect(rateLimit?.contains("rate-limit") == true)
#expect(rateLimit?.hint.contains("rate-limit") == true)
let unknown = ACPErrorHint.classify(
errorMessage: "weird thing",
@@ -286,6 +287,53 @@ import Foundation
#expect(unknown == nil)
}
@Test func errorHintsClassifyOAuthRefreshRevoked() {
// Primary trigger Hermes's verbatim message when an OAuth
// refresh token can't mint a new access token. Provider name
// appears alongside; classifier should extract it.
let revoked = ACPErrorHint.classify(
errorMessage: "",
stderrTail: "Refresh session has been revoked. Run `hermes model` to re-authenticate."
)
#expect(revoked?.hint.contains("Re-authenticate") == true)
// With provider context surfaces the affected provider name
// so the chat banner can offer a one-click re-auth that targets
// the right OAuth flow.
let revokedWithProvider = ACPErrorHint.classify(
errorMessage: "",
stderrTail: "Provider claude: Refresh session has been revoked. Run `hermes model` to re-authenticate."
)
#expect(revokedWithProvider?.oauthProvider == "claude")
// 401 + OAuth provider name broader catchall for providers
// that don't print the verbatim "revoked" string.
let unauthorized = ACPErrorHint.classify(
errorMessage: "",
stderrTail: "HTTP 401 Unauthorized from nous portal"
)
#expect(unauthorized?.oauthProvider == "nous")
#expect(unauthorized?.hint.contains("OAuth") == true)
// Unauthorized on a non-OAuth provider (API-key based) should
// NOT classify as OAuth revocation no `oauthProvider` known
// to dispatch the re-auth flow against.
let unauthorizedNonOAuth = ACPErrorHint.classify(
errorMessage: "",
stderrTail: "HTTP 401 Unauthorized for groq"
)
#expect(unauthorizedNonOAuth?.oauthProvider == nil)
// Word-boundary check "anthropicapi" must not false-trigger
// on "anthropic". Without word boundaries this catches the
// wrong cases.
let substringNoMatch = ACPErrorHint.classify(
errorMessage: "",
stderrTail: "401 unauthorized: anthropicapi.example.com"
)
#expect(substringNoMatch?.oauthProvider != "anthropic")
}
// MARK: - Helpers
/// Poll `predicate` every ~20ms up to `timeout` seconds. Fails if
@@ -455,8 +455,9 @@ import Foundation
}
}
}
func snapshotSQLite(remotePath: String) throws -> URL { URL(fileURLWithPath: remotePath) }
var cachedSnapshotPath: URL? { nil }
func streamScript(_ script: String, timeout: TimeInterval) async throws -> ProcessResult {
ProcessResult(exitCode: 0, stdout: Data(), stderr: Data())
}
func watchPaths(_ paths: [String]) -> AsyncStream<WatchEvent> {
AsyncStream { $0.finish() }
}
@@ -0,0 +1,182 @@
import Testing
import Foundation
@testable import ScarfCore
/// Pure tests for `ModelPreflight` both the `check(_:)` configured-vs-
/// missing classifier and the v2.8 `detectMismatch(_:)` provider/prefix
/// reconciliation. The mismatch path is what surfaces the orange
/// "Model/provider mismatch in config.yaml" banner in ChatView when the
/// user switches OAuth providers via Credential Pools and `model.default`
/// is left carrying the old provider's prefix.
@Suite struct ModelPreflightTests {
// MARK: - check(_:) missing-field classifier
@Test func bothModelAndProviderEmptyReportsMissingBoth() {
var cfg = HermesConfig.empty
cfg.model = ""
cfg.provider = ""
#expect(ModelPreflight.check(cfg) == .missingBoth)
}
@Test func bothModelAndProviderUnknownReportsMissingBoth() {
// `HermesConfig.empty` defaults model/provider to the literal
// "unknown" the classifier must treat that the same as "".
let cfg = HermesConfig.empty
#expect(ModelPreflight.check(cfg) == .missingBoth)
}
@Test func providerSetButModelEmptyReportsMissingModel() {
var cfg = HermesConfig.empty
cfg.model = ""
cfg.provider = "anthropic"
#expect(ModelPreflight.check(cfg) == .missingModel)
}
@Test func modelSetButProviderEmptyReportsMissingProvider() {
var cfg = HermesConfig.empty
cfg.model = "claude-sonnet-4.6"
cfg.provider = ""
#expect(ModelPreflight.check(cfg) == .missingProvider)
}
@Test func bothSetReportsConfigured() {
var cfg = HermesConfig.empty
cfg.model = "claude-sonnet-4.6"
cfg.provider = "anthropic"
#expect(ModelPreflight.check(cfg) == .configured)
}
@Test func whitespaceTreatedAsUnsetForBothFields() {
var cfg = HermesConfig.empty
cfg.model = " "
cfg.provider = "\n"
#expect(ModelPreflight.check(cfg) == .missingBoth)
}
@Test func resultIsConfiguredOnlyForConfiguredCase() {
#expect(ModelPreflight.Result.configured.isConfigured)
#expect(!ModelPreflight.Result.missingBoth.isConfigured)
#expect(!ModelPreflight.Result.missingModel.isConfigured)
#expect(!ModelPreflight.Result.missingProvider.isConfigured)
}
// MARK: - detectMismatch(_:)
@Test func detectMismatchReturnsNilWhenNoPrefixOnModelDefault() {
var cfg = HermesConfig.empty
cfg.model = "claude-sonnet-4.6"
cfg.provider = "anthropic"
#expect(ModelPreflight.detectMismatch(cfg) == nil)
}
@Test func detectMismatchReturnsNilWhenPrefixMatchesProvider() {
var cfg = HermesConfig.empty
cfg.model = "anthropic/claude-sonnet-4.6"
cfg.provider = "anthropic"
#expect(ModelPreflight.detectMismatch(cfg) == nil)
}
@Test func detectMismatchReturnsNilWhenModelDefaultIsUnset() {
var cfg = HermesConfig.empty
cfg.model = ""
cfg.provider = "nous"
#expect(ModelPreflight.detectMismatch(cfg) == nil)
}
@Test func detectMismatchReturnsNilWhenProviderIsUnset() {
var cfg = HermesConfig.empty
cfg.model = "anthropic/claude-sonnet-4.6"
cfg.provider = ""
#expect(ModelPreflight.detectMismatch(cfg) == nil)
}
@Test func detectMismatchReturnsNilWhenBothUnknown() {
// The literal "unknown" sentinel from the YAML parser fallback
// counts as unset on both sides no mismatch to report.
let cfg = HermesConfig.empty // model + provider both "unknown"
#expect(ModelPreflight.detectMismatch(cfg) == nil)
}
@Test func detectMismatchSurfacesPrefixVsActiveProvider() {
// The dogfooding scenario: Anthropic-prefixed model still sitting
// in config.yaml after the user OAuth'd into Nous via Credential
// Pools. Hermes can't reconcile and chats die with -32603 at
// first prompt. The banner offers a one-click fix in either
// direction; this test pins the data the banner reads.
var cfg = HermesConfig.empty
cfg.model = "anthropic/claude-sonnet-4.6"
cfg.provider = "nous"
let mismatch = ModelPreflight.detectMismatch(cfg)
#expect(mismatch != nil)
#expect(mismatch?.prefixProvider == "anthropic")
#expect(mismatch?.activeProvider == "nous")
#expect(mismatch?.modelDefault == "anthropic/claude-sonnet-4.6")
#expect(mismatch?.bareModel == "claude-sonnet-4.6")
}
@Test func detectMismatchIsCaseInsensitiveOnPrefixMatch() {
// Hermes accepts both `Anthropic/...` and `anthropic/...` casings
// in the wild case-only differences must NOT surface as a
// mismatch (would be a false-positive banner).
var cfg = HermesConfig.empty
cfg.model = "Anthropic/claude-sonnet-4.6"
cfg.provider = "anthropic"
#expect(ModelPreflight.detectMismatch(cfg) == nil)
}
@Test func detectMismatchHandlesNonAnthropicProviders() {
// The mismatch banner needs to work for any provider pair
// not just the dogfooding case. Pin the openai+nous shape.
var cfg = HermesConfig.empty
cfg.model = "openai/gpt-5"
cfg.provider = "nous"
let mismatch = ModelPreflight.detectMismatch(cfg)
#expect(mismatch?.prefixProvider == "openai")
#expect(mismatch?.activeProvider == "nous")
#expect(mismatch?.bareModel == "gpt-5")
}
@Test func detectMismatchReturnsNilForEmptyBareModel() {
// A pathological "anthropic/" with no model name after the
// slash isn't a valid mismatch caller has no bare model to
// write back. The classifier should refuse to surface it
// rather than emit a useless fix button.
var cfg = HermesConfig.empty
cfg.model = "anthropic/"
cfg.provider = "nous"
#expect(ModelPreflight.detectMismatch(cfg) == nil)
}
@Test func detectMismatchReturnsNilForEmptyPrefix() {
// Symmetric pathological case leading slash, no provider
// prefix. Don't fire.
var cfg = HermesConfig.empty
cfg.model = "/claude-sonnet-4.6"
cfg.provider = "nous"
#expect(ModelPreflight.detectMismatch(cfg) == nil)
}
@Test func detectMismatchHandlesModelsWithMultipleSlashes() {
// Some provider/model strings carry path-style segments after
// the first slash (e.g. an OpenRouter style path). The first
// slash separates prefix from bare model; the rest of the
// string is the bare model verbatim.
var cfg = HermesConfig.empty
cfg.model = "openrouter/anthropic/claude-sonnet-4.6"
cfg.provider = "anthropic"
let mismatch = ModelPreflight.detectMismatch(cfg)
#expect(mismatch?.prefixProvider == "openrouter")
#expect(mismatch?.activeProvider == "anthropic")
#expect(mismatch?.bareModel == "anthropic/claude-sonnet-4.6")
}
@Test func detectMismatchTrimsWhitespaceBeforeComparing() {
// A stray newline in a hand-edited config.yaml shouldn't read
// as a mismatch when the trimmed values agree.
var cfg = HermesConfig.empty
cfg.model = "anthropic/claude-sonnet-4.6 "
cfg.provider = " anthropic\n"
#expect(ModelPreflight.detectMismatch(cfg) == nil)
}
}
@@ -0,0 +1,565 @@
#if canImport(SQLite3)
import Testing
import Foundation
import SQLite3
@testable import ScarfCore
// MARK: - LocalSQLite3Transport
/// Test-only transport that runs the script through `/bin/sh -c` on the
/// local machine. Lets `RemoteSQLiteBackend`'s production codepath
/// (which calls `transport.streamScript`) drive a real local sqlite3
/// invocation against a tmp fixture DB. No SSH, no Citadel the
/// backend doesn't care how `streamScript` gets its bytes.
private struct LocalSQLite3Transport: ServerTransport {
let contextID: ServerID
let isRemote: Bool = false
init(contextID: ServerID = ServerContext.local.id) {
self.contextID = contextID
}
func readFile(_ path: String) throws -> Data {
try Data(contentsOf: URL(fileURLWithPath: path))
}
func writeFile(_ path: String, data: Data) throws {
try data.write(to: URL(fileURLWithPath: path), options: .atomic)
}
func fileExists(_ path: String) -> Bool {
FileManager.default.fileExists(atPath: path)
}
func stat(_ path: String) -> FileStat? {
guard let attrs = try? FileManager.default.attributesOfItem(atPath: path) else { return nil }
let size = (attrs[.size] as? Int64) ?? Int64((attrs[.size] as? Int) ?? 0)
let mtime = (attrs[.modificationDate] as? Date) ?? Date(timeIntervalSince1970: 0)
let isDir = (attrs[.type] as? FileAttributeType) == .typeDirectory
return FileStat(size: size, mtime: mtime, isDirectory: isDir)
}
func listDirectory(_ path: String) throws -> [String] {
try FileManager.default.contentsOfDirectory(atPath: path)
}
func createDirectory(_ path: String) throws {
try FileManager.default.createDirectory(atPath: path, withIntermediateDirectories: true)
}
func removeFile(_ path: String) throws {
guard FileManager.default.fileExists(atPath: path) else { return }
try FileManager.default.removeItem(atPath: path)
}
func runProcess(executable: String, args: [String], stdin: Data?, timeout: TimeInterval?) throws -> ProcessResult {
throw TransportError.other(message: "LocalSQLite3Transport.runProcess unused in tests")
}
#if !os(iOS)
func makeProcess(executable: String, args: [String]) -> Process {
let p = Process()
p.executableURL = URL(fileURLWithPath: executable)
p.arguments = args
return p
}
#endif
func streamLines(executable: String, args: [String]) -> AsyncThrowingStream<String, Error> {
AsyncThrowingStream { $0.finish() }
}
/// The actual workhorse: feed the script to `/bin/sh -c` so heredocs
/// and command substitution behave exactly as they would on the
/// remote end of an SSH session. Capture stdout / stderr / exit
/// code into a `ProcessResult`.
func streamScript(_ script: String, timeout: TimeInterval) async throws -> ProcessResult {
return try await withCheckedThrowingContinuation { continuation in
DispatchQueue.global().async {
let proc = Process()
proc.executableURL = URL(fileURLWithPath: "/bin/sh")
proc.arguments = ["-c", script]
let outPipe = Pipe()
let errPipe = Pipe()
proc.standardOutput = outPipe
proc.standardError = errPipe
do {
try proc.run()
} catch {
continuation.resume(throwing: TransportError.other(
message: "Failed to launch /bin/sh: \(error.localizedDescription)"
))
return
}
try? outPipe.fileHandleForWriting.close()
try? errPipe.fileHandleForWriting.close()
proc.waitUntilExit()
let stdout = (try? outPipe.fileHandleForReading.readToEnd()) ?? Data()
let stderr = (try? errPipe.fileHandleForReading.readToEnd()) ?? Data()
try? outPipe.fileHandleForReading.close()
try? errPipe.fileHandleForReading.close()
continuation.resume(returning: ProcessResult(
exitCode: proc.terminationStatus,
stdout: stdout,
stderr: stderr
))
}
}
}
func watchPaths(_ paths: [String]) -> AsyncStream<WatchEvent> {
AsyncStream { $0.finish() }
}
}
// MARK: - Suite
/// Integration tests for `RemoteSQLiteBackend`. Drives the real backend
/// against a local sqlite3 binary (via `LocalSQLite3Transport`) and a
/// per-test fixture state.db on disk.
@Suite struct RemoteSQLiteBackendTests {
// MARK: - Fixture builders
/// Build a minimal v0.6 baseline state.db (no v0.7, no v0.11 columns).
/// Each test takes ownership of cleanup via `defer`.
private func makeFixtureStateDB(
addV07Columns: Bool = false,
addV011SessionsColumn: Bool = false,
addV011MessagesColumn: Bool = false
) throws -> URL {
// Each test gets its own isolated parent dir. We can't dump the
// fixture directly into `temporaryDirectory` because the symlink
// we create alongside (`<parent>/state.db`) would clobber a
// sibling test's symlink when the suite runs in parallel.
let testDir = FileManager.default.temporaryDirectory
.appendingPathComponent("scarf-test-\(UUID().uuidString)", isDirectory: true)
try FileManager.default.createDirectory(at: testDir, withIntermediateDirectories: true)
let url = testDir.appendingPathComponent("fixture.db")
var db: OpaquePointer?
guard sqlite3_open_v2(url.path, &db, SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE, nil) == SQLITE_OK else {
throw TransportError.other(message: "sqlite3_open_v2 failed")
}
defer { sqlite3_close(db) }
var sessionsExtra = ""
if addV07Columns {
sessionsExtra += ", reasoning_tokens INTEGER, actual_cost_usd REAL, cost_status TEXT, billing_provider TEXT"
}
if addV011SessionsColumn {
sessionsExtra += ", api_call_count INTEGER"
}
var messagesExtra = ""
if addV011MessagesColumn {
messagesExtra += ", reasoning_content TEXT"
}
let schema = """
CREATE TABLE sessions (
id TEXT PRIMARY KEY,
source TEXT,
user_id TEXT,
model TEXT,
title TEXT,
parent_session_id TEXT,
started_at REAL,
ended_at REAL,
end_reason TEXT,
message_count INTEGER,
tool_call_count INTEGER,
input_tokens INTEGER,
output_tokens INTEGER,
cache_read_tokens INTEGER,
cache_write_tokens INTEGER,
estimated_cost_usd REAL\(sessionsExtra)
);
INSERT INTO sessions (id, source, user_id, model, title, parent_session_id, started_at, ended_at, end_reason, message_count, tool_call_count, input_tokens, output_tokens, cache_read_tokens, cache_write_tokens, estimated_cost_usd)
VALUES ('s1', 'acp', 'u1', 'gpt-5', 'Test', NULL, 1700000000.0, NULL, NULL, 5, 2, 100, 200, 0, 0, 0.05);
CREATE TABLE messages (
id INTEGER PRIMARY KEY,
session_id TEXT,
role TEXT,
content TEXT,
tool_call_id TEXT,
tool_calls TEXT,
tool_name TEXT,
timestamp REAL,
token_count INTEGER,
finish_reason TEXT\(messagesExtra)
);
INSERT INTO messages (id, session_id, role, content, tool_call_id, tool_calls, tool_name, timestamp, token_count, finish_reason)
VALUES (1, 's1', 'user', 'hi', NULL, NULL, NULL, 1700000001.0, NULL, NULL);
"""
var errMsg: UnsafeMutablePointer<CChar>?
let rc = sqlite3_exec(db, schema, nil, nil, &errMsg)
if rc != SQLITE_OK {
let msg = errMsg.flatMap { String(cString: $0) } ?? "unknown"
sqlite3_free(errMsg)
throw TransportError.other(message: "sqlite3_exec failed: \(msg)")
}
return url
}
/// Construct a remote-shaped context whose `paths.stateDB` points at
/// the fixture file. We embed the absolute path under a fake
/// `remoteHome` whose final `/.hermes/state.db` resolves to our
/// real DB on disk.
private func makeFixtureContext(dbURL: URL) -> ServerContext {
// The DB the backend opens is `<paths.home>/state.db`. We point
// `remoteHome` at the parent dir of the fixture file and then
// symlink `state.db` to the fixture so the backend's resolved
// path lands on it.
let parent = dbURL.deletingLastPathComponent()
let stateLink = parent.appendingPathComponent("state.db")
// Replace any prior symlink/file at the canonical "state.db" path.
try? FileManager.default.removeItem(at: stateLink)
try? FileManager.default.createSymbolicLink(at: stateLink, withDestinationURL: dbURL)
return ServerContext(
id: UUID(),
displayName: "fixture",
kind: .ssh(SSHConfig(host: "fake.invalid", remoteHome: parent.path))
)
}
/// Construct a remote-shaped context that uses the default
/// `~/.hermes` remote home exercises the tilde-expansion path
/// in `RemoteSQLiteBackend.quoteForRemoteShell`. The fixture DB
/// is symlinked at `$HOME/.hermes/state.db` so the shell-expanded
/// path resolves correctly. Cleanup restores anything we move.
/// Returns the original-symlink (or absent state) so the caller
/// can restore on teardown.
private struct DefaultHomeFixture {
let dbURL: URL
let stateLink: URL
let backupURL: URL?
let context: ServerContext
}
private func makeDefaultHomeFixtureContext(dbURL: URL) throws -> DefaultHomeFixture {
let homeURL = URL(fileURLWithPath: NSHomeDirectory())
let hermesDir = homeURL.appendingPathComponent(".hermes", isDirectory: true)
try FileManager.default.createDirectory(at: hermesDir, withIntermediateDirectories: true)
let stateLink = hermesDir.appendingPathComponent("state.db")
// If something is already at ~/.hermes/state.db (the user's
// real Hermes install on dev machines), move it aside so we
// can put our fixture in its place. Restore on teardown.
var backupURL: URL?
if FileManager.default.fileExists(atPath: stateLink.path) {
let bak = hermesDir.appendingPathComponent("state.db.scarf-test-bak-\(UUID().uuidString)")
try FileManager.default.moveItem(at: stateLink, to: bak)
backupURL = bak
}
try FileManager.default.createSymbolicLink(at: stateLink, withDestinationURL: dbURL)
let ctx = ServerContext(
id: UUID(),
displayName: "fixture",
kind: .ssh(SSHConfig(host: "fake.invalid"))
// No remoteHome override defaults to "~/.hermes".
)
return DefaultHomeFixture(dbURL: dbURL, stateLink: stateLink, backupURL: backupURL, context: ctx)
}
private func cleanupDefaultHomeFixture(_ fixture: DefaultHomeFixture) {
try? FileManager.default.removeItem(at: fixture.stateLink)
if let bak = fixture.backupURL {
try? FileManager.default.moveItem(at: bak, to: fixture.stateLink)
}
}
/// Skip the test if /usr/bin/sqlite3 isn't available. Mirrors how
/// other Apple-only tests gate on system tooling.
private func requireSqlite3() throws {
let path = "/usr/bin/sqlite3"
let exists = FileManager.default.isExecutableFile(atPath: path)
try #require(exists, "Test requires /usr/bin/sqlite3")
}
// MARK: - open() / schema detection
/// Regression: a default-config remote with `paths.stateDB ==
/// "~/.hermes/state.db"` previously hit `unable to open database
/// "~/.hermes/state.db"` because the backend single-quoted the
/// path and sqlite3 doesn't expand `~` itself. Verify the
/// $HOME-rewrite path works against a real shell.
@Test func openWithDefaultTildeHomeExpands() async throws {
try requireSqlite3()
let dbURL = try makeFixtureStateDB()
let fixture = try makeDefaultHomeFixtureContext(dbURL: dbURL)
defer {
cleanupDefaultHomeFixture(fixture)
try? FileManager.default.removeItem(at: dbURL)
try? FileManager.default.removeItem(at: dbURL.deletingLastPathComponent())
}
let backend = RemoteSQLiteBackend(context: fixture.context, transport: LocalSQLite3Transport())
let opened = await backend.open()
#expect(opened)
let err = await backend.lastOpenError
#expect(err == nil)
// And actually run a query through the same expansion path.
let rows = try await backend.query("SELECT id FROM sessions", params: [])
#expect(rows.count == 1)
}
@Test func openProbesSchemaSuccessfully() async throws {
try requireSqlite3()
let dbURL = try makeFixtureStateDB()
defer {
try? FileManager.default.removeItem(at: dbURL)
try? FileManager.default.removeItem(at: dbURL.deletingLastPathComponent().appendingPathComponent("state.db"))
}
let ctx = makeFixtureContext(dbURL: dbURL)
let backend = RemoteSQLiteBackend(context: ctx, transport: LocalSQLite3Transport())
let opened = await backend.open()
#expect(opened)
let v07 = await backend.hasV07Schema
let v011 = await backend.hasV011Schema
#expect(v07 == false)
#expect(v011 == false)
let err = await backend.lastOpenError
#expect(err == nil)
}
@Test func openOnV07SchemaDB() async throws {
try requireSqlite3()
let dbURL = try makeFixtureStateDB(addV07Columns: true)
defer {
try? FileManager.default.removeItem(at: dbURL)
try? FileManager.default.removeItem(at: dbURL.deletingLastPathComponent().appendingPathComponent("state.db"))
}
let ctx = makeFixtureContext(dbURL: dbURL)
let backend = RemoteSQLiteBackend(context: ctx, transport: LocalSQLite3Transport())
let opened = await backend.open()
#expect(opened)
let v07 = await backend.hasV07Schema
let v011 = await backend.hasV011Schema
#expect(v07 == true)
#expect(v011 == false)
}
@Test func openOnV011SchemaDB() async throws {
try requireSqlite3()
let dbURL = try makeFixtureStateDB(
addV07Columns: true,
addV011SessionsColumn: true,
addV011MessagesColumn: true
)
defer {
try? FileManager.default.removeItem(at: dbURL)
try? FileManager.default.removeItem(at: dbURL.deletingLastPathComponent().appendingPathComponent("state.db"))
}
let ctx = makeFixtureContext(dbURL: dbURL)
let backend = RemoteSQLiteBackend(context: ctx, transport: LocalSQLite3Transport())
let opened = await backend.open()
#expect(opened)
let v011 = await backend.hasV011Schema
#expect(v011 == true)
}
@Test func partialMigrationStaysOnV07() async throws {
try requireSqlite3()
// sessions has api_call_count but messages lacks reasoning_content
// the belt-and-braces guard should keep hasV011Schema false.
let dbURL = try makeFixtureStateDB(
addV07Columns: true,
addV011SessionsColumn: true,
addV011MessagesColumn: false
)
defer {
try? FileManager.default.removeItem(at: dbURL)
try? FileManager.default.removeItem(at: dbURL.deletingLastPathComponent().appendingPathComponent("state.db"))
}
let ctx = makeFixtureContext(dbURL: dbURL)
let backend = RemoteSQLiteBackend(context: ctx, transport: LocalSQLite3Transport())
let opened = await backend.open()
#expect(opened)
let v011 = await backend.hasV011Schema
#expect(v011 == false)
let v07 = await backend.hasV07Schema
#expect(v07 == true)
}
// MARK: - query()
@Test func queryReturnsRows() async throws {
try requireSqlite3()
let dbURL = try makeFixtureStateDB()
defer {
try? FileManager.default.removeItem(at: dbURL)
try? FileManager.default.removeItem(at: dbURL.deletingLastPathComponent().appendingPathComponent("state.db"))
}
let ctx = makeFixtureContext(dbURL: dbURL)
let backend = RemoteSQLiteBackend(context: ctx, transport: LocalSQLite3Transport())
_ = await backend.open()
let rows = try await backend.query("SELECT id FROM sessions", params: [])
#expect(rows.count == 1)
if case .text(let id) = rows[0][0] {
#expect(id == "s1")
} else {
Issue.record("Expected .text id, got \(rows[0][0])")
}
}
@Test func queryWithIntParam() async throws {
try requireSqlite3()
let dbURL = try makeFixtureStateDB()
defer {
try? FileManager.default.removeItem(at: dbURL)
try? FileManager.default.removeItem(at: dbURL.deletingLastPathComponent().appendingPathComponent("state.db"))
}
let ctx = makeFixtureContext(dbURL: dbURL)
let backend = RemoteSQLiteBackend(context: ctx, transport: LocalSQLite3Transport())
_ = await backend.open()
let rows = try await backend.query(
"SELECT id FROM sessions WHERE message_count >= ?",
params: [.integer(5)]
)
#expect(rows.count == 1)
}
@Test func queryWithTextParamEscapesQuotes() async throws {
try requireSqlite3()
let dbURL = try makeFixtureStateDB()
defer {
try? FileManager.default.removeItem(at: dbURL)
try? FileManager.default.removeItem(at: dbURL.deletingLastPathComponent().appendingPathComponent("state.db"))
}
let ctx = makeFixtureContext(dbURL: dbURL)
let backend = RemoteSQLiteBackend(context: ctx, transport: LocalSQLite3Transport())
_ = await backend.open()
// Injection-shaped value should be escaped to a harmless literal,
// matching nothing in the fixture.
let rows = try await backend.query(
"SELECT id FROM sessions WHERE id = ?",
params: [.text("s' OR 1=1 --")]
)
#expect(rows.isEmpty)
}
@Test func queryEmptyResultSet() async throws {
try requireSqlite3()
let dbURL = try makeFixtureStateDB()
defer {
try? FileManager.default.removeItem(at: dbURL)
try? FileManager.default.removeItem(at: dbURL.deletingLastPathComponent().appendingPathComponent("state.db"))
}
let ctx = makeFixtureContext(dbURL: dbURL)
let backend = RemoteSQLiteBackend(context: ctx, transport: LocalSQLite3Transport())
_ = await backend.open()
let rows = try await backend.query(
"SELECT id FROM sessions WHERE id = ?",
params: [.text("does-not-exist")]
)
#expect(rows.isEmpty)
}
@Test func queryNullValuesPreserved() async throws {
try requireSqlite3()
let dbURL = try makeFixtureStateDB()
defer {
try? FileManager.default.removeItem(at: dbURL)
try? FileManager.default.removeItem(at: dbURL.deletingLastPathComponent().appendingPathComponent("state.db"))
}
let ctx = makeFixtureContext(dbURL: dbURL)
let backend = RemoteSQLiteBackend(context: ctx, transport: LocalSQLite3Transport())
_ = await backend.open()
let rows = try await backend.query(
"SELECT id, ended_at, end_reason FROM sessions WHERE id = ?",
params: [.text("s1")]
)
#expect(rows.count == 1)
// ended_at and end_reason are NULL in the fixture row.
#expect(rows[0].isNull(at: 1))
#expect(rows[0].isNull(at: 2))
}
// MARK: - queryBatch()
@Test func queryBatchSplitsResultsCorrectly() async throws {
try requireSqlite3()
let dbURL = try makeFixtureStateDB()
defer {
try? FileManager.default.removeItem(at: dbURL)
try? FileManager.default.removeItem(at: dbURL.deletingLastPathComponent().appendingPathComponent("state.db"))
}
let ctx = makeFixtureContext(dbURL: dbURL)
let backend = RemoteSQLiteBackend(context: ctx, transport: LocalSQLite3Transport())
_ = await backend.open()
let results = try await backend.queryBatch([
(sql: "SELECT id FROM sessions", params: []),
(sql: "SELECT id FROM messages WHERE session_id = ?", params: [.text("s1")]),
(sql: "SELECT COUNT(*) FROM sessions", params: [])
])
#expect(results.count == 3)
// Slot 0: one session row.
#expect(results[0].count == 1)
if case .text(let sid) = results[0][0][0] {
#expect(sid == "s1")
} else {
Issue.record("Expected .text in slot 0")
}
// Slot 1: one message row.
#expect(results[1].count == 1)
// Slot 2: one count row with integer 1.
#expect(results[2].count == 1)
if case .integer(let n) = results[2][0][0] {
#expect(n == 1)
} else {
Issue.record("Expected .integer in slot 2")
}
}
@Test func queryBatchHandlesEmptyResultSets() async throws {
try requireSqlite3()
let dbURL = try makeFixtureStateDB()
defer {
try? FileManager.default.removeItem(at: dbURL)
try? FileManager.default.removeItem(at: dbURL.deletingLastPathComponent().appendingPathComponent("state.db"))
}
let ctx = makeFixtureContext(dbURL: dbURL)
let backend = RemoteSQLiteBackend(context: ctx, transport: LocalSQLite3Transport())
_ = await backend.open()
// Middle statement returns 0 rows; outer slots should still be
// populated correctly.
let results = try await backend.queryBatch([
(sql: "SELECT id FROM sessions", params: []),
(sql: "SELECT id FROM messages WHERE session_id = ?", params: [.text("does-not-exist")]),
(sql: "SELECT COUNT(*) FROM messages", params: [])
])
#expect(results.count == 3)
#expect(results[0].count == 1)
#expect(results[1].isEmpty)
#expect(results[2].count == 1)
}
// MARK: - Failure paths
@Test func nonZeroExitThrowsSqliteError() async throws {
try requireSqlite3()
// Point at a parent dir with no state.db symlink sqlite3 will
// open a brand-new empty DB, so the schema PRAGMAs return empty
// tables. That actually succeeds. Instead, point remoteHome at
// a path under a non-existent directory so sqlite3 can't open
// the file at all.
let nonExistentParent = "/var/empty/scarf-test-no-such-dir-\(UUID().uuidString)"
let ctx = ServerContext(
id: UUID(),
displayName: "broken",
kind: .ssh(SSHConfig(host: "fake.invalid", remoteHome: nonExistentParent))
)
let backend = RemoteSQLiteBackend(context: ctx, transport: LocalSQLite3Transport())
let opened = await backend.open()
#expect(opened == false)
let err = await backend.lastOpenError
#expect(err != nil)
#expect(!(err ?? "").isEmpty)
}
}
#endif // canImport(SQLite3)
@@ -0,0 +1,147 @@
import Testing
import Foundation
@testable import ScarfCore
/// Pure unit tests on `SQLValueInliner.inline(_:params:)` and
/// `SQLValueInliner.encode(_:)`. No backend, no transport, no actor
/// these are the lexical-substitution rules that drive the remote
/// SQLite backend's `?` literal pipeline.
@Suite struct SQLValueInlinerTests {
// MARK: - encode(_:) per SQLValue case
@Test func encodeNullProducesNULL() {
#expect(SQLValueInliner.encode(.null) == "NULL")
}
@Test func encodeIntegerProducesUnquotedDigits() {
#expect(SQLValueInliner.encode(.integer(42)) == "42")
#expect(SQLValueInliner.encode(.integer(-7)) == "-7")
#expect(SQLValueInliner.encode(.integer(0)) == "0")
#expect(SQLValueInliner.encode(.integer(Int64.max)) == "9223372036854775807")
}
@Test func encodeRealUsesPercent17gFormat() {
// %.17g round-trips a Double precisely as decimal. Verify the
// formatted string parses back to the exact same Double.
let original: Double = 3.14
let encoded = SQLValueInliner.encode(.real(original))
#expect(encoded == String(format: "%.17g", original))
// Round-trip: encoded value re-parsed must equal the source.
#expect(Double(encoded) == original)
// Tricky case: 0.1 + 0.2 has imprecise binary representation.
let imprecise = 0.1 + 0.2
let encodedImprecise = SQLValueInliner.encode(.real(imprecise))
#expect(Double(encodedImprecise) == imprecise)
}
@Test func encodeTextWrapsInSingleQuotes() {
#expect(SQLValueInliner.encode(.text("hi")) == "'hi'")
#expect(SQLValueInliner.encode(.text("")) == "''")
}
@Test func encodeTextDoublesEmbeddedSingleQuotes() {
// SQL literal escape: `it's` becomes `'it''s'`.
#expect(SQLValueInliner.encode(.text("it's")) == "'it''s'")
// Multiple embedded quotes each one is doubled.
#expect(SQLValueInliner.encode(.text("a'b'c")) == "'a''b''c'")
// The classic injection-shaped value gets escaped to harmless.
#expect(SQLValueInliner.encode(.text("' OR 1=1 --")) == "''' OR 1=1 --'")
}
@Test func encodeBlobProducesHexLiteral() {
// Two-byte blob: `X'dead'`.
#expect(SQLValueInliner.encode(.blob(Data([0xde, 0xad]))) == "X'dead'")
// Empty blob: `X''`.
#expect(SQLValueInliner.encode(.blob(Data())) == "X''")
// Lowercase hex, full byte range, with leading zero preserved.
#expect(SQLValueInliner.encode(.blob(Data([0x00, 0x0f, 0xff]))) == "X'000fff'")
}
// MARK: - inline(_:params:) substitution rules
@Test func inlineSubstitutesPlaceholdersInOrder() {
let out = SQLValueInliner.inline(
"INSERT INTO t VALUES (?, ?, ?)",
params: [.integer(1), .text("two"), .real(3.0)]
)
// Order is preserved: integer 1, text 'two', real 3.0.
#expect(out.hasPrefix("INSERT INTO t VALUES ("))
#expect(out.contains("1"))
#expect(out.contains("'two'"))
// Real 3.0 should round-trip via %.17g.
let real3 = String(format: "%.17g", 3.0)
#expect(out.contains(real3))
}
@Test func inlineSkipsPlaceholderInsideStringLiteral() {
// The `?` inside `'?'` is part of a string and must not be bound.
// Only the trailing `?` (outside the quotes) consumes the param.
let out = SQLValueInliner.inline(
"WHERE name = '?' AND id = ?",
params: [.integer(7)]
)
#expect(out == "WHERE name = '?' AND id = 7")
}
@Test func inlineSkipsPlaceholderInsideDoubleQuotedIdentifier() {
// Double-quoted identifiers (column / table names with special chars)
// are also a quoted region `?` inside them is literal.
let out = SQLValueInliner.inline(
"SELECT \"col?\" FROM t WHERE x = ?",
params: [.integer(1)]
)
#expect(out == "SELECT \"col?\" FROM t WHERE x = 1")
}
@Test func inlineHandlesDoubledSingleQuoteEscapeInString() {
// `'it''s ?'` is a single SQL string literal containing `it's ?`.
// The doubled single-quote is the SQL escape for an embedded
// apostrophe the scanner must NOT toggle out of string state
// at the doubled quote, and the trailing `?` is inside the string.
// No params consumed.
let out = SQLValueInliner.inline(
"WHERE x = 'it''s ?'",
params: []
)
#expect(out == "WHERE x = 'it''s ?'")
}
@Test func inlineSelectShapeMatchesDataServicePattern() {
// Sanity check: the SELECT shape `HermesDataService.fetchSessions`
// generates inlines cleanly for the typical `[.integer(100)]`
// limit param.
let sql = "SELECT id, source FROM sessions WHERE parent_session_id IS NULL ORDER BY started_at DESC LIMIT ?"
let out = SQLValueInliner.inline(sql, params: [.integer(100)])
#expect(out == "SELECT id, source FROM sessions WHERE parent_session_id IS NULL ORDER BY started_at DESC LIMIT 100")
}
@Test func inlineWithNoPlaceholdersReturnsInputUnchanged() {
let sql = "SELECT COUNT(*) FROM messages"
#expect(SQLValueInliner.inline(sql, params: []) == sql)
}
@Test func inlinePreservesAllOtherCharacters() {
// Make sure we're not mangling whitespace, semicolons, parens.
let sql = " SELECT *\n FROM t WHERE id = ? ; "
let out = SQLValueInliner.inline(sql, params: [.integer(5)])
#expect(out == " SELECT *\n FROM t WHERE id = 5 ; ")
}
@Test func inlineSubstitutesNullPlaceholder() {
let out = SQLValueInliner.inline(
"UPDATE t SET col = ? WHERE id = ?",
params: [.null, .integer(1)]
)
#expect(out == "UPDATE t SET col = NULL WHERE id = 1")
}
@Test func inlineSubstitutesBlobPlaceholder() {
let out = SQLValueInliner.inline(
"INSERT INTO t (data) VALUES (?)",
params: [.blob(Data([0x01, 0x02, 0x03]))]
)
#expect(out == "INSERT INTO t (data) VALUES (X'010203')")
}
}
@@ -0,0 +1,202 @@
import Testing
import Foundation
@testable import ScarfCore
/// `.serialized` because every test that exercises the wrappers
/// (`measure`, `measureAsync`, `event`) installs and uninstalls the
/// process-wide backend set, and parallel tests would race on that
/// shared state. Tests of the ring buffer in isolation don't need
/// serialization, but the suite-level annotation is the simplest way
/// to keep the global-state ones honest.
@Suite(.serialized) struct ScarfMonTests {
/// Ring-buffer ordering fewer than capacity, no wrap.
@Test func ringBufferKeepsOrderBeforeWrap() {
let ring = ScarfMonRingBuffer(capacity: 8)
ring.record(.fixture(name: "a"))
ring.record(.fixture(name: "b"))
ring.record(.fixture(name: "c"))
let names = ring.samples().map { $0.name.description }
#expect(names == ["a", "b", "c"])
}
/// Ring-buffer wrap-around the oldest entries are dropped, the
/// newest entries appear at the end.
@Test func ringBufferWrapsCorrectly() {
let ring = ScarfMonRingBuffer(capacity: 4)
ring.record(.fixture(name: "a"))
ring.record(.fixture(name: "b"))
ring.record(.fixture(name: "c"))
ring.record(.fixture(name: "d"))
ring.record(.fixture(name: "e"))
ring.record(.fixture(name: "f"))
let names = ring.samples().map { $0.name.description }
#expect(names == ["c", "d", "e", "f"])
}
/// Reset clears the buffer and resets wrap state subsequent reads
/// see only post-reset entries.
@Test func ringBufferResetClearsState() {
let ring = ScarfMonRingBuffer(capacity: 4)
ring.record(.fixture(name: "a"))
ring.record(.fixture(name: "b"))
ring.record(.fixture(name: "c"))
ring.record(.fixture(name: "d"))
ring.record(.fixture(name: "e"))
ring.reset()
ring.record(.fixture(name: "x"))
let names = ring.samples().map { $0.name.description }
#expect(names == ["x"])
}
/// Summary aggregates per (category, name) and computes percentiles.
@Test func summaryAggregatesByCategoryAndName() {
let ring = ScarfMonRingBuffer(capacity: 16)
// Three "fast" intervals + two "slow" intervals on the same key.
for nanos: UInt64 in [1_000_000, 2_000_000, 3_000_000, 50_000_000, 100_000_000] {
ring.record(.fixture(name: "render", durationNanos: nanos))
}
let stats = ring.summary()
#expect(stats.count == 1)
let s = stats[0]
#expect(s.count == 5)
#expect(s.totalNanos == 156_000_000)
// Nearest-rank p95 with 5 samples picks the 5th sorted value
// (rank = ceil(5 * 0.95) = 5).
#expect(s.p95Nanos == 100_000_000)
// p50 with 5 samples picks the 3rd sorted value.
#expect(s.p50Nanos == 3_000_000)
}
/// Events accumulate count + bytes without contributing to interval
/// percentiles.
@Test func eventsAccumulateBytesNotDuration() {
let ring = ScarfMonRingBuffer(capacity: 16)
ring.record(ScarfMon.Sample(
category: .chatStream, name: "token", kind: .event,
timestamp: Date(), durationNanos: 0, count: 1, bytes: 256
))
ring.record(ScarfMon.Sample(
category: .chatStream, name: "token", kind: .event,
timestamp: Date(), durationNanos: 0, count: 1, bytes: 128
))
let stats = ring.summary()
#expect(stats.count == 1)
#expect(stats[0].count == 2)
#expect(stats[0].totalBytes == 384)
#expect(stats[0].p95Nanos == 0)
}
/// `isActive` flips off when the backend set is empty so the
/// hot-path short-circuit kicks in.
@Test func installEmptyBackendsDeactivates() {
ScarfMon.install([])
#expect(ScarfMon.isActive == false)
ScarfMon.install([ScarfMonRingBuffer(capacity: 4)])
#expect(ScarfMon.isActive == true)
ScarfMon.install([])
}
/// `measure` records a duration into every installed backend.
@Test func measureFlowsThroughInstalledBackends() throws {
let ring = ScarfMonRingBuffer(capacity: 8)
ScarfMon.install([ring])
defer { ScarfMon.install([]) }
let result: Int = ScarfMon.measure(.render, "unit") {
return 42
}
#expect(result == 42)
let samples = ring.samples()
#expect(samples.count == 1)
#expect(samples[0].kind == .interval)
#expect(samples[0].name.description == "unit")
}
/// `measureAsync` records duration even when the body throws the
/// `defer` in the wrapper must fire on rethrow.
@Test func measureAsyncRecordsDurationEvenOnThrow() async {
struct Boom: Error {}
let ring = ScarfMonRingBuffer(capacity: 8)
ScarfMon.install([ring])
defer { ScarfMon.install([]) }
await #expect(throws: Boom.self) {
try await ScarfMon.measureAsync(.chatStream, "throws") {
throw Boom()
}
}
let samples = ring.samples()
#expect(samples.count == 1)
#expect(samples[0].name.description == "throws")
}
/// `event(...)` records a count entry without taking a clock reading.
@Test func eventRecordsCountSample() {
let ring = ScarfMonRingBuffer(capacity: 8)
ScarfMon.install([ring])
defer { ScarfMon.install([]) }
ScarfMon.event(.chatStream, "token", count: 1, bytes: 32)
let samples = ring.samples()
#expect(samples.count == 1)
#expect(samples[0].kind == .event)
#expect(samples[0].count == 1)
#expect(samples[0].bytes == 32)
#expect(samples[0].durationNanos == 0)
}
/// Boot configure flips the active backend set without leaking
/// across tests.
@Test func bootConfigureModesInstallExpectedBackends() {
defer { ScarfMon.install([]) }
ScarfMonBoot.configure(mode: .off)
#expect(ScarfMon.currentBackends.isEmpty)
#expect(ScarfMonBoot.sharedRingBuffer == nil)
ScarfMonBoot.configure(mode: .signpostOnly)
#expect(ScarfMon.currentBackends.count == 1)
#expect(ScarfMonBoot.sharedRingBuffer == nil)
let ring = ScarfMonBoot.configure(mode: .full)
#expect(ring != nil)
#expect(ScarfMon.currentBackends.count == 3)
#expect(ScarfMonBoot.sharedRingBuffer === ring)
}
/// JSON export round-trips through `JSONSerialization` proves the
/// per-line format is valid JSON the user can paste into a feedback
/// tool.
@Test func exportJSONIsParseable() throws {
let ring = ScarfMonRingBuffer(capacity: 8)
ring.record(.fixture(name: "a", durationNanos: 1_500_000))
ring.record(ScarfMon.Sample(
category: .chatStream, name: "token", kind: .event,
timestamp: Date(), durationNanos: 0, count: 1, bytes: 64
))
let json = ring.exportJSON()
let data = json.data(using: .utf8)!
let parsed = try JSONSerialization.jsonObject(with: data, options: [])
let arr = parsed as? [[String: Any]]
#expect(arr?.count == 2)
}
}
private extension ScarfMon.Sample {
static func fixture(
category: ScarfMon.Category = .render,
name: StaticString,
durationNanos: UInt64 = 1_000_000
) -> ScarfMon.Sample {
ScarfMon.Sample(
category: category,
name: name,
kind: .interval,
timestamp: Date(),
durationNanos: durationNanos,
count: 1,
bytes: nil
)
}
}
@@ -0,0 +1,312 @@
import Testing
import Foundation
@testable import ScarfCore
/// Pure-logic tests for the marker-block splice helpers in
/// `SecretsEnvBlock`. No Keychain access, no filesystem I/O just
/// strings in, strings out. The Mac-side `KeychainEnvMirror` wraps
/// these with Keychain resolution + transport-aware writes; that
/// integration is covered separately in `KeychainEnvMirrorTests`.
@Suite("SecretsEnvBlock")
struct SecretsEnvBlockTests {
// MARK: - envKeyName
@Test func envKeyNameStandardCase() {
#expect(
SecretsEnvBlock.envKeyName(slug: "local-news", fieldKey: "api_token")
== "SCARF_LOCAL_NEWS_API_TOKEN"
)
}
@Test func envKeyNameNonAlphanumericChars() {
// Dashes, underscores, dots, spaces all fold to single underscores.
#expect(
SecretsEnvBlock.envKeyName(slug: "foo.bar baz", fieldKey: "x-y-z")
== "SCARF_FOO_BAR_BAZ_X_Y_Z"
)
}
@Test func envKeyNameRunsCollapse() {
// Three consecutive special chars produce a single underscore,
// not three.
#expect(
SecretsEnvBlock.envKeyName(slug: "foo---bar", fieldKey: "a__b")
== "SCARF_FOO_BAR_A_B"
)
}
@Test func envKeyNameLeadingTrailingTrim() {
// Leading/trailing dashes on the slug shouldn't produce
// SCARF__... or trailing _ in the result.
let key = SecretsEnvBlock.envKeyName(slug: "-foo-", fieldKey: "-bar-")
#expect(key == "SCARF_FOO_BAR")
#expect(!key.hasSuffix("_"))
#expect(!key.contains("__"))
}
@Test func envKeyNameAllSymbolsFallsBackToUnnamed() {
// Pathological input slug is all special chars. Sanitizer
// emits `UNNAMED` rather than the empty string, so the env
// var name is still parseable.
#expect(
SecretsEnvBlock.envKeyName(slug: "!!!", fieldKey: "...")
== "SCARF_UNNAMED_UNNAMED"
)
}
// MARK: - renderBlock
@Test func renderBlockEmptyEntriesReturnsEmpty() {
// Empty entries is the documented "use removeBlock instead"
// sentinel renderBlock should not produce a block with
// dangling markers.
let result = SecretsEnvBlock.renderBlock(slug: "foo", entries: [])
#expect(result.isEmpty)
}
@Test func renderBlockSortsEntries() {
// Output is deterministic regardless of input order so two
// runs with the same logical content produce byte-identical
// bytes load-bearing for the no-op-when-unchanged check
// in the mirror's writeIfChanged.
let aFirst = SecretsEnvBlock.renderBlock(
slug: "foo",
entries: [("ALPHA", "1"), ("BRAVO", "2")]
)
let bFirst = SecretsEnvBlock.renderBlock(
slug: "foo",
entries: [("BRAVO", "2"), ("ALPHA", "1")]
)
#expect(aFirst == bFirst)
// Sanity: ALPHA precedes BRAVO in the output regardless of
// insertion order.
let alphaIdx = aFirst.range(of: "ALPHA")
let bravoIdx = aFirst.range(of: "BRAVO")
#expect(alphaIdx != nil && bravoIdx != nil)
#expect(alphaIdx!.lowerBound < bravoIdx!.lowerBound)
}
@Test func renderBlockEmitsMarkersAroundEntries() {
let result = SecretsEnvBlock.renderBlock(
slug: "site-status-checker",
entries: [("SCARF_SITE_STATUS_CHECKER_TOKEN", "abc")]
)
#expect(result.hasPrefix("# scarf-secrets:begin site-status-checker"))
#expect(result.hasSuffix("# scarf-secrets:end site-status-checker"))
#expect(result.contains("SCARF_SITE_STATUS_CHECKER_TOKEN=abc"))
}
@Test func renderBlockQuotesValuesWithWhitespace() {
let result = SecretsEnvBlock.renderBlock(
slug: "x",
entries: [("KEY", "hello world")]
)
// Whitespace forces single-quoting (dotenv canonical) so the
// value survives shell expansion and dotenv parsing.
#expect(result.contains("KEY='hello world'"))
}
@Test func renderBlockQuotesValuesWithSpecialChars() {
let cases: [(input: String, mustContain: String)] = [
("a#b", "KEY='a#b'"), // # is dotenv comment marker
("a$b", "KEY='a$b'"), // $ is shell expansion
("a\"b", "KEY='a\"b'"), // " conflicts with double-quote literal
("a\\b", "KEY='a\\b'"), // backslash needs escaping
]
for (input, mustContain) in cases {
let result = SecretsEnvBlock.renderBlock(
slug: "x",
entries: [("KEY", input)]
)
#expect(
result.contains(mustContain),
"value '\(input)' produced wrong escaping: \(result)"
)
}
}
@Test func renderBlockEscapesSingleQuotesViaCloseReopen() {
// A literal single quote inside a single-quoted string is
// dotenv-encoded as `'\''` (close, escape, reopen) the
// canonical sh/dotenv pattern.
let result = SecretsEnvBlock.renderBlock(
slug: "x",
entries: [("KEY", "it's fine")]
)
#expect(result.contains("KEY='it'\\''s fine'"))
}
@Test func renderBlockLeavesPlainValuesUnquoted() {
// No-special-chars values stay unquoted readability + matches
// the convention Hermes's existing ANTHROPIC_API_KEY entries
// follow.
let result = SecretsEnvBlock.renderBlock(
slug: "x",
entries: [("KEY", "abc-123_def")]
)
#expect(result.contains("\nKEY=abc-123_def\n"))
#expect(!result.contains("KEY='abc-123_def'"))
}
// MARK: - applyBlock
@Test func applyBlockToEmptyFile() {
let block = sampleBlock(slug: "foo", entries: [("KEY", "value")])
let result = SecretsEnvBlock.applyBlock(block, forSlug: "foo", to: "")
#expect(result == block + "\n")
}
@Test func applyBlockToWhitespaceOnlyFile() {
let block = sampleBlock(slug: "foo", entries: [("KEY", "value")])
let result = SecretsEnvBlock.applyBlock(block, forSlug: "foo", to: " \n \n")
// Whitespace-only treated like empty block + newline, no
// attempt to preserve the leading whitespace.
#expect(result == block + "\n")
}
@Test func applyBlockAppendsToFileWithUserContent() {
let existing = "ANTHROPIC_API_KEY=sk-test\nOPENAI_API_KEY=sk-other\n"
let block = sampleBlock(slug: "foo", entries: [("KEY", "value")])
let result = SecretsEnvBlock.applyBlock(block, forSlug: "foo", to: existing)
// User content is preserved at the top.
#expect(result.hasPrefix("ANTHROPIC_API_KEY=sk-test"))
#expect(result.contains("OPENAI_API_KEY=sk-other"))
// Block appended after a blank-line separator.
#expect(result.contains("OPENAI_API_KEY=sk-other\n\n# scarf-secrets:begin foo"))
// And ends with a trailing newline.
#expect(result.hasSuffix("\n"))
}
@Test func applyBlockReplacesExistingBlockForSameSlug() {
let oldBlock = sampleBlock(slug: "foo", entries: [("KEY", "old")])
let newBlock = sampleBlock(slug: "foo", entries: [("KEY", "new")])
let existing = "USER_VAR=something\n\n" + oldBlock + "\n"
let result = SecretsEnvBlock.applyBlock(newBlock, forSlug: "foo", to: existing)
#expect(result.contains("KEY=new"))
#expect(!result.contains("KEY=old"))
// User content above the block is preserved.
#expect(result.contains("USER_VAR=something"))
}
@Test func applyBlockPreservesOtherSlugBlocks() {
// The most important invariant multiple project blocks
// coexist in one file and editing one mustn't disturb the
// other.
let blockA = sampleBlock(slug: "alpha", entries: [("A_KEY", "1")])
let blockB = sampleBlock(slug: "bravo", entries: [("B_KEY", "2")])
let existing = blockA + "\n\n" + blockB + "\n"
let updatedA = sampleBlock(slug: "alpha", entries: [("A_KEY", "1-updated")])
let result = SecretsEnvBlock.applyBlock(updatedA, forSlug: "alpha", to: existing)
// A was updated.
#expect(result.contains("A_KEY=1-updated"))
#expect(!result.contains("A_KEY=1\n"))
// B is byte-identical.
#expect(result.contains(blockB))
}
@Test func applyBlockIdempotent() {
// Applying the output of one call back through applyBlock
// with the same inputs produces the same string. Critical
// for the launch reconciler a no-op pass shouldn't keep
// mutating the file.
let block = sampleBlock(slug: "foo", entries: [("KEY", "value")])
let existing = "USER_VAR=x\n"
let once = SecretsEnvBlock.applyBlock(block, forSlug: "foo", to: existing)
let twice = SecretsEnvBlock.applyBlock(block, forSlug: "foo", to: once)
#expect(once == twice)
}
@Test func applyBlockEmptyBlockBehavesLikeRemove() {
// Documented behaviour: passing an empty block is the same as
// calling removeBlock the splice path uses this when a
// project's secrets are all cleared.
let block = sampleBlock(slug: "foo", entries: [("KEY", "value")])
let withBlock = "USER=x\n\n" + block + "\n"
let viaApply = SecretsEnvBlock.applyBlock("", forSlug: "foo", to: withBlock)
let viaRemove = SecretsEnvBlock.removeBlock(forSlug: "foo", from: withBlock)
#expect(viaApply == viaRemove)
}
// MARK: - removeBlock
@Test func removeBlockNoOpWhenAbsent() {
let existing = "USER_VAR=hello\nOTHER=world\n"
let result = SecretsEnvBlock.removeBlock(forSlug: "foo", from: existing)
#expect(result == existing)
}
@Test func removeBlockStripsBlockOnly() {
let block = sampleBlock(slug: "foo", entries: [("KEY", "value")])
let existing = "USER_VAR=x\n\n" + block + "\n\nMORE_USER=y\n"
let result = SecretsEnvBlock.removeBlock(forSlug: "foo", from: existing)
#expect(!result.contains("scarf-secrets"))
#expect(result.contains("USER_VAR=x"))
#expect(result.contains("MORE_USER=y"))
}
@Test func removeBlockCollapsesAppendedBlankLineSeparator() {
// Round-trip: append a block, then remove it. The blank line
// we inserted at append time should be absorbed so repeated
// install/uninstall cycles don't accumulate blank lines.
let block = sampleBlock(slug: "foo", entries: [("KEY", "value")])
let original = "USER_VAR=x\n"
let appended = SecretsEnvBlock.applyBlock(block, forSlug: "foo", to: original)
let removed = SecretsEnvBlock.removeBlock(forSlug: "foo", from: appended)
// Removed content should be very close to the original at
// most one trailing newline difference. No accumulation of
// blank lines across the cycle.
#expect(removed.trimmingCharacters(in: .whitespacesAndNewlines)
== original.trimmingCharacters(in: .whitespacesAndNewlines))
}
// MARK: - Slug-prefix collision
@Test func slugPrefixCollisionIsolated() {
// A file with both `foo` and `foo-bar` blocks; editing `foo`
// must not match the `foo-bar` markers as a prefix-substring
// of the begin-line.
let blockShort = sampleBlock(slug: "foo", entries: [("SHORT", "1")])
let blockLong = sampleBlock(slug: "foo-bar", entries: [("LONG", "2")])
let existing = blockShort + "\n\n" + blockLong + "\n"
let updatedShort = sampleBlock(slug: "foo", entries: [("SHORT", "1-updated")])
let result = SecretsEnvBlock.applyBlock(updatedShort, forSlug: "foo", to: existing)
// Short was updated.
#expect(result.contains("SHORT=1-updated"))
#expect(!result.contains("SHORT=1\n"))
// Long block is byte-identical.
#expect(result.contains(blockLong))
// Both markers still present, exactly once each.
#expect(occurrences(of: "# scarf-secrets:begin foo\n", in: result) == 1)
#expect(occurrences(of: "# scarf-secrets:begin foo-bar\n", in: result) == 1)
}
@Test func removeBlockRespectsSlugPrefixIsolation() {
let blockShort = sampleBlock(slug: "foo", entries: [("SHORT", "1")])
let blockLong = sampleBlock(slug: "foo-bar", entries: [("LONG", "2")])
let existing = blockShort + "\n\n" + blockLong + "\n"
let result = SecretsEnvBlock.removeBlock(forSlug: "foo", from: existing)
// foo gone, foo-bar preserved byte-identically.
#expect(!result.contains("SHORT=1"))
#expect(result.contains(blockLong))
}
// MARK: - Helpers
private func sampleBlock(
slug: String,
entries: [(key: String, value: String)]
) -> String {
SecretsEnvBlock.renderBlock(slug: slug, entries: entries)
}
private func occurrences(of needle: String, in haystack: String) -> Int {
var count = 0
var search = haystack.startIndex
while let range = haystack.range(of: needle, range: search..<haystack.endIndex) {
count += 1
search = range.upperBound
}
return count
}
}
@@ -58,6 +58,9 @@ public final class CitadelServerTransport: ServerTransport, @unchecked Sendable
/// Shared directory under which cached SQLite snapshots land. On
/// iOS this maps to `<Caches>/scarf/snapshots/<server-id>/`.
/// Stable per-server cache directory. Was used by the snapshot
/// pipeline pre-v2.7; kept for the cache-cleanup migration that
/// purges old snapshot files at first launch on the new build.
private let snapshotBaseDir: URL
/// Actor-serialized access to the one shared `SSHClient`. Opens
@@ -159,19 +162,108 @@ public final class CitadelServerTransport: ServerTransport, @unchecked Sendable
AsyncThrowingStream { $0.finish() }
}
// MARK: - ServerTransport: SQLite snapshot
// MARK: - ServerTransport: script streaming
public func snapshotSQLite(remotePath: String) throws -> URL {
try runSync { try await self.asyncSnapshotSQLite(remotePath: remotePath) }
/// Pipe `script` to `/bin/sh -s` over Citadel's exec channel.
///
/// **Why base64.** Citadel's `executeCommandStream` doesn't expose
/// stdin in the version we're on, so we can't just open `sh -s` and
/// write the script. Instead we encode the script as base64, decode
/// it on the remote inline, and pipe the result into `sh`:
///
/// printf '%s' '<b64>' | base64 -d | /bin/sh
///
/// `base64 -d` is universally available on Linux/macOS. The base64
/// blob travels as a single shell-safe argv token, so multi-line
/// scripts with `"$VAR"` references and nested quotes survive
/// untouched same correctness guarantee as `SSHScriptRunner`'s
/// stdin-pipe approach.
public func streamScript(_ script: String, timeout: TimeInterval) async throws -> ProcessResult {
try await ScarfMon.measureAsync(.transport, "ssh.streamScript") {
try await _streamScriptImpl(script, timeout: timeout)
}
}
/// Path where the most recent successful snapshot was written
/// returned even when the SSH connection is currently down. The
/// data service falls back to this when `snapshotSQLite` throws so
/// Dashboard / Sessions / Chat-history stay viewable while the
/// phone is offline.
public var cachedSnapshotPath: URL? {
snapshotBaseDir.appendingPathComponent("state.db")
private func _streamScriptImpl(_ script: String, timeout: TimeInterval) async throws -> ProcessResult {
let scriptBytes = Data(script.utf8)
let b64 = scriptBytes.base64EncodedString()
// Prepend the same PATH guard that `asyncRunProcess` uses so
// base64 + sh resolve on hosts where they live in non-default
// prefixes. Most distros have base64 in /usr/bin but
// homebrew-installed coreutils in /opt/homebrew/bin would
// otherwise be invisible from a stripped-PATH exec channel.
let cmd = "PATH=\"$HOME/.local/bin:/opt/homebrew/bin:/usr/local/bin:$PATH\" "
+ "printf '%s' '\(b64)' | base64 -d | /bin/sh"
return try await runScript(cmd, timeout: timeout)
}
private func runScript(_ cmd: String, timeout: TimeInterval) async throws -> ProcessResult {
let client = try await connectionHolder.ssh()
let stream: AsyncThrowingStream<ExecCommandOutput, Error>
do {
stream = try await client.executeCommandStream(cmd)
} catch {
throw TransportError.other(message: "Failed to start exec stream: \(error.localizedDescription)")
}
// Drain in a child task and race against a sleep so a wedged remote
// sqlite3 (or a mid-stream Citadel transport failure) can't hang the
// caller indefinitely. Mirrors the busy-wait deadline that
// SSHScriptRunner enforces on Mac.
return try await withThrowingTaskGroup(of: ProcessResult?.self) { group in
group.addTask {
var stdout = Data()
var stderr = Data()
var exitCode: Int32 = 0
do {
for try await chunk in stream {
try Task.checkCancellation()
switch chunk {
case .stdout(var buf):
if let s = buf.readString(length: buf.readableBytes) {
stdout.append(Data(s.utf8))
}
case .stderr(var buf):
if let s = buf.readString(length: buf.readableBytes) {
stderr.append(Data(s.utf8))
}
}
}
} catch let failed as SSHClient.CommandFailed {
// Genuine remote non-zero exit surface as
// ProcessResult so the caller's existing exit-code
// handling fires (mapped to BackendError.sqlite by
// RemoteSQLiteBackend).
exitCode = Int32(failed.exitCode)
} catch is CancellationError {
throw TransportError.timeout(seconds: timeout, partialStdout: stdout)
} catch {
// Transport-level failure (host unreachable, channel
// dropped, ControlMaster died, NIO read error). Throw
// as a typed TransportError so RemoteSQLiteBackend
// routes it to BackendError.transport rather than
// misclassifying as a sqlite crash via a fake -1 exit.
throw TransportError.other(
message: "SSH stream failed: \(error.localizedDescription)"
)
}
return ProcessResult(exitCode: exitCode, stdout: stdout, stderr: stderr)
}
group.addTask {
try await Task.sleep(nanoseconds: UInt64(timeout * 1_000_000_000))
return nil
}
guard let first = try await group.next() else {
group.cancelAll()
throw TransportError.other(message: "SSH stream produced no result")
}
group.cancelAll()
if let result = first {
return result
}
// Timeout fired first drain task gets cancelled by the
// group cancel above; surface as a typed timeout.
throw TransportError.timeout(seconds: timeout, partialStdout: Data())
}
}
// MARK: - ServerTransport: watching
@@ -180,14 +272,32 @@ public final class CitadelServerTransport: ServerTransport, @unchecked Sendable
// Polling-based, identical in shape to `SSHTransport`'s remote-
// watch fallback: stat each path, yield `.anyChanged` when any
// mtime shifts. 3s tick keeps bandwidth low.
//
// ScarfMon A1 instrumentation:
// - `ios.fileWatcher.tick` (interval) full poll cycle latency,
// includes the SSH stat round-trips. Pre-fix this is what an
// "out of sync" user is feeling: anything > 1500 ms means
// the channel is congested or the host is slow.
// - `ios.fileWatcher.delta` (event) fires only when the
// signature actually changed. Low ratio (delta count / tick
// count) means we're polling more aggressively than the
// change rate warrants opens the door to dropping the 3s
// cadence on LAN.
// - `ios.fileWatcher.paths` (event with bytes=count) number
// of paths watched per cycle, helps explain a slow tick when
// the project list grows.
AsyncStream { continuation in
let task = Task.detached { [weak self] in
var lastSignature = ""
while !Task.isCancelled {
guard let self else { break }
let current = await self.buildWatchSignature(for: paths)
ScarfMon.event(.transport, "ios.fileWatcher.paths", count: 1, bytes: paths.count)
let current = await ScarfMon.measureAsync(.transport, "ios.fileWatcher.tick") {
await self.buildWatchSignature(for: paths)
}
if !current.isEmpty, current != lastSignature {
if !lastSignature.isEmpty {
ScarfMon.event(.transport, "ios.fileWatcher.delta", count: 1)
continuation.yield(.anyChanged)
}
lastSignature = current
@@ -397,101 +507,6 @@ public final class CitadelServerTransport: ServerTransport, @unchecked Sendable
return ProcessResult(exitCode: exitCode, stdout: stdout, stderr: stderr)
}
private func asyncSnapshotSQLite(remotePath: String) async throws -> URL {
// Same flow as SSHTransport: run `sqlite3 .backup` on the remote
// (WAL-safe), flip out of WAL mode on the snapshot, then SFTP
// the backup file down to the local cache.
try? FileManager.default.createDirectory(at: snapshotBaseDir, withIntermediateDirectories: true)
let localURL = snapshotBaseDir.appendingPathComponent("state.db")
let client = try await connectionHolder.ssh()
let remoteTmp = "/tmp/scarf-snapshot-\(UUID().uuidString).db"
// Double-quote paths; $HOME expansion happens inside double quotes.
let rewritten = Self.rewriteHomeRelative(remotePath)
// Prepend the same PATH prefix `asyncRunProcess` uses so `sqlite3`
// resolves on hosts where it lives in /usr/local/bin or
// /opt/homebrew/bin (issue #56). Citadel's bare exec channel
// inherits a stripped PATH (typically `/usr/bin:/bin` on Linux);
// without this, statically-linked or custom-prefix sqlite3
// installs fail "command not found" at exit 127.
let backupScript =
#"PATH="$HOME/.local/bin:/opt/homebrew/bin:/usr/local/bin:$PATH" "#
+ #"sqlite3 "\#(rewritten)" ".backup '\#(remoteTmp)'" && sqlite3 '\#(remoteTmp)' "PRAGMA journal_mode=DELETE;" > /dev/null"#
// Drive `executeCommandStream` instead of `executeCommand` so we
// capture stderr regardless of exit code (issue #56). Pre-fix
// a non-zero exit threw `CommandFailed` and discarded the buffer
// surfaced as the unhelpful "Citadel.SSHClient.CommandFailed
// error 1" banner. Now we propagate the real stderr so
// `HermesDataService.humanize` can translate "sqlite3: command
// not found" / "no such file" / "permission denied" into the
// dashboard banner with actionable copy.
let stream: AsyncThrowingStream<ExecCommandOutput, Error>
do {
stream = try await client.executeCommandStream(backupScript)
} catch {
throw NSError(
domain: "CitadelServerTransport",
code: -1,
userInfo: [NSLocalizedDescriptionKey: "Failed to start snapshot stream: \(error.localizedDescription)"]
)
}
var stdout = Data()
var stderr = Data()
var exitCode: Int32 = 0
do {
for try await chunk in stream {
switch chunk {
case .stdout(var buf):
if let s = buf.readString(length: buf.readableBytes) {
stdout.append(Data(s.utf8))
}
case .stderr(var buf):
if let s = buf.readString(length: buf.readableBytes) {
stderr.append(Data(s.utf8))
}
}
}
} catch let failed as SSHClient.CommandFailed {
exitCode = Int32(failed.exitCode)
} catch {
stderr.append(Data(error.localizedDescription.utf8))
exitCode = -1
}
if exitCode != 0 {
// Combine stdout + stderr into the error message sqlite3
// sometimes prints "Error: ..." on stdout depending on the
// remote shell. HermesDataService.humanize keys off
// substrings like "sqlite3: command not found",
// "permission denied", "no such file", so as long as one of
// them ends up in the message we get a useful banner.
let messageBytes = stderr.isEmpty ? stdout : stderr
let message = String(data: messageBytes, encoding: .utf8)?.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
throw NSError(
domain: "CitadelServerTransport",
code: Int(exitCode),
userInfo: [
NSLocalizedDescriptionKey: message.isEmpty
? "Snapshot exited \(exitCode) with no output (likely sqlite3 missing on remote)"
: message
]
)
}
// SFTP-download the remote tmp into our local snapshot cache.
let sftp = try await connectionHolder.sftp()
let data: Data = try await sftp.withFile(filePath: remoteTmp, flags: [.read]) { file in
let buf = try await file.readAll()
return Data(buffer: buf)
}
try data.write(to: localURL, options: .atomic)
// Best-effort cleanup of the remote tmp.
_ = try? await client.executeCommand("rm -f '\(remoteTmp)'")
return localURL
}
// MARK: - Shell helpers
/// Minimal shell-argument joiner. Handles spaces + quotes; sufficient
@@ -70,10 +70,13 @@ public final class IOSDashboardViewModel {
return
}
stats = await dataService.fetchStats()
recentSessions = await dataService.fetchSessions(limit: 5)
allSessions = await dataService.fetchSessions(limit: 25)
sessionPreviews = await dataService.fetchSessionPreviews(limit: 25)
await ScarfMon.measureAsync(.sessionLoad, "ios.loadDashboard") {
stats = await dataService.fetchStats()
recentSessions = await dataService.fetchSessions(limit: 5)
allSessions = await dataService.fetchSessions(limit: 25)
sessionPreviews = await dataService.fetchSessionPreviews(limit: 25)
}
ScarfMon.event(.sessionLoad, "ios.allSessions.count", count: allSessions.count)
// Attribution lookup (pass-2 UX): load the sessionproject
// sidecar + project registry once so Dashboard rows can show
@@ -126,6 +129,7 @@ public final class IOSDashboardViewModel {
/// Called from the pull-to-refresh gesture.
public func refresh() async {
ScarfMon.event(.sessionLoad, "ios.dashboardRefresh.trigger", count: 1)
await load()
}
}
+8
View File
@@ -14,6 +14,14 @@ struct ScarfIOSApp: App {
)
init() {
// ScarfMon open-source perf instrumentation. Reads the
// user-toggled mode from UserDefaults and installs the
// matching backend set. Default is `.signpostOnly` so
// Instruments-attached profiling works without users having
// to opt in. The Diagnostics Performance row in Settings
// flips this between off / signpost-only / full.
ScarfMonBoot.configure(mode: ScarfMonBoot.currentMode())
// Wire ScarfCore's transport factory to produce Citadel-backed
// `ServerTransport`s for every `.ssh` context. Without this,
// `ServerContext.makeTransport()` would fall back to the
+113 -21
View File
@@ -66,7 +66,12 @@ struct ChatView: View {
)!
var body: some View {
VStack(spacing: 0) {
// ScarfMon body-evaluation counter. Re-render churn during
// streaming is one of the load-bearing perf signals; rendering
// here costs ~one signpost emit + ring-buffer append (off the
// hot path otherwise).
let _: Void = ScarfMon.event(.chatRender, "ios.ChatView.body")
return VStack(spacing: 0) {
connectionBanner
errorBanner
projectContextBar
@@ -395,7 +400,21 @@ struct ChatView: View {
showSpinner: false
)
default:
EmptyView()
// v2.7: surface "Thinking" while the agent's thought
// stream is in flight without any visible message bytes.
// Hermes reasoning models commonly take 38 s here and
// the streaming bubble has nothing to render the user
// would otherwise see a stalled transcript. Disappears
// the moment the first message chunk arrives.
if controller.vm.isStreamingThoughtsOnly {
connectionBannerStrip(
text: "Thinking…",
tint: ScarfColor.info,
showSpinner: true
)
} else {
EmptyView()
}
}
}
@@ -448,14 +467,15 @@ struct ChatView: View {
}
private var composer: some View {
VStack(alignment: .leading, spacing: 4) {
VStack(alignment: .leading, spacing: ScarfSpace.s2) {
if !controller.attachments.isEmpty || isEncodingAttachment || attachmentError != nil {
attachmentStrip
}
composerRow
}
.padding(.horizontal, 12)
.padding(.vertical, 8)
.padding(.horizontal, ScarfSpace.s3)
.padding(.top, ScarfSpace.s2)
.padding(.bottom, ScarfSpace.s2)
.background(.regularMaterial)
#if canImport(PhotosUI)
.photosPicker(
@@ -536,18 +556,23 @@ struct ChatView: View {
}
private var composerRow: some View {
HStack(alignment: .bottom, spacing: 8) {
HStack(alignment: .bottom, spacing: ScarfSpace.s2) {
if supportsImagePrompts {
Button {
showPhotoPicker = true
} label: {
Image(systemName: "paperclip")
.font(.system(size: 22))
.foregroundStyle(.secondary)
.padding(.bottom, 4)
.font(.system(size: 20, weight: .regular))
.foregroundStyle(
attachDisabled
? ScarfColor.foregroundFaint
: ScarfColor.foregroundMuted
)
.frame(width: 44, height: 44)
.contentShape(Rectangle())
}
.buttonStyle(.plain)
.disabled(controller.state != .ready || controller.attachments.count >= Self.maxAttachments)
.disabled(attachDisabled)
.accessibilityLabel("Attach image")
}
TextField(
@@ -555,8 +580,19 @@ struct ChatView: View {
text: $controller.draft,
axis: .vertical
)
.textFieldStyle(.roundedBorder)
.textFieldStyle(.plain)
.lineLimit(1...5)
.padding(.horizontal, ScarfSpace.s3)
.padding(.vertical, ScarfSpace.s2)
.frame(minHeight: 44)
.background(
RoundedRectangle(cornerRadius: ScarfRadius.xl, style: .continuous)
.fill(ScarfColor.backgroundSecondary)
)
.overlay(
RoundedRectangle(cornerRadius: ScarfRadius.xl, style: .continuous)
.strokeBorder(ScarfColor.borderStrong, lineWidth: 1)
)
.disabled(controller.state != .ready)
.submitLabel(.send)
.focused($composerFocused)
@@ -592,13 +628,32 @@ struct ChatView: View {
}
}
// Big circular send button. Filled with the brand accent when
// ready, swapped to a flat gray when disabled opacity dims
// alone read as "not quite tappable" (issue #69), the explicit
// color swap makes the state unambiguous in both light and
// dark mode.
Button {
Task { await controller.send() }
} label: {
Image(systemName: "arrow.up.circle.fill")
.font(.system(size: 28))
ZStack {
Circle()
.fill(canSendComposer
? ScarfColor.accent
: ScarfColor.backgroundTertiary)
Image(systemName: "arrow.up")
.font(.system(size: 18, weight: .semibold))
.foregroundStyle(canSendComposer
? ScarfColor.onAccent
: ScarfColor.foregroundFaint)
}
.frame(width: 44, height: 44)
.contentShape(Circle())
.animation(ScarfAnimation.fast, value: canSendComposer)
}
.buttonStyle(.plain)
.disabled(!canSendComposer)
.accessibilityLabel("Send message")
}
}
@@ -610,6 +665,12 @@ struct ChatView: View {
return !controller.draft.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty
}
/// Mirror of the `.disabled(...)` predicate on the paperclip button.
/// Pulled out so the button's foreground branch reads cleanly.
private var attachDisabled: Bool {
controller.state != .ready || controller.attachments.count >= Self.maxAttachments
}
/// Pull JPEG/PNG bytes out of each PhotosPickerItem and feed them
/// through ImageEncoder. Detached so the heavyweight resize +
/// JPEG-encode work doesn't block MainActor; the resulting
@@ -1041,10 +1102,21 @@ final class ChatController {
/// the start intent so the preflight sheet can replay it after the
/// user picks a model. Reads via `context.readText` (transport-
/// aware) and parses with the ScarfCore YAML parser same path
/// `IOSSettingsViewModel.load` uses, just synchronous because the
/// preflight runs before any `state = .connecting` UI transition.
private func passModelPreflight(intent: PendingStart) -> Bool {
let raw = context.readText(context.paths.configYAML) ?? ""
/// `IOSSettingsViewModel.load` uses.
///
/// **Off MainActor.** `context.readText` synchronously calls
/// `transport.fileExists` + `transport.readFile`; on a remote
/// ScarfGo context that's a blocking SSH round-trip that, before
/// this fix, ran on the controller's `@MainActor` and stalled the
/// UI for seconds during connect long enough for iOS's
/// non-responsive-app watchdog to kill the process if the user
/// kept tapping (the typing TestFlight crash report). Reading
/// detached pushes the I/O off MainActor; the result and the
/// `pendingStartIntent` / `modelPreflightReason` writes hop back.
private func passModelPreflight(intent: PendingStart) async -> Bool {
let path = context.paths.configYAML
let ctx = context
let raw = await Task.detached { ctx.readText(path) ?? "" }.value
let config = HermesConfig(yaml: raw)
let result = ModelPreflight.check(config)
if result.isConfigured { return true }
@@ -1138,7 +1210,7 @@ final class ChatController {
/// can type and hit send immediately.
func start() async {
if state == .connecting || state == .ready { return }
guard passModelPreflight(intent: .fresh) else { return }
guard await passModelPreflight(intent: .fresh) else { return }
state = .connecting
vm.reset()
let client = ACPClient.forIOSApp(
@@ -1201,6 +1273,12 @@ final class ChatController {
/// assistant reply streams back as ACP notifications handled by
/// the event task.
func send() async {
await ScarfMon.measureAsync(.chatStream, "ios.send") {
await _sendImpl()
}
}
private func _sendImpl() async {
guard state == .ready, let client else { return }
let text = draft.trimmingCharacters(in: .whitespacesAndNewlines)
// v0.12+ allows image-only sends vision models accept "describe
@@ -1305,7 +1383,10 @@ final class ChatController {
let stream = await client.events
for await event in stream {
guard !Task.isCancelled else { break }
self?.vm.handleACPEvent(event)
ScarfMon.event(.chatStream, "ios.acpEvent", count: 1)
ScarfMon.measure(.chatStream, "ios.handleACPEvent") {
self?.vm.handleACPEvent(event)
}
}
// Stream ended if we weren't explicitly cancelled the
// channel died (EOF on stdin/out, write to dead pipe,
@@ -1651,7 +1732,7 @@ final class ChatController {
} else {
intent = .fresh
}
guard passModelPreflight(intent: intent) else { return }
guard await passModelPreflight(intent: intent) else { return }
state = .connecting
let client = ACPClient.forIOSApp(
context: context,
@@ -1735,7 +1816,13 @@ final class ChatController {
/// to `session/load` if the remote doesn't support `session/resume`
/// (Hermes < 0.9.x).
func startResuming(sessionID: String) async {
guard passModelPreflight(intent: .resume(sessionID: sessionID)) else { return }
await ScarfMon.measureAsync(.sessionLoad, "ios.startResuming") {
await _startResumingImpl(sessionID: sessionID)
}
}
private func _startResumingImpl(sessionID: String) async {
guard await passModelPreflight(intent: .resume(sessionID: sessionID)) else { return }
await stop()
vm.reset()
// Clear eagerly so a lingering project name from a prior
@@ -1899,6 +1986,11 @@ private struct MessageBubble: View, Equatable {
}
var body: some View {
// Per-bubble render counter. The streaming bubble
// (`message.id == 0`) re-renders on every chunk; tracking the
// count here is what tells us if a slow chat is bottlenecked
// on body re-eval vs. event-loop delivery.
let _: Void = ScarfMon.event(.chatRender, "ios.MessageBubble.body")
if message.isToolResult {
ToolResultRow(message: message)
} else {
@@ -102,17 +102,31 @@ struct WidgetView: View {
}
private var unsupportedView: some View {
VStack(alignment: .leading, spacing: 4) {
Label(widget.title, systemImage: "questionmark.app.dashed")
.font(.caption)
.foregroundStyle(ScarfColor.foregroundMuted)
Text("Widget type \"\(widget.type)\" isn't supported in this version of Scarf yet.")
VStack(alignment: .leading, spacing: 6) {
HStack(spacing: 6) {
Image(systemName: "exclamationmark.triangle.fill")
.font(.caption)
.foregroundStyle(ScarfColor.warning)
Text(widget.title.isEmpty ? "Widget error" : widget.title)
.font(.caption)
.foregroundStyle(ScarfColor.foregroundMuted)
}
Text("Unknown widget type: \"\(widget.type)\"")
.font(.callout)
.foregroundStyle(.primary)
.fixedSize(horizontal: false, vertical: true)
Text("This Scarf build doesn't render this widget type. Update Scarf or change the widget type in dashboard.json.")
.font(.caption2)
.foregroundStyle(.tertiary)
.fixedSize(horizontal: false, vertical: true)
}
.frame(maxWidth: .infinity, alignment: .leading)
.padding(12)
.background(.quaternary.opacity(0.5))
.background(ScarfColor.warning.opacity(0.08))
.overlay(
RoundedRectangle(cornerRadius: 8)
.strokeBorder(ScarfColor.warning.opacity(0.3), lineWidth: 1)
)
.clipShape(RoundedRectangle(cornerRadius: 8))
}
}
@@ -19,15 +19,7 @@ struct ListWidgetView: View {
}
if let items = widget.items {
ForEach(items) { item in
HStack(spacing: 6) {
Image(systemName: statusIcon(item.status))
.font(.caption2)
.foregroundStyle(statusColor(item.status))
Text(item.text)
.font(.callout)
.strikethrough(item.status == "done")
.foregroundStyle(item.status == "done" ? .secondary : .primary)
}
ListItemRow(item: item)
}
}
}
@@ -36,21 +28,52 @@ struct ListWidgetView: View {
.background(.quaternary.opacity(0.5))
.clipShape(RoundedRectangle(cornerRadius: 8))
}
}
private func statusIcon(_ status: String?) -> String {
switch status {
case "done": return "checkmark.circle.fill"
case "active": return "circle.inset.filled"
case "pending": return "circle"
default: return "circle"
private struct ListItemRow: View {
let item: ListItem
private var typedStatus: ListItemStatus? { ListItemStatus(raw: item.status) }
var body: some View {
HStack(spacing: 6) {
Image(systemName: iconName)
.font(.caption2)
.foregroundStyle(tint)
Text(item.text)
.font(.callout)
.strikethrough(typedStatus == .done)
.foregroundStyle(typedStatus == .done ? .secondary : .primary)
if typedStatus == nil, let raw = item.status, !raw.isEmpty {
Text(raw)
.font(.caption2)
.foregroundStyle(.secondary)
.padding(.horizontal, 6)
.padding(.vertical, 2)
.background(.quaternary.opacity(0.5))
.clipShape(Capsule())
}
}
}
private func statusColor(_ status: String?) -> Color {
switch status {
case "done": return .green
case "active": return .blue
default: return .secondary
private var iconName: String {
switch typedStatus {
case .success, .done: return "checkmark.circle.fill"
case .warning: return "exclamationmark.triangle.fill"
case .danger: return "xmark.octagon.fill"
case .info: return "info.circle.fill"
case .pending: return "circle.dashed"
case .neutral, nil: return "circle"
}
}
private var tint: Color {
switch typedStatus {
case .success, .done: return ScarfColor.success
case .warning: return ScarfColor.warning
case .danger: return ScarfColor.danger
case .info: return ScarfColor.info
case .pending, .neutral, nil: return .secondary
}
}
}
@@ -0,0 +1,176 @@
import SwiftUI
import ScarfCore
import ScarfDesign
import UIKit
/// In-app Diagnostics Performance panel. Lets users flip the
/// ScarfMon backend mode, watch live aggregated stats from the ring
/// buffer, and copy a JSON dump to paste into a feedback thread.
///
/// Data never leaves the device unless the user taps "Copy as JSON"
/// no remote upload, no analytics. Same source-of-truth as the Mac
/// panel; both sides read `ScarfMonBoot.sharedRingBuffer`.
struct ScarfMonDiagnosticsView: View {
@State private var mode: ScarfMonBoot.Mode = ScarfMonBoot.currentMode()
@State private var stats: [ScarfMonStat] = []
@State private var copiedToast: Bool = false
/// Ring buffer is process-wide; we read from it on a 1s timer
/// while the panel is foregrounded. No live tail; this view only
/// re-aggregates the in-memory snapshot.
private let refreshInterval: TimeInterval = 1.0
var body: some View {
List {
modeSection
if mode == .full {
summarySection
actionsSection
} else {
Section {
Text("Switch to **Full** above to see live stats and copy a JSON dump. Off and Signpost-only modes don't keep an in-memory ring buffer.")
.font(.callout)
.foregroundStyle(.secondary)
}
}
}
.navigationTitle("Performance")
.navigationBarTitleDisplayMode(.inline)
.task(id: mode) {
// Re-aggregate while the view is visible. SwiftUI cancels
// this task on disappear, so the timer stops eating cycles
// when the user backs out.
guard mode == .full else { return }
while !Task.isCancelled {
refresh()
try? await Task.sleep(nanoseconds: UInt64(refreshInterval * 1_000_000_000))
}
}
.overlay(alignment: .top) {
if copiedToast {
Text("Copied to clipboard")
.font(.caption)
.padding(.horizontal, 12)
.padding(.vertical, 6)
.background(.regularMaterial)
.clipShape(Capsule())
.padding(.top, 8)
}
}
}
@ViewBuilder
private var modeSection: some View {
Section {
Picker("Mode", selection: $mode) {
Text("Off").tag(ScarfMonBoot.Mode.off)
Text("Signpost only").tag(ScarfMonBoot.Mode.signpostOnly)
Text("Full").tag(ScarfMonBoot.Mode.full)
}
.pickerStyle(.segmented)
.onChange(of: mode) { _, newValue in
ScarfMonBoot.setMode(newValue)
}
} header: {
Text("Recording mode")
} footer: {
Text("**Signpost only** is the default — Instruments can attach and read the Points of Interest track without any other overhead. **Full** also keeps a 4096-entry in-memory ring you can browse below and copy as JSON.")
.font(.caption)
}
}
@ViewBuilder
private var summarySection: some View {
Section {
if stats.isEmpty {
Text("No samples yet. Use the app for a few seconds and the table will populate.")
.font(.caption)
.foregroundStyle(.secondary)
} else {
ForEach(stats.prefix(20), id: \.self) { stat in
StatRow(stat: stat)
}
}
} header: {
Text("Top 20 by p95")
} footer: {
Text("Sorted by 95th-percentile duration. Counts include events; intervals are everything wrapped in `ScarfMon.measure`.")
.font(.caption)
}
}
@ViewBuilder
private var actionsSection: some View {
Section {
Button {
copyJSON()
} label: {
Label("Copy ring buffer as JSON", systemImage: "doc.on.clipboard")
}
Button(role: .destructive) {
ScarfMonBoot.sharedRingBuffer?.reset()
refresh()
} label: {
Label("Reset ring buffer", systemImage: "trash")
}
}
}
private func refresh() {
stats = ScarfMonBoot.sharedRingBuffer?.summary() ?? []
}
private func copyJSON() {
guard let json = ScarfMonBoot.sharedRingBuffer?.exportJSON() else { return }
UIPasteboard.general.string = json
copiedToast = true
Task { @MainActor in
try? await Task.sleep(nanoseconds: 1_500_000_000)
copiedToast = false
}
}
}
private struct StatRow: View {
let stat: ScarfMonStat
var body: some View {
VStack(alignment: .leading, spacing: 2) {
HStack {
Text(stat.name)
.font(.system(.body, design: .monospaced))
Spacer()
Text("p95 \(formatMs(stat.p95Ms))")
.font(.caption.monospaced())
.foregroundStyle(.secondary)
}
HStack(spacing: 12) {
Text(stat.category.rawValue)
.font(.caption2)
.foregroundStyle(.tertiary)
Text("count \(stat.count)")
.font(.caption2.monospaced())
.foregroundStyle(.tertiary)
if stat.kind == .interval {
Text("p50 \(formatMs(stat.p50Ms))")
.font(.caption2.monospaced())
.foregroundStyle(.tertiary)
Text("max \(formatMs(stat.maxMs))")
.font(.caption2.monospaced())
.foregroundStyle(.tertiary)
}
if stat.totalBytes > 0 {
Text("bytes \(stat.totalBytes)")
.font(.caption2.monospaced())
.foregroundStyle(.tertiary)
}
}
}
}
private func formatMs(_ ms: Double) -> String {
if ms >= 100 { return String(format: "%.0fms", ms) }
if ms >= 1 { return String(format: "%.1fms", ms) }
return String(format: "%.2fms", ms)
}
}
@@ -45,6 +45,7 @@ struct SettingsView: View {
compressionSection
loggingSection
platformsSection
diagnosticsSection
rawYAMLToggleSection
}
}
@@ -257,6 +258,27 @@ struct SettingsView: View {
}
}
/// Diagnostics Performance entry point. Hidden from the
/// `quickEditsSection` flow because it doesn't touch config.yaml
/// it controls the in-process ScarfMon backend set instead. Off
/// by default users still get Instruments-visible signposts; flip
/// to Full when investigating a specific perf complaint.
@ViewBuilder
private var diagnosticsSection: some View {
Section {
NavigationLink {
ScarfMonDiagnosticsView()
} label: {
Label("Performance", systemImage: "speedometer")
}
} header: {
Text("Diagnostics")
} footer: {
Text("Performance instrumentation. Default mode emits Instruments signposts only; Full mode also keeps a 4096-entry in-memory ring you can copy as JSON.")
.font(.caption)
}
}
@ViewBuilder
private var rawYAMLToggleSection: some View {
Section {
+22 -22
View File
@@ -529,7 +529,7 @@
ASSETCATALOG_COMPILER_GLOBAL_ACCENT_COLOR_NAME = AccentColor;
CODE_SIGN_ENTITLEMENTS = "Scarf iOS/Scarf_iOS.entitlements";
CODE_SIGN_STYLE = Automatic;
CURRENT_PROJECT_VERSION = 28;
CURRENT_PROJECT_VERSION = 32;
DEVELOPMENT_TEAM = 3Q6X2L86C4;
ENABLE_PREVIEWS = YES;
GENERATE_INFOPLIST_FILE = YES;
@@ -540,13 +540,13 @@
INFOPLIST_KEY_UIApplicationSupportsIndirectInputEvents = YES;
INFOPLIST_KEY_UILaunchScreen_Generation = YES;
INFOPLIST_KEY_UISupportedInterfaceOrientations_iPad = "UIInterfaceOrientationPortrait UIInterfaceOrientationPortraitUpsideDown UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight";
INFOPLIST_KEY_UISupportedInterfaceOrientations_iPhone = "UIInterfaceOrientationPortrait UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight";
INFOPLIST_KEY_UISupportedInterfaceOrientations_iPhone = UIInterfaceOrientationPortrait;
IPHONEOS_DEPLOYMENT_TARGET = 18.6;
LD_RUNPATH_SEARCH_PATHS = (
"$(inherited)",
"@executable_path/Frameworks",
);
MARKETING_VERSION = 2.5.2;
MARKETING_VERSION = 2.7.0;
PRODUCT_BUNDLE_IDENTIFIER = com.scarfgo.app;
PRODUCT_NAME = "$(TARGET_NAME)";
SDKROOT = iphoneos;
@@ -571,7 +571,7 @@
ASSETCATALOG_COMPILER_GLOBAL_ACCENT_COLOR_NAME = AccentColor;
CODE_SIGN_ENTITLEMENTS = "Scarf iOS/Scarf_iOS.entitlements";
CODE_SIGN_STYLE = Automatic;
CURRENT_PROJECT_VERSION = 28;
CURRENT_PROJECT_VERSION = 32;
DEVELOPMENT_TEAM = 3Q6X2L86C4;
ENABLE_PREVIEWS = YES;
GENERATE_INFOPLIST_FILE = YES;
@@ -582,13 +582,13 @@
INFOPLIST_KEY_UIApplicationSupportsIndirectInputEvents = YES;
INFOPLIST_KEY_UILaunchScreen_Generation = YES;
INFOPLIST_KEY_UISupportedInterfaceOrientations_iPad = "UIInterfaceOrientationPortrait UIInterfaceOrientationPortraitUpsideDown UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight";
INFOPLIST_KEY_UISupportedInterfaceOrientations_iPhone = "UIInterfaceOrientationPortrait UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight";
INFOPLIST_KEY_UISupportedInterfaceOrientations_iPhone = UIInterfaceOrientationPortrait;
IPHONEOS_DEPLOYMENT_TARGET = 18.6;
LD_RUNPATH_SEARCH_PATHS = (
"$(inherited)",
"@executable_path/Frameworks",
);
MARKETING_VERSION = 2.5.2;
MARKETING_VERSION = 2.7.0;
PRODUCT_BUNDLE_IDENTIFIER = com.scarfgo.app;
PRODUCT_NAME = "$(TARGET_NAME)";
SDKROOT = iphoneos;
@@ -612,7 +612,7 @@
buildSettings = {
BUNDLE_LOADER = "$(TEST_HOST)";
CODE_SIGN_STYLE = Automatic;
CURRENT_PROJECT_VERSION = 28;
CURRENT_PROJECT_VERSION = 32;
DEVELOPMENT_TEAM = 3Q6X2L86C4;
GENERATE_INFOPLIST_FILE = YES;
IPHONEOS_DEPLOYMENT_TARGET = 26.2;
@@ -635,7 +635,7 @@
buildSettings = {
BUNDLE_LOADER = "$(TEST_HOST)";
CODE_SIGN_STYLE = Automatic;
CURRENT_PROJECT_VERSION = 28;
CURRENT_PROJECT_VERSION = 32;
DEVELOPMENT_TEAM = 3Q6X2L86C4;
GENERATE_INFOPLIST_FILE = YES;
IPHONEOS_DEPLOYMENT_TARGET = 26.2;
@@ -658,7 +658,7 @@
isa = XCBuildConfiguration;
buildSettings = {
CODE_SIGN_STYLE = Automatic;
CURRENT_PROJECT_VERSION = 28;
CURRENT_PROJECT_VERSION = 32;
DEVELOPMENT_TEAM = 3Q6X2L86C4;
GENERATE_INFOPLIST_FILE = YES;
IPHONEOS_DEPLOYMENT_TARGET = 26.2;
@@ -680,7 +680,7 @@
isa = XCBuildConfiguration;
buildSettings = {
CODE_SIGN_STYLE = Automatic;
CURRENT_PROJECT_VERSION = 28;
CURRENT_PROJECT_VERSION = 32;
DEVELOPMENT_TEAM = 3Q6X2L86C4;
GENERATE_INFOPLIST_FILE = YES;
IPHONEOS_DEPLOYMENT_TARGET = 26.2;
@@ -834,7 +834,7 @@
CODE_SIGN_ENTITLEMENTS = scarf/scarf.entitlements;
CODE_SIGN_STYLE = Automatic;
COMBINE_HIDPI_IMAGES = YES;
CURRENT_PROJECT_VERSION = 28;
CURRENT_PROJECT_VERSION = 32;
DEAD_CODE_STRIPPING = YES;
DEVELOPMENT_TEAM = 3Q6X2L86C4;
ENABLE_APP_SANDBOX = NO;
@@ -848,7 +848,7 @@
"@executable_path/../Frameworks",
);
MACOSX_DEPLOYMENT_TARGET = 14.6;
MARKETING_VERSION = 2.5.2;
MARKETING_VERSION = 2.7.0;
PRODUCT_BUNDLE_IDENTIFIER = com.scarf.app;
PRODUCT_NAME = "$(TARGET_NAME)";
REGISTER_APP_GROUPS = YES;
@@ -870,7 +870,7 @@
CODE_SIGN_ENTITLEMENTS = scarf/scarf.entitlements;
CODE_SIGN_STYLE = Automatic;
COMBINE_HIDPI_IMAGES = YES;
CURRENT_PROJECT_VERSION = 28;
CURRENT_PROJECT_VERSION = 32;
DEAD_CODE_STRIPPING = YES;
DEVELOPMENT_TEAM = 3Q6X2L86C4;
ENABLE_APP_SANDBOX = NO;
@@ -884,7 +884,7 @@
"@executable_path/../Frameworks",
);
MACOSX_DEPLOYMENT_TARGET = 14.6;
MARKETING_VERSION = 2.5.2;
MARKETING_VERSION = 2.7.0;
PRODUCT_BUNDLE_IDENTIFIER = com.scarf.app;
PRODUCT_NAME = "$(TARGET_NAME)";
REGISTER_APP_GROUPS = YES;
@@ -902,12 +902,12 @@
buildSettings = {
BUNDLE_LOADER = "$(TEST_HOST)";
CODE_SIGN_STYLE = Automatic;
CURRENT_PROJECT_VERSION = 28;
CURRENT_PROJECT_VERSION = 32;
DEAD_CODE_STRIPPING = YES;
DEVELOPMENT_TEAM = 3Q6X2L86C4;
GENERATE_INFOPLIST_FILE = YES;
MACOSX_DEPLOYMENT_TARGET = 26.2;
MARKETING_VERSION = 2.5.2;
MARKETING_VERSION = 2.7.0;
PRODUCT_BUNDLE_IDENTIFIER = com.scarfTests;
PRODUCT_NAME = "$(TARGET_NAME)";
STRING_CATALOG_GENERATE_SYMBOLS = NO;
@@ -924,12 +924,12 @@
buildSettings = {
BUNDLE_LOADER = "$(TEST_HOST)";
CODE_SIGN_STYLE = Automatic;
CURRENT_PROJECT_VERSION = 28;
CURRENT_PROJECT_VERSION = 32;
DEAD_CODE_STRIPPING = YES;
DEVELOPMENT_TEAM = 3Q6X2L86C4;
GENERATE_INFOPLIST_FILE = YES;
MACOSX_DEPLOYMENT_TARGET = 26.2;
MARKETING_VERSION = 2.5.2;
MARKETING_VERSION = 2.7.0;
PRODUCT_BUNDLE_IDENTIFIER = com.scarfTests;
PRODUCT_NAME = "$(TARGET_NAME)";
STRING_CATALOG_GENERATE_SYMBOLS = NO;
@@ -945,11 +945,11 @@
isa = XCBuildConfiguration;
buildSettings = {
CODE_SIGN_STYLE = Automatic;
CURRENT_PROJECT_VERSION = 28;
CURRENT_PROJECT_VERSION = 32;
DEAD_CODE_STRIPPING = YES;
DEVELOPMENT_TEAM = 3Q6X2L86C4;
GENERATE_INFOPLIST_FILE = YES;
MARKETING_VERSION = 2.5.2;
MARKETING_VERSION = 2.7.0;
PRODUCT_BUNDLE_IDENTIFIER = com.scarfUITests;
PRODUCT_NAME = "$(TARGET_NAME)";
STRING_CATALOG_GENERATE_SYMBOLS = NO;
@@ -965,11 +965,11 @@
isa = XCBuildConfiguration;
buildSettings = {
CODE_SIGN_STYLE = Automatic;
CURRENT_PROJECT_VERSION = 28;
CURRENT_PROJECT_VERSION = 32;
DEAD_CODE_STRIPPING = YES;
DEVELOPMENT_TEAM = 3Q6X2L86C4;
GENERATE_INFOPLIST_FILE = YES;
MARKETING_VERSION = 2.5.2;
MARKETING_VERSION = 2.7.0;
PRODUCT_BUNDLE_IDENTIFIER = com.scarfUITests;
PRODUCT_NAME = "$(TARGET_NAME)";
STRING_CATALOG_GENERATE_SYMBOLS = NO;
+159
View File
@@ -0,0 +1,159 @@
import Foundation
import ScarfCore
import os
/// One template entry as exposed by `awizemann.github.io/scarf/templates/catalog.json`.
/// Mirrors the per-template shape `tools/build-catalog.py` emits the
/// validator is the source of truth on the schema, this struct is the
/// Swift consumer. **Do not add fields here that aren't in `catalog.json`
/// today.** Keeping the surface 1:1 means we can't accidentally render
/// something the catalog doesn't actually carry.
///
/// Most fields are required-from-the-validator's-perspective but
/// expressed as optionals here so a single-template typo on the
/// website doesn't bring down the whole list we drop the malformed
/// entry and keep going (handled by the decoder in `CatalogService`).
struct CatalogEntry: Codable, Sendable, Identifiable, Hashable {
// Hashable + Equatable conformance is identity-based on `id`
// `TemplateConfigSchema` only conforms to Equatable, so we can't
// synthesize Hashable, and a content-based equality wouldn't be
// useful anyway (the same template re-fetched from cache vs. fresh
// is "the same entry" even if a description was edited upstream).
static func == (lhs: CatalogEntry, rhs: CatalogEntry) -> Bool {
lhs.id == rhs.id
}
func hash(into hasher: inout Hasher) {
hasher.combine(id)
}
/// Stable identifier `<author>/<template-name>`, e.g.
/// `awizemann/hackernews-digest`. Matches the value in
/// `template.json`'s `id` field.
let id: String
/// Human-readable name shown in the catalog list.
let name: String
/// Semver. Compared against the installed version from
/// `InstalledTemplatesIndex` to detect "Update available".
let version: String
let description: String?
let category: String?
let tags: [String]
let author: Author
let minScarfVersion: String?
let minHermesVersion: String?
/// HTTPS URL the install flow consumes.
/// `TemplateInstallerViewModel.openRemoteURL(_:)` accepts this
/// directly. The catalog itself only ships HTTPS URLs (validator
/// enforced).
let installUrl: String
/// Bundle metadata for size warnings and integrity checks. Optional
/// because pre-v2 catalogs didn't carry these.
let bundleSize: Int?
let bundleSha256: String?
/// Slug used by the static-site generator for detail-page URLs.
/// Reused as a stable accessibility-ID suffix so XCUITest can find
/// rows even if the human-readable id contains slashes.
let detailSlug: String?
/// What's inside the bundle, mirrored from `template.json`'s
/// `contents` claim. Drives the "what will be installed" preview
/// on the detail page.
let contents: Contents?
/// Config schema + model recommendation if the template declares
/// one. Using the existing `TemplateConfigSchema` decoder keeps
/// parsing aligned with the install sheet's config form.
let config: TemplateConfigSchema?
struct Author: Codable, Sendable, Equatable {
let name: String
let url: String?
}
/// `template.json`'s `contents` object. All counts are optional
/// `nil` means "not declared," which the catalog renders as zero.
struct Contents: Codable, Sendable, Equatable {
let dashboard: Bool?
let agentsMd: Bool?
let cron: Int?
let config: Int?
let memory: Bool?
let skills: [String]?
}
}
/// Top-level shape of `catalog.json`. Only carries what the Swift
/// catalog browser actually uses `templates` is the list itself,
/// `schemaVersion` lets us reject incompatible future formats.
///
/// **The validator's `generated` field is intentionally NOT decoded.**
/// It ships as a boolean (`true`) per `tools/build-catalog.py`'s
/// "human reminder; a timestamp would churn the diff every run"
/// comment. The catalog UI uses the cache file's `fetchedAt` for the
/// "last refreshed" string, not anything from `catalog.json`.
///
/// **Per-element fault tolerance.** `templates` is decoded entry by
/// entry through an unkeyed container a single malformed entry
/// (missing `tags`, `author`, etc.) is dropped with a logged warning
/// rather than failing the whole catalog decode. Honors the contract
/// the per-entry doc-comment promises.
struct Catalog: Codable, Sendable {
let schemaVersion: Int?
let templates: [CatalogEntry]
init(schemaVersion: Int?, templates: [CatalogEntry]) {
self.schemaVersion = schemaVersion
self.templates = templates
}
/// Custom decoder that drops every key other than `schemaVersion`
/// and `templates`. Without this, `generated: true` would surface
/// as a typeMismatch on `String?`.
enum CodingKeys: String, CodingKey {
case schemaVersion
case templates
}
private static let decodeLogger = Logger(subsystem: "com.scarf", category: "CatalogDecoder")
init(from decoder: Decoder) throws {
let container = try decoder.container(keyedBy: CodingKeys.self)
self.schemaVersion = try container.decodeIfPresent(Int.self, forKey: .schemaVersion)
var entries: [CatalogEntry] = []
if container.contains(.templates) {
var unkeyed = try container.nestedUnkeyedContainer(forKey: .templates)
entries.reserveCapacity(unkeyed.count ?? 0)
while !unkeyed.isAtEnd {
do {
entries.append(try unkeyed.decode(CatalogEntry.self))
} catch {
Self.decodeLogger.warning("dropping malformed catalog entry at index \(unkeyed.currentIndex - 1): \(error.localizedDescription, privacy: .public)")
// Advance past the bad element so the loop terminates.
// Decoding into a permissive `JSONValue` placeholder
// would also work, but Foundation's Decoder API has
// no built-in skip `_Skip` consumes one element.
_ = try? unkeyed.decode(_Skip.self)
}
}
}
self.templates = entries
}
/// Placeholder type used to consume a malformed array element after
/// the real decode threw. Decodes anything by ignoring it.
private struct _Skip: Decodable {
init(from decoder: Decoder) throws {
_ = try decoder.singleValueContainer()
}
}
}
@@ -0,0 +1,99 @@
import AppKit
import Foundation
import os
/// Quits the running app and brings up a fresh instance of the same
/// bundle. Used by the Profile-switching flow (issue #70) so the new
/// active profile lands in a process that has never observed the old
/// one sidesteps any in-process cache or service-state bug that
/// might still be reading from the previous profile's home directory.
///
/// The pairing is intentional:
/// 1. Caller invokes `try AppRelauncher.relaunch()`. That spawns a
/// fresh `open -n <bundleURL>`, captures stderr/exitCode, returns
/// success once the launcher has acknowledged the dispatch.
/// 2. Caller schedules `NSApp.terminate(nil)` 250ms later. The
/// 250ms gives macOS time to begin launching the second PID so
/// the dock-icon hand-off looks smooth (no flash of missing
/// icon). Without the gap, macOS can briefly show zero Scarf
/// icons in the dock.
///
/// Refuses to relaunch when the running bundle is under
/// `DerivedData/` or `Build/Products/Debug` that's an Xcode
/// debug session, and `terminate(nil)` would kill the run mid-debug
/// without giving the new instance any way to attach. The caller
/// surfaces a "restart manually" toast in that case.
@MainActor
enum AppRelauncher {
static let logger = Logger(subsystem: "com.scarf.app", category: "AppRelauncher")
enum RelaunchError: Error, LocalizedError {
case debugBuild
case openFailed(exitCode: Int32, stderr: String)
var errorDescription: String? {
switch self {
case .debugBuild:
return "Refusing to relaunch from an Xcode debug build."
case .openFailed(let code, let stderr):
return "open(1) exited \(code): \(stderr)"
}
}
}
/// Spawns a fresh instance of the running app via `/usr/bin/open -n
/// <bundleURL>` and returns once the launcher process has dispatched
/// the new instance. The caller is responsible for the subsequent
/// `NSApp.terminate(nil)` (deferred ~250ms for a smooth dock hand-off).
/// Throws `.debugBuild` when launched from Xcode/DerivedData;
/// `.openFailed` when `open` itself errored.
static func relaunch() throws {
let bundleURL = Bundle.main.bundleURL
let path = bundleURL.path
if path.contains("/DerivedData/")
|| path.contains("/Build/Products/Debug")
|| path.contains("/Build/Products/Debug-")
{
logger.warning("Refusing relaunch — running from Xcode build (\(path, privacy: .public))")
throw RelaunchError.debugBuild
}
let proc = Process()
proc.executableURL = URL(fileURLWithPath: "/usr/bin/open")
// -n: force a NEW instance (without it, `open` activates the
// running app and we'd never get a fresh process).
// Pass the bundle URL directly (not -a <bundleId>) so signed
// dev clones in `~/Applications` still resolve correctly.
// No -W: we want `open` to return immediately after dispatch,
// not block until the spawned app exits.
proc.arguments = ["-n", path]
let stderrPipe = Pipe()
let stdoutPipe = Pipe()
proc.standardError = stderrPipe
proc.standardOutput = stdoutPipe
do {
try proc.run()
} catch {
throw RelaunchError.openFailed(exitCode: -1, stderr: error.localizedDescription)
}
proc.waitUntilExit()
// Drain both streams BEFORE inspecting exit code so we don't leak fds.
let errData = (try? stderrPipe.fileHandleForReading.readToEnd()) ?? Data()
_ = try? stdoutPipe.fileHandleForReading.readToEnd()
try? stderrPipe.fileHandleForReading.close()
try? stdoutPipe.fileHandleForReading.close()
guard proc.terminationStatus == 0 else {
let stderr = String(data: errData, encoding: .utf8)?
.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
logger.warning("open(1) failed (\(proc.terminationStatus)): \(stderr, privacy: .public)")
throw RelaunchError.openFailed(exitCode: proc.terminationStatus, stderr: stderr)
}
logger.info("Relaunch dispatched for \(path, privacy: .public)")
}
}
@@ -0,0 +1,228 @@
import Foundation
import ScarfCore
import os
/// On-disk cache shape. Versioned so a future schema change can lift
/// stale caches gracefully bump `version` and the loader rejects
/// anything older without trying to migrate. Stored next to the
/// projects registry so a Hermes wipe takes it with the rest of the
/// Scarf-owned state.
struct CatalogCache: Codable, Sendable {
static let currentVersion = 1
let version: Int
let fetchedAt: Date
let catalog: Catalog
init(version: Int = CatalogCache.currentVersion, fetchedAt: Date, catalog: Catalog) {
self.version = version
self.fetchedAt = fetchedAt
self.catalog = catalog
}
}
/// Result of a `loadCatalog` call. Distinguishes "fetched fresh" from
/// "cache served, network failed" so the catalog UI can surface a
/// "could not refresh" hint next to a stale-but-useful list.
enum CatalogLoadResult: Sendable {
case fresh(catalog: Catalog, fetchedAt: Date)
case cache(catalog: Catalog, fetchedAt: Date, refreshError: String?)
case fallback(catalog: Catalog, reason: String)
}
enum CatalogServiceError: LocalizedError, Sendable {
case transport(String)
case http(status: Int)
case decode(String)
var errorDescription: String? {
switch self {
case .transport(let m): return "Catalog transport: \(m)"
case .http(let status): return "Catalog HTTP \(status)"
case .decode(let m): return "Catalog decode: \(m)"
}
}
}
/// Fetches + caches the public template catalog from
/// awizemann.github.io. Mirrors `NousModelCatalogService` 1:1 in
/// shape: cache-first, 24h TTL, fallback when both cache and fetch
/// fail. The catalog is unauthenticated (a public static file on
/// GitHub Pages), so no bearer-token plumbing.
struct CatalogService: Sendable {
/// Where the catalog lives in production. The static-site builder
/// publishes here on `./scripts/catalog.sh publish`. **Versioned
/// constant**: if we ever move this URL, every old Scarf install
/// pegs at its bundled fallback until the user updates Scarf so
/// keep it stable. Settings-configurable in v2.9 only if anyone
/// asks.
static let baseURL = URL(string: "https://awizemann.github.io/scarf/templates/catalog.json")!
static let cacheTTL: TimeInterval = 24 * 60 * 60 // 24h
static let requestTimeout: TimeInterval = 10 // seconds
/// Hard-coded fallback for offline-with-no-cache. Keeps the picker
/// non-empty on a fresh install so the user sees *something* even
/// before the first network call. **Update on every release that
/// adds a template** the validator's `tools/check-catalog-fallback-sync.py`
/// (TODO) catches drift between this list and `templates/`.
static let fallbackCatalog: Catalog = Catalog(
schemaVersion: 1,
templates: [
CatalogEntry(
id: "awizemann/site-status-checker",
name: "Site Status Checker",
version: "1.1.0",
description: "Daily uptime check for a list of URLs you configure on install.",
category: "monitoring",
tags: ["monitoring", "uptime", "cron", "starter"],
author: .init(name: "Alan Wizemann", url: "https://github.com/awizemann"),
minScarfVersion: "2.3.0",
minHermesVersion: "0.9.0",
installUrl: "https://raw.githubusercontent.com/awizemann/scarf/main/templates/awizemann/site-status-checker/site-status-checker.scarftemplate",
bundleSize: nil,
bundleSha256: nil,
detailSlug: "awizemann-site-status-checker",
contents: .init(dashboard: true, agentsMd: true, cron: 1, config: 2, memory: nil, skills: nil),
config: nil
),
CatalogEntry(
id: "awizemann/hackernews-digest",
name: "HackerNews Daily Digest",
version: "1.0.0",
description: "A daily digest of HackerNews top stories. No API keys required.",
category: "news",
tags: ["news", "digest", "hackernews", "cron", "starter"],
author: .init(name: "Alan Wizemann", url: "https://github.com/awizemann"),
minScarfVersion: "2.3.0",
minHermesVersion: "0.9.0",
installUrl: "https://raw.githubusercontent.com/awizemann/scarf/main/templates/awizemann/hackernews-digest/hackernews-digest.scarftemplate",
bundleSize: nil,
bundleSha256: nil,
detailSlug: "awizemann-hackernews-digest",
contents: .init(dashboard: true, agentsMd: true, cron: 1, config: 3, memory: nil, skills: nil),
config: nil
)
]
)
private static let logger = Logger(subsystem: "com.scarf", category: "CatalogService")
let context: ServerContext
private let session: URLSession
private let cachePath: String
init(context: ServerContext = .local, session: URLSession = .shared) {
self.context = context
self.session = session
self.cachePath = context.paths.catalogCache
}
// MARK: - Cache I/O
/// Read the cache via the active transport so a remote droplet's
/// cache lands on the droplet, not the user's Mac. Missing or
/// malformed cache nil; the loader treats that as "no cache" and
/// kicks off a fresh fetch.
func readCache() -> CatalogCache? {
let transport = context.makeTransport()
guard transport.fileExists(cachePath) else { return nil }
do {
let data = try transport.readFile(cachePath)
let decoder = JSONDecoder()
decoder.dateDecodingStrategy = .iso8601
let cache = try decoder.decode(CatalogCache.self, from: data)
guard cache.version == CatalogCache.currentVersion else {
Self.logger.info("catalog cache schema mismatch (got v\(cache.version), expected v\(CatalogCache.currentVersion)); ignoring")
return nil
}
return cache
} catch {
Self.logger.warning("couldn't decode catalog cache: \(error.localizedDescription, privacy: .public)")
return nil
}
}
private func writeCache(_ cache: CatalogCache) {
let transport = context.makeTransport()
do {
let encoder = JSONEncoder()
encoder.dateEncodingStrategy = .iso8601
encoder.outputFormatting = [.prettyPrinted, .sortedKeys]
let data = try encoder.encode(cache)
// Make sure the parent dir exists fresh remote installs
// may not yet have `~/.hermes/scarf/`. mkdir -p is cheap
// and idempotent on both transports.
let parent = (cachePath as NSString).deletingLastPathComponent
if !parent.isEmpty {
try? transport.createDirectory(parent)
}
try transport.writeFile(cachePath, data: data)
} catch {
Self.logger.warning("couldn't write catalog cache: \(error.localizedDescription, privacy: .public)")
}
}
func isCacheStale(_ cache: CatalogCache) -> Bool {
Date().timeIntervalSince(cache.fetchedAt) > Self.cacheTTL
}
// MARK: - Network fetch
/// Make the catalog GET. Times out after `requestTimeout` so a
/// hung network doesn't block the picker indefinitely. Returns the
/// parsed catalog on success, throws on any HTTP / decode error.
func fetchCatalog() async throws -> Catalog {
var request = URLRequest(url: Self.baseURL)
request.httpMethod = "GET"
request.timeoutInterval = Self.requestTimeout
request.setValue("application/json", forHTTPHeaderField: "Accept")
request.cachePolicy = .reloadIgnoringLocalCacheData
let (data, response) = try await session.data(for: request)
guard let http = response as? HTTPURLResponse else {
throw CatalogServiceError.transport("non-HTTP response")
}
guard (200..<300).contains(http.statusCode) else {
throw CatalogServiceError.http(status: http.statusCode)
}
do {
return try JSONDecoder().decode(Catalog.self, from: data)
} catch {
throw CatalogServiceError.decode(error.localizedDescription)
}
}
// MARK: - Public entry
/// Top-level "give me the catalog" entry point. Cache-first: serve
/// from cache if fresh, fetch + write through if stale or empty,
/// fall back to the hard-coded list when both fail. The caller
/// renders based on the case so it can show a "could not refresh"
/// hint next to a stale-but-still-useful list.
func loadCatalog(forceRefresh: Bool = false) async -> CatalogLoadResult {
let cached = readCache()
if let cached, !forceRefresh, !isCacheStale(cached) {
return .cache(catalog: cached.catalog, fetchedAt: cached.fetchedAt, refreshError: nil)
}
do {
let catalog = try await fetchCatalog()
let now = Date()
writeCache(CatalogCache(fetchedAt: now, catalog: catalog))
return .fresh(catalog: catalog, fetchedAt: now)
} catch let error as CatalogServiceError {
if let cached {
Self.logger.warning("catalog refresh failed (\(error.localizedDescription, privacy: .public)); serving stale cache")
return .cache(catalog: cached.catalog, fetchedAt: cached.fetchedAt, refreshError: error.localizedDescription)
}
Self.logger.warning("catalog refresh failed and no cache; serving fallback (\(error.localizedDescription, privacy: .public))")
return .fallback(catalog: Self.fallbackCatalog, reason: error.localizedDescription)
} catch {
if let cached {
return .cache(catalog: cached.catalog, fetchedAt: cached.fetchedAt, refreshError: error.localizedDescription)
}
return .fallback(catalog: Self.fallbackCatalog, reason: error.localizedDescription)
}
}
}
@@ -0,0 +1,105 @@
import Foundation
import UserNotifications
import os
#if canImport(AppKit)
import AppKit
#endif
/// Posts a "Hermes finished responding" local notification when an
/// agent prompt completes while Scarf is not in the foreground
/// (issue #64). Users can switch to other work and learn when their
/// prompt has landed without polling the chat pane.
///
/// Authorization is requested lazily on first use. The user's global
/// toggle (`scarf.chat.notifyOnComplete`, default on) gates posting,
/// and notifications are suppressed when `NSApp.isActive` so users
/// who happen to be looking at the chat aren't pinged for nothing.
@MainActor
final class ChatNotificationService {
static let shared = ChatNotificationService()
private let logger = Logger(subsystem: "com.scarf", category: "ChatNotifications")
private let center = UNUserNotificationCenter.current()
private var hasRequestedAuthorization = false
private var isAuthorized = false
/// AppStorage-shared key for the "notify on completion" toggle.
/// Default true; the toggle lives under Settings Display.
static let toggleKey = "scarf.chat.notifyOnComplete"
private init() {}
/// Post a local notification announcing prompt completion. Quietly
/// no-ops when:
/// - The user has disabled the toggle.
/// - Scarf is the foreground app (the in-chat status indicator
/// is sufficient).
/// - The system has not yet granted (or has denied) notification
/// authorization.
/// `preview` is the first line of the assistant's reply, truncated
/// to a sensible length for the lock-screen / notification center.
func postPromptCompleted(sessionTitle: String?, preview: String) {
let enabled = UserDefaults.standard.object(forKey: Self.toggleKey) as? Bool ?? true
guard enabled else { return }
#if canImport(AppKit)
if NSApp?.isActive == true { return }
#endif
Task { [weak self] in
guard let self else { return }
let granted = await self.ensureAuthorized()
guard granted else { return }
let content = UNMutableNotificationContent()
content.title = sessionTitle?.isEmpty == false
? "Hermes finished — \(sessionTitle ?? "")"
: "Hermes finished responding"
content.body = Self.trimmedPreview(preview)
content.sound = .default
let request = UNNotificationRequest(
identifier: UUID().uuidString,
content: content,
trigger: nil
)
do {
try await self.center.add(request)
} catch {
self.logger.warning("Notification post failed: \(error.localizedDescription, privacy: .public)")
}
}
}
private func ensureAuthorized() async -> Bool {
if isAuthorized { return true }
if hasRequestedAuthorization {
// Already asked once this run; respect the current settings.
let settings = await center.notificationSettings()
isAuthorized = settings.authorizationStatus == .authorized
return isAuthorized
}
hasRequestedAuthorization = true
do {
let granted = try await center.requestAuthorization(options: [.alert, .sound])
isAuthorized = granted
return granted
} catch {
logger.warning("Notification authorization failed: \(error.localizedDescription, privacy: .public)")
return false
}
}
/// First non-empty line, capped at ~140 chars so the notification
/// surface stays readable on every macOS notification style.
static func trimmedPreview(_ raw: String) -> String {
let firstLine = raw
.split(whereSeparator: \.isNewline)
.first
.map(String.init) ?? raw
let trimmed = firstLine.trimmingCharacters(in: .whitespacesAndNewlines)
if trimmed.count <= 140 { return trimmed }
let prefix = trimmed.prefix(140).trimmingCharacters(in: .whitespacesAndNewlines)
return prefix + ""
}
}
@@ -17,10 +17,32 @@ struct HermesFileService: Sendable {
// MARK: - Config
nonisolated func loadConfig() -> HermesConfig {
guard let content = readFile(context.paths.configYAML) else { return .empty }
return parseConfig(content)
// ScarfMon when Full mode is on, log a window of stack
// frames above this call so mystery callers (e.g. config
// reads with no user action) can be identified by tailing
// `log stream --predicate 'subsystem == "com.scarf.mon"'`.
// The window spans frames 1..8: SwiftUI / ObservableObject
// body re-eval chains burn 46 frames before reaching the
// user code, so dropping fewer than that hides the real
// caller. Each frame is on its own line, prefixed with "#N",
// so a single `log stream` line carries the full breadcrumb.
// Symbol-only no addresses, no PII. Backtrace alloc is
// gated on isActive so it's free outside Full mode.
if ScarfMon.isActive {
let frames = Thread.callStackSymbols.prefix(10)
.enumerated()
.map { "#\($0.offset) \($0.element)" }
.joined(separator: " | ")
Self.perfLogger.debug("loadConfig stack: \(frames, privacy: .public)")
}
return ScarfMon.measure(.diskIO, "loadConfig") {
guard let content = readFile(context.paths.configYAML) else { return .empty }
return parseConfig(content)
}
}
private static let perfLogger = Logger(subsystem: "com.scarf.mon", category: "HermesFileService")
/// Error-surfacing config load. Used by Dashboard to show the user a
/// specific reason when config.yaml can't be read on a remote host
/// (permission denied, missing file, sqlite3 not installed, etc.)
@@ -480,22 +502,47 @@ struct HermesFileService: Sendable {
// MARK: - Cron
nonisolated func loadCronJobs() -> [HermesCronJob] {
guard let data = readFileData(context.paths.cronJobsJSON) else { return [] }
do {
let file = try JSONDecoder().decode(CronJobsFile.self, from: data)
return file.jobs
} catch {
print("[Scarf] Failed to decode cron jobs: \(error.localizedDescription)")
return []
ScarfMon.measure(.diskIO, "loadCronJobs") {
guard let data = readFileData(context.paths.cronJobsJSON) else { return [] }
do {
let file = try JSONDecoder().decode(CronJobsFile.self, from: data)
return file.jobs
} catch {
print("[Scarf] Failed to decode cron jobs: \(error.localizedDescription)")
return []
}
}
}
/// Read the most-recent run output for a cron job. Hermes writes
/// `~/.hermes/cron/output/<jobId>/<YYYY-MM-DD_HH-MM-SS>.md` per run
/// (one file per execution); we resolve the per-job subdir, take
/// the lexicographically-last filename (which is the newest given
/// the timestamp prefix), and return its contents. Returns nil
/// when the subdir is missing, empty, or the read fails the cron
/// detail surface treats nil as "no output yet."
///
/// A legacy flat-file layout (`<dir>/<filename containing jobId>`)
/// is checked as a fallback so older Hermes installs that used a
/// non-nested layout still surface their last run.
nonisolated func loadCronOutput(jobId: String) -> String? {
let dir = context.paths.cronOutputDir
guard let files = try? transport.listDirectory(dir) else { return nil }
let matching = files.filter { $0.contains(jobId) }.sorted().last
guard let filename = matching else { return nil }
return readFile(dir + "/" + filename)
let perJobDir = dir + "/" + jobId
if let runs = try? transport.listDirectory(perJobDir),
let latest = runs.sorted().last {
if let content = readFile(perJobDir + "/" + latest) {
return content
}
}
// Legacy fallback: pre-subdir layouts had files like
// `<jobId>-<timestamp>.log` directly under cronOutputDir. Keep
// matching them so users on older Hermes versions still see
// their tail.
if let files = try? transport.listDirectory(dir),
let matching = files.filter({ $0.contains(jobId) }).sorted().last {
return readFile(dir + "/" + matching)
}
return nil
}
// MARK: - Skills
@@ -10,6 +10,10 @@ final class HermesFileWatcher {
/// Remote polling task. Non-nil only when `context.isRemote`. Cancelled
/// on `stopWatching()`.
private var remotePollTask: Task<Void, Never>?
/// Project directory paths fed to the SSH poller alongside `watchedCorePaths`.
/// Updated by `updateProjectWatches` so the remote stream restarts whenever
/// the project list changes.
private var remoteProjectPaths: [String] = []
let context: ServerContext
private let transport: any ServerTransport
@@ -52,17 +56,7 @@ final class HermesFileWatcher {
func startWatching() {
if context.isRemote {
// FSEvents doesn't reach across SSH. Drive lastChangeDate off
// the transport's AsyncStream, which polls stat mtime on a
// shared ControlMaster channel (~5ms per tick).
let stream = transport.watchPaths(watchedCorePaths)
remotePollTask = Task { [weak self] in
for await _ in stream {
await MainActor.run { [weak self] in
self?.lastChangeDate = Date()
}
}
}
startRemotePoller()
return
}
@@ -79,6 +73,31 @@ final class HermesFileWatcher {
// touches `gateway_state.json` which the watcher catches.
}
/// (Re)start the SSH polling stream over the union of `watchedCorePaths`
/// and the current `remoteProjectPaths`. Called on initial start and
/// whenever `updateProjectWatches` changes the project set.
///
/// ScarfMon `mac.fileWatcher.remoteRestart` (event) fires once per
/// poller restart with `bytes` carrying the path count. Frequent
/// restarts mean the project-list update path is churning; pair
/// with `mac.fileWatcher.remoteTick` from the upstream transport
/// (`ssh.streamScript` / `transport.watchPaths`) to see actual
/// poll cadence.
private func startRemotePoller() {
remotePollTask?.cancel()
let pathSet = watchedCorePaths + remoteProjectPaths
ScarfMon.event(.transport, "mac.fileWatcher.remoteRestart", count: 1, bytes: pathSet.count)
let stream = transport.watchPaths(pathSet)
remotePollTask = Task { [weak self] in
for await _ in stream {
ScarfMon.event(.transport, "mac.fileWatcher.remoteDelta", count: 1)
await MainActor.run { [weak self] in
self?.lastChangeDate = Date()
}
}
}
}
func stopWatching() {
for source in coreSources + projectSources {
source.cancel()
@@ -91,11 +110,26 @@ final class HermesFileWatcher {
remotePollTask = nil
}
func updateProjectWatches(_ dashboardPaths: [String]) {
// Remote contexts don't support per-project FSEvents watches today
// the shared mtime poll covers the core set. Adding per-project
// polling is a Phase 4 polish item.
guard !context.isRemote else { return }
/// Watch each project's `dashboard.json` AND its enclosing `.scarf/`
/// directory. Watching both is what lets file-reading widgets
/// (markdown_file, log_tail, image) refresh when a cron job rewrites
/// a sidecar file: dir-level FSEvents fire on add/remove/rename inside
/// `.scarf/`, file-level FSEvents fire on dashboard.json content
/// changes. In-place writes to an existing sidecar file (e.g., `>>` log
/// append) are NOT detected by convention the cron job should write
/// atomically (write-then-rename) or `touch dashboard.json` after each
/// run.
func updateProjectWatches(dashboardPaths: [String], scarfDirs: [String]) {
if context.isRemote {
// Restart the SSH poller with the union of core + project dir
// paths. `stat -c %Y` on a directory tracks mtime, which ticks
// on add/remove/rename inside the dir same coverage as the
// local FSEvents directory watch below.
let union = Array(Set(dashboardPaths + scarfDirs))
remoteProjectPaths = union.sorted()
startRemotePoller()
return
}
for source in projectSources {
source.cancel()
}
@@ -105,6 +139,11 @@ final class HermesFileWatcher {
projectSources.append(source)
}
}
for dir in scarfDirs {
if let source = makeSource(for: dir) {
projectSources.append(source)
}
}
}
private func makeSource(for path: String) -> DispatchSourceFileSystemObject? {
@@ -117,6 +156,12 @@ final class HermesFileWatcher {
queue: .main
)
source.setEventHandler { [weak self] in
// ScarfMon fires every time FSEvents detects a change on
// a watched core or project path. High counts during
// streaming chats are normal (state.db-wal ticks per
// message persisted); high counts when nothing's happening
// suggest a runaway watcher install.
ScarfMon.event(.transport, "mac.fileWatcher.localFire", count: 1)
self?.lastChangeDate = Date()
}
source.setCancelHandler {
@@ -0,0 +1,158 @@
import Foundation
import ScarfCore
import os
/// Maps `templateId installedVersion` for every project the user has
/// installed via a template. Used by the catalog browser to render
/// each row's "Installed" / "Update available" / "Not installed" badge.
///
/// **Read-only.** This service walks the projects registry + each
/// project's `.scarf/template.lock.json`. It never writes anything.
///
/// **Per-call rebuild.** The index is cheap to compute (a registry
/// read + N lock-file reads, each a few hundred bytes) and changes
/// infrequently from the user's perspective. We rebuild on every
/// catalog-sheet open instead of caching with invalidation rules
/// the cost of a stale "Installed" badge would surprise users far more
/// than the cost of one extra `[String:Data]` walk on each refresh.
struct InstalledTemplatesIndex: Sendable {
private static let logger = Logger(subsystem: "com.scarf", category: "InstalledTemplatesIndex")
let context: ServerContext
init(context: ServerContext = .local) {
self.context = context
}
/// Build the index. Returns `[templateId: version]`. Projects
/// without a lock file (ad-hoc projects added via "Add Project")
/// are skipped silently they aren't template-installed and don't
/// belong in the index.
func build() -> [String: String] {
let transport = context.makeTransport()
let registryPath = context.paths.projectsRegistry
guard transport.fileExists(registryPath) else { return [:] }
let data: Data
do {
data = try transport.readFile(registryPath)
} catch {
Self.logger.warning("couldn't read projects registry at \(registryPath, privacy: .public): \(error.localizedDescription, privacy: .public)")
return [:]
}
let registry: ProjectRegistry
do {
registry = try JSONDecoder().decode(ProjectRegistry.self, from: data)
} catch {
Self.logger.warning("couldn't decode projects registry: \(error.localizedDescription, privacy: .public)")
return [:]
}
var index: [String: String] = [:]
for project in registry.projects {
guard let lock = readLock(for: project) else { continue }
// Last-write-wins on duplicates. Two installs of the same
// template id at different versions is rare but possible
// (user installed it in two project dirs); the catalog
// doesn't need to render which version, just that
// *something* is installed.
index[lock.templateId] = lock.templateVersion
}
return index
}
/// Update-availability classification for a single catalog entry.
/// `installedVersion == nil` not installed. Equal versions
/// `.installed`. Catalog version newer than installed `.updateAvailable`.
/// Catalog version older or equal-but-different format `.installed`
/// (we trust the catalog; semver-noise comparisons aren't worth a
/// full parse here).
static func classify(catalogVersion: String, installedVersion: String?) -> InstallState {
guard let installedVersion else { return .notInstalled }
if catalogVersion == installedVersion {
return .installed(version: installedVersion)
}
if isVersionNewer(catalogVersion, than: installedVersion) {
return .updateAvailable(installedVersion: installedVersion, catalogVersion: catalogVersion)
}
return .installed(version: installedVersion)
}
enum InstallState: Sendable, Equatable {
case notInstalled
case installed(version: String)
case updateAvailable(installedVersion: String, catalogVersion: String)
}
// MARK: - Internals
/// Read `<project>/.scarf/template.lock.json`. Returns nil for
/// ad-hoc (non-templated) projects, malformed JSON, or any I/O
/// failure the catalog shouldn't crash because one project's
/// lock file got corrupted.
private func readLock(for project: ProjectEntry) -> TemplateLock? {
let path = project.path + "/.scarf/template.lock.json"
let transport = context.makeTransport()
guard transport.fileExists(path) else { return nil }
let data: Data
do {
data = try transport.readFile(path)
} catch {
Self.logger.warning("couldn't read template lock at \(path, privacy: .public): \(error.localizedDescription, privacy: .public)")
return nil
}
do {
return try JSONDecoder().decode(TemplateLock.self, from: data)
} catch {
Self.logger.warning("couldn't decode template lock at \(path, privacy: .public): \(error.localizedDescription, privacy: .public)")
return nil
}
}
/// Plain semver-ish comparison: split on `.`, compare numerically
/// from major down. Pre-release suffixes (anything after `-` in a
/// segment) make that release *older* than the same numeric prefix
/// without a suffix matches semver §11 ("a pre-release version has
/// lower precedence than the associated normal version"), so
/// `1.0.0-beta` is *not* newer than `1.0.0`. Two pre-releases on the
/// same numeric prefix fall back to lexicographic compare on the
/// suffix. Good enough for "is the catalog ahead?" this isn't a
/// package manager.
static func isVersionNewer(_ candidate: String, than other: String) -> Bool {
let (aCore, aPre) = splitPrerelease(candidate)
let (bCore, bPre) = splitPrerelease(other)
let a = aCore.split(separator: ".").map(String.init)
let b = bCore.split(separator: ".").map(String.init)
for i in 0..<max(a.count, b.count) {
let ai = i < a.count ? a[i] : "0"
let bi = i < b.count ? b[i] : "0"
if let an = Int(ai), let bn = Int(bi) {
if an != bn { return an > bn }
} else if ai != bi {
return ai > bi
}
}
// Numeric cores match. Pre-release tiebreak: an absent pre-release
// outranks any present pre-release.
switch (aPre, bPre) {
case (nil, nil): return false
case (nil, _): return true // candidate has no pre-release; older has one newer
case (_, nil): return false // candidate has pre-release; other is the release older
case (let ap?, let bp?): return ap > bp
}
}
/// Split a version string into its numeric core and pre-release
/// suffix on the first `-`. `"1.0.0-beta.2"` `("1.0.0", "beta.2")`.
/// `"1.0.0"` `("1.0.0", nil)`.
private static func splitPrerelease(_ version: String) -> (core: String, pre: String?) {
if let dash = version.firstIndex(of: "-") {
return (String(version[..<dash]), String(version[version.index(after: dash)...]))
}
return (version, nil)
}
}
@@ -0,0 +1,253 @@
import Foundation
import os
import ScarfCore
/// Mirrors a project's resolved Keychain secrets into a managed region
/// of `~/.hermes/.env` so Hermes cron jobs (and any other agent
/// process Hermes spawns) can use them via `os.environ`.
///
/// **Why this exists.** Hermes has no `keychain://` URI resolver. When
/// a cron prompt says *"read config.json, get values.api_token, call
/// the API,"* Hermes reads the literal `keychain://...` string and
/// forwards it as the token producing 401s. By mirroring resolved
/// values into `~/.hermes/.env` (which the cron scheduler reloads
/// fresh on every tick at `cron/scheduler.py:897-903`), the agent can
/// reference them via shell expansion (`$SCARF_<SLUG>_<FIELD>`) when
/// it invokes the terminal or code_exec tool.
///
/// **Source of truth stays in the Keychain.** This service derives
/// content; it never accepts plaintext values from callers. config.json
/// continues to store `keychain://` URIs unchanged.
///
/// **Marker contract.** One block per project, slug-namespaced:
/// `# scarf-secrets:begin <slug>` / `# scarf-secrets:end <slug>`. The
/// splice logic lives in ScarfCore's `SecretsEnvBlock`. Other slugs'
/// blocks and user-authored content outside any block are preserved
/// byte-identically.
///
/// **Trust boundary.** Mode 0600 on `~/.hermes/.env` is enforced by
/// `LocalTransport.writeFile`'s heuristic for `.env` paths. Plaintext
/// on disk matches the existing trust model for `ANTHROPIC_API_KEY`
/// and other Hermes-side credentials in the same file.
struct KeychainEnvMirror: Sendable {
private static let logger = Logger(subsystem: "com.scarf", category: "KeychainEnvMirror")
let context: ServerContext
nonisolated init(context: ServerContext = .local) {
self.context = context
}
// MARK: - Public
/// Resolve every `secret`-typed config field for `project` and
/// splice the result into `~/.hermes/.env` under a marker-bounded
/// block keyed by the template's slug. No-op when the project
/// has no cached manifest (schema-less project) or no secret
/// fields.
nonisolated func mirror(project: ProjectEntry) throws {
guard let resolved = try resolveSecrets(for: project) else {
// No manifest cache or no secret fields nothing to mirror.
// Don't write an empty block; that would leave dangling
// markers if a project briefly had secrets and then dropped
// them. Use unmirror() in that path instead.
return
}
try mirror(
slug: resolved.slug,
entries: resolved.entries,
envPath: context.paths.envFile
)
}
/// Splice-only seam: takes pre-resolved entries and writes the
/// block to `envPath`. Used by `mirror(project:)` after Keychain
/// resolution; also exposed for unit tests that don't want to
/// touch the user's real Keychain or `~/.hermes/.env`.
///
/// - Empty `entries` removes the block (idempotent no error
/// when block isn't there). This is the single sentinel for
/// "project briefly had secrets, no longer does."
/// - Path is checked for `.env`-suffix before writing so the
/// `LocalTransport` mode-0600 heuristic kicks in.
/// - No-op when the rewritten output equals the existing file
/// avoids file-watcher churn from idempotent reconciles.
nonisolated func mirror(
slug: String,
entries: [(key: String, value: String)],
envPath: String
) throws {
let transport = context.makeTransport()
if entries.isEmpty {
try unmirrorBlock(slug: slug, envPath: envPath, transport: transport)
return
}
let block = SecretsEnvBlock.renderBlock(slug: slug, entries: entries)
let existing = try readExisting(at: envPath, transport: transport)
let rewritten = SecretsEnvBlock.applyBlock(block, forSlug: slug, to: existing)
try writeIfChanged(path: envPath, existing: existing, rewritten: rewritten, transport: transport)
}
/// Strip the project's block from `~/.hermes/.env`. Reads the
/// project's cached manifest to recover its slug the slug is
/// the only key the env file knows. When the manifest is absent
/// (uninstall path may have deleted it before we run), we fall
/// back to `derivedSlug(forProject:)`.
nonisolated func unmirror(project: ProjectEntry) throws {
let slug = cachedSlug(for: project) ?? Self.derivedSlug(forProject: project)
try unmirror(slug: slug, envPath: context.paths.envFile)
}
/// Splice-only unmirror: strips the block for `slug` from `envPath`.
/// Symmetric with `mirror(slug:entries:envPath:)` no Keychain
/// access, suitable for unit tests.
nonisolated func unmirror(slug: String, envPath: String) throws {
let transport = context.makeTransport()
try unmirrorBlock(slug: slug, envPath: envPath, transport: transport)
}
/// Walk the project registry and call `mirror(project:)` on each
/// entry. Idempotent projects whose blocks are already current
/// produce no write. Used at app launch to catch the case where
/// the user upgraded from a pre-mirror Scarf version.
nonisolated func reconcileAll() throws {
let registry = ProjectDashboardService(context: context).loadRegistry()
for project in registry.projects {
do {
try mirror(project: project)
} catch {
Self.logger.warning(
"reconcile failed for \(project.name, privacy: .public): \(error.localizedDescription, privacy: .public)"
)
}
}
}
// MARK: - Resolution
private struct ResolvedSecrets {
let slug: String
let entries: [(key: String, value: String)]
}
/// Read the project's cached manifest + config, resolve every
/// secret field's Keychain value, return KEY=VALUE pairs ready
/// for `SecretsEnvBlock.renderBlock`. Nil when the project has
/// no manifest cache or no secret-typed fields in its schema.
nonisolated private func resolveSecrets(
for project: ProjectEntry
) throws -> ResolvedSecrets? {
let configService = ProjectConfigService(context: context)
guard let manifest = try configService.loadCachedManifest(project: project) else {
return nil
}
guard let schema = manifest.config else { return nil }
let secretFields = schema.fields.filter { $0.type == .secret }
guard !secretFields.isEmpty else { return nil }
let configFile = try configService.load(project: project)
let values = configFile?.values ?? [:]
var entries: [(key: String, value: String)] = []
for field in secretFields {
guard let value = values[field.key] else { continue }
let resolved: Data?
do {
resolved = try configService.resolveSecret(ref: value)
} catch {
Self.logger.warning(
"couldn't resolve secret \(field.key, privacy: .public) for \(project.name, privacy: .public): \(error.localizedDescription, privacy: .public)"
)
continue
}
guard let data = resolved,
let str = String(data: data, encoding: .utf8) else {
continue
}
let key = SecretsEnvBlock.envKeyName(slug: manifest.slug, fieldKey: field.key)
entries.append((key: key, value: str))
}
return ResolvedSecrets(slug: manifest.slug, entries: entries)
}
// MARK: - File I/O
nonisolated private func unmirrorBlock(
slug: String,
envPath: String,
transport: any ServerTransport
) throws {
guard transport.fileExists(envPath) else { return }
let existing = try readExisting(at: envPath, transport: transport)
let rewritten = SecretsEnvBlock.removeBlock(forSlug: slug, from: existing)
try writeIfChanged(path: envPath, existing: existing, rewritten: rewritten, transport: transport)
}
nonisolated private func readExisting(
at path: String,
transport: any ServerTransport
) throws -> String {
guard transport.fileExists(path) else { return "" }
let data = try transport.readFile(path)
return String(data: data, encoding: .utf8) ?? ""
}
nonisolated private func writeIfChanged(
path: String,
existing: String,
rewritten: String,
transport: any ServerTransport
) throws {
guard rewritten != existing else { return }
guard let outData = rewritten.data(using: .utf8) else {
throw NSError(
domain: "com.scarf.keychain-env-mirror",
code: -1,
userInfo: [NSLocalizedDescriptionKey: "Couldn't UTF-8 encode env file"]
)
}
// LocalTransport's writeFile preserves 0600 for paths that match
// `.env` conventions (see ServerTransport.writeFile docstring).
// The hermes home is ensured by Hermes itself; we don't mkdir
// here.
try transport.writeFile(path, data: outData)
Self.logger.info("rewrote \(path, privacy: .public)\(outData.count) bytes")
}
// MARK: - Slug helpers
/// Read the project's cached manifest to recover its slug. Used
/// by `unmirror` since the slug is the only key the env file
/// knows. Nil when the manifest cache is absent (schema-less
/// project, or uninstall path that already deleted it).
nonisolated private func cachedSlug(for project: ProjectEntry) -> String? {
let configService = ProjectConfigService(context: context)
guard let manifest = try? configService.loadCachedManifest(project: project) else {
return nil
}
return manifest.slug
}
/// Fallback slug derivation when the cached manifest is gone.
/// Mirrors `ProjectScaffolder.suggestedSlug` so a from-scratch
/// project has a stable slug shape too though scratch
/// projects don't have schemas so they shouldn't reach the
/// mirror path in practice.
nonisolated static func derivedSlug(forProject project: ProjectEntry) -> String {
let lowered = project.name.lowercased()
var slug = ""
var lastWasDash = false
for scalar in lowered.unicodeScalars {
let c = Character(scalar)
if c.isLetter || c.isNumber {
slug.append(c)
lastWasDash = false
} else if !slug.isEmpty && !lastWasDash {
slug.append("-")
lastWasDash = true
}
}
while slug.hasSuffix("-") { slug.removeLast() }
return slug.isEmpty ? "project" : slug
}
}
@@ -0,0 +1,110 @@
import Foundation
import AVFoundation
import os
import Observation
/// Per-message text-to-speech for assistant chat replies (issue #66).
/// Uses `AVSpeechSynthesizer` with the system voice no Hermes
/// dependency, works offline, picks up the user's macOS Spoken Content
/// voice selection automatically.
///
/// One synthesizer is shared across the app so starting a second
/// message's playback automatically interrupts the first. The
/// per-message speaker button reads `playingMessageId` to render
/// play vs. stop state.
///
/// The full Hermes-provider TTS pipeline (Edge / ElevenLabs / OpenAI
/// / NeuTTS / Piper from Settings Voice) is deferred to a follow-up
/// wiring per-provider audio fetching, caching, and interruption
/// is a much bigger surface than what's needed to give users a
/// listen-while-doing-other-work affordance today.
@MainActor
@Observable
final class MessageSpeechService: NSObject {
static let shared = MessageSpeechService()
/// The message id currently being spoken, or `nil` when idle.
/// Bubbles read this to flip their speaker icon to a stop glyph.
private(set) var playingMessageId: Int?
private let synthesizer = AVSpeechSynthesizer()
private let logger = Logger(subsystem: "com.scarf", category: "MessageSpeech")
private override init() {
super.init()
synthesizer.delegate = self
}
/// Speak `content`. If a different message is currently playing,
/// interrupt it. If the same message is currently playing, this
/// stops playback (toggle behavior).
func toggle(messageId: Int, content: String) {
if playingMessageId == messageId {
stop()
return
}
if synthesizer.isSpeaking {
synthesizer.stopSpeaking(at: .immediate)
}
let cleaned = Self.strippedForSpeech(content)
guard !cleaned.isEmpty else { return }
let utterance = AVSpeechUtterance(string: cleaned)
// AVSpeechUtterance honors the user's Spoken Content default
// voice when `voice` is `nil`, which is the right behavior:
// users who configured a specific macOS voice get it
// automatically.
utterance.rate = AVSpeechUtteranceDefaultSpeechRate
playingMessageId = messageId
synthesizer.speak(utterance)
}
/// Stop any in-progress speech and clear `playingMessageId`.
func stop() {
guard playingMessageId != nil else { return }
synthesizer.stopSpeaking(at: .immediate)
playingMessageId = nil
}
/// Strip markdown control characters before speech so the user
/// doesn't hear "asterisk asterisk bold". Code fences and inline
/// code are spoken verbatim minus the backticks. Keeps URLs
/// readable but drops square-bracket link wrappers.
static func strippedForSpeech(_ raw: String) -> String {
var out = raw
// Fenced code blocks keep contents
out = out.replacingOccurrences(of: "```", with: "")
// Inline code drop backticks
out = out.replacingOccurrences(of: "`", with: "")
// Bold/italic markers
out = out.replacingOccurrences(of: "**", with: "")
out = out.replacingOccurrences(of: "__", with: "")
// Link syntax: [text](url) text
if let regex = try? NSRegularExpression(
pattern: #"\[([^\]]+)\]\([^)]+\)"#,
options: []
) {
let range = NSRange(out.startIndex..., in: out)
out = regex.stringByReplacingMatches(
in: out,
options: [],
range: range,
withTemplate: "$1"
)
}
return out.trimmingCharacters(in: .whitespacesAndNewlines)
}
}
extension MessageSpeechService: AVSpeechSynthesizerDelegate {
nonisolated func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didFinish utterance: AVSpeechUtterance) {
Task { @MainActor in
self.playingMessageId = nil
}
}
nonisolated func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didCancel utterance: AVSpeechUtterance) {
Task { @MainActor in
self.playingMessageId = nil
}
}
}
+29 -20
View File
@@ -62,27 +62,36 @@ final class NousAuthFlow {
output = ""
state = .starting
let proc = context.makeTransport().makeProcess(
executable: context.paths.hermesBinary,
args: ["auth", "add", "nous", "--no-browser"]
)
if !context.isRemote {
// Only enrich env locally remote ssh gets the remote login env
// naturally, and exporting our local keys into it would be wrong.
// Python block-buffers stdout when it's a pipe (not a TTY). The
// device-code flow prints the verification URL + user code, then
// enters a ~15-minute polling loop that never hits `input()`
// so nothing flushes and our readability handler never sees the
// output. Users see the sheet spinning forever while hermes is
// actually waiting for approval.
//
// PKCE doesn't have this problem because `input("Authorization
// code: ")` flushes stdout before blocking, which is why
// OAuthFlowController works without this setting.
//
// Local: set on `proc.environment`. Remote: setting
// `proc.environment` would only configure the local-side ssh
// process, NOT the remote python interpreter ssh doesn't
// forward arbitrary env without `SendEnv` configured on both
// sides. So for remote we wrap the command in `env
// PYTHONUNBUFFERED=1 `, which prefixes the var into the
// remote command's environment regardless of ssh config.
let proc: Process
if context.isRemote {
proc = context.makeTransport().makeProcess(
executable: "env",
args: ["PYTHONUNBUFFERED=1", context.paths.hermesBinary, "auth", "add", "nous", "--no-browser"]
)
} else {
proc = context.makeTransport().makeProcess(
executable: context.paths.hermesBinary,
args: ["auth", "add", "nous", "--no-browser"]
)
var env = HermesFileService.enrichedEnvironment()
// Python block-buffers stdout when it's a pipe (not a TTY). The
// device-code flow prints the verification URL + user code, then
// enters a ~15-minute polling loop that never hits `input()`
// so nothing flushes and our readability handler never sees the
// output. Users see the sheet spinning forever while hermes is
// actually waiting for approval.
//
// PKCE doesn't have this problem because `input("Authorization
// code: ")` flushes stdout before blocking, which is why
// OAuthFlowController works without this setting.
//
// PYTHONUNBUFFERED forces line-buffered stdout for the whole
// subprocess tiny perf cost, huge UX win for device-code.
env["PYTHONUNBUFFERED"] = "1"
proc.environment = env
}
@@ -25,6 +25,31 @@ struct NousSubscriptionState: Sendable, Hashable {
/// to line up: auth record present *and* `nous` is the active provider.
/// Mirrors `NousSubscriptionFeatures.subscribed` on the Python side.
var subscribed: Bool { present && providerIsNous }
/// Days since the auth record was last touched (refreshed by Hermes
/// or re-authed by the user). Hermes refreshes on every agent boot,
/// so a large value here means the user hasn't started a session
/// recently which is exactly when the refresh token is at risk
/// of expiring (typical ~30 day lifetime). Returns nil when
/// `updatedAt` is unknown (older Hermes versions). Capped at
/// `Int.max` to avoid overflow on absurd inputs.
func daysSinceLastRefresh(now: Date = Date()) -> Int? {
guard let updatedAt else { return nil }
let seconds = now.timeIntervalSince(updatedAt)
guard seconds > 0 else { return 0 }
return Int(seconds / 86_400)
}
/// True when we haven't seen a Hermes refresh in 14 days half
/// the typical 30-day Nous refresh-token lifetime. This is the
/// trigger for the "enable keepalive" nudge: still recoverable
/// (refresh token hasn't expired yet) but heading there. Returns
/// false when `updatedAt` is unknown we don't nudge on missing
/// data, only on confirmed staleness.
var hasStaleRefresh: Bool {
guard let days = daysSinceLastRefresh() else { return false }
return days >= 14
}
}
/// Reads `auth.json` to detect Nous Portal subscription state. Delegates file
@@ -57,30 +82,32 @@ struct NousSubscriptionService: Sendable {
/// on any read or parse failure callers treat "absent" and "can't
/// read" the same in UI (show a "not subscribed" CTA).
nonisolated func loadState() -> NousSubscriptionState {
guard let data = try? transport.readFile(authJSONPath) else {
return .absent
ScarfMon.measure(.diskIO, "nous.subscription.loadState") {
guard let data = try? transport.readFile(authJSONPath) else {
return .absent
}
guard let root = try? JSONSerialization.jsonObject(with: data) as? [String: Any] else {
logger.warning("auth.json is not a JSON object; assuming no Nous subscription")
return .absent
}
let providers = root["providers"] as? [String: Any] ?? [:]
let nous = providers["nous"] as? [String: Any]
let token = nous?["access_token"] as? String
let present = (token?.isEmpty == false)
let activeProvider = root["active_provider"] as? String
let providerIsNous = (activeProvider == "nous")
let updatedAt: Date? = {
guard let raw = root["updated_at"] as? String else { return nil }
return ISO8601DateFormatter().date(from: raw)
}()
return NousSubscriptionState(
present: present,
providerIsNous: providerIsNous,
updatedAt: updatedAt
)
}
guard let root = try? JSONSerialization.jsonObject(with: data) as? [String: Any] else {
logger.warning("auth.json is not a JSON object; assuming no Nous subscription")
return .absent
}
let providers = root["providers"] as? [String: Any] ?? [:]
let nous = providers["nous"] as? [String: Any]
let token = nous?["access_token"] as? String
let present = (token?.isEmpty == false)
let activeProvider = root["active_provider"] as? String
let providerIsNous = (activeProvider == "nous")
let updatedAt: Date? = {
guard let raw = root["updated_at"] as? String else { return nil }
return ISO8601DateFormatter().date(from: raw)
}()
return NousSubscriptionState(
present: present,
providerIsNous: providerIsNous,
updatedAt: updatedAt
)
}
}
@@ -0,0 +1,128 @@
import Foundation
import ScarfCore
import os
/// Manages a Scarf-owned cron job that keeps OAuth refresh tokens
/// alive by booting a trivial Hermes session on a daily cadence.
///
/// **Why this exists.** Hermes refreshes OAuth access tokens on
/// agent startup (via `resolve_nous_runtime_credentials()` and
/// equivalents), but never proactively. If the user goes longer than
/// the *refresh*-token lifetime without starting a session, the
/// refresh token itself expires and only a full re-auth recovers it.
/// Refresh-token lifetimes are typically ~30 days; a 24-hour
/// heartbeat keeps the window from closing for users who go quiet.
///
/// **What it runs.** A single cron job with a stable name
/// (`Self.jobName`) and a minimal one-token prompt. Executing the
/// job boots `hermes acp` end-to-end, which is what triggers the
/// refresh. There is no public Hermes CLI verb to refresh a token in
/// isolation today (no `hermes auth refresh <provider>`), so booting
/// a session is the only mechanism we have. When Hermes adds a
/// dedicated refresh verb, swap the prompt for a `--script` that
/// invokes it and the surrounding wiring stays unchanged.
///
/// **Identification.** The job is found by exact-match on
/// `Self.jobName`. Users can edit the schedule from the Cron tab
/// without breaking detection only the name is load-bearing here.
@MainActor
final class OAuthKeepaliveCronService {
/// Stable job name. The leading `[scarf:oauth-keepalive]` prefix
/// follows the convention `ProjectTemplateInstaller` uses for
/// template-installed cron jobs (`[tmpl:<id>] `) so future
/// inspection tools can distinguish Scarf-owned schedules from
/// user-authored ones at a glance.
static let jobName = "[scarf:oauth-keepalive] OAuth token refresh"
/// 4am local daily. Off-peak avoids contending with interactive
/// usage and is a reasonable default; users can reschedule from
/// the Cron tab if they prefer a different cadence. The cron
/// window must stay <= the shortest refresh-token lifetime among
/// the user's configured OAuth providers (~30d for Nous).
static let defaultSchedule = "0 4 * * *"
/// Minimal prompt. The point is to boot a session not to do
/// useful work so we want the LLM call to terminate fast. A
/// one-word prompt + a one-word reply is the cheapest end-to-end
/// turn. Subscription-routed providers (Nous) bear zero
/// per-call cost; for API-key users, a single trivial turn per
/// day is negligible compared to the alternative of full re-auth
/// every month.
static let defaultPrompt = "Reply with the single word 'ok'."
private let logger = Logger(subsystem: "com.scarf", category: "OAuthKeepaliveCronService")
let context: ServerContext
private let fileService: HermesFileService
init(context: ServerContext = .local) {
self.context = context
self.fileService = HermesFileService(context: context)
}
// MARK: - Read
/// Returns the keepalive job if one is currently registered, nil
/// otherwise. Reads `~/.hermes/cron/jobs.json` synchronously via
/// the existing `loadCronJobs()` path.
nonisolated func currentJob() -> HermesCronJob? {
fileService.loadCronJobs().first { $0.name == Self.jobName }
}
nonisolated func isEnabled() -> Bool {
currentJob() != nil
}
// MARK: - Mutate
/// Register the keepalive job via `hermes cron create`. No-op when
/// a job with the same name already exists toggle semantics
/// stay idempotent so a double-tap doesn't duplicate the entry.
/// Returns true on success or no-op, false on CLI failure.
@discardableResult
nonisolated func enable() async -> Bool {
if isEnabled() { return true }
// `hermes cron create` only accepts: --name, --deliver,
// --repeat, --skill, --script, --workdir. The `silent: Bool?`
// field on HermesCronJob is JSON-only (Hermes can write it,
// but the CLI's create verb doesn't expose a flag for it).
// Pass any unknown flag and argparse rejects the whole
// command, so stick to the supported surface and let Hermes
// pick its default delivery target the side effect we care
// about (token refresh during session boot) fires regardless.
let result = await Task.detached { [fileService] in
fileService.runHermesCLI(
args: [
"cron", "create",
"--name", Self.jobName,
Self.defaultSchedule,
Self.defaultPrompt,
],
timeout: 60
)
}.value
if result.exitCode != 0 {
logger.warning("oauth-keepalive enable failed: exit=\(result.exitCode) output=\(result.output, privacy: .public)")
return false
}
return true
}
/// Remove the keepalive job. Idempotent when no job exists
/// today, the call is a no-op success. Returns true on success
/// or no-op, false on CLI failure.
@discardableResult
nonisolated func disable() async -> Bool {
guard let job = currentJob() else { return true }
let result = await Task.detached { [fileService] in
fileService.runHermesCLI(
args: ["cron", "remove", job.id],
timeout: 30
)
}.value
if result.exitCode != 0 {
logger.warning("oauth-keepalive disable failed: exit=\(result.exitCode) output=\(result.output, privacy: .public)")
return false
}
return true
}
}
@@ -0,0 +1,280 @@
import Foundation
import os
import ScarfCore
/// Creates a Scarf-standard project from scratch a minimal directory
/// tree with a placeholder `dashboard.json` + a stub `AGENTS.md` (just
/// the Scarf-managed marker block) and registers it. The
/// counterpart to `ProjectTemplateInstaller`: that one synthesizes a
/// project from a `.scarftemplate` plan; this one synthesizes a bare
/// shell that the agent fills in conversationally via the
/// `scarf-template-author` skill.
///
/// **Why this exists.** `AddProjectSheet` registers an existing
/// directory but doesn't create one; `ProjectTemplateInstaller`
/// creates a directory but only from a manifest. Neither produces a
/// fresh, hand-rolled, Scarf-standard project.
///
/// **What lands on disk.**
/// ```
/// <parent>/<slug>/
/// .scarf/
/// dashboard.json # placeholder single text widget
/// AGENTS.md # marker block only; refresh() populates it
/// ```
///
/// No `manifest.json` scratch projects don't have a config schema,
/// so the Configuration sheet correctly degrades when missing.
/// No `template.lock.json` there's no template install to undo.
struct ProjectScaffolder: Sendable {
private static let logger = Logger(subsystem: "com.scarf", category: "ProjectScaffolder")
let context: ServerContext
nonisolated init(context: ServerContext = .local) {
self.context = context
}
// MARK: - Public
/// Scaffold a new project at `<parentDir>/<slug>` and register it.
/// On any failure after the project dir is created, deletes the
/// dir and rethrows so the user isn't left with a half-created
/// project that doesn't show in the sidebar.
nonisolated func scaffold(
name: String,
slug: String,
parentDir: String,
description: String?
) throws -> ProjectEntry {
let cleanedName = name.trimmingCharacters(in: .whitespacesAndNewlines)
let cleanedSlug = slug.trimmingCharacters(in: .whitespacesAndNewlines)
let cleanedParent = Self.normalizeDirectoryPath(parentDir)
let cleanedDescription = description?.trimmingCharacters(in: .whitespacesAndNewlines)
guard !cleanedName.isEmpty else { throw ProjectScaffolderError.invalidName }
guard Self.isValidSlug(cleanedSlug) else {
throw ProjectScaffolderError.invalidSlug(cleanedSlug)
}
let transport = context.makeTransport()
// 1. Validate parent + collisions.
guard transport.fileExists(cleanedParent) else {
throw ProjectScaffolderError.parentDirMissing(cleanedParent)
}
let projectDir = cleanedParent + "/" + cleanedSlug
if transport.fileExists(projectDir) {
throw ProjectScaffolderError.projectDirExists(projectDir)
}
let dashboardService = ProjectDashboardService(context: context)
let registry = dashboardService.loadRegistry()
if registry.projects.contains(where: { $0.name == cleanedName }) {
throw ProjectScaffolderError.nameAlreadyRegistered(cleanedName)
}
if registry.projects.contains(where: { $0.path == projectDir }) {
throw ProjectScaffolderError.pathAlreadyRegistered(projectDir)
}
// 2. Create project + .scarf/ dir.
do {
try transport.createDirectory(projectDir + "/.scarf")
} catch {
// No partial state to clean up createDirectory is the
// first write. Surface the error directly.
throw ProjectScaffolderError.createFailed(error.localizedDescription)
}
// From here on, on any failure, we clean up the project dir
// before rethrowing so the user can retry without bumping
// into the collision check.
do {
// 3. Write placeholder dashboard.json.
let dashboardData = try Self.makePlaceholderDashboard(
name: cleanedName,
description: cleanedDescription
)
try transport.writeFile(
projectDir + "/.scarf/dashboard.json",
data: dashboardData
)
// 4. Write AGENTS.md with just the marker block the
// refresh() call below populates between the markers.
let agentsMd = ProjectContextBlock.beginMarker + "\n"
+ ProjectContextBlock.endMarker + "\n"
try transport.writeFile(
projectDir + "/AGENTS.md",
data: Data(agentsMd.utf8)
)
// 5. Register the project.
let entry = ProjectEntry(name: cleanedName, path: projectDir)
var nextRegistry = registry
nextRegistry.projects.append(entry)
try dashboardService.saveRegistry(nextRegistry)
// 6. Populate the marker block with project identity.
// Non-fatal the chat handoff calls refresh() again
// anyway via startACPSession's project-prep step. Logging
// the failure here is enough.
do {
try ProjectAgentContextService(context: context).refresh(for: entry)
} catch {
Self.logger.warning(
"couldn't populate AGENTS.md marker block for \(entry.name, privacy: .public): \(error.localizedDescription, privacy: .public)"
)
}
Self.logger.info(
"scaffolded project \(cleanedName, privacy: .public) at \(projectDir, privacy: .public)"
)
return entry
} catch {
// Roll back the project dir. `LocalTransport.removeFile` is
// backed by `FileManager.removeItem` which is recursive for
// directories, so this cleans the dir + its `.scarf/` child
// in one call on local. SSH's `rm -f` is non-recursive, but
// the wizard's NSOpenPanel only browses local filesystems
// anyway remote scaffolding isn't a supported entry point
// today. Best-effort either way: a failed cleanup logs but
// doesn't mask the original failure.
do {
try transport.removeFile(projectDir)
} catch {
Self.logger.warning(
"cleanup after scaffold failure left files at \(projectDir, privacy: .public): \(error.localizedDescription, privacy: .public)"
)
}
throw error
}
}
// MARK: - Slug helpers
/// Default slug derivation from a project's display name. Used
/// by the wizard to pre-fill the editable "Folder Name" field.
/// Lowercases, replaces whitespace runs with `-`, strips any
/// character outside `[a-z0-9-]`, collapses `--` `-`, trims
/// leading/trailing `-`.
nonisolated static func suggestedSlug(from name: String) -> String {
let lowered = name.lowercased()
var slug = ""
var lastWasDash = false
for scalar in lowered.unicodeScalars {
let c = Character(scalar)
if c.isLetter || c.isNumber {
slug.append(c)
lastWasDash = false
} else if c.isWhitespace || c == "-" || c == "_" || c == "." {
if !lastWasDash && !slug.isEmpty {
slug.append("-")
lastWasDash = true
}
}
// Other characters (emoji, punctuation) silently dropped.
}
// Trim trailing dash.
while slug.hasSuffix("-") { slug.removeLast() }
return slug
}
/// Validate a slug: at least one character, every character in
/// `[a-z0-9-]`, no leading/trailing `-`, no consecutive `--`.
nonisolated static func isValidSlug(_ slug: String) -> Bool {
guard !slug.isEmpty else { return false }
guard !slug.hasPrefix("-"), !slug.hasSuffix("-") else { return false }
if slug.contains("--") { return false }
for scalar in slug.unicodeScalars {
let c = Character(scalar)
let isLowerAlpha = ("a"..."z").contains(c)
let isDigit = ("0"..."9").contains(c)
let isDash = c == "-"
if !(isLowerAlpha || isDigit || isDash) {
return false
}
}
return true
}
// MARK: - Dashboard placeholder
nonisolated static func makePlaceholderDashboard(
name: String,
description: String?
) throws -> Data {
let placeholderWidget = DashboardWidget(
type: "text",
title: "Configure this project",
content: """
This project was just scaffolded by Scarf. \
Chat with the agent to add widgets, schedule jobs, and write \
instructions for future sessions. The `scarf-template-author` \
skill knows the project standard end-to-end.
""",
format: "markdown"
)
let section = DashboardSection(
title: "Setup",
columns: 1,
widgets: [placeholderWidget]
)
let dashboard = ProjectDashboard(
version: 1,
title: name,
description: description?.isEmpty == false ? description : nil,
updatedAt: ISO8601DateFormatter().string(from: Date()),
theme: nil,
sections: [section]
)
// Pretty-print so the file is readable when the user
// opens it in an editor, matches the dashboard.json
// shape produced by template installs.
let encoder = JSONEncoder()
encoder.outputFormatting = [.prettyPrinted, .sortedKeys]
return try encoder.encode(dashboard)
}
// MARK: - Helpers
/// Strip a single trailing `/` from a path so subsequent
/// `parent + "/" + slug` joins don't produce a `//` segment.
nonisolated static func normalizeDirectoryPath(_ path: String) -> String {
var p = path.trimmingCharacters(in: .whitespacesAndNewlines)
while p.count > 1 && p.hasSuffix("/") {
p.removeLast()
}
return p
}
}
enum ProjectScaffolderError: Error, LocalizedError {
case invalidName
case invalidSlug(String)
case parentDirMissing(String)
case projectDirExists(String)
case nameAlreadyRegistered(String)
case pathAlreadyRegistered(String)
case createFailed(String)
var errorDescription: String? {
switch self {
case .invalidName:
return "Project name can't be empty."
case .invalidSlug(let s):
return "Folder name \"\(s)\" must be lowercase letters, numbers, and dashes only — no leading/trailing or doubled dashes."
case .parentDirMissing(let p):
return "Parent directory doesn't exist: \(p)"
case .projectDirExists(let p):
return "A folder already exists at \(p). Pick a different name."
case .nameAlreadyRegistered(let n):
return "A project named \"\(n)\" is already registered."
case .pathAlreadyRegistered(let p):
return "A project at \(p) is already registered."
case .createFailed(let msg):
return "Couldn't create the project directory: \(msg)"
}
}
}
@@ -29,6 +29,20 @@ struct ProjectTemplateInstaller: Sendable {
let cronJobNames = try createCronJobs(plan: plan)
let entry = try registerProject(plan: plan)
try writeLockFile(plan: plan, cronJobNames: cronJobNames)
// Mirror resolved Keychain secrets into ~/.hermes/.env so the
// template's cron jobs (and any other agent process Hermes
// spawns) can use them via $SCARF_<SLUG>_<FIELD>. Hermes
// reloads .env fresh on every cron tick, so this takes effect
// without a restart. Failure is non-fatal the install
// itself succeeded; the launch-time reconciler retries on
// next app start.
do {
try KeychainEnvMirror(context: context).mirror(project: entry)
} catch {
Self.logger.warning("install couldn't mirror secrets to ~/.hermes/.env: \(error.localizedDescription, privacy: .public)")
}
Self.logger.info("installed template \(plan.manifest.id, privacy: .public) v\(plan.manifest.version, privacy: .public) into \(plan.projectDir, privacy: .public)")
return entry
}
@@ -141,6 +141,21 @@ struct ProjectTemplateUninstaller: Sendable {
nonisolated func uninstall(plan: TemplateUninstallPlan) throws {
let transport = context.makeTransport()
// 0. Strip the project's block from ~/.hermes/.env BEFORE we
// delete project files KeychainEnvMirror.unmirror reads the
// cached manifest at <project>/.scarf/manifest.json to recover
// the slug. After step 1 deletes that file the slug is only
// recoverable by name, which is fine but more brittle. Run
// first while the cached manifest is still around. Failure is
// non-fatal: a stale block in .env is benign (env vars
// referencing a deleted project just sit there) and a fresh
// install at the same slug will overwrite it.
do {
try KeychainEnvMirror(context: context).unmirror(project: plan.project)
} catch {
Self.logger.warning("uninstall couldn't strip secrets block from ~/.hermes/.env: \(error.localizedDescription, privacy: .public)")
}
// 1. Project files (tracked only user additions untouched).
for file in plan.projectFilesToRemove {
do {
@@ -0,0 +1,197 @@
import Foundation
import os
import ScarfCore
/// Copies skills shipped inside the app bundle into the user's
/// `~/.hermes/skills/` so they're always available without the user
/// having to install a template first. Idempotent + version-gated:
/// skips when the destination is the same version, copies on missing
/// or older, leaves a user-edited newer destination alone.
///
/// **Why this exists.** The "New Project from Scratch" wizard hands
/// off to the agent and expects it to invoke `scarf-template-author`,
/// which is the comprehensive interview-and-scaffold skill. That skill
/// is currently distributed as part of the `awizemann/template-author`
/// template so installing the wizard's skill story with "first install
/// this template" would be a worse first-run experience than today's.
/// Bootstrapping it from the app bundle decouples the skill's
/// availability from any one template install.
///
/// **What gets bootstrapped.** Every subdirectory of
/// `Bundle.main/Resources/Skills/` is treated as one skill (its name
/// is the directory name). Currently that's just
/// `scarf-template-author`; future built-in skills can drop their dir
/// next to it and be picked up automatically.
struct SkillBootstrapService: Sendable {
private static let logger = Logger(subsystem: "com.scarf", category: "SkillBootstrapService")
let context: ServerContext
nonisolated init(context: ServerContext = .local) {
self.context = context
}
/// Walk every skill in the app bundle and ensure its installed
/// copy at `~/.hermes/skills/<name>/` is at least the bundled
/// version. Throws on transport failures (e.g. a missing
/// `~/.hermes` for a remote without one set up); callers should
/// log and continue a failed bootstrap shouldn't block app
/// launch.
nonisolated func ensureBundledSkillsInstalled() throws {
guard let bundleSkillsDir = Self.bundleSkillsDir() else {
Self.logger.info("no bundled Skills/ directory; skipping bootstrap")
return
}
let fm = FileManager.default
let entries: [URL]
do {
entries = try fm.contentsOfDirectory(
at: bundleSkillsDir,
includingPropertiesForKeys: [.isDirectoryKey],
options: [.skipsHiddenFiles]
)
} catch {
Self.logger.warning("couldn't list bundled skills dir: \(error.localizedDescription, privacy: .public)")
return
}
let transport = context.makeTransport()
let destRoot = context.paths.skillsDir
try transport.createDirectory(destRoot)
for skillDir in entries {
var isDir: ObjCBool = false
guard fm.fileExists(atPath: skillDir.path, isDirectory: &isDir), isDir.boolValue else {
continue
}
let skillName = skillDir.lastPathComponent
do {
try installSkill(from: skillDir, named: skillName, transport: transport)
} catch {
Self.logger.warning("couldn't bootstrap skill \(skillName, privacy: .public): \(error.localizedDescription, privacy: .public)")
}
}
}
// MARK: - Per-skill install
private nonisolated func installSkill(
from sourceDir: URL,
named skillName: String,
transport: any ServerTransport
) throws {
let destDir = context.paths.skillsDir + "/" + skillName
let destSkillMd = destDir + "/SKILL.md"
let bundledSkillMd = sourceDir.appendingPathComponent("SKILL.md")
let bundledData = try Data(contentsOf: bundledSkillMd)
let bundledVersion = Self.parseVersion(bundledData) ?? "0.0.0"
let installedVersion: String? = {
guard transport.fileExists(destSkillMd) else { return nil }
guard let data = try? transport.readFile(destSkillMd) else { return nil }
return Self.parseVersion(data)
}()
// Only copy when the destination is missing OR older than the
// bundled copy. A user with a newer hand-edited skill keeps
// their version untouched.
if let installed = installedVersion,
Self.semverCompare(installed, bundledVersion) >= 0 {
Self.logger.info(
"skill \(skillName, privacy: .public) at \(installed, privacy: .public) is current (bundled: \(bundledVersion, privacy: .public)); skipping"
)
return
}
try transport.createDirectory(destDir)
try transport.writeFile(destSkillMd, data: bundledData)
// Carry any companion files (assets, examples, etc.) the skill
// ships alongside SKILL.md. Walks one level deep skills don't
// ship deep trees today and wider compat for that can wait
// until a use case appears.
if let extras = try? FileManager.default.contentsOfDirectory(
at: sourceDir,
includingPropertiesForKeys: nil,
options: [.skipsHiddenFiles]
) {
for url in extras where url.lastPathComponent != "SKILL.md" {
let data = try Data(contentsOf: url)
let dest = destDir + "/" + url.lastPathComponent
try transport.writeFile(dest, data: data)
}
}
Self.logger.info(
"bootstrapped skill \(skillName, privacy: .public) at v\(bundledVersion, privacy: .public) (was: \(installedVersion ?? "missing", privacy: .public))"
)
}
// MARK: - Frontmatter version parse
/// Pull the `version: X.Y.Z` value from a SKILL.md's YAML
/// frontmatter. Returns nil when no version line is present so
/// the caller can treat the destination as "unknown" and replace
/// it with the bundled copy on the safe side.
nonisolated static func parseVersion(_ data: Data) -> String? {
guard let text = String(data: data, encoding: .utf8) else { return nil }
var inFrontmatter = false
for rawLine in text.split(separator: "\n", omittingEmptySubsequences: false) {
let line = String(rawLine)
let trimmed = line.trimmingCharacters(in: .whitespaces)
if trimmed == "---" {
if !inFrontmatter {
inFrontmatter = true
continue
} else {
return nil
}
}
guard inFrontmatter else { return nil }
if trimmed.hasPrefix("version:") {
let value = trimmed
.dropFirst("version:".count)
.trimmingCharacters(in: .whitespaces)
.trimmingCharacters(in: CharacterSet(charactersIn: "\"'"))
return value.isEmpty ? nil : value
}
}
return nil
}
/// Three-component numeric semver compare. Returns -1, 0, +1.
/// Non-numeric components fall back to lexicographic fine for
/// the conservative "skip if installed >= bundled" use case.
nonisolated static func semverCompare(_ a: String, _ b: String) -> Int {
let lhs = a.split(separator: ".").map { String($0) }
let rhs = b.split(separator: ".").map { String($0) }
let count = max(lhs.count, rhs.count)
for i in 0..<count {
let l = i < lhs.count ? lhs[i] : "0"
let r = i < rhs.count ? rhs[i] : "0"
if let li = Int(l), let ri = Int(r) {
if li < ri { return -1 }
if li > ri { return 1 }
} else {
if l < r { return -1 }
if l > r { return 1 }
}
}
return 0
}
// MARK: - Bundle access
/// Locate the bundled-skills directory inside the app bundle.
/// We ship skills inside a `.bundle` folder so Xcode preserves the
/// internal directory structure (a plain folder of resources gets
/// flattened by `PBXFileSystemSynchronizedRootGroup`). The
/// `BuiltinSkills.bundle` is then walked at runtime exactly like
/// any directory of `<skill-name>/SKILL.md` entries. Returns nil
/// when the app wasn't bundled with skills (unit test hosts,
/// local dev runs against a stripped-down bundle).
nonisolated private static func bundleSkillsDir() -> URL? {
Bundle.main.url(forResource: "BuiltinSkills", withExtension: "bundle")
}
}
@@ -1,4 +1,5 @@
import Foundation
import ScarfCore
import Sparkle
/// Thin wrapper around Sparkle's `SPUStandardUpdaterController`.
@@ -24,9 +25,15 @@ final class UpdaterService: NSObject {
override init() {
// startingUpdater: true Sparkle scans for updates on launch per Info.plist schedule.
// Default delegates are sufficient for a non-sandboxed app.
// Under `--scarf-test-mode` we keep Sparkle inert so XCUITest runs
// never see a "an update is available" sheet pop on top of the
// window the test is trying to drive. The controller still
// initializes `automaticallyChecksForUpdates` reads/writes
// continue to work it just doesn't fire the on-launch check
// or surface UI.
let startUpdater = !TestModeFlags.shared.isTestMode
self.controller = SPUStandardUpdaterController(
startingUpdater: true,
startingUpdater: startUpdater,
updaterDelegate: nil,
userDriverDelegate: nil
)
@@ -0,0 +1,73 @@
import AppKit
import SwiftUI
/// Persist a SwiftUI `WindowGroup` window's frame (size + position) across
/// app launches by hooking into AppKit's `NSWindow.setFrameAutosaveName`.
///
/// **Why this exists.** SwiftUI's `WindowGroup` exposes `.defaultSize`,
/// `.windowResizability`, and (on macOS Sonoma+) various scene modifiers
/// but not a "remember this window's size between launches" affordance.
/// Apple's documented escape hatch is AppKit's `setFrameAutosaveName(_:)`,
/// which writes the window's frame to UserDefaults on resize/move and
/// reads it back on next `makeKey`. We bridge into it from SwiftUI via an
/// invisible `NSViewRepresentable` that finds the hosting `NSWindow`
/// and stamps the autosave name once it appears.
///
/// **Usage.**
/// ContentView()
/// .windowFrameAutosave("Scarf.\(context.id)")
///
/// Pass a stable identifier per logical window. Different identifiers per
/// window are required by AppKit ("no two windows can be associated with
/// the same name simultaneously" `NSWindow.setFrameAutosaveName(_:)`
/// docs). For Scarf's multi-window-per-server model, keying off
/// `ServerID` gives each server window its own remembered frame.
///
/// **First-launch behaviour.** No saved frame exists AppKit leaves the
/// window at whatever frame SwiftUI's `.defaultSize` produced. After the
/// first user resize, AppKit autosaves and subsequent opens restore the
/// new frame.
///
/// **What it doesn't do.** Doesn't capture/restore fullscreen state
/// (AppKit handles that separately and reasonably). Doesn't try to
/// override window state restoration when the user has the system-level
/// "Close windows when quitting an application" setting OFF that
/// pathway runs first and we just ride alongside.
struct WindowFrameAutosave: NSViewRepresentable {
let name: String
func makeNSView(context: Context) -> NSView {
let view = NSView(frame: .zero)
// The hosting NSWindow isn't attached to this view yet at
// makeNSView time SwiftUI mounts the AppKit view hierarchy
// before the window assignment propagates. Defer one runloop
// iteration so `view.window` is non-nil when we stamp.
DispatchQueue.main.async { [weak view] in
view?.window?.setFrameAutosaveName(name)
}
return view
}
func updateNSView(_ nsView: NSView, context: Context) {
// SwiftUI may swap the host window in rare cases (window
// restoration after a relaunch, scene reuse). Re-stamp on
// update so we don't lose the autosave binding silently.
// setFrameAutosaveName is idempotent for the same name on
// the same window; assigning the same name twice is a no-op.
DispatchQueue.main.async { [weak nsView] in
guard let window = nsView?.window else { return }
if window.frameAutosaveName != name {
window.setFrameAutosaveName(name)
}
}
}
}
extension View {
/// Persist this view's hosting window's frame (size + position)
/// across launches under `name`. See `WindowFrameAutosave` for
/// details.
func windowFrameAutosave(_ name: String) -> some View {
background(WindowFrameAutosave(name: name))
}
}
@@ -3,12 +3,22 @@ import SwiftUI
struct MarkdownContentView: View {
let content: String
/// Chat font scale plumbed from `RichChatView` (issue #68). Defaults
/// to 1.0 when this view is used outside the chat surface so other
/// callers see the un-scaled rendering.
@Environment(\.chatFontScale) private var chatFontScale: Double
var body: some View {
VStack(alignment: .leading, spacing: 6) {
ForEach(Array(parseBlocks().enumerated()), id: \.offset) { _, block in
blockView(block)
}
}
// Paragraphs are rendered as plain `Text(AttributedString)` and
// inherit whatever font is set on the enclosing scope. Pin the
// scope to the scaled body font so the chat slider actually
// moves the visible text.
.font(ChatFontScale.body(chatFontScale))
}
@ViewBuilder
@@ -37,15 +47,19 @@ struct MarkdownContentView: View {
// MARK: - Block Views
private func headingView(level: Int, text: String) -> some View {
let font: Font = switch level {
case 1: .title.bold()
case 2: .title2.bold()
case 3: .title3.bold()
case 4: .headline
default: .subheadline.bold()
// Heading sizes scale with `chatFontScale` (issue #68). Bases
// mirror the SwiftUI semantic tokens we used previously
// (`.title` 28, `.title2` 22, `.title3` 20, `.headline`
// 17, `.subheadline` 15) so 100% matches today's UI.
let baseSize: CGFloat = switch level {
case 1: 28
case 2: 22
case 3: 20
case 4: 17
default: 15
}
return Text(MarkdownRenderer.inlineAttributedString(text))
.font(font)
.font(.system(size: baseSize * chatFontScale, weight: .semibold))
.textSelection(.enabled)
.padding(.top, level <= 2 ? 8 : 4)
}
@@ -54,11 +68,11 @@ struct MarkdownContentView: View {
VStack(alignment: .leading, spacing: 4) {
if let lang = language, !lang.isEmpty {
Text(lang)
.font(.caption2.bold())
.font(ChatFontScale.caption2(chatFontScale).bold())
.foregroundStyle(.secondary)
}
Text(code)
.font(.system(.callout, design: .monospaced))
.font(ChatFontScale.codeInline(chatFontScale))
.textSelection(.enabled)
.frame(maxWidth: .infinity, alignment: .leading)
}
@@ -19,12 +19,17 @@ struct ActivityView: View {
VStack(spacing: 0) {
pageHeader
filterStrip
if let err = viewModel.loadError {
loadErrorBanner(err)
}
ScrollView {
LazyVStack(alignment: .leading, spacing: ScarfSpace.s5) {
ForEach(groupedByDay) { group in
dayGroup(group)
}
if viewModel.filteredActivity.isEmpty && !viewModel.isLoading {
if viewModel.isLoading && viewModel.filteredActivity.isEmpty {
loadingState
} else if viewModel.filteredActivity.isEmpty && viewModel.loadError == nil {
emptyState
}
}
@@ -43,6 +48,53 @@ struct ActivityView: View {
.sheet(isPresented: detailSheetBinding) { detailSheet }
}
/// Spinner + label rendered while the first load is in flight and
/// the feed is still empty. v2.8 fix pre-fix, `isLoading=true`
/// rendered nothing because the empty-state was gated on
/// `!isLoading`, leaving the user staring at a blank pane during
/// the SSH round-trip.
private var loadingState: some View {
HStack(spacing: ScarfSpace.s3) {
ProgressView().controlSize(.small)
Text("Loading activity…")
.scarfStyle(.body)
.foregroundStyle(ScarfColor.foregroundMuted)
}
.frame(maxWidth: .infinity)
.padding(ScarfSpace.s6)
}
/// Orange banner shown above the feed when the most recent load
/// hit a transport failure. Replaces the silent empty-state that
/// pre-v2.8 left users thinking Activity was broken.
private func loadErrorBanner(_ message: String) -> some View {
HStack(alignment: .top, spacing: ScarfSpace.s2) {
Image(systemName: "exclamationmark.triangle.fill")
.foregroundStyle(.orange)
VStack(alignment: .leading, spacing: 2) {
Text("Couldn't load activity")
.scarfStyle(.bodyEmph)
.foregroundStyle(ScarfColor.foregroundPrimary)
Text(message)
.scarfStyle(.caption)
.foregroundStyle(ScarfColor.foregroundMuted)
.textSelection(.enabled)
}
Spacer()
Button("Retry") {
Task { await viewModel.load() }
}
.buttonStyle(.bordered)
.controlSize(.small)
}
.padding(ScarfSpace.s3)
.background(Color.orange.opacity(0.08))
.overlay(
Rectangle().fill(Color.orange.opacity(0.25)).frame(height: 1),
alignment: .bottom
)
}
// MARK: - Page header
private var pageHeader: some View {
@@ -56,6 +108,17 @@ struct ActivityView: View {
.foregroundStyle(ScarfColor.foregroundMuted)
}
Spacer()
if viewModel.isHydratingToolCalls {
HStack(spacing: 6) {
ProgressView().controlSize(.small)
Text("Loading tool details…")
.scarfStyle(.caption)
.foregroundStyle(ScarfColor.foregroundMuted)
}
.padding(.horizontal, ScarfSpace.s3)
.padding(.vertical, 4)
.background(.thinMaterial, in: Capsule())
}
}
.padding(.horizontal, ScarfSpace.s6)
.padding(.top, ScarfSpace.s5)
@@ -321,19 +384,25 @@ private struct ActivityRow: View {
ZStack {
RoundedRectangle(cornerRadius: 6, style: .continuous)
.fill(toneBackground)
Image(systemName: entry.kind.icon)
.font(.system(size: 12))
.foregroundStyle(toneForeground)
if entry.isPlaceholder {
ProgressView().controlSize(.mini)
} else {
Image(systemName: entry.kind.icon)
.font(.system(size: 12))
.foregroundStyle(toneForeground)
}
}
.frame(width: 26, height: 26)
VStack(alignment: .leading, spacing: 1) {
Text(entry.toolName)
.scarfStyle(.body)
.foregroundStyle(ScarfColor.foregroundPrimary)
.foregroundStyle(entry.isPlaceholder ? ScarfColor.foregroundMuted : ScarfColor.foregroundPrimary)
.lineLimit(1)
Group {
if entry.summary.isEmpty {
if entry.isPlaceholder {
Text("Tool calls hydrating in the background…")
} else if entry.summary.isEmpty {
Text(entry.kind.displayName)
} else {
Text(entry.summary)
@@ -345,16 +414,20 @@ private struct ActivityRow: View {
.truncationMode(.middle)
}
Spacer(minLength: 8)
Image(systemName: "chevron.right")
.font(.system(size: 11))
.foregroundStyle(ScarfColor.foregroundFaint)
if !entry.isPlaceholder {
Image(systemName: "chevron.right")
.font(.system(size: 11))
.foregroundStyle(ScarfColor.foregroundFaint)
}
}
.padding(.horizontal, ScarfSpace.s4)
.padding(.vertical, ScarfSpace.s3 - 2)
.background(hover ? ScarfColor.backgroundTertiary.opacity(0.6) : Color.clear)
.background(hover && !entry.isPlaceholder ? ScarfColor.backgroundTertiary.opacity(0.6) : Color.clear)
.opacity(entry.isPlaceholder ? 0.65 : 1.0)
.contentShape(Rectangle())
}
.buttonStyle(.plain)
.disabled(entry.isPlaceholder)
.onHover { hover = $0 }
}
@@ -1,4 +1,5 @@
import SwiftUI
import ScarfCore
/// Scarf-local chat rendering preferences (issues #47 / #48).
///
@@ -22,6 +23,16 @@ enum ChatDensityKeys {
/// When hidden, clicking a tool card auto-flips it back on so the
/// click does what the user expects (`ToolCallCard.onFocus`). Issue #58.
static let showInspector = "scarf.chat.showInspector"
/// v2.8 opt-in auto-fetch of tool result CONTENT in past chats.
/// Defaults FALSE because a single tool result blob (file dump,
/// stack trace) can be hundreds of KB; bulk-fetching all of them
/// during chat resume on a slow remote can blow past the 30s SSH
/// timeout (observed in 2026-05-05 dogfooding). When false, tool
/// CALL cards still render (the `tool_calls` JSON path is bounded
/// and fast); only the inspector pane's "Output" section is empty
/// until the user expands a card, at which point we lazy-fetch
/// just that single result via `fetchToolResult(callId:)`.
static let loadHistoricalToolResults = RichChatViewModel.loadHistoricalToolResultsKey
}
/// How `RichMessageBubble` renders the per-call tool widgets.
@@ -106,4 +117,74 @@ enum ChatFontScale {
let pct = Int((scale * 100).rounded())
return "\(pct)%"
}
// MARK: - Scaled font helpers
//
// ScarfFont's tokens are fixed-point (`Font.system(size: 14, )`),
// so `.environment(\.dynamicTypeSize, )` doesn't reach them the
// Mac chat slider had no visible effect on bubbles, reasoning,
// tool chips, or code blocks (issue #68). These helpers mirror the
// ScarfFont base sizes, multiplied by the user's chat scale, and
// are used by `RichMessageBubble`, `MarkdownContentView`, and
// `CodeBlockView` in place of the static tokens. At scale = 1.0
// they're byte-for-byte identical to ScarfFont so the default UI
// is unchanged.
static func body(_ scale: Double) -> Font {
.system(size: 14 * scale, weight: .regular)
}
static func bodyEmph(_ scale: Double) -> Font {
.system(size: 14 * scale, weight: .medium)
}
static func callout(_ scale: Double) -> Font {
.system(size: 15 * scale, weight: .regular)
}
static func caption(_ scale: Double) -> Font {
.system(size: 12 * scale, weight: .regular)
}
static func captionStrong(_ scale: Double) -> Font {
.system(size: 12 * scale, weight: .semibold)
}
static func caption2(_ scale: Double) -> Font {
.system(size: 10 * scale, weight: .medium)
}
static func mono(_ scale: Double) -> Font {
.system(size: 13 * scale, weight: .regular, design: .monospaced)
}
static func monoSmall(_ scale: Double) -> Font {
.system(size: 12 * scale, weight: .regular, design: .monospaced)
}
/// Code-block body matches `CodeBlockView`'s 12pt mono.
static func codeBlock(_ scale: Double) -> Font {
.system(size: 12 * scale, weight: .regular, design: .monospaced)
}
/// Inline code in markdown paragraphs `.callout` (15pt) mono.
static func codeInline(_ scale: Double) -> Font {
.system(size: 15 * scale, weight: .regular, design: .monospaced)
}
}
// MARK: - Environment plumbing
private struct ChatFontScaleKey: EnvironmentKey {
static let defaultValue: Double = ChatFontScale.default
}
extension EnvironmentValues {
/// Multiplier applied to chat content fonts. Set once on
/// `RichChatView`'s root so message bubbles, markdown paragraphs,
/// and code blocks scale together. Default 1.0 = today's UI.
var chatFontScale: Double {
get { self[ChatFontScaleKey.self] }
set { self[ChatFontScaleKey.self] = newValue }
}
}
@@ -34,6 +34,31 @@ final class ChatViewModel {
var recentSessions: [HermesSession] = []
var sessionPreviews: [String: String] = [:]
/// Debounce handle for watcher-driven `loadRecentSessions` calls.
/// During an active ACP conversation the file watcher fires many
/// times per second (every message Hermes persists writes to
/// `state.db-wal`); without this, every tick spawned a fresh
/// reload task whose `recentSessions = ` reassignment re-rendered
/// the chat sidebar and caused the list to visibly disappear /
/// reappear during a streaming response. The debounce coalesces
/// rapid bursts into one trailing fetch ~500 ms after the last
/// tick. Created/resumed sessions still appear immediately because
/// `startACPSession` and `autoStartACPAndSend` call
/// `loadRecentSessions()` directly outside this path.
@ObservationIgnored
private var sessionsRefreshTask: Task<Void, Never>?
/// L2 (v2.8) in-flight coalescing handle for `loadRecentSessions`.
/// On a slow remote each load is a 1.52.5s SSH round-trip; the
/// 500 ms `scheduleSessionsRefresh` debounce only suppresses a
/// pending tick, not one that's already executing. Without this
/// guard, file-watcher deltas during a stream stack 23 parallel
/// loadRecentSessions tasks (observed at t=305844 in 2026-05-05
/// dogfooding). The in-flight pointer lets a second caller await
/// the active task instead of spawning another SSH subprocess.
@ObservationIgnored
private var inFlightSessionLoad: Task<Void, Never>?
/// Per-recent-session project attribution. Keyed by `HermesSession.id`,
/// value is the project's display name. Populated alongside
/// `recentSessions` via a single batched read in `loadRecentSessions()`.
@@ -108,21 +133,57 @@ final class ChatViewModel {
var isACPConnected: Bool { acpClient != nil && hasActiveProcess }
var acpStatus: String = ""
/// User-facing status strings that all map to "the session is in
/// the middle of being established." Centralized so the toolbar
/// status pill, the chat-pane loader, and `ChatSessionListPane`'s
/// click-gating stay in sync. v2.8 added `loadingHistory` after
/// the user reported the chat looked engageable while the
/// 30-second `fetchMessages` was still in flight on a slow remote.
static let preparingPhases: Set<String> = [
ACPPhase.spawning,
ACPPhase.authenticating,
ACPPhase.creatingSession,
ACPPhase.creatingNewSession,
ACPPhase.loadingSession,
ACPPhase.loadingHistory
]
enum ACPPhase {
static let spawning = "Spawning hermes acp…"
static let authenticating = "Authenticating…"
static let creatingSession = "Creating session…"
static let creatingNewSession = "Creating new session…"
static let loadingSession = "Loading session…"
static let loadingHistory = "Loading history…"
static let ready = "Ready"
static let agentWorking = "Agent working…"
static let cancelled = "Cancelled"
static let failed = "Failed"
static let error = "Error"
static let connectionLost = "Connection lost"
}
/// Set true the moment the user kicks off a session-start path
/// (resume / new / continue), cleared when the ACP session is
/// fully ready or has failed. Decoupled from `hasActiveProcess`
/// that flag only flips true AFTER `client.start()` succeeds,
/// which on remote contexts is a 57s window where the user sees
/// nothing happening even though they've just clicked. v2.8
/// fixes the gap between row-click and overlay-appears that
/// the user reported in 2026-05-05 dogfooding.
var isStartingSession: Bool = false
/// True while a session is being established or restored from the user
/// kicking off "start chat" or "resume session" until the ACP session is
/// ready for messages. The chat pane uses this to show a loader in place
/// of the empty-state placeholder.
/// of the empty-state placeholder; `ChatSessionListPane` uses it to
/// disable session-row taps so the user can't queue up a second
/// switch while the first is still mid-boot (v2.8).
var isPreparingSession: Bool {
if isStartingSession { return true }
guard hasActiveProcess else { return false }
switch acpStatus {
case "Starting...",
"Creating session...",
"Creating new session...",
"Loading session...":
return true
default:
return acpStatus.hasPrefix("Reconnecting")
}
if Self.preparingPhases.contains(acpStatus) { return true }
return acpStatus.hasPrefix("Reconnecting")
}
/// Error triplet moved to RichChatViewModel in M7 #2 so ScarfGo can
/// share the same banner. These are forwarding accessors to keep
@@ -139,9 +200,23 @@ final class ChatViewModel {
get { richChatViewModel.acpErrorDetails }
set { richChatViewModel.acpErrorDetails = newValue }
}
var acpErrorOAuthProvider: String? {
get { richChatViewModel.acpErrorOAuthProvider }
set { richChatViewModel.acpErrorOAuthProvider = newValue }
}
/// True when `hasAnyAICredential()` returned false at last preflight.
var missingCredentials: Bool = false
/// `model.default` / `model.provider` mismatch detected by the
/// last `refreshConfigDiagnostics` pass. Drives the "Configuration
/// mismatch" banner in `errorBanner`. Nil when config is coherent
/// or unset. v2.8 observed in dogfooding when switching OAuth
/// providers via Credential Pools left a stale model prefix
/// behind (e.g. `model.default: anthropic/...` with
/// `model.provider: nous`); chats died with `-32603 Internal error`
/// at first prompt with no diagnostic.
var modelProviderMismatch: ModelPreflight.Mismatch?
/// Set when chat-start is blocked because the active server's
/// `config.yaml` has no `model.default` / `model.provider`. The chat
/// view observes this and presents `ChatModelPreflightSheet`; on
@@ -154,7 +229,7 @@ final class ChatViewModel {
/// for the user to pick a model. Replayed verbatim once
/// `confirmModelPreflight` writes the chosen model+provider to
/// config.yaml. Cleared on cancel or after replay.
private var pendingStartArgs: (sessionId: String?, projectPath: String?)?
private var pendingStartArgs: (sessionId: String?, projectPath: String?, initialPrompt: String?)?
private static let maxReconnectAttempts = 5
private static let reconnectBaseDelay: UInt64 = 1_000_000_000 // 1 second
@@ -173,6 +248,72 @@ final class ChatViewModel {
missingCredentials = !fileService.hasAnyAICredential()
}
/// Re-reads config.yaml and refreshes the
/// `model.default` / `model.provider` mismatch state. Off-MainActor
/// because `loadConfig()` is a synchronous file read (and an SSH
/// round-trip on remote contexts). Safe to call from `.task` or
/// after a write that would have changed config.
func refreshConfigDiagnostics() {
let svc = fileService
Task.detached { [weak self] in
let config = svc.loadConfig()
let mismatch = ModelPreflight.detectMismatch(config)
await MainActor.run { [weak self] in
self?.modelProviderMismatch = mismatch
}
}
}
/// Persist a one-click mismatch fix. Aligns `model.provider` to the
/// prefix carried in `model.default` (the user's "I just authed
/// against this provider, that's what the prefix means" intent).
/// Triggers a config-diagnostics refresh on completion to clear the
/// banner if the write took. Failures fall through to the existing
/// `acpError` banner so the user sees something happened.
func alignProviderToModelPrefix(_ mismatch: ModelPreflight.Mismatch) {
let svc = fileService
Task.detached { [weak self] in
// We pass the bare model so config.yaml ends up with a
// clean (provider-prefix-free) model name alongside the
// matching provider matches what `confirmModelPreflight`
// writes for a fresh setup.
let ok = svc.setModelAndProvider(
model: mismatch.bareModel,
provider: mismatch.prefixProvider
)
await MainActor.run { [weak self] in
guard let self else { return }
if ok {
self.modelProviderMismatch = nil
} else {
self.acpError = "Couldn't write the new provider to config.yaml. Open Settings to fix manually."
}
}
}
}
/// Persist the inverse mismatch fix strip the provider prefix
/// off `model.default` and keep `model.provider` as the active
/// authoritative value. Use case: the user genuinely intended to
/// switch their active provider and the stale prefix is the bug.
func stripPrefixFromModelDefault(_ mismatch: ModelPreflight.Mismatch) {
let svc = fileService
Task.detached { [weak self] in
let ok = svc.setModelAndProvider(
model: mismatch.bareModel,
provider: mismatch.activeProvider
)
await MainActor.run { [weak self] in
guard let self else { return }
if ok {
self.modelProviderMismatch = nil
} else {
self.acpError = "Couldn't rewrite model.default in config.yaml. Open Settings to fix manually."
}
}
}
}
/// Forwarders to the ScarfCore implementation so the error-banner
/// state lives in one place (M7 #2). The per-site logging label
/// stays here only the storage is shared.
@@ -189,13 +330,30 @@ final class ChatViewModel {
// MARK: - Session Lifecycle
func startNewSession(projectPath: String? = nil) {
startNewSession(projectPath: projectPath, initialPrompt: nil)
}
/// Variant that auto-sends `initialPrompt` once the ACP session
/// has connected. Used by the "New Project from Scratch" wizard
/// (v2.8) to kick the conversation off with a message the agent
/// recognizes as a `scarf-template-author` invocation, so the user
/// doesn't have to type anything to begin the interview.
/// Terminal mode ignores the prompt the wizard runs in rich-chat
/// only.
func startNewSession(projectPath: String?, initialPrompt: String?) {
// Flip the loading flag synchronously on the user's tap so
// SwiftUI paints the session-list overlay on the same tick
// `startACPSession` won't reach `acpStatus = .spawning`
// until the Task body runs, which on remote contexts is
// multiple seconds after the click. v2.8.
isStartingSession = true
voiceEnabled = false
ttsEnabled = false
isRecording = false
richChatViewModel.reset()
if displayMode == .richChat {
startACPSession(resume: nil, projectPath: projectPath)
startACPSession(resume: nil, projectPath: projectPath, initialPrompt: initialPrompt)
} else {
// Terminal mode doesn't surface project attribution today
// `hermes chat` uses the shell's cwd, so starting a terminal
@@ -206,7 +364,20 @@ final class ChatViewModel {
}
}
/// Start a new project-scoped ACP session and send `text` as the
/// first prompt once connected. Thin wrapper named for the
/// wizard's call site to make intent obvious; behaves identically
/// to `startNewSession(projectPath:initialPrompt:)`.
func startNewSessionAndSend(projectPath: String, text: String) {
// Force rich-chat the wizard handoff doesn't make sense in
// terminal mode, and we'd silently swallow the initial prompt
// if the user happened to be on the terminal segment.
displayMode = .richChat
startNewSession(projectPath: projectPath, initialPrompt: text)
}
func resumeSession(_ sessionId: String) {
isStartingSession = true
voiceEnabled = false
ttsEnabled = false
isRecording = false
@@ -221,6 +392,7 @@ final class ChatViewModel {
}
func continueLastSession() {
isStartingSession = true
voiceEnabled = false
ttsEnabled = false
isRecording = false
@@ -231,6 +403,7 @@ final class ChatViewModel {
Task { @MainActor in
let opened = await dataService.open()
if !opened {
isStartingSession = false
acpError = context.isRemote
? "Couldn't reach \(context.displayName). Check the SSH connection and try again."
: "Couldn't open the Hermes state database."
@@ -293,6 +466,7 @@ final class ChatViewModel {
/// between the DB read and ACP `session/load`, producing a silent prompt
/// failure with no UI feedback.
private func autoStartACPAndSend(text: String, images: [ChatImageAttachment] = []) {
isStartingSession = true
// Show the user message immediately
richChatViewModel.addUserMessage(text: text)
@@ -303,8 +477,9 @@ final class ChatViewModel {
self.acpClient = client
do {
acpStatus = ACPPhase.spawning
try await client.start()
acpStatus = await client.statusMessage
acpStatus = ACPPhase.authenticating
startACPEventLoop(client: client)
startHealthMonitor(client: client)
@@ -314,26 +489,36 @@ final class ChatViewModel {
let resolvedSessionId: String
if let existing = sessionToResume {
acpStatus = "Loading session..."
acpStatus = ACPPhase.loadingSession
do {
resolvedSessionId = try await client.loadSession(cwd: cwd, sessionId: existing)
} catch {
logger.info("Session \(existing) not found in ACP, creating new session")
acpStatus = "Creating new session..."
acpStatus = ACPPhase.creatingNewSession
resolvedSessionId = try await client.newSession(cwd: cwd)
}
} else {
acpStatus = "Creating session..."
acpStatus = ACPPhase.creatingSession
resolvedSessionId = try await client.newSession(cwd: cwd)
}
richChatViewModel.setSessionId(resolvedSessionId)
acpStatus = "Connected (\(resolvedSessionId.prefix(12)))"
acpStatus = ACPPhase.ready
isStartingSession = false
// Surface the freshly-created session in the chat
// sidebar immediately. We can't lean on the file
// watcher to do this it fires unconditionally
// through `scheduleSessionsRefresh` which has a
// 500 ms debounce. An explicit call here keeps the
// "type see new chat in the list" feedback prompt.
await loadRecentSessions()
// Now send the queued prompt
sendViaACP(client: client, text: text, images: images)
} catch {
acpStatus = "Failed"
acpStatus = ACPPhase.failed
isStartingSession = false
await recordACPFailure(error, client: client, context: "Auto-start ACP failed")
hasActiveProcess = false
acpClient = nil
@@ -369,6 +554,7 @@ final class ChatViewModel {
}
private func sendViaACP(client: ACPClient, text: String, images: [ChatImageAttachment] = []) {
ScarfMon.event(.chatStream, "mac.sendViaACP", count: 1, bytes: text.utf8.count)
guard let sessionId = richChatViewModel.sessionId else {
clearACPErrorState()
acpError = "No session ID — cannot send"
@@ -404,21 +590,38 @@ final class ChatViewModel {
}
}
} else {
acpStatus = "Agent working..."
acpStatus = ACPPhase.agentWorking
}
acpPromptTask = Task { @MainActor in
do {
let result = try await client.sendPrompt(sessionId: sessionId, text: wireText, images: images)
acpStatus = "Ready"
let result = try await ScarfMon.measureAsync(.chatStream, "mac.sendPrompt") {
try await client.sendPrompt(sessionId: sessionId, text: wireText, images: images)
}
acpStatus = ACPPhase.ready
richChatViewModel.handleACPEvent(
.promptComplete(sessionId: sessionId, response: result)
)
// Re-fetch session from DB to pick up cost/token data Hermes may have written
await richChatViewModel.refreshSessionFromDB()
// Issue #64 notify the user that Hermes has
// finished if Scarf isn't the foreground app. The
// notifier handles the foreground/disabled gating;
// we just hand it the latest assistant text and
// session title for the body line.
if !isSteer {
let preview = richChatViewModel.messages
.last(where: { $0.isAssistant })?
.content ?? ""
let title = richChatViewModel.currentSession?.title
ChatNotificationService.shared.postPromptCompleted(
sessionTitle: title,
preview: preview
)
}
} catch is CancellationError {
acpStatus = "Cancelled"
acpStatus = ACPPhase.cancelled
} catch {
acpStatus = "Error"
acpStatus = ACPPhase.error
await recordACPFailure(error, client: client, context: "ACP prompt failed")
richChatViewModel.handleACPEvent(
.promptComplete(sessionId: sessionId, response: ACPPromptResult(
@@ -433,9 +636,18 @@ final class ChatViewModel {
// MARK: - ACP Session Management
private func startACPSession(resume sessionId: String?, projectPath: String? = nil) {
private func startACPSession(
resume sessionId: String?,
projectPath: String? = nil,
initialPrompt: String? = nil
) {
ScarfMon.event(.sessionLoad, "mac.startACPSession", count: 1)
stopACP()
clearACPErrorState()
// stopACP() clears `isStartingSession` (it's a generic teardown
// helper used by disconnect paths too). Re-arm it here so the
// session-list overlay stays up through the entire boot.
isStartingSession = true
// Pre-flight: bail before opening any ACP plumbing if the
// active server's `config.yaml` has no primary model or
@@ -446,14 +658,15 @@ final class ChatViewModel {
// unchanged after the user picks a model.
let preflight = ModelPreflight.check(fileService.loadConfig())
if !preflight.isConfigured {
pendingStartArgs = (sessionId, projectPath)
pendingStartArgs = (sessionId, projectPath, initialPrompt)
modelPreflightReason = preflight.reason
acpStatus = ""
hasActiveProcess = false
isStartingSession = false
return
}
acpStatus = "Starting..."
acpStatus = ACPPhase.spawning
let client = ACPClient.forMacApp(context: context)
self.acpClient = client
@@ -492,7 +705,7 @@ final class ChatViewModel {
do {
// Start ACP process and event loop FIRST
try await client.start()
acpStatus = await client.statusMessage
acpStatus = ACPPhase.authenticating
startACPEventLoop(client: client)
startHealthMonitor(client: client)
@@ -516,26 +729,34 @@ final class ChatViewModel {
let resolvedSessionId: String
if let sessionId {
acpStatus = "Loading session..."
acpStatus = ACPPhase.loadingSession
do {
resolvedSessionId = try await client.loadSession(cwd: cwd, sessionId: sessionId)
} catch {
logger.info("Session \(sessionId) not found in ACP, creating new session with history")
acpStatus = "Creating new session..."
acpStatus = ACPPhase.creatingNewSession
resolvedSessionId = try await client.newSession(cwd: cwd)
}
// Load messages from both origin CLI session and ACP session
// Surface "Loading history" before the (potentially
// 30s) message-history fetch fires. Pre-fix the user
// saw "Loading session" through start(), then jump
// straight to "Ready" the moment the bytes hit the
// pane but the actual hydrate is the slowest step
// on a remote and the pane looked engageable while
// the SQLite query was still pending. v2.8.
acpStatus = ACPPhase.loadingHistory
await richChatViewModel.loadSessionHistory(
sessionId: sessionId,
acpSessionId: resolvedSessionId
)
} else {
acpStatus = "Creating session..."
acpStatus = ACPPhase.creatingSession
resolvedSessionId = try await client.newSession(cwd: cwd)
}
richChatViewModel.setSessionId(resolvedSessionId)
acpStatus = "Connected (\(resolvedSessionId.prefix(12)))"
acpStatus = ACPPhase.ready
isStartingSession = false
// Attribute this session to the project it was started
// under, so the per-project Sessions tab can surface it
@@ -600,8 +821,21 @@ final class ChatViewModel {
await loadRecentSessions()
logger.info("ACP session ready: \(resolvedSessionId)")
// v2.8 wizard handoff: auto-send the kickoff prompt now
// that the session is connected. Renders as a normal user
// bubble (matches the user's intent they triggered this
// flow via the New Project sheet) and routes through the
// same `sendViaACP` path that typed messages use, so the
// event loop, attribution, and streaming are identical.
if let prompt = initialPrompt,
!prompt.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty {
richChatViewModel.addUserMessage(text: prompt)
sendViaACP(client: client, text: prompt, images: [])
}
} catch {
acpStatus = "Failed"
acpStatus = ACPPhase.failed
isStartingSession = false
await recordACPFailure(error, client: client, context: "Failed to start ACP session")
hasActiveProcess = false
acpClient = nil
@@ -614,8 +848,16 @@ final class ChatViewModel {
let eventStream = await client.events
for await event in eventStream {
guard !Task.isCancelled else { break }
self?.richChatViewModel.handleACPEvent(event)
self?.acpStatus = await client.statusMessage
ScarfMon.event(.chatStream, "mac.acpEvent", count: 1)
ScarfMon.measure(.chatStream, "mac.handleACPEvent") {
self?.richChatViewModel.handleACPEvent(event)
}
// Don't overwrite a phase-typed acpStatus with the
// ACP-side "Connected" string mid-stream; we promote
// to ready/agentWorking from the call sites that own
// the lifecycle. The event-loop side-effect is
// the heartbeat leave acpStatus alone here.
_ = await client.statusMessage
}
// Stream ended if we weren't cancelled, the connection died
if !Task.isCancelled {
@@ -681,7 +923,7 @@ final class ChatViewModel {
for attempt in 1...Self.maxReconnectAttempts {
guard !Task.isCancelled else { return }
acpStatus = "Reconnecting (\(attempt)/\(Self.maxReconnectAttempts))..."
acpStatus = "Reconnecting (\(attempt)/\(Self.maxReconnectAttempts))"
logger.info("Reconnect attempt \(attempt)/\(Self.maxReconnectAttempts) for session \(sessionId)")
// Backoff delay (skip on first attempt for fast recovery)
@@ -718,7 +960,7 @@ final class ChatViewModel {
// Reconcile in-memory messages with what Hermes persisted to DB
await richChatViewModel.reconcileWithDB(sessionId: resolvedSessionId)
acpStatus = "Reconnected (\(resolvedSessionId.prefix(12)))"
acpStatus = ACPPhase.ready
clearACPErrorState()
startACPEventLoop(client: client)
@@ -743,7 +985,7 @@ final class ChatViewModel {
private func showConnectionFailure() {
richChatViewModel.handleACPEvent(.connectionLost(reason: "The ACP process terminated unexpectedly"))
acpStatus = "Connection lost"
acpStatus = ACPPhase.connectionLost
clearACPErrorState()
acpError = "Connection lost. Use the Session menu to reconnect."
}
@@ -763,6 +1005,7 @@ final class ChatViewModel {
acpClient = nil
hasActiveProcess = false
isHandlingDisconnect = false
isStartingSession = false
}
// MARK: - Model preflight
@@ -785,7 +1028,11 @@ final class ChatViewModel {
guard let self else { return }
if ok {
if let pending {
self.startACPSession(resume: pending.sessionId, projectPath: pending.projectPath)
self.startACPSession(
resume: pending.sessionId,
projectPath: pending.projectPath,
initialPrompt: pending.initialPrompt
)
}
} else {
self.acpError = "Couldn't save model+provider to config.yaml. Open Settings to retry."
@@ -815,44 +1062,109 @@ final class ChatViewModel {
// MARK: - Recent Sessions
/// Coalesce rapid `loadRecentSessions` triggers into one trailing
/// fetch. Hooked up to the file-watcher tick in `ChatView`; during
/// an ACP message stream the watcher fires 510 times per second
/// as Hermes appends to `state.db-wal`, and an unconditional
/// reload on each tick would visibly flicker the chat sidebar
/// while the response streams in.
///
/// The 500 ms window is short enough that idle external changes
/// (a session created from another `hermes` invocation, a rename
/// from another window) still appear "soon" without explicit user
/// action, and long enough to absorb a streaming-response burst.
/// Newly created / resumed sessions in *this* window don't depend
/// on the debounce `startACPSession` and `autoStartACPAndSend`
/// call `loadRecentSessions()` synchronously after the session id
/// resolves, so the chat sidebar updates immediately.
func scheduleSessionsRefresh() {
// Track every file-watcher-driven debounce entry. During an ACP
// stream this fires many times per second; the count helps us see
// how often the watcher fires vs. how often a real reload executes.
ScarfMon.event(.sessionLoad, "mac.scheduleSessionsRefresh", count: 1)
sessionsRefreshTask?.cancel()
sessionsRefreshTask = Task { @MainActor [weak self] in
try? await Task.sleep(nanoseconds: 500_000_000)
if Task.isCancelled { return }
await self?.loadRecentSessions()
}
}
func loadRecentSessions() async {
let opened = await dataService.open()
guard opened else { return }
// Bumped from 10 50 so the project filter has enough data to
// surface attributed sessions (older attributed sessions were
// getting truncated out of the original limit). Sessions feature
// loads 500; the chat sidebar doesn't need that, but 50 keeps
// the project filter useful without measurable cost.
let fetchedSessions = await dataService.fetchSessions(limit: 50)
let fetchedPreviews = await dataService.fetchSessionPreviews(limit: 50)
await dataService.close()
// L2 (v2.8) coalesce against an in-flight load. If one's
// already running, await its completion instead of spawning a
// parallel one. Drops the 2-3× contention seen during file-
// watcher streams.
if let existing = inFlightSessionLoad {
ScarfMon.event(.sessionLoad, "mac.loadRecentSessions.coalesced", count: 1)
await existing.value
return
}
let task = Task { @MainActor [weak self] in
guard let self else { return }
await self.performLoadRecentSessions()
}
inFlightSessionLoad = task
await task.value
inFlightSessionLoad = nil
}
// Project attribution + registry single batched off-main read.
let ctx = context
let bundle: (names: [String: String], projects: [ProjectEntry]) = await Task.detached {
let attribution = SessionAttributionService(context: ctx)
let registry = ProjectDashboardService(context: ctx).loadRegistry()
let pathToName = Dictionary(
uniqueKeysWithValues: registry.projects.map { ($0.path, $0.name) }
)
let map = attribution.load().mappings
var names: [String: String] = [:]
for (sessionID, path) in map {
if let name = pathToName[path] {
names[sessionID] = name
private func performLoadRecentSessions() async {
// Measure the full wall-clock cost of a sessions sidebar reload,
// from DB open through the off-main attribution read to the final
// observable assignment. Surfaces fetch regressions and SQLite
// latency spikes in the ScarfMon trace.
await ScarfMon.measureAsync(.sessionLoad, "mac.loadRecentSessions") {
let opened = await dataService.open()
guard opened else { return }
// Bumped from 10 50 so the project filter has enough data to
// surface attributed sessions (older attributed sessions were
// getting truncated out of the original limit). Sessions feature
// loads 500; the chat sidebar doesn't need that, but 50 keeps
// the project filter useful without measurable cost.
//
// v2.7: folded sessions + previews into one queryBatch round
// trip via sessionListSnapshot. Pre-fix the two awaits below
// were serialized SSH calls, paying the 420 ms RTT twice
// every time the file watcher fired (~2.2 s baseline reload).
// sessionListSnapshot halves the round-trips for every
// sidebar refresh.
let snapshot = await dataService.sessionListSnapshot(limit: 50)
let fetchedSessions = snapshot.sessions
let fetchedPreviews = snapshot.previews
await dataService.close()
// Project attribution + registry single batched off-main read.
let ctx = context
let bundle: (names: [String: String], projects: [ProjectEntry]) = await Task.detached {
let attribution = SessionAttributionService(context: ctx)
let registry = ProjectDashboardService(context: ctx).loadRegistry()
let pathToName = Dictionary(
uniqueKeysWithValues: registry.projects.map { ($0.path, $0.name) }
)
let map = attribution.load().mappings
var names: [String: String] = [:]
for (sessionID, path) in map {
if let name = pathToName[path] {
names[sessionID] = name
}
}
}
return (names: names, projects: registry.projects)
}.value
return (names: names, projects: registry.projects)
}.value
// Single batched commit assigning all four observables at once
// means SwiftUI sees one update rather than four staggered ones.
// Eliminates the brief "list flashes / project chips appear
// late" reload artifact during session switches.
recentSessions = fetchedSessions
sessionPreviews = fetchedPreviews
sessionProjectNames = bundle.names
allProjects = bundle.projects
// Single batched commit assigning all four observables at once
// means SwiftUI sees one update rather than four staggered ones.
// Eliminates the brief "list flashes / project chips appear
// late" reload artifact during session switches.
recentSessions = fetchedSessions
sessionPreviews = fetchedPreviews
sessionProjectNames = bundle.names
allProjects = bundle.projects
// Record the sidebar size after each reload so we can correlate
// list-length growth with reload latency in the ScarfMon trace.
ScarfMon.event(.sessionLoad, "mac.recentSessions.count", count: recentSessions.count)
}
}
/// Resolved project display name for a recent session, or nil for
@@ -44,6 +44,16 @@ struct ChatInspectorPane: View {
}
}
.background(ScarfColor.backgroundSecondary)
// v2.8 lazy-load the tool result content when the inspector
// opens for a call whose result wasn't auto-hydrated. The
// chat-resume path skips Phase 2b by default (the bulk fetch
// can blow past the 30s SSH timeout on remote contexts), so
// the inspector is the user-initiated lazy path.
.task(id: chatViewModel.focusedToolCallId) {
guard let id = chatViewModel.focusedToolCallId,
chatViewModel.focusedToolCall?.result == nil else { return }
await chatViewModel.richChatViewModel.loadToolResultIfMissing(callId: id)
}
}
// MARK: - Header
@@ -55,6 +55,31 @@ struct ChatSessionListPane: View {
.padding(.horizontal, 6)
.padding(.bottom, ScarfSpace.s2)
}
// While a session is mid-boot the SSH tunnel is bottlenecked
// on the in-flight start/load letting the user queue up a
// second session-switch ends with both fights racing for
// the same backend (we've seen the small fast chat lose to
// a 30s timeout from the prior big chat). Disable the
// entire pane (taps + visual) during prep, plus a
// ProgressView so the cause is obvious. v2.8.
.disabled(chatViewModel.isPreparingSession)
.opacity(chatViewModel.isPreparingSession ? 0.55 : 1.0)
.overlay {
if chatViewModel.isPreparingSession {
HStack(spacing: 6) {
ProgressView().controlSize(.small)
Text(chatViewModel.acpStatus.isEmpty ? "Loading…" : chatViewModel.acpStatus)
.scarfStyle(.caption)
.foregroundStyle(ScarfColor.foregroundMuted)
}
.padding(.horizontal, ScarfSpace.s3)
.padding(.vertical, ScarfSpace.s2)
.background(.thinMaterial, in: Capsule())
.padding(.bottom, ScarfSpace.s5)
.frame(maxWidth: .infinity, maxHeight: .infinity, alignment: .bottom)
.allowsHitTesting(false)
}
}
footer
}
.background(ScarfColor.backgroundTertiary)
@@ -44,12 +44,23 @@ struct ChatTranscriptPane: View {
if let hint = richChat.transientHint {
steeringToast(hint)
}
// Issue #62: bind composer identity to the active session
// ID so SwiftUI rebuilds `RichChatInputBar` (and its
// `@State` `text`/`attachments`) when the user switches
// conversations. Without this the composer is structurally
// identical across sessions and SwiftUI happily reuses the
// instance, leaking the unsent draft into the new session.
// A stable fallback id covers the brief "no session
// selected" window using `UUID()` here would mint a
// fresh value per render and trash the composer on every
// body re-eval.
RichChatInputBar(
onSend: onSend,
isEnabled: isEnabled,
commands: richChat.availableCommands,
showCompressButton: richChat.supportsCompress && !richChat.hasBroaderCommandMenu
)
.id(richChat.sessionId ?? "scarf.chat.no-session")
}
.background(ScarfColor.backgroundPrimary)
}
+125 -5
View File
@@ -17,6 +17,12 @@ struct ChatView: View {
private var showInspector: Bool = true
var body: some View {
// ScarfMon body-evaluation counter tracks how many times
// SwiftUI re-evaluates this view per second during streaming.
// High counts here usually mean state is fanning out further
// than necessary; pair with `mac.RichMessageBubble.body` to
// see whether the churn lives in the parent or the bubbles.
let _: Void = ScarfMon.event(.chatRender, "mac.ChatView.body")
@Bindable var vm = viewModel
@Bindable var coord = coordinator
VStack(spacing: 0) {
@@ -42,13 +48,20 @@ struct ChatView: View {
.task {
await viewModel.loadRecentSessions()
viewModel.refreshCredentialPreflight()
viewModel.refreshConfigDiagnostics()
// Cold-launch handoff: if the user clicked "New Chat" on
// a project before ChatView had a chance to render, the
// coordinator was already populated. Consume the request
// here. The onChange below handles the live case.
if let pending = coordinator.pendingProjectChat {
let prompt = coordinator.pendingInitialPrompt
coordinator.pendingProjectChat = nil
viewModel.startNewSession(projectPath: pending)
coordinator.pendingInitialPrompt = nil
if let prompt {
viewModel.startNewSessionAndSend(projectPath: pending, text: prompt)
} else {
viewModel.startNewSession(projectPath: pending)
}
}
// Same story for resume-session handoff: the user clicked
// a session in the Projects Sessions tab (routes to `.chat`
@@ -65,7 +78,17 @@ struct ChatView: View {
}
}
.onChange(of: fileWatcher.lastChangeDate) {
Task { await viewModel.loadRecentSessions() }
// Debounced rather than immediate. During an active ACP
// message stream the watcher fires many times per second
// (every persisted message bumps `state.db-wal`'s mtime);
// an unconditional reload on each tick caused the chat
// sidebar to visibly flicker as `recentSessions` was
// reassigned over and over with the same data. The
// debounced helper coalesces bursts into one trailing
// fetch ~500 ms after the last tick. New sessions still
// appear immediately because the create/resume paths
// call `loadRecentSessions()` synchronously themselves.
viewModel.scheduleSessionsRefresh()
viewModel.refreshCredentialPreflight()
}
// Live handoff from the per-project Sessions tab: the tab
@@ -73,10 +96,22 @@ struct ChatView: View {
// `.chat`; this view consumes the path and starts a fresh
// session with cwd=projectPath. Attribution happens inside
// ChatViewModel on successful session creation.
//
// The "New Project from Scratch" wizard (v2.8) sets the
// sister slot `pendingInitialPrompt` alongside the project
// path so the agent receives a kickoff prompt without the
// user having to type one. We drain both atomically and
// route to `startNewSessionAndSend` when present.
.onChange(of: coord.pendingProjectChat) { _, new in
if let projectPath = new {
let prompt = coordinator.pendingInitialPrompt
coordinator.pendingProjectChat = nil
viewModel.startNewSession(projectPath: projectPath)
coordinator.pendingInitialPrompt = nil
if let prompt {
viewModel.startNewSessionAndSend(projectPath: projectPath, text: prompt)
} else {
viewModel.startNewSession(projectPath: projectPath)
}
}
}
// Live handoff for resume: user clicked an existing session in
@@ -93,6 +128,27 @@ struct ChatView: View {
}
/// Banner rendered between the toolbar and the chat area when either
/// Status string surfaced in the toolbar pill. When the agent's
/// thought stream is in flight without any visible message bytes
/// (Hermes reasoning models routinely take 38 s here), promote
/// the generic "Agent working..." to "Thinking" so the user
/// sees the model is reasoning rather than stalled. v2.7.
private var displayedStatus: String {
if viewModel.richChatViewModel.isStreamingThoughtsOnly {
return "Thinking…"
}
// v2.8 promote the otherwise-ready status to a more honest
// "Loading tool details" while the two-phase loader's
// background hydration is still pulling tool_calls JSON and
// tool result rows. The bare conversation transcript is
// already on screen; this just tells the user that the
// missing tool cards / result bodies are on their way.
if viewModel.richChatViewModel.isHydratingTools {
return "Loading tool details…"
}
return viewModel.acpStatus.isEmpty ? "Active" : viewModel.acpStatus
}
/// (a) a preflight credential check failed, or (b) the ACP subprocess
/// returned an error we captured. Shows a short hint + expandable raw
/// details (stderr tail) that the user can copy to the clipboard.
@@ -116,6 +172,15 @@ struct ChatView: View {
.lineLimit(showErrorDetails ? nil : 2)
}
Spacer()
if let provider = viewModel.acpErrorOAuthProvider {
Button("Re-authenticate") {
coordinator.pendingOAuthReauth = provider
coordinator.selectedSection = .credentialPools
}
.buttonStyle(.borderedProminent)
.controlSize(.small)
.help("Open Credential Pools and re-authenticate \(provider).")
}
if viewModel.acpErrorDetails != nil {
Button(showErrorDetails ? "Hide details" : "Show details") {
showErrorDetails.toggle()
@@ -178,6 +243,50 @@ struct ChatView: View {
.frame(height: 1),
alignment: .bottom
)
} else if let mismatch = viewModel.modelProviderMismatch, !viewModel.hasActiveProcess {
// Provider/model mismatch `model.default` carries one
// provider prefix while `model.provider` names another.
// Hermes can't reconcile and the chat dies with -32603 at
// first prompt. v2.8 surfaces a one-click fix for both
// directions: align provider to the model's prefix
// (likely the user just authed against `prefixProvider`),
// or strip the prefix to keep the active provider intact.
HStack(alignment: .top, spacing: 8) {
Image(systemName: "exclamationmark.triangle.fill")
.foregroundStyle(.orange)
VStack(alignment: .leading, spacing: 4) {
Text("Model/provider mismatch in config.yaml")
.font(.callout)
Text("`model.default` is `\(mismatch.modelDefault)` but `model.provider` is `\(mismatch.activeProvider)`. Chats will fail at first prompt until this is reconciled.")
.font(.caption)
.foregroundStyle(.secondary)
.textSelection(.enabled)
HStack(spacing: 6) {
Button("Use \(mismatch.prefixProvider)") {
viewModel.alignProviderToModelPrefix(mismatch)
}
.buttonStyle(.borderedProminent)
.controlSize(.small)
.help("Set model.provider = \(mismatch.prefixProvider) and model.default = \(mismatch.bareModel).")
Button("Keep \(mismatch.activeProvider)") {
viewModel.stripPrefixFromModelDefault(mismatch)
}
.buttonStyle(.bordered)
.controlSize(.small)
.help("Strip the prefix from model.default, leaving model.provider = \(mismatch.activeProvider).")
}
.padding(.top, 2)
}
Spacer()
}
.padding(10)
.background(Color.orange.opacity(0.08))
.overlay(
Rectangle()
.fill(Color.orange.opacity(0.25))
.frame(height: 1),
alignment: .bottom
)
}
}
@@ -190,7 +299,14 @@ struct ChatView: View {
Circle()
.fill(.green)
.frame(width: 6, height: 6)
(viewModel.acpStatus.isEmpty ? Text("Active") : Text(viewModel.acpStatus))
// Promote the generic "Agent working..." status to
// "Thinking" the moment the thought stream starts
// arriving without visible message bytes the user
// gets a more honest signal that the model is
// reasoning, not stalled. Falls back to whatever
// status string the VM has when no thought stream
// is in flight.
Text(displayedStatus)
.font(.caption)
.foregroundStyle(.secondary)
.lineLimit(1)
@@ -457,7 +573,11 @@ struct ChatView: View {
// MARK: - Permission Approval View
extension RichChatViewModel.PendingPermission: Identifiable {
// `@retroactive` acknowledges that we're declaring conformance for a
// type (`PendingPermission`) and protocol (`Identifiable`) we don't own
// the Swift 6 compiler flags this otherwise so that downstream
// breakage is loud if `ScarfCore` ever adds the conformance upstream.
extension RichChatViewModel.PendingPermission: @retroactive Identifiable {
public var id: Int { requestId }
}
@@ -7,12 +7,16 @@ struct CodeBlockView: View {
@State private var copied = false
/// Chat font scale plumbed from `RichChatView` (issue #68). Defaults
/// to 1.0 outside the chat surface.
@Environment(\.chatFontScale) private var chatFontScale: Double
var body: some View {
VStack(alignment: .leading, spacing: 0) {
if let language, !language.isEmpty {
HStack {
Text(language)
.font(.caption2.bold())
.font(ChatFontScale.caption2(chatFontScale).bold())
.foregroundStyle(.secondary)
Spacer()
copyButton
@@ -31,7 +35,7 @@ struct CodeBlockView: View {
ScrollView(.horizontal, showsIndicators: false) {
Text(code)
.font(.system(size: 12, design: .monospaced))
.font(ChatFontScale.codeBlock(chatFontScale))
.foregroundStyle(Color(nsColor: NSColor(red: 0.85, green: 0.87, blue: 0.91, alpha: 1.0)))
.textSelection(.enabled)
.padding(.horizontal, 10)
@@ -108,16 +108,47 @@ struct RichChatInputBar: View {
)
)
.overlay(alignment: .topLeading) {
if text.isEmpty {
Text(supportsImagePrompts
? "Message Hermes… / for commands · drag images to attach"
: "Message Hermes… / for commands")
.scarfStyle(.body)
.foregroundStyle(ScarfColor.foregroundFaint)
.padding(.horizontal, 14)
.padding(.vertical, 10)
.allowsHitTesting(false)
}
// Placeholder ghosting (#65): TextEditor's
// NSTextView updates the visible glyphs a frame
// before the SwiftUI binding propagates, so a
// bare `if text.isEmpty` overlay renders the
// translucent placeholder text on top of the
// just-typed character visible as a "behind
// or around" ghost. Three mitigations:
//
// 1. Pin an opaque rectangle behind the
// placeholder text. During any single-
// frame lag the user sees a clean
// placeholder, never layered glyphs.
// 2. Use `.opacity(...)` instead of an `if`.
// Keeps the view tree stable per
// keystroke (removes the per-keystroke
// view-mutation churn the composer was
// already paying for).
// 3. Constrain to a single line with
// `frame(maxWidth: .infinity)` and
// `truncationMode(.tail)` so the long-form
// hint can't escape the rounded
// TextEditor bounds when the sidebar /
// detail-pane geometry compresses the
// composer (was visibly overflowing).
Text(supportsImagePrompts
? "Message Hermes… / for commands · drag images to attach"
: "Message Hermes… / for commands")
.scarfStyle(.body)
.foregroundStyle(ScarfColor.foregroundFaint)
.lineLimit(1)
.truncationMode(.tail)
.frame(maxWidth: .infinity, alignment: .leading)
.padding(.horizontal, 14)
.padding(.vertical, 10)
.background(ScarfColor.backgroundSecondary)
// Hide once the field has any content OR
// the user is actively focused matches
// standard NSTextField / UITextField
// placeholder semantics.
.opacity((text.isEmpty && !isFocused) ? 1 : 0)
.allowsHitTesting(false)
}
// Drag-drop image attachments. Receives both file URLs
// (from Finder) and raw image bitmap data (from
@@ -200,7 +231,12 @@ struct RichChatInputBar: View {
.onChange(of: text) { _, _ in
updateMenuState()
}
.onChange(of: commands.map(\.id)) { _, _ in
// Watch `commands.count` rather than `commands.map(\.id)` the
// mapped form allocates a fresh `[String]` on every body
// re-eval (i.e. every keystroke), which is wasted work even
// when the array compares equal. The count proxy fires when
// the agent advertises new commands.
.onChange(of: commands.count) { _, _ in
updateMenuState()
}
.sheet(isPresented: $showCompressSheet) {
@@ -358,17 +394,37 @@ struct RichChatInputBar: View {
private func updateMenuState() {
let shouldShow = shouldShowMenu
// Common case: user is composing normal text and the menu is
// already hidden. Skip the filter computation + state writes
// entirely so onChange stays cheap. Without this guard typing
// recomputes `filteredCommands` on every keystroke even when
// the menu can't possibly appear.
guard shouldShow || showMenu else { return }
// Compute desired selection, then only write what changed.
// SwiftUI emits "onChange action tried to update multiple
// times per frame" when an onChange handler mutates more than
// one piece of state per frame; the warning correlates with
// unusable typing lag because each redundant write triggers
// another body re-eval.
let count = filteredCommands.count
let newSelection: Int
if count == 0 {
newSelection = 0
} else if selectedIndex >= count {
newSelection = count - 1
} else if selectedIndex < 0 {
newSelection = 0
} else {
newSelection = selectedIndex
}
if shouldShow != showMenu {
showMenu = shouldShow
}
// Re-clamp selection whenever the filtered list may have shrunk.
let count = filteredCommands.count
if count == 0 {
selectedIndex = 0
} else if selectedIndex >= count {
selectedIndex = count - 1
} else if selectedIndex < 0 {
selectedIndex = 0
if newSelection != selectedIndex {
selectedIndex = newSelection
}
}
@@ -41,7 +41,12 @@ struct RichChatMessageList: View {
/// we can reintroduce lazy with a preference-key-based height
/// measurement, but that's a much larger change.
var body: some View {
ScrollViewReader { proxy in
// ScarfMon confirms whether the parent re-issues the
// ForEach. If this fires once and we still see RichMessageBubble.body
// burst N times, churn lives inside the bubbles (or in their inputs).
// If this fires N times, the ForEach itself is being rebuilt.
let _: Void = ScarfMon.event(.chatRender, "mac.RichChatMessageList.body")
return ScrollViewReader { proxy in
ScrollView {
VStack(alignment: .leading, spacing: 16) {
if groups.isEmpty && !isWorking {
@@ -64,6 +64,11 @@ struct RichChatView: View {
}
.frame(minHeight: 0, idealHeight: 500, maxHeight: .infinity)
.environment(\.dynamicTypeSize, ChatFontScale.dynamicTypeSize(for: fontScale))
// ScarfFont tokens are fixed-point so dynamicTypeSize alone
// doesn't move bubble / markdown / code-block text. Plumb the
// raw scale via `\.chatFontScale` so chat content views can
// read it and scale their explicit sizes too (issue #68).
.environment(\.chatFontScale, fontScale)
// Animate side-pane shows/hides so the transcript reflows
// smoothly rather than snapping. ~180ms feels responsive
// without being jarring.
@@ -14,6 +14,11 @@ struct RichMessageBubble: View, Equatable {
@Environment(ChatViewModel.self) private var chatViewModel
/// Chat-only font scale set on `RichChatView`. Chat content uses
/// these multiplied sizes (issue #68); other surfaces still see
/// the static ScarfFont tokens at scale = 1.0.
@Environment(\.chatFontScale) private var chatFontScale: Double
/// Scarf-local chat density preferences (issues #47 / #48). All
/// three default to today's UI. Read here so the reasoning + tool-
/// call switches don't have to thread the values through every
@@ -53,6 +58,11 @@ struct RichMessageBubble: View, Equatable {
}
var body: some View {
// Per-bubble render counter. The streaming bubble re-renders
// per token; cross-reference with `mac.ChatView.body` and
// `chatStream.handleACPEvent` to see whether streaming churn
// lives in the parent, the bubble, or the event handler.
let _: Void = ScarfMon.event(.chatRender, "mac.RichMessageBubble.body")
if message.isUser {
userBubble
} else if message.isAssistant {
@@ -68,7 +78,7 @@ struct RichMessageBubble: View, Equatable {
HStack {
Spacer(minLength: 80)
Text(message.content)
.scarfStyle(.body)
.font(ChatFontScale.body(chatFontScale))
.foregroundStyle(ScarfColor.onAccent)
.textSelection(.enabled)
.padding(.horizontal, 14)
@@ -91,7 +101,7 @@ struct RichMessageBubble: View, Equatable {
.font(.system(size: 9))
.foregroundStyle(ScarfColor.success)
Text(time, style: .time)
.font(ScarfFont.caption2)
.font(ChatFontScale.caption2(chatFontScale))
.foregroundStyle(ScarfColor.foregroundFaint)
}
.padding(.trailing, 4)
@@ -183,7 +193,7 @@ struct RichMessageBubble: View, Equatable {
private var reasoningDisclosure: some View {
DisclosureGroup {
Text(message.preferredReasoning ?? "")
.font(ScarfFont.monoSmall)
.font(ChatFontScale.monoSmall(chatFontScale))
.foregroundStyle(ScarfColor.foregroundMuted)
.italic()
.textSelection(.enabled)
@@ -194,11 +204,11 @@ struct RichMessageBubble: View, Equatable {
Image(systemName: "brain")
.font(.system(size: 11))
Text("REASONING")
.scarfStyle(.captionStrong)
.font(ChatFontScale.captionStrong(chatFontScale))
.tracking(0.5)
if let tokens = message.tokenCount, tokens > 0 {
Text("· \(tokens) tok")
.font(ScarfFont.monoSmall)
.font(ChatFontScale.monoSmall(chatFontScale))
.foregroundStyle(ScarfColor.foregroundFaint)
}
}
@@ -222,7 +232,7 @@ struct RichMessageBubble: View, Equatable {
.font(.system(size: 9))
.foregroundStyle(ScarfColor.warning)
Text(message.preferredReasoning ?? "")
.font(ScarfFont.caption)
.font(ChatFontScale.caption(chatFontScale))
.italic()
.foregroundStyle(ScarfColor.foregroundFaint)
.textSelection(.enabled)
@@ -281,7 +291,7 @@ struct RichMessageBubble: View, Equatable {
.font(.system(size: 10))
.foregroundStyle(color)
Text(call.functionName)
.font(ScarfFont.monoSmall)
.font(ChatFontScale.monoSmall(chatFontScale))
.fontWeight(.medium)
.foregroundStyle(ScarfColor.foregroundPrimary)
.lineLimit(1)
@@ -341,28 +351,103 @@ struct RichMessageBubble: View, Equatable {
HStack(spacing: 8) {
if let tokens = message.tokenCount, tokens > 0 {
Text("\(tokens) tok")
.font(ScarfFont.monoSmall)
.font(ChatFontScale.monoSmall(chatFontScale))
}
if let reason = message.finishReason, !reason.isEmpty {
if let reason = message.finishReason,
Self.shouldShowFinishReason(reason)
{
Text("·")
Text(reason)
.scarfStyle(.caption)
.font(ChatFontScale.caption(chatFontScale))
.foregroundStyle(Self.finishReasonTone(reason))
}
if let time = message.timestamp {
Text("·")
Text(time, style: .time)
.scarfStyle(.caption)
.font(ChatFontScale.caption(chatFontScale))
}
if let seconds = turnDuration {
Text("·")
Text(RichChatViewModel.formatTurnDuration(seconds))
.font(ScarfFont.monoSmall)
.font(ChatFontScale.monoSmall(chatFontScale))
.help("Wall-clock duration of this turn")
}
// Per-message TTS playback toggle (issue #66). Only on
// settled assistant bubbles streaming bubble (id == 0)
// would speak partial text. Empty content has nothing to
// speak.
if message.id != 0, !message.content.isEmpty {
speakButton
}
}
.font(ChatFontScale.caption(chatFontScale))
.foregroundStyle(ScarfColor.foregroundFaint)
.padding(.leading, 4)
}
/// Whether `finishReason` should render as a visible badge in the
/// message footer. `stop` and `end_turn` are normal end-of-turn
/// signals `RichChatViewModel.finalizeStreamingMessage` stamps
/// `"stop"` on every text-bearing turn-final assistant message
/// so showing them creates the impression that something stopped
/// the agent prematurely. We suppress them and reserve the badge
/// for abnormal terminations (max_tokens, error, refusal,
/// content_filter, ) the user actually wants to see. Matches
/// the conventions in ChatGPT, Claude.ai, Cursor, etc.
private static func shouldShowFinishReason(_ reason: String) -> Bool {
let normalized = reason.trimmingCharacters(in: .whitespaces).lowercased()
return !["stop", "end_turn", "end-turn", ""].contains(normalized)
}
/// Visual tone for an abnormal finish-reason badge. Severity
/// scales: warning (yellow) for "the response was cut short" cases
/// the user can usually retry, danger (red) for outright failures
/// or refusals, muted otherwise so unrecognized reasons stay
/// readable but un-alarming.
private static func finishReasonTone(_ reason: String) -> Color {
switch reason.lowercased() {
case "max_tokens", "length", "content_filter":
return ScarfColor.warning
case "error", "refusal":
return ScarfColor.danger
default:
return ScarfColor.foregroundMuted
}
}
/// Speaker glyph that toggles `AVSpeechSynthesizer` playback for
/// the assistant reply. Lives in its own view so the
/// `MessageSpeechService` observation doesn't fight the bubble's
/// `Equatable` short-circuit the parent only needs to pass
/// stable id + content; this view re-renders on its own when
/// playback state flips.
private var speakButton: some View {
SpeakMessageButton(messageId: message.id, content: message.content)
}
}
/// Stand-alone speaker button so the `MessageSpeechService`
/// observation doesn't get short-circuited by `RichMessageBubble`'s
/// `Equatable`. Only the button re-renders when playback flips
/// the bubble itself stays optimised.
private struct SpeakMessageButton: View {
let messageId: Int
let content: String
@State private var speech = MessageSpeechService.shared
var body: some View {
let isPlaying = speech.playingMessageId == messageId
Button {
speech.toggle(messageId: messageId, content: content)
} label: {
Image(systemName: isPlaying ? "stop.circle.fill" : "speaker.wave.2")
.font(.system(size: 11))
.foregroundStyle(isPlaying ? ScarfColor.accent : ScarfColor.foregroundFaint)
}
.buttonStyle(.plain)
.help(isPlaying ? "Stop speaking" : "Read this reply aloud")
}
}
// MARK: - Content Block Parsing
@@ -342,6 +342,34 @@ final class CredentialPoolsViewModel {
}
}
/// Remove an OAuth provider from `auth.json`. Maps to
/// `hermes auth logout <provider>` Hermes' canonical verb for
/// dropping the access + refresh token entries from
/// `providers.<name>` while leaving the upstream account intact.
/// User-initiated; the credential pool view's trash button on
/// each OAuth row routes here after a confirmation dialog.
func removeOAuthProvider(_ provider: String) {
let result = runHermes(["auth", "logout", provider])
if result.exitCode == 0 {
message = "Removed OAuth provider \(provider)"
load()
} else {
// Surface the first output line in the toast so the user
// can tell whether the verb is missing on this Hermes
// version (older builds may not have `auth logout`) vs.
// an actual failure. `runHermes` returns combined output
// (stdout + stderr) in `output`; first non-empty line is
// the most useful tail.
let detail = result.output
.split(separator: "\n", omittingEmptySubsequences: true)
.first.map(String.init) ?? "exit \(result.exitCode)"
message = "Remove failed: \(detail)"
}
DispatchQueue.main.asyncAfter(deadline: .now() + 3) { [weak self] in
self?.message = nil
}
}
func resetProvider(_ provider: String) {
let result = runHermes(["auth", "reset", provider])
message = result.exitCode == 0 ? "Cooldowns cleared for \(provider)" : "Reset failed"
@@ -93,15 +93,29 @@ final class OAuthFlowController {
// local spawns hermes directly, remote rounds through ssh -T while
// preserving stdin (for the auth-code prompt) and stdout (for the
// URL parser).
let proc = context.makeTransport().makeProcess(
executable: context.paths.hermesBinary,
args: args
)
if !context.isRemote {
// Only enrich env locally the remote ssh process gets the
// remote login env naturally, and exporting our local API keys
// into it would be wrong.
proc.environment = HermesFileService.enrichedEnvironment()
//
// PYTHONUNBUFFERED forces line-buffered Python stdout so the URL
// banner reaches us before `input("Authorization code: ")`
// blocks. PKCE *usually* recovers because input() flushes, but
// certain providers print preamble lines AFTER the prompt that
// we still want streamed in real time. Local: set on
// `proc.environment`. Remote: ssh doesn't forward arbitrary env
// vars without `SendEnv` configured, so wrap the command in
// `env PYTHONUNBUFFERED=1 ` to inject it on the remote side.
let proc: Process
if context.isRemote {
proc = context.makeTransport().makeProcess(
executable: "env",
args: ["PYTHONUNBUFFERED=1", context.paths.hermesBinary] + args
)
} else {
proc = context.makeTransport().makeProcess(
executable: context.paths.hermesBinary,
args: args
)
var env = HermesFileService.enrichedEnvironment()
env["PYTHONUNBUFFERED"] = "1"
proc.environment = env
}
let outPipe = Pipe()
@@ -6,9 +6,40 @@ struct CredentialPoolsView: View {
@State private var viewModel: CredentialPoolsViewModel
@State private var showAddSheet = false
@State private var pendingRemove: HermesCredential?
/// Mirrors `pendingRemove` for OAuth providers different model
/// type, separate confirmation. Non-nil while the dialog is up.
@State private var pendingOAuthRemove: HermesOAuthProvider?
/// When non-nil, `AddCredentialSheet` opens pre-seeded with this
/// provider name + OAuth type driven by the chat banner's
/// "Re-authenticate" button via `AppCoordinator.pendingOAuthReauth`,
/// or by clicking the per-row "Re-authenticate" button in this
/// view. Reset to nil when the sheet dismisses so the next plain
/// "Add Credential" press doesn't accidentally inherit it.
@State private var reauthInitialProvider: String?
@Environment(AppCoordinator.self) private var coordinator
@Environment(HermesFileWatcher.self) private var fileWatcher
/// Mirror of `OAuthKeepaliveCronService.isEnabled()` so the
/// toggle reads from local @State (instant) instead of hitting
/// disk on every render. `nil` while the initial probe is in
/// flight; reloaded on appear and after every enable/disable.
@State private var keepaliveEnabled: Bool?
@State private var keepaliveBusy: Bool = false
@State private var keepaliveError: String?
/// Cached Nous subscription state. Used by `keepaliveSection` to
/// surface a contextual nudge when the auth record hasn't been
/// refreshed in 14 days that's exactly when enabling the
/// keepalive cron is highest-value. Loaded async on appear; the
/// section renders without the nudge while this is `.absent`.
@State private var nousSubscription: NousSubscriptionState = .absent
private let keepalive: OAuthKeepaliveCronService
private let nousService: NousSubscriptionService
init(context: ServerContext) {
_viewModel = State(initialValue: CredentialPoolsViewModel(context: context))
self.keepalive = OAuthKeepaliveCronService(context: context)
self.nousService = NousSubscriptionService(context: context)
}
@@ -24,6 +55,7 @@ struct CredentialPoolsView: View {
emptyState
} else {
if !viewModel.oauthProviders.isEmpty {
keepaliveSection
oauthProvidersSection
}
ForEach(viewModel.pools) { pool in
@@ -42,9 +74,37 @@ struct CredentialPoolsView: View {
label: "Loading credentials…",
isEmpty: viewModel.pools.isEmpty && viewModel.oauthProviders.isEmpty
)
.onAppear { viewModel.load() }
.sheet(isPresented: $showAddSheet) {
AddCredentialSheet(viewModel: viewModel) {
.onAppear {
viewModel.load()
consumePendingReauth()
probeKeepalive()
}
.onChange(of: coordinator.pendingOAuthReauth) { _, _ in
consumePendingReauth()
}
// Pick up external changes to auth.json terminal
// `hermes auth logout`, OAuth flows from another window,
// OAuth keepalive cron rewriting tokens. Without this the
// pool only refreshes on appear / sheet-dismiss, so users
// who removed a provider via CLI saw stale rows after
// Reload (the file watcher already polls auth.json on the
// remote SSH path; here we just subscribe to its tick).
.onChange(of: fileWatcher.lastChangeDate) {
viewModel.load()
probeKeepalive()
}
.sheet(isPresented: $showAddSheet, onDismiss: {
// Refresh after every dismiss the OAuth flow rewrites
// `auth.json` on success, but the sheet self-closes
// before SwiftUI re-renders the parent. Without this,
// users had to hit Reload manually after a successful
// re-auth to see the expiry badge clear and the new
// `tokenTail` populate.
reauthInitialProvider = nil
viewModel.load()
probeKeepalive()
}) {
AddCredentialSheet(viewModel: viewModel, initialProvider: reauthInitialProvider) {
showAddSheet = false
}
}
@@ -62,6 +122,131 @@ struct CredentialPoolsView: View {
} message: {
Text("This removes the credential from hermes. The upstream provider key is not revoked.")
}
.confirmationDialog(
pendingOAuthRemove.map { "Remove OAuth provider \($0.provider.capitalized)?" } ?? "",
isPresented: Binding(get: { pendingOAuthRemove != nil }, set: { if !$0 { pendingOAuthRemove = nil } })
) {
Button("Remove", role: .destructive) {
if let target = pendingOAuthRemove {
viewModel.removeOAuthProvider(target.provider)
}
pendingOAuthRemove = nil
}
Button("Cancel", role: .cancel) { pendingOAuthRemove = nil }
} message: {
Text("Removes this OAuth provider from auth.json. You'll need to re-authenticate before Hermes can use it again. The upstream provider account is not revoked.")
}
}
/// Drain any pending re-auth hand-off from the chat banner: the
/// banner's "Re-authenticate" button writes to
/// `coordinator.pendingOAuthReauth` and switches to this view; we
/// pick the value up here, seed the sheet's initial provider, and
/// clear the slot so navigating back to this view doesn't re-open
/// the sheet.
private func consumePendingReauth() {
guard let pending = coordinator.pendingOAuthReauth else { return }
reauthInitialProvider = pending
showAddSheet = true
coordinator.pendingOAuthReauth = nil
}
/// Read the current keepalive cron job state off the main
/// thread. Disk reads on remote contexts can take 100300ms
/// (one SFTP round-trip for `~/.hermes/cron/jobs.json`) so this
/// hops to a detached task and only flips `keepaliveEnabled` on
/// MainActor when the result lands. Concurrently loads the Nous
/// subscription record so the staleness nudge is computed off
/// the same probe.
private func probeKeepalive() {
let svc = keepalive
let nous = nousService
Task.detached {
let enabled = svc.isEnabled()
let state = nous.loadState()
await MainActor.run {
keepaliveEnabled = enabled
nousSubscription = state
}
}
}
/// Section above the OAuth providers list with a single toggle
/// that registers / removes a Scarf-owned daily cron job. The
/// job's only purpose is to boot a Hermes session, which is what
/// causes Hermes to refresh OAuth access tokens (no standalone
/// CLI verb for refresh exists today). Hidden until we know the
/// current state flickering the toggle offon on view appear
/// would be confusing.
@ViewBuilder
private var keepaliveSection: some View {
let isOn = keepaliveEnabled ?? false
let stale = nousSubscription.hasStaleRefresh && keepaliveEnabled == false
SettingsSection(title: LocalizedStringKey("Keep tokens fresh"), icon: "arrow.clockwise") {
HStack(alignment: .top, spacing: 12) {
Image(systemName: "arrow.clockwise.circle")
.foregroundStyle(.secondary)
VStack(alignment: .leading, spacing: 4) {
Toggle(isOn: Binding(
get: { isOn },
set: { newValue in toggleKeepalive(to: newValue) }
)) {
Text("Auto-refresh OAuth tokens daily")
.font(.system(.body, weight: .medium))
}
.toggleStyle(.switch)
.disabled(keepaliveEnabled == nil || keepaliveBusy)
Text("Registers a `\(OAuthKeepaliveCronService.jobName)` cron job that runs at 4am daily. Booting a Hermes session is what triggers token refresh — without this, refresh tokens silently expire if you go ~30 days without using Scarf.")
.font(.caption)
.foregroundStyle(.secondary)
.fixedSize(horizontal: false, vertical: true)
if stale, let days = nousSubscription.daysSinceLastRefresh() {
HStack(spacing: 6) {
Image(systemName: "exclamationmark.triangle.fill")
.foregroundStyle(.orange)
Text("Your Nous subscription was last refreshed \(days) days ago. Enable the toggle above to prevent the refresh token from expiring.")
.font(.caption)
.foregroundStyle(.orange)
.fixedSize(horizontal: false, vertical: true)
}
.padding(.top, 4)
}
if let err = keepaliveError {
Text(err)
.font(.caption)
.foregroundStyle(.red)
.textSelection(.enabled)
}
}
Spacer(minLength: 0)
if keepaliveBusy {
ProgressView().controlSize(.small)
}
}
.padding(.horizontal, 12)
.padding(.vertical, 8)
.background(.quaternary.opacity(0.3))
}
}
private func toggleKeepalive(to newValue: Bool) {
guard !keepaliveBusy else { return }
keepaliveBusy = true
keepaliveError = nil
let svc = keepalive
Task.detached {
let ok = newValue ? await svc.enable() : await svc.disable()
let actualState = svc.isEnabled()
await MainActor.run {
keepaliveBusy = false
keepaliveEnabled = actualState
if !ok {
keepaliveError = newValue
? "Couldn't register the keepalive cron job. Check `hermes cron` works in a terminal."
: "Couldn't remove the keepalive cron job. Check `hermes cron remove` works in a terminal."
}
}
}
}
private var header: some View {
@@ -166,13 +351,32 @@ struct CredentialPoolsView: View {
}
}
Spacer()
Button("Re-authenticate") {
reauthInitialProvider = provider.provider
showAddSheet = true
}
.controlSize(.small)
// `Text(verbatim:)` skips the LocalizedStringKey
// overload that would interpret the backticks as
// markdown inline-code styling `.help(_:)` rejects
// styled Text. Plain string preserves the backticks
// literally.
.help(Text(verbatim: "Run `hermes auth add \(provider.provider) --type oauth` again to refresh this provider's tokens."))
Button(role: .destructive) {
pendingOAuthRemove = provider
} label: {
Image(systemName: "trash")
}
.controlSize(.small)
.buttonStyle(.borderless)
.help(Text(verbatim: "Remove this OAuth provider from auth.json. Hermes will need to be re-authenticated to use it again."))
}
.padding(.horizontal, 12)
.padding(.vertical, 6)
.background(.quaternary.opacity(0.3))
}
HStack {
Text("Managed by `hermes auth add <provider>` — Scarf is read-only here.")
Text("Re-authenticate refreshes tokens; the trash icon removes the provider from auth.json.")
.font(.caption2)
.foregroundStyle(.tertiary)
Spacer()
@@ -337,8 +541,25 @@ struct CredentialPoolsView: View {
/// OAuth flow so the user can paste the authorization code back.
private struct AddCredentialSheet: View {
@Bindable var viewModel: CredentialPoolsViewModel
/// Optional pre-fill from the re-auth path. When non-nil, the sheet
/// opens with this provider name + OAuth selected, mirroring the
/// state the user would otherwise have to type. Plain "Add
/// Credential" presses leave it nil.
let initialProvider: String?
let onDismiss: () -> Void
init(
viewModel: CredentialPoolsViewModel,
initialProvider: String? = nil,
onDismiss: @escaping () -> Void
) {
self.viewModel = viewModel
self.initialProvider = initialProvider
self.onDismiss = onDismiss
_providerID = State(initialValue: initialProvider ?? "")
_authType = State(initialValue: initialProvider == nil ? .apiKey : .oauth)
}
enum AuthType: String, CaseIterable, Identifiable {
case apiKey = "API Key"
case oauth = "OAuth"
@@ -352,8 +573,8 @@ private struct AddCredentialSheet: View {
}
}
@State private var providerID: String = ""
@State private var authType: AuthType = .apiKey
@State private var providerID: String
@State private var authType: AuthType
@State private var apiKey: String = ""
@State private var label: String = ""
@State private var providers: [HermesProviderInfo] = []
@@ -369,6 +590,22 @@ private struct AddCredentialSheet: View {
/// regular `OAuthFlowController` silently stalls, so we route Nous
/// through ``NousSignInSheet`` instead.
@State private var showNousSignIn: Bool = false
/// Provider/model swap prompt presented after a successful OAuth.
/// Captures the just-authed provider and the active config so the
/// confirm sheet can show the user what's about to change. Nil
/// when no swap is offered (already aligned, or user dismissed).
@State private var pendingProviderSwap: PendingProviderSwap?
/// Snapshot of the post-OAuth state used to render the
/// "Switch active provider?" sheet. Frozen at the moment OAuth
/// succeeded so the sheet stays consistent if config.yaml is
/// edited concurrently.
private struct PendingProviderSwap: Identifiable {
let id = UUID()
let newProvider: String
let currentProvider: String
let currentModelDefault: String
}
private var catalog: ModelCatalogService { ModelCatalogService(context: viewModel.context) }
@@ -412,12 +649,44 @@ private struct AddCredentialSheet: View {
// off `succeeded` which the controller sets only when hermes exited
// zero AND the output has no failure markers. The 0.8s delay lets the
// user see the success banner before the sheet disappears.
//
// v2.8 before auto-dismissing, check whether the just-authed
// provider matches `model.provider` in config.yaml. If they
// disagree, surface the "Switch active provider?" sheet so the
// user doesn't have to dig into Settings to make the new
// credentials actually drive chats. Detected entirely on the
// detached read; only the present-sheet branch keeps the user
// from auto-dismissing.
.onChange(of: viewModel.oauthFlow.succeeded) { _, newValue in
guard newValue else { return }
DispatchQueue.main.asyncAfter(deadline: .now() + 0.8) {
onDismiss()
let trimmedProvider = providerID.trimmingCharacters(in: .whitespaces)
let ctx = viewModel.context
Task.detached {
let svc = HermesFileService(context: ctx)
let config = svc.loadConfig()
let activeProvider = config.provider.trimmingCharacters(in: .whitespaces)
let modelDefault = config.model.trimmingCharacters(in: .whitespaces)
let needsSwap = !trimmedProvider.isEmpty
&& !activeProvider.isEmpty
&& trimmedProvider.caseInsensitiveCompare(activeProvider) != .orderedSame
await MainActor.run {
if needsSwap {
pendingProviderSwap = PendingProviderSwap(
newProvider: trimmedProvider,
currentProvider: activeProvider,
currentModelDefault: modelDefault
)
} else {
DispatchQueue.main.asyncAfter(deadline: .now() + 0.8) {
onDismiss()
}
}
}
}
}
.sheet(item: $pendingProviderSwap) { swap in
providerSwapSheet(swap: swap)
}
// Nous sign-in is a parallel flow that bypasses OAuthFlowController.
// When it completes, the parent list refreshes from auth.json just
// like it does after a regular OAuth add so we dismiss the
@@ -717,6 +986,61 @@ private struct AddCredentialSheet: View {
}
}
/// "Switch active provider?" sheet shown after a successful OAuth
/// when the just-authed provider doesn't match `model.provider` in
/// config.yaml. Without this, the user has to remember to open
/// Settings and swap the provider manually they'd otherwise hit
/// the v2.8 mismatch banner on the very next chat. v2.8.
private func providerSwapSheet(swap: PendingProviderSwap) -> some View {
VStack(alignment: .leading, spacing: 14) {
HStack(alignment: .top, spacing: 10) {
Image(systemName: "arrow.triangle.2.circlepath")
.font(.title2)
.foregroundStyle(.tint)
VStack(alignment: .leading, spacing: 4) {
Text("Switch active provider to \(swap.newProvider)?")
.font(.headline)
Text("`\(swap.newProvider)` is now authenticated, but `model.provider` in config.yaml is still `\(swap.currentProvider)`.")
.font(.callout)
.foregroundStyle(.secondary)
.textSelection(.enabled)
}
}
if !swap.currentModelDefault.isEmpty {
Text("Current `model.default`: `\(swap.currentModelDefault)` — Hermes will pick a default for `\(swap.newProvider)` if you switch.")
.font(.caption)
.foregroundStyle(.secondary)
.textSelection(.enabled)
}
HStack {
Button("Keep \(swap.currentProvider)") {
pendingProviderSwap = nil
DispatchQueue.main.asyncAfter(deadline: .now() + 0.4) { onDismiss() }
}
Spacer()
Button("Switch to \(swap.newProvider)") {
let target = swap.newProvider
let ctx = viewModel.context
pendingProviderSwap = nil
Task.detached {
let svc = HermesFileService(context: ctx)
// Empty model lets Hermes pick its own default
// for the new provider matches the Nous Portal
// path and avoids re-introducing a stale prefix.
_ = svc.setModelAndProvider(model: "", provider: target)
await MainActor.run {
DispatchQueue.main.asyncAfter(deadline: .now() + 0.4) { onDismiss() }
}
}
}
.buttonStyle(.borderedProminent)
.keyboardShortcut(.defaultAction)
}
}
.padding(20)
.frame(minWidth: 460)
}
/// Gate-aware OAuth primary action. For PKCE providers it's the
/// unchanged "Start OAuth" button; for Nous it's "Sign in to Nous
/// Portal" (opens ``NousSignInSheet``); for other device-code /
@@ -24,23 +24,38 @@ final class CronViewModel {
var editingJob: HermesCronJob?
var isLoading = false
/// Classified hint for the selected job's `lastError`, computed via
/// `ACPErrorHint.classify` so cron rows surface the same OAuth-revoked
/// affordance that ChatView's banner offers. `nil` when the selected
/// job has no error or the error doesn't match a known pattern the
/// detail pane falls back to rendering `lastError` raw.
var selectedErrorClassification: ACPErrorHint.Classification? {
guard let job = selectedJob, let lastError = job.lastError, !lastError.isEmpty else { return nil }
return ACPErrorHint.classify(errorMessage: lastError, stderrTail: "")
}
func load() {
isLoading = true
let svc = fileService
let selectedID = selectedJob?.id
Task.detached { [weak self] in
// Three sync transport ops on remote keep them off main.
let jobs = svc.loadCronJobs()
let skills = svc.loadSkills().flatMap { $0.skills.map(\.id) }.sorted()
let refreshed = selectedID.flatMap { id in jobs.first(where: { $0.id == id }) }
let output = refreshed.flatMap { svc.loadCronOutput(jobId: $0.id) }
await MainActor.run { [weak self] in
guard let self else { return }
self.jobs = jobs
self.availableSkills = skills
if let refreshed { self.selectedJob = refreshed }
if output != nil { self.jobOutput = output }
self.isLoading = false
// v2.8: instrumented so we can see how many SSH RTTs the
// Cron tab actually costs in captures.
await ScarfMon.measureAsync(.diskIO, "cron.load") {
let jobs = svc.loadCronJobs()
let skills = svc.loadSkills().flatMap { $0.skills.map(\.id) }.sorted()
let refreshed = selectedID.flatMap { id in jobs.first(where: { $0.id == id }) }
let output = refreshed.flatMap { svc.loadCronOutput(jobId: $0.id) }
ScarfMon.event(.diskIO, "cron.load.jobs", count: jobs.count)
await MainActor.run { [weak self] in
guard let self else { return }
self.jobs = jobs
self.availableSkills = skills
if let refreshed { self.selectedJob = refreshed }
if output != nil { self.jobOutput = output }
self.isLoading = false
}
}
}
}
+173 -8
View File
@@ -12,7 +12,10 @@ import ScarfDesign
struct CronView: View {
@State private var viewModel: CronViewModel
@State private var pendingDelete: HermesCronJob?
@State private var showOutputPanel: Bool = false
@Environment(\.hermesCapabilities) private var capabilitiesStore
@Environment(AppCoordinator.self) private var coordinator
@Environment(HermesFileWatcher.self) private var fileWatcher
init(context: ServerContext) {
_viewModel = State(initialValue: CronViewModel(context: context))
@@ -36,6 +39,13 @@ struct CronView: View {
.navigationTitle("Cron Jobs")
.loadingOverlay(viewModel.isLoading, label: "Loading cron jobs…", isEmpty: viewModel.jobs.isEmpty)
.onAppear { viewModel.load() }
// Reload on Hermes file mutations Hermes flips `state` between
// "scheduled" and "running" inside `~/.hermes/cron/jobs.json`
// when a job starts/finishes, and writes a new run-output file
// under `~/.hermes/cron/output/`. The watcher gives us the
// running indicator + log tail refresh "for free" without a
// polling timer. Same wiring ActivityView uses.
.onChange(of: fileWatcher.lastChangeDate) { viewModel.load() }
.sheet(isPresented: $viewModel.showCreateSheet) {
CronJobEditor(mode: .create, availableSkills: viewModel.availableSkills, supportsWorkdir: hasCronWorkdir) { form in
viewModel.createJob(
@@ -172,6 +182,13 @@ struct CronView: View {
Circle()
.fill(statusDotColor(job))
.frame(width: 7, height: 7)
.opacity(job.state == "running" ? 0.55 : 1.0)
.animation(
job.state == "running"
? .easeInOut(duration: 0.9).repeatForever(autoreverses: true)
: .default,
value: job.state
)
}
HStack(spacing: 10) {
Text(job.schedule.expression ?? job.schedule.display ?? "")
@@ -221,7 +238,13 @@ struct CronView: View {
}
private func statusDotColor(_ job: HermesCronJob) -> Color {
// Order matters: a currently-running job overrides a stale
// lastError so the user sees "yes, retrying right now" rather
// than "still showing the old failure." Disabled wins over
// everything else a paused job isn't running, regardless
// of state-field churn.
if !job.enabled { return ScarfColor.foregroundFaint }
if job.state == "running" { return ScarfColor.info }
if job.lastError != nil { return ScarfColor.danger }
return ScarfColor.success
}
@@ -272,6 +295,9 @@ struct CronView: View {
.foregroundStyle(ScarfColor.foregroundPrimary)
ScarfBadge(job.enabled ? "active" : "paused",
kind: job.enabled ? .success : .neutral)
if job.state == "running" {
ScarfBadge("running…", kind: .info)
}
}
Text(CronScheduleFormatter.humanReadable(from: job.schedule))
.scarfStyle(.footnote)
@@ -420,26 +446,165 @@ struct CronView: View {
}
if let error = job.lastError {
errorBanner(job: job, error: error)
}
outputPanel(job: job)
}
/// Last-error surface. When `ACPErrorHint` recognizes the message
/// (OAuth refresh-revoked, missing credentials, SSH failure, etc.),
/// it renders the human hint + raw error + a re-auth button when
/// applicable. Otherwise falls back to the legacy single-line
/// red text same chrome the view used pre-PR for unrecognized
/// errors. Mirrors `ChatView.errorBanner` so the recovery flow is
/// identical between cron and chat.
@ViewBuilder
private func errorBanner(job: HermesCronJob, error: String) -> some View {
if let classification = viewModel.selectedErrorClassification {
VStack(alignment: .leading, spacing: 6) {
HStack(alignment: .top, spacing: 8) {
Image(systemName: "exclamationmark.triangle.fill")
.foregroundStyle(ScarfColor.warning)
VStack(alignment: .leading, spacing: 4) {
Text(classification.hint)
.scarfStyle(.body)
.foregroundStyle(ScarfColor.foregroundPrimary)
.textSelection(.enabled)
Text(error)
.scarfStyle(.caption)
.foregroundStyle(ScarfColor.foregroundMuted)
.textSelection(.enabled)
.lineLimit(2)
}
Spacer(minLength: ScarfSpace.s2)
if let provider = classification.oauthProvider {
Button("Re-authenticate") {
coordinator.pendingOAuthReauth = provider
coordinator.selectedSection = .credentialPools
}
.buttonStyle(ScarfPrimaryButton())
.help("Open Credential Pools and re-authenticate \(provider).")
}
}
}
.padding(ScarfSpace.s3)
.background(
RoundedRectangle(cornerRadius: ScarfRadius.lg, style: .continuous)
.fill(ScarfColor.warning.opacity(0.08))
)
.overlay(
RoundedRectangle(cornerRadius: ScarfRadius.lg, style: .continuous)
.strokeBorder(ScarfColor.warning.opacity(0.25), lineWidth: 1)
)
} else {
HStack(spacing: 6) {
Image(systemName: "exclamationmark.triangle.fill")
Text(error)
.scarfStyle(.caption)
.textSelection(.enabled)
}
.foregroundStyle(ScarfColor.danger)
}
}
if let output = viewModel.jobOutput {
sectionBlock("LAST OUTPUT") {
Text(output)
.font(ScarfFont.monoSmall)
.foregroundStyle(ScarfColor.foregroundPrimary)
.textSelection(.enabled)
.padding(ScarfSpace.s3)
.frame(maxWidth: .infinity, alignment: .leading)
/// Per-job run-output panel. Always visible; collapsed by default
/// with a one-line summary so the detail pane stays scannable when
/// the user has dozens of cron jobs. Expanded body mirrors the
/// dark monospaced tail layout `LogsView` uses, fed by
/// `HermesFileService.loadCronOutput` (Hermes writes per-run files
/// under `~/.hermes/cron/output/<jobId>-*`). Reload happens via the
/// outer `HermesFileWatcher` `.onChange` when a fresh run lands a
/// new output file, the VM re-reads on the next mtime tick.
@ViewBuilder
private func outputPanel(job: HermesCronJob) -> some View {
let summary = outputSummary(job)
VStack(alignment: .leading, spacing: ScarfSpace.s2) {
Button {
showOutputPanel.toggle()
} label: {
HStack(spacing: ScarfSpace.s2) {
Image(systemName: showOutputPanel ? "chevron.down" : "chevron.right")
.font(.system(size: 10, weight: .semibold))
.foregroundStyle(ScarfColor.foregroundMuted)
Text("LAST RUN OUTPUT")
.scarfStyle(.captionUppercase)
.foregroundStyle(ScarfColor.foregroundMuted)
Text(summary)
.font(ScarfFont.monoSmall)
.foregroundStyle(ScarfColor.foregroundFaint)
.lineLimit(1)
Spacer()
}
.contentShape(Rectangle())
}
.buttonStyle(.plain)
if showOutputPanel {
if let output = viewModel.jobOutput, !output.isEmpty {
ScrollViewReader { proxy in
ScrollView {
Text(output)
.font(ScarfFont.monoSmall)
.foregroundStyle(ScarfColor.foregroundPrimary)
.textSelection(.enabled)
.frame(maxWidth: .infinity, alignment: .leading)
.padding(ScarfSpace.s3)
.id("cron-output-bottom")
}
.frame(maxHeight: 320)
.background(
RoundedRectangle(cornerRadius: ScarfRadius.lg, style: .continuous)
.fill(Color(red: 0.07, green: 0.06, blue: 0.05))
)
.overlay(
RoundedRectangle(cornerRadius: ScarfRadius.lg, style: .continuous)
.strokeBorder(ScarfColor.border, lineWidth: 1)
)
// Auto-scroll to the latest line whenever the
// output content changes (a new run lands).
.onChange(of: output) {
withAnimation(.easeOut(duration: 0.18)) {
proxy.scrollTo("cron-output-bottom", anchor: .bottom)
}
}
.onAppear {
proxy.scrollTo("cron-output-bottom", anchor: .bottom)
}
}
} else {
Text("No output yet — this job hasn't run, or its output file is gone.")
.scarfStyle(.caption)
.foregroundStyle(ScarfColor.foregroundMuted)
.frame(maxWidth: .infinity, alignment: .leading)
.padding(ScarfSpace.s3)
.background(
RoundedRectangle(cornerRadius: ScarfRadius.lg, style: .continuous)
.fill(ScarfColor.backgroundSecondary)
)
.overlay(
RoundedRectangle(cornerRadius: ScarfRadius.lg, style: .continuous)
.strokeBorder(ScarfColor.border, lineWidth: 1)
)
}
}
}
}
/// One-line summary rendered next to the LAST RUN OUTPUT chevron
/// when the panel is collapsed. Gives a quick "yes there's content"
/// (or "no output yet") read without expanding.
private func outputSummary(_ job: HermesCronJob) -> String {
let timestamp = job.lastRunAt.map { CronScheduleFormatter.formatNextRun(iso: $0) } ?? "never"
let status: String = {
if job.state == "running" { return "running…" }
if job.lastError != nil { return "error" }
if job.lastRunAt != nil { return "ok" }
return "no runs yet"
}()
return "\(timestamp)\(status)"
}
@ViewBuilder
private func sectionBlock<Content: View>(_ title: String, @ViewBuilder _ content: () -> Content) -> some View {
VStack(alignment: .leading, spacing: ScarfSpace.s2) {
@@ -43,16 +43,23 @@ final class DashboardViewModel {
func load() async {
isLoading = true
// refresh() = close + reopen, forces a fresh remote snapshot. Cheap
// on local (live DB reopen).
// refresh() is essentially free for the streaming remote backend
// (no transfer every query is fresh) and a cheap reopen for
// local. The four data-service queries below are batched
// through `dashboardSnapshot` so a remote load is one SSH
// round-trip instead of four.
let opened = await dataService.refresh()
var collectedErrors: [String] = []
if opened {
stats = await dataService.fetchStats()
recentSessions = await dataService.fetchSessions(limit: 5)
sessionPreviews = await dataService.fetchSessionPreviews(limit: 5)
let activityMessages = await dataService.fetchRecentToolCalls(limit: 8)
recentActivity = activityMessages.flatMap { msg in
let snapshot = await dataService.dashboardSnapshot(
sessionLimit: 5,
previewLimit: 5,
toolCallLimit: 8
)
stats = snapshot.stats
recentSessions = snapshot.recentSessions
sessionPreviews = snapshot.sessionPreviews
recentActivity = snapshot.recentToolCalls.flatMap { msg in
msg.toolCalls.map { call in
ActivityEntry(
id: call.callId,
@@ -43,21 +43,26 @@ final class MemoryViewModel {
let svc = fileService
let currentProfile = activeProfile
// Sync transport calls would beach-ball the UI on remote dispatch
// off main, then commit results back on MainActor.
// off main, then commit results back on MainActor. v2.8: wrapped
// in ScarfMon so we can see how many SSH RTTs this load actually
// costs (4 sequential SFTP reads on the slow path).
Task.detached { [weak self] in
let config = svc.loadConfig()
let profiles = svc.loadMemoryProfiles()
let profile = currentProfile.isEmpty ? config.memoryProfile : currentProfile
let memory = svc.loadMemory(profile: profile)
let user = svc.loadUserProfile(profile: profile)
await MainActor.run { [weak self] in
guard let self else { return }
self.memoryProvider = config.memoryProvider
self.profiles = profiles
self.activeProfile = profile
self.memoryContent = memory
self.userContent = user
self.isLoading = false
await ScarfMon.measureAsync(.diskIO, "memory.load") {
let config = svc.loadConfig()
let profiles = svc.loadMemoryProfiles()
let profile = currentProfile.isEmpty ? config.memoryProfile : currentProfile
let memory = svc.loadMemory(profile: profile)
let user = svc.loadUserProfile(profile: profile)
ScarfMon.event(.diskIO, "memory.load.bytes", count: 0, bytes: memory.utf8.count + user.utf8.count)
await MainActor.run { [weak self] in
guard let self else { return }
self.memoryProvider = config.memoryProvider
self.profiles = profiles
self.activeProfile = profile
self.memoryContent = memory
self.userContent = user
self.isLoading = false
}
}
}
}
@@ -1,3 +1,4 @@
import AppKit
import Foundation
import ScarfCore
import os
@@ -50,8 +51,65 @@ final class ProfilesViewModel {
}
}
/// Set the active profile via `hermes profile use <name>` without
/// relaunching Scarf. Most users will reach for `switchAndRelaunch`
/// instead kept here so the context-menu "Use" item stays
/// functional and so callers that genuinely want a no-relaunch
/// switch (tests, scripted setups) have a path. Invalidates the
/// resolver cache on success so the next `context.paths` access
/// picks up the new home directory.
func switchTo(_ profile: HermesProfile) {
runAndReload(["profile", "use", profile.name], success: "Active profile set to \(profile.name)")
Task.detached { [fileService] in
let result = fileService.runHermesCLI(args: ["profile", "use", profile.name], timeout: 60)
await MainActor.run {
if result.exitCode == 0 {
HermesProfileResolver.invalidateCache()
self.message = "Active profile set to \(profile.name) — restart Scarf to refresh."
} else {
self.message = "Failed: \(result.output.prefix(120))"
}
self.load()
DispatchQueue.main.asyncAfter(deadline: .now() + 3) { [weak self] in
self?.message = nil
}
}
}
}
/// Set the active profile and immediately relaunch Scarf. The
/// canonical user-facing switch path (issue #70): a fresh process
/// guarantees every service constructs from the new
/// `~/.hermes/active_profile` value, sidestepping any in-process
/// state that might still be holding the previous profile's
/// data. Failures fall back to a "restart manually" toast.
@MainActor
func switchAndRelaunch(_ profile: HermesProfile) {
Task.detached { [fileService] in
let result = fileService.runHermesCLI(args: ["profile", "use", profile.name], timeout: 30)
await MainActor.run {
guard result.exitCode == 0 else {
self.message = "Failed: \(result.output.prefix(120))"
self.load()
DispatchQueue.main.asyncAfter(deadline: .now() + 3) { [weak self] in
self?.message = nil
}
return
}
HermesProfileResolver.invalidateCache()
do {
try AppRelauncher.relaunch()
DispatchQueue.main.asyncAfter(deadline: .now() + 0.25) {
NSApp.terminate(nil)
}
} catch AppRelauncher.RelaunchError.debugBuild {
self.message = "Profile switched to \(profile.name). Restart Scarf manually (Xcode-launched instance)."
self.load()
} catch {
self.message = "Profile switched to \(profile.name). Please quit and reopen Scarf manually."
self.load()
}
}
}
}
func create(name: String, cloneConfig: Bool, cloneAll: Bool) {
@@ -20,6 +20,12 @@ struct ProfilesView: View {
@State private var renameTarget: HermesProfile?
@State private var renameNewName = ""
@State private var pendingDelete: HermesProfile?
/// Profile the user has clicked "Switch & Relaunch" on, awaiting
/// confirmation before we run `hermes profile use` and exit. The
/// confirmation step is load-bearing relaunching closes every
/// open Scarf window in the process, so the user needs an explicit
/// agreement.
@State private var pendingSwitch: HermesProfile?
/// Remote-import sheet visibility. Local imports use `NSOpenPanel`
/// inline; remote imports route through `RemoteProfilePathSheet`
/// because the zip the user wants to import lives on the remote
@@ -63,6 +69,18 @@ struct ProfilesView: View {
} message: {
Text("This removes the profile directory and all data within it. This cannot be undone.")
}
.confirmationDialog(
pendingSwitch.map { "Switch to '\($0.name)' and relaunch Scarf?" } ?? "",
isPresented: Binding(get: { pendingSwitch != nil }, set: { if !$0 { pendingSwitch = nil } })
) {
Button("Switch & Relaunch") {
if let profile = pendingSwitch { viewModel.switchAndRelaunch(profile) }
pendingSwitch = nil
}
Button("Cancel", role: .cancel) { pendingSwitch = nil }
} message: {
Text("All Scarf windows will close and reopen. Unsaved chat input may be lost.")
}
.sheet(isPresented: $showRemoteImportSheet) {
RemoteProfilePathSheet(
context: viewModel.context,
@@ -160,7 +178,9 @@ struct ProfilesView: View {
}
.tag(profile.id)
.contextMenu {
Button("Use") { viewModel.switchTo(profile) }
Button("Switch & Relaunch") { pendingSwitch = profile }
.disabled(profile.isActive)
Button("Set Active (no relaunch)") { viewModel.switchTo(profile) }
.disabled(profile.isActive)
Button("Rename") {
renameTarget = profile
@@ -215,16 +235,17 @@ struct ProfilesView: View {
Spacer()
if !profile.isActive {
Button {
viewModel.switchTo(profile)
pendingSwitch = profile
} label: {
Label("Switch to This Profile", systemImage: "arrow.triangle.swap")
Label("Switch & Relaunch", systemImage: "arrow.triangle.2.circlepath")
}
.buttonStyle(.borderedProminent)
.controlSize(.small)
.help("Set as active profile and relaunch Scarf so every tab loads from \(profile.name)")
}
}
if !profile.isActive {
profileSwitchWarning
profileSwitchInfo
}
SettingsSection(title: "Details", icon: "info.circle") {
if !profile.path.isEmpty {
@@ -255,16 +276,16 @@ struct ProfilesView: View {
}
}
private var profileSwitchWarning: some View {
private var profileSwitchInfo: some View {
HStack(alignment: .top, spacing: 8) {
Image(systemName: "exclamationmark.triangle")
.foregroundStyle(.orange)
Text("Switching the active profile changes the `~/.hermes` directory hermes uses. Restart Scarf after switching so it re-reads from the new profile's files.")
Image(systemName: "info.circle")
.foregroundStyle(.secondary)
Text("**Switch & Relaunch** sets this as the active profile (writes `~/.hermes/active_profile`) and relaunches Scarf so every tab — Webhooks, Sessions, SOUL.md, Memory — reloads from the new profile's `~/.hermes/profiles/<name>/` directory.")
.font(.caption)
.foregroundStyle(.secondary)
}
.padding(10)
.background(.orange.opacity(0.1))
.background(ScarfColor.backgroundSecondary)
.clipShape(RoundedRectangle(cornerRadius: 6))
}

Some files were not shown because too many files have changed in this diff Show More