Compare commits

...

129 Commits

Author SHA1 Message Date
Alan Wizemann a8cdb3e663 feat(ios): v0.13 read-only catch-up — goal pill, queue chip, Kanban diagnostics, Curator archived, Platforms (WS-9)
Mirrors the v0.13 surfaces from WS-2 (Persistent Goals + ACP /queue),
WS-3 (Kanban diagnostics + hallucination gate), WS-4 (Curator archive),
and WS-5 (Google Chat platform + cross-platform allowlists + behavior
toggles) onto ScarfGo. Per Phase H precedent, every iOS surface is
strictly read-only — write verbs (Verify / Reject, /goal --clear, queue
send, allowlist editing, archive Restore / Prune) live on Mac in v2.8.0
and are deferred to v2.8.x.

Five iOS additions, all capability-gated so pre-v0.13 hosts see the
v2.7.5 layout unchanged:

1. Chat — goal pill ("Goal: <text>") and queue chip ("N queued") render
   inside `projectContextBar` whenever a project, goal, or queue is
   present. The bar is no longer project-only; goal/queue chips render
   even outside a project chat. Goal text scales with Dynamic Type
   (semantic `.subheadline`); the full untruncated text rides VoiceOver
   via the chip's accessibility label.
2. Kanban — `ScarfGoKanbanDetailSheet` gains a `retries: N` chip in the
   header `FlowLayout`, a yellow "Worker-created — verify on Mac" badge
   for `pending` hallucination state, a red "Auto-blocked" banner with
   the server-supplied `auto_blocked_reason`, and tappable diagnostics
   chip-lists (task-level + per-run) that present a new
   `DiagnosticDetailSheet` with kind / severity / message / timestamp.
   No Verify or Reject buttons; the badge copy points users to the Mac
   app.
3. Curator — `CuratorView` appends a read-only "Archived" section that
   loads via `viewModel.loadArchive()` on appear and pull-to-refresh.
   Per-row name + category badge + reason + archived-at + size; footer
   signposts users to the Mac app for Restore / Prune.
4. Settings → Platforms — adds a Google Chat status row (configured /
   not configured), busy-ack and restart-notification rows summarized
   across `gatewayPlatforms` (yes / no / mixed (N platforms)), and
   collapsed DisclosureGroups for allowed channels / chats / rooms with
   monospaced "platform: id" entries when expanded. No editor.
5. Settings — green "v0.13 features active" `ScarfBadge` above the
   quick-edits section when `caps.isV013OrLater`. Tap presents a new
   `V013FeaturesSheet` listing the six v0.13 surfaces with one-sentence
   summaries; the section footer is explicit that editing lives on Mac.

Implements WS-9 of Scarf v2.8.0 (Hermes v0.13.0 catch-up).
Plan: scarf/docs/v2.8/WS-9-ios-v0.13-plan.md (on coordination/v2.8.0-plans).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 19:25:16 +02:00
Alan Wizemann 441d11404f Merge remote-tracking branch 'origin/ws-8-ux-v0.13' into integration/v2.8.0
# Conflicts:
#	scarf/scarf/Features/Chat/Views/ChatTranscriptPane.swift
#	scarf/scarf/Features/Chat/Views/SessionInfoBar.swift
2026-05-09 19:12:15 +02:00
Alan Wizemann 6e8480411a Merge remote-tracking branch 'origin/ws-7-settings-v0.13' into integration/v2.8.0
# Conflicts:
#	scarf/Packages/ScarfCore/Sources/ScarfCore/Models/HermesConfig.swift
#	scarf/Packages/ScarfCore/Sources/ScarfCore/Parsing/HermesConfig+YAML.swift
2026-05-09 19:11:29 +02:00
Alan Wizemann 3a764e81e0 Merge remote-tracking branch 'origin/ws-6-providers-v0.13' into integration/v2.8.0
# Conflicts:
#	scarf/Packages/ScarfCore/Sources/ScarfCore/Models/HermesConfig.swift
#	scarf/Packages/ScarfCore/Sources/ScarfCore/Parsing/HermesConfig+YAML.swift
2026-05-09 19:10:43 +02:00
Alan Wizemann 6e90741a17 Merge remote-tracking branch 'origin/ws-5-gateway-v0.13' into integration/v2.8.0 2026-05-09 19:09:38 +02:00
Alan Wizemann 93a3b40a67 Merge remote-tracking branch 'origin/ws-4-curator-archive' into integration/v2.8.0 2026-05-09 19:09:38 +02:00
Alan Wizemann 52f0ddb36c Merge remote-tracking branch 'origin/ws-3-kanban-v0.13' into integration/v2.8.0 2026-05-09 19:09:38 +02:00
Alan Wizemann cedee04f2a feat(kanban): v0.13 diagnostics + recovery UX (WS-3)
Layers Hermes v0.13's reliability + recovery affordances on top of the
v2.7.5 Kanban v3 board. New surface — gated end-to-end on
`HermesCapabilities.hasKanbanDiagnostics` (>= v0.13.0):

- **Hallucination gate.** Worker-created cards land in `pending` until
  the user verifies the underlying work exists. Inspector renders a
  yellow Verify / Reject banner above the body; cards dim to 0.6 with
  a question-mark glyph. Verify is optimistic — banner clears
  immediately, polling confirms. Reject routes through
  `comment` + `archive` so there's an audit trail.
- **Generic diagnostics engine.** `HermesKanbanDiagnostic` (new model +
  typed-mirror enum `KanbanDiagnosticKind`) renders cross-run signals
  on the inspector header and per-run signals under each Runs row.
  Card footer gains a stethoscope dot when any signal is attached.
- **`max_retries` create-time field + inspector chip.** Toggle-gated
  Stepper in the create sheet sends `--max-retries N`; chip on the
  inspector header reads it back read-only with a tooltip explaining
  there's no update verb.
- **Multi-line title input.** Create sheet's title becomes a
  `TextField(axis: .vertical, lineLimit: 1...4)`. Newlines are stripped
  client-side on pre-v0.13 hosts (which truncate at the first `\n`).
- **Auto-blocked reason banner.** When `task.auto_blocked_reason` is
  set, replaces the generic "Last run: blocked" with a red banner
  rendering the server reason verbatim. Card footer shows a 1-line
  truncated copy in red.
- **Tolerant decode contract.** Every new field is `Optional` with
  `decodeIfPresent`; diagnostics arrays use `try?` so a single
  malformed entry doesn't poison the row. v0.12 hosts decode unchanged.

Implements WS-3 of Scarf v2.8.0 (Hermes v0.13.0 catch-up).
Plan: scarf/docs/v2.8/WS-3-kanban-v0.13-plan.md (on
coordination/v2.8.0-plans).

TODOs marked inline pending integration against a live v0.13 binary:
WS-3-Q1 (verify verb name), WS-3-Q2 (diagnostics envelope vs task),
WS-3-Q4 (failure_count placement), WS-3-Q5 (darwin-zombie kind
string), WS-3-Q6 (max_retries default).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 19:06:38 +02:00
Alan Wizemann b4482e5ee7 feat(gateway): Google Chat platform + cross-platform allowlists + behavior toggles (WS-5)
Catches the Mac Messaging Gateway and Platforms surfaces up to Hermes
v0.13.0. Adds Google Chat as the 20th platform under Settings → Platforms,
gated on `hasGoogleChatPlatform`. Adds a per-platform "Gateway behavior"
subsection to the six platforms Hermes added v0.13 allowlist support to
(Slack, Mattermost, Google Chat, Telegram, WhatsApp, Matrix) — each
exposes the `allowed_channels` / `allowed_chats` / `allowed_rooms` editor
plus three new toggles (`busy_ack_enabled`, `gateway_restart_notification`,
`slash_command_notice_ttl_seconds`). The Messaging Gateway page header
gains a one-line cross-profile digest sourced from `hermes gateway list
--json`. SkillsView surfaces an informational row on skills whose body
contains the v0.13 `[[as_document]]` directive.

New ScarfCore types: `GatewayAllowlistKind` (channels/chats/rooms +
platform mapping), `GatewayPlatformSettings` (per-platform v0.13 bundle),
`GatewayConfigWriter` (pure YAML list-block editor — `hermes config set`
can't write lists; tested with 15 cases incl. round-trip + idempotence +
quoting + scalar-sibling preservation), `HermesGatewayListService`
(`hermes gateway list --json` parser tolerant of unknown keys + alt
field names; 13 tests), `HermesConfig.gatewayPlatforms` field. Mac VM
renamed to `MessagingGatewayViewModel` (single-feature local rename;
CLAUDE.md "the SidebarSection.gateway enum case stays" invariant
upheld). All 22 new tests pass; full ScarfCore suite green except 3
pre-existing `RemoteSQLiteBackendTests` failures unrelated to WS-5.

Capability-gated end-to-end. Pre-v0.13 hosts see no Google Chat row,
no cross-profile digest, no v0.13 toggles, and no `[[as_document]]`
info row — the v2.7.5 surface is byte-for-byte unchanged. Q1-Q3 wire-
shape unknowns (Google Chat identifier, YAML key path,
`gateway list --json` shape) are marked with `// TODO(WS-5-Q<N>)` and
defended by tolerant parsers + dual-spelling lookups.

Implements WS-5 of Scarf v2.8.0 (Hermes v0.13.0 catch-up).
Plan: scarf/docs/v2.8/WS-5-gateway-v0.13-plan.md (on coordination/v2.8.0-plans).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 19:05:55 +02:00
Alan Wizemann 4757b5ae49 feat(curator): archive + prune + list-archived (WS-4)
Catches the Curator surface up to Hermes v0.13's new write-side verbs
(`archive <skill>`, `prune`, `list-archived`, synchronous `run`). Adds
a new `CuratorService` actor in ScarfCore mirroring `KanbanService`'s
pattern (Sendable, pure I/O, `Task.detached(priority: .utility)` per
verb), tolerantly-decoded `HermesCuratorArchivedSkill` /
`CuratorPruneSummary` models, and `CuratorError` for inline-banner
surfacing.

Mac UX gains an "Archived" section between the leaderboards and the
last-report block (per-row Restore button), an "archivebox" button on
every active-skill leaderboard row to manually archive, a destructive
"Prune Archived…" confirm sheet enumerating each skill (template-
uninstall pattern — Cancel owns `.defaultAction`, Prune is on the red
`ScarfDestructiveButton`), and a synchronous-with-progress "Run Now"
on v0.13+ hosts (600s timeout, `ProgressView` while in-flight).
Failure path routes through a yellow inline error banner instead of a
modal alert. The legacy `CuratorRestoreSheet` stays accessible from
the overflow menu but only on pre-v0.13 hosts; on v0.13+ the per-row
Restore in the new Archived section replaces it.

All new surfaces gate on `HermesCapabilities.hasCuratorArchive` —
pre-v0.13 hosts see the v2.7.x layout unchanged. iOS picks up the new
`runNow(synchronous:)` signature with the v0.13 capability flag; the
read-only Archived section + WS-9 marker is left for the next stream.
14 new parser tests in `HermesCuratorParserTests` cover the JSON
happy path, the `{"archived": [...]}` envelope, the text fallback
(`--json` not supported), `"no archived skills"` sentinel folding,
prune-dry-run with both wrapper + bare-array shapes, and zero-skill
prune. All 369 ScarfCore tests pass; `xcodebuild` for the `scarf`
scheme succeeds.

Wire-shape unknowns (CLI flag presence on real v0.13) carry
`// TODO(WS-4-Q<N>)` markers in `CuratorService` and fall back
defensively when a flag isn't recognized. Implements WS-4 of Scarf
v2.8.0 (Hermes v0.13.0 catch-up). Plan:
scarf/docs/v2.8/WS-4-curator-archive-plan.md (on
coordination/v2.8.0-plans).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 19:03:13 +02:00
Alan Wizemann 0070441243 feat(profiles): add --no-skills toggle to create-profile sheet
Adds an "Empty profile (no skills)" toggle to the Mac create-profile
sheet, gated on `hasProfileNoSkills` (v0.13+). When ON, the create
flow appends `--no-skills` to `hermes profile create`. The toggle is
disabled (greyed out) when "Full copy of active profile" is on, per
WS-7 plan Decision H — a full clone copies skills wholesale, so
`--no-skills` would be a contradiction at the UX layer. The wire
itself stays permissive: a user can stack `--clone --no-skills` to
clone config but skip skills, which is a plausible workflow.

Defensive write-strip: even though the toggle is hidden on pre-v0.13
hosts, the call site reads `createNoSkills` through the capability
gate so a stale state value can't sneak `--no-skills` past argparse
on a CLI that doesn't know it.

iOS Profiles is read-only (per CLAUDE.md "v0.12 iOS catch-up
Phase H") so no toggle there.

TODO marker (WS-7-Q8) flags the assumed `--clone-all` interaction —
verify Hermes's behaviour with both flags during integration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 19:03:06 +02:00
Alan Wizemann 57a6340985 feat(providers): catalog refresh + image_gen.model + OpenRouter caching (WS-6)
Surfaces the v0.13 provider catalog work in Scarf v2.8.0. Five new model IDs
(deepseek/deepseek-v4-pro, x-ai/grok-4.3, openrouter/owl-alpha,
tencent/hy3-preview, arcee/trinity-large-thinking) flow through
models_dev_cache.json on next refresh — no manual catalog entries
needed; the picker reaches them automatically. The grok-4.20-beta →
grok-4.20 rename is handled via a new ModelCatalogService.modelAliases
map plus resolveModelAlias() helper, called from validateModel(),
model(_:_:), and provider(for:) at read time. Lossless: stored configs
are never rewritten.

Vercel AI Gateway is demoted to the bottom of the picker via a new
demotedProviders set + sort-comparator axis (between subscription-gated
and alphabetical). Always-on, no capability gate — sort-order
consistency across Hermes versions.

image_gen.model (top-level v0.13 YAML key) and
openrouter.response_cache.enabled (provisional key shape per
TODO(WS-6-Q1)) are surfaced as new SettingsSection rows in
AuxiliaryTab, capability-gated on hasImageGenModel +
hasOpenRouterResponseCache so pre-v0.13 hosts hide them. Image-gen
picker has a curated 7-entry allowlist (HermesImageGenModel) plus
free-form Custom model ID entry.

CLAUDE.md gains two schema-drift bullets next to the existing
overlayOnlyProviders requirement (modelAliases + demotedProviders
mirror with hermes_cli/providers.py).

Tests: 4 new M0cServicesTests (sort axis, alias resolution + cross-
provider isolation, image-gen allowlist, demoted-set sentinel) and 2
new M6ConfigCronTests (YAML round-trip + empty-default).

Implements WS-6 of Scarf v2.8.0 (Hermes v0.13.0 catch-up).
Plan: scarf/docs/v2.8/WS-6-providers-v0.13-plan.md
(on coordination/v2.8.0-plans).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 19:02:45 +02:00
Alan Wizemann 0f78856e6e feat(settings): v0.13 polish — redaction hint, display.language picker, xAI cloning badge (WS-8)
Three Settings-tab surfaces tracking v0.13 release notes:

- **Redaction default-flip awareness** (Advanced → Caching & Redaction):
  inline hint below the existing toggle whose copy depends on
  `HermesCapabilities.isV013OrLater`. v0.13 flipped the server-side
  default from OFF (v0.12) to ON, but Scarf's parser still treats
  "absent key" as `false`. Hint disambiguates so users on v0.13 hosts
  understand redaction is on server-side even when the toggle reads OFF.

- **`display.language` picker** (General → Locale): 8-option enum (`""`
  default + en/zh/ja/de/es/fr/uk/tr) capability-gated on
  `hasDisplayLanguage`. Persists via `hermes config set
  display.language <code>`. Empty string preserves "no key" semantics
  (Hermes-default English); explicit `en` pins it. Required a small
  `optionLabel:` overload on `PickerRow` so non-English labels
  (中文 / 日本語 / etc.) render alongside their codes.

- **xAI Custom Voices badge** (Voice → Text-to-Speech): adds `xai`
  to the TTS provider picker (un-gated — xAI TTS shipped earlier),
  exposes Voice ID + Model fields, and renders a "Cloning supported"
  ScarfBadge gated on `hasXAIVoiceCloning`. Hint copy points at
  `hermes voice` for cloned-voice management since Scarf has no
  in-app surface for that yet (out-of-scope for v2.8).

Capability gates: `isV013OrLater` (hint discriminator),
`hasDisplayLanguage` (picker), `hasXAIVoiceCloning` (badge). Pre-v0.13
hosts see the v2.7.5 layout unchanged.

`TODO(WS-8-Q2)` flags the assumed xAI YAML keys (`tts.xai.voice_id` /
`tts.xai.model` mirroring elevenlabs) for grep-verify against
`~/.hermes/hermes-agent/hermes_cli/voice/tts.py`.

iOS deferred to v2.9 (Q4): `Scarf iOS` Settings is read-mostly and
doesn't have a write surface for either the language picker or the
xAI fields.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 18:59:38 +02:00
Alan Wizemann 5877bf6519 feat(updater): forward-compat HermesUpdaterCommandBuilder for hermes update --yes (WS-8)
Pure-function helper that builds argv arrays for `hermes update`,
gated on `HermesCapabilities`. Pre-v0.12 → bare `update`; v0.12+
honors `--check`; v0.13+ honors `--yes` for unattended runs.

No in-app "Update Hermes" affordance ships in v2.7.5 — Sparkle handles
Scarf-self-update and `hermes update` is invoked by users in their
terminal. This is forward-compat plumbing so the eventual UI surface
shares flag selection across Mac / iOS / remote without re-deriving
from scratch.

Test matrix in `M0eUpdaterTests` covers all six combinations
(pre-v0.12, v0.12 ± unattended ± check, v0.13 ± unattended ± check)
plus an empty-capabilities fallback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 18:59:12 +02:00
Alan Wizemann f19f19cd56 feat(chat): surface v0.13 compression count + bracket-aware slash hint (WS-8)
Two small chat-surface additions tracking Hermes v0.13:

- Plumb a `compressionCount` field through `ACPPromptResult` and
  `RichChatViewModel.acpCompressionCount` so `SessionInfoBar` can render
  a `🗜 ×N` chip next to the token counter when the agent has performed
  context compactions. Capability-gated on
  `HermesCapabilities.hasContextCompressionCount` and `count > 0` so
  pre-v0.13 hosts (which always emit 0) and fresh sessions never see
  the chip. Wire decode tolerates camelCase + snake_case;
  `TODO(WS-8-Q1)` flags the assumption that the field rides on
  `usage` — if v0.13 emits via a separate `session/update` notification
  the bigger fix is described in the WS-8 plan.

- Slash-menu argument hint is now bracket-aware: hints starting with
  `<` or `[` pass through verbatim, others wrap as `<hint>`. v0.13's
  `/new [name]` ships through unchanged without rendering as
  `<[name]>`. No flag check at the renderer — agent payload is the
  source of truth.

Coordination with WS-2: both WSes touch `SessionInfoBar`. WS-2 owns
the queue chip on the left half; this WS owns the compression chip on
the right half. The added `capabilities` parameter is shared — kept
additive so WS-2's later merge produces no file-level conflict.

Tests: extends `M0dViewModelsTests` (compression count tracking +
reset semantics) and `ScarfCoreSmokeTests` (decode default + explicit
v0.13 init path).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 18:58:58 +02:00
Alan Wizemann 6c96fcfa43 feat(settings): add Web Tools tab with v0.13 search/extract split
Adds a new "Web Tools" Settings tab (between Browser and Voice) with
two distinct shapes that share the same chrome:

- Pre-v0.13: a single "Backend" picker writing the legacy
  `web_tools.backend` key (so v0.12 users still configure web tools).
- v0.13+: two pickers — Search backend writes
  `web_tools.search.backend` (SearXNG appears here only — Hermes
  registers it as a search-only dispatch), Extract backend writes
  `web_tools.extract.backend`.

Capability gate: `hasWebToolsBackendSplit` chooses which shape
renders. The tab itself is always visible — pre-v0.13 users would
otherwise lose access to the legacy combined-backend picker.

Model layer:
- `HermesConfig.webToolsBackend` / `webToolsSearchBackend` /
  `webToolsExtractBackend` — three fields, each round-tripping its
  own YAML key. Defaults: `duckduckgo` / `duckduckgo` / `reader`.
- YAML parser reads all three keys via the existing `str(...)`
  helper. Pre-v0.13 hosts populate only `webToolsBackend`; the
  split keys default to the same backend so the picker shows the
  same value the user already had.

TODO markers (WS-7-Q6/Q7) flag the inline backend lists + legacy
fallback semantics — verify against `~/.hermes/hermes-agent/
hermes_cli/web_tools.py` during integration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 18:56:08 +02:00
Alan Wizemann edac142d08 feat(chat): add /goal and /queue slash commands (WS-2)
Adds Hermes v0.13's Persistent Goals and ACP /queue surfaces to the
rich-chat composer. /goal <text> locks the agent on a target across
turns (rendered as an info-tinted "Goal locked" pill in the chat
header, with a context-menu Clear action that dispatches /goal --clear);
/queue <text> queues a prompt to run after the current turn (rendered
as a warning-tinted chip with a popover listing queued prompts +
relative timestamps). Both ride .acpNonInterruptive so the chat keeps
"Agent working…" off, and both surface a 4-second transient toast
mirroring /steer's existing UX.

Capability-gated end-to-end: the rich-chat slash menu reads through
RichChatViewModel.capabilitiesGate (a new @ObservationIgnored field
fed by ChatViewModel.attachCapabilitiesStore on Mac and a parallel
.task(id:) on iOS), so pre-v0.13 hosts never see /goal or /queue.
/steer is greyed-out on idle sessions when hasACPSteerOnIdle is off
(pre-v0.13 hosts only). The "Clear all" queue-popover button is
intentionally absent in v2.8.0 — Hermes' wire-shape for /queue --clear
isn't verified yet, so a button that lies about server-side state is
worse than no button (per WS-2 plan Q2 decision).

Optimistic-only: there is no authoritative read-back path for the
active goal in v2.8.0. The pill paints synchronously off the
optimistic write the moment the user sends /goal …; cross-session
resume won't re-paint it until the user types /goal again. A
TODO(WS-2-Q1) marker in RichChatViewModel.recordActiveGoal points at
the read-back hook for v2.8.1; TODO(WS-2-Q5) flags the verbatim
/queue argument shape for coordinator wire-verification; TODO(WS-2-Q7)
flags the /goal non-interruptive classification. TODO(v2.8.1) in
handlePromptComplete is the deferred "auto-resumed from checkpoint"
indicator (WS-2 plan Q3 decision).

iOS surfaces no UI yet (deferred to WS-9), but the iOS controller's
_sendImpl mirrors the dispatch so the shared RichChatViewModel state
stays aligned across platforms — otherwise an iOS user who ran /goal
then opened the same session on Mac would see an empty pill.

Tests: extends M9SlashCommandTests with 13 new cases covering the
non-interruptive list contents, capability-gated availableCommands
filtering on v0.12 vs v0.13, parseGoalArgument variants, optimistic
mutators (recordActiveGoal / recordQueuedPrompt / popQueuedPrompt),
isNonInterruptiveSlash recognition, and reset() drainage.

Implements WS-2 of Scarf v2.8.0 (Hermes v0.13.0 catch-up).
Plan: scarf/docs/v2.8/WS-2-goals-and-queue-plan.md (on coordination/v2.8.0-plans).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 18:55:54 +02:00
Alan Wizemann fd33b714e3 feat(cron): add --no-agent watchdog toggle gated on hasCronNoAgent
Adds a "Run script only (no agent call)" toggle to the cron job
editor. When ON, the prompt + skills sections dim + disable
visually but stay rendered (no layout shift mid-edit), the
script field stays fully active, and the form passes
`noAgent: true` to `createJob`/`updateJob`. The toggle is hidden
on pre-v0.13 hosts via `supportsNoAgent: hasCronNoAgent` and
defensively stripped at the call site (`hasCronNoAgent ?
form.noAgent : false` on create, `: nil` on edit) — same shape
as the v0.12 `workdir` strip.

Read-side: `HermesCronJob.noAgent: Bool?` is decoded via
`decodeIfPresent` so pre-v0.13 jobs.json files round-trip
unchanged. The display rule `job.noAgent == true` treats
`nil` and `false` identically — a script-only job must opt in.

Write-side:
- `createJob` appends `--no-agent` and passes an empty positional
  prompt (per WS-7-Q5) to keep argparse happy when the prompt is
  the trailing positional.
- `updateJob` sends `--no-agent` / `--agent` to flip the flag in
  edit mode (per WS-7-Q4 — verify the toggle-off spelling on
  integration; if Hermes is one-way, disable the toggle in edit
  mode with a tooltip).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 18:43:03 +02:00
Alan Wizemann c81a8a56e8 feat(mcp): add SSE transport support gated on hasMCPSSETransport
Extends MCPTransport with a third .sse case (alongside stdio + http),
plumbed through the YAML parser, add-server form, list view, detail
view, and editor. The add-server form filters .sse out of the segmented
picker on pre-v0.13 hosts (capability-gated on hasMCPSSETransport) so
Hermes never sees a transport flag it can't parse. The editor renders
a third numeric "SSE read timeout" field only for .sse servers.

YAML layer:
- HermesMCPServer.sseReadTimeout: Int? — defaulted in init, decoded
  from `sse_read_timeout` scalar.
- parseMCPServersBlock: 3-way transport discriminator — `transport: sse`
  scalar wins, then url-bearing entries default to .http (v0.12 shape),
  command-bearing to .stdio. Pre-v0.13 entries are byte-for-byte
  unaffected.
- HermesFileService.addMCPServerSSE writes via `hermes mcp add --url
  <u> --transport sse [--sse-read-timeout <t>]`.
- HermesFileService.setMCPServerSSETimeout patches the scalar via the
  same surgical patcher used by setMCPServerTimeouts.

TODO markers (WS-7-Q1/Q2/Q3) flag the wire-format unknowns the plan
called out — verify against a v0.13 Hermes install during integration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 18:34:27 +02:00
Alan Wizemann 3e470c7155 Merge pull request #80 from awizemann/ws-1-capabilities-v0.13
feat(capabilities): add Hermes v0.13 capability flags + version bump (WS-1)
2026-05-09 18:17:51 +02:00
Alan Wizemann 963d0e1a5c feat(capabilities): add isV013OrLater convenience predicate
Surfaces a v0.12 → v0.13 boundary check that doesn't proxy through any
specific feature flag. Used by WS-8 (redaction default-state hint copy,
"v0.13 features active" Settings badge in iOS WS-9) where the call site
isn't actually about a specific feature — it's about whether the host is
on the v0.13 line.

Equivalent to any individual v0.13 flag (e.g. `hasGoals`); both resolve
to the same `>= 0.13.0` threshold. Convenience exists to keep call sites
honest: `caps.isV013OrLater` reads better than `caps.hasGoals` when the
context isn't goal-related.

Tests: 4 new fixtures covering v0.13 host (true), v0.12 host (false),
empty/undetected (false), and v0.14 host (true). 19 total tests in the
suite, all passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 18:08:14 +02:00
Alan Wizemann 52c802676f feat(capabilities): add Hermes v0.13 capability flags + version bump
Adds 22 new capability flags grouped under a v0.13 (v2026.5.7) MARK
section in HermesCapabilities, covering Persistent Goals, ACP /queue
+ /steer-on-idle, Kanban diagnostics + recovery UX, Curator archive
+ prune, Google Chat (20th platform), cross-platform allowlists,
MCP SSE transport, Cron --no-agent, Web Tools backend split, Profiles
--no-skills, context compression count, /new <name>, OpenRouter cache,
image_gen.model, display.language, xAI voice cloning, video_analyze,
and the transform_llm_output plugin hook.

Each flag gates on >= 0.13.0 so v0.13 patch releases (0.13.4 etc.)
still light up every flag. Existing v0.12 flags unchanged. Test suite
extends with v0.13.0/2026.5.7 fixtures, a v0.13.4 patch-release case,
explicit "v0.13 flags off on v0.12 host" coverage, and updates the
future-version test to v0.14.0.

CLAUDE.md target line bumps to v2026.5.7 (v0.13.0); a new v2026.5.7
section mirrors the v0.12 / v0.11 scaffolding describing the Scarf-
relevant subset. The v0.12 + v0.11 historical sections remain intact
since pre-v0.13 hosts still consume those flags.

Foundation for the v2.8.0 Scarf release — every subsequent work-stream
(WS-2 through WS-9) consumes flags added here.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 17:31:51 +02:00
Alan Wizemann 5d8873d305 chore: Bump version to 2.7.5 2026-05-08 13:59:21 +02:00
Alan Wizemann 49bc4efe83 fix(kanban): enrich LocalTransport subprocess env so kanban dispatcher can spawn workers
GUI-launched Scarf inherits macOS's launch-services PATH
(`/usr/bin:/bin:/usr/sbin:/sbin`). Scarf itself finds `hermes` via
absolute-path resolution in `HermesPathSet.hermesBinaryCandidates`,
but when the kanban dispatcher (a child of Scarf) tries to spawn a
worker, the worker inherits the same stripped PATH and Hermes's spawn
machinery prints `\`hermes\` executable not found on PATH. Install
Hermes Agent or activate its venv before running the kanban
dispatcher.` — recording `outcome=spawn_failed` on the run.

`LocalTransport` now mirrors `SSHTransport.environmentEnricher`:
adds an `environmentEnricher: (() -> [String: String])?` static, and
applies it to every subprocess. `scarfApp.swift` wires it at launch
to the same `HermesFileService.enrichedEnvironment()` login-shell
probe (`zsh -l -i` → `zsh -l` fallback) the SSH transport already
uses, so subprocesses see `~/.local/bin`, `/opt/homebrew/bin`, and
the user's credential env vars.

Defense-in-depth: `subprocessEnvironment(forExecutable:)` always
prepends the executable's own directory to PATH if missing — covers
early-startup paths and test harnesses where the enricher hasn't
been wired yet.

Two new tests in `KanbanModelsTests` lock in:
1. The fallback (no enricher → executable's dir lands on PATH)
2. The enricher win for PATH + the empty-string-aware copy semantics
   for credential env vars (process env happens to set
   `ANTHROPIC_API_KEY=""` as an empty string in some environments;
   the enricher's non-empty value must still take effect)

Release notes for v2.7.5 updated to document the fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 13:59:21 +02:00
Alan Wizemann adcc984091 feat(kanban): full read/write board with per-project tenants
Lifts Scarf's Kanban surface from the v2.6 read-only list to a
drag-and-drop board with the complete Hermes v0.12 mutation surface
wired up, plus per-project boards bound to a Scarf-minted tenant slug
and a read-only board on iOS.

Why now: the v2.6 list was a placeholder shipped while upstream Kanban
collab was still mid-rework. v0.12 stabilized the 27-verb CLI; this
release makes Scarf a real GUI client for it. Driving real tasks
end-to-end exposed and closed a connected bug pattern (claim vs
dispatch, silent skipped_unassigned, integer-vs-ISO timestamps,
parser-leaked "(no" sentinel) that would have shipped as latent UX
papercuts otherwise.

ScarfCore: KanbanService actor (Sendable, pure I/O) wrapping every
verb; KanbanTenantReader cross-platform manifest projection; eight
new model types (TaskDetail, Comment, Event, Run, Stats, Assignee,
CreateRequest, Filters); KanbanError; pure transition planner that
maps drag-drop column changes to verb sequences, tested against
canonical Hermes JSON fixtures.

Mac: KanbanBoardView orchestrator with five-column drag-drop layout,
optimistic-merge state, KanbanInspectorPane side-pane (Comments /
Events / Runs / Log tabs, Log streams worker stdout every 2s while
running), inline assignee picker, health banner for unassigned and
last-failed-run states. New Task sheet defaults to active profile
and auto-fires kanban dispatch on submit. Sidebar moved Kanban from
Manage to Monitor. Read-only KanbanListView preserved as Board|List
toggle for narrow windows / accessibility.

Per-project: DashboardTab.kanban tab on every project gated on
hasKanban; KanbanTenantResolver mints scarf:<slug> tenants on first
interaction and persists to .scarf/manifest.json (immutable across
rename); ProjectAgentContextService surfaces the tenant in the
AGENTS.md scarf-managed block so agents pass --tenant <slug> on
kanban create. New kanban_summary dashboard widget; vocabulary
mirrored in tools/widget-schema.json and site/widgets.js.

iOS: read-only board on the project tab via paged single-column
Picker, modal detail sheet with Comments / Events / Runs. Mutations
+ drag-drop deferred to v2.8.

Tests: 19 new pure-logic tests covering decoding, planner verb
mapping, argv assembly, glance string formatting, and parser
rejection of the kanban assignees empty-state sentinel. All 348
ScarfCore tests pass.

Constraints documented in CLAUDE.md: no within-column reorder
(Hermes has no update --priority verb); no live watch streaming
yet (5s polling for board, 2s for log); no bulk re-tag for legacy
NULL-tenant tasks. Pre-v0.12 Hermes hosts gracefully hide the
surface end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 13:59:21 +02:00
Alan Wizemann fd80f4f95a Create FUNDING.yml 2026-05-07 12:55:53 +02:00
Alan Wizemann 9f240ae291 chore: Bump version to 2.7.1 2026-05-07 12:46:11 +02:00
Alan Wizemann 9c149b288b fix(docs): restore Sonoma compatibility messaging in BUILDING.md + CONTRIBUTING.md
Scarf's `MACOSX_DEPLOYMENT_TARGET` is `14.6` (Sonoma) on the main
`scarf` target, set in 86762ea. Sonoma support is intentional —
several users dogfood on macOS 14.x and we want to keep them on the
release channel. Yesterday's BUILDING.md and the long-stale
CONTRIBUTING.md statement both claimed macOS/Xcode 26.x as minimums,
which would have steered Sonoma contributors and users away from a
build that actually runs on their box.

Correct values:

- Runtime min: **macOS 14.6 (Sonoma)** — matches the deployment target.
- Build min: **Xcode 16.0** — needed for Swift 6 strict-concurrency
  features the codebase uses.

Add a load-bearing-callout to BUILDING.md so future doc edits don't
silently raise the floor again.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 12:13:18 +02:00
Alan Wizemann 37afbdeffc feat(build): contributor-friendly local-build.sh + BUILDING.md
Adds `scripts/local-build.sh` for unsigned command-line Debug builds
so contributors without an Apple Developer account can clone, build,
and run without provisioning gymnastics. The script:

- Detects arm64 / x86_64
- Verifies xcode-select, xcrun, xcodebuild are present
- Probes the Metal toolchain and offers an interactive install (gated
  on `[[ -t 0 && -z "${CI:-}" ]]` — CI never gets prompted)
- Resolves Swift packages, builds Debug with signing disabled
- Optionally `ditto`s the result to /Applications/scarf.app on
  explicit y/N

`BUILDING.md` documents prerequisites alongside the script. Existing
canonical Release universal CLI in README stays — `local-build.sh`
is an alternative for contributors, not a replacement for the
shipping build.

Cherry-picked from #76 with thanks to @unixwzrd. BUILDING.md's
prerequisites are corrected to match the actual deployment target
(macOS 26.2, Xcode 26.2+).

Co-Authored-By: M S <unixwzrd.register@mac.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 12:08:33 +02:00
Alan Wizemann bfd9bab9a0 fix(health): stop external dashboards by port, not pkill -f
`stopDashboard()` used to fall back to `pkill -f "hermes dashboard"`
when the running dashboard wasn't a Scarf-spawned subprocess. That's
broad enough to match shell history, log tails, README readers, and
this very source file — anything with the substring "hermes
dashboard" in its argv was a kill target.

Replace with a port-anchored lookup: `lsof -tiTCP:<port> -sTCP:LISTEN`
returns the PID actually bound to the dashboard port, then we
`SIGTERM` only that one process. Trusting the port is correct here:
Scarf owns the configured port and the user-visible intent is "stop
the thing on this port."

We deliberately omit `lsof -c hermes`. Hermes installs as a Python
shebang script (verified locally — `file ~/.local/bin/hermes` →
"a python3 script text executable"), so the kernel COMM is `python` /
`python3`, never `hermes`. A `-c hermes` filter would silently miss
every standard install.

Cherry-picked from #76 with thanks to @unixwzrd for the direction;
this version drops the `-c hermes` filter to actually fire on real
Hermes installs.

Co-Authored-By: M S <unixwzrd.register@mac.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 12:08:23 +02:00
Alan Wizemann 2e0eb63ea4 fix(health): tighten Hermes gateway pgrep so unrelated commands don't match
`hermesPIDResult()` was running `pgrep -f hermes`, which matched any
process with "hermes" anywhere in its argv — `hermes acp` chat
sessions Scarf itself spawns, `hermes -z` one-shots, log tails, even
this very file in an editor. The Dashboard "Hermes is running" badge
read true even when the gateway daemon was down.

Narrow the match to the gateway shape specifically. Two alternations
cover both invocation forms used in the wild:

- `python -m hermes_cli.main gateway run …` (the launchctl form)
- `/path/to/hermes gateway run …` (the script-path form)

Verified locally against an actual gateway PID:

    cmd=/Users/.../python -m hermes_cli.main gateway run --replace

The first alternation matches via the `-m hermes_cli.main gateway run`
boundary. All callers — `stopHermes()`, `DashboardViewModel`,
`HealthViewModel`, `SettingsViewModel`, `scarfApp` — semantically
want the gateway PID specifically, so the narrower match is the
right shape, not a behavior change.

Cherry-picked from #76 with thanks to @unixwzrd for the diagnosis
and the regex.

Co-Authored-By: M S <unixwzrd.register@mac.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 12:08:11 +02:00
Alan Wizemann 3a3c87e033 fix(skills): scope What's New pill to Installed tab + reword updated→changed
Issue #78 — The "What's New" pill at the top of the Skills page
announced "18 new, 3 updated since you last looked" while the Updates
sub-tab simultaneously said "No Updates / All skills are up to date."
Two surfaces measuring two different things both used the word
"update": the pill counts local file deltas since the user last
clicked "Mark as seen", while the Updates body runs `hermes skills
check` to find skills with newer upstream versions available. From
the user's seat the screen contradicted itself.

Two changes:

1. Render the pill only on the Installed sub-tab (Mac + ScarfGo).
   Local file deltas are contextually meaningful only on the tab
   that surfaces installed skills; showing them above Browse Hub or
   Updates was misleading.

2. Reword the pill: "X updated since you last looked" → "X changed
   since you last looked". Keeps `SkillSnapshotDiff.updatedCount` as
   the field name (it's still about file changes, not version bumps);
   only the user-visible string changes. Removes the vocabulary
   collision with the Updates tab's separate upstream-update check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:51:05 +02:00
Alan Wizemann f9e3cd38f5 fix(skills): client-side filter for All-Sources hub search
Issue #79 — Browse Hub clearly listed "honcho" but searching for
"honcho" with the source picker on "All Sources" returned nothing.
Root cause is on the Hermes side: `hermes skills search <query>`
without a `--source` flag routes through the centralized
`hermes-index` source and skips the external API sources
(skills-sh, github, clawhub, lobehub, well-known, claude-marketplace).
Browse aggregates those sources too, so any skill that lives only in
the API tier shows up in browse but disappears in search. Same picker,
same query, contradictory results.

Rather than chase Hermes's index gaps, redefine "All Sources" search
in Scarf to mean filter-what-you-see — the canonical type-to-filter
UX users already expect on a list. Source-specific searches keep the
CLI shell-out for full upstream search semantics on that registry.

Implementation:
- New `lastBrowseResults` cache populated on every successful
  `browseHub()`. Setter is `internal` so the test suite can seed
  without invoking the live CLI; out-of-module callers can still
  only read.
- `searchHub()` now branches on `hubSource`. The "all" branch filters
  the cache via `localizedCaseInsensitiveContains` against name,
  description, and identifier, runs synchronously on the calling
  actor (UI invocations are already on MainActor) so the user sees
  the narrowed list without a render-tick gap.
- If the cache is empty (search-before-browse), `browseHubThenFilter`
  performs one CLI fetch, populates the cache, then applies the
  filter — failure surfaces a "Search failed" banner instead of a
  silent empty state.
- Source-specific search still shells out to
  `hermes skills search <query> --source <s> --limit 40`.

Adds five regression tests covering name match, description match,
case-insensitive folding, no-match message state, and the empty-query
fallthrough to browse.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:50:52 +02:00
Alan Wizemann a6a8cae8ff fix(transport): drain ssh stdout/stderr concurrently to unwedge >64KB payloads
Issue #77 — Sessions screen rendered empty even though Dashboard
reported 161 sessions and Activity reported 116. Root cause was a
classic pipe-buffer deadlock in SSHScriptRunner: stdout was read via
`readToEnd()` AFTER the subprocess had exited. macOS pipes default to
a 16–64 KB kernel buffer; once the remote `sqlite3 -json` script wrote
more than that to its stdout, ssh back-pressured across the wire,
sshd back-pressured sqlite3, sqlite3 blocked, the script never
finished, the 30-second timeout fired, `streamScript` threw, and
`HermesDataService.sessionListSnapshot()` swallowed the failure into
an empty array. Empty Sessions list. Dashboard kept working because
its smaller LIMIT 5 payload fit under the threshold.

Why this was a v2.7 regression specifically: 20cc3a2 folded the
previously-separate sessions + previews queries into a single batched
round-trip (perf win for remote users). The new combined payload for
~150+ sessions crossed the buffer threshold for the first time.

Fix: drain stdout/stderr concurrently with the running process via
Foundation's `FileHandle.readabilityHandler`, accumulating chunks
into an NSLock-guarded `Data` buffer. The kernel pipe never fills,
the subprocess never blocks, the script returns the full payload.
Same change applied to both the SSH path (`runOverSSH`) and the
local path (`runLocally`) — they had identical bug shapes.

Adds SSHScriptRunnerTests with three regression checks: a 256 KB
synthetic payload that would have wedged pre-fix, a small-payload
sanity round-trip, and a non-zero exit propagation check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:50:34 +02:00
Alan Wizemann 6b66b1c96f perf(ios): wire v2.7 perf parity — instrument iOS-only VMs + surface hydration banner + opt-in toggle
Most of the v2.7 perf work was already covered on iOS via shared
code in ScarfCore — `RichChatViewModel.loadSessionHistory` (and
its skeleton-then-hydrate path), `hydrateAssistantToolCalls`,
`fetchSkeletonMessages`, `fetchRecentToolCallSkeleton`,
`ModelPreflight.detectMismatch`, and the `RemoteSQLiteBackend`
cancellation handler all flow through to the ScarfGo chat
unchanged. `CitadelServerTransport.streamScript` already
honors `Task.isCancelled` correctly via `withThrowingTaskGroup` +
`Task.checkCancellation()`, so the SSH-cancellation-on-nav-away
chain works on iOS without the Mac-side `SSHScriptRunner` fix.

Three iOS-specific gaps closed:

* IOSCronViewModel.load + IOSMemoryViewModel.load wrapped in
  `ScarfMon.measureAsync(.diskIO, "ios.cron.load")` /
  `"ios.memory.load"` — parity with the Mac `cron.load` /
  `memory.load` events. `ios.memory.load.bytes` records the
  payload size for the loaded file.
* iOS Settings → "Chat (Scarf)" section gains a toggle bound to
  `RichChatViewModel.loadHistoricalToolResultsKey` so iOS users
  can opt into Phase 2b bulk tool-result hydration, same as the
  Mac DisplayTab. The shared key means the gate inside
  `startToolHydration` reads the right value automatically — no
  extra plumbing needed.
* iOS ChatView surfaces `isHydratingTools` as a "Loading tool
  details…" connection banner (mirrors the Mac toolbar pill
  added in v2.7 perf work). Sits between the existing
  "Thinking…" banner and the empty-view fallback so chat status
  is always honest about what the agent and Scarf are doing.

Both Mac and iOS targets build clean; all 321 ScarfCore tests
pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 21:26:25 +02:00
Alan Wizemann 97ec4d2882 chore: Bump version to 2.7.0 2026-05-05 20:41:39 +02:00
Alan Wizemann cd5bb32a21 release: prep v2.7.0 — consolidated notes + in-app Sparkle release notes
Rolls up everything since v2.6.5 (36 commits across remote-perf,
project wizard, dashboard widgets, OAuth resilience, ScarfMon
instrumentation, and the v2.7 skeleton-then-hydrate redesign) into
a single 2.7.0 release.

* releases/v2.7.0/RELEASE_NOTES.md — full consolidated notes,
  reorganized around the throughline (slow-remote performance) with
  five thematic sections: skeleton-then-hydrate loaders, SSH
  cancellation, project wizard + Keychain cron secrets, dashboard
  widgets, OAuth resilience, and ScarfMon. Replaces the previously-
  drafted dashboard-only v2.7.0 stub and the separate v2.8 wizard
  stub (both unreleased).
* releases/v2.8/ — deleted; folded into v2.7.
* README.md — "What's New in 2.6" → "What's New in 2.7" with the
  five-section summary linking out to the full notes.

* tools/render-release-notes.py — stdlib-only Markdown → HTML
  renderer covering the subset of GitHub-flavored markdown that
  release notes use (## / ### headings, paragraphs, ul lists,
  fenced code, inline code/bold/italic/links, hr). Output includes
  a small <style> block tuned for Sparkle's update alert WebKit
  view (light + dark variants via prefers-color-scheme).
* scripts/release.sh — render the active RELEASE_NOTES.md and
  inject the result as <description><![CDATA[...]]></description>
  on the appcast item. Sparkle's standard updater renders this in
  the in-app update sheet so users see release-specific "what's
  new" alongside the version number, not just the bare version.
  Falls back to a "see GitHub release page" placeholder when the
  notes file is missing.

User runs ./scripts/release.sh 2.7.0 to ship.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 20:31:27 +02:00
Alan Wizemann 5e23b59697 test(model-preflight): cover detect-mismatch + fix newline-trim bug
* New ModelPreflightTests suite (19 tests) covering both `check(_:)`
  and the v2.8 `detectMismatch(_:)` paths. Pins the dogfooding
  scenario (anthropic-prefixed model + nous active provider after
  Credential Pools OAuth swap), the case-insensitive prefix match,
  empty-prefix / empty-bare-model edge cases, and multi-slash model
  ids (OpenRouter style).

* Bug fix surfaced by the tests: `ModelPreflight` was using
  `trimmingCharacters(in: .whitespaces)` which doesn't strip
  newlines. A stray `\n` in a hand-edited config.yaml would either
  miss the missing-fields classifier OR false-positive the mismatch
  banner (showing "anthropic" vs "anthropic\n"). Switched both
  trims to `.whitespacesAndNewlines`.

perf(observability): instrument Tier C load paths + fetchSessionPreviews

No behavior change — adds ScarfMon coverage so future captures show
how often Memory/Skills/Cron/Curator/SessionPreviews load paths fire
and what they cost on remote (each is multiple sequential SFTP RTTs
that pre-fix were invisible). New events:

* `mac.fetchSessionPreviews` / `.rows` / `.transportError`
* `memory.load` / `.bytes`
* `cron.load` / `.jobs`
* `skills.load` / `.count`
* `curator.load` / `.bytes`

All 321 ScarfCore tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 20:03:35 +02:00
Alan Wizemann 09e33b2999 perf(chat,activity,transport): skeleton-then-hydrate loaders + SSH cancellation propagation
Major perf overhaul for slow-remote contexts. Chats and Activity now
render in <2s instead of timing out at 30s; abandoned SSH work is
killed within 100ms instead of pinning a ControlMaster session.

* Skeleton-then-hydrate chat loader. New `fetchSkeletonMessages`
  selects user+assistant rows only (skips role='tool', NULLs
  tool_calls + reasoning at the SQL level). Wire payload bounded by
  conversational text alone — sub-second on remote regardless of
  underlying tool result blob sizes. Background `startToolHydration`
  pages through `hydrateAssistantToolCalls` (5-id batches) to splice
  tool calls in. Tool-result CONTENT is opt-in via Settings → Display
  → "Load tool results in past chats" (default off); inspector pane
  lazy-fetches per-result via `fetchToolResult(callId:)` on expand.

* Skeleton-then-hydrate Activity loader. New
  `fetchRecentToolCallSkeleton` returns metadata-only rows in ~3 KB
  for 50 entries; placeholder ActivityRows render immediately, real
  per-call entries swap in as paged hydration completes. Loading
  pill in the page header, orange transport-error banner replaces
  the pre-fix silent empty state.

* SSH cancellation propagation. `Task.detached` and unstructured
  `Task<...> { ... }` don't inherit cancellation from awaiting
  parents — without bridging, killing a Swift Task left the ssh
  subprocess running for the full 30s deadline, pinning a remote
  sqlite query and a ControlMaster session. Wired
  `withTaskCancellationHandler` through `SSHScriptRunner.run` and
  `RemoteSQLiteBackend.query`; cancellation now reaches `Process`
  within ~100ms. New `ssh.cancelled` ScarfMon event.

* L1 single-id retry. When a 5-id `hydrateAssistantToolCalls` page
  trips the 30s timeout (one row carries an oversized tool_calls
  blob — long Edit args, big diffs), fall back to single-id queries
  to isolate the whale. Non-whale rows in the same batch hydrate
  normally; whale row stays bare. New `mac.hydrateToolCalls.singleTimeout`
  event tracks how often the recovery fires.

* L2 in-flight coalescing for `loadRecentSessions`. File-watcher
  deltas during streaming used to stack 2-3 parallel sessions-list
  reload tasks; subsequent callers now await the active one. New
  `mac.loadRecentSessions.coalesced` event tracks dedup hits.

* Loading-state UX hardening. New `isStartingSession` flag flips
  synchronously on user click so the chat sidebar greys + disables
  immediately instead of waiting for `client.start()` to return
  (5-7s on remote). Phase-typed status: "Spawning hermes acp…" →
  "Authenticating…" → "Loading session…" → "Loading history…" →
  "Ready". `ChatSessionListPane` overlays a ProgressView showing
  the current phase.

* Partial-result detection. `fetchMessagesOutcome` distinguishes a
  transport failure from a genuine empty result; `loadSessionHistory`
  surfaces "Couldn't load full chat history — connection timed out"
  through the existing acpError triplet so the user sees what
  happened instead of a silent empty transcript.

* Model/provider mismatch banner. `ModelPreflight.detectMismatch`
  recognizes when `model.default` carries a `<provider>/...` prefix
  that disagrees with `model.provider` (e.g. anthropic prefix +
  nous active provider after switching OAuth via Credential Pools).
  Banner offers one-click fix in either direction. Companion: ACP
  error classifier recognizes `model_not_found` / `404 messages`
  and surfaces "Hermes pins each session to its original model —
  start a new chat" so the pinned-model failure mode has a clear
  recovery path.

* OAuth-completion provider swap prompt. After successful OAuth in
  Credential Pools, if the just-authed provider differs from
  `model.provider` in config.yaml, surface "Switch active provider
  to <name>?" with [Switch] / [Keep current] instead of
  auto-dismissing.

All 302 ScarfCore tests pass. New ScarfMon events documented in the
Performance-Monitoring wiki page.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 19:43:53 +02:00
Alan Wizemann 9f2e2ecfcd perf(chat): exclude reasoning_content from initial fetch + drop page size to 25
The 160-message thinking-model session still timed out at the 30s
ceiling even after dropping page size 200→50 in commit a193003.
ScarfMon trace:

  mac.fetchMessages    30,105,329,125 ns ← 30s timeout fired
  mac.hydrateMessages.rows  count=1     ← 1 partial row only

Root cause: `reasoning_content` is huge on thinking models (20+
KB per row). Even 50 rows × 30 KB = 1.5 MB JSON shipping over a
420ms-RTT remote SSH channel exceeds the budget. The chat
appeared empty AGAIN.

Two cuts:

1. **`messageColumnsLight`** — same as messageColumns but omits
   `reasoning_content`. Used by `fetchMessages` so the bulk
   wire payload is small. `messageFromRow` reads
   reasoning_content via `row.optionalString(at: 11)` which
   gracefully returns nil when the column isn't present, so the
   shape change is transparent.

2. **`fetchReasoningContent(for:)`** — single-row lazy fetch
   the inspector pane calls when the user expands a thinking
   disclosure. One small SSH round-trip per inspection vs. paying
   for ALL reasoning content on every session boot.

3. **`HistoryPageSize.initial` 50 → 25** — sized for the lite
   column shape with margin for sessions that include some heavy
   tool-call payloads. The "Load earlier" affordance still
   pages back through older messages.

Net effect on the user-reported case: 160-message session loads
the most-recent 25 messages in ~5-10s (one SSH round-trip ~420ms
plus ~3 KB × 25 = 75 KB wire). The remaining 135 are reachable
via Load earlier.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:28:40 +02:00
Alan Wizemann 1eb5c92f6a fix(aux-tab): correct nested-YAML parser so unknown-task surface works on remote
Bug 1 — the previous parser collected every indented child under
`auxiliary:` as if it were a task name, including leaf fields
(provider, model, base_url, api_key, timeout). Result: bogus rows
on local where the parser happened to fire, plus pollution of
the unknown-tasks set with field names that subtractFrom-known
left orphaned.

Bug 2 — the flat-dot-path branch (`auxiliary.X.Y:`) was dead
code. config.yaml is always nested YAML; the dot-path form only
appears in interactive `hermes config get` output, never on
disk. Removing it.

User reported the unknown-tasks section showed on local but not
on remote. Most likely root cause: the buggy parser surfaced
junk on local (where their config has nested-form aux settings)
while the dead flat-path branch never fired on remote either,
so remote silently rendered nothing. With the parser fixed both
contexts now surface real unknown task names if any are
present.

Rewrite as a clean two-pass walker:
- First nested line inside the block locks taskIndent.
- Only collect at exactly taskIndent (skip leaf fields deeper).
- Tolerate CRLF line endings, blank lines, and YAML comments
  without resetting block state.
- Handles 2-space and 4-space indent equally.

Verified manually with four fixture shapes: 2-space, 4-space,
with-comments-and-blanks, no-aux-block. All correct.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:12:55 +02:00
Alan Wizemann bccaba0742 feat(acp,aux): classify resolve_provider_client errors + surface unknown aux tasks
Two fixes for the user-reported "ACP -32603 Internal error" after
removing a Nous OAuth provider while config.yaml still referenced
nous for an auxiliary task. The actual stderr was clear:

  agent.auxiliary_client: resolve_provider_client: nous requested
    but Nous Portal not configured

But Scarf's chat banner showed only the bare JSON-RPC code and
the user had no actionable path through the UI.

**ACPErrorHint.classify** now pattern-matches the
`resolve_provider_client: <name> requested but` stderr line and
extracts the provider name. Surfaces:

  An auxiliary task is configured to use `<name>` but that
  provider isn't authenticated. Open Settings → Aux Models, or
  check ~/.hermes/config.yaml for auxiliary.<task>.provider: <name>
  and switch it to your active provider (or set it to `auto`).

Routed through the existing chat-banner pipeline that already
catches OAuth revocation and missing-credentials errors.

**AuxiliaryTab** gains an "Other tasks in config.yaml" section
that surfaces aux task keys present in YAML but not in Scarf's
typed list (vision, web_extract, compression, session_search,
skills_hub, approval, mcp, flush_memories, curator). Common
case: `auxiliary.summarization.provider: nous` left over from
older Hermes versions or hand-edited configs. Each unknown task
gets a one-click "Reset provider" button that writes
`auxiliary.<key>.provider: auto` — the most-actionable fix
for the OAuth-removal failure mode. Detection scans both
flat-dot-path and nested YAML shapes so it works regardless of
how Hermes dumped the file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:00:48 +02:00
Alan Wizemann 4684b9deed feat(credential-pools): OAuth remove button + auto-refresh on auth.json change
User reports the Nous OAuth provider still showed in the
credential pool after they 'removed' it, and Reload didn't help.
Two underlying bugs:

**Bug 1 — no UI path to remove OAuth providers.** The pool view
had a Re-authenticate button on each OAuth row but no remove.
Users who switched active provider thought that removed Nous;
the OAuth tokens stayed in auth.json and the row kept rendering.
Add a trash icon next to Re-authenticate that calls
`hermes auth logout <provider>` after a confirmation dialog.
ViewModel route is `removeOAuthProvider` mirroring
`removeCredential`.

**Bug 2 — view didn't refresh on external auth.json changes.**
Pool view subscribed only to .onAppear and sheet-dismiss. A
terminal `hermes auth logout` or another window's OAuth flow
left the view stale until manually re-entered. Wire up
`fileWatcher.lastChangeDate` so any auth.json mtime tick
triggers a reload (the file watcher already polls auth.json
on the remote SSH path).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:46:41 +02:00
Alan Wizemann f6dc45b397 feat(scarfmon): track empty-assistant turns + document Nous quirk
User reports chats "dying" on Nous models — screenshot shows the
assistant bubble stuck with `(°□°) deliberating...` and a
1.7s turn-duration pill (turn DID complete; the content is the
problem). The literal placeholder string isn't in Scarf's source;
it's coming from Hermes or Nous itself when the model emits a
brief thought stream and then fails to produce any visible
output.

ScarfMon trace confirms the failure mode:
  mac.sendViaACP    →  firstThoughtByte (25 bytes)
  mac.handleACPEvent  ✓
  mac.sendPrompt     ✓ (1.7s, normal)
  finalizeStreamingMessage  ✓ (turn cleanly closed)

So Scarf sees no transport error — the turn finalized normally
with empty assistant text plus a small thought stream. The
visible "deliberating" text is content Hermes/Nous chose to
substitute for the missing response.

Adds `mac.emptyAssistantTurn` event (category .chatStream) that
fires whenever a turn finalizes with empty `streamingAssistantText`
and empty `streamingToolCalls`. Bytes carry the thinking-text
length so we can distinguish:
  - bytes=0: total empty turn (model produced nothing)
  - bytes>0: thoughts-only turn (model thought but didn't answer)

Both are user-visible failures. The fix is upstream — Hermes
should refuse to finalize a turn with no response and surface
an error, OR Nous should not return empty responses with the
placeholder string. Document this finding so a future capture
that shows multiple `mac.emptyAssistantTurn` events confirms
the rate / model-correlation.

For now Scarf surfaces the same UX as before (no UI change in
this commit). A follow-on commit could intercept this case and
replace the bubble with a clearer "Model returned no response"
banner, but that requires a confident heuristic for which
empty-finalize cases are real failures vs. legitimate
no-response turns.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:40:21 +02:00
Alan Wizemann f2ddcbbd60 feat(model-picker): add search filter to Nous overlay model list
Nous returned 402 models in the recent perf capture (~496 KB of
JSON). The picker's existing top-bar search field already filters
the catalog list (`filteredModels`) but the Nous overlay path
showed all 402 unfiltered, making it nearly unusable.

Add `filteredNousModels` mirroring the `filteredModels` shape:
filters `nousModels` by case-insensitive substring match against
both `id` and `owned_by`. Updates the empty-state overlay so
"no matches" surfaces a different message from "no models
loaded" — the user knows the catalog is fine, the search just
didn't match.

User feedback: "we need a search in the model picker, some of
these lists are large and unorganized."

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:38:30 +02:00
Alan Wizemann a193003842 fix(chat): paginate session-load + race-guard against session switch
Two related bugs from remote-context perf captures.

**Bug 1 — 30s timeout fetching the 157-message session.** The
initial page size was 200 messages. For a session including
`reasoning_content` from a thinking model, that produces enough
JSON over `sqlite3 -json | ssh` to time out at exactly 30s on a
420ms-RTT remote, returning 0 rows. Bumping queryTimeout further
just trades latency for stalls.

Drop `HistoryPageSize.initial` from 200 → 50. Sized to fit
comfortably inside the 30s queryTimeout; the existing "Load
earlier" affordance pages back through older messages on demand.

**Bug 2 — session-switch race silently swaps transcripts.** When
the user picks a small chat while a slow fetch for a different
chat is still in flight, the slow fetch finishes second and its
`messages = …` assignment overwrites the small chat's transcript.
User sees the small chat "jump back" to the big one. ScarfMon
trace: parallel `mac.fetchMessages` events at t=641870 (small,
425ms, 2 rows) and t=643316 (big, 30,028ms timeout) — last
write won.

Add a `loadingForSession` capture and three guards: after the
DB refresh, after the primary fetch, after the ACP-fork fetch.
Each compares `self.sessionId` against the captured id; on
mismatch fire `mac.hydrateMessages.dropped` and return without
assigning. Race is silent in normal usage but visible in traces.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:38:19 +02:00
Alan Wizemann 93a64e3e82 fix(nous-picker): kill 120s beach-ball — dedupe readCache + 5s timeout
Two stacking bugs in the Nous-overlay branch of the model picker
caused a 120-second beach-ball on remote contexts.

**Bug 1 — duplicated readCache.** ModelPickerSheet.refreshNousModels
called `service.readCache()` directly (for instant first-paint),
then called `service.loadModels(forceRefresh: false)` which calls
`readCache()` AGAIN as its first step. Two SSH round-trips per
picker open. Drop the inline call; loadModels is already cache-first
on its happy path (returns `.cache(...)` when fresh). One read
per open.

**Bug 2 — 60s readFile timeout for a hint.** `readCache()` goes
through SSHTransport.readFile which has a 60s default timeout. On
a remote with a corrupted or oversized cache file, `cat` never
returns and we wait the full 60s — twice, due to bug 1, for a
total 120s picker stall. ScarfMon perf capture (commit 00a1bbd's
diagnostic split) localized this precisely:

  nous.readCache.fileExists  =   251 ms  ✓
  nous.readCache.readFile    = 60,011 ms   (60s timeout)

Cache is an optimization, not a requirement. Added
`readCacheWithTimeout(seconds: 5)` that races readCache against
a 5-second sleep via withTaskGroup. On timeout returns nil; caller
treats that as no-cache and falls through to the network fetch
(which succeeded in 2s in the offending capture, returning 402
models). The runaway `cat` keeps running on its own 60s transport
timeout but no longer blocks the picker.

New ScarfMon event: `nous.readCache.timeoutFired` surfaces hits
in traces so we can tell whether the timeout is being exercised
in the wild.

The underlying `cat` hang on the cache file is still unexplained;
the file size (~500KB) shouldn't take 60s on a 420ms-RTT SSH link.
For now: deleting the cache file (`rm ~/.hermes/scarf/nous_models_cache.json`
on the remote) is the workaround. The next picker open will rebuild it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:17:45 +02:00
Alan Wizemann 00a1bbd109 feat(scarfmon): split nous.readCache into fileExists/readFile/decode/bytes
Last perf capture showed nous.readCache as a single 60-second
interval — but the function does three things (transport.fileExists,
transport.readFile, JSONDecoder). Splitting the measure points so
the next capture localizes which step actually owns the wall-clock.

Adds:
- nous.readCache.fileExists (interval) — SSH `test -e` round-trip
- nous.readCache.readFile (interval) — SSH `cat` round-trip
- nous.readCache.bytes (event) — payload size of the cache file
- nous.readCache.decode (interval) — JSON parsing cost

If the next 60-second beach ball localizes to readFile, we know
the cache file is somehow huge or the SSH read is hung; if it's
fileExists, the path resolution is the issue; if decode, we have
malformed JSON. All three wear the same outer wrapper so the
existing nous.readCache total stays for trend comparison.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:07:43 +02:00
Alan Wizemann 20cc3a2985 perf(sessions): fold sessions+previews into one batched SSH round-trip
Audit Finding 1 — ChatViewModel.loadRecentSessions and
SessionsViewModel.load each fired two sequential `await
dataService.fetch*` calls (sessions + previews), paying the 420 ms
SSH RTT twice on every reload. Visible in ScarfMon traces as
back-to-back `ssh.run` intervals, totaling ~840 ms minimum
overhead per sidebar refresh.

Adds HermesDataService.sessionListSnapshot(limit:) — same shape
as the existing dashboardSnapshot, folds both queries into a
single backend.queryBatch() call. Both call sites switched.

Halves the SSH round-trips for every sidebar load. With Finding 5's
coalescing, redundant parallel reloads also become free. Together,
the 9× redundant queries-per-minute observed in baseline captures
should drop substantially.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:07:31 +02:00
Alan Wizemann 432d5b0b52 fix(remote-sqlite): bump query timeout 15s→30s + add in-flight coalescing
Two issues from the perf capture:

1. fetchMessages on a 157-message session timed out at exactly 15.06 s
   (`mac.fetchMessages` interval = 15,062,646,042 ns), then silently
   returned 0 rows. The chat appeared empty but the session had data;
   the timeout was firing before sqlite3 -json could ship the ~50KB
   payload over a 420 ms-RTT SSH link. Bumped queryTimeout to 30 s.
   The streamScript transport-level timeout still fires on truly
   wedged hosts.

2. mac.loadRecentSessions fired twice in parallel at t=960450 +
   t=960584, finishing 134 ms apart — two independent watcher ticks
   each spawning a full 3-query SSH load for the same data. Added
   in-flight request coalescing keyed on the inlined SQL text:
   when a query with the exact same SQL is already pending, second
   caller awaits the first task instead of spawning a new
   subprocess. New ScarfMon event `sqlite.query.coalesced`
   surfaces hits in traces.

Coalescing is surgical — applies to single `query` calls only,
not `queryBatch` (different timeout scaling, concurrent-same-batch
is rare). Avoids serializing independent work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:07:19 +02:00
Alan Wizemann 12e152bfea perf(ssh): replace Thread.sleep spin with kernel-wait for runLocal timeout
Audit Finding 3 — every SSH operation funnels through SSHTransport.runLocal,
which used a 100ms Thread.sleep loop while waiting for the timeout. Each
call held one cooperative-pool thread for the full timeout duration with
spin-poll overhead, AND had 100ms granularity on the deadline.

Replace with proc.terminationHandler + DispatchGroup wait — kernel-wakeup
when the process exits (or the deadline fires), no spin. Same one-thread
blocking footprint, but eliminates the per-operation spin work that
inflated query latency 60-70% under concurrent SSH load (visible in
ScarfMon as 7-second mac.loadRecentSessions outliers when sidebar reload +
chat finalize + watcher poll all fired together).

Minimum-touch fix; full async migration of runLocal documented for
follow-up. The bigger refactor would let cooperative-pool threads
park on a true async suspension during the wait, but requires
propagating async through every ServerTransport caller.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:06:58 +02:00
Alan Wizemann 099d73dde8 feat(scarfmon): instrument Nous model catalog + subscription path (beach-ball investigation)
User reported a remote-context beach-ball when opening the model
picker with Nous as the active provider. Existing measure points
showed loadProviders + loadModels at ~315ms each (fast). The
beach-ball must be in the uninstrumented Nous-overlay branch the
picker fires when nous is selected.

Adds four measure points covering every blocking call in that path:

- nous.subscription.loadState (interval, .diskIO) — auth.json read
  via NousSubscriptionService.loadState. Already known to do an SSH
  read; now precisely measurable.
- nous.readCache (interval, .diskIO) — nous_models cache read,
  TWO sequential SSH ops (fileExists + readFile).
- nous.bearerToken (interval, .diskIO) — auth.json read AGAIN inside
  fetchModels. **This is a duplicate read** — loadState already
  parsed the same file moments earlier. Comment-flagged as a
  caching candidate.
- nous.fetchModels (interval, .transport) + .bytes (event) — HTTP
  GET against the Nous /v1/models endpoint with the body byte count
  attached. The most likely beach-ball culprit if the endpoint is
  slow or hung.

After the next capture we'll know which of the four owns the user's
wall-clock; if `nous.bearerToken` shows up alongside
`nous.subscription.loadState` with similar duration, the duplicate
read is also a real cost worth fixing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 11:50:51 +02:00
Alan Wizemann 4efd84c119 feat(projects,cron): new project wizard + keychain env mirror + #75 fix
Three coordinated additions to the project surface:

1. New Project from Scratch wizard. Toolbar entry that scaffolds a
   Scarf-standard project skeleton (`<project>/.scarf/dashboard.json`
   placeholder + `AGENTS.md` marker block), registers it, opens an ACP
   chat session in the project's cwd, and auto-sends a kickoff prompt
   that activates the bundled `scarf-template-author` skill. The skill
   drives the substantive setup conversationally — widgets, optional
   config schema, optional cron, AGENTS.md content.

2. Keychain secrets mirror into ~/.hermes/.env. Cron jobs can now
   reference Keychain-backed config values via env vars named
   `SCARF_<UPPER_SLUG>_<UPPER_FIELDKEY>`. Hermes reloads .env per cron
   tick (cron/scheduler.py:897-903), so credential rotation is free.
   Source of truth stays in the Keychain — config.json keeps
   `keychain://` URIs unchanged. Mirror runs at install, post-install
   Configuration save, uninstall, "Remove from List", and on app
   launch (reconcileAll). Mode 0600 on `.env` enforced by
   LocalTransport's existing `.env` heuristic.

3. Configuration form layout recursion fix (issue #75). Per-stage
   frame sizes on `ConfigEditorSheet` triggered
   `_NSDetectedLayoutRecursion` for projects with manifest.json.
   Stabilized the outer frame at the editing stage's intrinsic size so
   transitions only swap content, never resize the container.

New services:
- `ProjectScaffolder` (Mac) — bare-shell project + AGENTS.md marker
- `SkillBootstrapService` (Mac) — copies bundled skills into ~/.hermes/skills/
- `KeychainEnvMirror` (Mac) — splice/unmirror/reconcileAll over ~/.hermes/.env
- `SecretsEnvBlock` (ScarfCore) — pure marker-block helpers

Bundled skill `scarf-template-author` v1.1.0 ships in
`Resources/BuiltinSkills.bundle/`; SkillBootstrapService copies it
into `~/.hermes/skills/scarf-template-author/` on launch (idempotent +
version-gated). The skill grew a "Using secrets in cron prompts"
section documenting the env-var convention.

Migration: launch reconciler auto-populates .env on first v2.8 launch.
Users with cron prompts authored against the old (broken) pattern need
to update them to use $SCARF_… references — see release notes.

Tests:
- SecretsEnvBlockTests: 24/24 (`swift test --filter SecretsEnvBlock`)
- KeychainEnvMirrorTests: 11/11 (`xcodebuild ... -only-testing:scarfTests/KeychainEnvMirror`)

The idempotent-mirror test caught a real bug: applyBlock's replace
path consumed the trailing newline from blockRange but didn't restore
it, breaking the no-op-when-unchanged contract that the launch
reconciler relies on. Fixed.

v2.8 RELEASE_NOTES.md committed but no release cut yet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 11:44:23 +02:00
Alan Wizemann bd9bacb8b3 feat(scarfmon): B2 + B3 + iOS dashboard — file watcher, message hydration, dashboard load
Three areas instrumented in this batch. Both targets build clean.

B2 — Mac HermesFileWatcher (FSEvents + remote SSH poll)
- mac.fileWatcher.localFire (event) — every FSEvents change on a
  watched core or project path. High counts during streaming chats
  are normal (state.db-wal ticks per persisted message); high counts
  during idle suggest a runaway watcher install.
- mac.fileWatcher.remoteRestart (event, bytes=path-count) — fires
  once per SSH poller restart, with the union path count attached.
  Frequent restarts mean the project-list update path is churning.
- mac.fileWatcher.remoteDelta (event) — fires per non-empty change
  detected on the SSH poll. Pair with `ssh.streamScript` cadence to
  see actual poll latency.

B3 — Chat session boot + message hydration
- mac.fetchMessages (interval) + .rows (event) — bounded SQL
  fetch from HermesDataService. Catches slow paginated scrolls
  back through long sessions.
- mac.refreshSessionFromDB (interval) — RichChatViewModel's
  post-promptComplete refresh that picks up cost/token data.
- mac.hydrateMessages (interval) + .rows (event) — full session-boot
  hydration in RichChatViewModel.loadSessionHistory. Was the suspected
  trigger of the 22-bubble session-start storms in the Phase 3a
  baseline; now precisely measurable.

iOS Dashboard (resolves the original "out of sync" mystery)
- ios.loadDashboard (interval) — wraps the four dataService.fetch*
  Citadel SFTP round-trips in IOSDashboardViewModel.load().
- ios.allSessions.count (event) — sidebar list size after each
  load, correlates load latency with list growth.
- ios.dashboardRefresh.trigger (event) — fires only on
  pull-to-refresh, separates that entry path from initial appear.

**Architectural finding:** the original v2.6.0 user feedback
("chat out of sync iOS↔Mac on fast LAN") is now firmly attributable
to this — iOS does NOT subscribe to a file watcher. The dashboard
refresh path is appear-time + pull-to-refresh only.
`CitadelServerTransport.watchPaths()` is effectively dead code on
iOS today; nobody calls it. Earlier A1 instrumentation (commit
9df7142) put measure points on it, which is why captures showed
zero `ios.fileWatcher.tick` events. Future work: either add a
foregrounded poll loop to iOS, or thread the file watcher into
the dashboard subscription. Documented in the ScarfMon roadmap
memory.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 23:52:11 +02:00
Alan Wizemann 96af545e66 feat(scarfmon): Tier A2/A3/B1/B4 — sessions, model catalog, dashboard widgets, image encoder
Four parallel instrumentation drops orchestrated by the perf roadmap.
All adds; no logic changes; both targets build clean.

A2 — Mac sessions list reload
- mac.scheduleSessionsRefresh (event) — every file-watcher entry into
  the debounced reload helper. Pair with mac.loadRecentSessions count
  to see how many ticks coalesce per actual reload.
- mac.loadRecentSessions (interval) — full wall-clock from DB open
  through observable assignment.
- mac.recentSessions.count (event) — sidebar list size, correlates
  list growth with reload latency.

A3 — ModelCatalogService loads
- modelCatalog.loadProviders (interval) + .providers.count (event).
- modelCatalog.loadModels (interval) + .models.count (event).
- modelCatalog.validateModel (interval) — covers loadCatalog ->
  transport.readFile, hits disk on every call.
Sync wrap (not measureAsync): the inner Task.detached body is
synchronous; the detached hop is the async boundary.

B1 — Dashboard render
- mac.dashboard.body (event) — ProjectsView body re-eval count.
- dashboard.loadRegistry (interval) — projects.json read + decode.
- widget.markdown_file.load / widget.log_tail.load /
  widget.image.load / widget.cron_status.load (intervals) —
  one per v2.7 file-reading widget. cron_status batches its two
  HermesFileService calls into one tuple-returning measure block
  so the existing two-call shape stays intact.

B4 — Image encoder
- imageEncoder.input.bytes (event) — raw input size.
- imageEncoder.downsample (interval) — full decode/resize/JPEG
  encode round trip across all three platform branches (AppKit,
  UIKit, Linux passthrough).
- imageEncoder.bytes (event) — final encoded JPEG size, lets us
  spot blowup cases.
Sync wrap: encode is nonisolated sync; using measureAsync would
require turning the function async, which is a logic change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 23:38:50 +02:00
Alan Wizemann 9df7142f49 feat(scarfmon): A1 — instrument iOS file-watcher polling cadence
Adds three measure points to CitadelServerTransport.watchPaths:

- ios.fileWatcher.tick (interval) — full poll cycle latency including
  the SSH stat round-trips. > 1500ms here is what 'out of sync' feels
  like — the channel is congested or the host is slow.
- ios.fileWatcher.delta (event) — fires only when the signature
  actually changed. Low delta/tick ratio means we can safely drop
  the 3-second cadence; high ratio means we'd just burn bandwidth.
- ios.fileWatcher.paths (event, bytes=count) — number of paths watched
  per cycle. Explains slow ticks as the project list grows.

Surgical addition; existing 3-second cadence + signature-diff logic
unchanged. With Full mode on, a few minutes of usage on LAN will
tell us empirically whether the cadence can drop to 1s — the
original v2.6.0 user feedback complained 'chat is out of sync'
between iOS and Mac on a fast LAN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 23:33:30 +02:00
Alan Wizemann 9ff9a018e7 feat(scarfmon,chat): Phase 3b — dampen finalize bursts + Thinking… status + wider loadConfig stack
Three targeted fixes from the Phase 3a baseline.

Bubble-burst dampening (Phase 3b-1):
- RichChatViewModel.finalizeStreamingMessage wraps both the
  streaming-id rewrite and the empty-finalize remove() in a
  no-animation Transaction. The id flip from 0 → permanent value
  was the load-bearing trigger of the 5–8 RichMessageBubble.body
  fires we were seeing 1–2 ms after every `finalizeStreamingMessage`
  interval; SwiftUI ran an animated diff against neighbors and
  re-evaluated their bodies. The new message is content-equal to
  the streaming one — there is no animation worth running.

Thinking… status promotion (Phase 3b-2):
- RichChatViewModel exposes `isStreamingThoughtsOnly` — true while
  a turn is in flight, has emitted thought-stream bytes, and has not
  yet produced any visible assistant text. The Phase 3a baseline
  showed this is where most of the user-perceived "feels slow" lives:
  reasoning models commonly take 3–8 s before producing visible
  output, and Scarf surfaced no specific signal during that window.
- Mac ChatView.displayedStatus promotes the toolbar pill to
  "Thinking…" when the flag is true.
- iOS connectionBanner gains a transient "Thinking…" strip with
  spinner, same trigger condition.

Phase 3a fix-up:
- HermesFileService.loadConfig stack-trace logging widened from
  one frame to a 10-frame window prefixed with "#N", so the actual
  caller is visible past inlined ScarfMon wrappers (the prior log
  surfaced ScarfMon.measure itself, not the loadConfig caller).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 23:14:03 +02:00
Alan Wizemann 0a4f8de492 feat(scarfmon): Phase 3a — diagnostic measure points for chat-render bursts
Adds four targeted measure points so the next baseline capture can
attribute the bubble-re-render storm and the slow sendPrompt to a
specific cause:

- mac.RichChatMessageList.body — distinguishes "the parent is
  re-issuing the ForEach" from "the bubbles are re-rendering on their
  own". If list.body fires once and bubble.body fires N times, churn
  is in the bubbles; if list.body fires N times, the ForEach itself
  is being rebuilt.
- finalizeStreamingMessage (interval) — pinpoints the end-of-stream
  burst trigger. The 20-bubble re-eval burst we saw at the close of
  each turn lines up with this call; measuring it surfaces whether
  it's the streaming-id rewrite, the turn-duration assignment, or
  something downstream.
- firstByte / firstThoughtByte (event) — fires once per turn on the
  first chunk after currentTurnStart is set. Splits user-tap → first
  byte (network + Hermes thinking, the dominant component of the 7-11s
  sendPrompt) from first byte → turn end (Scarf streaming render).
- loadConfig caller hint via os.Logger — when ScarfMon is in Full mode,
  logs the first stack frame above each loadConfig call to the
  com.scarf.mon subsystem so mystery callers (the read at t=264282
  with no apparent trigger in the prior baseline) become traceable
  via `log stream`. Symbol-only, no PII, free outside Full mode.

All four are pure additions — no behavior change, same zero-cost
default-off semantics as Phase 2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 22:47:29 +02:00
Alan Wizemann 3126c34561 feat(scarfmon): chat + transport + sqlite measure points (Phase 2)
Wires ScarfMon measure points into the chat hot path on both targets,
plus the underlying SSH transport and remote-SQLite backend. All
callsites are surgical adds — no behavior change. Cost when ScarfMon
is in `.signpostOnly` (default) is one os_signpost emit per call,
elided by the runtime outside an Instruments session. In `.full` mode
the same callsites also push samples into the in-memory ring buffer.

Render counters (event):
- mac.ChatView.body / ios.ChatView.body — full transcript pane re-evals
- mac.RichMessageBubble.body / ios.MessageBubble.body — per-bubble re-evals

Stream + session (event + interval):
- mac.sendViaACP, mac.sendPrompt — user tap → first-byte
- mac.acpEvent, mac.handleACPEvent — per-event delivery + handle cost
- mac.startACPSession — session boot
- ios.send, ios.startResuming — same shape on iOS
- ios.acpEvent, ios.handleACPEvent — same per-event split on iOS

Transport + SQLite (interval, with byte counts on rows):
- ssh.streamScript (Citadel iOS) — SSH round-trip
- ssh.run (SSHScriptRunner Mac) — SSH round-trip
- sqlite.query, sqlite.queryBatch — Remote SQLite per-call
- sqlite.query.rows — row count + stdout bytes per query

Disk I/O (interval):
- diskIO.loadConfig — config.yaml read + parse
- diskIO.loadCronJobs — cron jobs.json decode

Body counters use the `let _: Void = ScarfMon.event(...)` pattern at
the top of `body` — works inside `@ViewBuilder` and fires on every
re-eval, which is exactly the signal we want.

To use:
  Mac: Settings → Advanced → Performance Diagnostics → Full
  iOS: Settings → Diagnostics → Performance → Full
Both panels auto-aggregate by (category, name), surface top 20 by
p95, and offer Copy as JSON for sharing in feedback threads.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 22:18:06 +02:00
Alan Wizemann 6cf59c8a44 feat(scarfmon): perf instrumentation plumbing for iOS + Mac (Phase 1)
ScarfMon lands the always-on perf instrumentation harness. Phase 1 ships
the plumbing only; Phase 2 wires the chat measure points.

Core (ScarfCore/Diagnostics/):
- ScarfMon — public API: measure / measureAsync / event with @inline(__always)
  short-circuit when the backend set is empty so the off path is one
  branch + return. Categories are an enum, names are StaticString so
  user content cannot leak through metric tags.
- ScarfMonRingBuffer — fixed-capacity (4096) lock-protected ring; one
  os_unfair_lock per record; summary() aggregates by (category, name)
  with nearest-rank p50/p95; exportJSON() emits a one-line-per-sample
  dump for the Copy as JSON button.
- ScarfMonSignpostBackend — emits os_signpost into a dedicated
  com.scarf.mon subsystem so Instruments → Points of Interest shows
  Scarf's own measure points without a debug build.
- ScarfMonLoggerBackend — Logger(.debug) sink for users running
  `log stream --predicate 'subsystem == \"com.scarf.mon\"'`.
- ScarfMonBoot — three modes (off / signpostOnly / full); persists the
  user's choice in UserDefaults under ScarfMonMode; configure() is
  idempotent and replaces the active backend set atomically.

Tests: 11 cases covering ring ordering / wrap / reset, summary
aggregation, p95 percentiles, event vs interval semantics, install /
isActive, measure + measureAsync (including the throw path), boot
mode transitions, and JSON export round-trip. @Suite(.serialized)
because the suite mutates process-wide backend state.

App wiring:
- ScarfIOSApp.init + ScarfApp.init call ScarfMonBoot.configure(mode:)
  with the persisted mode (default .signpostOnly).
- iOS Settings → Diagnostics → Performance row leads to a list-style
  panel with the segmented mode picker, top-20 stat rows by p95, Copy
  as JSON, and Reset.
- Mac Settings → Advanced gains a ScarfMonDiagnosticsSection with the
  same shape (NSPasteboard for copy).

Open-source by design — no remote upload, no analytics. The ring buffer
never leaves the device unless the user explicitly taps Copy as JSON.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 22:08:21 +02:00
Alan Wizemann 272da6a915 fix(transport,widgets): code-review fixes for v2.7 + iOS Citadel transport
- CronStatusWidgetView: include jobId + lineCount in `.task(id:)` so widget reload fires when dashboard.json changes either field, not only when the file watcher ticks
- CitadelServerTransport.runScript: enforce the timeout via withThrowingTaskGroup race; propagate transport-level Citadel errors as TransportError.other (so RemoteSQLiteBackend.query maps them to BackendError.transport instead of misclassifying as BackendError.sqlite via a fake -1 exit code); throw TransportError.timeout on the deadline branch with partial stdout preserved
- SSHScriptRunner: close fileHandleForReading on stdout/stderr Pipes in the timeout branch (success path already did); check Task.isCancelled inside the busy-wait so a cancelled parent task terminates the subprocess early instead of waiting out the full timeout. Both runOverSSH and runLocally fixed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 21:40:07 +02:00
Alan Wizemann c7bcfd8655 feat(dashboards): v2.7 widget catalog — file-reading widgets, sparkline, typed status, project-wide watch
Major project-dashboard release. Five new widget types (markdown_file, log_tail,
cron_status, image, status_grid), inline sparkline on stat, typed status enum
shared by list + status_grid, structured WidgetErrorCard, and a project-wide
.scarf/ directory watch that picks up files cron jobs write next to dashboard.json.

- ProjectDashboard: extend DashboardWidget with path/lines/jobId/cells/gridColumns/sparkline; add StatusGridCell + ListItemStatus (lenient parse with synonyms)
- HermesFileWatcher: watch each project's .scarf/ dir alongside dashboard.json (local FSEvents + remote SSH mtime poll); updateProjectWatches signature now takes dashboardPaths + scarfDirs
- New widget views: CronStatus, Image, LogTail, MarkdownFile, StatusGrid, plus WidgetErrorCard for structured failure messaging; legacy "Unknown" placeholder replaced everywhere
- WidgetPathResolver: project-root-anchored path resolution that rejects absolute paths + ".." escapes pre and post canonicalization
- Stat widget gains optional inline sparkline (pure SwiftUI Path, no Charts dep); list widget rows route through typed status with semantic icons + ScarfColor tints
- iOS list widget + unsupported card adopt typed status + warning-toned error card (parity with Mac error styling); new widget types remain Mac-only
- Site mirror: widgets.js renders all five new types (file-reading widgets show annotated catalog placeholders), sparkline SVG, status-grid grid; styles.css adds typed-status palette + error-card + sparkline + grid styles
- Catalog validator: tools/widget-schema.json is the single source of truth; build-catalog.py loads it and enforces per-type required fields. 8 new test cases in test_build_catalog.py covering schema load, v2.7 additions, and missing-required rejection
- Template-author skill (SKILL.md) gains v2.7 Widget Catalog section + canonical status guidance; CONTRIBUTING.md points authors at widget-schema.json; template-author bundle rebuilt
- Localizable.xcstrings picks up auto-extracted strings for the previously-shipped OAuth keepalive feature
- Release notes drafted at releases/v2.7.0/RELEASE_NOTES.md

Backwards compatible — existing dashboard.json renders byte-identically, status synonyms (ok/up/down/active/etc.) keep working.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 21:16:29 +02:00
Alan Wizemann 9d945150e0 fix(chat): suppress 'stop' badge in metadata footer for normal turn ends
Every text-bearing assistant turn finalizes with `finishReason="stop"`
(set by `RichChatViewModel.finalizeStreamingMessage` line 881 — the
standard end-of-turn signal Hermes/ACP/OpenAI all emit). The
`metadataFooter` in `RichMessageBubble` was rendering it
unconditionally, so every assistant bubble carried a `· stop · TIME`
footer. Combined with terse model output (e.g. deepseek-v4-flash
emitting only a brief status line before ending the turn), the
badge created a misleading "the agent gave up" impression — there
was no warning, error, or actual failure.

Match the convention used by ChatGPT, Claude.ai, Cursor, etc.:
suppress the badge for normal end-of-turn (`stop` / `end_turn`),
reserve it for abnormal terminations the user actually wants to
see (`max_tokens`, `length`, `error`, `refusal`, `content_filter`,
…). When it does render, color it with severity tone — warning
yellow for "response cut short" cases, danger red for failures
and refusals, muted otherwise.

The existing `handlePromptComplete` system-message-injection path
(line 725-751) for non-`end_turn` stops still surfaces those cases
explicitly at the top of the chat — this change only trims the
always-on badge from the per-message footer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 15:40:31 +02:00
Alan Wizemann fa15634381 fix(oauth-keepalive): drop unsupported --silent flag from cron create
`hermes cron create` only accepts --name, --deliver, --repeat,
--skill, --script, --workdir. The `silent: Bool?` field on
HermesCronJob exists in the JSON model but isn't exposed through
the CLI's create verb today — argparse rejected the unknown flag,
non-zero exit, toggle failed with the generic CLI hint.

Drops the flag; the keepalive runs with Hermes's default delivery.
Token-refresh side effect during session boot is unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 15:33:25 +02:00
Alan Wizemann 3271391506 fix(chat): debounce sidebar reloads so sessions list doesn't flicker mid-stream
ChatView's `.onChange(of: fileWatcher.lastChangeDate)` fired an
unconditional `Task { await viewModel.loadRecentSessions() }` on
every file-watcher tick. During an ACP message stream the watcher
fires 5–10 times per second (every message Hermes persists bumps
`state.db-wal`'s mtime), and each spawned task re-fetched sessions +
previews + project attribution and reassigned `recentSessions` even
though the data was identical. Each reassignment triggered an
@Observable re-render of the chat sidebar; the user saw the chats
list visibly disappear and reappear several times while typing the
first message in a new chat.

Two changes:

* Add `scheduleSessionsRefresh()` to ChatViewModel — coalesces rapid
  ticks into one trailing `loadRecentSessions()` ~500 ms after the
  last tick. ChatView's onChange now calls this instead. The 500 ms
  window is short enough that idle external changes (a session
  created from another `hermes` invocation, a rename from a
  different window) still appear "soon", and long enough to absorb
  a streaming-response burst.
* Add an explicit `await loadRecentSessions()` to
  `autoStartACPAndSend` after the new session id resolves — the
  debounce would otherwise delay the just-created chat from
  appearing in the sidebar by 500 ms after first send. Mirrors what
  `startACPSession` already does at line 619 for the explicit New /
  Resume paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:56:59 +02:00
Alan Wizemann 5afd391838 feat(sidebar): promote Projects to first section + move profile chip under server name
Two small UX tweaks to the macOS sidebar:

* Reorder sections so Projects is the top section above Monitor.
  Reflects how users actually start sessions in Scarf — they pick a
  project first, then drill into chat / sessions / etc. The previous
  order put the read-mostly Dashboard at the top, which made
  Projects feel like a secondary surface.
* Move the active-profile chip out of the top header HStack (where
  it competed for horizontal space with the server-name pill) and
  drop it into a second row right-aligned under the server name.
  Top row stays clean: `[icon] Scarf       <server>`. Second row:
  `                              profile: <name>` only on local
  contexts. Same click target, same .help, just better-anchored.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:37:29 +02:00
Alan Wizemann 2a368a04f7 feat(window): persist window size + position across app launches
SwiftUI's WindowGroup exposes `.defaultSize` and `.windowResizability`
but no built-in autosave for window frame across launches. The
documented escape hatch is AppKit's
`NSWindow.setFrameAutosaveName(_:)`, which writes the frame to
UserDefaults on resize/move and restores it on next open.

Add a small `WindowFrameAutosave` NSViewRepresentable that finds its
hosting NSWindow on first appear and stamps the autosave name. Apply
it to `ContextBoundRoot` keyed off `context.id` so each open server
window remembers its own geometry. New servers fall back to the
WindowGroup's `.defaultSize(1100, 700)` until the user resizes once.

A previous WIP attempt (dd4a61f) tried to use a fictional
`.windowFrameAutosaveName(...)` SwiftUI modifier that doesn't exist —
which is why it was never merged. This works because we go through
AppKit directly.

Also picks up Xcode's auto-extracted cron-related Localizable.xcstrings
entries that had been pending.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:34:08 +02:00
Alan Wizemann 9aa901a286 fix(credential-pools): refresh view after OAuth sheet dismiss
The sheet auto-closes 0.8s after `oauthFlow.succeeded` flips, but
the parent view didn't reload — so the expiry badge stayed red and
the `tokenTail` stayed stale until the user hit Reload. Hook
`viewModel.load()` + `probeKeepalive()` into the sheet's
`onDismiss` so the freshly-written `auth.json` lands on screen
immediately. Runs on every dismiss (success or cancel) — `load()`
is cheap and idempotent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:33:22 +02:00
Alan Wizemann 111fe9bb67 feat(oauth): unblock remote re-auth + daily keepalive to prevent expiry
Two related fixes for OAuth subscriptions (Nous Portal, Anthropic
Claude OAuth, etc.):

- **Remote re-auth stall**: Both `NousAuthFlow` and
  `OAuthFlowController` set `PYTHONUNBUFFERED=1` only on local
  contexts. On remote, setting `proc.environment` only affects the
  local-side ssh process — not the remote python interpreter. ssh
  doesn't forward arbitrary env vars without `SendEnv` configured on
  both sides, so remote hermes ran with default block-buffered stdout
  and the device-code prompt never reached Scarf — the sheet hung at
  "Contacting Nous Portal" forever. Fix: when remote, wrap the
  command in `env PYTHONUNBUFFERED=1 …` to inject the var on the
  remote side regardless of ssh config.
- **Daily keepalive**: Hermes refreshes OAuth access tokens on agent
  startup but never proactively. If the user goes longer than the
  refresh-token lifetime (~30 days for Nous) without starting a
  session, the refresh token itself expires and full re-auth is
  required. New `OAuthKeepaliveCronService` registers a Scarf-owned
  daily cron job (`[scarf:oauth-keepalive] OAuth token refresh`) at
  4am that runs a minimal one-token prompt — booting the session is
  what triggers `resolve_nous_runtime_credentials()`. Wired as an
  opt-in toggle in the OAuth providers section of CredentialPoolsView.
  When `hermes auth refresh <provider>` lands upstream we'll swap the
  prompt for that verb; the surrounding wiring stays unchanged.
- **Stale-refresh nudge**: `NousSubscriptionState` gains
  `daysSinceLastRefresh()` + `hasStaleRefresh` (>= 14 days, half of
  Nous's 30-day refresh-token window). The keepalive section
  surfaces an inline orange warning when stale and the toggle is
  off — points the user at the toggle that would have prevented the
  problem.

Verification: scarfCore 263/263; Mac app builds clean. Manual repro
of remote stall against Digital Ocean droplet pending user test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:32:06 +02:00
Alan Wizemann 6191c9f19f fix(remote-backend): pre-expand ~/ in Swift via resolvedUserHome
The previous fix (b8b426e) rewrote `~/.hermes/state.db` to
`"$HOME/.hermes/state.db"` and relied on the remote shell to expand
$HOME. That works on Mac SSHTransport (login shell with $HOME set in
the environment) but not reliably through Citadel's exec channel +
base64-decode + inner-/bin/sh pipeline on iOS — the user reports
"unable to open database \"~/.hermes/state.db\"" connecting from
ScarfGo (iOS Simulator) to 127.0.0.1, meaning the literal `~`
character reached sqlite3 untouched.

Switch to client-side expansion: probe remote $HOME once at
RemoteSQLiteBackend.open() via the existing
ServerContext.resolvedUserHome() helper (which uses transport.runProcess
to `echo $HOME` — same code path Hermes CLI calls already exercise
successfully on iOS). Cache the result. quoteForRemoteShell then
substitutes `~/` with the absolute path in Swift before single-
quoting, so sqlite3 receives `/Users/alan/.hermes/state.db` directly
— no nested-shell expansion required.

Falls back to the previous "$HOME/..."-quoted form when the home
probe fails (rare; covers the case where runProcess can't reach the
remote but the user happens to have a working streamScript path).

Mirrors how RemoteBackupService.expandTilde already handles the same
problem upstream.

Refs #74

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 13:40:33 +02:00
Alan Wizemann b8b426ed75 fix(remote-backend): expand ~/ to $HOME so sqlite3 finds the DB
Default-config remotes (Hetzner, Digital Ocean, anything where the
user hasn't overridden remoteHome on the SSHConfig) have
`paths.stateDB == "~/.hermes/state.db"`. The streaming backend was
single-quoting that path, which suppresses tilde expansion, and
sqlite3 itself doesn't expand `~` (that's a shell affordance). Result:
"Error: unable to open database \"~/.hermes/state.db\": unable to open
database file" — the path was reaching sqlite3 with a literal `~`
that it tried to interpret as a directory name.

Replace the single-quote-only `escape(_:)` with `quoteForRemoteShell(_:)`
that mirrors `SSHTransport.remotePathArg`'s pattern: rewrite leading
`~/` to `"$HOME/..."` (double-quoted so the shell expands `$HOME`,
backslash-escaping any embedded `\\`, `"`, `$`, ` to keep the literal
intact), bare `~` to `"$HOME"`, and absolute paths get the standard
single-quote-with-`'\''`-escape treatment.

Adds a regression test (`openWithDefaultTildeHomeExpands`) that
exercises the tilde-rewrite end-to-end against a real /bin/sh: places
a fixture state.db at `~/.hermes/state.db` (backing up the user's
real DB if present) and verifies open() + a query both succeed
through the streaming path.

Refs #74

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 13:34:20 +02:00
Alan Wizemann 593b4e62cb feat(remote): replace SQLite snapshot pipeline with SSH query streaming
The remote-DB pipeline pulled the entire state.db down via scp on
every refresh tick. For the issue #74 user (4.87 GB DB) that meant
~7-min per-snapshot wall time even with the size-aware-timeout fix,
~30 GB/hour upload, and data permanently 5–10 minutes stale. This
isn't a bug to patch — it's the wrong architecture for any non-trivial
remote DB.

Replace it with per-query streaming over SSH. Each SQL statement
becomes one ssh round-trip running `sqlite3 -readonly -json` against
the live remote DB. ControlMaster keeps the channel warm at ~5 ms
overhead; sqlite3 cold-start adds ~30–50 ms; total ~50–100 ms per
query vs. the old multi-minute snapshot. Bandwidth scales with query
result size, not DB size.

What changed:

* New `HermesQueryBackend` protocol and two implementations:
  `LocalSQLiteBackend` (libsqlite3 in-process — local performance
  unchanged) and `RemoteSQLiteBackend` (sqlite3 over SSH per query
  with batched-statement support for multi-query view loads).
* `SQLValue` and `Row` types as the typed boundary between backends
  and the row parsers. `SQLValueInliner` substitutes `?` placeholders
  with SQLite-escaped literals for the remote-CLI codepath (local
  backend keeps real `sqlite3_bind_*`).
* `ServerTransport` swaps `snapshotSQLite` + `cachedSnapshotPath` for
  `streamScript(_:timeout:)`. SSHTransport delegates to the existing
  `SSHScriptRunner`; CitadelServerTransport (iOS) base64-encodes the
  script + decodes remotely via Citadel's exec channel since stdin
  pipes aren't supported there yet.
* `HermesDataService` becomes a thin facade — every fetch* method
  routes through `backend.query(...)`. Public API is unchanged for
  view-model callers; `lastSnapshotMtime`/`isUsingStaleSnapshot`/
  `staleAge` removed (had zero UI consumers).
* New `dashboardSnapshot()` and `insightsSnapshot(since:)` batched
  calls turn Dashboard's 4-query and Insights' 5-query view loads
  into one SSH round-trip each (~80–100 ms total instead of ~280 ms
  naive). DashboardViewModel and InsightsViewModel updated to use
  them.
* One-time launch migration in `scarfApp` wipes the orphaned
  `~/Library/Caches/scarf/snapshots/` directory (could be 5 GB+ for
  the issue #74 user).

JSON parsing detail: sqlite3 -json preserves SELECT column order in
the raw bytes, but `[String: Any]` from NSJSONSerialization doesn't.
The remote backend extracts column ordering by walking the first
object's literal bytes — without this, every positional row read
(`row.string(at: 0)`) would silently return wrong columns.

Tests: 41 new across `SQLValueInlinerTests`, `HermesDataServiceBackendTests`
(mock backend) and `RemoteSQLiteBackendTests` (integration via local
sqlite3 binary). Full suite 262/262 passing.

Builds clean on Mac and iOS. Ships as part of v2.7.

Refs #74

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 13:09:06 +02:00
Alan Wizemann de36411a8d fix(remote): size-aware snapshot timeouts and partial-file cleanup (#74)
The remote-DB snapshot pipeline was hardcoded to a 120s scp timeout and
a 60s remote-backup timeout. For users with a multi-GB state.db (the
report cites 4.87 GB), 120s is wildly insufficient — at typical home
upload speeds (5-50 Mbps) a 5GB transfer takes 13 minutes to several
hours. scp gets killed mid-transfer, leaves a partially-written .db at
the cache path, and every subsequent attempt opens that corrupt file
with sqlite_open returning garbage. Symptom: SSH connects, all
diagnostics pass, but Dashboard / Sessions / Memory show no data.

Changes to SSHTransport.snapshotSQLite:

* Probe `stat` on the remote DB before starting. Drives both the
  timeout budget and a local-disk-space pre-flight (refuses to start
  if local Caches volume can't hold size + 500MB margin).
* Adaptive timeouts based on remote size:
  - backup: 60s base + 1s per 100MB, capped at 600s.
  - scp:    300s base + 0.5s per MB (≈2 MB/s minimum throughput),
            capped at 3600s.
  Defaults of 60s/300s when stat fails (still up from 120s on scp).
* Add `-C` to scp args. SQLite DBs have lots of zero-padded empty
  pages and typically compress 30-50% in transit.
* On any failure path, remove the partial local snapshot file so the
  next attempt starts fresh instead of opening a corrupt DB.
* Rewrite the generic "Command timed out after Ns" error into a
  specific "Snapshot transfer timed out after Ns pulling X.X GB
  state.db from <host>" so users on slow links know what hit the
  wall instead of seeing a meaningless number.

Cannot reproduce locally (no 5GB state.db on hand), but the failure
mode is unambiguous from code reading: hardcoded 120s vs. real-world
multi-GB transfer durations.

Closes #74

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 11:25:38 +02:00
Alan Wizemann 6a7ac21ebe chore: Bump version to 2.6.5 2026-05-03 22:15:05 +02:00
Alan Wizemann 5be67282d8 test(layer-b): full Install → Configure → Open → Uninstall journey XCUITest (#73)
Closes the deferred Layer B install-drive that v2.7's smoke test
left as future work. The new test
(`testFullCatalogToInstallToDashboardJourney`) drives the full
install/uninstall pipeline end-to-end and validates 9 assertion
points along the way:

- Window surfaces under `--scarf-test-mode`
- Sidebar navigation to Projects
- Install sheet appears (URL handoff via launch arg)
- Parent-dir field accepts custom path + Continue
- Configure sheet renders + commit clicks
- Confirm Install runs the install pipeline
- Open Project advances to success view
- Project row appears in sidebar with uniquified name
- Right-click Uninstall + confirm Remove + Done removes the row

Runs in ~30s green on the dev Mac.

## What needed wiring up

**SwiftUI Menu / NSToolbarItem accessibility-bridging.** macOS
toolbar Menus don't propagate `.accessibilityIdentifier` through to
XCUITest — neither the menu trigger NOR the popup contents are
queryable by ID. Verified by tree-dump diagnostics. The test
sidesteps this entirely by routing the install URL through a new
`--scarf-test-install-url <https-url>` launch arg that calls
`TemplateURLRouter.shared.handle(scarf://install?url=...)` at App
init, gated on `TestModeFlags.shared.isTestMode`. Production
launches (no flag) untouched.

**Accessibility IDs added** on the new install/uninstall path:
- `templateConfig.commitButton`, `templateConfig.cancelButton`
- `projects.row.<name>`, `sidebar.section.<rawValue>`
- `projects.contextMenu.uninstallTemplate`
- `templateUninstall.confirmRemove`
- `templateInstall.success.openProject`
- `templateUninstall.success.done`

**Sandboxed-runner caveat.** The XCUITest runner's `/tmp` is
sandbox-protected (createDirectory throws EPERM); we use
`NSTemporaryDirectory()` which resolves to the runner's container
tmp (`~/Library/Containers/com.scarfUITests.xctrunner/Data/tmp/`),
which the unsandboxed Scarf app can read since it has full disk
access.

## Known cohabitation hazard (pre-existing uninstaller bug)

If the dev Mac already has a project from the same template
installed, the install pipeline uniquifies the new project's name
("HackerNews Daily Digest 2") but BOTH projects' cron jobs get
registered under the same `[tmpl:awizemann/hackernews-digest] Daily
HN digest` name. `ProjectTemplateUninstaller.loadUninstallPlan`
resolves cron jobs to remove by NAME and can target the wrong
project's job. The Layer B test surfaces this — manifests as: test
passes, the dev's real project's cron job disappears.

**Fix (separate work):** store cron-job IDs in
`<project>/.scarf/template.lock.json` at install time and resolve
by ID at uninstall time. Until then, the test docstring warns
about cohabitation; recovery is `hermes cron create` to recreate
the lost job.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 22:09:50 +02:00
Alan Wizemann c661945a1f feat(cron): auth-error banner + running indicator + per-job log tail (#72)
Cron rows now surface the same OAuth-refresh-revoked recovery flow as
chat instead of a generic red dot, plus three previously-missing
observability cues:

- ACPErrorHint.classify is reused on `job.lastError`. When it returns
  `oauthRefreshRevoked(provider)` the detail pane shows the human hint
  + a "Re-authenticate" button that drops the user into Credential
  Pools via `coordinator.pendingOAuthReauth = provider` — same wiring
  ChatView's banner uses. Unrecognized errors fall back to the legacy
  red `lastError` text (no regression).
- Row dot turns blue + pulses when `state == "running"` (taking
  precedence over disabled / error / success); the detail header gains
  a `ScarfBadge("running…", kind: .info)` next to active/paused. No new
  polling — `HermesFileWatcher.lastChangeDate` (already wired into
  ActivityView/Logs) drives `CronViewModel.load()` so state flips
  surface within a watcher tick.
- "LAST RUN OUTPUT" replaces the inline `LAST OUTPUT` block with a
  collapsible panel: a one-line summary (`<timestamp> — ok|error|running…`)
  always visible, full monospaced terminal-style scroll view on
  expand, auto-scrolls to bottom when new runs land.

Also fixes a pre-existing bug in `HermesFileService.loadCronOutput`:
Hermes nests per-run output under `~/.hermes/cron/output/<jobId>/<ts>.md`
but the loader treated the dir as flat, so the cron output panel never
rendered any content. The fix walks the per-job subdir + keeps the
legacy flat-file fallback for older Hermes layouts.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 22:09:21 +02:00
Alan Wizemann f5f8dc30b6 Dogfooding templates: HN Digest + in-app catalog browser + test harness (#71)
* feat(templates): hackernews-digest template + dogfooding test harness

First pass of the dogfooding-templates initiative. Each pre-release cycle
ships one new official `.scarftemplate` and uses installing/exercising
that template as the regression test. v1 lands the harness scaffolding
plus the first template under it.

- HackerNews Daily Digest template (`templates/awizemann/hackernews-digest/`):
  config-driven (min_score / max_items / topics) cron-only template.
  No secrets — keeps the harness minimal until the fake-Keychain shim
  lands. Bundle validates against `tools/build-catalog.py`; entry added
  to `templates/catalog.json`.
- `SCARF_HERMES_HOME` env-var override at `HermesProfileResolver` —
  the seam every Layer-B test relies on to drive Scarf against an
  isolated Hermes home. Bypasses cache + active_profile lookup; rejects
  relative paths. 5 unit tests + 3 ServerContext integration tests.
- `TestModeFlags.shared.isTestMode` — reads `--scarf-test-mode` once
  from `CommandLine.arguments`. Wiring only; gating sites (Sparkle,
  capability probe, first-run walkthrough) land as Layer-B exercises
  them.
- Layer A (`scarf/scarfTests/TemplateE2ETests.swift`): parses + plans
  the shipped HN bundle the way the app does at install time;
  asserts manifest, config schema, dashboard widgets, and cron prompt
  contract. Mirrors the existing site-status-checker coverage.
- Layer B scaffold (`scarf/scarfUITests/TemplateInstallUITests.swift`):
  proves the launch-arg + env-var plumbing reaches Scarf. Full install
  click-through deferred until fixture-Hermes-home and accessibility
  IDs land.

Wiki pages added separately on the `.wiki-worktree` branch:
- `Template-Ideas.md` — backlog of 9 v1-feasible templates +
  full-spec v3 epic for Project-Site-as-Living-Surface (eBay listings
  use case).
- `Test-Harness.md` — contributor guide for extending the harness.

Verification: scarfTests 124/124, ScarfCore 220/220, new Layer A 3/3,
Layer B scaffold 1/1, build-catalog.py + its 28 unit tests all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(test-harness): Layer B pivot to real ~/.hermes + a11y IDs + Sparkle gating

Discovered during Layer B work that XCUITest runners are sandboxed:
they can read ~/.hermes/ but writes throw NSFileWriteNoPermissionError.
That kills the SCARF_HERMES_HOME-based isolation pattern for UI tests —
snapshot/restore from inside the runner can't work. Pivot:

- Layer B drives the real ~/.hermes the dev Mac is already running
  against. The harness assumes a working Hermes install (XCTSkip if
  the binary isn't there). Cleanup is via the app's own UI flows
  (which have full disk access), not direct file I/O. Layer A keeps
  its env-var seam — those tests run inside the host app's address
  space and write freely.
- SwiftUI's WindowGroup(for: ServerID.self) doesn't auto-surface a
  window on a fresh XCUIApplication.launch(). The harness sends ⌘1
  (the "Open Server → Local" menu shortcut wired in scarfApp.swift's
  OpenServerCommands) to take the same code path real users hit via
  Dock click.
- Real user home resolved via getpwuid(getuid()) rather than
  NSHomeDirectory(), which inside the sandboxed runner returns
  ~/Library/Containers/com.scarfUITests.xctrunner/Data.
- 8 accessibility IDs added on the install path so the next iteration
  can drive the full Templates → Install from URL → Parent dir →
  Confirm Install flow without depending on view-tree label scraping:
  templates.toolbar.menu, templates.installFromFile,
  templates.installFromURL, templates.installURL.field,
  templates.installURL.confirm, templateInstall.parentDir.field,
  templateInstall.parentDir.continue, templateInstall.confirmInstall.
- TestModeFlags.shared.isTestMode now gates UpdaterService —
  --scarf-test-mode launches Sparkle inert so update prompts don't
  pop on top of an XCUITest-driven window. Production launches
  unchanged.

FixtureHermesHome.swift removed — the fixture-tmpdir approach is
abandoned in favour of using the real installation. Layer A's
SCARF_HERMES_HOME tests still pass; they just don't need a populated
home to exercise path derivation.

Verification: scarfTests 124/124, ScarfCore 220/220, Layer B smoke
1/1 (after fresh build — XCUITest is sensitive to stale binaries).
catalog.py --check still green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(chat): clip placeholder to TextEditor bounds and clear it on focus

Two related bugs in the Mac chat composer's placeholder overlay:

* The "Message Hermes… / for commands · drag images to attach" hint had
  no width constraint, so on narrower window geometries it visibly
  overflowed past the rounded TextEditor boundary. Add `lineLimit(1)`,
  `truncationMode(.tail)`, and `frame(maxWidth: .infinity, alignment:
  .leading)` so it ellipsizes inside the field instead.
* The opacity formula `text.isEmpty ? 1 : 0` only hid the placeholder
  once content was typed, not when the field gained focus. Standard
  NSTextField / UITextField semantics clear the placeholder on focus.
  Switch to `(text.isEmpty && !isFocused) ? 1 : 0` so the hint
  disappears the moment the user clicks into the field.

The opaque-background ghosting mitigation from #65 is preserved
unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(chat): surface OAuth refresh-revoked errors with in-app re-auth

When an OAuth provider's refresh token was revoked, Hermes printed
"Refresh session has been revoked. Run `hermes model` to re-authenticate."
to stderr but Scarf swallowed it — the user saw a typing indicator that
silently disappeared with no banner, no system message, no actionable
hint. The error classifier had no pattern for OAuth revocation.

- `ACPErrorHint.classify` now returns a `Classification` struct
  carrying the hint plus an optional `oauthProvider` name. New
  patterns match "Refresh session has been revoked", "re-authenticate",
  and 401-with-OAuth-provider-name (whole-word so `anthropicapi`
  doesn't false-match `anthropic`). Provider extraction lets the UI
  dispatch the right re-auth flow.
- Chat error banner ([ChatView.swift]) gains a "Re-authenticate" button
  when an OAuth provider was identified — sets
  `AppCoordinator.pendingOAuthReauth` and routes to Credential Pools.
- Credential Pools view consumes the hand-off slot to auto-present
  AddCredentialSheet seeded with the affected provider, AND adds a
  per-row "Re-authenticate" button on every OAuth provider so users
  who go straight there don't have to retype the provider name.
- `AddCredentialSheet` accepts an optional `initialProvider` that
  pre-fills providerID + authType=.oauth; the existing Nous-vs-PKCE-
  vs-CLI gate dispatches re-auth identically to first-time setup —
  reuses the same `OAuthFlowController` / `NousSignInSheet` plumbing,
  no new flow code.

Verification: ScarfCore 221/221 (incl. new
errorHintsClassifyOAuthRefreshRevoked covering the four patterns +
word-boundary guard); Mac app builds clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(catalog): in-app template catalog browser + sentinel-marker test isolation

The v2.8 catalog browser surfaces every shipped .scarftemplate from
awizemann.github.io/scarf/templates/catalog.json directly in Scarf.
Users now discover and install templates without leaving the app.
Closes the gap that publishing the catalog updated the website but
nothing inside Scarf.

Architecture mirrors NousModelCatalogService 1:1: cache-first fetch,
24h TTL at ~/.hermes/scarf/catalog_cache.json, result enum (fresh /
cache / fallback) with bundled fallback so a fresh-install / offline
user still sees something. Search + category filter + sort
(awizemann official first). Detail page renders entry.config schema
preview without separate README fetch — what's in catalog.json is
what we render. Install hands the HTTPS URL to the existing
TemplateInstallerViewModel.openRemoteURL flow; nothing about the
installer itself changes.

Files:
- Core/Models/CatalogEntry.swift — Decodable mirror of catalog.json
  per-template shape. Identity-based Equatable/Hashable on `id`.
- Core/Services/CatalogService.swift — fetch + cache + fallback
- Core/Services/InstalledTemplatesIndex.swift — walks projects.json +
  template.lock.json to build [templateId: version] map; classify()
  helper for Installed / Update available / Not installed badges
- Features/Templates/ViewModels/CatalogViewModel.swift — @Observable
- Features/Templates/Views/{CatalogView,CatalogRowView,CatalogDetailView,CatalogCategoryFilter}.swift
- Packages/ScarfCore/.../HermesPathSet.swift — adds catalogCache path
- Features/Projects/Views/ProjectsView.swift — Templates toolbar
  menu now opens with "Browse Catalog…"; sheet binding.

Tests (20 new, all passing in isolation):
- CatalogServiceTests (6) — live catalog.json snapshot, cache lifecycle,
  staleness boundary, schema-version mismatch rejection, bundled fallback
- InstalledTemplatesIndexTests (5) — empty registry, templated project,
  ad-hoc project skip, corrupt lock skip, classify() branches
- CatalogViewModelTests (6) — search filter, category filter, official-first
  sort, deduped categories, install state, install URL pass-through

Accessibility IDs (6, on the catalog path): templates.browseCatalog,
catalog.searchField, catalog.refreshButton, catalog.row.<detailSlug>,
catalog.categoryFilter, catalogDetail.installButton.

## Sentinel-marker hardening on SCARF_HERMES_HOME (incident response)

While iterating on v2.8 tests, the env-var override pattern racing
under Swift Testing's parallel-suite scheduler caused
~/.hermes/scarf/projects.json to be overwritten with fixture data
from ProjectsViewModelTests. Recovered the user's projects from the
on-disk dirs they referenced + cron-job prompt paths (6 projects
restored).

To make this class of incident impossible going forward:
HermesProfileResolver.scarfHermesHomeOverride() now requires the
override path to contain a sentinel marker file
(`.scarf-test-home-marker`). Without the marker, the override is
ignored and Scarf falls through to the real ~/.hermes/. Even if a
test crashes mid-teardown leaving the env var set, even if the var
leaks to a non-test process, even if a misconfigured launchctl plist
exports it — the override only activates against directories that
explicitly opt in by carrying the marker. Tests drop the marker in
their tmpdir setUp; production never carries it.

HermesProfileResolverTests gains overrideIsIgnoredWhenMarkerMissing
which verifies the guard is load-bearing. All test files using
SCARF_HERMES_HOME (CatalogServiceTests, CatalogViewModelTests,
InstalledTemplatesIndexTests, TemplateE2ETests) now drop the marker
before setenv.

Verification: 20/20 v2.8 + v2.7 hardened tests pass; 45/45 adjacent
existing tests pass; ScarfCore package tests pass (221/221); catalog
validator clean (3 templates); wiki secret-scan clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(swift6): retroactive conformance + verbatim help text + xcstrings refresh

Three small Swift 6 compile-cleanups that landed during the
dogfooding-templates iteration:

- MessageSpeechService — drop `@preconcurrency` on the
  AVSpeechSynthesizerDelegate conformance now that the protocol's
  Sendable annotations are upstreamed.
- ChatView — mark `RichChatViewModel.PendingPermission: Identifiable`
  as `@retroactive`. We don't own either the type or the protocol; the
  Swift 6 compiler flags this so downstream breakage is loud if
  ScarfCore ever adds the conformance upstream.
- CredentialPoolsView — wrap the `.help(...)` string in
  `Text(verbatim:)` so the backticks render literally instead of being
  interpreted as markdown inline-code by the LocalizedStringKey
  overload (which `.help(_:)` rejects styled).

Localizable.xcstrings: auto-generated catalog refresh picking up the
new active-profile + chat error-hint strings landed in earlier
commits on this branch (acd3692, 301806d).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(catalog): error logging + MainActor I/O + semver pre-release + decoder fault tolerance

- InstalledTemplatesIndex: replace bare `try?` reads/decodes with logged
  do/catch so corrupt registry/lock files leave a breadcrumb instead of a
  silent nil.
- InstalledTemplatesIndex.isVersionNewer: handle pre-release suffixes per
  semver §11 — `1.0.0-beta` no longer reports as newer than `1.0.0`,
  preventing a ghost "Update available" that would downgrade users.
- CatalogViewModel.refresh: dispatch the synchronous index walk through
  `Task.detached` so registry + N lock-file reads don't run on
  @MainActor.
- Catalog decoder: per-element fault tolerance via custom `init(from:)` —
  one malformed catalog entry is dropped with a logged warning instead
  of failing the whole catalog decode (honors the per-entry doc-comment
  contract).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 20:04:13 +02:00
Alan Wizemann 34d315793b fix(chat): clip placeholder to TextEditor bounds and clear it on focus
Two related bugs in the Mac chat composer's placeholder overlay:

* The "Message Hermes… / for commands · drag images to attach" hint had
  no width constraint, so on narrower window geometries it visibly
  overflowed past the rounded TextEditor boundary. Add `lineLimit(1)`,
  `truncationMode(.tail)`, and `frame(maxWidth: .infinity, alignment:
  .leading)` so it ellipsizes inside the field instead.
* The opacity formula `text.isEmpty ? 1 : 0` only hid the placeholder
  once content was typed, not when the field gained focus. Standard
  NSTextField / UITextField semantics clear the placeholder on focus.
  Switch to `(text.isEmpty && !isFocused) ? 1 : 0` so the hint
  disappears the moment the user clicks into the field.

The opaque-background ghosting mitigation from #65 is preserved
unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 16:48:32 +02:00
Alan Wizemann acd3692faf fix(profiles): switch-and-relaunch flow + active-profile chip + structured logs
Profile selection had no apparent effect on Webhooks/Sessions/SOUL.md/Memory
even after restart in some user setups. The path-resolution code reads
~/.hermes/active_profile correctly on paper, so the failure mode is likely
environment-specific (HERMES_HOME exported in the shell, in-process state
that didn't reset on what the user perceived as a restart, etc). Layer a
defense that's correct regardless of root cause:

* New AppRelauncher helper spawns a fresh `open -n <bundleURL>` and asks
  the current process to terminate after a 250ms delay. Refuses to fire
  from Xcode/DerivedData (the .debugBuild guard) so debug sessions don't
  lose their attached debugger.
* ProfilesViewModel.switchAndRelaunch runs `hermes profile use`, calls
  HermesProfileResolver.invalidateCache(), then relaunches via the helper.
  Existing switchTo() also gains the cache-invalidation step so the
  context-menu "Set Active (no relaunch)" path stays self-consistent.
* ProfilesView replaces the passive "Restart Scarf after switching" text
  with a confirmation-gated `Switch & Relaunch` primary button on the
  detail pane plus the same item in each row's context menu. Confirmation
  dialog flags that all Scarf windows will close.
* SidebarView header gains a brand-tinted ScarfBadge showing the
  currently-active profile on local contexts. Click to jump to the
  Profiles tab. The chip refreshes on `selectedSection` change so a
  terminal-side `hermes profile use` is visible after the next nav.
* HermesProfileResolver success logs gain `name=…, home=…, source=…`
  key=value structure across all three resolution paths (file / file-default /
  default-no-file). `log show … | grep ProfileResolver` now answers
  "what did the resolver decide?" unambiguously for support requests.

Closes #70

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 13:18:10 +02:00
Alan Wizemann ab615f0c28 feat(ios-chat): redesign composer with HIG touch targets and clear disabled state
Send button is now a 44pt circular target with an explicit color swap
(rust accent → background-tertiary) on disable, instead of relying on
SwiftUI's default opacity dim — addresses the "first tap doesn't
register" complaint by making the inactive state visibly different in
both light and dark mode. Paperclip and text field both gain a 44pt
minimum height so the row feels modern and roomy.

The text field swaps `.roundedBorder` for a plain field with a
ScarfRadius.xl rounded fill (ScarfColor.backgroundSecondary) and a
borderStrong stroke. Outer paddings and HStack spacing migrate from
magic numbers to ScarfSpace tokens.

Preserves verbatim: the `.toolbar { ToolbarItemGroup(placement: .keyboard) }`
keyboard-dismiss chevron (issue #51), draft persistence, .submitLabel,
@FocusState, photo-picker wiring, attachment-strip rendering, and every
.disabled() predicate.

Closes #69

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 13:14:09 +02:00
Alan Wizemann 982ed7da92 chore: bump iOS build to 30 for TestFlight
iOS-only patch carrying the rotation lock + chat-start preflight
off-MainActor fixes from cb164f0. Mac side stays on the v2.6.0
binary already shipped (build 29 archive); this build number bump
only affects future Mac archives, not the one already notarized.

Uploaded to App Store Connect via altool — Apple processing now,
will land in TestFlight once the binary clears the post-upload
scan (typically 5–15 min).
2026-05-01 16:20:13 +02:00
Alan Wizemann cb164f07f9 fix(ios): lock iPhone to portrait + move chat-start preflight off MainActor
Two iOS-specific crash classes from the v2.5.1 TestFlight feedback
round:

**Rotation crash** — locked the iPhone target to
`UIInterfaceOrientationPortrait` only (was Portrait + LandscapeLeft
+ LandscapeRight). The phone can't rotate the app at all anymore,
so any layout path that wasn't audited for size-class transitions
is no longer reachable. iPad orientation list left alone (target
device family is iPhone-only anyway).

**"Crash while typing" / "trying to continue an existing
conversation"** — `ChatController.passModelPreflight()` was doing
a synchronous SSH read (`context.readText(configYAML)`) on
`@MainActor` during chat-start. On a remote ScarfGo context that
blocks the main thread for seconds; iOS's non-responsive-app
watchdog kills the process around 10s. To the user this surfaces
as a "crash" while they're typing — they kept tapping the keyboard
while the connect was hung. Move the read to `Task.detached` and
await it; the UI stays responsive while the SSH I/O drains. Three
callers (`start`, `start(projectPath:)`, `startResuming`) updated
to `await passModelPreflight(...)` — they were already async.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 16:03:28 +02:00
Alan Wizemann 1dbdf9d079 chore: ignore local crashes/ triage directory
TestFlight feedback / crash JSONs land here while we're working
through an iOS fix round. They carry tester PII (emails, carriers,
locales) and aren't meant for the public repo. Kept local-only;
deleted after the round closes.
2026-05-01 15:57:41 +02:00
Alan Wizemann 101488cd0d docs(readme): bump What's New to v2.6.0 + Hermes v0.12 catch-up
Replaces the 2.5 "What's New" block with a 2.6 summary that
covers the Hermes v0.12 surfaces (Curator, multimodal images, 5
new providers, Teams + Yuanbao, Kanban, Skills v0.12, cron
--workdir, settings deltas, ScarfGo Webhooks/Plugins/Profiles)
and the post-merge chat fix round (#67/#68/#65/#62/#63/#64/#66/
#61). Verified-versions table gains v0.12.0 as the current target;
recommended-Hermes line points at v0.12.0+ for full feature
support. ScarfGo block kept but de-emphasised since it shipped
in 2.5.
2026-05-01 15:55:16 +02:00
Alan Wizemann 03c996ee80 chore: Bump version to 2.6.0 2026-05-01 15:42:48 +02:00
Alan Wizemann 8428cbff10 docs(v2.6.0): document post-merge issue fixes in RELEASE_NOTES
Adds a "Chat composer + transcript (post-merge round)" subsection
to the bug-fixes block covering #67, #68, #65, #62, #63, #64,
#66, and the partial #61 ACP-timeout bump. The pre-merge
test-target / iOS-build fixes stay grouped under "Pre-merge".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:41:48 +02:00
Alan Wizemann 381adfd925 fix(acp): bump control-message timeout 30s→60s for db-contended hosts (#61)
Field-reported (#61): under realistic concurrency where the
Hermes gateway is also running, state.db lock contention
(Discord sync / skill registration / cron scheduling all
holding write locks) stalls ACP's `initialize` / `session/new` /
`session/load` past the previous 30s watchdog, surfacing as
"Starting…" indefinitely or an opaque timeout error.

SQLite contention on a healthy host clears in seconds, so 60s
gives the lock-resolution path room to breathe while still
surfacing genuinely broken transports promptly. `session/prompt`
remains untimed (it streams events and can run for minutes).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:40:33 +02:00
Alan Wizemann 254af46e93 feat(chat): per-message TTS playback in assistant bubbles (#66)
Adds a small speaker glyph to the metadata footer of each settled
assistant bubble. Tap to read the reply aloud through
`AVSpeechSynthesizer`; tap again (or any other bubble's button) to
stop. Picks up the user's macOS Spoken Content default voice
automatically — no Hermes dependency, works offline.

- New `MessageSpeechService` (`Core/Services/`) — shared
  `@Observable` synthesizer; `playingMessageId` drives icon
  state. Markdown control characters (asterisks, backticks,
  link syntax) are stripped before speech so the user doesn't
  hear "asterisk asterisk bold".
- `SpeakMessageButton` lives outside `RichMessageBubble.==` so
  the bubble's Equatable short-circuit doesn't freeze the icon
  when playback flips between messages.

The full Hermes-provider TTS pipeline (Edge / ElevenLabs /
OpenAI / NeuTTS / Piper from Settings → Voice) is a much bigger
follow-up — wiring per-provider audio fetching, caching, and
streamed playback is its own quarter. v2.6.0 ships the immediate
"listen while doing something else" affordance.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:38:22 +02:00
Alan Wizemann 596c844da5 feat(chat): notify when Hermes finishes a prompt in the background (#64)
Sending a long prompt and switching to other work — the canonical
async-agent flow — required polling the chat to know when the
response landed. Wire a local UNUserNotificationCenter notification
to fire when an ACP prompt completes while Scarf isn't the
foreground app.

- New `ChatNotificationService` (Core/Services) handles lazy
  authorization, foreground gating, and post.
- `ChatViewModel.sendViaACP` calls it on successful prompt
  completion with the assistant's first-line preview and the
  active session title.
- Settings → Display → Feedback adds a "Notify when Hermes finishes"
  toggle, default on. Skipped for `/steer`-style mid-run sends —
  those don't end a turn.

Dock badges and per-session unread state from the issue are
worthwhile follow-ups but out of scope for v2.6.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:35:55 +02:00
Alan Wizemann ec47d191a1 fix(chat): preserve local user messages across resume cycles (#63)
When a user sent a prompt and immediately switched to a different
session before Hermes flushed the row to state.db, `resumeSession`
ran `reset()` (which clears `messages`) and then
`loadSessionHistory` read the un-persisted DB and replaced the
array with an empty result. The user's bubble came back blank or
disappeared on return.

Hold local-only user messages (negative ids) in a per-session
cache that survives `reset()`. `loadSessionHistory` re-injects any
still-pending entries for the loaded session, dedups against any
DB row that finally caught up (matching content with persisted id
≥ 0), and clears the cache as the DB confirms each entry.

Cache is bounded by sessions sent-in during one app run; entries
clean themselves out as Hermes persists, and orphaned entries
(deleted sessions etc.) are tiny and never re-surface since
session ids are unique per session.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:33:37 +02:00
Alan Wizemann 31e6c31acf fix(chat): scope composer state to active session id (#62)
`RichChatInputBar`'s `@State` `text` and `attachments` survived
session switches because the surrounding view tree is structurally
identical across sessions — SwiftUI happily reused the same
instance and leaked the previous session's unsent draft into the
new one.

Bind the composer's identity to `richChat.sessionId` so SwiftUI
rebuilds the view (and its `@State`) on session change. A stable
fallback string covers the brief "no session selected" window;
using `UUID()` here would mint a fresh id on every render and
trash the composer per body re-eval.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:28:59 +02:00
Alan Wizemann fcfe1c89d6 fix(chat): stop placeholder ghosting in chat composer (#65)
`TextEditor`'s NSTextView surfaces a typed glyph one frame before
the SwiftUI binding propagates, so the bare `if text.isEmpty`
overlay rendered the translucent placeholder text directly on top
of the just-typed character — the "behind or around" ghost the
reporter described.

Two mitigations:

- Pin an opaque `ScarfColor.backgroundSecondary` rect behind the
  placeholder Text. During any single-frame binding lag the user
  now sees a clean placeholder rather than layered glyphs.
- Switch the conditional to `.opacity(text.isEmpty ? 1 : 0)` so the
  view tree stays stable per keystroke. Pairs with the composer
  perf fix from #67.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:27:53 +02:00
Alan Wizemann df1b9caabf fix(chat): scale rich chat content with the font-size slider (#68)
The chat font-size slider only set `\.dynamicTypeSize` on the chat
root, but ScarfFont tokens are fixed-point (`Font.system(size: 14, …)`)
so dynamic type didn't reach bubble text, reasoning, tool chips, code
blocks, or markdown headings. Slider moved between 85%–130% with
little visible effect.

Plumb a separate `\.chatFontScale: Double` env value from
`RichChatView` and have the chat content views read it:

- `RichMessageBubble` — user bubble body, reasoning (disclosure +
  inline), REASONING label, token chip, tool-chip name, metadata
  footer.
- `MarkdownContentView` — paragraphs (now pinned to a scaled body
  font instead of inheriting), headings (1..5), inline-rendered code
  blocks, code-language label.
- `CodeBlockView` — code body and language label.

`ChatFontScale.{body, callout, caption, captionStrong, caption2,
mono, monoSmall, codeBlock, codeInline}(_ scale:)` helpers mirror
`ScarfFont`'s base sizes so scale = 1.0 is byte-for-byte identical
to today's UI; the slider now actually moves the visible chat text.

Other surfaces (settings, sidebar, etc.) still use the static
ScarfFont tokens — chat scaling stays scoped to the chat surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:24:45 +02:00
Alan Wizemann a41c81c048 fix(chat): coalesce composer onChange writes to stop typing lag (#67)
Typing in the chat composer became unusably laggy because
`updateMenuState()` ran on every keystroke and unconditionally wrote
both `showMenu` and `selectedIndex`. Two state writes inside one
`onChange(of: text)` handler tripped SwiftUI's "action tried to
update multiple times per frame" warning, and each redundant write
forced a full body re-eval — visible as the slow-HID stalls and the
main-thread layout churn the reporter captured in sampling.

Two changes:

- Compute the new selection up front and write only the deltas. Same
  semantics; no spurious mutations.
- Short-circuit the whole handler when the user is composing normal
  text (no `/` prefix) and the menu is already hidden — the common
  case. Stops paying for `SlashCommandMenu.filter` on every keystroke
  of regular prose.
- Replace `.onChange(of: commands.map(\.id))` with
  `.onChange(of: commands.count)`. The mapped form allocated a fresh
  `[String]` on every body re-eval; counting is one int read.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:20:15 +02:00
Alan Wizemann 88add62997 Merge branch 'v12-updates'
Hermes v2026.4.30 (v0.12.0) compatibility — autonomous Curator (Mac +
iOS), multimodal image input in chat, 5 new inference providers,
Microsoft Teams + Yuanbao gateway platforms, read-only Kanban view,
Skills v0.12 surface (URL install / reload / pin / disable), Cron
--workdir flag, Settings deltas (cache TTL, redaction, runtime footer,
Piper, Vercel), iOS read-only Webhooks/Plugins/Profiles, and a
pre-v0.12 Hermes-version banner. All new surfaces capability-gated so
older Hermes hosts see the v2.5 surface unchanged.

Release notes: releases/v2.6.0/RELEASE_NOTES.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:17:36 +02:00
Alan Wizemann 80589b3f23 chore(i18n): pick up autogenerated v0.12 string keys
Xcode-autogenerated strings for the v12 surface — curator chip labels,
image attachment button + counter, archived-skill banner — that the
extractor produced while the v12-updates branch was being authored.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:17:11 +02:00
Alan Wizemann 13f89e309b docs(claude-md): correct Hermes v0.12 surface drift after review fixes
CLAUDE.md was rewritten in 3d85b91 to describe the new v0.12 surfaces
but several claims drifted from what actually shipped (or have since
walked back during the review-fix pass):

- Curator iOS panel was described as "read-only"; it ships Run Now /
  Pause / Resume actions and inline pin toggles.
- Curator path symbols were named `curatorReportJSON` / `curatorReportMD`;
  the actual additions to `HermesPathSet` are `curatorLogsDir` and
  `curatorStateFile`, with the per-cycle `run.json` / `REPORT.md`
  resolved at runtime via the state file's `last_report_path`.
- The `flush_memories` bullet claimed Scarf had dropped the field; it's
  preserved on pre-v0.12 hosts via `hasFlushMemoriesAux` (restored in
  commit 33022ae).
- The cron `--workdir` bullet didn't mention the capability gating that
  landed in commit 4a2ef74, nor the empty-string clear gesture from
  commit 46cec81.
- The v0.12 surface list omitted the iOS Phase H catch-up
  (Webhooks/Plugins/Profiles read-only tabs + HermesVersionBanner)
  shipped in commit 799332f.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 14:15:34 +02:00
Alan Wizemann c055081ba3 perf(chat-ios): ingest picker items in parallel via TaskGroup
`ingestPickerItems` ran loadTransferable + encode sequentially per
selected image. PhotosPickerItem.loadTransferable is async and hops
off MainActor (nonisolated), but for 5+ iCloud-backed PHAssets the
sequential pipeline meant five round-trips back-to-back instead of
five concurrent ones.

Switched to `withTaskGroup` keyed by selection index so:
- Slot cap is computed once up front and items past the cap are
  dropped (previously we mid-loop-broke after the first overage).
- Each item's loadTransferable + ImageEncoder runs concurrently.
- Results land back in selection order via index sort, so the
  attachment chip row matches what the user picked.

Errors carry a Sendable `String` message rather than the raw `Error`,
which isn't Sendable under strict concurrency.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 14:12:41 +02:00
Alan Wizemann bd05e01d1c fix(webhooks-ios): surface parse failure in lastError
The post-load assignment was a true no-op:
`self.lastError = parsed.isEmpty && !result.isEmpty ? nil : nil` —
both ternary branches assigned `nil`. The intent (visible from the
condition shape) was to set an error message when the CLI returned
text but the parser produced no webhooks.

Now that branch sets a "Couldn't parse webhook list output" message
which the existing banner at line 33 renders. Normal flow (parse
succeeds, or empty output) still clears the error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 14:11:25 +02:00
Alan Wizemann b66ed7e8d7 fix(kanban): show stderr-only in error banner, parse stdout-only as JSON
`KanbanViewModel.load` previously assigned the combined stdout+stderr
output of `runHermesCLI` into both the JSON-parse `data` and the
`stderr` slot of its result tuple. Two consequences:

- On non-zero exit, the error banner showed combined output (often
  stdout usage text concatenated with the actual error), reducing the
  signal-to-noise ratio when troubleshooting.
- On non-zero exit with mixed output, JSON decoding could fail because
  stderr text was prepended to the JSON body.

Added `HermesFileService.runHermesCLISplit` — a sibling of `runHermesCLI`
that returns `(exitCode, stdout, stderr)` separately, leaning on the
already-separated `stdoutString` / `stderrString` from the transport
layer. KanbanViewModel now uses it: stdout is the JSON parse target,
stderr is the error-banner source. Existing `runHermesCLI` callers are
untouched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 13:29:16 +02:00
Alan Wizemann 46cec816ec fix(cron): allow clearing an existing workdir on edit
`updateJob` only emitted `--workdir <path>` when the value was non-empty,
so once a workdir was set on a job, the user had no way to remove it
through Scarf — clearing the TextField and saving was a silent no-op.

Hermes' `cron edit --workdir` argparse documents passing an empty string
as the explicit clear gesture (mirroring the existing `--script` shape,
which already passes empty through here). Drop the `!isEmpty` predicate
so a non-nil value — including "" — reaches the CLI.

The previous capability gate keeps this safe on pre-v0.12 hosts: CronView
passes `workdir: nil` there, so the flag is omitted and v0.11 argparse
is never asked about an unknown arg.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 13:27:49 +02:00
Alan Wizemann 681fa40c3c fix(skills): use ScarfFont token for OFF pill badge
The disabled-skill row's "OFF" pill used `.font(.system(size: 9, weight:
.semibold))`, which the project CLAUDE.md flags as a code smell ("bypass
the type scale… is a code smell"). The design system documents
`scarfStyle(.captionUppercase)` as the canonical badge font; switching
to it picks up the matching tracking + uppercase casing as a bonus.

The pin glyph above (`Image(systemName: "pin.fill").font(.system(size:
9))`) is left as-is — that's intentional glyph sizing on an `Image`,
which the design rule explicitly excludes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 13:27:07 +02:00
Alan Wizemann 15642d37cf fix(skills): parse equal-indent disabled list in skills config
`readDisabledSkillNames` broke out of the loop on `leading <= baseIndent`,
but PyYAML's default `yaml.dump` (what Hermes uses to write the disabled
list) emits list items at the SAME indent as the parent key:

    skills:
      disabled:
      - foo
      - bar

Here `disabled:` is at indent 2 and `- foo` is also at indent 2, so the
old check terminated before any item was appended — every disabled skill
written by Hermes would have appeared enabled in the UI.

Now the loop only breaks when the indent is strictly shallower than the
`disabled:` line, or when a same-indent line isn't a list item (sibling
key — that's still the end of the block). The deeper-indent layout still
parses correctly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 13:23:01 +02:00
Alan Wizemann 33022aeb92 fix(settings): restore flush_memories aux row on pre-v0.12 hosts
Phase B removed the `flushMemories` field from `AuxiliarySettings`,
the `aux("flush_memories")` reader from the YAML parser, and the
"Flush Memories" row from `AuxiliaryTab.tasks` outright. But
`HermesCapabilities.hasFlushMemoriesAux` still claims (with inverse
semantics) that the row should stay visible on pre-v0.12 hosts where
the task is alive. Project CLAUDE.md documents the same contract.

Restored:
- `AuxiliarySettings.flushMemories: AuxiliaryModel` (and `.empty`).
- `aux("flush_memories")` in both YAML readers
  (`HermesConfig+YAML.swift` and the `HermesFileService` mirror).
- `AuxiliaryTab.tasks` appends the Flush Memories row when
  `hasFlushMemoriesAux` is true, mirroring how `curator` is appended
  on the v0.12+ branch.

On v0.12+ hosts the flag is `false` so the field stays `.empty` and
the row is hidden — no behaviour change for current users.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 13:22:41 +02:00
Alan Wizemann 4a2ef74b74 fix(cron): gate --workdir flag on hasCronWorkdir capability
`HermesCapabilities.hasCronWorkdir` was added but never consumed: the
editor sheet always rendered the Workdir TextField and the view model
unconditionally appended `--workdir <path>` whenever the field was
non-empty. On a pre-v0.12 host argparse rejects the unknown flag and
the entire `cron create`/`cron edit` call fails.

Two-layer gate:
- CronJobEditor takes a `supportsWorkdir` flag and hides the field on
  pre-v0.12 hosts.
- CronView reads `\.hermesCapabilities` and forces the workdir argument
  to "" / nil when the capability is absent, so an editing-an-existing-
  job path that hydrates `form.workdir` from a pre-existing value can't
  smuggle the flag through.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 13:21:35 +02:00
Alan Wizemann 11bb2bd0c3 fix(chat): detach NSOpenPanel image read off MainActor
`presentImagePicker()` ran `Data(contentsOf: url)` synchronously on
MainActor inside the URL loop before the detached `encode()`. A 24 MP
HEIC at 8-15 MB stalled the chat composer per file. The drag/drop and
paste paths already read off-main via `loadObject`/`loadDataRepresentation`
callbacks; this brings the open-panel branch in line by capturing the
URLs into a `Task.detached` and reading bytes there.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 13:20:50 +02:00
Alan Wizemann 3d85b91392 docs(hermes-v12): release notes + CLAUDE.md polish (Phase I)
Adds releases/v2.6.0/RELEASE_NOTES.md covering every Phase A-H surface
(Curator, multimodal image input, 5 new providers, Skills v0.12,
Settings deltas, Cron workdir, Teams + Yuanbao, read-only Kanban, iOS
read-only Webhooks/Plugins/Profiles, version banner, internal
capability detector). Drops a paragraph at the top noting Hermes
v0.11 hosts continue to work — every new surface is gated on
HermesCapabilities so v2.6 against v0.11 looks identical to v2.5.2
against v0.11.

Polishes CLAUDE.md inaccuracies introduced in Phase A's first pass:

- ACP image wire shape: corrected to {"type":"image","data":...,"mimeType":...}
  (matches acp.schema.ImageContentBlock); previous Anthropic-style
  source: {type: base64, ...} sketch was wrong.
- Cron --context-from: clarified that Hermes hasn't exposed it as a
  CLI flag yet (read-only via HermesCronJob.contextFrom), only
  --workdir is writable.
- hermes memory setup: noted that the interactive verb stays in
  Terminal (no in-app shellout); Settings → Memory just exposes the
  provider picker.
- Skills surface: more precise about which CLI verbs back the Mac UI
  affordances and why the disable-toggle is deferred to v2.7.

215 ScarfCore tests green; both Mac and iOS schemes build clean. Wiki
update + the actual release.sh ship are deferred to the user's
typical release-prep flow (the wiki repo is a separate worktree
that needs scripts/wiki.sh pull/commit/push, and release.sh expects
a clean working tree pointed at main).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 13:01:43 +02:00
Alan Wizemann 799332fbcd feat(hermes-v12): iOS catch-up — Webhooks/Plugins/Profiles read-only + version banner (Phase H)
Closes the iOS read-only inspection gap on three CLI-driven Hermes
surfaces and adds a Hermes-version banner so mobile users on remote
v0.11 hosts see the upgrade nudge inline.

Components:

- Scarf iOS/Components/HermesVersionBanner.swift — yellow banner shown
  on the Dashboard when the active server's HermesCapabilities returns
  detected==true && hasCurator==false. One-tap session dismiss; comes
  back on next app open. Lists the v0.12 capabilities the user is
  missing out on (curator, multimodal, new providers).

- Scarf iOS/Webhooks/WebhooksView.swift — read-only list rendered from
  `hermes webhook list`. Tolerant block parser mirrors the Mac
  WebhooksViewModel shape so future drift fixes in one canonical place
  if/when promoted into ScarfCore. Detects the "platform not enabled"
  state and shows a setup-required pane instead of synthesizing rows
  from instructional text.

- Scarf iOS/Plugins/PluginsView.swift — filesystem-first scan over
  `~/.hermes/plugins/<name>/` with plugin.json / plugin.yaml manifest
  reads (mirrors the Mac VM). Enabled/disabled badge, version, source.
  Uses HermesYAML.parseNestedYAML / stripYAMLQuotes from ScarfCore
  (already public).

- Scarf iOS/Profiles/ProfilesView.swift — `hermes profile list` text
  parser with active-profile highlighting from
  `~/.hermes/active_profile`. Defensively handles both Rich box-drawn
  table output and plain-text fallback.

ScarfGoTabRoot's System tab gains an "Inspect" section with the three
new NavigationLinks. None are capability-gated — the underlying
list verbs exist on both v0.11 and v0.12, so the read views work
against either Hermes version without surprises.

Tests: 215 ScarfCore tests pass; both Mac and iOS schemes build clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 12:58:28 +02:00
Alan Wizemann 7a833b6c5a feat(hermes-v12): Cron workdir + Microsoft Teams + Yuanbao + read-only Kanban (Phase G)
Mac-only Phase G surfaces. Three additions:

Cron — `--workdir` flag (v0.12+):

- HermesCronJob carries `workdir: String?` and `contextFrom: [String]?`
  fields (the latter is read-only from CLI today; YAML-only chaining).
- FormState.workdir; CronJobEditor adds an absolute-path field;
  CronViewModel.createJob/updateJob forward `--workdir` when set,
  omit it when blank so v0.11 hosts (which don't know the flag) keep
  working unchanged.

Platforms — Microsoft Teams + Yuanbao (v0.12+):

- KnownPlatforms gains the two new platform identifiers + icons.
- PlatformsView adds inline read-only setup panels for each since the
  full setup flow lives outside Scarf (OAuth dance for Yuanbao, plugin
  install for Teams). Both panels surface the type, the recommended
  setup command, and the current configured/connected status the
  existing connectivity probe already understands.

Kanban — read-only list (v0.12+):

- HermesKanbanTask Sendable Codable model mirroring
  `_task_to_dict` in hermes_cli/kanban.py.
- KanbanViewModel polls `hermes kanban list --json` every 5s while the
  view is foregrounded; status filter dropdown maps to `--status`.
  Empty list and "no matching tasks" text outputs both render the
  empty state cleanly.
- KanbanView: page header + status badges + meta chips
  (id/assignee/workspace/skills) per row. No create/claim/dispatch UI
  — multi-profile collaboration was reverted upstream while the
  design is reworked, so v2.6 ships read-only and defers the editor
  to v2.7+.
- AppCoordinator.SidebarSection.kanban + ContentView routing.
  SidebarView's capability-aware `sections` filters out the row when
  `HermesCapabilities.hasKanban` is false.

Tests: 215 ScarfCore tests pass; both Mac and iOS schemes build clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 12:54:38 +02:00
Alan Wizemann 6954f0276a feat(hermes-v12): Settings deltas — cache TTL, redaction, runtime footer, Piper, Vercel (Phase F)
Surfaces the v0.12 config knobs that landed without their own dedicated
UI elsewhere:

- prompt_caching.cache_ttl picker (5m default, 1h opt-in) — reduces
  cache writes on long agent loops with stable system prompts.
- redaction.enabled toggle — Hermes flipped this off by default in
  v0.12 because the substitution corrupted patches; security-sensitive
  users can flip it back on here.
- agent.runtime_metadata_footer toggle — opt-in compact footer on each
  final reply (provider/model/cost/turn count).
- TTS provider list gains "piper" — native local TTS engine new in
  v0.12.
- Terminal backend list gains "vercel" — Vercel Sandbox backend for
  execute_code/terminal added in v0.12.

The new "Caching & Redaction" section in AdvancedTab is gated on
HermesCapabilities.hasPromptCacheTTL — pre-v0.12 hosts don't see
toggles that would write keys Hermes ignores. The Piper + Vercel
options ride along unconditionally because Hermes silently accepts
unknown values and falls back to safe defaults.

Model + parser:

- HermesConfig grows three optional scalar fields (cacheTTL: String,
  redactionEnabled: Bool, runtimeMetadataFooter: Bool). All three
  have init defaults so existing call sites — including
  HermesConfig.empty — keep compiling.
- Both YAML readers (HermesFileService for Mac, HermesConfig+YAML for
  the package) now parse the new keys with v0.12-defaults.

Tests: 215 ScarfCore tests pass; both Mac and iOS schemes build clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 12:47:54 +02:00
Alan Wizemann ee3791a1b2 feat(hermes-v12): Skills v0.12 surface — URL install + reload + pin/disable badges (Phase E)
Hermes v0.12 added three skills surfaces Scarf can now reach:

- Direct-URL install: `hermes skills install <https://...>` lets users
  pull a one-off skill without going through a registry. Mac SkillsView
  grew an "Install from URL…" toolbar button (capability-gated on
  HermesCapabilities.hasSkillURLInstall) opening a sheet with the URL
  field plus optional --category / --name overrides.
- Reload: `hermes skills audit` rescans `~/.hermes/skills/` and refreshes
  the agent's view of available skills without restarting. Wired to a
  "Reload" toolbar button next to the install button on Mac.
- Enabled state: skills.disabled in config.yaml is now read at scan time
  (SkillsViewModel.readDisabledSkillNames). Disabled skills render
  strikethrough + an "OFF" pill on Mac and iOS rows so users see what
  Hermes won't load. iOS detail view explains the state in plain text.
- Curator pin badge: pinned-skill names from
  `~/.hermes/skills/.curator_state` (SkillsViewModel.readPinnedSkillNames)
  surface as a pin glyph on each row. Mac sidebar + iOS list both show
  it; iOS detail view explains "pinned by curator — won't auto-archive."

Model + scanner:

- HermesSkill gains `enabled: Bool` (default true) and `pinned: Bool`
  (default false). Both default to backwards-compatible values so
  unmodified call sites keep compiling.
- SkillsScanner.scan now takes optional `disabledNames` and
  `pinnedNames` sets and applies them per skill at scan time.
- SkillsViewModel.load auto-fetches both sets internally so Mac/iOS
  callers don't have to plumb curator state manually; an opt-in
  `pinnedNames` override is available for the Curator screen which
  has a fresher snapshot in hand.

Tests: 215 ScarfCore tests pass; both Mac and iOS schemes build clean.

Note: the disable-toggle path (writing the array back into
config.yaml) is deferred to v2.7 — Hermes ships
`hermes skills config` as an interactive verb only, and we'd rather
read accurately than risk clobbering the user's list with a
half-tested write path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 12:44:15 +02:00
Alan Wizemann 686fb37630 feat(hermes-v12): Curator feature module on Mac + iOS (Phase D)
Hermes v0.12 ships an autonomous Curator that prunes / consolidates
agent-created skills on a 7-day cycle. This phase brings that surface
into Scarf so users can see status, trigger runs, pin protected skills,
and restore archived ones.

Pipeline:

- HermesCuratorStatus + HermesCuratorSkillRow: Sendable value types for
  parsed status + per-skill leaderboard rows.
- HermesCuratorStatusParser: pure text parser for `hermes curator status`
  stdout (no `--json` flag exists upstream). Tolerates Hermes's
  whitespace-padded leaderboard layout (`activity=  0` with N spaces
  between `=` and the value) by slicing between known key positions
  rather than splitting on whitespace. State-file JSON overrides
  text-parsed values for last_run_at / last_run_summary /
  last_report_path because the file carries full ISO timestamps the
  text output may have rounded.
- CuratorViewModel: @Observable @MainActor, drives the CLI verbs
  (status / run / pause / resume / pin / unpin / restore) via
  transport.runProcess so it works equally over local and Citadel SSH.
- HermesPathSet: adds curatorLogsDir + curatorStateFile (the latter
  is `.curator_state` with no extension despite holding JSON).

Mac:

- Features/Curator/Views/CuratorView.swift — page-header + status card
  + skill counts + pinned chips + 3 leaderboard tables (least recent,
  most active, least active) with inline pin toggles and a
  per-skill counter chip row. "Run Now" button + a kebab menu for
  Pause/Resume + Restore Archived.
- Features/Curator/Views/CuratorRestoreSheet.swift — name-entry sheet
  for `hermes curator restore <skill>`. Free-form text field; Hermes
  doesn't ship a `curator list-archived` yet so we don't synthesize a
  picker.
- Sidebar: AppCoordinator + SidebarView gain a `.curator` case under
  Interact (between Memory and Skills); the row is filtered out by
  SidebarView's capability-aware `sections` computed property when
  `HermesCapabilities.hasCurator` is false. ContentView routes
  `.curator` to CuratorView. Pre-v0.12 hosts see the v0.11 sidebar
  unchanged.

iOS:

- Scarf iOS/Curator/CuratorView.swift — read-mostly List with the same
  status / skill counts / pinned / leaderboards + inline pin toggles.
  Run Now / Pause / Resume actions in the section footer.
- ScarfGoTabRoot's System tab gains a Curator NavigationLink under
  Features, gated on `hasCurator`. Uses a stable
  `systemTabContextID` so the SSH transport pool reuses the cached
  Citadel connection keyed by that id.

Tests: 6 new parser tests (215 total, all green). Locks the empty-state
output captured from a real v0.12.0 install + paused-state + state-file
override + multi-word-name-row parsing. Both Mac and iOS schemes build
clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 12:37:48 +02:00
Alan Wizemann 1354568992 feat(hermes-v12): ACP multimodal image input on Mac + iOS (Phase C)
Hermes v0.12 advertises `prompt_capabilities.image = true` and accepts
image content blocks in `session/prompt`. This wires a producer flow on
both targets so users can attach images alongside text and have them
routed to the vision-capable model automatically.

Pipeline:

- ChatImageAttachment: Sendable value type holding base64 payload +
  thumbnail, MIME type, source filename, and approximate byte count.
- ImageEncoder: detached-only Sendable service that downsamples to
  Anthropic's 1568px long-edge cap, JPEG-encodes at q=0.85, and
  produces a small inline thumbnail for composer chips. Cross-platform
  (NSImage on Mac, UIImage on iOS, JPEG-passthrough on Linux/CI).
- ACPClient.sendPrompt(sessionId:text:images:) overload emits a content
  array `[{type: "text"...}, {type: "image", data, mimeType}]` matching
  the wire shape in hermes-agent/acp_adapter/server.py. The
  zero-arg-images convenience overload preserves the v0.11 wire shape
  for any unmodified callers.

Mac UI:

- RichChatInputBar grew an `attachments: [ChatImageAttachment]` state
  array, a paperclip button (NSOpenPanel multi-pick), drag-drop and
  paste handlers, and a horizontal preview chip strip. The "send"
  callback's signature is `(String, [ChatImageAttachment]) -> Void`
  threaded through RichChatView -> ChatTranscriptPane -> ChatView ->
  ChatViewModel.sendText(text, images:). Image-only prompts are
  permitted ("describe this") once at least one attachment is queued.

iOS UI:

- ChatView's composer adopts a paperclip + PhotosPicker flow with the
  same chip strip and 5-attachment cap. Attachments live on
  ChatController so they survive across PhotosPicker presentations.
  loadTransferable(type: Data.self) feeds raw bytes into the same
  ImageEncoder; encode work runs detached so MainActor stays
  responsive on cellular.

Capability gating:

- Both composers hide the entire attachment surface when
  HermesCapabilities.hasACPImagePrompts is false (pre-v0.12 hosts).
  No paperclip button, no drop target, no paste accept — the input bar
  is byte-for-byte the v0.11 surface against an older Hermes.

Tests: 209 ScarfCore tests pass; both Mac and iOS schemes build clean.
The encoder's pixel work is hard to unit-test at the package level
(no NSImage/UIImage in plain Swift CI) — manual end-to-end testing
is the verification path here.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 12:28:41 +02:00
Alan Wizemann da721fa276 feat(hermes-v12): provider catalog + auxiliary swap (Phase B)
Adds the five v0.12 inference providers to ModelCatalogService.overlayOnlyProviders
so the model picker reaches them. IDs match HERMES_OVERLAYS verbatim:

- gmi → GMI Cloud (api_key)
- azure-foundry → Azure AI Foundry (api_key)
- lmstudio → LM Studio (api_key, promoted from custom-endpoint alias)
- minimax-oauth → MiniMax (OAuth, oauth_external)
- tencent-tokenhub → Tencent TokenHub (api_key)

Auxiliary tasks: drop the `flush_memories` row (Hermes removed it
entirely in v0.12) and add `auxiliary.curator` so users can configure
the model the autonomous curator's review fork uses. The Curator row is
gated on HermesCapabilities.hasCuratorAux, so v0.11 hosts don't see a
control that writes a key Hermes ignores. AuxiliarySettings, the YAML
parser, and HealthViewModel's Tool Gateway breakdown are all updated.

Side fixes:

- CredentialPoolsGatingTests was missing `import ScarfCore` after
  ModelCatalogService moved to the package (broke the test target's
  compile against pure-Mac scarf).
- Promoted `ModelCatalogService.overlayOnlyProviders` to public so the
  new `v012OverlayProvidersCarryCorrectAuthTypes` lock-in test can
  reach it.

Tests: 14 ToolGateway tests pass; 209 ScarfCore tests pass; both Mac
and iOS schemes build clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 12:16:37 +02:00
Alan Wizemann a90a29add8 feat(hermes-v12): version-aware capability detection (Phase A)
Introduces `HermesCapabilities` (parsed from `hermes --version`) and a
per-server `HermesCapabilitiesStore` injected into Mac `ContextBoundRoot`
and iOS `ScarfGoTabRoot` via `.environment(_:)` and `.hermesCapabilities`.
Subsequent v0.12-targeted UI (Curator, Kanban, ACP image input,
auxiliary.curator, prompt cache TTL, etc.) can branch on these flags so
older Hermes installs degrade silently instead of throwing on unknown CLI
subcommands.

Adds `curatorReportJSON` / `curatorReportMD` paths to `HermesPathSet`.

Bumps the Hermes version target in CLAUDE.md from v2026.4.23 (v0.11.0) to
v2026.4.30 (v0.12.0) and lists the v0.12 surfaces Scarf will consume.

Side fixes:

- `M5FeatureVMTests.ScriptedTransport` was missing
  `cachedSnapshotPath` after that property was added in 7b864d7;
  added `URL? { nil }` stub.
- `M0dViewModelsTests` referenced `.degraded(reason:)` after the case
  gained `hint` + `cause`; updated.
- `RemoteBackupService.zipDirectory` and `RemoteRestoreService.unzipArchive`
  used `Foundation.Process` unconditionally, breaking the iOS build
  (Process is unavailable on iOS). Wrapped in `#if !os(iOS)` with iOS
  stubs that throw — the backup/restore flow is Mac-only by design.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 12:10:06 +02:00
Alan Wizemann 421e6030df fix(dashboard): shadow Hermes-home consolidation actually clears the warning
The "Project-local Hermes home shadowing global setup" banner has a
"Copy fix command" button that produced a one-liner the user could
paste on the remote. The old command only `cp`'d the project's
`auth.json` into the global `~/.hermes/`; it never touched the
project-local `.hermes/` directory. Hermes' CLI binds to the
*closest* `.hermes/` as `$HERMES_HOME`, so the directory still being
there meant it still shadowed — the detector's
`fileExists(<project>/.hermes)` correctly kept returning true and
the warning didn't go away after the user "fixed" it. They got
stuck.

Fix: rename the project-local `.hermes/` to
`.hermes.scarf-bak.<UTC-stamp>/` after the auth copy. Hermes scans
for a directory literally named `.hermes`, so the rename is enough
to stop binding without losing user data — `state.db`, sessions,
skills all survive untouched in the renamed folder. The user can
inspect / delete the `.bak` later when confident. `mv` over
`rm -rf` because a project's shadow can hold uncommitted session
history; deletion would be unrecoverable, the rename is reversible.

Also removes the `if shadow.hasAuthJSON` gate around the "Copy fix
command" button — a state-only shadow (no creds, just `state.db`)
still binds as `$HERMES_HOME` and needs the same rename to clear
the warning. The button now always shows; the help-tooltip text
branches on `hasAuthJSON` to describe what the command will do.

Help-text now spells out the rename so the user knows where their
data went before they paste anything.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 17:51:33 +02:00
Alan Wizemann 7b864d77d5 feat(servers): backup + restore for any Scarf server
Adds an end-to-end "back up this server's full Hermes state" flow
with a verifiable archive format and a matching restore that pushes
it onto a fresh droplet. Tested against a 570 MB local Hermes home
+ 5 projects, then iterated against a real DigitalOcean droplet.

Architecture
- `.scarfbackup` is a ZIP containing `manifest.json` (schema v1,
  source server + hermes version + per-tarball SHA-256), one
  `hermes.tar.gz` (gzipped tar of `~/.hermes/`), and one
  `projects/<id>.tar.gz` per registered project. Streams via
  `tar -czf - …` over SSH; never buffers a full archive in memory.
- New `streamRawBytes(executable:args:)` on `ServerTransport`
  (Local + SSH impls) yields binary `Data` chunks. `streamLines`
  splits on `\n` and would corrupt tar output — needed a
  binary-safe sibling.
- `RemoteBackupService` runs preflight (resolves $HOME, probes
  hermes version, enumerates projects via the existing
  `ProjectDashboardService`, sizes each via `du -sb`, checks for
  `sqlite3`), optionally runs `PRAGMA wal_checkpoint(TRUNCATE)`
  to quiesce state.db, streams each tarball with incremental
  SHA-256, then ZIP-bundles via `/usr/bin/zip`. Atomic
  temp-then-rename so a partial archive never appears at the
  user-chosen destination.
- `RemoteRestoreService` unzips into a temp dir, validates the
  manifest's `kind` magic + `schemaVersion`, hash-verifies every
  inner tarball BEFORE pushing any bytes to the target, then
  streams each tarball into `tar -xzf - -C …` over SSH stdin.
  Post-restore: rewrites `~/.hermes/scarf/projects.json` with
  source→target path mappings via a small `python3 -c` script,
  and pauses every cron job (`enabled: false`) so restored jobs
  don't surprise-fire on a fresh droplet.

Defaults + safety
- Excluded from the backup unless explicitly opted in:
  `auth.json` (provider creds), `mcp-tokens/` (per-host OAuth),
  `logs/`. Always excluded: `state.db-{wal,shm}`,
  `gateway_state.json`, and standard project junk
  (`node_modules`, `.venv`, `.git/objects`, `__pycache__`,
  `.next`, `dist`).
- Manifest records `options.includeAuth/includeMcpTokens/
  includeLogs/checkpointedWAL` honestly so restore can warn
  the user about what they'll need to re-establish manually.
- All paths are tilde-expanded against the resolved remote
  `$HOME` before being passed to `tar`/`sqlite3`.
  `tar -C '~/projects'` would otherwise fail with
  "No such file or directory" because `shellQuote` wraps the
  path in single quotes and tar doesn't expand tildes itself.

UI
- Per-row ellipsis menu on `ManageServersView` consolidates
  Back Up… / Restore from Backup… / Diagnostics… / Remove…
  Keeps the row visually clean as actions grow. Local server
  gets Back Up + Restore (no Remove or Diagnostics).
- `BackupServerSheet` walks loading → ready (size + project
  list + auth/logs toggles) → running (byte-counter progress
  per stage) → done (Show in Finder) | failed (Try again).
- `RestoreServerSheet` walks awaitingFile → inspecting →
  ready (source-vs-target preview, projects-root chooser,
  cron-pause toggle, "auth was excluded" notes) → running →
  done | failed.
- Both view models use a `WeakBox` two-step capture pattern so
  the @Sendable progress callback hops back into MainActor
  without the Swift 6 var-self warning on nested closures.

Cleanup folded in
- Drops two no-op `await`s on sync `startReaders()` in
  `ProcessACPChannel` (warning surfaced after the Phase 1 ACP
  changes; cleanest to fix in the same Transport-layer touch).

Verified
- Local round-trip via a Swift CLI harness:
  preflight → backup → unzip listing matches manifest →
  on-disk SHA-256 matches manifest claim for every tarball.
- Real DigitalOcean droplet: backup completes after the
  tilde-expansion fix; restore preserves projects + sessions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 17:51:10 +02:00
Alan Wizemann 11946aad67 feat(remote): legible SSH/ACP failures + servers.json export/import
A vanished or misconfigured remote surfaced as an opaque 30s
"ACP request 'initialize' timed out" because the channel's EOF
fired with no exit code or stderr context, and `sh -c` on the
remote couldn't find pipx-installed `hermes` on PATH. This makes
remote failure modes immediately legible and adds a recovery path
for the server registry itself.

- `ACPClientError.processTerminated` now carries exit code + stderr
  tail; `performDisconnectCleanup` reads them from the channel
  before failing pending requests, and `ACPErrorHint.classify`
  recognises Connection refused, Operation timed out, Permission
  denied (publickey), Host key verification failed, Could not
  resolve hostname, and exit 127 / command not found.
- `ProcessACPChannel.terminationHandler` closes the stdout read
  end the moment the OS reaps the child so disconnect cleanup
  fires within ~1s instead of waiting on `availableData`.
  `lastExitCode` reads `Process.terminationStatus` directly to
  avoid an actor-handshake race.
- `SSHTransport.makeProcess` / `streamLines` switch from `sh -c`
  to `bash -lc` so non-interactive SSH shells source the user's
  profile and pick up pipx (`~/.local/bin`), Linuxbrew, asdf,
  and conda PATH entries.
- New `ServerRegistry.exportFile()` / `importEntries(from:)` with
  a `.scarfservers` JSON envelope (schema v1, dedupe by UUID,
  default-server flag preserved). UI in `ManageServersView`'s
  header menu surfaces Export… / Import… via NSSave/OpenPanel.
  No secrets travel — `identityFile` is a path string and SSH
  keys live in `~/.ssh/`, not in `servers.json`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 16:04:14 +02:00
Alan Wizemann 4140983866 feat(site): marketing landing page for Mac + ScarfGo
Replace the gh-pages root placeholder with a real landing page that
sells both apps. Sources live at site/landing/ and publish through a
new scripts/site.sh that mirrors scripts/catalog.sh and scripts/wiki.sh
(check / build / preview / serve / publish, two-pass secret-scan, only
touches root files + assets/ on gh-pages so appcast.xml and templates/
stay disjoint).

Page is rust-palette tokens mapped from ScarfDesign, semantic HTML,
SEO + AEO infra (OpenGraph, Twitter cards, JSON-LD SoftwareApplication
+ MobileApplication + FAQPage, llms.txt, sitemap, manifest), 12-entry
FAQ, light/dark via prefers-color-scheme + manual toggle that swaps
both site chrome and screenshot variants. tools/og-image.html renders
the 1200x630 OG / 1200x600 Twitter cards via headless Chromium.

Real captures from the live Mac app (9 surfaces x light + dark) +
existing ScarfGo screenshots round out the imagery.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 14:41:37 +02:00
Alan Wizemann cca99d4e13 chore: Bump version to 2.5.2 2026-04-29 13:36:53 +02:00
Alan Wizemann 2aab9dac07 feat: chat-start preflight, Nous catalog, remote-aware admin sheets
Three feature batches that were in progress on chat-resilience —
all aligned with v2.5.2's remote-context theme.

## Chat-start model preflight

When a chat-start hits a server whose config.yaml has no
model.default / model.provider, the upstream provider returns an
opaque "Model parameter is required" 400 only AFTER the user types
a prompt and hits send. New ModelPreflight in ScarfCore catches the
missing keys before any ACP work; ChatView presents the existing
ModelPickerSheet via a thin ChatModelPreflightSheet wrapper so the
picker / validation / Nous-catalog branch stay single-sourced.
ChatViewModel persists the selection via `hermes config set` and
replays the original startACPSession arguments — the chat the user
originally opened lands without re-clicking the project row.

## Nous Portal live catalog

NousModelCatalogService fetches `GET /v1/models` from
inference-api.nousresearch.com using the bearer token in
`auth.json`, caches to `~/.hermes/scarf/nous_models_cache.json`
(new path on HermesPathSet) with a 24h TTL. Picker's nous-overlay
detail switches from a free-form TextField to a real model list,
with a "Custom…" escape hatch (nousManualEntry) for IDs not yet in
the API response.

## Remote-aware admin sheets (mirror of #54's pattern)

The Add Project sheet got context-aware Verify in v2.5.1 (#54);
this batch extends the same shape to three more sheets:

- Profiles: remote import/export. ProfilesView gains
  showRemoteImportSheet + pendingRemoteExport state; reuses the
  same path-input + verify + run-via-hermes pattern from
  AddProjectSheet. Drives `hermes profile import <zip>` /
  `hermes profile export <name> <zip>` over SSH.
- Backup restore (Settings → Advanced): pickLocalBackupZip + new
  RemoteBackupPathSheet so the Restore action picks a local zip
  on local contexts and verifies a remote path on remote contexts.
- Template install destination: TemplateInstallSheet's parent-
  directory picker now branches on context. ParentDirectoryStep
  with browseLocalDirectory + verifyRemotePath + RemoteVerification
  — same UX vocabulary as AddProjectSheet, applied to where the
  template gets installed.

Plus a `runHermesWithStdin` helper on HermesFileService for the
profile import flow (passing zip bytes through stdin rather than
landing them on the remote disk first), and ProjectTemplateInstaller
gains a remote-path-aware code path for the install destination.

## Localizations

Localizable.xcstrings adds strings for all the new copy across
seven supported locales (en, zh-Hans, de, fr, es, ja, pt-BR).
2026-04-29 13:27:25 +02:00
Alan Wizemann c31dfccb9b fix(ios-chat): move keyboard-dismiss chevron to leading edge (#57)
The keyboard accessory dismiss button added in #51 was placed at
the trailing edge of the keyboard toolbar (Spacer before Button),
which sits directly above the trailing-edge send button in the
composer below. Two near-identical-shape controls visually stack
on the right edge of the screen, confusing users about which is
which.

Move the Spacer() to AFTER the Button so the chevron lives at the
leading edge of the keyboard accessory bar — visually separated
from the send button below, and matches the iOS convention (Notes,
Mail, Reminders all put accessory dismiss on the leading side).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 13:22:51 +02:00
Alan Wizemann 61e61f556a feat(chat): hideable sessions + inspector panes for the Mac chat (#58)
The 3-pane layout (264px sessions list + transcript + 320px inspector)
ate ~584px of horizontal space on every chat window — squeezing the
actual transcript on smaller windows AND keeping the "No tool selected"
empty-state visible even when irrelevant. User reported that as
"reasoning, in/out, hard to read because of the tool selected box
taking so much space".

Add toolbar toggles + Settings parity to hide either side pane:

- Two new @AppStorage keys in ChatDensitySettings:
    scarf.chat.showSessionsList (default true)
    scarf.chat.showInspector    (default true)
- ChatView toolbar gains two buttons next to the View picker:
  sidebar.left toggles the sessions list, sidebar.right toggles the
  inspector. Both highlight in accent color when visible. Hidden when
  in terminal mode (the 3-pane layout doesn't apply there).
- RichChatView body conditionally renders each side pane and its
  divider, with .transition(.move + .opacity) and a 180ms easeInOut
  animation so the transcript reflows smoothly rather than snapping.
- Auto-show inspector when a tool card is focused so a click never
  silently dies — onChange of focusedToolCallId flips
  showInspector back on if it was off. The slide-in animation
  covers the visual transition.
- DisplayTab → Chat density gains parity Toggle rows for "Sessions
  list" and "Tool inspector" — same group as the existing density
  pickers from #47/#48 so the settings home is consistent.

Defaults match today's behavior so existing users see no change
until they opt out.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 13:22:51 +02:00
Alan Wizemann 424711c3d9 fix(ios-snapshot): harden Citadel state.db snapshot path (#56)
Reported on iOS: dashboard shows "Connection issue / Citadel.SSH
Client.CommandFailed error 1", memory files (USER.md, SOUL.md) load
fine but Sessions / Activity / Tool Calls all show 0. The snapshot
operation that pulls ~/.hermes/state.db over SFTP via `sqlite3
.backup` was failing on the remote, but the iOS user got zero
actionable context.

Two latent bugs in CitadelServerTransport.asyncSnapshotSQLite —
both fixed in v2.5.0 for asyncRunProcess but missed on this path:

1. `executeCommand` throws CommandFailed on non-zero exit AND
   discards the captured stderr buffer. So when sqlite3 is missing
   (slim Docker images, statically-linked installs) or state.db
   doesn't exist, the user only saw "error 1" and a generic
   connection-issue banner with no remediation.

2. No `PATH=...` prefix. asyncRunProcess inline-prepends
   `PATH="$HOME/.local/bin:/opt/homebrew/bin:/usr/local/bin:$PATH"`
   so bare command resolution works on Citadel's stripped-PATH
   exec channel; the snapshot path didn't, so any sqlite3 install
   outside /usr/bin failed at exit 127 ("command not found").

Mirror the asyncRunProcess hardening on the snapshot path:

- Prepend the same PATH prefix so sqlite3 resolves on hosts where
  it lives at /usr/local/bin or /opt/homebrew/bin.
- Drive `executeCommandStream` instead of `executeCommand`.
  Capture stdout + stderr regardless of exit code.
- On non-zero exit, throw an NSError carrying the real stderr (or
  stdout if stderr is empty — sqlite3 sometimes errors via stdout
  depending on the remote shell). HermesDataService.humanize
  already keys off "sqlite3: command not found" /
  "permission denied" / "no such file" substrings, so once the
  real message reaches it the dashboard banner becomes actionable
  ("sqlite3 is not installed on <host>. Install with apt install
  sqlite3..." instead of the generic CommandFailed error).
- When the stream itself fails to start (network/auth-level), throw
  with a "Failed to start snapshot stream" message so the connect-
  level error path is distinguishable from the remote-exec failure.

iOS-only — Mac path was already correct.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 13:22:51 +02:00
Alan Wizemann 067aeda878 fix(catalog): async catalog reads — unfreezes Model + Credential sheets (#59)
Two views called ModelCatalogService.loadProviders() synchronously
from .onAppear on the MainActor:

- ModelPickerSheet (Settings → Model)
- AddCredentialSheet (Credential Pools → +)

loadProviders() walks loadCatalog() → transport.readFile() of
~/.hermes/models_dev_cache.json — a multi-megabyte JSON with ~1500
models across ~110 providers. On a remote SSH context that's a
synchronous SSH file read on the main thread; the user's reported
1–2 minute UI freeze on first open is exactly that. Even on local
contexts the JSONDecoder pass on the main thread is a noticeable
hiccup. Direct violation of CLAUDE.md's rule against sync I/O on
@MainActor.

Compound case: ModelPickerSheet.loadModelsForSelection() did the
same sync read every time the user clicked a different provider in
the picker — re-froze the UI per click.

Fix:
- Add async wrappers on the service:
    loadProvidersAsync()      -> [HermesProviderInfo]
    loadModelsAsync(for:)     -> [HermesModelInfo]
  Each await Task.detached { sync method }.value. Existing sync
  methods stay for tests and any non-View consumers.
- ModelPickerSheet: replace .onAppear with .task; await both async
  calls. Same conversion for loadModelsForSelection() — renamed to
  loadModelsForSelectionAsync() and called from the provider-list
  selection binding via Task { ... }. Subscription state load also
  routed through Task.detached since it's another auth.json read
  that's tiny on local but SSH-backed on remote.
- AddCredentialSheet (CredentialPoolsView): same .onAppear → .task
  conversion with isLoadingProviders @State driving an overlay
  ProgressView "Loading providers..." while the read is in flight.

No behavior or data-shape change; pure I/O dispatch fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 13:22:51 +02:00
Alan Wizemann 389620059c fix(credentials): recognize OAuth providers; warn on project-shadowed Hermes
Three related fixes for the "I authed Nous but Scarf doesn't see it" bug:

1. `hasAnyAICredential()` (HermesFileService) only probed the
   `credential_pool.<provider>` shape in auth.json. OAuth-authed providers
   land under `providers.<name>.access_token` instead — Nous, Spotify, GH
   Copilot ACP, Qwen, Gemini all use that path. The chat banner kept
   showing "No AI provider credentials" even after a successful Nous
   sign-in. Now we probe both shapes; refresh-only entries (pre-mint
   OAuth flows) also count.

2. `CredentialPoolsViewModel` decoded only `credential_pool.*` and
   ignored `providers.*` entirely. New `oauthProviders` array surfaces
   them in a parallel "OAuth providers" section above the rotation
   pools — read-only, with token tail, expiry badge, portal URL, and
   "managed by `hermes auth add`" footnote so users know where the
   write path lives.

3. New `ProjectHermesShadowDetector` (ScarfCore) probes each registered
   project for a `<project>/.hermes/` directory. Hermes' CLI binds to
   the closest `.hermes/` as `$HERMES_HOME` when run from inside such a
   project — `hermes auth add nous` lands in the project's auth.json
   instead of `~/.hermes/auth.json` and Scarf's global probes never
   see it. Surfaced as a yellow Dashboard banner listing affected
   projects with badges for `auth.json` / `state.db` presence and a
   "Copy fix command" button that emits a one-liner consolidating
   auth.json into the global home. Read-only — no auto-migration; the
   user decides what to keep.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 22:48:20 +02:00
Alan Wizemann 4ffd353835 fix(diagnostics): treat config.yaml absence as informational, not failure
Same root cause as the connection-pill fix in 511726e: Hermes v0.11+
doesn't materialize config.yaml until the user changes a setting from
defaults, so a healthy fresh install was reporting "12/14 passing"
forever even though everything that mattered worked.

Probe.Status becomes tri-state (.pass / .fail / .skipped). The shell
script emits SKIP for the "config.yaml absent" branch (Hermes creates
it lazily); only "exists but unreadable" still emits FAIL. The view
renders .skipped with a grey info-circle and excludes those probes
from the summary's denominator — "12/12 passing (2 optional skipped)"
instead of the misleading "12/14."

Probe titles relabeled to "config.yaml readable (optional)" and
"config.yaml content (optional)" so users see the file is not
load-bearing at a glance. The failure hint for the genuine
permission-denied case explicitly notes that absence is fine.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 22:31:40 +02:00
Alan Wizemann 511726e2c0 feat(chat-resilience): iOS reconnect + snapshot fallback + paging + pill fix
Brings iOS chat to parity with Mac's reconnect behavior so a session
survives phone-sleep, network handoffs, and SSH socket drops without
losing the agent's work — Hermes already persists messages to state.db
in real-time, the iOS app just had no resync path.

Core changes (shared between Mac and iOS via ScarfCore):

- ServerTransport.cachedSnapshotPath: fall back to the cached state.db
  snapshot when a fresh pull fails. HermesDataService surfaces this via
  isUsingStaleSnapshot + lastSnapshotMtime so views can render "Last
  updated X ago." Default opt-in via refresh(forceFresh: false); chat
  history reload passes forceFresh: true to refuse stale data.
- HermesDataService.fetchMessages(sessionId:limit:before:): bounded
  pagination by id desc. Legacy unbounded overload deprecated. New
  HistoryPageSize constants centralize the budget.
- RichChatViewModel.loadEarlier(): pages back through the current
  session via oldestLoadedMessageID + hasMoreHistory.

iOS-only:

- ChatController gains the Mac reconnect machinery: 5-attempt
  exponential backoff (1→16s) via session/resume → session/load,
  reconcileWithDB on success, "Resynced N new messages" toast.
  startACPEventLoop + startHealthMonitor extracted as helpers.
- New NetworkReachabilityService (NWPathMonitor singleton). Suspends
  reconnect attempts while offline; kicks a fresh cycle on link-up.
- ScarfGoCoordinator + ScarfGoTabRoot funnel scenePhase transitions to
  ChatController.handleScenePhase. On .active we verify channel
  health and reconnect if dead.
- Draft persistence: UserDefaults keyed by (serverID, sessionID)
  survives force-quit. 7-day janitor at app launch.
- Connection-state banner: .reconnecting and .offline render slim
  ScarfDesign-tinted strips above the message list. .failed keeps
  using the existing full-screen overlay.

Bonus fix:

- ConnectionStatusViewModel tier-2 probe now checks state.db instead
  of config.yaml. Hermes v0.11+ doesn't materialize config.yaml until
  the user changes a setting, so a freshly-installed working Hermes
  was being marked "degraded — config missing" indefinitely. state.db
  is the file Scarf actually depends on.

Out of scope (deferred): APNs push notifications, BGTaskScheduler-
based extended-background keepalive, offline write queue.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 21:57:49 +02:00
Alan Wizemann 587c6c36c8 fix(diagnostics): sqlite3 probe with login-PATH + candidate fallback (#19)
@cmalpass's April 25 follow-up on #19: diagnostics reported "sqlite3
not installed or on system PATH" while sqlite3 was actually installed
and Hermes was using it fine. Same false-negative class the `hermes`
probe pre-fix had — a bare `command -v sqlite3` in the non-login SSH
shell misses installs at /opt/homebrew/bin or /usr/local/bin when
the user's PATH export lives in .zprofile (the typical Homebrew
setup). The hermes probe was upgraded to source rc files + walk a
candidate list; sqlite3 wasn't.

Mirror the same pattern:

- Move the sqlite3 detection AFTER the rc-source loop so the login
  PATH is in scope.
- Add a standard-location fallback list:
  /usr/bin/sqlite3, /usr/local/bin/sqlite3,
  /opt/homebrew/bin/sqlite3, /opt/local/bin/sqlite3.
- Use the resolved sqlite3 binary explicitly in the
  sqlite3CanOpenStateDB probe so it doesn't re-fail-by-PATH when the
  binary is at e.g. /opt/homebrew/bin. Falls back to bare `sqlite3`
  so the FAIL detail line still carries the real error.

Hermes non-login probe stays as-is — that semantic ("is hermes on
the un-enriched PATH?") is meaningful and we don't want to muddle it.

Failure-hint copy on sqlite3Installed updated to spell out the new
fallback behavior so users who still see FAIL get accurate guidance
(install via package manager, OR symlink an existing binary into a
location the probe checks).

Closes the third and last open layer of #19. Layer 1 (104-byte
ControlMaster path) was fixed in v2.0.2; layer 2 (pill / diagnostics
disagreement) was fixed in v2.5.1 (#44). Ships in v2.5.2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 15:55:18 +02:00
350 changed files with 43573 additions and 1650 deletions
+15
View File
@@ -0,0 +1,15 @@
# These are supported funding model platforms
github: # Replace with up to 4 GitHub Sponsors-enabled usernames e.g., [user1, user2]
patreon: # Replace with a single Patreon username
open_collective: # Replace with a single Open Collective username
ko_fi: # Replace with a single Ko-fi username
tidelift: # Replace with a single Tidelift platform-name/package-name e.g., npm/babel
community_bridge: # Replace with a single Community Bridge project-name e.g., cloud-foundry
liberapay: # Replace with a single Liberapay username
issuehunt: # Replace with a single IssueHunt username
lfx_crowdfunding: # Replace with a single LFX Crowdfunding project-name e.g., cloud-foundry
polar: # Replace with a single Polar username
buy_me_a_coffee: awizemann
thanks_dev: # Replace with a single thanks.dev username
custom: # Replace with up to 4 custom sponsorship URLs e.g., ['link1', 'link2']
+5
View File
@@ -61,3 +61,8 @@ releases/v*/appcast-entry.xml
# Wiki helper: personal patterns (hostnames, IPs) blocked from the wiki push.
scripts/wiki-blocklist.txt
# TestFlight feedback / crash JSONs downloaded for triage. PII (emails,
# carriers, locales) and never meant for the public repo — kept local
# while a fix round is in progress, deleted afterward.
crashes/
+24
View File
@@ -0,0 +1,24 @@
# Building Scarf
Scarf is a native macOS app built with Xcode. For contributor builds, use the local script:
```bash
./scripts/local-build.sh
```
Requirements:
- macOS 14.6 (Sonoma) or newer at runtime — that's the app's `MACOSX_DEPLOYMENT_TARGET`. Sonoma support is intentional and load-bearing; do not raise this without an explicit decision to drop Sonoma users
- Xcode 16.0 or newer, selected by `xcode-select` (needed for Swift 6 strict-concurrency features the project uses)
- Metal toolchain installed
- Hermes installed at `~/.hermes/` (see the project README for setup)
If the Metal toolchain is missing, the script will offer to install it in interactive shells. You can also install it manually:
```bash
xcodebuild -downloadComponent MetalToolchain
```
`scripts/local-build.sh` resolves Swift package dependencies, detects `arm64` vs `x86_64`, and builds the Debug app unsigned. Signing is intentionally disabled for local Debug builds so contributors do not need the maintainer's Apple Developer account.
Release signing is separate from contributor builds. Maintainers should continue using the existing release process for signed distributable builds.
+79 -2
View File
@@ -113,9 +113,48 @@ Public documentation lives in the GitHub wiki at https://github.com/awizemann/sc
## Hermes Version
Targets Hermes v2026.4.23 (v0.11.0). Log lines may carry an optional `[session_id]` tag between the level and logger name — `HermesLogService.parseLine` treats the session tag as an optional capture group, so older untagged lines still parse.
Targets Hermes v2026.5.7 (v0.13.0). Log lines may carry an optional `[session_id]` tag between the level and logger name — `HermesLogService.parseLine` treats the session tag as an optional capture group, so older untagged lines still parse.
**v2026.4.23 (v0.11.0)** added (Scarf-relevant subset):
**Capability gating.** Scarf detects the target's Hermes version once per server connection via [HermesCapabilities](scarf/Packages/ScarfCore/Sources/ScarfCore/Services/HermesCapabilities.swift) (`hermes --version` → semver + `YYYY.M.D` parse). The resulting `HermesCapabilitiesStore` is injected on `ContextBoundRoot` (Mac) and `ScarfGoTabRoot` (iOS) via `.environment(_:)` and `.hermesCapabilities(_:)`; UI that depends on a release-gated surface reads it through the typed environment key. Pre-target hosts gracefully hide the new affordances rather than throwing on unknown CLI subcommands. Add a new flag at the top of `HermesCapabilities` whenever Scarf gains a release-gated UI surface — group flags by the Hermes release that introduced them (`MARK: v0.13 (v2026.5.7) flags`, etc.).
**v2026.5.7 (v0.13.0)** added (Scarf-relevant subset; full v2.8.0 implementation lands across WS-2 through WS-9):
- **Persistent Goals** — `/goal <text>` slash command locks the agent onto a target across turns. Checkpoints v2 single-store rewrite + auto-resume after gateway restart. Surfaced in Scarf chat as a non-interruptive command + a "🎯 Goal locked: <text>" pill in the chat header. Gated on `HermesCapabilities.hasGoals`.
- **ACP `/queue` slash command** — queues a prompt to run after the current turn completes. Joins `/steer` in `RichChatViewModel.nonInterruptiveCommands` with a transient "Queued" toast. Gated on `hasACPQueue`. `/steer` now also runs as a regular prompt on idle sessions (`hasACPSteerOnIdle`).
- **Kanban v0.13 reliability + recovery UX** — hallucination gate on worker-created cards, generic diagnostics engine (per-task distress signals), per-task `max_retries` override, multiline title/body create, `auto_blocked_reason` rendered in the inspector banner, darwin zombie detection, unify failure counter across spawn/timeout/crash. New fields decode through tolerant `HermesKanbanRun` / `HermesKanbanTaskDetail` extensions; pre-v0.13 hosts ignore unknown keys. Gated on `hasKanbanDiagnostics`.
- **Curator archive + prune** — `hermes curator archive <skill>` + `prune` + `list-archived` subcommands. The synchronous manual `hermes curator run` blocks until done (pre-v0.13 returned immediately). Surfaced as an "Archived" tab in CuratorView with per-row Restore + Prune actions and a destructive prune-confirm sheet. Gated on `hasCuratorArchive`.
- **Messaging Gateway expansion** — Google Chat (20th platform; `hasGoogleChatPlatform`), cross-platform allowlists (`allowed_channels` / `allowed_chats` / `allowed_rooms` per platform; `hasGatewayAllowlists`), per-platform `gateway_restart_notification` (`hasGatewayRestartNotification`), `busy_ack_enabled` toggle (`hasGatewayBusyAckToggle`), slash-command auto-delete TTL, `[[as_document]]` skill media routing directive, `hermes gateway list` cross-profile status verb (`hasGatewayList`).
- **Provider catalog refresh** — new models on Nous Portal + OpenRouter: `deepseek/deepseek-v4-pro`, `x-ai/grok-4.3`, `openrouter/owl-alpha` (free), `tencent/hy3-preview`, `arcee/trinity-large-thinking` (with temperature + compression overrides). `x-ai/grok-4.20-beta` renamed to `x-ai/grok-4.20` — keep alias map. Vercel AI Gateway demoted to bottom of the picker. `image_gen.model` from `config.yaml` now honored by Hermes (was advertised but ignored pre-v0.13); surfaced in `Settings → Auxiliary` (`hasImageGenModel`). OpenRouter response caching toggle (`hasOpenRouterResponseCache`).
- **MCP SSE transport** — MCP servers can be configured with SSE transport + `sse_read_timeout`. Surfaced in MCPServersView add-server flow alongside stdio/pipe. Gated on `hasMCPSSETransport`.
- **Cron `--no-agent` mode** — script-only watchdog jobs that skip the AI call. Surfaced in CronView edit sheet. Gated on `hasCronNoAgent`.
- **Web Tools per-capability backends** — `web_search` and `web_extract` can use distinct backends; SearXNG joined as a search-only backend. Surfaced in the Web Tools settings tab. Gated on `hasWebToolsBackendSplit`.
- **Profiles `--no-skills`** — `hermes profile create --no-skills` for empty-profile creation. Surfaced as a toggle in the create-profile flow. Gated on `hasProfileNoSkills`.
- **CLI / UX additions** — context compression count in the status feed (rendered next to the token count in chat status bar; `hasContextCompressionCount`), `/new <name>` slash-command argument (`hasNewWithSessionName`), `hermes update --yes` non-interactive (`hasUpdateNonInteractive`), `display.language` static-message translation (zh / ja / de / es / fr / uk / tr; `hasDisplayLanguage`), xAI Custom Voices (voice-cloning badge next to xAI TTS provider; `hasXAIVoiceCloning`).
- **Server-side defaults flipped** — secret redaction defaults back to ON in v0.13 (was off by default in v0.12). The Settings redaction toggle remains for opt-out; the default-state hint reflects the v0.13 semantics when the host advertises v0.13+.
- **`video_analyze` tool** — native video understanding on Gemini-class models. Hermes handles transparently inside the agent loop; Scarf has no UI surface yet but `hasVideoAnalyze` is reserved for future widget gating.
- **`transform_llm_output` plugin hook** — plugin-author concern; surfaced indirectly through PluginsView when a plugin advertises the hook. `hasTransformLLMOutputHook` gates the metadata badge.
- **Schema is unchanged from v0.11/v0.12** — same state.db columns. No migration needed.
**v2026.4.30 (v0.12.0)** added (Scarf-relevant subset):
**v2026.4.30 (v0.12.0)** added (Scarf-relevant subset):
- **Autonomous Curator** — `hermes curator` self-prunes / -consolidates the skill library on a 7-day cycle. Reports land at `~/.hermes/logs/curator/run.json` + `REPORT.md`; paths exposed via `HermesPathSet.curatorLogsDir` (`logs/curator`) + `curatorStateFile` (`skills/.curator_state`), with the per-cycle `run.json` / `REPORT.md` resolved at runtime from the `last_report_path` field on the state file. Surfaced in Scarf as a dedicated "Curator" sidebar item under Interact (between Memory and Skills) on Mac, plus a read-mostly iOS panel with Run Now / Pause / Resume actions and inline pin toggles; both gated on `HermesCapabilities.hasCurator`.
- **5 new inference providers** — GMI Cloud, Azure AI Foundry, LM Studio (upgraded to first-class), MiniMax OAuth, Tencent Tokenhub. Mirrored in `ModelCatalogService.overlayOnlyProviders`; the model picker reaches all of them automatically.
- **`flush_memories` aux task removed (server side)** — `auxiliary.flush_memories` is gone from v0.12 Hermes config but remains alive on pre-v0.12 hosts. Scarf preserves `AuxiliarySettings.flushMemories: AuxiliaryModel`, the YAML reader still emits an `aux("flush_memories")` row, and `AuxiliaryTab` only renders the row when `HermesCapabilities.hasFlushMemoriesAux` is `true` (inverse semantics — pre-v0.12 only). v0.12 users never see the row; v0.11 users keep their edit surface.
- **`auxiliary.curator` aux task added** — Curator's review model is configurable independently of the main model. Surfaced in `Settings → Auxiliary` next to the other aux rows.
- **Multimodal ACP `session/prompt`** — ACP advertises and forwards image content blocks. Scarf chat composers (Mac drag/drop + paste; iOS PhotosPicker) attach images that flow through `ACPClient.sendPrompt(sessionId:text:images:)` as `[{"type":"text","text":...}, {"type":"image","data":"<base64>","mimeType":"image/jpeg"}]` — wire shape matches `acp.schema.ImageContentBlock`. `ImageEncoder` downsamples to 1568px long-edge JPEG q=0.85 detached (never blocks MainActor). Gated on `HermesCapabilities.hasACPImagePrompts`.
- **CLI additions:** `hermes -z <prompt>` (non-interactive one-shot), `hermes update --check` (preflight), `hermes fallback` (manage fallback providers), `hermes curator` (status / run / pause / resume / pin / unpin / restore), `hermes kanban` (full 27-verb task-board CLI). All capability-gated. **v2.7.5 lifts Kanban from a read-only list to a full drag-and-drop board.** See the dedicated [Kanban v3](#kanban-v3-drag-and-drop-board--per-project-tenants-v275) section below for the complete architecture.
- **Skills surface:** `hermes skills install <https-url>` direct-URL install (SkillsView "Install from URL…" toolbar button), reload via `hermes skills audit` (Skills "Reload" button — equivalent to the `/reload-skills` slash command for non-ACP contexts), enabled/disabled state read from `skills.disabled` in config.yaml (rendered as strikethrough + "OFF" pill), Curator pin badge from `~/.hermes/skills/.curator_state` (rendered as a pin glyph). The disable-toggle write path is deferred to v2.7 — Hermes only exposes `hermes skills config` as an interactive verb, and Scarf prefers reading accurately to risking a clobbered list.
- **Two new gateway platforms:** Microsoft Teams (19th, plugin-shipped) + Tencent 元宝 / Yuanbao (18th, native). Surfaced in the Mac Platforms tab.
- **Cron upgrades:** per-job `--workdir <abs-path>` (project-aware cwd that pulls AGENTS.md / CLAUDE.md / .cursorrules) is exposed in the editor sheet, gated on `HermesCapabilities.hasCronWorkdir` so pre-v0.12 hosts don't see the field (and a defensive override in `CronView` strips the value before calling `createJob`/`updateJob` even if it was hydrated from a pre-existing job). Pass an empty string on edit to clear an existing workdir, mirroring the `--script` shape. Hermes also added a `context_from` field for chaining cron outputs but only via YAML so far — Scarf reads it (HermesCronJob.contextFrom) but doesn't write it.
- **Settings deltas:** `prompt_caching.cache_ttl` (5m/1h picker), `redaction.enabled` toggle (off-by-default in v0.12 — toggle restores it), `agent.runtime_metadata_footer` toggle, Piper added to TTS provider list, `vercel` added to terminal backend list.
- **Bundled plugins:** Spotify, Google Meet, Langfuse observability, hermes-achievements (visible in Plugins tab).
- **iOS catch-up (Phase H):** read-only Webhooks / Plugins / Profiles tabs (`Scarf iOS/Webhooks/WebhooksView.swift`, `Plugins/PluginsView.swift`, `Profiles/ProfilesView.swift`) parity-match the Mac surfaces but skip mutating CLI verbs. `Scarf iOS/Components/HermesVersionBanner.swift` nudges pre-v0.12 hosts to upgrade (renders only when the connected target is below v0.12).
- **`hermes memory` providers:** honcho, openviking, mem0, hindsight, holographic, retaindb, byterover. `Settings → Memory` lists all providers in the picker; the existing "Run `hermes memory setup` in Terminal" hint stays — `hermes memory setup` is interactive (asks for tokens) so an in-app shellout would surface a frozen UI.
- **Schema is unchanged from v0.11** — same state.db columns (`messages.reasoning_content`, `sessions.api_call_count` introduced in v0.11 remain). No migration needed.
**v2026.4.23 (v0.11.0)** added (historical context, still consumed by Scarf when running against a pre-v0.12 host):
- `/steer <prompt>` — non-interruptive mid-run guidance slash command. Surfaced in Scarf chat menus via `RichChatViewModel.nonInterruptiveCommands`; `ChatViewModel.sendViaACP` (Mac) and `ChatController.send` (iOS) skip the "Agent working…" status flip and show a transient toast instead.
- New CLI subcommands: `hermes plugins` / `profile` / `webhook` / `insights` / `logs` / `memory reset` / `completion` / `dashboard`. Scarf v2.5 adopts **`hermes memory reset`** (toolbar button on MemoryView with destructive confirmation). The other CLIs are documented here for v2.6 — Scarf still reads `~/.hermes/plugins/`, `~/.hermes/profiles/` etc directly today; switching those paths to the canonical CLI is a forward-compatible change to make when bandwidth permits.
@@ -134,6 +173,44 @@ v0.10.0 introduced the **Tool Gateway** — paid Nous Portal subscribers route w
**Keep `ModelCatalogService.overlayOnlyProviders` in sync** with `HERMES_OVERLAYS` in `~/.hermes/hermes-agent/hermes_cli/providers.py`. When Hermes adds a new overlay-only provider, mirror the entry (display name, base URL, auth type, subscription-gated flag, doc URL) or the picker won't reach it.
**Keep `ModelCatalogService.modelAliases` in sync** with Hermes's deprecated-model-ID map (currently release-notes-only upstream; the canonical successor lives in `hermes_cli/providers.py` if/when upstream tracks it in code). Drift here means a user's old model ID stops resolving in the picker even though Hermes still accepts it at runtime.
**Keep `ModelCatalogService.demotedProviders` in sync** with the deprioritized-provider list in `hermes-agent/hermes_cli/providers.py`. Drift means Vercel AI Gateway (or any future demoted provider) sorts in the wrong position in Scarf's picker.
## Kanban v3: drag-and-drop board + per-project tenants (v2.7.5)
Scarf v2.7.5 promotes Kanban from a read-only list to a full board with drag-and-drop, every Hermes write verb wired up, and per-project boards bound to a Scarf-minted tenant slug. The list view is preserved as a `Board | List` toggle for accessibility / narrow-window fallback.
**Sidebar move.** `.kanban` moved from *Manage**Monitor* in `SidebarView` (between `.activity` and the remaining Monitor entries). Kanban is runtime work-in-progress, not configuration. Position kept inside the same enum case — only the section bucket changed.
**Hermes constraints that drive design.**
1. **No `update` verb.** `priority`, `title`, `body`, `tenant` are write-once at `kanban create`. Mutations after create are state transitions (`assign` / `claim` / `complete` / `block` / `unblock` / `archive`) or new comments. Inline-edit on a card title is impossible at the wire level.
2. **No `project_id` column.** Hermes Kanban is one global SQLite DB at `~/.hermes/kanban.db`. Closest namespace is the optional `tenant TEXT` column. Scarf hijacks it: each project gets a `scarf:<slug>` tenant minted on first kanban interaction.
3. **No within-column position field.** Drag-to-reorder inside a column has no Hermes persistence path and is **disabled** in v2.7.5. Sort key is `priority DESC, created_at DESC` — matches dispatcher's actual run order. Cross-column drag is the only persisted gesture.
4. **No file-watch / webhooks.** Polling at 5s while foregrounded; live `watch` streaming deferred to a later release (a `hasKanbanWatch` flag will gate it).
5. **Status enum has 7 values, board collapses to 5 columns:** Triage / **Up Next** (`todo` + `ready`) / Running / Blocked / Done. Triage hides when empty; Archived hides behind a toolbar toggle.
**Service layer.** [KanbanService](scarf/Packages/ScarfCore/Sources/ScarfCore/Services/KanbanService.swift) is a Sendable `actor` in ScarfCore — pure I/O, no UI state. Wraps every v0.12 verb (`list / show / runs / stats / assignees / create / assign / claim / comment / complete / block / unblock / archive / dispatch / link / unlink`). Every method dispatches its CLI invocation through `Task.detached(priority: .utility)`, matching the existing `KanbanViewModel.load` pattern (re: Swift 6 rules in `~/.claude/CLAUDE.md`). Errors land in [KanbanError](scarf/Packages/ScarfCore/Sources/ScarfCore/Models/KanbanError.swift) and surface as inline banners (not modal alerts) since the board is high-frequency. The "no matching tasks" stdout sentinel is normalized to `[]`.
**Drag-drop transition planner.** `KanbanService.plan(for: KanbanTransition)` is a pure function that maps `(from, to)` columns to the right verb sequence — `(.upNext, .running) → [.claim]`, `(.blocked, .running) → [.unblock, .claim]`, etc. Disallowed transitions throw `KanbanError.forbiddenTransition` with a user-facing reason: drop on Done from anywhere triggers "Done is terminal — create a follow-up task to continue work."; drop on Triage from outside triggers "Triage tasks are promoted by a specifier agent." The view's drop handler short-circuits forbidden transitions with red-stroke target feedback.
**Per-project tenant.** [KanbanTenantResolver](scarf/scarf/Core/Services/KanbanTenantResolver.swift) (Mac) mints `scarf:<slug>` on first kanban interaction inside a project, persisting to `<project>/.scarf/manifest.json`'s new optional `kanbanTenant: String?` field. Tenants are **immutable across rename** (existing tasks already carry the old slug). Bare projects (no manifest) get a sentinel manifest written with `id: scarf/<project-id>` + `version: 0.0.0` + just the `kanbanTenant` set; `ProjectAgentContextService` recognizes the sentinel and refuses to surface it as a "Template" line. The cross-platform read-only counterpart is [KanbanTenantReader](scarf/Packages/ScarfCore/Sources/ScarfCore/Services/KanbanTenantReader.swift) in ScarfCore — iOS uses it to filter the per-project board without linking the full manifest model.
**Agent-side tenant injection.** `ProjectAgentContextService.renderBlock` adds a "Kanban tenant" line to the AGENTS.md scarf-managed block whenever a tenant exists. Since `ChatViewModel.startACPSession` calls `refresh(for:)` before opening every project chat, the agent sees the tenant on every session start and is told to pass `--tenant scarf:<slug>` on `hermes kanban create`. Agents are imperfect at flag discipline; misuse just sends the task to the global "Untagged" group on the global board, which is acceptable v2.7.5 behavior. A dedicated retag UX is a follow-up.
**View model.** [KanbanBoardViewModel](scarf/scarf/Features/Kanban/ViewModels/KanbanBoardViewModel.swift) is `@MainActor + @Observable`, holds the column-grouped task array, and applies optimistic-merge logic around drag-drops: an in-flight move records `optimisticOverrides[taskId] = newStatus`, mutates the local array immediately, and clears the override only when the polled response confirms the new status. Without this, a stale poll response can clobber a card the user just dragged. On CLI failure the override is removed and an error message lands in the inline banner.
**Mac surface.** [KanbanBoardView](scarf/scarf/Features/Kanban/Views/KanbanBoardView.swift) is the orchestrator (header + columns + side-pane inspector + create/block/complete sheets). [KanbanColumnView](scarf/scarf/Features/Kanban/Views/KanbanColumnView.swift) owns its `dropDestination(for: KanbanTaskRef.self)`. [KanbanCardView](scarf/scarf/Features/Kanban/Views/KanbanCardView.swift) handles the `.draggable` source, status-specific chrome (running edge accent + shimmer; blocked warning glyph; done dim 0.7/0.55), and a custom drag preview. [KanbanInspectorPane](scarf/scarf/Features/Kanban/Views/KanbanInspectorPane.swift) is a 420pt side-pane (not modal) so the user can keep dragging cards after inspecting one. [KanbanCreateSheet](scarf/scarf/Features/Kanban/Views/KanbanCreateSheet.swift) maps form state to a `KanbanCreateRequest`; the Workspace picker locks to "Project Dir" on per-project boards. [KanbanBlockReasonSheet](scarf/scarf/Features/Kanban/Views/KanbanBlockReasonSheet.swift) and [KanbanCompleteResultSheet](scarf/scarf/Features/Kanban/Views/KanbanCompleteResultSheet.swift) prompt for optional `--reason` / `--result` text on those transitions.
**Per-project surface.** New `DashboardTab.kanban` case in `ProjectsView.swift`, dispatched to [ProjectKanbanTab](scarf/scarf/Features/Projects/Views/ProjectKanbanTab.swift) which mints the tenant on appearance and wraps `KanbanBoardView` with `tenantFilter` + `projectPath` pre-applied. Capability-gated on `HermesCapabilities.hasKanban` so pre-v0.12 hosts don't see a broken destination. Plus a new `kanban_summary` widget — top 3 tasks by priority across `running` + `blocked` + `todo` for the project's tenant, with stats glance footer. Mirror in `tools/widget-schema.json`, `tools/build-catalog.py`, and `site/widgets.js`. Templates can reference it as `{ kind: kanban_summary, max_rows: 3 }` in dashboard.json.
**iOS surface.** Read-only board on the project Kanban tab ([ScarfGoKanbanView](Scarf%20iOS/Kanban/ScarfGoKanbanView.swift) + [ScarfGoKanbanDetailSheet](Scarf%20iOS/Kanban/ScarfGoKanbanDetailSheet.swift)). Renders the 5 columns as a horizontally-paged `Picker` of single-column lists — HIG-friendly on iPhone. No mutations, no drag-drop in v2.7.5 (deferred to a later release). Card titles use semantic `.headline` (not `ScarfFont`) so Dynamic Type works; chrome (badges) keeps `ScarfBadge` for fixed visual weight. Gated on `HermesCapabilities.hasKanban`; pre-v0.12 hosts don't see the segment.
**Capability gating.** Kept the single `HermesCapabilities.hasKanban` flag (`>= 0.12.0`). All 27 verbs shipped together; finer-grained gating is YAGNI. A `hasKanbanWatch` flag will land in a later release if `watch` semantics drift between point releases.
**Don't:** introduce within-column reorder via a client-side ordering sidecar — sort order would diverge from dispatcher's actual run order, which is worse than no manual order. Use `priority` on `kanban create` to set initial order; revisit when Hermes ships an `update --priority` verb. Don't try to mutate `priority` / `title` / `body` post-create — there's no verb. Don't drop cards from `done` into anything — Done is terminal. Don't call `transport.runProcess` directly from view bodies; route through `KanbanService` (the actor) so polling and writes share the same concurrency model.
## Project Templates
Scarf ships a `.scarftemplate` format (v1 as of 2.2.0) for sharing pre-packaged projects across users and machines. A bundle is a zip containing:
+4 -2
View File
@@ -5,8 +5,10 @@ Thanks for your interest in contributing to Scarf.
## Getting Started
1. Fork and clone the repo
2. Open `scarf/scarf.xcodeproj` in Xcode 26.3+
3. Build and run (requires macOS 26.2+ and Hermes installed at `~/.hermes/`)
2. Open `scarf/scarf.xcodeproj` in Xcode 16.0+
3. Build and run (Scarf runs on macOS 14.6 Sonoma or newer; Hermes must be installed at `~/.hermes/`)
For an unsigned command-line Debug build without an Apple Developer account, run [`./scripts/local-build.sh`](scripts/local-build.sh). See [BUILDING.md](BUILDING.md) for prerequisites.
## Architecture
+54 -21
View File
@@ -19,11 +19,56 @@
<a href="https://www.buymeacoffee.com/awizemann"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me a Coffee" height="28"></a>
</p>
## What's New in 2.5
## What's New in 2.7
### ScarfGo — the iPhone companion ships in public TestFlight
The biggest release since 2.6 — six weeks of work focused on **remote-context performance**, a **new project authoring flow**, **dashboard widgets**, **OAuth resilience**, and a top-to-bottom **performance instrumentation harness** that drove the bulk of the rest. 36 commits, no schema bump, no Hermes capability bump.
Same Hermes server you've been running on your Mac — now reachable from your phone over SSH. Multi-server, project-scoped chat, session resume, memory editor, cron list, skills tree, settings (read), all native iOS. Pure-Swift SSH (Citadel under the hood — no `ssh` binary needed on iOS). Per-project chat writes the same Scarf-managed `AGENTS.md` block the Mac app does, so the agent boots with the same project context regardless of which client opened the session.
### Remote chats and Activity in seconds, not 30s timeouts
Resuming a chat or opening Activity on a slow remote (a 420ms-RTT droplet, an underprovisioned VPS, a tunnel through 4G) used to fetch the full message column set in one shot, which routinely tripped the 30s SSH timeout on chats with multi-page tool result blobs. v2.7 introduces a **skeleton-then-hydrate pattern** that bounds the wire payload by what the user actually needs to see RIGHT NOW, then fills in the heavy stuff in the background.
- **Chat skeleton** — user + assistant rows only (skips `role='tool'`), `tool_calls` / `reasoning` hard-NULLed at SQL level. Wire payload bounded by conversational text. The chat appears in seconds. Background hydration pages tool calls in 5-id batches; tool-result CONTENT is opt-in (Settings → Display → "Load tool results in past chats", default off) with per-card lazy-fetch in the inspector pane.
- **Activity skeleton** — metadata-only fetch (~3 KB for 50 rows). Placeholder rows render immediately; real per-call entries swap in as paged hydration completes.
- **Single-id whale recovery** — when a 5-id batch trips the 30s timeout (one row carries an oversized `tool_calls` blob), an L1 single-id retry isolates the offender so the rest of the batch still hydrates.
### SSH cancellation that actually cancels
`Task.detached` doesn't inherit cancellation from the awaiting parent. Pre-fix, navigating away from a chat left the underlying ssh subprocess running for the full 30s, pinning a remote sqlite query and a ControlMaster session — the "third chat hangs" / "dashboard spins after rapid switching" symptom. v2.7 wires `withTaskCancellationHandler` through `SSHScriptRunner.run` and `RemoteSQLiteBackend.query`; cancellation now reaches the `Process` within ~100ms.
### New Project from Scratch wizard + Keychain-backed cron secrets
A third project entry point alongside Browse Catalog and Add Existing Project. Scaffolds a Scarf-standard skeleton, registers it, and hands off to a chat session that auto-activates the bundled `scarf-template-author` skill. The skill drives the rest conversationally — widgets, optional config schema, optional cron — and writes the final files itself.
**Cron + Keychain.** Cron prompts that referenced `secret`-typed config fields used to get the literal `keychain://...` URI back, producing 401s. v2.7 mirrors resolved Keychain values into `~/.hermes/.env` under `$SCARF_<UPPER_SLUG>_<UPPER_FIELD>` env vars. Hermes already reloads `.env` per cron tick — credential rotation is automatic.
### Project dashboards — file-reading widgets, sparklines, typed status
Five new widget types and project-wide auto-refresh. **Backwards-compatible** — every existing `dashboard.json` renders byte-identically.
- **`markdown_file`** / **`log_tail`** / **`cron_status`** / **`image`** / **`status_grid`** — file-reading widgets that auto-refresh when the underlying file changes. By convention, place files inside `<project>/.scarf/`.
- **`stat` widget gains inline sparklines** via optional `sparkline: [Number]`. SVG-only render; dozens per dashboard cost nothing.
- **Typed status badges** with lenient decode (`ok`/`up` → success, `down`/`error` → danger). Unknown strings render as plain text rather than crashing.
- **Structured widget error card** replaces the legacy "Unknown: \<type\>" placeholder.
### OAuth resilience + Credential Pools
- **Daily OAuth keepalive cron** prevents Anthropic OAuth refresh tokens from expiring after weeks of inactivity.
- **Remote re-auth** unblocked — OAuth flow drives a remote `hermes auth add` correctly with stdin forwarded.
- **OAuth remove button** + auto-refresh of Credential Pools on `auth.json` change.
- **`resolve_provider_client` errors** (auxiliary task references an unauthenticated provider) classified into a clear hint with a one-click jump to Settings → Aux Models.
- **Model/provider mismatch banner** detects when `model.default` carries a `<provider>/...` prefix that disagrees with `model.provider`, with one-click fix in either direction.
### ScarfMon — performance instrumentation harness
The diagnostic surface that drove the bulk of the v2.7 perf work. Off by default; signpost-only mode (Instruments-friendly) is free; Full mode keeps a 4096-entry in-memory ring buffer you can copy as JSON for paste-into-issue diagnosis. Wiki: [Performance-Monitoring](https://github.com/awizemann/scarf/wiki/Performance-Monitoring).
See the full [v2.7.0 release notes](https://github.com/awizemann/scarf/releases/tag/v2.7.0) for the complete list (36 commits, including: in-flight coalescing for `loadRecentSessions`, snapshot pipeline rewrite from `sqlite3 .backup` to direct SSH-streamed queries [#74](https://github.com/awizemann/scarf/issues/74), per-message TTS, window-position persistence, sidebar reorder, and many other fixes).
**Previous releases:** see the [Release Notes Index](https://github.com/awizemann/scarf/wiki/Release-Notes-Index) on the wiki for v2.6, v2.5, v2.3, v2.2, v2.0, v1.6, and earlier.
## ScarfGo — the iPhone companion
Same Hermes server you've been running on your Mac — reachable from your phone over SSH. Multi-server, project-scoped chat, session resume, memory editor, cron list, skills tree, settings (read), all native iOS. Pure-Swift SSH (Citadel under the hood — no `ssh` binary needed on iOS). Per-project chat writes the same Scarf-managed `AGENTS.md` block the Mac app does, so the agent boots with the same project context regardless of which client opened the session.
**[Join the public TestFlight](https://testflight.apple.com/join/qCrRpcTz)** — the link is live now but only accepts new beta testers once Apple's Beta Review approves the first build. If you hit a "not accepting testers" splash, bookmark it and try again in 2448h.
@@ -39,21 +84,6 @@ Same Hermes server you've been running on your Mac — now reachable from your p
See the [ScarfGo wiki page](https://github.com/awizemann/scarf/wiki/ScarfGo) for the full feature tour, [ScarfGo Onboarding](https://github.com/awizemann/scarf/wiki/ScarfGo-Onboarding) for the SSH-key setup walkthrough, and [Platform Differences](https://github.com/awizemann/scarf/wiki/Platform-Differences) for what is and isn't shared between Mac and iOS.
### Everything else in 2.5
- **Portable project-scoped slash commands.** Author reusable prompt templates as Markdown files at `<project>/.scarf/slash-commands/<name>.md` with YAML frontmatter (name, description, argumentHint, optional model override). Invoke as `/<name> [args]` from chat — Scarf substitutes `{{argument}}` (with optional `default:` fallback) in the body and sends the expanded prompt to Hermes. Mac authoring tab + iOS read-only browser. Templates carry them via the new `slash-commands/` block in `.scarftemplate` bundles (schemaVersion 3). See [Slash Commands](https://github.com/awizemann/scarf/wiki/Slash-Commands) for the full schema.
- **Hermes v2026.4.23 chat parity.** `/steer` non-interruptive guidance command, per-turn stopwatch on assistant bubbles, numbered keyboard shortcuts (19) on the permission sheet, git branch chip in the chat header. The new `messages.reasoning_content` and `sessions.api_call_count` columns surface as a richer reasoning disclosure + an "API" chip on session rows.
- **Spotify + design-md skills.** Mac ships an in-app Spotify OAuth sheet (mirrors the v2.3 Nous Portal pattern); design-md gets a host-side `npx` prereq check on both platforms. SKILL.md frontmatter (`allowed_tools`, `related_skills`, `dependencies`) renders as chip rows. A "What's New" pill on the Skills tab tells you when remote skills changed since you last looked.
- **Mac global Sessions: project filter + project badges** — parity with ScarfGo's Sessions tab. The list grows a filter Menu (All projects / Unattributed / each registered project) and each row carries a tinted folder chip with the project name when attributed.
- **Human-readable cron schedules everywhere.** New `CronScheduleFormatter` in ScarfCore translates the common cron shapes into English phrases and falls back to the raw expression on anything custom. Mac and iOS render the same.
- **Mac design-system overhaul.** Rust palette, typed token bundle (`ScarfColor`, `ScarfFont`, `ScarfSpace`, `ScarfRadius`), reusable components (`ScarfPageHeader`, `ScarfCard`, `ScarfBadge`, `ScarfTextField`, four button styles), redesigned 3-pane chat. iOS adopts the same tokens with a hybrid Dynamic Type policy so accessibility scaling on body text is preserved. See [Design System](https://github.com/awizemann/scarf/wiki/Design-System) for the full reference.
- **Under the hood** — `SessionAttributionService`, `ProjectContextBlock`, `CronScheduleFormatter`, `GitBranchService`, `SkillPrereqService`, `SkillSnapshotService`, `ProjectSlashCommandService`, and the ACP error triplet (`acpError` / `acpErrorHint` / `acpErrorDetails`) consolidated into ScarfCore so Mac and iOS consume one source of truth. 179 tests across 13 suites, three consecutive green runs. Several `try?` swallows in iOS lifecycle code now surface real failures (Keychain unlock errors no longer drop people into onboarding; partial Forget operations report what failed).
- **iOS push notifications skeleton** — `NotificationRouter` ships with foreground presentation + a lock-screen "Approve / Deny" action category gated by `apnsEnabled = false`. Lights up when Hermes ships a server-side push sender + an APNs cert.
See the full [v2.5.0 release notes](https://github.com/awizemann/scarf/releases/tag/v2.5.0).
**Previous releases:** see the [Release Notes Index](https://github.com/awizemann/scarf/wiki/Release-Notes-Index) on the wiki for v2.3, v2.2, v2.0, v1.6, and earlier.
## Connect ScarfGo to your Hermes server
ScarfGo speaks SSH directly — no companion service, no developer-controlled server in between. Onboarding takes about a minute:
@@ -145,7 +175,7 @@ Custom, agent-generated dashboards for any project. Define stat boxes, charts, t
- macOS 14.6+ (Sonoma) for Scarf
- iOS 18.0+ for [ScarfGo](https://github.com/awizemann/scarf/wiki/ScarfGo) (the iPhone companion, public TestFlight from v2.5)
- Xcode 16.0+ to build from source
- [Hermes agent](https://github.com/hermes-ai/hermes-agent) v0.6.0+ installed at `~/.hermes/` on each target host (v0.11.0+ recommended for full v2.5 feature support — `/steer`, new state.db columns, design-md/spotify skills, SKILL.md frontmatter chips)
- [Hermes agent](https://github.com/hermes-ai/hermes-agent) v0.6.0+ installed at `~/.hermes/` on each target host (v0.12.0+ recommended for full v2.6 feature support — autonomous Curator, multimodal image input, 5 new providers, Microsoft Teams + Yuanbao gateways, Kanban, Skills v0.12 surface, cron `--workdir`, prompt-cache TTL, Piper TTS, Vercel terminal)
- For remote servers: SSH access (key-based), `sqlite3` on the remote (for atomic DB snapshots), and the `hermes` CLI resolvable from the remote user's `PATH` or at a path you specify per server. ScarfGo requires the same on every Hermes host it connects to.
### Compatibility
@@ -159,9 +189,10 @@ Scarf reads Hermes's SQLite database and parses CLI output from `hermes status`,
| v0.8.0 (2026-04-08) | Verified |
| v0.9.0 (2026-04-13) | Verified |
| v0.10.0 (2026-04-16) | Verified (Tool Gateway introduced) |
| v0.11.0 (2026-04-23) | **Verified — current target (recommended for full v2.5 feature support)** |
| v0.11.0 (2026-04-23) | Verified |
| v0.12.0 (2026-04-30) | **Verified — current target (recommended for full v2.6 feature support)** |
Scarf 2.5 targets Hermes v0.11.0 for `/steer`, the new state.db columns (`messages.reasoning_content`, `sessions.api_call_count`), the new skills (design-md, spotify), the SKILL.md frontmatter chip surfaces, and the `hermes memory reset` toolbar action. Earlier Hermes versions remain supported for monitoring, sessions, file-based features, and ACP chat; v0.11-specific behavior degrades gracefully on older agents (`/steer` is harmless, new columns silently nil out).
Scarf 2.6 targets Hermes v0.12.0 for the autonomous Curator, multimodal ACP image content blocks, the 5 new inference providers, Microsoft Teams + Yuanbao gateways, the read-only Kanban view, the Skills v0.12 surface (URL install / reload / disable badges / curator pin), cron `--workdir`, `auxiliary.curator`, `prompt_caching.cache_ttl`, the redaction toggle, the runtime metadata footer, Piper TTS, and the Vercel terminal backend. Every v0.12 surface is **capability-gated** — Scarf detects the host's Hermes version once per server connection (`hermes --version` → semver + `YYYY.M.D` parse) and hides v0.12-only UI on older hosts. v0.11.0 hosts keep the full v2.5 surface (`/steer`, `messages.reasoning_content`, `sessions.api_call_count`, design-md/spotify skills, SKILL.md frontmatter chips, `hermes memory reset`). Earlier Hermes versions remain supported for monitoring, sessions, file-based features, and ACP chat; new behavior degrades gracefully on older agents.
If a Hermes update changes the database schema or CLI output format, Scarf may need to be updated. Check the [Health](#features) view for compatibility warnings.
@@ -207,6 +238,8 @@ Or from the command line:
xcodebuild -project scarf/scarf.xcodeproj -scheme scarf -configuration Release -arch arm64 -arch x86_64 ONLY_ACTIVE_ARCH=NO build
```
For an unsigned local Debug build without an Apple Developer account (handy for contributors), use [`./scripts/local-build.sh`](scripts/local-build.sh) — see [BUILDING.md](BUILDING.md) for prerequisites.
## Architecture
Scarf follows the **MVVM-Feature** pattern with zero external dependencies beyond SwiftTerm:
+55
View File
@@ -0,0 +1,55 @@
## What's in 2.5.2
A patch with one substantial new feature (**iOS chat resilience** — reconnect, cached snapshot fallback, history paging) plus a stack of fixes for issues reported against 2.5.1 and earlier. Drop-in replacement for 2.5.1 on Mac; drop-in TestFlight build on iOS. No data migrations.
### iOS chat resilience
ScarfGo now survives phone-sleep, network handoffs, and SSH socket drops without losing the agent's work. Hermes was already persisting messages to `state.db` in real-time; iOS just had no resync path.
- **5-attempt exponential reconnect** (1s → 2s → 4s → 8s → 16s) via `session/resume` with `session/load` fallback. Reconciles with `state.db` on success and surfaces a *"Resynced N new messages"* toast when the agent kept working through the disconnect.
- **`NetworkReachabilityService`** (NWPathMonitor singleton): suspends reconnect attempts while offline and kicks a fresh cycle on link-up. Two new banner states above the message list — `.reconnecting` and `.offline` — render as slim ScarfDesign-tinted strips so the user always knows what the chat is doing.
- **Scene-phase awareness**: returning to foreground triggers a channel-health check; if dead, the reconnect cycle starts immediately rather than waiting for the next interaction.
- **Draft persistence**: per-server, per-session draft survives force-quit (UserDefaults-backed, 7-day janitor at app launch).
### Cached snapshot fallback (Mac + iOS)
`ServerTransport.cachedSnapshotPath` lets `HermesDataService` fall back to the previously-pulled `state.db` snapshot when a fresh pull fails. `isUsingStaleSnapshot` + `lastSnapshotMtime` surface to views so they render *"Last updated X ago."* Chat-history reload still passes `forceFresh: true` to refuse stale data; everything else (Dashboard, Sessions list, Activity) gets read-while-disconnected for free.
### Bounded message-history paging
`HermesDataService.fetchMessages(sessionId:limit:before:)` paginates by id desc with centralized `HistoryPageSize` constants. `RichChatViewModel.loadEarlier()` walks back through long sessions via `oldestLoadedMessageID` + `hasMoreHistory`. Legacy unbounded overload deprecated.
### Bug fixes
#### Mac
- **[#46](https://github.com/awizemann/scarf/issues/46) — chat O(n)-per-token bog-down (already shipped in 2.5.1 for the trailing-group patch; this release retains the fix and pairs with the new history paging so chats with thousands of messages stay smooth).**
- **[#19](https://github.com/awizemann/scarf/issues/19) layer-3 — sqlite3 false-negative in diagnostics.** Already in v2.5.1; kept here.
- **[#44](https://github.com/awizemann/scarf/issues/44) — pill / diagnostics agreement** via shared `SSHScriptRunner`. From v2.5.1; the tier-2 probe now also checks `state.db` (not just `config.yaml`) so a healthy fresh install reports green.
- **[#59](https://github.com/awizemann/scarf/issues/59) — Settings → Model and Credential Pools no longer freeze.** Both views called `ModelCatalogService.loadProviders()` synchronously from `.onAppear` on the MainActor; on a remote SSH context that's a multi-megabyte SSH file read on the main thread, freezing the UI for 12 minutes. New `loadProvidersAsync()` / `loadModelsAsync(for:)` wrappers dispatch off the main thread; both views now use `.task` + `await` with a `ProgressView("Loading providers…")` overlay. Per-provider switching in the picker is also async now, so clicking a different provider doesn't re-freeze the UI.
- **Diagnostics tri-state.** Hermes v0.11+ doesn't materialize `config.yaml` until the user changes a setting from defaults — so the diagnostics view was reporting *"12/14 passing"* on healthy fresh installs. The probe now distinguishes `.pass` / `.fail` / `.skipped`; a missing `config.yaml` emits SKIP and is excluded from the summary's denominator. Reads as *"12/12 passing (2 optional skipped)"* instead of the misleading 12/14.
- **Credentials: OAuth providers visible.** `hasAnyAICredential()` only probed `credential_pool.<provider>` in `auth.json`; OAuth-authed providers land under `providers.<name>.access_token` (Nous, Spotify, GH Copilot ACP, Qwen, Gemini all use that path). The chat banner kept showing *"No AI provider credentials"* even after a successful Nous sign-in. Now both shapes count. Credential Pools view gains a parallel "OAuth providers" section listing OAuth-authed providers with token tail, expiry badge, and portal URL.
- **Project-shadowed Hermes detection.** New `ProjectHermesShadowDetector` (ScarfCore) probes each registered project at chat-start; if a `.hermes/` dir or `hermes.yaml` is found inside the project, the user gets a banner explaining that project-local Hermes config will shadow the server-level one (a quiet failure mode for users who didn't realize Hermes prefers project-local config).
- **[#58](https://github.com/awizemann/scarf/issues/58) — Mac chat side panes are hideable.** Two toolbar buttons next to the View picker (`sidebar.left` / `sidebar.right`) toggle the sessions list and tool inspector with a slide animation; both default visible (today's behavior). Clicking a tool card auto-shows the inspector if hidden so the click never silently dies. Settings → Display → Chat density gains parity Toggle rows.
#### ScarfGo (iOS)
- **[#56](https://github.com/awizemann/scarf/issues/56) — *"Citadel.SSHClient.CommandFailed error 1"* on dashboard.** `asyncSnapshotSQLite` was missed during the v2.5.0 Citadel hardening — used raw `executeCommand` (which discards stderr on non-zero exit) and didn't prepend the Citadel-friendly `PATH=$HOME/.local/bin:/opt/homebrew/bin:/usr/local/bin:$PATH`. Now uses `executeCommandStream` and the same PATH prefix. `HermesDataService.humanize` already translates `sqlite3: command not found` / `permission denied` / `no such file` into actionable user copy — the bug was that the snapshot path never fed it real stderr.
- **[#57](https://github.com/awizemann/scarf/issues/57) — keyboard-dismiss chevron over send button.** The keyboard accessory dismiss button added in v2.5.1 (#51) was placed at the trailing edge of the keyboard toolbar, directly above the trailing-edge send button. Moved to the leading edge — matches the iOS convention (Notes, Mail, Reminders).
### New features (Mac)
- **Chat-start model preflight ([commit](https://github.com/awizemann/scarf/commit/2aab9da)).** Catches a missing `model.default` / `model.provider` in `config.yaml` *before* the ACP session starts. Pre-fix the user typed a prompt, hit send, and got an opaque *"Model parameter is required"* HTTP 400 from the upstream provider. Now `ChatModelPreflightSheet` wraps the existing model picker so the same selection / validation / Nous-catalog branch is single-sourced; the chat the user originally opened lands without re-clicking the project row.
- **Nous Portal live model catalog.** `NousModelCatalogService` fetches `GET /v1/models` from `inference-api.nousresearch.com` using the bearer token in `auth.json`. Cached at `~/.hermes/scarf/nous_models_cache.json` with a 24h TTL. The picker's nous-overlay detail view switches from a free-form TextField to a real model list, with a *"Custom…"* escape hatch for IDs not yet in the API response.
- **Remote-aware admin sheets.** Three sheets gained the same context-aware Verify pattern that Add Project got in v2.5.1 (#54):
- **Profiles → Import / Export.** Buttons that drive `hermes profile import <zip>` / `hermes profile export <name> <zip>` over SSH. Local context picks via `NSOpenPanel`; remote context shows a path-input + Verify button.
- **Settings → Advanced → Restore.** Pick a local backup zip OR enter+verify a remote path.
- **Templates → Install destination.** The parent-directory step in the install sheet branches on context — local Browse, or remote text-input + Verify.
### Translations
`Localizable.xcstrings` adds strings for all the new copy across the seven supported locales (English, Simplified Chinese, German, French, Spanish, Japanese, Brazilian Portuguese).
### Notes for users running 2.5.1
No data migrations needed. `~/.hermes/scarf/nous_models_cache.json` is created lazily on first use of the Nous picker; everything else is forward-compatible with existing config / Keychain / project registries.
+134
View File
@@ -0,0 +1,134 @@
## What's in 2.6.0
A major release tracking **Hermes v2026.4.30 (v0.12.0)** — the largest single Hermes update Scarf has had to follow since v0.10's Tool Gateway. Headline additions: the autonomous **Curator**, **multimodal image input** in chat, **5 new inference providers**, **Microsoft Teams + Yuanbao** gateway platforms, a **read-only Kanban** view, and ScarfGo gains read-only Webhooks/Plugins/Profiles plus a Hermes-version banner.
Pre-v0.12 Hermes hosts are fully supported. Every new surface is gated on a runtime capability detector (`hermes --version` → semver), so users on older Hermes installs see the v2.5 surface unchanged. UI doesn't appear until the underlying CLI subcommand exists.
### Curator (Mac + iOS)
Hermes v0.12's autonomous skill curator prunes / consolidates / archives agent-created skills on a 7-day schedule. Scarf adds a dedicated **Curator** sidebar item under Interact (Mac) and a Curator nav row under the System tab (iOS).
- **Status panel** — enabled/paused/disabled badge, last-run timestamp, last summary, run count, scheduling cadence (interval / stale-after / archive-after).
- **Run Now** button triggers `hermes curator run`; pause/resume from the kebab menu.
- **Three leaderboards** — least-recently-active, most-active, least-active. Each row carries activity / use / view / patch counters and an inline pin toggle.
- **Pin / unpin** — pinned skills are protected from auto-archive and rewrites. State pulled from `~/.hermes/skills/.curator_state` and surfaced as a pin glyph everywhere skills appear (Curator screen, Skills sidebar/list, SkillDetailView).
- **Restore archived** sheet calls `hermes curator restore <name>` to bring a previously-archived skill back.
- **Last report Markdown** — when present, the previous run's REPORT.md renders inline in mono.
Capability-gated; sidebar item disappears on pre-v0.12 hosts.
### Multimodal image input in chat (Mac + iOS)
Hermes v0.12 advertises `prompt_capabilities.image = true` on ACP and accepts image content blocks in `session/prompt`. Scarf wires the producer side on both targets:
- **Mac**: paperclip toolbar button on the chat composer opens NSOpenPanel multi-pick. Drag-and-drop and paste also work — drop an image (or a Finder file URL) onto the composer and it attaches. Capability-gated; the entire attachment surface is hidden on pre-v0.12 hosts.
- **iOS**: paperclip button opens PhotosPicker (multi-select up to 5 photos). Same byte-for-byte capability gate.
- **ImageEncoder** downsamples to 1568px long-edge (Anthropic's recommended ceiling) at JPEG q=0.85, so a 12 MP screenshot lands under ~300 KB on the wire. Detached only — never blocks MainActor.
- **Image-only sends are valid** — once at least one attachment is queued, the send button enables even with empty text. Vision models accept "describe this" with no caption.
- **Per-attachment chips** above the input field with thumbnail + filename tooltip + X to remove. 5-image-per-message cap; total payload stays under ~2 MB so cellular sends don't time out.
Hermes routes the resulting prompt to a vision-capable model automatically — no extra Scarf-side work to pick the right aux model.
### 5 new inference providers (Mac + iOS)
Five overlay-only providers added to `ModelCatalogService.overlayOnlyProviders`. The model picker reaches all of them; provider IDs match `HERMES_OVERLAYS` in `hermes_cli/providers.py` exactly so a typo here doesn't strand users with an unreachable provider.
- **GMI Cloud** (api_key) — `https://api.gmi-serving.com/v1`
- **Azure AI Foundry** (api_key) — base URL resolved from `AZURE_FOUNDRY_BASE_URL` per tenant
- **LM Studio** (api_key, first-class) — promoted from custom-endpoint alias to a real provider; defaults to `http://127.0.0.1:1234/v1`
- **MiniMax (OAuth)** (oauth_external) — `https://api.minimax.io/anthropic`
- **Tencent TokenHub** (api_key) — base URL resolved from `TOKENHUB_BASE_URL`
### `auxiliary.curator` aux task (Mac)
Hermes removed `auxiliary.flush_memories` entirely in v0.12 (the underlying memory pipeline was rewritten) and added `auxiliary.curator` so the curator's review fork can run on a separate model from the main agent. Settings → Auxiliary now surfaces a Curator row when the active host is v0.12+ (gated on `HermesCapabilities.hasCuratorAux`); the obsolete Flush Memories panel is gone.
The Tool Gateway health view in HealthView lost the flushMemories-routes-through-Nous row and gained a curator row, matching the new aux task list.
### Skills v0.12 surface (Mac + iOS)
Three new capabilities Scarf can now reach:
- **Direct-URL install** — `hermes skills install <https-url>` lets users pull a one-off skill without going through a registry. Mac SkillsView gains an "Install from URL…" toolbar button (capability-gated) opening a sheet with the URL field plus optional `--category` / `--name` overrides.
- **Reload** — `hermes skills audit` rescans the skills directory and refreshes the agent's view without a session restart. Wired to a "Reload" toolbar button next to the install button on Mac.
- **Enabled / disabled state** — `skills.disabled` in config.yaml is read at scan time. Disabled skills render strikethrough + an "OFF" pill on Mac and iOS rows; iOS detail view explains the state in plain text.
- **Curator pin badge** — pinned-skill names from `~/.hermes/skills/.curator_state` surface as a pin glyph on each row across Mac sidebar and iOS list, plus an explanatory chip on iOS detail view.
The disable-toggle write path is deferred to v2.7 — Hermes only exposes `hermes skills config` as an interactive verb today, and we'd rather read accurately than risk clobbering the user's list with a half-tested write.
### Cron — `--workdir` flag (Mac)
Hermes v0.12 cron jobs accept `--workdir <absolute-path>` to inject AGENTS.md / CLAUDE.md / .cursorrules from that directory and pin cwd for terminal/file/code_exec tools. Scarf's CronJobEditor now has a Workdir field; both create and edit paths forward the flag. Existing v0.11 jobs keep the no-cwd behaviour by leaving the field blank.
The `context_from` chaining field is read-only from Scarf this round (Hermes hasn't exposed a `--context-from` CLI flag yet, only YAML).
### Microsoft Teams + Yuanbao (Mac)
Two new gateway platforms. Microsoft Teams (the 19th platform) ships as a plugin; Yuanbao 元宝 (the 18th) is a native gateway adapter. Both surface in the Platforms tab with read-only setup panels — the OAuth dance for Yuanbao and the plugin install for Teams happen outside Scarf.
### Read-only Kanban (Mac)
Hermes v0.12 ships a SQLite-backed multi-tenant task board with a full CLI (`hermes kanban create / list / claim / dispatch / …`). The multi-profile *collaboration* layer was reverted upstream while the design is reworked, so v2.6 ships a **read-only** Kanban view: paginated table of `hermes kanban list --json` filtered by status, with status badges, meta chips (id / assignee / workspace / skills), and per-row metadata. 5-second polling while the view is foregrounded; suspended on disappear.
Create / claim / dispatch UI is deferred until upstream stabilizes — building the editor now would risk rework on a quarter-out timeline.
### Settings deltas (Mac)
A new **Caching & Redaction** section under Settings → Advanced with three v0.12 knobs (gated on capability):
- **Prompt cache TTL** picker — 5m default / 1h opt-in. Reduces cache writes on long agent loops with stable system prompts.
- **Redact secrets in patches** toggle — Hermes flipped this off by default in v0.12 because the substitution corrupted patches; security-sensitive users can flip it back on here.
- **Runtime metadata footer** toggle — opt-in compact footer on each final reply (provider/model/cost/turn count).
TTS provider list gains **piper** (native local TTS engine new in v0.12). Terminal backend list gains **vercel** (Vercel Sandbox backend for execute_code/terminal). Both ride along unconditionally — Hermes silently falls back when an older host doesn't recognize the value.
### iOS catch-up — Webhooks / Plugins / Profiles (read-only)
Three new System-tab nav rows in ScarfGo, all read-only:
- **Webhooks** — list of `hermes webhook list` output with description / deliver / events / route per row. "Platform not enabled" detection so a freshly-installed Hermes shows setup guidance instead of error noise.
- **Plugins** — filesystem-first scan over `~/.hermes/plugins/` with manifest reads (plugin.json or plugin.yaml). Enabled/disabled badge, version, source, path.
- **Profiles** — `hermes profile list` with active-profile highlighting from `~/.hermes/active_profile`. Tolerant of both Rich box-drawn and plain-text outputs.
None of the three are capability-gated — the underlying list verbs work on both v0.11 and v0.12. Create / edit / delete remain Mac-only since they touch enough state we keep them off the phone.
### Hermes-version banner (iOS)
Yellow banner at the top of the Dashboard tab when the active server is pre-v0.12. Lists the v0.12 capabilities the user is missing out on (curator, multimodal image input, new providers); one-tap session-dismiss; reappears on next app open. Hidden entirely on v0.12+ hosts.
### Internal — version-aware capability detection
The foundation of every gated surface above:
- `HermesCapabilities` value type parses `Hermes Agent v0.12.0 (2026.4.30)` from `hermes --version` output. Exposes booleans for each release-gated UI surface (`hasCurator`, `hasACPImagePrompts`, `hasKanban`, `hasOneShot`, `hasSkillURLInstall`, `hasFallbackCommand`, `hasUpdateCheck`, `hasPiperTTS`, `hasVercelTerminal`, `hasCuratorAux`, `hasTeamsPlatform`, `hasYuanbaoPlatform`, `hasCronWorkdir`, `hasPromptCacheTTL`, `hasRedactionToggle`, `hasFlushMemoriesAux`).
- `HermesCapabilitiesStore` (`@Observable @MainActor`) caches per-server capabilities. Injected on `ContextBoundRoot` (Mac) and `ScarfGoTabRoot` (iOS) via `.environment(_:)` and `.hermesCapabilities(_:)`.
- 12 parser tests + 6 curator-output parser tests lock the v0.12 / v0.11 / fallback flag matrices.
### Bug fixes
#### Chat composer + transcript (post-merge round)
- **Typing lag in the chat composer (#67)** — `RichChatInputBar.updateMenuState()` ran on every keystroke and unconditionally wrote both `showMenu` and `selectedIndex`, tripping SwiftUI's "action tried to update multiple times per frame" warning and stalling input. Composer now coalesces writes to deltas, short-circuits when not in slash mode (the common case), and watches `commands.count` instead of re-allocating `commands.map(\.id)` per keystroke.
- **Chat font-size slider had no visible effect (#68)** — `RichChatView` only set `\.dynamicTypeSize`, but `ScarfFont` tokens are fixed-point (`Font.system(size: 14, …)`) so dynamic type didn't reach bubble text, reasoning, tool chips, code blocks, or markdown headings. New `\.chatFontScale` env value plumbed through `RichMessageBubble`, `MarkdownContentView`, and `CodeBlockView`; `ChatFontScale.{body, caption, captionStrong, caption2, mono, monoSmall, codeBlock, codeInline}(_:)` helpers mirror the ScarfFont base sizes so 100% is byte-for-byte identical to today's UI.
- **Placeholder ghosting on first keystroke (#65)** — `TextEditor`'s NSTextView surfaces a typed glyph one frame before the SwiftUI binding propagates, so the bare `if text.isEmpty` overlay rendered the translucent placeholder text on top of the just-typed character. Pinned an opaque background behind the placeholder rect and switched the conditional to `.opacity(...)` so the view tree stays stable per keystroke.
- **Draft text leaked between conversations (#62)** — composer `@State` survived session switches because the surrounding view tree was structurally identical. Bound `RichChatInputBar`'s identity to `richChat.sessionId` so SwiftUI rebuilds the view (and its `@State`) on session change. Stable fallback string for the "no session selected" window — `UUID()` would have minted a new id per body re-eval and trashed the composer mid-typing.
- **Sent message rendered blank after navigating away (#63)** — when a user sent a prompt and immediately resumed a different session before Hermes flushed the row to state.db, `resumeSession`'s `reset()` cleared `messages` and `loadSessionHistory` then read an as-yet-empty DB. New per-session pending-user-messages cache survives `reset()` and re-injects still-pending entries on load; entries clear themselves as soon as a matching DB row catches up.
- **No completion notification (#64)** — sending a long prompt and switching to other work required polling the chat to know when the response landed. New `ChatNotificationService` fires a local `UNUserNotificationCenter` banner on prompt completion when Scarf isn't the foreground app. Settings → Display → Feedback → "Notify when Hermes finishes" toggle, default on.
- **Per-message TTS playback (#66)** — small speaker glyph in each settled assistant bubble's metadata footer; uses `AVSpeechSynthesizer` with the user's macOS Spoken Content default voice, picks up offline. Markdown control characters stripped before speech. The deeper Settings → Voice provider integration (Edge / ElevenLabs / OpenAI / NeuTTS / Piper) is queued as a v2.7 follow-up.
- **ACP control-message timeout under gateway concurrency (#61)** — bumped 30s → 60s. State.db lock contention on a healthy host clears in seconds, but the previous 30s watchdog tripped under realistic gateway+ACP concurrency (Discord sync / skill registration / cron scheduling holding write locks during ACP `initialize` / `session/new` / `session/load`). 60s gives lock resolution headroom while still surfacing genuinely broken transports.
#### Pre-merge
- **Test target compile** — `M5FeatureVMTests.ScriptedTransport` had drifted off the `ServerTransport` protocol after `cachedSnapshotPath` landed in v2.5.2; added the missing stub. `M0dViewModelsTests` got the `ConnectionStatusViewModel.Status.degraded` argument-name update. `CredentialPoolsGatingTests` got the missing `import ScarfCore`. The full `swift test` suite now runs (and passes — 215 tests across 17 suites).
- **iOS package compile** — `RemoteBackupService.zipDirectory` and `RemoteRestoreService.unzipArchive` used `Foundation.Process` unconditionally, breaking the iOS build entirely (Process is unavailable on the iOS SDK). Wrapped in `#if !os(iOS)` with iOS stubs that throw — backup/restore is Mac-only by design.
### Hermes version
Targets Hermes **v2026.4.30 (v0.12.0)**. v2026.4.23 (v0.11.0) hosts continue to work — every v0.12 surface is gated on capability detection, so Scarf v2.6 against v0.11 looks identical to Scarf v2.5.2 against v0.11. Update Hermes (`hermes update`) to unlock the new surfaces.
### Compatibility
- macOS 14+ (unchanged)
- iOS 17+ (unchanged)
- Hermes v0.11+ for the v2.5 surface; v0.12+ for the new features above.
- No data migrations.
+78
View File
@@ -0,0 +1,78 @@
## What's in 2.6.5
A patch release that ships **template discoverability**, **cron observability**, and an **end-to-end UI test harness** that locks the new install path against regression. No breaking changes; every Hermes capability target is unchanged from 2.6.0.
### In-app Template Catalog
The catalog is no longer web-only. **Templates → Browse Catalog…** opens a sheet that fetches the live catalog from `awizemann.github.io/scarf/templates/`, renders one row per published template with name + version + tags, and one-click installs through the existing flow. Search filters across name / description / tags; the category picker constrains to whatever categories the loaded catalog actually carries.
- **Install-state badges** — each row shows "Installed v1.2.0" (green) or "Update v1.3.0" (amber) when the catalog version is newer than what's in `~/.hermes/scarf/projects.json`. Update is "uninstall + reinstall" today; in-place upgrade is on the v3 backlog.
- **24h cache** at `~/.hermes/scarf/catalog_cache.json` so opening the sheet repeatedly doesn't re-hit the network. Refresh icon force-fetches.
- **Bundled fallback** — fresh-install / offline users still see the official templates as a hardcoded list. Network failures serve stale cache with a "refresh failed" hint.
- **Catalog-schema decoder fault tolerance** — one malformed entry on the live catalog can't bring down the whole list. The bad row is dropped with a logged warning; the rest survive.
### HackerNews Daily Digest template
First template added under the new dogfooding-templates loop. Configurable `min_score`, `max_items`, `topics`; one daily-at-08:00 cron job (paused on install) that pulls the HN Firebase API, filters, and prepends a markdown digest to the project's `digest.md`. No API keys required. Live at the catalog URL above.
### Cron observability — auth-error banner + running indicator + log tail
Cron rows now surface the same OAuth-refresh-revoked recovery flow as Chat instead of a generic red dot, plus three previously-missing observability cues:
- **OAuth re-auth.** `ACPErrorHint.classify` runs on `job.lastError`; when it returns `oauthRefreshRevoked(provider)` the detail pane shows the human-readable hint + a **Re-authenticate** button that drops the user into Credential Pools — same wiring ChatView's banner uses. Unrecognized errors fall back to the legacy red `lastError` text.
- **Running indicator.** The row dot turns blue + pulses when `state == "running"` (precedence over disabled / error / success); the detail header gains a "running…" badge next to active/paused. No new polling — `HermesFileWatcher.lastChangeDate` already drives `CronViewModel.load()`.
- **Last run output.** Collapsible panel replacing the inline log: a one-line summary (`<timestamp> — ok|error|running…`) always visible, full monospaced terminal-style scroll on expand, auto-scrolls to bottom when new runs land.
Also fixes a pre-existing bug in `HermesFileService.loadCronOutput` that returned the wrong file under Hermes's per-job-id output nesting.
### Layer B install-drive XCUITest harness
The dogfooding-templates initiative ships its first end-to-end UI test that drives the install pipeline:
```
Launch with --scarf-test-mode → Sidebar → Projects → Install sheet
(via --scarf-test-install-url launch arg) → Configure → Open Project
→ Right-click → Uninstall Template → Confirm Remove → Done
```
Runs ~30 s green on the dev Mac, validates 9 assertion points across the user journey. Covers the new accessibility identifiers wired in this release: `templateConfig.commitButton`, `projects.row.<name>`, `sidebar.section.<rawValue>`, `projects.contextMenu.uninstallTemplate`, `templateUninstall.confirmRemove`, `templateInstall.success.openProject`, `templateUninstall.success.done`. The `--scarf-test-install-url` launch arg + `TestModeFlags.isTestMode` gating lets XCUITest skip SwiftUI Menu / NSToolbarItem accessibility-bridging quirks that otherwise block toolbar-menu driving.
Wiki [Test-Harness](https://github.com/awizemann/scarf/wiki/Test-Harness) documents how to extend the harness for the next template.
### Sentinel-marker test isolation (incident-response hardening)
`SCARF_HERMES_HOME` override now requires the path to contain a `.scarf-test-home-marker` file to activate. Without the marker, production code falls through to the user's real `~/.hermes/`. Lands belt-and-braces protection for cases where a test crashes mid-teardown leaving the env var set, an env var inherits from a parent shell, or a misconfigured launchctl plist exports the variable. The override remains the seam every E2E test relies on; the marker file ensures it can't accidentally pivot a non-test process off the user's data.
### Chat fixes
- **OAuth refresh-revoked surface.** Chat-side error banner now classifies the message via `ACPErrorHint.classify` and offers an in-app **Re-authenticate** button that routes through Credential Pools (#65). Same primitive the new cron banner reuses.
- **Placeholder ghosting fix.** TextEditor's placeholder now clips to the editor's bounds and clears on focus instead of bleeding past the cursor area when the user types fast (#67).
### Profile chip + structured logs
- **Active-profile chip in the sidebar header.** Click → routes to Profiles. Local contexts only (remote SSH would mislead).
- **Switch & Relaunch** flow now writes `~/.hermes/active_profile` and relaunches Scarf in a single click instead of asking the user to quit+reopen.
- Profile-resolver logs are now structured (key=value form) so `log show … | grep ProfileResolver` can pull "which profile did Scarf resolve to and why" out of support requests.
### Swift 6 cleanup
- `MessageSpeechService` — drop `@preconcurrency` on the AVSpeechSynthesizerDelegate conformance now that the protocol's Sendable annotations are upstreamed.
- `ChatView``RichChatViewModel.PendingPermission: @retroactive Identifiable`. Quiets the Swift 6 compiler so downstream breakage would be loud if ScarfCore ever adds the conformance upstream.
- `CredentialPoolsView``.help(Text(verbatim:))` so backticks render literally instead of being treated as markdown inline-code.
### iOS
- Composer redesigned with HIG touch targets + clear disabled state.
- Portrait lock retained.
- Chat-start preflight moved off MainActor.
### Known caveats
- **Cron-job-uninstall by name is ambiguous** when two projects share the same template id. The Layer B test surfaced this — manifests as: the test passes, but if you've manually installed the same template before running the test, your real cron job can disappear. Recovery is `hermes cron create`. Fix is queued: store cron-job IDs in `<project>/.scarf/template.lock.json` at install time and resolve by ID at uninstall time.
- **Full-suite parallel test runs intermittently hang** — pre-existing flaky test infrastructure unrelated to this release. Individual suites all pass; the hang only manifests on `xcodebuild test` with everything concurrent. The sentinel-marker hardening prevents user-data damage from any race.
### Compatibility
- **Hermes target unchanged from 2.6.0**: v2026.4.30 (v0.12.0). Pre-v0.12 Hermes hosts continue to work — no new capability gates added in this release.
- **Min macOS unchanged**: 14.6.
- **No schema changes** to anything in `~/.hermes/`. The two new Scarf-owned files (`scarf/catalog_cache.json` and the template-installer's `.scarf-test-home-marker` for tests) are additive.
+155
View File
@@ -0,0 +1,155 @@
## What's in 2.7.0
The biggest release since 2.6.0 — a six-week stretch covering **remote-context performance**, a **new project authoring flow**, **dashboard widgets**, **OAuth resilience**, and a top-to-bottom **performance instrumentation harness** that drove the bulk of the rest. 36 commits, no schema bump, no Hermes capability bump.
The throughline: Scarf got materially faster and more honest on slow remote SSH links, where 30-second sqlite timeouts and silently-empty UI used to be common. The skeleton-then-hydrate pattern, SSH cancellation propagation, and ScarfMon-driven diagnosis are the shape of how that work gets done now.
---
### Remote-context performance — chats and Activity in seconds, not 30s timeouts
Resuming a chat on a slow remote (a 420ms-RTT droplet, an underprovisioned VPS, a tunnel through 4G) used to fetch the full message column set in one shot, which routinely tripped the 30s SSH timeout on chats with multi-page tool result blobs. The 160-message session was broken; the 30-message session was broken too. Activity didn't load at all.
v2.7 introduces a **skeleton-then-hydrate pattern** that bounds the wire payload by what the user actually needs to see RIGHT NOW, then fills in the heavy stuff in the background:
- **Chat skeleton.** [`fetchSkeletonMessages`](https://github.com/awizemann/scarf/blob/main/scarf/Packages/ScarfCore/Sources/ScarfCore/Services/HermesDataService.swift) selects user + assistant rows only (skips `role='tool'`) with `tool_calls` / `reasoning` / `reasoning_content` hard-NULLed at the SQL level. Wire payload bounded by conversational text alone — typically a few KB. The chat appears in seconds. Background `startToolHydration` pages through `hydrateAssistantToolCalls` in 5-id batches to splice tool calls in. Tool-result CONTENT is **opt-in** via Settings → Display → "Load tool results in past chats" (default off); the inspector pane lazy-fetches per-result content via `fetchToolResult(callId:)` when you open a card.
- **Activity skeleton.** [`fetchRecentToolCallSkeleton`](https://github.com/awizemann/scarf/blob/main/scarf/Packages/ScarfCore/Sources/ScarfCore/Services/HermesDataService.swift) returns metadata-only rows (id + session_id + role + timestamp; everything else NULLed). Activity opens in <1s on remote with placeholder rows; real per-call entries swap in as paged hydration completes. New "Loading tool details…" pill in the page header surfaces hydration progress.
- **Single-id whale recovery.** When a 5-id batch trips the 30s timeout (one row carries an oversized `tool_calls` blob — a long Edit's args, a big diff), an L1 single-id retry isolates the offending row so the rest of the batch still hydrates. Whale row stays bare; assistant message stays readable.
- **Lazy tool result loading in the inspector.** Default-off avoids the bulk fetch. When you focus a tool call card, ChatInspectorPane fires `loadToolResultIfMissing(callId:)` which splices a single result into the message stream without re-fetching anything else.
Effect: a 160-message thinking-model session that used to time out at exactly 30s now opens in under 2 seconds with placeholder cards filling in over the next few. Activity loads in 500-800ms.
#### SSH cancellation that actually cancels
`Task.detached { … }` doesn't inherit cancellation from the awaiting parent, and `Task<…> { … }` (unstructured) also drops the signal. Without explicit bridging, cancelling a chat-load Task only unwinds Swift state — the underlying ssh subprocess kept running for the full 30s, pinning a remote sqlite query and a ControlMaster session slot. This produced the "third chat hangs" / "dashboard spins after rapid switching" symptom.
v2.7 wires `withTaskCancellationHandler` through [`SSHScriptRunner.run`](https://github.com/awizemann/scarf/blob/main/scarf/Packages/ScarfCore/Sources/ScarfCore/Transport/SSHScriptRunner.swift) and [`RemoteSQLiteBackend.query`](https://github.com/awizemann/scarf/blob/main/scarf/Packages/ScarfCore/Sources/ScarfCore/Services/Backends/RemoteSQLiteBackend.swift) so parent cancellation reaches the `Process` and calls `proc.terminate()` within 100ms. New `ssh.cancelled` ScarfMon event surfaces this.
#### In-flight coalescing for `loadRecentSessions`
File-watcher deltas during an active stream used to stack 2-3 parallel sessions-list reload tasks (the 500ms `scheduleSessionsRefresh` debounce only suppresses a pending tick, not one already executing). Subsequent callers now await the in-flight load instead of spawning a parallel SSH subprocess. New `mac.loadRecentSessions.coalesced` event tracks dedup hits.
#### Loading-state UX hardening
The Mac chat sidebar greys out and disables row taps the moment a session-switch is initiated (synchronously, before `client.start()` returns), with a floating ProgressView showing the current phase: **"Spawning hermes acp…"** → **"Authenticating…"** → **"Loading session…"** → **"Loading history…"** → **"Ready"**. Pre-fix the sidebar looked engageable while the 5-7 second SSH+ACP boot was still in flight, and the user could queue up a second session-switch behind the first. New `isStartingSession` flag flips on user click for instant feedback.
#### Partial-result + mismatch + pinned-model banners
- **Partial-result banner.** When the skeleton fetch trips an SSH transport failure (rather than a clean empty result), the chat surfaces "Couldn't load full chat history — the connection to *server* timed out" through the existing `acpError` triplet, plus forces `hasMoreHistory = true` so the "Load earlier" affordance shows up. Replaces the pre-fix silent empty transcript.
- **Model/provider mismatch banner.** [`ModelPreflight.detectMismatch`](https://github.com/awizemann/scarf/blob/main/scarf/Packages/ScarfCore/Sources/ScarfCore/Services/ModelPreflight.swift) recognizes when `model.default` carries a `<provider>/...` prefix that disagrees with `model.provider` (e.g. `anthropic/claude-sonnet-4.6` + `provider: nous` after switching OAuth via Credential Pools). Banner offers one-click fix in either direction.
- **Pinned-model failure hint.** ACP error classifier now recognizes `model_not_found` / `404 messages` / `model is not available` and surfaces "This session was created with a model the provider no longer offers — start a new chat to use your current model" so the pinned-model failure mode has a clear recovery path.
- **OAuth-completion provider swap.** After a successful OAuth in Credential Pools, if the just-authed provider differs from `model.provider`, surface "Switch active provider to *name*?" with [Switch] / [Keep current] instead of auto-dismissing.
---
### New Project from Scratch wizard + Keychain-backed cron secrets
A **third project entry point** alongside Browse Catalog and Add Existing Project: a wizard that scaffolds a Scarf-standard project skeleton (`<project>/.scarf/dashboard.json` + AGENTS.md marker block), registers it, and hands off to a chat session that auto-activates the bundled `scarf-template-author` skill. The skill drives the rest conversationally — widgets, optional config schema, optional cron — and writes the final files itself. Wizard stays minimal because the agent does configuration better than a multi-step form. The skill ships bundled inside `Scarf.app/Contents/Resources/BuiltinSkills.bundle/` and copies into `~/.hermes/skills/` on launch (idempotent + version-gated).
**Cron + Keychain — `$SCARF_<SLUG>_<FIELD>` env vars.** Cron prompts that referenced `secret`-typed config fields used to get the literal `keychain://...` URI back when reading `config.json`, producing 401s. v2.7 mirrors resolved Keychain values into `~/.hermes/.env` under a marker-bounded block keyed by template slug:
```sh
# scarf-secrets:begin local-news-aggregator
SCARF_LOCAL_NEWS_AGGREGATOR_API_TOKEN=actual-value
SCARF_LOCAL_NEWS_AGGREGATOR_RSS_URL=https://example.com/feed
# scarf-secrets:end local-news-aggregator
```
Hermes already reloads `~/.hermes/.env` per cron tick, so credential rotation is automatic — just edit the value in Configuration → next tick sees it. The mirror runs at every state-change point: install, post-install Configuration save, uninstall, "Remove from List", and on app launch (reconciliation pass over registered projects). Source of truth stays in the Keychain — `config.json` keeps `keychain://` URIs unchanged. Mode 0600 enforced on `~/.hermes/.env`.
Cron prompts now reference these env vars directly:
```json
{
"prompt": "Use the terminal: curl -sS -H \"Authorization: Bearer $SCARF_LOCAL_NEWS_AGGREGATOR_API_TOKEN\" \"$SCARF_LOCAL_NEWS_AGGREGATOR_RSS_URL\" -o {{PROJECT_DIR}}/.scarf/feed.xml"
}
```
**Migration.** First launch of v2.7 walks the project registry and writes the managed block per schemaful project — automatic. Existing cron prompts you wrote against the old (broken) `config.json` pattern still need updating: open the cron job in Scarf's Cron sidebar and edit the prompt, or ask the agent in chat ("Update my Local News cron job's prompt to use the new env var convention") — the bundled `scarf-template-author` skill (now v1.1.0) documents the convention with worked examples.
Also fixes [#75](https://github.com/awizemann/scarf/issues/75) — `_NSDetectedLayoutRecursion` on the Configuration form for projects whose form transitioned between stages with different intrinsic heights.
---
### Project dashboards — file-reading widgets, sparklines, typed status
Five new widget types, project-wide auto-refresh, and a structured error card for unknown widgets. Backwards-compatible — every existing `dashboard.json` renders byte-identically.
- **Project-wide auto-refresh.** [`HermesFileWatcher`](https://github.com/awizemann/scarf/blob/main/scarf/scarf/Core/Services/HermesFileWatcher.swift) used to watch each project's `dashboard.json` specifically. v2.7 promotes that to a watch on the entire `<project>/.scarf/` directory. A `markdown_file` or `log_tail` widget pointing at `<project>/.scarf/reports/foo.md` refreshes the moment a cron job rewrites the file. **By convention, place files the dashboard reads inside `.scarf/`** so the watch picks them up.
- **`markdown_file`** — renders a markdown file from disk through the same `MarkdownContentView` pipeline used by inline `text` widgets.
- **`log_tail`** — last `lines` of a file (default 20, max 200), monospaced, ANSI codes stripped.
- **`cron_status`** — last run / next run / state for one Hermes cron job by `jobId`, plus a small inline log tail. Read-only — Run/Pause/Resume controls stay on the Cron tab.
- **`image`** — local file (`path` relative to project root) or remote `url`. Optional `height` cap. Useful for matplotlib/Plotly PNGs the cron job generates.
- **`status_grid`** — compact NxM grid of colored cells, one per service / item, with hover labels.
- **`stat` widget gains inline sparklines.** Optional `sparkline: [Number]` field. SVG-only render, dozens per dashboard cost nothing.
- **Typed status badges.** `list` items and `status_grid` cells share a typed enum (`success`, `warning`, `danger`, `info`, `pending`, `done`, `neutral`) with lenient decode for synonyms (`ok`/`up` → success, `down`/`error` → danger). Unknown strings render as plain text.
- **Structured widget error card.** Replaces the legacy "Unknown: \<type\>" placeholder with a card surfacing the title, specific reason, and a hint.
- **Schema mirror.** The widget vocabulary lives once at [`tools/widget-schema.json`](https://github.com/awizemann/scarf/blob/main/tools/widget-schema.json); the catalog validator reads from it and enforces per-type required fields.
---
### OAuth resilience + Credential Pools
- **Daily OAuth keepalive cron.** Prevents Anthropic OAuth refresh tokens from expiring after weeks of inactivity. New cron job `[scarf:oauth-keepalive]` (managed by Scarf) pings Hermes on a daily cadence; the in-app Refresh All Sessions action mirrors the same path on demand.
- **Remote re-auth.** Re-authenticating against a remote droplet's OAuth provider used to be blocked by the lack of a stdin path through SSHTransport. The OAuth flow now drives a remote `hermes auth add` correctly with stdin forwarded.
- **OAuth remove button.** Per-provider remove action in Credential Pools (auth.json edit), with confirmation dialog. Companion auto-refresh of the view when `auth.json` changes externally (file-watcher).
- **`resolve_provider_client` error classification.** When an auxiliary task references a provider whose credentials aren't loaded, Hermes prints `resolve_provider_client: <name> requested but <Display Name> not configured` to stderr — pre-fix this surfaced in chat as the opaque `-32603 Internal error` with no actionable detail. Now classified into a clear hint pointing at Settings → Aux Models.
- **Aux Tab unknown-task surface.** When `config.yaml` has an `auxiliary.<task>` block for a task Scarf doesn't know about (newer Hermes added it; Scarf hasn't caught up), render it as a plain row with the raw provider/model values instead of dropping it silently.
- **Credential Pools refresh after OAuth sheet dismiss.** Closing the OAuth sheet after a successful add now refreshes the list immediately instead of leaving the just-added pool hidden until the next file-watcher tick.
---
### ScarfMon — performance instrumentation harness
The diagnostic surface that drove the bulk of the v2.7 perf work. Off by default; signpost-only mode (Instruments-friendly) is free; Full mode (4096-entry in-memory ring buffer + os.Logger) is a click away in Settings → Diagnostics → Performance. Wiki: https://github.com/awizemann/scarf/wiki/Performance-Monitoring
- **Phases 1-3** built the core: dispatcher + ring buffer + 3 backends, chat / transport / sqlite measure points, diagnostic counters for chat-render bursts, finalize-burst dampening.
- **Tier A + B** added per-feature instrumentation: iOS file watcher, sessions list, model catalog, dashboard widgets, image encoder, message hydration.
- **Nous picker investigation** localized a 60s + 120s beach-ball to a specific path (Nous catalog `readCache`), then killed the 120s one with dedupe + 5s timeout.
- **Tier C catch-up** (this release): instrumented Memory / Skills / Cron / Curator load paths so future captures show how often these tabs cost multiple sequential SFTP RTTs on remote.
- **Per-call bytes recorded** on transport + sqlite events so captures show payload sizes alongside latencies.
- **`mac.emptyAssistantTurn` event** documents the Nous quirk where the model returns a thought stream with no body (the bubble looks like Hermes is "still thinking" but the turn already finished).
Adding a new measure point is two lines. The harness covers Mac and iOS uniformly. The "Copy as JSON" button exports the ring buffer for paste-into-issue diagnosis.
---
### Other fixes + polish
- **Sessions sidebar reload debounce** — file-watcher deltas during streaming used to flicker the sessions list. Coalesced into one trailing fetch ~500ms after the last tick.
- **Session-load pagination + race guard** — switching to a small chat while a larger one is mid-fetch could last-write-wins the small chat away. Three race-checks against `self.sessionId` prevent the stale fetch from overwriting.
- **Sessions + previews batched** — two separate SSH calls folded into one `queryBatch` round trip, halving the round-trips for every sidebar refresh.
- **Remote SQLite query timeout** bumped 15→30s to better tolerate slow links; in-flight query coalescing dedupes concurrent identical queries.
- **`Thread.sleep` spin replaced** with a kernel-wait via `DispatchGroup` for `runLocal` timeout; under concurrent SSH load the old loop accumulated spin-blocked threads and produced 7-second outliers in `loadRecentSessions`.
- **Window position + size** persists across launches.
- **Sidebar reorder** — Projects promoted to first section; profile chip moved under server name.
- **`stop` badge suppressed** on metadata footer for normal turn ends (it was firing for every clean completion, looking like an error).
- **Nous picker search field** + `model-picker` filter for the long Nous overlay model list.
- **`oauth-keepalive` cron create** — drop the `--silent` flag Hermes doesn't accept.
- **Snapshot pipeline rewritten** — replaced the `sqlite3 .backup`-then-download pipeline with direct SSH-streamed query execution (issue [#74](https://github.com/awizemann/scarf/issues/74)). Eliminates the multi-minute snapshot wait on multi-GB state.db files. Companion fix: pre-expand `~/` in Swift via `resolvedUserHome` so sqlite3 finds the DB without depending on the remote shell's tilde expansion.
- **Aux nested-YAML parser** — corrected the parser so the unknown-task surface works on remote (was previously dropping aux blocks whose `provider:` value lived on a separate line).
- **`ModelPreflight` newline trim bug** — `.whitespaces` doesn't strip newlines; switched both trims to `.whitespacesAndNewlines` so a stray `\n` in a hand-edited config.yaml doesn't false-positive the mismatch banner.
---
### What's measured today
321 ScarfCore tests pass (302 prior + 19 new ModelPreflight). New ScarfMon events documented in the [Performance-Monitoring wiki](https://github.com/awizemann/scarf/wiki/Performance-Monitoring).
### Compatibility
- macOS 14+ (unchanged).
- Hermes target: still **v2026.4.30 (v0.12.0)**. No new Hermes capability gates added.
- Existing `dashboard.json` files render unchanged.
- Existing `.scarftemplate` bundles install unchanged. Catalog manifest schemaVersion stays at 1/2/3 — no bump.
- Existing `~/.hermes/.env` content is preserved byte-identically — Scarf only writes inside its `# scarf-secrets:begin <slug>` / `# scarf-secrets:end <slug>` regions.
- The skeleton-then-hydrate chat loader and SSH cancellation propagation are **Mac-only** in this release; ScarfGo (iOS) keeps its existing chat path.
### What's deferred
- **Per-widget data sources + per-widget refresh granularity.** The general "widget points at a typed data source" abstraction is the next-largest win in dashboards but materially expands the model + JS mirror + validator surface. The project-wide watch covers the common cron-driven workflow without it.
- **Cross-project health digest sidebar rollup.** Counting attention-needed projects across the registry — scoped but didn't pull its weight. The typed status enum makes it cheap to add later.
- **Automatic cron-prompt rewriter on upgrade.** Heuristic rewrites of free-form prompts are risky; the docs + agent-assisted path ships in v2.7. Revisit a "scan + fix" UI in v2.8 if real users miss the migration.
- **iOS New Project wizard + iOS Keychain-env mirror.** ScarfGo's project surface is read-only; the wizard's chat-handoff pattern depends on Mac-only ACP plumbing.
- **iOS skeleton-then-hydrate loaders.** Same data-service surfaces are public, but the iOS chat lifecycle is structured differently. Defer until iOS dogfooding shows the same payload-size pain.
- **Tier C redesigns (Memory/Skills/Cron/Curator).** Instrumented in v2.7; redesign waits for capture data showing which path actually needs the skeleton-then-hydrate treatment.
+34
View File
@@ -0,0 +1,34 @@
## What's in 2.7.1
A patch release covering three bug reports filed against 2.7.0, plus follow-up cleanups in the same neighborhood. No data migrations, no UI surface changes — drop-in replacement for 2.7.0 on Mac.
### Bug fixes
#### Mac
- **[#77](https://github.com/awizemann/scarf/issues/77) — Sessions screen renders empty even when Dashboard reports sessions exist.** v2.7.0 folded the Sessions tab's two SQL queries (sessions list + previews) into a single batched SSH round-trip for perf. The combined wire payload for any user with ~150+ sessions crossed macOS's 1664 KB pipe-buffer threshold; without a concurrent reader draining the pipe, the remote `sqlite3 -json` blocked, the script never finished, our 30-second timeout fired, and the call returned an empty result. `SSHScriptRunner` now drains stdout/stderr concurrently with the running process via `FileHandle.readabilityHandler`, so the kernel pipe never fills. Same fix applied to the local-execution path. New regression test pushes 256 KB of synthetic output through the runner and asserts full delivery — would have wedged pre-fix.
- **[#78](https://github.com/awizemann/scarf/issues/78) — Skills "What's New" pill contradicts the Updates sub-tab.** The pill at the top of the Skills page was rendering on every sub-tab, including Updates. It counts **local** file deltas since the user last clicked "Mark as seen" (e.g. "18 new" = 18 skills landed on disk that you haven't acknowledged), while the Updates body runs `hermes skills check` to find skills with newer **upstream** versions available — a different concept. Two surfaces using the word "update" for two different things made the screen contradict itself. Two changes: the pill now renders only on the Installed sub-tab (Mac and ScarfGo), and its label says "X **changed** since you last looked" instead of "X updated" so the local-file vocabulary doesn't collide with upstream-update vocabulary anywhere on the page.
- **[#79](https://github.com/awizemann/scarf/issues/79) — Skills hub search returns nothing for terms visible in Browse.** With the source picker on "All Sources", `hermes skills search <query>` (no `--source` flag) routes through Hermes's centralized index and skips external API sources (skills-sh, github, clawhub, lobehub, well-known) — but Browse still aggregates from those sources, so a skill like `honcho` would show up in Browse and disappear in search. Same picker, same query, contradictory results. Rather than chase Hermes's index gaps, "All Sources" search now means "filter what you can already see": Scarf caches the most recent Browse payload and runs a client-side substring filter (case-insensitive against name, description, and identifier) against it, instantly. Source-specific searches still shell out to `hermes skills search --source <s>` for full upstream search semantics. Five new tests cover the filter behavior.
- **`hermesPIDResult()` — narrow the Hermes "is it running?" probe to the gateway.** Previously `pgrep -f hermes`, which matched any process with "hermes" in its argv: chat sessions Scarf itself spawns, `hermes -z` one-shots, log tails, even the README in an editor. The Dashboard "Hermes is running" badge could read true even when the gateway daemon was down. Tightened to a regex that matches only the gateway shape — `python -m hermes_cli.main gateway run …` and `/path/to/hermes gateway run …`. All callers (DashboardViewModel, HealthViewModel, SettingsViewModel, scarfApp, stopHermes) want the gateway PID specifically. Cherry-picked from [#76](https://github.com/awizemann/scarf/pull/76) — thanks to [@unixwzrd](https://github.com/unixwzrd) for the diagnosis and regex.
- **`HealthViewModel.stopDashboard()` — stop the dashboard by port, not `pkill -f`.** External-instance fallback used to be `pkill -f "hermes dashboard"`, broad enough to match shell history, log tails, README readers — anything with the substring in its argv. Now `lsof -tiTCP:<port> -sTCP:LISTEN` resolves the PID actually bound to the dashboard port and only that one process gets `SIGTERM`. Trusting the port is correct here: Scarf owns the configured port and the user-visible intent is "stop the thing on this port." Direction cherry-picked from [#76](https://github.com/awizemann/scarf/pull/76); the `-c hermes` filter from the original was dropped because Hermes installs as a Python shebang script and the kernel COMM is `python`, not `hermes``-c hermes` would silently miss every standard install.
### Documentation + tooling
- **`scripts/local-build.sh` + `BUILDING.md` for contributor builds.** New unsigned single-arch Debug build script for contributors without an Apple Developer account. Detects arm64 / x86_64, verifies xcode-select / xcrun / xcodebuild, probes the Metal toolchain (offers an interactive install on TTY, errors cleanly on CI), resolves Swift packages, builds Debug with signing disabled. Optional one-touch `ditto` to `/Applications/scarf.app` on explicit y/N. The canonical Release universal CLI in `README.md` is unchanged — `local-build.sh` is an alternative for contributors, not a replacement for the shipping build. Cherry-picked from [#76](https://github.com/awizemann/scarf/pull/76).
- **`BUILDING.md` + `CONTRIBUTING.md` — restored Sonoma compatibility messaging.** The runtime min is **macOS 14.6 (Sonoma)** — that's the `MACOSX_DEPLOYMENT_TARGET` on the main `scarf` target and is intentional. Build min is **Xcode 16.0** (needed for Swift 6 strict-concurrency features). The legacy CONTRIBUTING.md line had drifted to "Xcode 26.3+ / macOS 26.2+", which would have steered Sonoma contributors and users away from a build that actually runs on their box. Corrected, with a load-bearing-callout in BUILDING.md so future doc edits don't silently raise the floor again.
### Migrating from 2.7.0
Sparkle will offer the update automatically. No config migration, no schema changes. Existing sessions, skills, and projects are untouched.
If you've been working around #77 by collapsing the sidebar or restarting Scarf to repopulate the Sessions list, you can stop — sessions should load reliably now.
### Acknowledgements
- [@bricelb](https://github.com/bricelb) for the three v2.7.0 bug reports ([#77](https://github.com/awizemann/scarf/issues/77), [#78](https://github.com/awizemann/scarf/issues/78), [#79](https://github.com/awizemann/scarf/issues/79)) — well-instrumented reproductions including screenshots and environment details made the diagnosis straightforward.
- [@unixwzrd](https://github.com/unixwzrd) for [#76](https://github.com/awizemann/scarf/pull/76) — the gateway-pgrep tighten, the `pkill -f "hermes dashboard"` direction, and the `local-build.sh` contributor flow are all cherry-picked from that PR.
+83
View File
@@ -0,0 +1,83 @@
## What's in 2.7.5
A feature release that lifts Scarf's Kanban surface from a read-only list (the v2.6 placeholder shipped while upstream Kanban was still mid-rework) to a full drag-and-drop board with the complete Hermes v0.12 mutation surface wired up — plus per-project boards bound to a Scarf-minted tenant slug, and a read-only board on iOS for at-a-glance status from your phone. No data migrations, no schema changes; pre-v0.12 hosts gracefully hide the surface.
### New features
#### Mac
- **Drag-and-drop Kanban board** ([scarf/Features/Kanban/Views/KanbanBoardView.swift](scarf/scarf/Features/Kanban/Views/KanbanBoardView.swift)). Five visible columns — Triage / Up Next (`todo` + `ready`) / Running / Blocked / Done — collapsing Hermes's seven status values into a layout that doesn't waste space on `ready`, which the dispatcher only ever holds for a few seconds. Triage hides itself when empty; archived hides behind a header toggle. Drop a card onto a column and Scarf maps the gesture to the right Hermes verbs through a pure transition planner: drop-on-Running fires `kanban dispatch` (the dispatcher then spawns a worker), drop-on-Blocked opens a sheet asking for a reason and calls `kanban block`, drop-on-Done opens a result sheet and calls `kanban complete`, blocked → running chains `unblock` + `dispatch`. Forbidden transitions (anything dropped on Done; anything dragged out of Triage) reject with a red drop-target stroke and a tooltip explaining why — Done is terminal, Triage is promoted by a specifier worker, neither has a CLI verb that maps cleanly. Optimistic local updates apply on drop and revert on CLI failure with a toast, so the UI feels instant.
- **Side-pane inspector** ([KanbanInspectorPane.swift](scarf/scarf/Features/Kanban/Views/KanbanInspectorPane.swift)). Click a card and a 420 px pane slides in from the trailing edge. Not a modal sheet — modal would block triaging the next card after closing. Header carries the status, an inline assignee menu (more on that below), workspace kind, and tenant; below that, four tabs render `hermes kanban show <id>` data: **Comments** (with an inline composer that calls `kanban comment`), **Events** (the `task_events` log with per-kind glyphs), **Runs** (one row per attempt with outcome badge + summary + error), and **Log** — the worker's captured stdout/stderr from `hermes kanban log <id>`, polled every 2 s while the task is running with a "● streaming" indicator and auto-scroll to the latest line, snapshot-only with a refresh button when the task is in a terminal state. The action bar at the bottom has all the per-status verbs — Start (which is `claim` rebranded as a user-visible action), Complete, Block, Unblock, Archive — every one with a help tooltip explaining what it does and what Hermes verb it invokes. The "Archive" tooltip explicitly notes Hermes has no hard-delete: archived tasks remain in `~/.hermes/kanban.db` and are recoverable via the "Show archived" toggle until `hermes kanban gc` runs.
- **Inspector auto-refresh.** While the inspector is open, the detail (header, action buttons, comments, events, runs) re-fetches every 5 s on the same cadence as the board itself, so a worker transition (e.g. running → done elsewhere) is reflected without the user having to close + reopen. The Log tab's 2 s poll runs separately and self-cancels the moment the task transitions out of `running`.
- **Inline assignee picker on the inspector header.** The assignee badge is a clickable menu — set means a `.brand` (rust) chip, unassigned means a `.warning` (yellow) chip so the eye catches it instantly. Tapping opens a menu of every known profile (union of `~/.hermes/profiles/`, current task assignees, and the active local profile from `HermesProfileResolver`) plus an "Unassigned" option. Selection routes through `kanban assign` and immediately follows with `kanban dispatch` so the task gets picked up promptly. Solves the "I assigned a profile but nothing happened" gap end-to-end without the user touching a terminal.
- **Health banner in the inspector.** Surfaces two conditions that previously left users staring at a stuck task with no explanation. **Yellow** when the task is unassigned in `ready` / `todo`: *"Won't run automatically — Hermes's dispatcher silently skips tasks with no assignee."* The dispatcher's own `--json` output literally lists these under `skipped_unassigned`; we now surface that to the human. **Red** when the most-recently-completed run ended in a non-success outcome (`stale_lock` / `crashed` / `gave_up` / `timed_out` / `spawn_failed` / `reclaimed` / `failed`): banner displays the outcome label + the raw `error` field from the run record, so you don't have to dig into the Runs tab to discover it. The red banner is suppressed while a fresh attempt is running — once status flips back to `running`, the previous outcome is stale signal and the Log tab's live stream is the right thing to look at.
- **Card-level signals.** Cards in `running` get a 2 px `ScarfColor.info` left edge + a subtle title shimmer so live work is obvious at a glance. Blocked cards get a 2 px `ScarfColor.warning` left edge + a ⚠ glyph next to the title. Done cards dim to 0.7 opacity in light mode, 0.55 in dark, with a green ✓ in the title row. Cards in `ready` / `todo` with no assignee get a yellow ⚠ glyph in the title row with a tooltip explaining the dispatcher won't pick them up — same signal as the inspector banner, just at the board level so triage is one keypress away.
- **`Board | List` toggle at the top of the route.** The v2.6 read-only list view is preserved in `KanbanListView.swift` and surfaced via a segmented picker, so users on narrow windows or anyone who prefers a flat sortable list can opt in. Choice persists across launches via `@AppStorage`.
- **New Task sheet** ([KanbanCreateSheet.swift](scarf/scarf/Features/Kanban/Views/KanbanCreateSheet.swift)). Title, body (markdown supported), assignee (defaults to `HermesProfileResolver.activeProfileName()` so newly-created tasks actually run), workspace kind (segmented `Scratch / Worktree / Project Dir`; locked to Project Dir on per-project boards), priority slider, comma-separated skills with autocomplete from `~/.hermes/skills/`, optional tenant (hidden on per-project boards — the slug is implicit), and a "Send to triage" toggle. Submit fires `kanban create --json` and immediately follows with `kanban dispatch` so an assigned task transitions `ready``running` within seconds rather than waiting for the gateway dispatcher's internal cycle.
- **Kanban moved from Manage → Monitor in the sidebar.** It's runtime work-in-progress, not configuration. Sits between Activity and the rest of Manage so users see "what's happening right now" at a glance.
#### Per-project Kanban
- **`DashboardTab.kanban` on every project**, capability-gated on `HermesCapabilities.hasKanban`. Renders a project-scoped `KanbanBoardView` filtered to the project's tenant slug. Workspace defaults in the New Task sheet are pre-pinned to `dir:<project.path>`. Empty state explains the project doesn't have any tasks yet and offers a "New Task" CTA — the empty board IS the discovery surface.
- **Tenant minting via [KanbanTenantResolver](scarf/scarf/Core/Services/KanbanTenantResolver.swift).** Each Scarf project gets a stable `scarf:<slug>` tenant minted on first kanban interaction and persisted to `<project>/.scarf/manifest.json` (new optional `kanbanTenant` field on `ProjectTemplateManifest`). Slug rules: lowercased, hyphenated, ≤ 48 chars, `scarf:` prefix to avoid collision with hand-typed tenants. Once minted, the tenant is **immutable across rename** — tasks already on the board carry the original slug, so renaming the project doesn't orphan them. Bare projects (no manifest) get a sentinel manifest written with `id: scarf/<project-id>` + `version: 0.0.0` + just the `kanbanTenant` set; the `ProjectAgentContextService` reader recognizes the sentinel and refuses to surface it as a "Template" line in the AGENTS.md block, so the project doesn't suddenly start advertising a fake template to the agent.
- **Agent-side tenant injection.** [ProjectAgentContextService.renderBlock](scarf/scarf/Core/Services/ProjectAgentContextService.swift) emits a "Kanban tenant" line inside the `<!-- scarf-project -->` markers in `<project>/AGENTS.md` whenever a tenant exists, instructing the agent to pass `--tenant scarf:<slug>` on `hermes kanban create`. `ChatViewModel.startACPSession` already calls `refresh(for:)` before opening every project chat, so the agent reads a fresh tenant on every session start with no extra wiring. Agents are imperfect at flag discipline; a forgotten `--tenant` lands the task in the global "Untagged" group rather than failing — acceptable v2.7.5 behavior.
- **`kanban_summary` dashboard widget** ([KanbanSummaryWidgetView.swift](scarf/scarf/Features/Projects/Views/Widgets/KanbanSummaryWidgetView.swift)). New widget kind for project dashboards: shows the top three `running` / `blocked` / `todo` tasks for the project's tenant by priority, plus a glance footer (`"12 todo · 3 running · 5 blocked"`) sourced from `kanban stats`. Polls every 10 s while the dashboard is foregrounded. Widget vocabulary registered in [tools/widget-schema.json](tools/widget-schema.json) and rendered on the catalog site via [site/widgets.js](site/widgets.js); template authors can drop a `{ kind: kanban_summary, max_rows: 3 }` block into `dashboard.json`.
#### iOS / iPadOS
- **Read-only Kanban tab on `ProjectDetailView`** ([Scarf iOS/Kanban/ScarfGoKanbanView.swift](scarf/Scarf%20iOS/Kanban/ScarfGoKanbanView.swift)). Same five-column collapse rendered as a horizontally-paged segmented `Picker` of single-column lists — HIG-friendly on iPhone where a 5-column grid forces unreadable card widths. Pulls live status, assignee, workspace, skills, priority chips. Tap a card → modal `NavigationStack` detail sheet ([ScarfGoKanbanDetailSheet.swift](scarf/Scarf%20iOS/Kanban/ScarfGoKanbanDetailSheet.swift)) with the same Comments / Events / Runs tabs the Mac inspector has. Read-only in v2.7.5 — mutations + drag-drop on iPad land in v2.8 once the Mac flow is fully shaken out. Card titles use semantic `.headline` (not `ScarfFont`) so Dynamic Type works; chrome (badges) stays on `ScarfBadge` for fixed visual weight per the project's iOS conventions.
#### ScarfCore
- **`KanbanService` actor** ([Packages/ScarfCore/Sources/ScarfCore/Services/KanbanService.swift](scarf/Packages/ScarfCore/Sources/ScarfCore/Services/KanbanService.swift)) — pure-I/O Sendable actor wrapping every Hermes v0.12 verb (`list / show / runs / stats / assignees / create / assign / claim / comment / complete / block / unblock / archive / dispatch / link / unlink / log`). Dispatches each CLI invocation through `Task.detached(priority: .utility)` matching the existing concurrency conventions. Errors land in [KanbanError](scarf/Packages/ScarfCore/Sources/ScarfCore/Models/KanbanError.swift) and surface as inline banners (not modal alerts) since the board is high-frequency. The "no matching tasks" stdout sentinel is normalized to `[]` rather than thrown.
- **Pure transition planner.** `KanbanService.plan(for: KanbanTransition)` is a synchronous function that maps a `(from, to)` column pair to the right verb sequence — `(.upNext, .running) → [.dispatch]`, `(.blocked, .running) → [.unblock, .dispatch]`, etc. Disallowed transitions throw `KanbanError.forbiddenTransition` with a user-actionable reason. The planner is fully tested in `KanbanModelsTests.swift`. Critically: `dispatch` (not `claim`) is the verb used for Up-Next → Running. Hermes's `claim` is documented as "manual alternative to the dispatcher" and assumes the caller spawns the worker themselves — Scarf doesn't, so calling `claim` from drag-drop reserved tasks but never spawned work, and the dispatcher reclaimed them ~15 minutes later (`stale_lock`). `dispatch` is the right primitive for a GUI client.
- **Cross-platform [KanbanTenantReader](scarf/Packages/ScarfCore/Sources/ScarfCore/Services/KanbanTenantReader.swift).** Read-only projection over `<project>/.scarf/manifest.json`'s `kanbanTenant` field. The full `ProjectTemplateManifest` type lives in the Mac target; this lightweight reader gives iOS a way to filter the per-project board by tenant without linking the full manifest model.
- **Timestamp decoding tolerates both shapes.** Hermes emits `created_at` / `started_at` / `completed_at` / `last_heartbeat_at` etc. as Unix integer seconds (its SQLite columns are INTEGER), but earlier wire docs implied ISO-8601 strings. The decoder now accepts either an integer or a string and normalizes to ISO-8601 so downstream code only handles one type. Locked in by `decodeUnixIntegerTimestamps` in `KanbanModelsTests`.
- **`KanbanBoardViewModel` optimistic merge.** Holds `optimisticOverrides: [taskId: status]` for in-flight drags; the polled response merges with optimistic state until the server confirms the new status, so a stale poll arriving milliseconds after a drop can't snap the card back to its old column. On CLI failure the override is removed and the message lands in the inline banner.
### Dispatch + assignee fixes
A diagnostic round driving real tasks end-to-end exposed a connected bug pattern that the polish pass closed:
- **Hermes's dispatcher silently skips unassigned tasks** — its `kanban dispatch --json` output literally lists them under a `skipped_unassigned` key and moves on. Tasks created without an assignee sat in `ready` indefinitely and the user had no signal anything was wrong. The New Task sheet now defaults to the active Hermes profile, the inspector header shows a yellow "Unassigned" chip + warning banner, every `ready` / `todo` card without an assignee gets a ⚠ glyph + tooltip, and the inspector's inline assignee picker fixes it in one click.
- **Drag-to-Running used to call `claim`**, which is a manual alternative to the dispatcher. Status flipped to `running`, but no worker spawned (Scarf doesn't host workers), and 15 minutes later the dispatcher reclaimed the task with a `stale_lock` outcome. Replaced with `dispatch` end-to-end so the gateway-running dispatcher actually does the spawning.
- **`hermes kanban assignees` empty-state was leaking into the picker.** The CLI prints a literal sentinel `(no assignees — create a profile with hermes -p <name> setup)` when the table is empty; the parser was tokenizing it on whitespace and offering `(no` as a profile in the menu. Parser now skips the sentinel, validates each candidate against `^[a-zA-Z0-9_-]+$`, and falls back cleanly to the active local profile when the table is empty.
- **`spawn_failed` from "executable not found on PATH"** — most subtle of the lot. macOS GUI apps inherit a launch-services PATH (`/usr/bin:/bin:/usr/sbin:/sbin`) that doesn't include `~/.local/bin` (where pipx installs `hermes`) or `/opt/homebrew/bin`. Scarf was finding `hermes` for its own invocation via the absolute-path resolver in `HermesPathSet.hermesBinaryCandidates`, but when the dispatcher then spawned a worker process, that worker inherited Scarf's GUI PATH and couldn't find `hermes` by name — recording an `outcome=spawn_failed` run with the exact "executable not found on PATH" message. `LocalTransport` now grows an `environmentEnricher` static (mirroring `SSHTransport.environmentEnricher`) wired by `scarfApp.swift` to the same `HermesFileService.enrichedEnvironment()` login-shell probe the SSH transport uses. Every local subprocess Scarf spawns now sees the user's full PATH and credential env, so a spawned-from-Scarf hermes can spawn its children by name without reaching for absolute paths. Defense-in-depth: `subprocessEnvironment(forExecutable:)` also unconditionally prepends the executable's parent directory to PATH, so the fix works even if the enricher hasn't been wired (early startup, tests).
### Migrating from 2.7.1
Sparkle will offer the update automatically. No config migration, no schema changes — `~/.hermes/kanban.db` is shared across all Hermes clients and Scarf only reads/writes through the documented CLI surface. Existing Scarf projects pick up the new project Kanban tab on first open; the tenant slug is minted lazily on first kanban interaction inside the project, so projects with no kanban activity stay byte-identical until the user opens the tab.
If you have an existing project with a Scarf-managed `manifest.json`, the new optional `kanbanTenant` field is added on next mint and lives alongside any template-author config schema without touching it. Templates do not ship `kanbanTenant` (it's user-machine-scoped state); the export pipeline strips it.
If you've been running tasks via the v2.6 read-only list and your Hermes host already runs the gateway dispatcher, your existing kanban tasks should appear on the board automatically — there's no migration step. Tasks created without an assignee in v2.6 will now show the yellow "Unassigned" warning until you fix them through the inline picker.
### Known limitations
- **Within-column reorder is not supported.** Hermes has no `update` verb and no `position` column on the tasks table — `priority` is write-once at create time. Sort order inside each column is `priority DESC, created_at DESC`, matching the dispatcher's actual run order. We considered a client-side ordering sidecar; rejected because the on-screen order would diverge from what runs next, which is worse than no manual order. Will revisit if Hermes ships an `update --priority` verb.
- **No live `watch` streaming yet.** The board polls every 5 s; the inspector polls detail on the same cadence and the Log tab on a 2 s cadence while running. `hermes kanban watch --json` event streaming + reconnect-with-backoff lands in v2.8 along with iOS write surfaces.
- **No bulk re-tag for legacy NULL-tenant tasks.** Tasks created before this release (assignee or no assignee) appear in the global "Untagged" group on the global board. Hermes has no `tenant` mutation verb post-create, so retagging would be archive + recreate — too destructive to ship in this release.
### Acknowledgements
- Driven end-to-end against a fresh local Hermes v0.12.0 install with the gateway dispatcher running. Real bug surface mostly came from doing instead of speculating: the `claim` vs `dispatch` distinction, the silent `skipped_unassigned` behavior, the `(no` parse leak, the integer-vs-ISO timestamp shape, and the stale "Last run" banner during a fresh attempt all surfaced from driving real tasks and watching what actually happened.
@@ -47,6 +47,23 @@ public protocol ACPChannel: Sendable {
/// SSH exec channels return the SSH channel id or `nil` when not
/// applicable.
var diagnosticID: String? { get async }
/// Exit status of the underlying transport once it has terminated.
/// `nil` while the channel is still alive, or for transports that
/// don't have a meaningful integer exit code (Citadel SSH-exec).
/// Read by `ACPClient` when populating `processTerminated` so the
/// user-facing error can name the actual exit code (e.g. `exit
/// 255` for SSH connect failures, `exit 127` for missing remote
/// binary).
var lastExitCode: Int32? { get async }
}
public extension ACPChannel {
/// Default: channels that don't track an exit code report `nil`.
/// Concrete `ProcessACPChannel` overrides this.
var lastExitCode: Int32? {
get async { nil }
}
}
/// Errors raised by `ACPChannel` implementations when the underlying
@@ -266,18 +266,59 @@ public actor ACPClient {
// MARK: - Messaging
public func sendPrompt(sessionId: String, text: String) async throws -> ACPPromptResult {
try await sendPrompt(sessionId: sessionId, text: text, images: [])
}
/// v0.12+ overload: forward zero or more image attachments alongside
/// the user's text. Each attachment becomes a separate
/// `ImageContentBlock` in the ACP `prompt` content array matches
/// the shape Hermes' `acp_adapter/server.py` expects (text first,
/// then image blocks). Hermes routes the resulting payload to a
/// vision-capable model automatically; the producer side only has
/// to deliver the bytes.
///
/// Pre-v0.12 Hermes installs accepted only a single `text` block.
/// Callers gate this overload on
/// `HermesCapabilitiesStore.capabilities.hasACPImagePrompts` so we
/// don't send blocks an older agent would silently drop.
public func sendPrompt(
sessionId: String,
text: String,
images: [ChatImageAttachment]
) async throws -> ACPPromptResult {
statusMessage = "Sending prompt..."
let messageId = UUID().uuidString
// Always include the text block, even when empty keeps the
// server-side text-extraction path stable regardless of whether
// the user sent text alongside the image(s).
var promptBlocks: [[String: Any]] = [
["type": "text", "text": text] as [String: Any],
]
for image in images {
promptBlocks.append([
"type": "image",
"data": image.base64Data,
"mimeType": image.mimeType,
] as [String: Any])
}
let params: [String: AnyCodable] = [
"sessionId": AnyCodable(sessionId),
"messageId": AnyCodable(messageId),
"prompt": AnyCodable([
["type": "text", "text": text] as [String: Any],
] as [Any]),
"prompt": AnyCodable(promptBlocks as [Any]),
]
let result = try await sendRequest(method: "session/prompt", params: params)
let dict = result?.dictValue ?? [:]
let usage = dict["usage"] as? [String: Any] ?? [:]
// TODO(WS-8-Q1): Confirm wire field name once v0.13 Hermes is
// available. We tolerate camelCase + snake_case to match the rest
// of the ACP payload's mixed conventions; if Hermes routes the
// count through a `session/update` notification instead, this
// decode is a no-op and the ACPEvent path takes over.
let compression = (usage["compressionCount"] as? Int)
?? (usage["compression_count"] as? Int)
?? 0
statusMessage = "Ready"
return ACPPromptResult(
@@ -285,7 +326,8 @@ public actor ACPClient {
inputTokens: usage["inputTokens"] as? Int ?? 0,
outputTokens: usage["outputTokens"] as? Int ?? 0,
thoughtTokens: usage["thoughtTokens"] as? Int ?? 0,
cachedReadTokens: usage["cachedReadTokens"] as? Int ?? 0
cachedReadTokens: usage["cachedReadTokens"] as? Int ?? 0,
compressionCount: compression
)
}
@@ -329,10 +371,17 @@ public actor ACPClient {
#endif
// session/prompt streams events and can run for minutes no hard
// timeout. Control messages get a 30s watchdog.
// timeout. Control messages get a 60s watchdog. Older versions
// capped at 30s, which the field reported (#61) was tripping
// under realistic gateway+ACP concurrency: the gateway holds
// state.db locks for Discord sync / skill registration / cron
// scheduling, and ACP's `initialize` / `session/new` /
// `session/load` stall waiting for the lock. SQLite contention
// on a healthy host clears in seconds; 60s gives that headroom
// while still surfacing genuinely broken transports promptly.
let timeoutTask: Task<Void, Error>? = if method != "session/prompt" {
Task { [weak self] in
try await Task.sleep(nanoseconds: 30 * 1_000_000_000)
try await Task.sleep(nanoseconds: 60 * 1_000_000_000)
await self?.timeoutRequest(id: requestId, method: method)
}
} else {
@@ -468,35 +517,48 @@ public actor ACPClient {
// MARK: - Disconnect Cleanup
/// Single idempotent cleanup path for all disconnect scenarios.
private func performDisconnectCleanup(reason: String) {
/// Captures the channel's exit code + recent stderr BEFORE we drop
/// the reference, so the `processTerminated` error rides with
/// diagnostics the user banner shows "exit 255 ssh: connect to
/// host : Connection refused" instead of a bare opaque timeout.
private func performDisconnectCleanup(reason: String) async {
guard isConnected else { return }
#if canImport(os)
logger.warning("ACP disconnecting: \(reason)")
#endif
let exitCode = await channel?.lastExitCode
let tail = recentStderr
isConnected = false
statusMessage = "Connection lost"
for (_, continuation) in pendingRequests {
continuation.resume(throwing: ACPClientError.processTerminated)
continuation.resume(throwing: ACPClientError.processTerminated(
exitCode: exitCode,
stderrTail: tail
))
}
pendingRequests.removeAll()
eventContinuation?.finish()
eventContinuation = nil
}
private func handleReadLoopEnded(cleanly: Bool, error: Error? = nil) {
private func handleReadLoopEnded(cleanly: Bool, error: Error? = nil) async {
let reason = cleanly ? "read loop ended (EOF)" : "read loop failed: \(error?.localizedDescription ?? "unknown")"
performDisconnectCleanup(reason: reason)
await performDisconnectCleanup(reason: reason)
}
private func handleWriteFailed() {
performDisconnectCleanup(reason: "write failed (broken pipe)")
private func handleWriteFailed() async {
await performDisconnectCleanup(reason: "write failed (broken pipe)")
}
private func handleWriteFailedForRequest(id: Int) {
private func handleWriteFailedForRequest(id: Int) async {
if let continuation = pendingRequests.removeValue(forKey: id) {
continuation.resume(throwing: ACPClientError.processTerminated)
let exitCode = await channel?.lastExitCode
continuation.resume(throwing: ACPClientError.processTerminated(
exitCode: exitCode,
stderrTail: recentStderr
))
}
performDisconnectCleanup(reason: "write failed (broken pipe)")
await performDisconnectCleanup(reason: "write failed (broken pipe)")
}
}
@@ -507,7 +569,7 @@ public enum ACPClientError: Error, LocalizedError {
case encodingFailed
case invalidResponse(String)
case rpcError(code: Int, message: String)
case processTerminated
case processTerminated(exitCode: Int32?, stderrTail: String)
case requestTimeout(method: String)
public var errorDescription: String? {
@@ -516,25 +578,152 @@ public enum ACPClientError: Error, LocalizedError {
case .encodingFailed: return "Failed to encode JSON-RPC request"
case .invalidResponse(let msg): return "Invalid ACP response: \(msg)"
case .rpcError(let code, let msg): return "ACP error \(code): \(msg)"
case .processTerminated: return "ACP process terminated unexpectedly"
case .processTerminated(let exit, let tail):
let exitPart = exit.map { "exit \($0)" } ?? "no exit code"
let tailPart = Self.firstNonEmptyLine(in: tail).map { "\($0)" } ?? ""
return "ACP process terminated unexpectedly (\(exitPart))\(tailPart)"
case .requestTimeout(let method): return "ACP request '\(method)' timed out"
}
}
/// Pluck the first non-empty stderr line for the user-facing
/// summary. Full tail still rides through on `acpErrorDetails`,
/// but the description itself stays single-line.
private static func firstNonEmptyLine(in s: String) -> String? {
for raw in s.split(separator: "\n") {
let line = raw.trimmingCharacters(in: .whitespaces)
if !line.isEmpty { return line }
}
return nil
}
}
/// Maps a raw error message (RPC message or captured stderr) to a short
/// human-readable hint for the chat UI. Pattern-matches the most common
/// fresh-install failure modes. Returns nil when no known pattern matches.
public enum ACPErrorHint {
public static func classify(errorMessage: String, stderrTail: String) -> String? {
/// Result of a classifier hit. `hint` is the user-facing copy; when
/// the failure is an OAuth refresh-revocation, `oauthProvider` names
/// the affected provider (lowercase, matching `auth.json` keys) so
/// the UI can offer a one-click re-authenticate affordance. `nil`
/// `oauthProvider` means "we matched a non-OAuth failure mode, or
/// we matched OAuth but couldn't identify which provider."
public struct Classification: Sendable, Equatable {
public let hint: String
public let oauthProvider: String?
public init(hint: String, oauthProvider: String? = nil) {
self.hint = hint
self.oauthProvider = oauthProvider
}
}
/// Known OAuth-authed providers Hermes ships. Listed lowercase to
/// match `auth.json.providers.<key>` and the values
/// `OAuthFlowController.start(provider:)` accepts.
private static let oauthProviders = [
"nous", "claude", "anthropic", "qwen", "gemini", "google", "copilot", "github",
]
public static func classify(errorMessage: String, stderrTail: String) -> Classification? {
let haystack = errorMessage + "\n" + stderrTail
// SSH-level failures come first they apply only to remote
// contexts and the patterns are unambiguous (system ssh prints
// them verbatim to stderr). Without these classifications a
// vanished droplet, a wrong key, or a missing remote `hermes`
// all surface as opaque "ACP process terminated" / "request
// timed out", and the user has no idea where to look.
if haystack.contains("Connection refused") {
return Classification(hint: "Couldn't reach the remote host — the SSH port is closed or the droplet is down. Check the host is running and reachable.")
}
if haystack.localizedCaseInsensitiveContains("Operation timed out")
|| haystack.localizedCaseInsensitiveContains("Connection timed out")
|| haystack.contains("Network is unreachable")
|| haystack.contains("No route to host") {
return Classification(hint: "Couldn't reach the remote host — the network connection timed out. Check the host is running and your network is up.")
}
if haystack.contains("Permission denied (publickey")
|| haystack.contains("Permission denied, please try again") {
return Classification(hint: "SSH rejected the key. Make sure the right identity file is selected and that ssh-agent has the key loaded — open Terminal and run `ssh-add -l`.")
}
if haystack.contains("Host key verification failed")
|| haystack.contains("REMOTE HOST IDENTIFICATION HAS CHANGED") {
return Classification(hint: "The remote host's SSH key changed. If you just rebuilt the droplet, remove the old entry with `ssh-keygen -R <host>`, then try again.")
}
if haystack.contains("Could not resolve hostname")
|| haystack.contains("Name or service not known") {
return Classification(hint: "Couldn't resolve the host name. Check the host in this server's settings.")
}
if haystack.localizedCaseInsensitiveContains("command not found")
|| haystack.contains("hermes: not found")
|| haystack.contains("exit 127") {
return Classification(hint: "The remote shell couldn't find `hermes`. Either install Hermes on the remote (`pipx install hermes-agent`) or set an absolute binary path in this server's settings.")
}
// OAuth refresh-token revocation. Hermes prints
// "Refresh session has been revoked. Run `hermes model` to
// re-authenticate." to stderr/stdout when an OAuth-authed
// provider's refresh token can no longer mint access tokens
// (user revoked, server rotated keys, etc.). We can't drive
// `hermes model` interactively, but `hermes auth add <provider>
// --type oauth` is the same code path Scarf already drives via
// `OAuthFlowController` for first-time setup, so we surface a
// re-authenticate affordance instead. Checked BEFORE the
// generic "no credentials found" path because the message
// contains the word "credentials" via the surrounding context.
if haystack.localizedCaseInsensitiveContains("refresh session has been revoked")
|| haystack.range(of: #"refresh.*revoked"#, options: [.regularExpression, .caseInsensitive]) != nil
|| haystack.localizedCaseInsensitiveContains("re-authenticate")
|| haystack.localizedCaseInsensitiveContains("reauthenticate")
|| (haystack.contains("401") && oauthProvider(in: haystack) != nil)
|| (haystack.localizedCaseInsensitiveContains("unauthorized") && oauthProvider(in: haystack) != nil) {
let provider = oauthProvider(in: haystack)
let suffix = provider.map { " (affected provider: \($0))." } ?? "."
return Classification(
hint: "Your OAuth session has expired or been revoked\(suffix) Click Re-authenticate below to sign in again.",
oauthProvider: provider
)
}
// Auxiliary task references a provider that isn't authenticated.
// Hermes prints `resolve_provider_client: <name> requested but
// <Display Name> not configured` when an aux task (compression,
// summarization, memory_flush, curator, vision, web_extract,
// session_search, skills_hub) has `provider: <name>` set in
// config.yaml but that provider's credentials aren't loaded.
// Common after a user removes one OAuth provider while their
// existing config.yaml still names it for an aux task. The
// chat banner used to surface this as `-32603 Internal error`
// with no actionable detail; surface a clear path now.
if let match = haystack.range(
of: #"resolve_provider_client:\s*([a-zA-Z0-9_-]+)\s+requested\s+but"#,
options: .regularExpression
) {
let line = String(haystack[match])
// Pull the captured provider name out of the matched line.
// First word after "resolve_provider_client:" is the value.
let provider: String = {
let parts = line.split(whereSeparator: { $0.isWhitespace })
if let idx = parts.firstIndex(where: { $0.contains("resolve_provider_client") }),
parts.index(after: idx) < parts.endIndex {
let candidate = parts[parts.index(after: idx)]
return String(candidate)
}
return "an unauthenticated provider"
}()
return Classification(
hint: "An auxiliary task is configured to use `\(provider)` but that provider isn't authenticated. Open Settings → Aux Models, or check `~/.hermes/config.yaml` for `auxiliary.<task>.provider: \(provider)` and switch it to your active provider (or set it to `auto`)."
)
}
if haystack.range(of: #"No\s+(Anthropic|OpenAI|OpenRouter|Gemini|Google|Groq|Mistral|XAI)?\s*credentials\s+found"#,
options: .regularExpression) != nil
|| haystack.contains("ANTHROPIC_API_KEY")
|| haystack.contains("ANTHROPIC_TOKEN")
|| haystack.contains("claude setup-token")
|| haystack.contains("claude /login") {
return "Hermes can't find your AI provider credentials. Set `ANTHROPIC_API_KEY` (or similar) in `~/.hermes/.env` or your shell profile, then restart Scarf."
return Classification(hint: "Hermes can't find your AI provider credentials. Set `ANTHROPIC_API_KEY` (or similar) in `~/.hermes/.env` or your shell profile, then restart Scarf.")
}
if let match = haystack.range(of: #"No such file or directory:\s*'([^']+)'"#,
options: .regularExpression) {
@@ -542,13 +731,47 @@ public enum ACPErrorHint {
if let nameStart = matched.range(of: "'"),
let nameEnd = matched.range(of: "'", range: nameStart.upperBound..<matched.endIndex) {
let name = String(matched[nameStart.upperBound..<nameEnd.lowerBound])
return "Hermes couldn't find `\(name)` on PATH. If you use nvm/asdf/mise, make sure it's exported in `~/.zprofile` (not only `~/.zshrc`), then restart Scarf."
return Classification(hint: "Hermes couldn't find `\(name)` on PATH. If you use nvm/asdf/mise, make sure it's exported in `~/.zprofile` (not only `~/.zshrc`), then restart Scarf.")
}
return "Hermes couldn't find a required binary on PATH. Check that your shell's PATH is exported in `~/.zprofile`, then restart Scarf."
return Classification(hint: "Hermes couldn't find a required binary on PATH. Check that your shell's PATH is exported in `~/.zprofile`, then restart Scarf.")
}
if haystack.localizedCaseInsensitiveContains("rate limit")
|| haystack.localizedCaseInsensitiveContains("429") {
return "Your AI provider returned a rate-limit error. Try again in a moment."
return Classification(hint: "Your AI provider returned a rate-limit error. Try again in a moment.")
}
// Model-availability failure. Hermes pins each session to the
// model that opened it, so resuming an old session whose model
// is no longer available (provider deprecation, OAuth swapped
// to a different provider, model name changed) returns a 404
// / model_not_found from the upstream provider surfaced as
// an opaque "-32603 Internal error" in chat. v2.8 surfaces a
// clear "session is pinned" hint with the recovery path.
if haystack.localizedCaseInsensitiveContains("model_not_found")
|| haystack.localizedCaseInsensitiveContains("model not found")
|| haystack.localizedCaseInsensitiveContains("invalid_model")
|| haystack.localizedCaseInsensitiveContains("model is not available")
|| haystack.localizedCaseInsensitiveContains("unknown model")
|| (haystack.contains("404") && (haystack.localizedCaseInsensitiveContains("model")
|| haystack.localizedCaseInsensitiveContains("messages"))) {
return Classification(hint: "This session was created with a model the provider no longer offers. Hermes pins each session to its original model — start a new chat to use your current model, or run `hermes sessions clone` in Terminal to copy this conversation onto the new model.")
}
return nil
}
/// Best-effort extraction of an OAuth provider name from raw error
/// text. Returns the lowercase provider key (`"nous"`, `"claude"`,
/// etc.) when one of the known OAuth providers appears as a whole
/// word. The first match wins Hermes typically logs the active
/// provider name once, near the failure.
private static func oauthProvider(in haystack: String) -> String? {
let lowered = haystack.lowercased()
for provider in oauthProviders {
// Whole-word match so substrings like "anthropicapi" don't
// false-trigger on "anthropic".
let pattern = "\\b" + NSRegularExpression.escapedPattern(for: provider) + "\\b"
if lowered.range(of: pattern, options: .regularExpression) != nil {
return provider
}
}
return nil
}
@@ -36,6 +36,17 @@ public actor ProcessACPChannel: ACPChannel {
private var readerTask: Task<Void, Never>?
private var stderrTask: Task<Void, Never>?
/// Read by `ACPClient` to fill in `processTerminated(exitCode:)`
/// so the error names the actual exit code rather than reporting a
/// bare timeout. Sourced directly from `Process` `Process` is
/// thread-safe for this read and reflects the actual reap state,
/// so we sidestep the race between the OS-side `terminationHandler`
/// callback and the EOF-driven disconnect cleanup that would
/// otherwise need an atomic to coordinate.
public var lastExitCode: Int32? {
process.isRunning ? nil : process.terminationStatus
}
/// The subprocess's PID as a human-readable string.
public var diagnosticID: String? {
"pid=\(process.processIdentifier)"
@@ -58,7 +69,7 @@ public actor ProcessACPChannel: ACPChannel {
proc.executableURL = URL(fileURLWithPath: executable)
proc.arguments = args
proc.environment = env
try await Self.launch(process: proc, self_: nil)
try await Self.launch(process: proc)
try Self.ignoreSIGPIPE_once()
self.process = proc
@@ -75,14 +86,15 @@ public actor ProcessACPChannel: ACPChannel {
self.stderr = errStream
self.stderrContinuation = errContinuation
await startReaders()
startReaders()
installTerminationHandler()
}
/// Secondary entry point for callers that have a pre-configured
/// `Process` (typically from `SSHTransport.makeProcess`). The process
/// must NOT already be running this initializer calls `run()`.
public init(process: Process) async throws {
try await Self.launch(process: process, self_: nil)
try await Self.launch(process: process)
try Self.ignoreSIGPIPE_once()
self.process = process
@@ -99,15 +111,13 @@ public actor ProcessACPChannel: ACPChannel {
self.stderr = errStream
self.stderrContinuation = errContinuation
await startReaders()
startReaders()
installTerminationHandler()
}
/// Wire fresh stdin/stdout/stderr pipes (overwriting any the caller
/// set) and start the subprocess. `self_` is unused today the
/// placeholder keeps the signature ready for a future hook that
/// captures termination in `proc.terminationHandler` and routes it
/// into the channel's actor state.
private static func launch(process: Process, self_: Any?) async throws {
/// set) and start the subprocess.
private static func launch(process: Process) async throws {
process.standardInput = Pipe()
process.standardOutput = Pipe()
process.standardError = Pipe()
@@ -118,6 +128,22 @@ public actor ProcessACPChannel: ACPChannel {
}
}
/// Install a `terminationHandler` that closes the stdout read end
/// the moment the OS reaps the child. Without this, the reader
/// loop's `availableData` keeps blocking until the kernel tears
/// the pipe down on its own schedule visible to the user as a
/// 30s ACP `initialize` timeout where a fast SSH-side failure
/// (Connection refused, exit 127) should surface in under a
/// second. The exit code itself is read on demand from
/// `Process.terminationStatus` (see `lastExitCode`), so this
/// callback doesn't need to touch actor state.
private func installTerminationHandler() {
let stdoutFh = stdoutPipe.fileHandleForReading
process.terminationHandler = { _ in
try? stdoutFh.close()
}
}
/// Ignore SIGPIPE once per process so a broken-pipe write returns
/// `EPIPE` (which we surface as `.writeEndClosed`) instead of
/// delivering SIGPIPE and tearing the app down. Idempotent; the
@@ -0,0 +1,277 @@
import Foundation
#if canImport(os)
import os
import os.signpost
#endif
/// Lightweight performance instrumentation for the Scarf app family.
///
/// Three primitives `measure(...)`, `measureAsync(...)`, `event(...)` drop
/// timing samples through whatever set of backends is currently active.
/// Backends are pluggable: an always-on `os_signpost` backend (free outside
/// Instruments), an in-memory ring buffer (drives the in-app panel), and an
/// `os.Logger` debug backend (off by default).
///
/// **Cost when off.** When no backends are registered, every entry point is
/// `@inline(__always)` and short-circuits to the body call without taking the
/// `ContinuousClock.now` reading. Open source build defaults to "signpost
/// only" that backend pays one signpost emit per call, which Apple's runtime
/// elides when no Instruments session is recording.
///
/// **Privacy.** Names are `StaticString` so we cannot accidentally pass user
/// content through a metric tag. Optional `bytes:` field on `event` tracks
/// payload size, never payload contents. The ring buffer never leaves the
/// device unless the user explicitly hits "Copy as JSON" in the Diagnostics
/// panel.
public enum ScarfMon {
// MARK: - Public API
/// Synchronous timing wrapper. The body's return value flows through
/// untouched; the time it took plus `(category, name)` are recorded.
@inline(__always)
public static func measure<T>(
_ category: Category,
_ name: StaticString,
_ body: () throws -> T
) rethrows -> T {
guard isActive else { return try body() }
let start = ContinuousClock.now
defer { record(category, name, start: start, end: ContinuousClock.now) }
return try body()
}
/// Async variant. Same shape the `defer` block fires after the body
/// returns whether or not it threw, so cancelled / failed work still
/// records its duration.
@inline(__always)
public static func measureAsync<T>(
_ category: Category,
_ name: StaticString,
_ body: () async throws -> T
) async rethrows -> T {
guard isActive else { return try await body() }
let start = ContinuousClock.now
defer { record(category, name, start: start, end: ContinuousClock.now) }
return try await body()
}
/// Single-shot timestamped event. Use for things that aren't intervals
/// (token arrivals, buffer flushes) where count + optional payload size
/// is the useful signal.
@inline(__always)
public static func event(
_ category: Category,
_ name: StaticString,
count: Int = 1,
bytes: Int? = nil
) {
guard isActive else { return }
recordEvent(category, name, count: count, bytes: bytes)
}
// MARK: - Backend management
/// Install the desired backend set. Replaces the current set atomically.
/// Call once at app boot from the launch sequence; safe to call again
/// when the user toggles a setting on or off.
public static func install(_ backends: [ScarfMonBackend]) {
lock.lock()
defer { lock.unlock() }
installed = backends
cachedActive = !backends.isEmpty
}
/// Currently-installed backends. Test-only callers should not iterate
/// this in production.
public static var currentBackends: [ScarfMonBackend] {
lock.lock()
defer { lock.unlock() }
return installed
}
/// Cheap "are we recording anything?" check. The flag is updated only
/// when `install(...)` runs, so the hot path doesn't take the lock.
@inline(__always)
public static var isActive: Bool { cachedActive }
// MARK: - Internals
private static let lock = ScarfMonLock()
nonisolated(unsafe) private static var installed: [ScarfMonBackend] = []
nonisolated(unsafe) private static var cachedActive: Bool = false
@inline(__always)
private static func record(
_ category: Category,
_ name: StaticString,
start: ContinuousClock.Instant,
end: ContinuousClock.Instant
) {
let duration = end - start
let nanos = nanoseconds(of: duration)
let backends = snapshotBackends()
let sample = Sample(
category: category,
name: name,
kind: .interval,
timestamp: Date(),
durationNanos: nanos,
count: 1,
bytes: nil
)
for backend in backends {
backend.record(sample)
}
}
@inline(__always)
private static func recordEvent(
_ category: Category,
_ name: StaticString,
count: Int,
bytes: Int?
) {
let backends = snapshotBackends()
let sample = Sample(
category: category,
name: name,
kind: .event,
timestamp: Date(),
durationNanos: 0,
count: count,
bytes: bytes
)
for backend in backends {
backend.record(sample)
}
}
private static func snapshotBackends() -> [ScarfMonBackend] {
lock.lock()
defer { lock.unlock() }
return installed
}
private static func nanoseconds(of duration: Duration) -> UInt64 {
// Duration is (seconds: Int64, attoseconds: Int64). Avoid Double
// for the seconds term to keep precision on long intervals.
let comps = duration.components
let secondsAsNanos = UInt64(max(0, comps.seconds)) &* 1_000_000_000
let attoAsNanos = UInt64(max(0, comps.attoseconds) / 1_000_000_000)
return secondsAsNanos &+ attoAsNanos
}
}
// MARK: - Categories
extension ScarfMon {
/// Stable category vocabulary. Add cases here when new subsystems get
/// instrumented; renames are breaking changes for any saved JSON dumps
/// users have shared, so prefer adding over renaming.
public enum Category: String, CaseIterable, Sendable, Codable {
case chatRender
case chatStream
case sessionLoad
case transport
case sqlite
case diskIO
case render
case other
}
}
// MARK: - Sample
/// One recorded sample. All fields are value types so the struct is trivially
/// `Sendable` across backend queues without locks.
public struct ScarfMonSample: Sendable, Hashable {
public enum Kind: String, Sendable, Codable {
case interval
case event
}
public let category: ScarfMon.Category
/// Static name string captured at the call site. Not a `String` keeping
/// it `StaticString` proves at compile time that names cannot leak user
/// data through this channel.
public let name: StaticString
public let kind: Kind
public let timestamp: Date
public let durationNanos: UInt64
public let count: Int
public let bytes: Int?
public init(
category: ScarfMon.Category,
name: StaticString,
kind: Kind,
timestamp: Date,
durationNanos: UInt64,
count: Int,
bytes: Int?
) {
self.category = category
self.name = name
self.kind = kind
self.timestamp = timestamp
self.durationNanos = durationNanos
self.count = count
self.bytes = bytes
}
/// `StaticString` does not conform to `Hashable` natively (it doesn't
/// promise a stable hash). We hash via its UTF-8 representation so two
/// samples with the same source-literal name compare equal.
public static func == (lhs: ScarfMonSample, rhs: ScarfMonSample) -> Bool {
lhs.category == rhs.category
&& lhs.kind == rhs.kind
&& lhs.timestamp == rhs.timestamp
&& lhs.durationNanos == rhs.durationNanos
&& lhs.count == rhs.count
&& lhs.bytes == rhs.bytes
&& lhs.name.description == rhs.name.description
}
public func hash(into hasher: inout Hasher) {
hasher.combine(category)
hasher.combine(kind)
hasher.combine(timestamp)
hasher.combine(durationNanos)
hasher.combine(count)
hasher.combine(bytes)
hasher.combine(name.description)
}
}
extension ScarfMon {
public typealias Sample = ScarfMonSample
}
// MARK: - Backend protocol
/// One sink for samples. Implementations must be cheap on the hot path
/// callers hold no lock while invoking `record`, but the hot path runs from
/// every instrumented site, so allocations and disk I/O are off-limits here.
public protocol ScarfMonBackend: Sendable {
func record(_ sample: ScarfMon.Sample)
}
// MARK: - Lock
/// Tiny `os_unfair_lock` wrapper. CLAUDE.md says "Use os_unfair_lock (not
/// NSLock) for simple boolean flags accessed from multiple threads."
@usableFromInline
final class ScarfMonLock: @unchecked Sendable {
private let _lock: UnsafeMutablePointer<os_unfair_lock>
init() {
_lock = .allocate(capacity: 1)
_lock.initialize(to: os_unfair_lock())
}
deinit {
_lock.deinitialize(count: 1)
_lock.deallocate()
}
@usableFromInline func lock() { os_unfair_lock_lock(_lock) }
@usableFromInline func unlock() { os_unfair_lock_unlock(_lock) }
}
@@ -0,0 +1,76 @@
import Foundation
/// Boot-time wiring for ScarfMon. Both app targets call
/// `ScarfMonBoot.configure(...)` at launch and again whenever the user
/// flips the Diagnostics Performance toggle.
///
/// Three modes:
/// - `.off` nothing is recorded. Hot path is one branch + return.
/// - `.signpostOnly` Instruments-only. Default in the open-source build.
/// Free outside an Instruments session.
/// - `.full` signpost + ring buffer + os.Logger debug stream. Drives the
/// in-app panel and the "Copy as JSON" button. Opt-in.
public enum ScarfMonBoot {
public enum Mode: String, Sendable, CaseIterable {
case off
case signpostOnly
case full
}
/// User-defaults key for the persisted toggle. Same key on iOS + Mac
/// so `defaults read com.scarf.app ScarfMonMode` works on either.
public static let userDefaultsKey = "ScarfMonMode"
/// Read the persisted mode, defaulting to `.signpostOnly` so users
/// always get Instruments-visible signposts unless they explicitly
/// turn them off.
public static func currentMode(_ defaults: UserDefaults = .standard) -> Mode {
if let raw = defaults.string(forKey: userDefaultsKey),
let mode = Mode(rawValue: raw) {
return mode
}
return .signpostOnly
}
/// Persist a new mode and reinstall the backend set.
public static func setMode(_ mode: Mode, _ defaults: UserDefaults = .standard) {
defaults.set(mode.rawValue, forKey: userDefaultsKey)
configure(mode: mode)
}
/// Install the backend set for a given mode. Returns the active ring
/// buffer (if any) so the in-app Diagnostics panel can read from it.
@discardableResult
public static func configure(mode: Mode) -> ScarfMonRingBuffer? {
switch mode {
case .off:
ScarfMon.install([])
sharedRingBuffer = nil
return nil
case .signpostOnly:
ScarfMon.install([ScarfMonSignpostBackend()])
sharedRingBuffer = nil
return nil
case .full:
let ring = ScarfMonRingBuffer()
sharedRingBuffer = ring
ScarfMon.install([
ScarfMonSignpostBackend(),
ring,
ScarfMonLoggerBackend()
])
return ring
}
}
/// Process-wide ring buffer when running in `.full` mode. Nil otherwise.
/// Read by the Diagnostics panel; writes happen through the backend
/// dispatcher so this property is read-only.
///
/// `nonisolated(unsafe)` because the value is only mutated by
/// `configure(...)` (which itself runs on whichever actor invokes
/// the boot helper at app launch single-writer in practice) and
/// read from the panel UI on the main actor. Adding a lock here
/// would just add overhead with no real safety win.
nonisolated(unsafe) public private(set) static var sharedRingBuffer: ScarfMonRingBuffer?
}
@@ -0,0 +1,41 @@
import Foundation
#if canImport(os)
import os
#endif
/// `os.Logger`-backed sink. Off by default opt-in via the Diagnostics
/// settings toggle. Writes one `.debug` line per sample at the
/// `com.scarf.mon` subsystem, so users can stream the output via
/// `log stream --predicate 'subsystem == "com.scarf.mon"'` without
/// enabling private-data redaction overrides.
///
/// Only meaningful for users running their own debug build or with the
/// "verbose performance logging" toggle on.
public final class ScarfMonLoggerBackend: ScarfMonBackend, @unchecked Sendable {
#if canImport(os)
private let logger: Logger
public init(category: String = "perf") {
self.logger = Logger(subsystem: "com.scarf.mon", category: category)
}
public func record(_ sample: ScarfMon.Sample) {
switch sample.kind {
case .interval:
// `\(static:)` interpolation keeps the StaticString out of the
// private-data redaction path names are public, durations
// are public, the user's content never touches this channel.
logger.debug(
"\(sample.category.rawValue, privacy: .public) \(sample.name.description, privacy: .public) ms=\(Double(sample.durationNanos) / 1_000_000.0, privacy: .public)"
)
case .event:
logger.debug(
"\(sample.category.rawValue, privacy: .public) \(sample.name.description, privacy: .public) count=\(sample.count, privacy: .public) bytes=\(sample.bytes ?? -1, privacy: .public)"
)
}
}
#else
public init(category: String = "perf") {}
public func record(_ sample: ScarfMon.Sample) { /* no-op off-Apple */ }
#endif
}
@@ -0,0 +1,176 @@
import Foundation
/// Fixed-size, lock-protected ring of recent samples. Drives the in-app
/// Diagnostics panel and the export-as-JSON button.
///
/// Capacity is a compile-time choice; 4096 entries × ~80 bytes per sample =
/// ~320 KB resident. That's enough for several minutes of streaming-chat
/// activity at 200 samples/s without overwriting interesting context.
///
/// The hot path takes one `os_unfair_lock` per `record`. Aggregation (the
/// `summary(...)` reader) builds a fresh dictionary each call only invoked
/// from the panel UI, which polls at a human cadence.
public final class ScarfMonRingBuffer: ScarfMonBackend, @unchecked Sendable {
public let capacity: Int
private let lock = ScarfMonLock()
private var storage: [ScarfMon.Sample?]
/// Next write index. Wraps around `capacity` so the buffer never grows.
private var head: Int = 0
/// True once we've wrapped at least once switches the read order from
/// `[0..<head]` to `[head..<capacity] + [0..<head]`.
private var didWrap: Bool = false
public init(capacity: Int = 4096) {
precondition(capacity > 0, "ring buffer needs a positive capacity")
self.capacity = capacity
self.storage = Array(repeating: nil, count: capacity)
}
public func record(_ sample: ScarfMon.Sample) {
lock.lock()
defer { lock.unlock() }
storage[head] = sample
head += 1
if head >= capacity {
head = 0
didWrap = true
}
}
/// Snapshot of all currently-resident samples in chronological order.
public func samples() -> [ScarfMon.Sample] {
lock.lock()
defer { lock.unlock() }
if !didWrap {
return storage[0..<head].compactMap { $0 }
}
let tail = storage[head..<capacity].compactMap { $0 }
let leading = storage[0..<head].compactMap { $0 }
return tail + leading
}
/// Wipe the buffer. Used by the "Reset" button in the Diagnostics
/// panel and at the top of every test case.
public func reset() {
lock.lock()
defer { lock.unlock() }
for i in 0..<capacity { storage[i] = nil }
head = 0
didWrap = false
}
/// Aggregated stats over the current buffer. Buckets by
/// `(category, name)`; computes count, total nanos, mean, p50, p95.
public func summary() -> [ScarfMonStat] {
let snapshot = samples()
var buckets: [BucketKey: [UInt64]] = [:]
var counts: [BucketKey: Int] = [:]
var byteTotals: [BucketKey: Int] = [:]
var kinds: [BucketKey: ScarfMon.Sample.Kind] = [:]
for sample in snapshot {
let key = BucketKey(category: sample.category, name: sample.name.description)
kinds[key] = sample.kind
counts[key, default: 0] += sample.count
if let b = sample.bytes { byteTotals[key, default: 0] += b }
if sample.kind == .interval {
buckets[key, default: []].append(sample.durationNanos)
}
}
var stats: [ScarfMonStat] = []
for (key, _) in counts {
let durations = buckets[key] ?? []
let kind = kinds[key] ?? .event
stats.append(ScarfMonStat(
category: key.category,
name: key.name,
kind: kind,
count: counts[key] ?? 0,
totalNanos: durations.reduce(0, &+),
p50Nanos: percentile(durations, 0.50),
p95Nanos: percentile(durations, 0.95),
maxNanos: durations.max() ?? 0,
totalBytes: byteTotals[key] ?? 0
))
}
stats.sort { $0.p95Nanos > $1.p95Nanos }
return stats
}
private struct BucketKey: Hashable {
let category: ScarfMon.Category
let name: String
}
private func percentile(_ values: [UInt64], _ p: Double) -> UInt64 {
guard !values.isEmpty else { return 0 }
let sorted = values.sorted()
// Nearest-rank percentile good enough for triage and avoids
// interpolation edge cases on tiny samples.
let rank = max(1, min(sorted.count, Int((p * Double(sorted.count)).rounded(.up))))
return sorted[rank - 1]
}
}
/// Per-bucket stats surfaced to the in-app panel.
public struct ScarfMonStat: Sendable, Hashable, Codable {
public let category: ScarfMon.Category
public let name: String
public let kind: ScarfMon.Sample.Kind
public let count: Int
public let totalNanos: UInt64
public let p50Nanos: UInt64
public let p95Nanos: UInt64
public let maxNanos: UInt64
public let totalBytes: Int
public var totalMs: Double { Double(totalNanos) / 1_000_000.0 }
public var p50Ms: Double { Double(p50Nanos) / 1_000_000.0 }
public var p95Ms: Double { Double(p95Nanos) / 1_000_000.0 }
public var maxMs: Double { Double(maxNanos) / 1_000_000.0 }
}
// MARK: - JSON export
extension ScarfMonRingBuffer {
/// Compact JSON dump for the "Copy as JSON" button. One line per sample
/// keeps the output greppable when the user pastes it into a feedback
/// thread.
public func exportJSON() -> String {
struct Wire: Codable {
let category: String
let name: String
let kind: String
let timestampMs: Double
let durationNanos: UInt64
let count: Int
let bytes: Int?
}
let snapshot = samples()
let encoder = JSONEncoder()
encoder.outputFormatting = [.sortedKeys]
var lines: [String] = []
lines.reserveCapacity(snapshot.count + 1)
lines.append("[")
for (i, s) in snapshot.enumerated() {
let wire = Wire(
category: s.category.rawValue,
name: s.name.description,
kind: s.kind.rawValue,
timestampMs: s.timestamp.timeIntervalSince1970 * 1000,
durationNanos: s.durationNanos,
count: s.count,
bytes: s.bytes
)
if let data = try? encoder.encode(wire),
let line = String(data: data, encoding: .utf8) {
let suffix = i == snapshot.count - 1 ? "" : ","
lines.append(" " + line + suffix)
}
}
lines.append("]")
return lines.joined(separator: "\n")
}
}
@@ -0,0 +1,54 @@
import Foundation
#if canImport(os)
import os
import os.signpost
#endif
/// Always-on signpost backend. Emits an `os_signpost` event per sample so
/// users can attach Instruments and see Scarf's instrumentation in the
/// Points of Interest track without a debug build.
///
/// `os_signpost` is elided by the runtime when no Instruments session is
/// recording the relevant subsystem the backend pays the cost of one
/// `OSLog` lookup per emit and nothing else.
public final class ScarfMonSignpostBackend: ScarfMonBackend, @unchecked Sendable {
#if canImport(os)
private let log: OSLog
public init(subsystem: String = "com.scarf.mon") {
self.log = OSLog(subsystem: subsystem, category: .pointsOfInterest)
}
public func record(_ sample: ScarfMon.Sample) {
// Signposts want a `StaticString` name we already require
// exactly that on the API. Format string is also static; the
// dynamic values flow as printf-style args, so no allocations
// for the event name itself.
switch sample.kind {
case .interval:
os_signpost(
.event,
log: log,
name: sample.name,
"category=%{public}@ ms=%{public}.3f count=%d",
sample.category.rawValue,
Double(sample.durationNanos) / 1_000_000.0,
sample.count
)
case .event:
os_signpost(
.event,
log: log,
name: sample.name,
"category=%{public}@ count=%d bytes=%d",
sample.category.rawValue,
sample.count,
sample.bytes ?? -1
)
}
}
#else
public init(subsystem: String = "com.scarf.mon") {}
public func record(_ sample: ScarfMon.Sample) { /* no-op off-Apple */ }
#endif
}
@@ -243,19 +243,32 @@ public struct ACPPromptResult: Sendable {
public let outputTokens: Int
public let thoughtTokens: Int
public let cachedReadTokens: Int
/// Number of automatic context compactions Hermes has performed on this
/// session so far. v0.13+ older Hermes hosts always return 0, which
/// the chat status bar treats as "hide chip". Optional in the wire
/// payload; folded into a non-optional `Int` here with a 0 default so
/// the rest of the pipeline doesn't need to nil-check.
// TODO(WS-8-Q1): Verify that v0.13 Hermes emits the count on
// `session/prompt`'s `usage` blob (assumed here). If it lands on a
// separate `session/update` notification instead, this becomes a new
// ACPEvent case + a branch in RichChatViewModel.handleACPEvent wire
// shape is documented in the WS-8 plan as the bigger fix path.
public let compressionCount: Int
public init(
stopReason: String,
inputTokens: Int,
outputTokens: Int,
thoughtTokens: Int,
cachedReadTokens: Int
cachedReadTokens: Int,
compressionCount: Int = 0
) {
self.stopReason = stopReason
self.inputTokens = inputTokens
self.outputTokens = outputTokens
self.thoughtTokens = thoughtTokens
self.cachedReadTokens = cachedReadTokens
self.compressionCount = compressionCount
}
}
@@ -0,0 +1,183 @@
import Foundation
/// Top-level manifest for a `.scarfbackup` archive.
///
/// **Archive layout** (`.scarfbackup` is a plain ZIP):
/// ```
/// <name>.scarfbackup
/// manifest.json this struct, JSON-encoded
/// hermes.tar.gz gzipped tar of `~/.hermes/` (minus exclusions)
/// projects/
/// <project-id>.tar.gz one inner tarball per registered project
/// ...
/// ```
///
/// **Why two layers (outer ZIP + inner tarballs).** The inner tarballs are
/// produced by streaming `tar -czf - ` over SSH that's the only way to
/// keep memory bounded for multi-GB hermes homes. The outer ZIP exists so
/// the manifest sits at a fixed, easy-to-inspect location and so users on
/// macOS can double-click in Finder and see the structure. ZIP also has a
/// central directory at the end, which makes "validate without extracting"
/// cheap.
///
/// **What rides along.** Hermes home (state.db + sessions + skills + cron +
/// memories + scarf sidecars + plugins/profiles), each project's full file
/// tree (the user's code), and the manifest itself. **What does NOT ride
/// along by default**: `auth.json` (provider credentials), `mcp-tokens/`
/// (per-host OAuth bearer tokens), `logs/` (size, low restore value),
/// `state.db-wal` / `state.db-shm` (in-flight WAL siblings we checkpoint
/// before the archive). The `options` block records exactly which
/// exclusions were applied so the restore flow can warn the user.
public struct BackupManifest: Codable, Sendable, Equatable {
/// Bumped when the on-disk shape changes incompatibly. v1 is the only
/// shape today; restores refuse anything they don't recognize.
public var schemaVersion: Int
/// Magic string. Lets a future Scarf reject `.zip` files that aren't
/// our backups before unpacking them as if they were.
public var kind: String
/// ISO-8601 UTC timestamp the archive was produced.
public var createdAt: String
/// Identifies the server the backup came from. The display name is for
/// the restore preview sheet; serverID is for de-dupe and lineage.
public var source: Source
/// Hermes home tree metadata. Always present (even an empty Hermes
/// install ships an empty tarball the restore replaces nothing
/// rather than refusing).
public var hermes: HermesTree
/// One entry per registered project at backup time. Empty array
/// when the user never registered any projects.
public var projects: [ProjectEntry]
/// What was included / excluded from the Hermes tree. Flagged so the
/// restore preview honestly reports "auth.json was not in this
/// backup you'll re-authenticate after restore".
public var options: Options
public init(
schemaVersion: Int = BackupManifest.currentSchemaVersion,
kind: String = BackupManifest.kindMagic,
createdAt: String,
source: Source,
hermes: HermesTree,
projects: [ProjectEntry],
options: Options
) {
self.schemaVersion = schemaVersion
self.kind = kind
self.createdAt = createdAt
self.source = source
self.hermes = hermes
self.projects = projects
self.options = options
}
public static let currentSchemaVersion = 1
public static let kindMagic = "scarf-server-backup"
public struct Source: Codable, Sendable, Equatable {
public var serverID: String
public var displayName: String
public var host: String
public var user: String?
/// Output of `hermes --version` on the source host at backup
/// time. Restore warns if the target installs an older version
/// (state.db schema differences could break things silently).
public var hermesVersion: String?
public init(serverID: String, displayName: String, host: String, user: String?, hermesVersion: String?) {
self.serverID = serverID
self.displayName = displayName
self.host = host
self.user = user
self.hermesVersion = hermesVersion
}
}
public struct HermesTree: Codable, Sendable, Equatable {
/// Absolute path of `~/.hermes/` on the source host (e.g.
/// `/root/.hermes` or `/home/alan/.hermes`). Used by restore to
/// detect path drift when targeting a different user account.
public var homePath: String
/// Path inside the outer ZIP (always `hermes.tar.gz`).
public var tarballPath: String
/// Compressed bytes for the preview sheet's size summary.
public var tarballSize: Int64
/// Hex SHA-256 of the inner tarball. Restore verifies before
/// extracting; corruption surfaces as a single bad path
/// rather than a half-extracted home.
public var tarballSHA256: String
public init(homePath: String, tarballPath: String, tarballSize: Int64, tarballSHA256: String) {
self.homePath = homePath
self.tarballPath = tarballPath
self.tarballSize = tarballSize
self.tarballSHA256 = tarballSHA256
}
}
public struct ProjectEntry: Codable, Sendable, Equatable {
/// Stable UUID for the project. Used to namespace the inner
/// tarball so a project with `name = "scratch"` in two
/// different directories doesn't collide.
public var id: String
public var name: String
/// Absolute path on the source host. Restore re-anchors this if
/// the target has a different home (e.g. backup from `/root`,
/// restore to `/home/ubuntu`).
public var path: String
/// Path inside the outer ZIP (e.g. `projects/<id>.tar.gz`).
public var tarballPath: String
public var tarballSize: Int64
public var tarballSHA256: String
public init(id: String, name: String, path: String, tarballPath: String, tarballSize: Int64, tarballSHA256: String) {
self.id = id
self.name = name
self.path = path
self.tarballPath = tarballPath
self.tarballSize = tarballSize
self.tarballSHA256 = tarballSHA256
}
}
public struct Options: Codable, Sendable, Equatable {
public var includeAuth: Bool
public var includeMcpTokens: Bool
public var includeLogs: Bool
/// True if `sqlite3 PRAGMA wal_checkpoint(TRUNCATE)` was run on
/// the remote before tarballing the Hermes home. False means the
/// archive may contain a `state.db` mid-write usually fine
/// (SQLite tolerates restarted reads from a quiesced DB) but
/// flagged for forensics.
public var checkpointedWAL: Bool
public init(includeAuth: Bool, includeMcpTokens: Bool, includeLogs: Bool, checkpointedWAL: Bool) {
self.includeAuth = includeAuth
self.includeMcpTokens = includeMcpTokens
self.includeLogs = includeLogs
self.checkpointedWAL = checkpointedWAL
}
public static let safeDefault = Options(
includeAuth: false,
includeMcpTokens: false,
includeLogs: false,
checkpointedWAL: true
)
}
}
/// Canonical layout strings referenced by both the producer and the
/// consumer so the on-disk paths stay in sync.
public enum BackupArchiveLayout {
public static let manifestPath = "manifest.json"
public static let hermesTarballPath = "hermes.tar.gz"
public static let projectsTarballPrefix = "projects/"
public static let archiveExtension = "scarfbackup"
/// Returns `projects/<id>.tar.gz`. The id is the `ProjectEntry.id`
/// (stable UUID), not the project name names are renamed all the
/// time and would collide.
public static func projectTarballPath(for id: String) -> String {
projectsTarballPrefix + id + ".tar.gz"
}
}
@@ -0,0 +1,52 @@
import Foundation
/// One image attached to an outgoing chat prompt.
///
/// Hermes v0.12 ACP advertises `prompt_capabilities.image = true` and
/// accepts content-block arrays in `session/prompt`. Scarf produces these
/// blocks from drag-dropped / pasted / picker-selected images. We
/// downsample + JPEG-encode at the producer side so the wire payload
/// stays under a few hundred kilobytes per image even when the user
/// drops a 12 MP screenshot.
///
/// Constructed via `ImageEncoder.encode(...)`. The store-the-bytes-once
/// shape means `RichChatViewModel` can keep the array between turns
/// (e.g. while the agent is responding) without holding `NSImage` /
/// `UIImage` references that would pin the originals in memory.
public struct ChatImageAttachment: Sendable, Equatable, Identifiable {
public let id: String
/// IANA MIME type matches the `mimeType` field on ACP `ImageContentBlock`.
/// Currently always `image/jpeg` after re-encoding; PNG-only originals
/// keep their type when small enough to skip the JPEG step.
public let mimeType: String
/// Base64-encoded payload. NOT prefixed with `data:` Hermes wraps it
/// when forwarding to OpenAI multimodal payloads (see
/// `_image_block_to_openai_part` in `acp_adapter/server.py`).
public let base64Data: String
/// Small inline thumbnail for the composer's preview strip. Same MIME
/// type as `base64Data`. Nil when the source was already small enough
/// to use directly.
public let thumbnailBase64: String?
/// Original filename, when known (drag-drop carries it; paste doesn't).
/// Surfaced as a tooltip on the preview chip.
public let filename: String?
/// Approximate decoded byte count, kept for the composer's
/// "X images, Y KB" status pill.
public let approximateByteCount: Int
public init(
id: String = UUID().uuidString,
mimeType: String,
base64Data: String,
thumbnailBase64: String?,
filename: String?,
approximateByteCount: Int
) {
self.id = id
self.mimeType = mimeType
self.base64Data = base64Data
self.thumbnailBase64 = thumbnailBase64
self.filename = filename
self.approximateByteCount = approximateByteCount
}
}
@@ -0,0 +1,34 @@
import Foundation
/// Errors thrown by `CuratorService`. Each case carries enough detail
/// to render a user-actionable message the view model surfaces these
/// inline as a banner above the leaderboard rather than blocking with a
/// modal alert.
public enum CuratorError: Error, LocalizedError, Sendable {
/// `hermes` binary couldn't be located.
case cliMissing
/// Subprocess returned non-zero exit. `stderr` may carry a synthetic
/// message when the transport itself failed.
case nonZeroExit(verb: String, code: Int32, stderr: String)
/// JSON decoding failed. Underlying message wrapped for diagnostics.
case decoding(verb: String, message: String)
/// Generic transport error process couldn't start, IO failed, etc.
case transport(message: String)
public var errorDescription: String? {
switch self {
case .cliMissing:
return "Hermes CLI couldn't be found. Install Hermes v0.13+ and ensure it's on your PATH."
case .nonZeroExit(let verb, let code, let stderr):
let trimmed = stderr.trimmingCharacters(in: .whitespacesAndNewlines)
if trimmed.isEmpty {
return "`hermes curator \(verb)` exited with code \(code)."
}
return trimmed
case .decoding(let verb, let message):
return "Couldn't decode `hermes curator \(verb)` output: \(message)"
case .transport(let message):
return message
}
}
}
@@ -0,0 +1,76 @@
import Foundation
/// Hermes v0.13 added cross-platform recipient allowlists to the Messaging
/// Gateway. Each platform stores the list under a different YAML key
/// depending on the platform's primary noun for "addressable destination":
///
/// - **`allowed_channels`** Slack, Mattermost, Google Chat
/// - **`allowed_chats`** Telegram, WhatsApp
/// - **`allowed_rooms`** Matrix, DingTalk
///
/// `GatewayAllowlistKind` encodes the (platform key) mapping plus a few
/// presentation hints (placeholder strings, singular noun) so the allowlist
/// editor can render the right copy without the per-platform setup view
/// needing to know the YAML shape.
public enum GatewayAllowlistKind: String, Sendable, Equatable {
case channels // -> allowed_channels
case chats // -> allowed_chats
case rooms // -> allowed_rooms
/// YAML scalar key segment under `gateway.platforms.<platform>.<key>`.
public var yamlKey: String {
switch self {
case .channels: return "allowed_channels"
case .chats: return "allowed_chats"
case .rooms: return "allowed_rooms"
}
}
/// Placeholder copy for the editor's "add row" text field. Picks the
/// most common identifier shape per platform family Slack channel IDs
/// for `channels`, Telegram username/numeric for `chats`, Matrix room
/// IDs for `rooms`. Users can paste in any platform-specific format the
/// gateway accepts; this is a hint, not validation.
public var inputPlaceholder: String {
switch self {
case .channels: return "C0123ABCD or #channel-name"
case .chats: return "@username or 12345678"
case .rooms: return "!RoomId:matrix.org"
}
}
/// Singular noun for prose surfaces ("Add a channel", "1 chat allowed",
/// "0 rooms"). Capitalization is the caller's responsibility.
public var noun: String {
switch self {
case .channels: return "channel"
case .chats: return "chat"
case .rooms: return "room"
}
}
/// Plural noun for headings + counts.
public var pluralNoun: String {
switch self {
case .channels: return "channels"
case .chats: return "chats"
case .rooms: return "rooms"
}
}
/// Map a Hermes platform identifier to the allowlist kind it supports.
/// Returns `nil` for platforms without v0.13 allowlist support
/// (`cli`, `signal`, `email`, `imessage`, `homeassistant`, `webhook`,
/// `yuanbao`, `microsoft-teams`, `feishu`, `discord`).
///
/// `googlechat` and `google-chat` both map to `.channels` so we round-trip
/// regardless of which spelling Hermes lands on. // TODO(WS-5-Q1)
public static func kind(for platform: String) -> GatewayAllowlistKind? {
switch platform {
case "slack", "mattermost", "google-chat", "googlechat": return .channels
case "telegram", "whatsapp": return .chats
case "matrix", "dingtalk": return .rooms
default: return nil
}
}
}
@@ -0,0 +1,71 @@
import Foundation
/// Per-platform Messaging Gateway settings introduced in Hermes v0.13. Bundles
/// the allowlist (the platform-appropriate flavor of `allowed_channels` /
/// `allowed_chats` / `allowed_rooms`) and three behavior toggles
/// (`busy_ack_enabled`, `gateway_restart_notification`,
/// `slash_command_notice_ttl_seconds`).
///
/// The struct carries all three list fields so a single shape fits every
/// platform; only the field matching `GatewayAllowlistKind.kind(for:)` is
/// surfaced in the editor for a given platform. The other two stay empty
/// and round-trip through the YAML parser unchanged.
///
/// **Defaults track Hermes v0.13.** `busyAckEnabled = true`,
/// `gatewayRestartNotification = false`, `slashCommandNoticeTTLSeconds = 0`
/// (disabled). An "all-default" instance therefore produces no `gateway:`
/// block in YAML see `HermesConfig+YAML` parsing logic which only inserts
/// an entry into `gatewayPlatforms` when at least one v0.13 key is present
/// in the file.
public struct GatewayPlatformSettings: Sendable, Equatable {
/// `gateway.platforms.<platform>.allowed_channels` Slack, Mattermost,
/// Google Chat. Empty when the platform doesn't use channels.
public var allowedChannels: [String]
/// `gateway.platforms.<platform>.allowed_chats` Telegram, WhatsApp.
/// Empty when the platform doesn't use chats.
public var allowedChats: [String]
/// `gateway.platforms.<platform>.allowed_rooms` Matrix, DingTalk.
/// Empty when the platform doesn't use rooms.
public var allowedRooms: [String]
/// `gateway.platforms.<platform>.busy_ack_enabled`. Default `true` set
/// to `false` to suppress per-message "agent is working" acks.
public var busyAckEnabled: Bool
/// `gateway.platforms.<platform>.gateway_restart_notification`. Default
/// `false` set to `true` to post a "Gateway restarted" notice on boot.
public var gatewayRestartNotification: Bool
/// `gateway.platforms.<platform>.slash_command_notice_ttl_seconds`.
/// Default `0` (disabled). Positive values auto-delete slash-command
/// notices after N seconds.
public var slashCommandNoticeTTLSeconds: Int
public init(
allowedChannels: [String] = [],
allowedChats: [String] = [],
allowedRooms: [String] = [],
busyAckEnabled: Bool = true,
gatewayRestartNotification: Bool = false,
slashCommandNoticeTTLSeconds: Int = 0
) {
self.allowedChannels = allowedChannels
self.allowedChats = allowedChats
self.allowedRooms = allowedRooms
self.busyAckEnabled = busyAckEnabled
self.gatewayRestartNotification = gatewayRestartNotification
self.slashCommandNoticeTTLSeconds = slashCommandNoticeTTLSeconds
}
/// All-default instance. `HermesConfig.empty` initializes
/// `gatewayPlatforms: [:]` so this is rarely used directly; provided
/// for symmetry with the other settings types.
public static let empty = GatewayPlatformSettings()
/// The list field matching this allowlist kind, or `nil` for
/// platforms without an allowlist surface.
public func items(for kind: GatewayAllowlistKind) -> [String] {
switch kind {
case .channels: return allowedChannels
case .chats: return allowedChats
case .rooms: return allowedRooms
}
}
}
@@ -0,0 +1,26 @@
import Foundation
/// Optimistic local mirror of the agent's currently-locked goal (set via
/// the `/goal <text>` slash command, Hermes v0.13+). Scarf records this
/// the moment the user sends `/goal ` so the chat header pill appears
/// synchronously, without waiting for a server round-trip. There is no
/// authoritative read-back path in v2.8.0 see WS-2 plan Q1.
///
/// Plain value type, no mutation API. Drives the goal pill in
/// `SessionInfoBar` and the inspector contextual menu.
public struct HermesActiveGoal: Sendable, Equatable, Identifiable {
/// The user's verbatim goal text (post-trim).
public let text: String
/// When Scarf observed the `/goal` send. Local clock not the
/// server's authoritative timestamp.
public let setAt: Date
public var id: String {
text + "@" + ISO8601DateFormatter().string(from: setAt)
}
public init(text: String, setAt: Date) {
self.text = text
self.setAt = setAt
}
}
@@ -36,6 +36,13 @@ public struct DisplaySettings: Sendable, Equatable {
public var toolProgressCommand: Bool
public var toolPreviewLength: Int
public var busyInputMode: String // e.g. "interrupt"
/// Static-message translation language. v0.13+. Empty string means
/// "follow Hermes default" the picker collapses both empty-string
/// and `"en"` to "English" in display, but only writes a value when
/// the user explicitly picks one. Persisted via
/// `hermes config set display.language <code>`. Supported values per
/// v0.13 release notes: `en`, `zh`, `ja`, `de`, `es`, `fr`, `uk`, `tr`.
public var language: String
public init(
@@ -46,7 +53,8 @@ public struct DisplaySettings: Sendable, Equatable {
inlineDiffs: Bool,
toolProgressCommand: Bool,
toolPreviewLength: Int,
busyInputMode: String
busyInputMode: String,
language: String = ""
) {
self.skin = skin
self.compact = compact
@@ -56,6 +64,7 @@ public struct DisplaySettings: Sendable, Equatable {
self.toolProgressCommand = toolProgressCommand
self.toolPreviewLength = toolPreviewLength
self.busyInputMode = busyInputMode
self.language = language
}
public nonisolated static let empty = DisplaySettings(
skin: "default",
@@ -65,7 +74,8 @@ public struct DisplaySettings: Sendable, Equatable {
inlineDiffs: true,
toolProgressCommand: false,
toolPreviewLength: 0,
busyInputMode: "interrupt"
busyInputMode: "interrupt",
language: ""
)
}
@@ -190,6 +200,15 @@ public struct VoiceSettings: Sendable, Equatable {
public var ttsOpenAIVoice: String
public var ttsNeuTTSModel: String
public var ttsNeuTTSDevice: String
/// xAI TTS voice identifier. v0.13+ xAI shipped TTS earlier but the
/// custom-voice / cloning surface is the v0.13 add-on.
// TODO(WS-8-Q2): Confirm key name vs `tts.xai.voice` /
// `tts.xai.voice_id` / a top-level `tts.xai_voice` once a v0.13
// host is on hand. The setter / YAML reader follow whatever this
// field name implies.
public var ttsXAIVoiceID: String
/// xAI TTS model identifier. v0.13+. Mirrors the elevenlabs shape.
public var ttsXAIModel: String
// STT
public var sttEnabled: Bool
@@ -217,7 +236,9 @@ public struct VoiceSettings: Sendable, Equatable {
sttLocalModel: String,
sttLocalLanguage: String,
sttOpenAIModel: String,
sttMistralModel: String
sttMistralModel: String,
ttsXAIVoiceID: String = "",
ttsXAIModel: String = ""
) {
self.recordKey = recordKey
self.maxRecordingSeconds = maxRecordingSeconds
@@ -230,6 +251,8 @@ public struct VoiceSettings: Sendable, Equatable {
self.ttsOpenAIVoice = ttsOpenAIVoice
self.ttsNeuTTSModel = ttsNeuTTSModel
self.ttsNeuTTSDevice = ttsNeuTTSDevice
self.ttsXAIVoiceID = ttsXAIVoiceID
self.ttsXAIModel = ttsXAIModel
self.sttEnabled = sttEnabled
self.sttProvider = sttProvider
self.sttLocalModel = sttLocalModel
@@ -254,11 +277,22 @@ public struct VoiceSettings: Sendable, Equatable {
sttLocalModel: "base",
sttLocalLanguage: "",
sttOpenAIModel: "whisper-1",
sttMistralModel: "voxtral-mini-latest"
sttMistralModel: "voxtral-mini-latest",
ttsXAIVoiceID: "",
ttsXAIModel: ""
)
}
/// Eight sub-models that share the same provider/model/base_url/api_key/timeout shape.
/// Per-task auxiliary model overrides.
///
/// `flush_memories` was removed in Hermes v0.12 but remains alive on
/// pre-v0.12 hosts the field is preserved here so the YAML parser
/// can round-trip it and `AuxiliaryTab` can render the row when
/// `HermesCapabilities.hasFlushMemoriesAux` is set. On v0.12+ the
/// field stays empty and is never surfaced.
/// `curator` was added in v0.12 Curator's review fork uses its own
/// model so users can keep main-model spend separate from background
/// maintenance.
public struct AuxiliarySettings: Sendable, Equatable {
public var vision: AuxiliaryModel
public var webExtract: AuxiliaryModel
@@ -267,7 +301,10 @@ public struct AuxiliarySettings: Sendable, Equatable {
public var skillsHub: AuxiliaryModel
public var approval: AuxiliaryModel
public var mcp: AuxiliaryModel
/// pre-v0.12 only; on v0.12+ this stays `.empty` and the row is hidden.
public var flushMemories: AuxiliaryModel
/// v0.12+; pre-v0.12 Hermes installs ignore this slot.
public var curator: AuxiliaryModel
public init(
@@ -278,7 +315,8 @@ public struct AuxiliarySettings: Sendable, Equatable {
skillsHub: AuxiliaryModel,
approval: AuxiliaryModel,
mcp: AuxiliaryModel,
flushMemories: AuxiliaryModel
flushMemories: AuxiliaryModel,
curator: AuxiliaryModel
) {
self.vision = vision
self.webExtract = webExtract
@@ -288,6 +326,7 @@ public struct AuxiliarySettings: Sendable, Equatable {
self.approval = approval
self.mcp = mcp
self.flushMemories = flushMemories
self.curator = curator
}
public nonisolated static let empty = AuxiliarySettings(
vision: .empty,
@@ -297,7 +336,8 @@ public struct AuxiliarySettings: Sendable, Equatable {
skillsHub: .empty,
approval: .empty,
mcp: .empty,
flushMemories: .empty
flushMemories: .empty,
curator: .empty
)
}
@@ -634,6 +674,66 @@ public struct HermesConfig: Sendable {
/// platform. Scarf reads for display; edits go through Hermes CLI.
public var platformToolsets: [String: [String]]
// -- Hermes v0.12 additions ----------------------------------------
// Defaults match the Hermes v0.12 defaults so that an absent key in
// config.yaml looks identical to a freshly-installed v0.12 host.
/// `prompt_caching.cache_ttl` `"5m"` (default) or `"1h"`. Hermes
/// v0.12 added the 1-hour ceiling for users with prompt-cache-heavy
/// workloads (long agent loops with stable system prompts).
public var cacheTTL: String
/// `redaction.enabled` flipped from `true` to `false` as the
/// upstream default in v0.12 because the substitution corrupted
/// patches and API payloads. Surface a toggle so users with hard
/// redaction requirements can opt back in.
public var redactionEnabled: Bool
/// `agent.runtime_metadata_footer` opt-in compact footer on each
/// final reply (provider/model/cost/turn count). Off by default;
/// useful for cost auditing and screen-recording demos.
public var runtimeMetadataFooter: Bool
/// Pre-v0.13: single combined Web Tools backend at `web_tools.backend`.
/// v0.13 split this into per-capability keys (see below). Kept readable
/// for round-trip compatibility on hosts that never migrated; v0.13+
/// hosts ignore this scalar and read the split keys instead.
public var webToolsBackend: String
/// v0.13+: `web_tools.search.backend`. SearXNG is search-only and
/// can land here. Pre-v0.13 hosts default to the same value as the
/// combined backend.
public var webToolsSearchBackend: String
/// v0.13+: `web_tools.extract.backend`. Pre-v0.13 hosts default to
/// the same value as the combined backend.
public var webToolsExtractBackend: String
// -- Hermes v0.13 additions ----------------------------------------
// Per-platform Messaging Gateway settings dictionary keyed by Hermes
// platform identifier (`slack`, `telegram`, `matrix`, `mattermost`,
// `whatsapp`, `dingtalk`, `google-chat`). Populated only for platforms
// whose `gateway.platforms.<platform>.*` block exists in config.yaml
// platforms without an explicit block don't appear in the dictionary.
// Editing surfaces (per-platform setup forms) read with a `?? .empty`
// fallback so a missing entry behaves identically to an all-default
// entry.
public var gatewayPlatforms: [String: GatewayPlatformSettings]
/// `image_gen.model` (v0.13+) overrides the per-provider default
/// image-gen model. Empty string means "let Hermes pick the
/// provider default". Hermes v0.12 advertised this key but ignored
/// it; Scarf's `AuxiliaryTab` only renders the picker when
/// `HermesCapabilities.hasImageGenModel` is `true`.
public var imageGenModel: String
/// `openrouter.response_cache.enabled` (v0.13+) when true, Hermes
/// asks OpenRouter to cache responses for repeat prompts within a
/// session. Off by default in Scarf's parser per WS-6 plan
/// recommendation. UI gated on
/// `HermesCapabilities.hasOpenRouterResponseCache`.
// TODO(WS-6-Q1): the exact YAML key shape is provisional. Verify
// against a v0.13 host's `hermes config check` output before
// shipping (see WS-6-plan §Open Questions #1). Candidate alternative
// shapes: `providers.openrouter.response_cache_enabled` or
// `prompt_caching.openrouter.enabled`.
public var openrouterResponseCacheEnabled: Bool
// Grouped blocks
public var display: DisplaySettings
public var terminal: TerminalSettings
@@ -711,8 +811,26 @@ public struct HermesConfig: Sendable {
matrix: MatrixSettings,
mattermost: MattermostSettings,
whatsapp: WhatsAppSettings,
homeAssistant: HomeAssistantSettings
homeAssistant: HomeAssistantSettings,
cacheTTL: String = "5m",
redactionEnabled: Bool = false,
runtimeMetadataFooter: Bool = false,
gatewayPlatforms: [String: GatewayPlatformSettings] = [:],
imageGenModel: String = "",
openrouterResponseCacheEnabled: Bool = false,
webToolsBackend: String = "duckduckgo",
webToolsSearchBackend: String = "duckduckgo",
webToolsExtractBackend: String = "reader"
) {
self.cacheTTL = cacheTTL
self.redactionEnabled = redactionEnabled
self.runtimeMetadataFooter = runtimeMetadataFooter
self.gatewayPlatforms = gatewayPlatforms
self.imageGenModel = imageGenModel
self.openrouterResponseCacheEnabled = openrouterResponseCacheEnabled
self.webToolsBackend = webToolsBackend
self.webToolsSearchBackend = webToolsSearchBackend
self.webToolsExtractBackend = webToolsExtractBackend
self.model = model
self.provider = provider
self.maxTurns = maxTurns
@@ -27,6 +27,39 @@ public enum QueryDefaults: Sendable {
public nonisolated static let defaultSilenceThreshold = 200
}
/// Page sizes for `HermesDataService.fetchMessages(sessionId:limit:before:)`.
/// Centralized so iOS, Mac, and the polling code paths can pick a
/// consistent budget and so we have one knob to retune if perf
/// concerns shift.
public enum HistoryPageSize: Sendable {
/// Initial chat-history load. **Sized to fit the SSH wire payload
/// inside a 30-second `RemoteSQLiteBackend.queryTimeout`.** A
/// 157-message session at 200-row page size produced enough
/// JSON (with `reasoning_content` for thinking models) to time
/// out at exactly 30 s on a 420 ms-RTT remote. Dropped to 50,
/// then to 25 in v2.7 after a 160-message session still timed
/// out at 50 `reasoning_content` for thinking-model turns can
/// run 20+ KB per row, so 50 rows × 30 KB = 1.5 MB JSON which
/// over a slow SSH channel still trips the 30s budget. Pair
/// with `messageColumnsLight` (excludes `reasoning_content`)
/// so the on-wire payload is small even at this size; the
/// inspector pane lazy-loads via `fetchReasoningContent(for:)`
/// when the user expands a disclosure. The "Load earlier"
/// affordance pages back through older messages on demand.
public nonisolated static let initial = 25
/// Reconnection reconcile against the DB. 200 rows is plenty
/// disconnects don't generate hundreds of unseen messages.
public nonisolated static let reconcile = 200
/// Mac sessions detail view. Larger to reduce paging UX in the
/// desktop browser-style read; the desktop has the screen real
/// estate and memory headroom for it.
public nonisolated static let macSessionDetail = 500
/// Terminal-mode polling refresh. Same 500-row budget as Mac
/// detail; covers sessions long enough that the user is actively
/// scrolling but bounded to keep each poll tick cheap.
public nonisolated static let polling = 500
}
// MARK: - File Size Formatting
public enum FileSizeUnit: Sendable {
@@ -19,6 +19,21 @@ public struct HermesCronJob: Identifiable, Sendable, Codable {
public nonisolated let timeoutType: String?
public nonisolated let timeoutSeconds: Int?
public nonisolated let silent: Bool?
/// Hermes v0.12+ the directory the job runs from. Hermes injects
/// AGENTS.md / CLAUDE.md / .cursorrules from this dir and uses it
/// as cwd for terminal/file/code_exec tools. `nil` preserves the
/// pre-v0.12 behaviour (no project context files).
public nonisolated let workdir: String?
/// Hermes v0.12+ chain another cron job's last output into this
/// job's prompt. YAML-only field today (no `--context-from` CLI
/// flag yet) Scarf displays it but doesn't write it.
public nonisolated let contextFrom: [String]?
/// Hermes v0.13+ script-only watchdog mode. When `true` the
/// pre-run script runs but the AI turn is skipped. `nil` means the
/// jobs.json file is pre-v0.13 (treat as `false`); `false` is the
/// explicit v0.13+ default. Capability-gated on `hasCronNoAgent`
/// at all write call sites.
public nonisolated let noAgent: Bool?
public enum CodingKeys: String, CodingKey {
case id, name, prompt, skills, model, schedule, enabled, state, deliver, silent
@@ -30,6 +45,9 @@ public struct HermesCronJob: Identifiable, Sendable, Codable {
case lastDeliveryError = "last_delivery_error"
case timeoutType = "timeout_type"
case timeoutSeconds = "timeout_seconds"
case workdir
case contextFrom = "context_from"
case noAgent = "no_agent"
}
/// Memberwise init. Swift doesn't synthesize one for us because
@@ -53,7 +71,10 @@ public struct HermesCronJob: Identifiable, Sendable, Codable {
lastDeliveryError: String? = nil,
timeoutType: String? = nil,
timeoutSeconds: Int? = nil,
silent: Bool? = nil
silent: Bool? = nil,
workdir: String? = nil,
contextFrom: [String]? = nil,
noAgent: Bool? = nil
) {
self.id = id
self.name = name
@@ -73,6 +94,9 @@ public struct HermesCronJob: Identifiable, Sendable, Codable {
self.timeoutType = timeoutType
self.timeoutSeconds = timeoutSeconds
self.silent = silent
self.workdir = workdir
self.contextFrom = contextFrom
self.noAgent = noAgent
}
public nonisolated init(from decoder: any Decoder) throws {
@@ -95,6 +119,9 @@ public struct HermesCronJob: Identifiable, Sendable, Codable {
self.timeoutType = try c.decodeIfPresent(String.self, forKey: .timeoutType)
self.timeoutSeconds = try c.decodeIfPresent(Int.self, forKey: .timeoutSeconds)
self.silent = try c.decodeIfPresent(Bool.self, forKey: .silent)
self.workdir = try c.decodeIfPresent(String.self, forKey: .workdir)
self.contextFrom = try c.decodeIfPresent([String].self, forKey: .contextFrom)
self.noAgent = try c.decodeIfPresent(Bool.self, forKey: .noAgent)
}
public nonisolated func encode(to encoder: any Encoder) throws {
@@ -117,6 +144,9 @@ public struct HermesCronJob: Identifiable, Sendable, Codable {
try c.encodeIfPresent(timeoutType, forKey: .timeoutType)
try c.encodeIfPresent(timeoutSeconds, forKey: .timeoutSeconds)
try c.encodeIfPresent(silent, forKey: .silent)
try c.encodeIfPresent(workdir, forKey: .workdir)
try c.encodeIfPresent(contextFrom, forKey: .contextFrom)
try c.encodeIfPresent(noAgent, forKey: .noAgent)
}
public nonisolated var stateIcon: String {
@@ -0,0 +1,124 @@
import Foundation
/// One entry in the `hermes curator list-archived` output. Decoded
/// tolerantly via `decodeIfPresent` so a stripped-down host (or a future
/// Hermes that drops one of the optional columns) doesn't crash the view.
///
/// Only `name` is required every other field is optional and the
/// computed `*Label` accessors render `""` for missing values.
public struct HermesCuratorArchivedSkill: Sendable, Equatable, Identifiable, Codable {
public var id: String { name }
public let name: String
public let category: String?
public let archivedAt: String?
public let reason: String?
public let sizeBytes: Int?
public let path: String?
public init(
name: String,
category: String? = nil,
archivedAt: String? = nil,
reason: String? = nil,
sizeBytes: Int? = nil,
path: String? = nil
) {
self.name = name
self.category = category
self.archivedAt = archivedAt
self.reason = reason
self.sizeBytes = sizeBytes
self.path = path
}
private enum CodingKeys: String, CodingKey {
case name
case category
case archivedAt = "archived_at"
case reason
case sizeBytes = "size_bytes"
case path
}
public init(from decoder: Decoder) throws {
let c = try decoder.container(keyedBy: CodingKeys.self)
self.name = try c.decode(String.self, forKey: .name)
self.category = try c.decodeIfPresent(String.self, forKey: .category)
self.archivedAt = try c.decodeIfPresent(String.self, forKey: .archivedAt)
self.reason = try c.decodeIfPresent(String.self, forKey: .reason)
self.sizeBytes = try c.decodeIfPresent(Int.self, forKey: .sizeBytes)
self.path = try c.decodeIfPresent(String.self, forKey: .path)
}
public func encode(to encoder: Encoder) throws {
var c = encoder.container(keyedBy: CodingKeys.self)
try c.encode(name, forKey: .name)
try c.encodeIfPresent(category, forKey: .category)
try c.encodeIfPresent(archivedAt, forKey: .archivedAt)
try c.encodeIfPresent(reason, forKey: .reason)
try c.encodeIfPresent(sizeBytes, forKey: .sizeBytes)
try c.encodeIfPresent(path, forKey: .path)
}
/// "4.4 KB" / "1.2 MB" / "" for nil. Uses the SI byte formatter so
/// the labels match what Finder shows.
public var sizeLabel: String {
guard let bytes = sizeBytes else { return "" }
let formatter = ByteCountFormatter()
formatter.allowedUnits = [.useAll]
formatter.countStyle = .file
return formatter.string(fromByteCount: Int64(bytes))
}
/// `2026-04-22` (ISO date prefix) / "". Hermes returns full ISO
/// timestamps with seconds + Z; the date prefix is what the user
/// actually wants in the archived list.
public var archivedAtLabel: String {
guard let iso = archivedAt, !iso.isEmpty else { return "" }
// Trim to date prefix if it looks like a full ISO timestamp.
if let tIdx = iso.firstIndex(of: "T") {
return String(iso[..<tIdx])
}
return iso
}
}
/// Result of `hermes curator prune --dry-run` what would be removed
/// if the user confirms. The view derives `totalCount` from
/// `wouldRemove.count` so the wire shape stays flat.
public struct CuratorPruneSummary: Sendable, Equatable, Codable {
public let wouldRemove: [HermesCuratorArchivedSkill]
public let totalBytes: Int
public var totalCount: Int { wouldRemove.count }
public init(wouldRemove: [HermesCuratorArchivedSkill], totalBytes: Int) {
self.wouldRemove = wouldRemove
self.totalBytes = totalBytes
}
private enum CodingKeys: String, CodingKey {
case wouldRemove = "would_remove"
case totalBytes = "total_bytes"
}
public init(from decoder: Decoder) throws {
let c = try decoder.container(keyedBy: CodingKeys.self)
self.wouldRemove = try c.decodeIfPresent([HermesCuratorArchivedSkill].self, forKey: .wouldRemove) ?? []
self.totalBytes = try c.decodeIfPresent(Int.self, forKey: .totalBytes) ?? 0
}
public func encode(to encoder: Encoder) throws {
var c = encoder.container(keyedBy: CodingKeys.self)
try c.encode(wouldRemove, forKey: .wouldRemove)
try c.encode(totalBytes, forKey: .totalBytes)
}
/// "12.3 KB" / "" for empty. Convenience for the confirm sheet header.
public var totalBytesLabel: String {
guard totalBytes > 0 else { return "" }
let formatter = ByteCountFormatter()
formatter.allowedUnits = [.useAll]
formatter.countStyle = .file
return formatter.string(fromByteCount: Int64(totalBytes))
}
}
@@ -0,0 +1,361 @@
import Foundation
/// Parsed view of `hermes curator status` text + the on-disk
/// `~/.hermes/skills/.curator_state` JSON.
///
/// Hermes v0.12 doesn't ship a `--json` flag for `curator status` the
/// CLI writes a human-readable report. CuratorViewModel parses the text
/// output for the human-readable bits ("least recently active", "most
/// active") and reads the state file directly for last-run metadata.
public struct HermesCuratorStatus: Sendable, Equatable {
public enum RunState: String, Sendable, Equatable {
case enabled
case paused
case disabled
case unknown
}
public let state: RunState
public let runCount: Int
public let lastRunISO: String? // raw timestamp string, parsed by callers
public let lastSummary: String? // free-text summary line
public let lastReportPath: String? // absolute path to <YYYYMMDD-HHMMSS>/ dir
public let intervalLabel: String // e.g. "every 7d"
public let staleAfterLabel: String // e.g. "30d unused"
public let archiveAfterLabel: String // e.g. "90d unused"
public let totalSkills: Int
public let activeSkills: Int
public let staleSkills: Int
public let archivedSkills: Int
public let pinnedNames: [String]
/// Top-5 lists rendered in the curator output. Each row carries the
/// skill name + the four counters Hermes prints.
public let leastRecentlyActive: [HermesCuratorSkillRow]
public let mostActive: [HermesCuratorSkillRow]
public let leastActive: [HermesCuratorSkillRow]
public init(
state: RunState,
runCount: Int,
lastRunISO: String?,
lastSummary: String?,
lastReportPath: String?,
intervalLabel: String,
staleAfterLabel: String,
archiveAfterLabel: String,
totalSkills: Int,
activeSkills: Int,
staleSkills: Int,
archivedSkills: Int,
pinnedNames: [String],
leastRecentlyActive: [HermesCuratorSkillRow],
mostActive: [HermesCuratorSkillRow],
leastActive: [HermesCuratorSkillRow]
) {
self.state = state
self.runCount = runCount
self.lastRunISO = lastRunISO
self.lastSummary = lastSummary
self.lastReportPath = lastReportPath
self.intervalLabel = intervalLabel
self.staleAfterLabel = staleAfterLabel
self.archiveAfterLabel = archiveAfterLabel
self.totalSkills = totalSkills
self.activeSkills = activeSkills
self.staleSkills = staleSkills
self.archivedSkills = archivedSkills
self.pinnedNames = pinnedNames
self.leastRecentlyActive = leastRecentlyActive
self.mostActive = mostActive
self.leastActive = leastActive
}
public static let empty = HermesCuratorStatus(
state: .unknown,
runCount: 0,
lastRunISO: nil,
lastSummary: nil,
lastReportPath: nil,
intervalLabel: "",
staleAfterLabel: "",
archiveAfterLabel: "",
totalSkills: 0,
activeSkills: 0,
staleSkills: 0,
archivedSkills: 0,
pinnedNames: [],
leastRecentlyActive: [],
mostActive: [],
leastActive: []
)
}
public struct HermesCuratorSkillRow: Sendable, Equatable, Identifiable {
public var id: String { name }
public let name: String
public let activityCount: Int
public let useCount: Int
public let viewCount: Int
public let patchCount: Int
public let lastActivityLabel: String // raw label as printed (e.g. "never", "2d ago")
public init(
name: String,
activityCount: Int,
useCount: Int,
viewCount: Int,
patchCount: Int,
lastActivityLabel: String
) {
self.name = name
self.activityCount = activityCount
self.useCount = useCount
self.viewCount = viewCount
self.patchCount = patchCount
self.lastActivityLabel = lastActivityLabel
}
}
/// Pure parser for `hermes curator status` stdout. Public for tests.
///
/// Format is stable enough to text-parse; we never error on missing
/// sections we just leave the corresponding field empty so
/// CuratorView can render "" without crashing on a future layout
/// tweak. State file overrides text-parsed values when both are present.
public enum HermesCuratorStatusParser {
public static func parse(text: String, stateFileJSON: Data? = nil) -> HermesCuratorStatus {
let lines = text.components(separatedBy: "\n")
var status = HermesCuratorStatus.empty
// Header section: `curator: ENABLED` / `runs:` / `last run:` /
// `last summary:` / `interval:` / `stale after:` / `archive after:`
var state = HermesCuratorStatus.RunState.unknown
var runCount = 0
var lastRunISO: String?
var lastSummary: String?
var lastReportPath: String?
var interval = ""
var stale = ""
var archive = ""
// Skill counts: `agent-created skills: N total` then
// ` active N` / ` stale N` / ` archived N`
var total = 0
var active = 0
var staleCount = 0
var archived = 0
var pinned: [String] = []
// Lists: `least recently active (top 5):` / `most active (top 5):` /
// `least active (top 5):` followed by indented row lines.
enum Section {
case header
case leastRecent
case mostActive
case leastActive
}
var section = Section.header
var leastRecent: [HermesCuratorSkillRow] = []
var mostActiveRows: [HermesCuratorSkillRow] = []
var leastActiveRows: [HermesCuratorSkillRow] = []
for raw in lines {
let line = raw.trimmingCharacters(in: .whitespaces)
// Section markers
if line.hasPrefix("least recently active") {
section = .leastRecent
continue
}
if line.hasPrefix("most active") {
section = .mostActive
continue
}
if line.hasPrefix("least active") {
section = .leastActive
continue
}
// Header section single-line keys
if line.hasPrefix("curator:") {
let val = String(line.dropFirst("curator:".count)).trimmingCharacters(in: .whitespaces).uppercased()
switch val {
case "ENABLED": state = .enabled
case "PAUSED": state = .paused
case "DISABLED": state = .disabled
default: state = .unknown
}
continue
}
if line.hasPrefix("runs:") {
runCount = Int(line.dropFirst("runs:".count).trimmingCharacters(in: .whitespaces)) ?? 0
continue
}
if line.hasPrefix("last run:") {
let val = String(line.dropFirst("last run:".count)).trimmingCharacters(in: .whitespaces)
lastRunISO = val == "never" ? nil : val
continue
}
if line.hasPrefix("last summary:") {
let val = String(line.dropFirst("last summary:".count)).trimmingCharacters(in: .whitespaces)
lastSummary = (val == "(none)" || val.isEmpty) ? nil : val
continue
}
if line.hasPrefix("last report:") {
let val = String(line.dropFirst("last report:".count)).trimmingCharacters(in: .whitespaces)
lastReportPath = val.isEmpty ? nil : val
continue
}
if line.hasPrefix("interval:") {
interval = String(line.dropFirst("interval:".count)).trimmingCharacters(in: .whitespaces)
continue
}
if line.hasPrefix("stale after:") {
stale = String(line.dropFirst("stale after:".count)).trimmingCharacters(in: .whitespaces)
continue
}
if line.hasPrefix("archive after:") {
archive = String(line.dropFirst("archive after:".count)).trimmingCharacters(in: .whitespaces)
continue
}
// `agent-created skills: 18 total`
if line.hasPrefix("agent-created skills:") {
let after = line.dropFirst("agent-created skills:".count).trimmingCharacters(in: .whitespaces)
if let n = Int(after.split(separator: " ").first ?? "") {
total = n
}
section = .header
continue
}
// Counts: "active 18" / "stale 0" / "archived 0"
if let row = parseStateCountRow(line) {
switch row.state {
case "active": active = row.count
case "stale": staleCount = row.count
case "archived": archived = row.count
default: break
}
continue
}
// pinned (3): foo, bar, baz
if line.hasPrefix("pinned (") {
if let colon = line.firstIndex(of: ":") {
let names = line[line.index(after: colon)...]
.split(separator: ",")
.map { $0.trimmingCharacters(in: .whitespaces) }
.filter { !$0.isEmpty }
pinned = names
}
continue
}
// Skill rows like:
// <name> activity= N use= N view= N patches= N last_activity=<label>
if section != .header, let parsed = parseSkillRow(line) {
switch section {
case .leastRecent: leastRecent.append(parsed)
case .mostActive: mostActiveRows.append(parsed)
case .leastActive: leastActiveRows.append(parsed)
case .header: break
}
}
}
// Apply state-file overrides if present. The .curator_state JSON
// is authoritative for last_run_at / last_run_summary /
// last_report_path because those carry timestamps the text
// output rounds.
if let json = stateFileJSON,
let obj = try? JSONSerialization.jsonObject(with: json) as? [String: Any] {
if obj["paused"] as? Bool == true { state = .paused }
if let count = obj["run_count"] as? Int { runCount = count }
if let lr = obj["last_run_at"] as? String { lastRunISO = lr }
if let summary = obj["last_run_summary"] as? String, !summary.isEmpty { lastSummary = summary }
if let path = obj["last_report_path"] as? String, !path.isEmpty { lastReportPath = path }
}
status = HermesCuratorStatus(
state: state,
runCount: runCount,
lastRunISO: lastRunISO,
lastSummary: lastSummary,
lastReportPath: lastReportPath,
intervalLabel: interval,
staleAfterLabel: stale,
archiveAfterLabel: archive,
totalSkills: total,
activeSkills: active,
staleSkills: staleCount,
archivedSkills: archived,
pinnedNames: pinned,
leastRecentlyActive: leastRecent,
mostActive: mostActiveRows,
leastActive: leastActiveRows
)
return status
}
/// `active 18` style row inside the skill-count block.
private static func parseStateCountRow(_ line: String) -> (state: String, count: Int)? {
let parts = line.split(whereSeparator: { $0 == " " || $0 == "\t" }).map(String.init)
guard parts.count >= 2,
["active", "stale", "archived"].contains(parts[0]),
let count = Int(parts[1])
else { return nil }
return (parts[0], count)
}
/// Skill-list row parser. Tolerates Hermes's whitespace-padded
/// layout `activity= 0` has two spaces between `=` and the
/// number, so we can't split-on-space-then-split-on-`=`. Instead
/// we slide a key-detection cursor across the row and grab the
/// next non-whitespace token after each known key.
private static func parseSkillRow(_ line: String) -> HermesCuratorSkillRow? {
guard let activityRange = line.range(of: "activity=") else { return nil }
let name = String(line[..<activityRange.lowerBound]).trimmingCharacters(in: .whitespaces)
guard !name.isEmpty else { return nil }
// Map each known key to its value substring. Read positionally
// by slicing between consecutive known keys handles arbitrary
// whitespace padding without depending on column positions.
let knownKeys = ["activity=", "use=", "view=", "patches=", "last_activity="]
var positions: [(key: String, range: Range<String.Index>)] = []
for key in knownKeys {
if let r = line.range(of: key) {
positions.append((key, r))
}
}
positions.sort { $0.range.lowerBound < $1.range.lowerBound }
var activity = 0, use = 0, view = 0, patch = 0
var lastActivity = ""
for (idx, entry) in positions.enumerated() {
let valueStart = entry.range.upperBound
let valueEnd = idx + 1 < positions.count
? positions[idx + 1].range.lowerBound
: line.endIndex
let raw = String(line[valueStart..<valueEnd]).trimmingCharacters(in: .whitespaces)
switch entry.key {
case "activity=": activity = Int(raw) ?? 0
case "use=": use = Int(raw) ?? 0
case "view=": view = Int(raw) ?? 0
case "patches=": patch = Int(raw) ?? 0
case "last_activity=": lastActivity = raw
default: break
}
}
return HermesCuratorSkillRow(
name: name,
activityCount: activity,
useCount: use,
viewCount: view,
patchCount: patch,
lastActivityLabel: lastActivity
)
}
}
@@ -0,0 +1,32 @@
import Foundation
/// One row from `hermes kanban assignees --json`. The output is the
/// union of profiles configured on the host (`~/.hermes/profiles/`)
/// and any names appearing in the live board's `assignee` column
/// covers the case where a profile was renamed but historical tasks
/// still reference the old name.
public struct HermesKanbanAssignee: Sendable, Equatable, Identifiable, Codable {
public var id: String { profile }
public let profile: String
public let activeCount: Int
public let totalCount: Int
public init(profile: String, activeCount: Int = 0, totalCount: Int = 0) {
self.profile = profile
self.activeCount = activeCount
self.totalCount = totalCount
}
enum CodingKeys: String, CodingKey {
case profile
case activeCount = "active"
case totalCount = "total"
}
public init(from decoder: any Decoder) throws {
let c = try decoder.container(keyedBy: CodingKeys.self)
self.profile = try c.decode(String.self, forKey: .profile)
self.activeCount = try c.decodeIfPresent(Int.self, forKey: .activeCount) ?? 0
self.totalCount = try c.decodeIfPresent(Int.self, forKey: .totalCount) ?? 0
}
}
@@ -0,0 +1,51 @@
import Foundation
/// One comment from `hermes kanban show <id> --json` or appended via
/// `hermes kanban comment <id> <text>`. Comments are append-only there's
/// no edit/delete verb.
public struct HermesKanbanComment: Sendable, Equatable, Identifiable, Codable {
public let id: Int
public let taskId: String
public let author: String
public let body: String
public let createdAt: String
public init(
id: Int,
taskId: String,
author: String,
body: String,
createdAt: String
) {
self.id = id
self.taskId = taskId
self.author = author
self.body = body
self.createdAt = createdAt
}
enum CodingKeys: String, CodingKey {
case id
case taskId = "task_id"
case author
case body
case createdAt = "created_at"
}
public init(from decoder: any Decoder) throws {
let c = try decoder.container(keyedBy: CodingKeys.self)
self.id = try c.decode(Int.self, forKey: .id)
self.taskId = try c.decodeIfPresent(String.self, forKey: .taskId) ?? ""
self.author = try c.decodeIfPresent(String.self, forKey: .author) ?? ""
self.body = try c.decodeIfPresent(String.self, forKey: .body) ?? ""
// Hermes emits Unix integer timestamps from its SQLite columns;
// accept both ints and ISO strings.
if let unix = try? c.decodeIfPresent(Double.self, forKey: .createdAt) {
let f = ISO8601DateFormatter()
f.formatOptions = [.withInternetDateTime]
self.createdAt = f.string(from: Date(timeIntervalSince1970: unix))
} else {
self.createdAt = (try? c.decodeIfPresent(String.self, forKey: .createdAt)) ?? ""
}
}
}
@@ -0,0 +1,158 @@
import Foundation
/// A structured signal Hermes emits when it observes worker / task
/// distress. Hermes v0.13 introduced a generic diagnostics engine that
/// attaches these to a task (cross-run signals) and/or a run (per-attempt
/// signals). Pre-v0.13 hosts never emit diagnostics so the array decodes
/// empty and downstream UI no-ops.
///
/// **Wire shape (best inference from release notes verify against live
/// JSON during integration):** an array of objects with `kind`, optional
/// `message`, optional `detected_at` (ISO-8601 string OR Unix integer,
/// matching the rest of `HermesKanbanTask`'s timestamp tolerance).
///
/// **Forward compat:** `kind` stays a `String` so a future Hermes can
/// add new diagnostic kinds without a Scarf release. `KanbanDiagnosticKind`
/// is the typed mirror it falls back to `.unknown` for unrecognized
/// kinds and renders the raw string verbatim.
public struct HermesKanbanDiagnostic: Sendable, Equatable, Identifiable, Codable {
/// Synthetic id not on the wire. Lets SwiftUI `ForEach` over a
/// diagnostic array without forcing a deterministic id from the
/// server (Hermes doesn't currently mint one).
public let id: UUID
/// Wire-side `kind` string. Compared case-insensitively via
/// `KanbanDiagnosticKind.from(_:)`.
public let kind: String
/// Human-friendly elaboration ("no heartbeat for 4m20s", "exit code
/// 0 with no complete call", etc.). May be nil; render the raw
/// `kind` then.
public let message: String?
/// ISO-8601 string. Decoder accepts Unix integer seconds (Hermes's
/// SQLite-backed shape) and converts to ISO-8601 so consumers see
/// one type same pattern as `HermesKanbanTask.decodeFlexibleTimestamp`.
public let detectedAt: String?
public init(
kind: String,
message: String? = nil,
detectedAt: String? = nil
) {
self.id = UUID()
self.kind = kind
self.message = message
self.detectedAt = detectedAt
}
enum CodingKeys: String, CodingKey {
case kind
case message
case detectedAt = "detected_at"
}
public init(from decoder: any Decoder) throws {
let c = try decoder.container(keyedBy: CodingKeys.self)
self.id = UUID()
self.kind = try c.decodeIfPresent(String.self, forKey: .kind) ?? "unknown"
self.message = try c.decodeIfPresent(String.self, forKey: .message)
// Flexible timestamp decode mirrors HermesKanbanTask's pattern.
if !c.contains(.detectedAt) {
self.detectedAt = nil
} else if let unix = try? c.decodeIfPresent(Double.self, forKey: .detectedAt) {
let date = Date(timeIntervalSince1970: unix)
self.detectedAt = Self.isoFormatter.string(from: date)
} else {
self.detectedAt = try c.decodeIfPresent(String.self, forKey: .detectedAt)
}
}
public func encode(to encoder: any Encoder) throws {
var c = encoder.container(keyedBy: CodingKeys.self)
try c.encode(kind, forKey: .kind)
try c.encodeIfPresent(message, forKey: .message)
try c.encodeIfPresent(detectedAt, forKey: .detectedAt)
}
public static func == (lhs: HermesKanbanDiagnostic, rhs: HermesKanbanDiagnostic) -> Bool {
// Compare on wire fields, not synthetic id round-trip decoding
// mints fresh ids.
lhs.kind == rhs.kind
&& lhs.message == rhs.message
&& lhs.detectedAt == rhs.detectedAt
}
private static let isoFormatter: ISO8601DateFormatter = {
let f = ISO8601DateFormatter()
f.formatOptions = [.withInternetDateTime]
return f
}()
}
// MARK: - Typed mirror
/// Typed view of `HermesKanbanDiagnostic.kind`. Models keep the raw
/// string for forward compatibility; UI helpers read this enum to pick
/// the right glyph + tint without string-matching at every callsite.
///
/// `unknown` is the fallback for any kind a future Hermes adds that
/// Scarf doesn't recognize. Views render the raw string verbatim in
/// that case so the user still sees what Hermes flagged.
// TODO(WS-3-Q5): The exact `kind` string for darwin-zombie detection is
// inferred from the v0.13 release notes ("Detect darwin zombie workers");
// confirm against live `hermes kanban show --json` output during
// integration. Same for `worker_exit_no_complete` and the heartbeat-stalled
// kinds typed mirror falls through to `.unknown` if the wire string
// drifts, and the raw string is still rendered.
public enum KanbanDiagnosticKind: String, Sendable, CaseIterable {
case heartbeatStalled = "heartbeat_stalled"
case toolErrorLoop = "tool_error_loop"
case retryCapHit = "retry_cap_hit"
case unboundedRetry = "unbounded_retry"
case darwinZombieDetected = "darwin_zombie_detected"
case spawnFailure = "spawn_failure"
case workerExitNoComplete = "worker_exit_no_complete"
case unknown
/// Map a wire string (case-insensitive) to a typed kind. Unknown
/// values fall through to `.unknown` so callers can still surface
/// the raw string.
public static func from(_ raw: String) -> KanbanDiagnosticKind {
KanbanDiagnosticKind(rawValue: raw.lowercased()) ?? .unknown
}
/// SF Symbol name to render alongside the diagnostic. View code
/// reaches through the typed enum so glyph choices live in one
/// place.
public var glyphName: String {
switch self {
case .heartbeatStalled: return "waveform.path.badge.minus"
case .toolErrorLoop: return "arrow.triangle.2.circlepath.exclamationmark"
case .retryCapHit: return "nosign"
case .unboundedRetry: return "arrow.clockwise.circle.fill"
case .darwinZombieDetected: return "apple.logo"
case .spawnFailure: return "bolt.slash"
case .workerExitNoComplete: return "figure.walk.departure"
case .unknown: return "stethoscope"
}
}
/// Severity tier for this kind drives badge tint. `.danger` for
/// terminal-class signals (retry cap hit, zombie, spawn failure);
/// `.warning` for recoverable signals (heartbeat stalled, tool
/// error loop); `.neutral` only for unknown / forward-compat kinds.
public var severity: DiagnosticSeverity {
switch self {
case .retryCapHit, .darwinZombieDetected, .spawnFailure:
return .danger
case .heartbeatStalled, .toolErrorLoop, .unboundedRetry, .workerExitNoComplete:
return .warning
case .unknown:
return .neutral
}
}
public enum DiagnosticSeverity: Sendable {
case warning
case danger
case neutral
}
}
@@ -0,0 +1,175 @@
import Foundation
/// One event from the `task_events` log emitted by `hermes kanban show`
/// (within a `HermesKanbanTaskDetail`) and streamed live by
/// `hermes kanban watch --json`. Event kinds are open-ended on the Hermes
/// side; v0.12 emits a small known set listed in `KanbanEventKind`. Unknown
/// kinds map to `.unknown` so new Hermes builds don't break decoding.
public struct HermesKanbanEvent: Sendable, Equatable, Identifiable, Codable {
public let id: Int
public let taskId: String
public let runId: Int?
/// Wire string for the event kind. Use `kindEnum` to interpret.
public let kind: String
public let createdAt: String
/// Opaque diagnostics payload from the `task_events.payload` column.
/// Stored as a JSON string so callers that don't need it pay no
/// decoding cost; callers that do can re-parse.
public let payloadJSON: String?
public init(
id: Int,
taskId: String,
runId: Int? = nil,
kind: String,
createdAt: String,
payloadJSON: String? = nil
) {
self.id = id
self.taskId = taskId
self.runId = runId
self.kind = kind
self.createdAt = createdAt
self.payloadJSON = payloadJSON
}
public var kindEnum: KanbanEventKind { KanbanEventKind.from(kind) }
enum CodingKeys: String, CodingKey {
case id
case taskId = "task_id"
case runId = "run_id"
case kind
case createdAt = "created_at"
case payload
}
public init(from decoder: any Decoder) throws {
let c = try decoder.container(keyedBy: CodingKeys.self)
self.id = try c.decodeIfPresent(Int.self, forKey: .id) ?? 0
self.taskId = try c.decodeIfPresent(String.self, forKey: .taskId) ?? ""
self.runId = try c.decodeIfPresent(Int.self, forKey: .runId)
self.kind = try c.decodeIfPresent(String.self, forKey: .kind) ?? "unknown"
if let unix = try? c.decodeIfPresent(Double.self, forKey: .createdAt) {
let f = ISO8601DateFormatter()
f.formatOptions = [.withInternetDateTime]
self.createdAt = f.string(from: Date(timeIntervalSince1970: unix))
} else {
self.createdAt = (try? c.decodeIfPresent(String.self, forKey: .createdAt)) ?? ""
}
// payload may be absent, a JSON object, or already a string.
if let raw = try? c.decodeIfPresent(String.self, forKey: .payload) {
self.payloadJSON = raw
} else if c.contains(.payload) {
// Re-encode arbitrary JSON into a string so we can carry it
// around without committing to a typed shape.
let nested = try c.decode(JSONAny.self, forKey: .payload)
let data = try JSONEncoder().encode(nested)
self.payloadJSON = String(data: data, encoding: .utf8)
} else {
self.payloadJSON = nil
}
}
public func encode(to encoder: any Encoder) throws {
var c = encoder.container(keyedBy: CodingKeys.self)
try c.encode(id, forKey: .id)
try c.encode(taskId, forKey: .taskId)
try c.encodeIfPresent(runId, forKey: .runId)
try c.encode(kind, forKey: .kind)
try c.encode(createdAt, forKey: .createdAt)
try c.encodeIfPresent(payloadJSON, forKey: .payload)
}
}
/// Known event kinds emitted by Hermes v0.12+. New kinds are surfaced
/// as `.unknown` until the model catches up; UI defaults to a generic
/// rendering for those.
public enum KanbanEventKind: String, Sendable, CaseIterable {
case created
case claimed
case released
case started
case completed
case blocked
case unblocked
case commented
case archived
case heartbeat
case statusChange = "status_change"
case error
case crashed
case timedOut = "timed_out"
case spawnFailed = "spawn_failed"
case unknown
public static func from(_ raw: String) -> KanbanEventKind {
KanbanEventKind(rawValue: raw.lowercased()) ?? .unknown
}
}
// MARK: - JSON-any helper
/// Minimal type-erased JSON wrapper used for opaque event payloads. We
/// don't commit to a typed shape because Hermes treats payload as
/// diagnostics and may evolve it freely. Used only inside Codable
/// init/encode (a single decodere-encodestring pass), so the `Any`
/// payload never crosses an actor boundary `@unchecked Sendable`
/// is the appropriate seal here.
struct JSONAny: Codable, @unchecked Sendable {
let raw: Any
init(from decoder: any Decoder) throws {
let container = try decoder.singleValueContainer()
if container.decodeNil() {
self.raw = NSNull()
} else if let b = try? container.decode(Bool.self) {
self.raw = b
} else if let i = try? container.decode(Int64.self) {
self.raw = i
} else if let d = try? container.decode(Double.self) {
self.raw = d
} else if let s = try? container.decode(String.self) {
self.raw = s
} else if let arr = try? container.decode([JSONAny].self) {
self.raw = arr.map(\.raw)
} else if let dict = try? container.decode([String: JSONAny].self) {
self.raw = dict.mapValues(\.raw)
} else {
throw DecodingError.dataCorruptedError(
in: container,
debugDescription: "Unsupported JSON value"
)
}
}
func encode(to encoder: any Encoder) throws {
var c = encoder.singleValueContainer()
switch raw {
case is NSNull:
try c.encodeNil()
case let b as Bool:
try c.encode(b)
case let i as Int64:
try c.encode(i)
case let i as Int:
try c.encode(Int64(i))
case let d as Double:
try c.encode(d)
case let s as String:
try c.encode(s)
case let arr as [Any]:
try c.encode(arr.map { JSONAny(unsafeRaw: $0) })
case let dict as [String: Any]:
try c.encode(dict.mapValues { JSONAny(unsafeRaw: $0) })
default:
throw EncodingError.invalidValue(
raw,
EncodingError.Context(codingPath: encoder.codingPath, debugDescription: "Unsupported")
)
}
}
private init(unsafeRaw: Any) { self.raw = unsafeRaw }
}
@@ -0,0 +1,170 @@
import Foundation
/// One attempt to execute a kanban task `hermes kanban runs <id> --json`
/// returns an array of these per task. Each run records the worker
/// profile that claimed the task, the outcome, and a structured
/// metadata blob the worker handed back.
public struct HermesKanbanRun: Sendable, Equatable, Identifiable, Codable {
public let id: Int
public let taskId: String
public let profile: String?
public let stepKey: String?
public let status: String // running | done | blocked | crashed | timed_out | failed | released
public let claimLock: String? // "host:pid" at spawn time
public let claimExpires: Int?
public let workerPid: Int?
public let maxRuntimeSeconds: Int?
public let lastHeartbeatAt: String?
public let startedAt: String
public let endedAt: String?
public let outcome: String? // completed | blocked | crashed | timed_out | spawn_failed | gave_up | reclaimed
public let summary: String?
public let error: String?
/// `metadata` is an opaque JSON dict from the worker. Carried as a
/// raw string so we don't lock the typed shape.
public let metadataJSON: String?
// v0.13 (v2026.5.7) fields. Both Optional / empty-default so a v0.12
// host's run row decodes without error.
/// Per-attempt distress signals. Cross-run signals (retry cap hit,
/// etc.) hang off `HermesKanbanTask.diagnostics`; in-flight signals
/// (heartbeat stalled, darwin zombie detected) attach here.
public let diagnostics: [HermesKanbanDiagnostic]
/// Server-side unified failure counter (renamed from three separate
/// spawn / timeout / crash counters in v0.13). Optional when nil,
/// callers fall back to counting failed runs in the runs array.
// TODO(WS-3-Q4): Verify whether v0.13 exposes this field on the per-run
// shape OR only at the task level. Tolerant decode handles either.
public let failureCount: Int?
public init(
id: Int,
taskId: String,
profile: String? = nil,
stepKey: String? = nil,
status: String,
claimLock: String? = nil,
claimExpires: Int? = nil,
workerPid: Int? = nil,
maxRuntimeSeconds: Int? = nil,
lastHeartbeatAt: String? = nil,
startedAt: String,
endedAt: String? = nil,
outcome: String? = nil,
summary: String? = nil,
error: String? = nil,
metadataJSON: String? = nil,
diagnostics: [HermesKanbanDiagnostic] = [],
failureCount: Int? = nil
) {
self.id = id
self.taskId = taskId
self.profile = profile
self.stepKey = stepKey
self.status = status
self.claimLock = claimLock
self.claimExpires = claimExpires
self.workerPid = workerPid
self.maxRuntimeSeconds = maxRuntimeSeconds
self.lastHeartbeatAt = lastHeartbeatAt
self.startedAt = startedAt
self.endedAt = endedAt
self.outcome = outcome
self.summary = summary
self.error = error
self.metadataJSON = metadataJSON
self.diagnostics = diagnostics
self.failureCount = failureCount
}
enum CodingKeys: String, CodingKey {
case id
case taskId = "task_id"
case profile
case stepKey = "step_key"
case status
case claimLock = "claim_lock"
case claimExpires = "claim_expires"
case workerPid = "worker_pid"
case maxRuntimeSeconds = "max_runtime_seconds"
case lastHeartbeatAt = "last_heartbeat_at"
case startedAt = "started_at"
case endedAt = "ended_at"
case outcome
case summary
case error
case metadata
case diagnostics
case failureCount = "failure_count"
}
public init(from decoder: any Decoder) throws {
let c = try decoder.container(keyedBy: CodingKeys.self)
self.id = try c.decodeIfPresent(Int.self, forKey: .id) ?? 0
self.taskId = try c.decodeIfPresent(String.self, forKey: .taskId) ?? ""
self.profile = try c.decodeIfPresent(String.self, forKey: .profile)
self.stepKey = try c.decodeIfPresent(String.self, forKey: .stepKey)
self.status = try c.decodeIfPresent(String.self, forKey: .status) ?? "unknown"
self.claimLock = try c.decodeIfPresent(String.self, forKey: .claimLock)
self.claimExpires = try c.decodeIfPresent(Int.self, forKey: .claimExpires)
self.workerPid = try c.decodeIfPresent(Int.self, forKey: .workerPid)
self.maxRuntimeSeconds = try c.decodeIfPresent(Int.self, forKey: .maxRuntimeSeconds)
let f = ISO8601DateFormatter()
f.formatOptions = [.withInternetDateTime]
if let unix = try? c.decodeIfPresent(Double.self, forKey: .lastHeartbeatAt) {
self.lastHeartbeatAt = f.string(from: Date(timeIntervalSince1970: unix))
} else {
self.lastHeartbeatAt = try c.decodeIfPresent(String.self, forKey: .lastHeartbeatAt)
}
if let unix = try? c.decodeIfPresent(Double.self, forKey: .startedAt) {
self.startedAt = f.string(from: Date(timeIntervalSince1970: unix))
} else {
self.startedAt = (try? c.decodeIfPresent(String.self, forKey: .startedAt)) ?? ""
}
if let unix = try? c.decodeIfPresent(Double.self, forKey: .endedAt) {
self.endedAt = f.string(from: Date(timeIntervalSince1970: unix))
} else {
self.endedAt = try c.decodeIfPresent(String.self, forKey: .endedAt)
}
self.outcome = try c.decodeIfPresent(String.self, forKey: .outcome)
self.summary = try c.decodeIfPresent(String.self, forKey: .summary)
self.error = try c.decodeIfPresent(String.self, forKey: .error)
if let raw = try? c.decodeIfPresent(String.self, forKey: .metadata) {
self.metadataJSON = raw
} else if c.contains(.metadata) {
let nested = try c.decode(JSONAny.self, forKey: .metadata)
let data = try JSONEncoder().encode(nested)
self.metadataJSON = String(data: data, encoding: .utf8)
} else {
self.metadataJSON = nil
}
// v0.13 diagnostics array `try?` so a malformed entry doesn't
// poison the whole run row. Empty default for pre-v0.13 hosts.
self.diagnostics = (try? c.decodeIfPresent([HermesKanbanDiagnostic].self, forKey: .diagnostics)) ?? []
self.failureCount = try c.decodeIfPresent(Int.self, forKey: .failureCount)
}
public func encode(to encoder: any Encoder) throws {
var c = encoder.container(keyedBy: CodingKeys.self)
try c.encode(id, forKey: .id)
try c.encode(taskId, forKey: .taskId)
try c.encodeIfPresent(profile, forKey: .profile)
try c.encodeIfPresent(stepKey, forKey: .stepKey)
try c.encode(status, forKey: .status)
try c.encodeIfPresent(claimLock, forKey: .claimLock)
try c.encodeIfPresent(claimExpires, forKey: .claimExpires)
try c.encodeIfPresent(workerPid, forKey: .workerPid)
try c.encodeIfPresent(maxRuntimeSeconds, forKey: .maxRuntimeSeconds)
try c.encodeIfPresent(lastHeartbeatAt, forKey: .lastHeartbeatAt)
try c.encode(startedAt, forKey: .startedAt)
try c.encodeIfPresent(endedAt, forKey: .endedAt)
try c.encodeIfPresent(outcome, forKey: .outcome)
try c.encodeIfPresent(summary, forKey: .summary)
try c.encodeIfPresent(error, forKey: .error)
try c.encodeIfPresent(metadataJSON, forKey: .metadata)
try c.encode(diagnostics, forKey: .diagnostics)
try c.encodeIfPresent(failureCount, forKey: .failureCount)
}
}
@@ -0,0 +1,68 @@
import Foundation
/// Output of `hermes kanban stats --json`. Drives the toolbar glance
/// ("12 todo · 3 running · 5 blocked"), the per-project Kanban summary
/// widget, and the column-count badges on the board header.
public struct HermesKanbanStats: Sendable, Equatable, Codable {
public let byStatus: [String: Int]
public let byAssignee: [String: Int]
public let byTenant: [String: Int]
/// Age in seconds of the oldest task currently in the `ready` status.
/// `nil` when no tasks are ready. Helps surface a stuck dispatcher.
public let oldestReadyAgeSeconds: Double?
public init(
byStatus: [String: Int],
byAssignee: [String: Int] = [:],
byTenant: [String: Int] = [:],
oldestReadyAgeSeconds: Double? = nil
) {
self.byStatus = byStatus
self.byAssignee = byAssignee
self.byTenant = byTenant
self.oldestReadyAgeSeconds = oldestReadyAgeSeconds
}
public static let empty = HermesKanbanStats(byStatus: [:])
enum CodingKeys: String, CodingKey {
case byStatus = "by_status"
case byAssignee = "by_assignee"
case byTenant = "by_tenant"
case oldestReadyAgeSeconds = "oldest_ready_age_seconds"
}
public init(from decoder: any Decoder) throws {
let c = try decoder.container(keyedBy: CodingKeys.self)
self.byStatus = try c.decodeIfPresent([String: Int].self, forKey: .byStatus) ?? [:]
self.byAssignee = try c.decodeIfPresent([String: Int].self, forKey: .byAssignee) ?? [:]
self.byTenant = try c.decodeIfPresent([String: Int].self, forKey: .byTenant) ?? [:]
self.oldestReadyAgeSeconds = try c.decodeIfPresent(Double.self, forKey: .oldestReadyAgeSeconds)
}
/// "12 todo · 3 running · 5 blocked" formatted glance string. Skips
/// empty buckets and never includes archived. Returns an empty
/// string when there's nothing to show so callers can hide chrome.
public var glanceString: String {
let order: [(String, String)] = [
("todo", "todo"),
("ready", "ready"),
("running", "running"),
("blocked", "blocked"),
("done", "done")
]
let parts = order.compactMap { (key, label) -> String? in
guard let n = byStatus[key], n > 0 else { return nil }
return "\(n) \(label)"
}
return parts.joined(separator: " · ")
}
/// Active task count across the board (everything except archived
/// and done). Used as a badge on the sidebar / project tab.
public var activeCount: Int {
["triage", "todo", "ready", "running", "blocked"]
.map { byStatus[$0] ?? 0 }
.reduce(0, +)
}
}
@@ -0,0 +1,282 @@
import Foundation
/// One task from `hermes kanban list --json` (v0.12+).
///
/// Hermes ships a SQLite-backed task board under `~/.hermes/kanban.db`.
/// v2.6 surfaced this as a read-only list; v2.7.5 lifts it to a full
/// drag-and-drop board with the complete write surface (`create`,
/// `claim`, `complete`, `block`, `unblock`, `archive`, `assign`,
/// `link`/`unlink`, `comment`, `dispatch`).
///
/// Hermes has no `update` verb `priority` / `title` / `body` /
/// `tenant` / `max_retries` are write-once at create time. Mutations
/// after that are expressed as state transitions (status, assignee) or
/// new comments.
public struct HermesKanbanTask: Sendable, Equatable, Identifiable, Codable {
public let id: String
public let title: String
public let body: String?
public let assignee: String?
public let status: String // archived | blocked | done | ready | running | todo | triage
public let priority: Int?
public let tenant: String?
public let workspaceKind: String? // scratch | worktree | dir
public let workspacePath: String?
public let createdBy: String?
public let createdAt: String? // ISO timestamp
public let startedAt: String?
public let completedAt: String?
public let result: String?
public let skills: [String]
// v2.7.5 fields exposed by `kanban show --json` and `kanban watch`.
public let idempotencyKey: String?
public let lastHeartbeatAt: String?
public let maxRuntimeSeconds: Int?
public let currentRunId: Int?
// v0.13 (v2026.5.7) reliability + recovery fields. All Optional with
// `nil` decoded for pre-v0.13 hosts so the v2.7.5 surface keeps
// rendering unchanged when the connected Hermes hasn't shipped them.
/// Per-task retry budget set at create time via `--max-retries N`.
/// Hermes pattern is write-once no `set_max_retries` verb. Scarf
/// surfaces this read-only on the inspector header.
public let maxRetries: Int?
/// Server-supplied reason a task was auto-blocked (e.g. "worker
/// exited (code 0) without calling `kanban complete`"). Surfaced
/// verbatim in the inspector banner.
public let autoBlockedReason: String?
/// `pending` / `verified` / `rejected` / nil. Pending means a worker
/// claimed it created this card but Hermes hasn't confirmed the
/// underlying work exists. Read through `KanbanHallucinationGate.from`
/// to map to a typed mirror kept as a String at the wire level so
/// Hermes can add new gate states (e.g. `quarantined`) without a
/// Scarf release.
public let hallucinationGateStatus: String?
/// Cross-run distress signals (retry cap hit, etc.). Per-run signals
/// hang off `HermesKanbanRun.diagnostics`. Empty array for pre-v0.13
/// hosts AND for tasks the diagnostics engine hasn't flagged.
public let diagnostics: [HermesKanbanDiagnostic]
public init(
id: String,
title: String,
body: String? = nil,
assignee: String? = nil,
status: String,
priority: Int? = nil,
tenant: String? = nil,
workspaceKind: String? = nil,
workspacePath: String? = nil,
createdBy: String? = nil,
createdAt: String? = nil,
startedAt: String? = nil,
completedAt: String? = nil,
result: String? = nil,
skills: [String] = [],
idempotencyKey: String? = nil,
lastHeartbeatAt: String? = nil,
maxRuntimeSeconds: Int? = nil,
currentRunId: Int? = nil,
maxRetries: Int? = nil,
autoBlockedReason: String? = nil,
hallucinationGateStatus: String? = nil,
diagnostics: [HermesKanbanDiagnostic] = []
) {
self.id = id
self.title = title
self.body = body
self.assignee = assignee
self.status = status
self.priority = priority
self.tenant = tenant
self.workspaceKind = workspaceKind
self.workspacePath = workspacePath
self.createdBy = createdBy
self.createdAt = createdAt
self.startedAt = startedAt
self.completedAt = completedAt
self.result = result
self.skills = skills
self.idempotencyKey = idempotencyKey
self.lastHeartbeatAt = lastHeartbeatAt
self.maxRuntimeSeconds = maxRuntimeSeconds
self.currentRunId = currentRunId
self.maxRetries = maxRetries
self.autoBlockedReason = autoBlockedReason
self.hallucinationGateStatus = hallucinationGateStatus
self.diagnostics = diagnostics
}
enum CodingKeys: String, CodingKey {
case id, title, body, assignee, status, priority, tenant
case workspaceKind = "workspace_kind"
case workspacePath = "workspace_path"
case createdBy = "created_by"
case createdAt = "created_at"
case startedAt = "started_at"
case completedAt = "completed_at"
case result, skills
case idempotencyKey = "idempotency_key"
case lastHeartbeatAt = "last_heartbeat_at"
case maxRuntimeSeconds = "max_runtime_seconds"
case currentRunId = "current_run_id"
case maxRetries = "max_retries"
case autoBlockedReason = "auto_blocked_reason"
case hallucinationGateStatus = "hallucination_gate_status"
case diagnostics
}
public init(from decoder: any Decoder) throws {
let c = try decoder.container(keyedBy: CodingKeys.self)
self.id = try c.decode(String.self, forKey: .id)
self.title = try c.decode(String.self, forKey: .title)
self.body = try c.decodeIfPresent(String.self, forKey: .body)
self.assignee = try c.decodeIfPresent(String.self, forKey: .assignee)
self.status = try c.decodeIfPresent(String.self, forKey: .status) ?? "unknown"
self.priority = try c.decodeIfPresent(Int.self, forKey: .priority)
self.tenant = try c.decodeIfPresent(String.self, forKey: .tenant)
self.workspaceKind = try c.decodeIfPresent(String.self, forKey: .workspaceKind)
self.workspacePath = try c.decodeIfPresent(String.self, forKey: .workspacePath)
self.createdBy = try c.decodeIfPresent(String.self, forKey: .createdBy)
// Hermes emits timestamps as Unix integer seconds for tasks
// returned from `create`/`show`/`list` (its SQLite columns are
// INTEGER) but ISO-8601 strings in some other paths. Normalize
// both shapes into ISO-8601 strings so UI code only deals with
// one type.
self.createdAt = try Self.decodeFlexibleTimestamp(c, forKey: .createdAt)
self.startedAt = try Self.decodeFlexibleTimestamp(c, forKey: .startedAt)
self.completedAt = try Self.decodeFlexibleTimestamp(c, forKey: .completedAt)
self.result = try c.decodeIfPresent(String.self, forKey: .result)
self.skills = try c.decodeIfPresent([String].self, forKey: .skills) ?? []
self.idempotencyKey = try c.decodeIfPresent(String.self, forKey: .idempotencyKey)
self.lastHeartbeatAt = try Self.decodeFlexibleTimestamp(c, forKey: .lastHeartbeatAt)
self.maxRuntimeSeconds = try c.decodeIfPresent(Int.self, forKey: .maxRuntimeSeconds)
self.currentRunId = try c.decodeIfPresent(Int.self, forKey: .currentRunId)
// v0.13 fields every one is `decodeIfPresent` so a v0.12 host's
// task row decodes successfully with these all nil/empty. The
// tolerant-decode contract is pinned by KanbanModelsTests.
self.maxRetries = try c.decodeIfPresent(Int.self, forKey: .maxRetries)
self.autoBlockedReason = try c.decodeIfPresent(String.self, forKey: .autoBlockedReason)
self.hallucinationGateStatus = try c.decodeIfPresent(String.self, forKey: .hallucinationGateStatus)
// Wrap diagnostics decode in `try?` so a single malformed entry
// (or the whole array being the wrong shape) doesn't poison the
// task row the rest of the decoder still produces a usable
// task. Empty default matches the `skills` pattern.
self.diagnostics = (try? c.decodeIfPresent([HermesKanbanDiagnostic].self, forKey: .diagnostics)) ?? []
}
/// Decode a timestamp that may arrive as a Unix integer or an
/// ISO-8601 string. Returns the ISO-8601 string form so downstream
/// code only deals with one type.
static func decodeFlexibleTimestamp(
_ container: KeyedDecodingContainer<CodingKeys>,
forKey key: CodingKeys
) throws -> String? {
if !container.contains(key) { return nil }
// Try the SQLite-style integer first (most common from Hermes).
if let unix = try? container.decodeIfPresent(Double.self, forKey: key) {
let date = Date(timeIntervalSince1970: unix)
return Self.isoFormatter.string(from: date)
}
// Fall back to a plain string.
return try container.decodeIfPresent(String.self, forKey: key)
}
static let isoFormatter: ISO8601DateFormatter = {
let f = ISO8601DateFormatter()
f.formatOptions = [.withInternetDateTime]
return f
}()
}
// MARK: - Status enum (typed view of the wire string)
/// Typed mirror of Hermes's status enum. Models keep `status: String` for
/// forward compatibility with new statuses Hermes might add; UI code uses
/// `KanbanStatus.from(_:)` to map known values into typed categories and
/// fall back to `.unknown` for anything new.
public enum KanbanStatus: String, Sendable, CaseIterable, Identifiable {
case triage
case todo
case ready
case running
case blocked
case done
case archived
case unknown
public var id: String { rawValue }
public static func from(_ raw: String) -> KanbanStatus {
KanbanStatus(rawValue: raw.lowercased()) ?? .unknown
}
/// Coarse 5-column board grouping. `triage` is a column; `todo` and
/// `ready` collapse to one ("Up Next"); everything else maps 1:1.
/// `archived` lives outside the board (toggle).
public var boardColumn: KanbanBoardColumn {
switch self {
case .triage: return .triage
case .todo, .ready, .unknown: return .upNext
case .running: return .running
case .blocked: return .blocked
case .done: return .done
case .archived: return .archived
}
}
}
public enum KanbanBoardColumn: String, Sendable, CaseIterable, Identifiable {
case triage
case upNext
case running
case blocked
case done
case archived
public var id: String { rawValue }
public var displayName: String {
switch self {
case .triage: return "Triage"
case .upNext: return "Up Next"
case .running: return "Running"
case .blocked: return "Blocked"
case .done: return "Done"
case .archived: return "Archived"
}
}
/// Visible columns in the default board layout. `archived` appears
/// only when the "Show archived" toggle is on. `triage` is shown
/// only when the board has at least one triage task (collapsed
/// otherwise to keep the default layout focused).
public static let defaultVisible: [KanbanBoardColumn] = [
.triage, .upNext, .running, .blocked, .done
]
}
// MARK: - Hallucination gate (v0.13)
/// Typed mirror of Hermes v0.13's hallucination-gate state. Worker-created
/// cards land in `pending` until something verifies the underlying work
/// exists; Scarf surfaces a Verify / Reject UX above the task body so the
/// user can act as the verification gate.
///
/// Kept separate from `KanbanStatus` because hallucination state is
/// orthogonal to the lifecycle a card can be `ready` *and* `pending`,
/// for example.
public enum KanbanHallucinationGate: String, Sendable, CaseIterable {
case pending
case verified
case rejected
/// Map a raw `hallucination_gate_status` string (case-insensitive) to
/// a typed gate. Returns nil for empty/nil/unknown values so callers
/// can short-circuit "no gate" branches with `if let gate = `.
public static func from(_ raw: String?) -> KanbanHallucinationGate? {
guard let raw, !raw.isEmpty else { return nil }
return KanbanHallucinationGate(rawValue: raw.lowercased())
}
}
@@ -0,0 +1,89 @@
import Foundation
/// Output of `hermes kanban show <id> --json`. Wraps a task with its full
/// audit trail: comments + events + parent results. Loaded on-demand
/// when the user opens the inspector pane; the board itself only carries
/// the lightweight `HermesKanbanTask` rows.
public struct HermesKanbanTaskDetail: Sendable, Equatable, Codable {
public let task: HermesKanbanTask
public let comments: [HermesKanbanComment]
public let events: [HermesKanbanEvent]
/// Parent-task results keyed by parent task id. Hermes hands these
/// to the worker as upstream context; surfacing them in the
/// inspector is useful for understanding why a task started.
public let parentResults: [String: String]
/// Envelope-level diagnostics array (sibling to `task`, not nested
/// inside it). Defensive Hermes v0.13's wire shape may attach
/// diagnostics to the task itself OR to the envelope.
/// `allDiagnostics` dedupes both sources by `(kind, detected_at)`.
// TODO(WS-3-Q2): Confirm against live `hermes kanban show --json`
// whether diagnostics live on the task envelope, the inner task, or
// both. Current decode is tolerant of either.
public let envelopeDiagnostics: [HermesKanbanDiagnostic]?
public init(
task: HermesKanbanTask,
comments: [HermesKanbanComment] = [],
events: [HermesKanbanEvent] = [],
parentResults: [String: String] = [:],
envelopeDiagnostics: [HermesKanbanDiagnostic]? = nil
) {
self.task = task
self.comments = comments
self.events = events
self.parentResults = parentResults
self.envelopeDiagnostics = envelopeDiagnostics
}
enum CodingKeys: String, CodingKey {
case task
case comments
case events
case parentResults = "parent_results"
case envelopeDiagnostics = "diagnostics"
}
public init(from decoder: any Decoder) throws {
// Hermes emits `kanban show --json` either as a nested
// {task: {...}, comments: [...], events: [...]} object or
// as a flat task object with extra `comments`/`events`
// keys at top level. Try the nested form first; fall
// back to top-level decode.
let container = try decoder.container(keyedBy: CodingKeys.self)
if let nested = try? container.decode(HermesKanbanTask.self, forKey: .task) {
self.task = nested
} else {
let single = try decoder.singleValueContainer()
self.task = try single.decode(HermesKanbanTask.self)
}
self.comments = (try? container.decodeIfPresent([HermesKanbanComment].self, forKey: .comments)) ?? []
self.events = (try? container.decodeIfPresent([HermesKanbanEvent].self, forKey: .events)) ?? []
self.parentResults = (try? container.decodeIfPresent([String: String].self, forKey: .parentResults)) ?? [:]
// Same `try?` shield as the rest a malformed envelope
// diagnostics array shouldn't reject the whole show response.
self.envelopeDiagnostics = try? container.decodeIfPresent([HermesKanbanDiagnostic].self, forKey: .envelopeDiagnostics)
}
public func encode(to encoder: any Encoder) throws {
var c = encoder.container(keyedBy: CodingKeys.self)
try c.encode(task, forKey: .task)
try c.encode(comments, forKey: .comments)
try c.encode(events, forKey: .events)
try c.encode(parentResults, forKey: .parentResults)
try c.encodeIfPresent(envelopeDiagnostics, forKey: .envelopeDiagnostics)
}
/// Unified diagnostics view for the inspector. Combines `task.diagnostics`
/// with envelope-level diagnostics (when present) and dedupes on the
/// `(kind, detectedAt)` tuple. Wire-side dupes are unlikely but cheap to
/// filter. Empty for pre-v0.13 hosts.
public var allDiagnostics: [HermesKanbanDiagnostic] {
let onTask = task.diagnostics
let onEnvelope = envelopeDiagnostics ?? []
var seen = Set<String>()
return (onTask + onEnvelope).filter { diag in
let key = "\(diag.kind)|\(diag.detectedAt ?? "")"
return seen.insert(key).inserted
}
}
}
@@ -3,6 +3,10 @@ import Foundation
public enum MCPTransport: String, Sendable, Equatable, CaseIterable, Identifiable {
case stdio
case http
/// Server-Sent Events transport. Hermes v0.13+ only.
// TODO(WS-7-Q1): Verify Hermes uses the literal `sse` transport name
// (vs. `streamable-http`/`http-sse`/etc.) once a v0.13 host is on hand.
case sse
public var id: String { rawValue }
@@ -11,6 +15,7 @@ public enum MCPTransport: String, Sendable, Equatable, CaseIterable, Identifiabl
switch self {
case .stdio: return "Local (stdio)"
case .http: return "Remote (HTTP)"
case .sse: return "Remote (SSE)"
}
}
#endif
@@ -33,6 +38,12 @@ public struct HermesMCPServer: Identifiable, Sendable, Equatable {
public let resourcesEnabled: Bool
public let promptsEnabled: Bool
public let hasOAuthToken: Bool
/// Hermes-side keepalive interval (seconds) for SSE transport. `nil`
/// when the YAML doesn't specify `sse_read_timeout` (Hermes default
/// applies). Pre-v0.13 hosts always have this as `nil`.
// TODO(WS-7-Q2): Default is assumed to be 300s per WS-7 plan; placeholder
// copy uses that. Verify against `~/.hermes/hermes-agent/hermes_cli/mcp.py`.
public let sseReadTimeout: Int?
public init(
@@ -51,7 +62,8 @@ public struct HermesMCPServer: Identifiable, Sendable, Equatable {
toolsExclude: [String],
resourcesEnabled: Bool,
promptsEnabled: Bool,
hasOAuthToken: Bool
hasOAuthToken: Bool,
sseReadTimeout: Int? = nil
) {
self.name = name
self.transport = transport
@@ -69,6 +81,7 @@ public struct HermesMCPServer: Identifiable, Sendable, Equatable {
self.resourcesEnabled = resourcesEnabled
self.promptsEnabled = promptsEnabled
self.hasOAuthToken = hasOAuthToken
self.sseReadTimeout = sseReadTimeout
}
public var id: String { name }
@@ -79,6 +92,8 @@ public struct HermesMCPServer: Identifiable, Sendable, Equatable {
return (command ?? "") + argString
case .http:
return url ?? ""
case .sse:
return url ?? ""
}
}
}
@@ -64,6 +64,28 @@ public struct HermesMessage: Identifiable, Sendable {
if let rc = reasoningContent, !rc.isEmpty { return rc }
return reasoning
}
/// Return a copy of this message with `toolCalls` replaced. Used
/// by the v2.8 two-phase chat loader: skeleton fetch returns
/// messages with empty `toolCalls`; the background hydrate splices
/// the parsed values in without re-fetching the conversational
/// columns.
public func withToolCalls(_ newCalls: [HermesToolCall]) -> HermesMessage {
HermesMessage(
id: id,
sessionId: sessionId,
role: role,
content: content,
toolCallId: toolCallId,
toolCalls: newCalls,
toolName: toolName,
timestamp: timestamp,
tokenCount: tokenCount,
finishReason: finishReason,
reasoning: reasoning,
reasoningContent: reasoningContent
)
}
}
public struct HermesToolCall: Identifiable, Sendable, Codable {
@@ -210,3 +232,23 @@ public enum ToolKind: String, Sendable, CaseIterable {
}
}
}
/// Outcome of a `fetchMessagesOutcome` call. `transportError` is non-nil
/// only when the underlying SSH/SQLite call hit a transport-layer
/// failure (timeout, ControlMaster drop) distinguishes a genuine
/// empty session from a silent partial-load. The chat resume path uses
/// it to surface a "couldn't load full history" banner.
public struct MessageFetchOutcome: Sendable {
public let messages: [HermesMessage]
public let transportError: String?
public init(messages: [HermesMessage], transportError: String?) {
self.messages = messages
self.transportError = transportError
}
/// True when the fetch tripped a transport failure. Distinct from
/// `messages.isEmpty` an empty session is a successful zero-row
/// result, while a transport error is "we don't know what's there."
public var didTimeOut: Bool { transportError != nil }
}
@@ -75,12 +75,35 @@ public struct HermesPathSet: Sendable, Hashable {
public nonisolated var errorsLog: String { home + "/logs/errors.log" }
public nonisolated var agentLog: String { home + "/logs/agent.log" }
public nonisolated var gatewayLog: String { home + "/logs/gateway.log" }
/// Curator run-reports root (v0.12+). Hermes writes per-cycle dirs
/// under here named `<YYYYMMDD-HHMMSS>/` containing `run.json` and
/// `REPORT.md`. The `last_report_path` field on `curator_state`
/// points at the most recent dir; `CuratorViewModel` resolves the
/// JSON/Markdown files relative to it.
public nonisolated var curatorLogsDir: String { home + "/logs/curator" }
/// JSON-encoded curator state (v0.12+). Filename has no extension
/// despite holding JSON Hermes writes it via
/// `~/.hermes/skills/.curator_state`. Carries last-run metadata,
/// run count, pause flag, and the path to the most recent report.
public nonisolated var curatorStateFile: String { home + "/skills/.curator_state" }
public nonisolated var scarfDir: String { home + "/scarf" }
public nonisolated var projectsRegistry: String { scarfDir + "/projects.json" }
/// Maps Hermes session IDs to the Scarf project path a chat was
/// started for. Scarf-owned; Hermes never touches this file.
public nonisolated var sessionProjectMap: String { scarfDir + "/session_project_map.json" }
/// Cached list of available Nous Portal models. Populated by
/// `NousModelCatalogService` from `GET https://inference-api.nousresearch.com/v1/models`
/// using the bearer token in `auth.json`. Refreshed on a 24h TTL or
/// on user request from the model picker. Survives offline runs so
/// the picker still has something to render.
public nonisolated var nousModelsCache: String { scarfDir + "/nous_models_cache.json" }
/// Cached `templates/catalog.json` from awizemann.github.io. Populated
/// by `CatalogService` on first sheet-open and refreshed on a 24h TTL
/// or on explicit user click. Mirrors `nousModelsCache` exactly:
/// JSON, scarf-owned, survives offline runs so the catalog browser
/// still has something to render. Wiped by a Hermes home reset.
public nonisolated var catalogCache: String { scarfDir + "/catalog_cache.json" }
public nonisolated var mcpTokensDir: String { home + "/mcp-tokens" }
// MARK: - Binary resolution
@@ -0,0 +1,23 @@
import Foundation
/// One queued prompt the user has staged via `/queue <text>` (Hermes
/// v0.13+ ACP `/queue` slash command). Hermes is the authoritative owner
/// of the actual queue server-side Scarf maintains this mirror so the
/// chat header chip + popover can show "what's pending" without an
/// extra round-trip. The mirror drains best-effort when a turn
/// completes (`RichChatViewModel.popQueuedPrompt`).
///
/// `id` is a Scarf-side UUID minted at queue-time Hermes' wire
/// protocol does not expose a per-queue-entry id, so we never round-trip
/// an entry-level identifier. See WS-2 plan Q5.
public struct HermesQueuedPrompt: Sendable, Equatable, Identifiable {
public let id: UUID
public let text: String
public let queuedAt: Date
public init(id: UUID = UUID(), text: String, queuedAt: Date = Date()) {
self.id = id
self.text = text
self.queuedAt = queuedAt
}
}
@@ -37,6 +37,16 @@ public struct HermesSkill: Identifiable, Sendable {
/// Python packages). Used by `SkillPrereqService` to know what to
/// probe; nil when the field is absent.
public let dependencies: [String]?
/// `false` when the skill name appears in `skills.disabled` in
/// `~/.hermes/config.yaml`. Hermes v0.12 stores disable state in
/// the config rather than per-skill markers; this is read-only
/// from Scarf's side until the toggle UI lands. Defaults to `true`.
public let enabled: Bool
/// `true` when the skill is pinned via `hermes curator pin <name>`.
/// Pinned skills are protected from auto-archive / consolidation.
/// Read from `CuratorViewModel.status.pinnedNames`; defaults to
/// `false` when curator state is unavailable.
public let pinned: Bool
public init(
id: String,
@@ -47,7 +57,9 @@ public struct HermesSkill: Identifiable, Sendable {
requiredConfig: [String],
allowedTools: [String]? = nil,
relatedSkills: [String]? = nil,
dependencies: [String]? = nil
dependencies: [String]? = nil,
enabled: Bool = true,
pinned: Bool = false
) {
self.id = id
self.name = name
@@ -58,5 +70,7 @@ public struct HermesSkill: Identifiable, Sendable {
self.allowedTools = allowedTools
self.relatedSkills = relatedSkills
self.dependencies = dependencies
self.enabled = enabled
self.pinned = pinned
}
}
@@ -53,6 +53,24 @@ public enum KnownPlatforms {
HermesToolPlatform(name: "feishu", displayName: "Feishu", icon: "message.badge.circle"),
HermesToolPlatform(name: "mattermost", displayName: "Mattermost", icon: "bubble.left.and.exclamationmark.bubble.right"),
HermesToolPlatform(name: "imessage", displayName: "iMessage", icon: "message.fill"),
// -- v0.12 additions ---------------------------------------------
// Yuanbao is a native gateway adapter (18th platform); Microsoft
// Teams ships as a plugin (19th). PlatformDetail surfaces the
// distinction in the setup copy. Names match Hermes's gateway
// platform identifiers.
HermesToolPlatform(name: "yuanbao", displayName: "Yuanbao 元宝", icon: "bubble.left.and.bubble.right.fill"),
HermesToolPlatform(name: "microsoft-teams", displayName: "Microsoft Teams", icon: "person.2.fill"),
// -- v0.13 additions ---------------------------------------------
// Google Chat is the 20th gateway platform. It's a generic
// `env_enablement_fn` / `cron_deliver_env_var`-driven adapter; setup
// runs through `hermes setup` rather than per-field forms because
// the auth dance is OAuth-style and lives outside Scarf. Identifier
// is `google-chat` (kebab-case, mirroring `microsoft-teams`).
// TODO(WS-5-Q1): verify identifier against Hermes v0.13 GA if it
// ships as `googlechat` instead, update both this entry and
// `KnownPlatforms.icon(for:)` below. `GatewayAllowlistKind.kind(for:)`
// already accepts both spellings defensively.
HermesToolPlatform(name: "google-chat", displayName: "Google Chat", icon: "bubble.left.fill"),
]
public static func icon(for platform: String) -> String {
@@ -70,6 +88,9 @@ public enum KnownPlatforms {
case "feishu": return "message.badge.circle"
case "mattermost": return "bubble.left.and.exclamationmark.bubble.right"
case "imessage": return "message.fill"
case "yuanbao": return "bubble.left.and.bubble.right.fill"
case "microsoft-teams": return "person.2.fill"
case "google-chat", "googlechat": return "bubble.left.fill"
default: return "bubble.left"
}
}
@@ -0,0 +1,134 @@
import Foundation
/// Swift-side parameter struct that maps 1:1 onto `hermes kanban create`
/// flags. Constructing one then handing it to `KanbanService.create`
/// keeps the CLI argv assembly in one place VMs build a `KanbanCreateRequest`
/// from form state and never assemble argv directly.
public struct KanbanCreateRequest: Sendable, Equatable {
public var title: String
public var body: String?
public var assignee: String?
public var parentIds: [String]
public var workspace: KanbanWorkspaceSpec?
public var tenant: String?
public var priority: Int?
public var triage: Bool
public var idempotencyKey: String?
public var maxRuntimeSeconds: Int?
public var createdBy: String?
public var skills: [String]
/// v0.13: per-task retry budget. `--max-retries N` is write-once at
/// create time no `set_max_retries` verb. Pass `nil` to let Hermes
/// pick its built-in default (3 as of v0.13.0). Capability-gated in
/// the create sheet on `hasKanbanDiagnostics`.
// TODO(WS-3-Q6): Confirm Hermes's global default for `max_retries`
// (v0.13 release notes don't enumerate it). The create sheet defaults
// the field to 3; if Hermes config exposes a different default, mirror
// it.
public var maxRetries: Int?
public init(
title: String,
body: String? = nil,
assignee: String? = nil,
parentIds: [String] = [],
workspace: KanbanWorkspaceSpec? = nil,
tenant: String? = nil,
priority: Int? = nil,
triage: Bool = false,
idempotencyKey: String? = nil,
maxRuntimeSeconds: Int? = nil,
createdBy: String? = nil,
skills: [String] = [],
maxRetries: Int? = nil
) {
self.title = title
self.body = body
self.assignee = assignee
self.parentIds = parentIds
self.workspace = workspace
self.tenant = tenant
self.priority = priority
self.triage = triage
self.idempotencyKey = idempotencyKey
self.maxRuntimeSeconds = maxRuntimeSeconds
self.createdBy = createdBy
self.skills = skills
self.maxRetries = maxRetries
}
/// Build the argv suffix this request maps to (everything after
/// `["kanban", "create"]`). Public for tests; consumers should
/// call `KanbanService.create` instead of building argv directly.
public func argv() -> [String] {
var args: [String] = []
if let body, !body.isEmpty {
args.append(contentsOf: ["--body", body])
}
if let assignee, !assignee.isEmpty {
args.append(contentsOf: ["--assignee", assignee])
}
for parent in parentIds {
args.append(contentsOf: ["--parent", parent])
}
if let workspace {
args.append(contentsOf: ["--workspace", workspace.cliValue])
}
if let tenant, !tenant.isEmpty {
args.append(contentsOf: ["--tenant", tenant])
}
if let priority {
args.append(contentsOf: ["--priority", String(priority)])
}
if triage {
args.append("--triage")
}
if let idempotencyKey, !idempotencyKey.isEmpty {
args.append(contentsOf: ["--idempotency-key", idempotencyKey])
}
if let maxRuntimeSeconds {
args.append(contentsOf: ["--max-runtime", "\(maxRuntimeSeconds)s"])
}
if let maxRetries {
args.append(contentsOf: ["--max-retries", String(maxRetries)])
}
if let createdBy, !createdBy.isEmpty {
args.append(contentsOf: ["--created-by", createdBy])
}
for skill in skills {
args.append(contentsOf: ["--skill", skill])
}
args.append("--json")
// Title is the positional argument appended last so flags
// can't be confused for it.
args.append(title)
return args
}
}
/// Typed mirror of Hermes's `--workspace` flag. `scratch` and `worktree`
/// are bare strings on the wire; `dir:<absolute path>` is a colon-prefixed
/// path. We keep them typed in Swift so callers can't typo "scrach".
public enum KanbanWorkspaceSpec: Sendable, Equatable {
case scratch
case worktree
case directory(String)
public var cliValue: String {
switch self {
case .scratch: return "scratch"
case .worktree: return "worktree"
case .directory(let p): return "dir:\(p)"
}
}
/// "scratch" / "worktree" / "dir" the kind segment, suitable
/// for badge labels.
public var displayKind: String {
switch self {
case .scratch: return "scratch"
case .worktree: return "worktree"
case .directory: return "dir"
}
}
}
@@ -0,0 +1,52 @@
import Foundation
/// Errors thrown by `KanbanService`. Each case carries enough detail
/// to render a user-actionable message VMs surface these inline in
/// the board's error banner rather than blocking with alerts, since
/// kanban interactions are high-frequency.
public enum KanbanError: Error, LocalizedError, Sendable {
/// `hermes` binary couldn't be located (local) or the remote
/// `hermesBinaryHint` is unset (SSH).
case cliMissing
/// Subprocess returned non-zero exit. `stderr` may be empty if the
/// transport itself failed; carries a synthetic message in that case.
case nonZeroExit(code: Int32, stderr: String)
/// JSON decoding failed. Underlying `Error` is wrapped for
/// diagnostics; the user-facing message is generic.
case decoding(message: String)
/// `hermes kanban list --json` printed the literal string
/// "no matching tasks" instead of `[]`. Treated as a successful
/// empty result by callers but exposed here so VMs can distinguish
/// it from "transport error" if they want to.
case noMatchingTasks
/// Verb is not supported by this Hermes version (gated upstream
/// by `HermesCapabilities.hasKanban` + reasoned-about feature
/// drift). Carries the verb name + a hint.
case notSupported(verb: String, reason: String)
/// Disallowed transition the UI tried to perform (e.g. dragging a
/// `done` card back to `todo`). Caller surfaces a tooltip; this is
/// thrown only when a programmatic transition is requested instead
/// of being filtered out at the drag-target gate.
case forbiddenTransition(from: String, to: String, reason: String)
public var errorDescription: String? {
switch self {
case .cliMissing:
return "Hermes CLI couldn't be found. Install Hermes v0.12+ and ensure it's on your PATH."
case .nonZeroExit(let code, let stderr):
let trimmed = stderr.trimmingCharacters(in: .whitespacesAndNewlines)
if trimmed.isEmpty {
return "Hermes exited with code \(code)."
}
return trimmed
case .decoding(let message):
return "Couldn't decode Hermes output: \(message)"
case .noMatchingTasks:
return "No matching tasks."
case .notSupported(let verb, let reason):
return "`hermes kanban \(verb)` isn't available: \(reason)"
case .forbiddenTransition(let from, let to, let reason):
return "Can't move a \(from) task to \(to): \(reason)"
}
}
}
@@ -0,0 +1,146 @@
import Foundation
/// Filter options for `hermes kanban list --json`. Empty filter (default)
/// returns all non-archived tasks across all tenants.
public struct KanbanListFilter: Sendable, Equatable {
public var status: KanbanStatus?
public var assignee: String?
/// `nil` = all tenants. Empty string "untagged" (NULL tenant)
/// Hermes treats `--tenant ""` as "no tenant".
public var tenant: String?
public var includeArchived: Bool
/// Show only my profile's tasks (`--mine`).
public var mineOnly: Bool
public init(
status: KanbanStatus? = nil,
assignee: String? = nil,
tenant: String? = nil,
includeArchived: Bool = false,
mineOnly: Bool = false
) {
self.status = status
self.assignee = assignee
self.tenant = tenant
self.includeArchived = includeArchived
self.mineOnly = mineOnly
}
public static let all = KanbanListFilter()
/// Build the argv suffix after `["kanban", "list"]`.
public func argv() -> [String] {
var args: [String] = ["--json"]
if mineOnly {
args.append("--mine")
}
if let status, status != .unknown {
args.append(contentsOf: ["--status", status.rawValue])
}
if let assignee, !assignee.isEmpty {
args.append(contentsOf: ["--assignee", assignee])
}
if let tenant {
args.append(contentsOf: ["--tenant", tenant])
}
if includeArchived {
args.append("--archived")
}
return args
}
}
/// Filter options for `hermes kanban watch --json` (live event stream).
public struct KanbanWatchFilter: Sendable, Equatable {
public var assignee: String?
public var tenant: String?
public var kinds: [KanbanEventKind]
public var intervalSeconds: Double
public init(
assignee: String? = nil,
tenant: String? = nil,
kinds: [KanbanEventKind] = [],
intervalSeconds: Double = 0.5
) {
self.assignee = assignee
self.tenant = tenant
self.kinds = kinds
self.intervalSeconds = intervalSeconds
}
public static let all = KanbanWatchFilter()
public func argv() -> [String] {
var args: [String] = []
if let assignee, !assignee.isEmpty {
args.append(contentsOf: ["--assignee", assignee])
}
if let tenant, !tenant.isEmpty {
args.append(contentsOf: ["--tenant", tenant])
}
if !kinds.isEmpty {
let joined = kinds.map(\.rawValue).joined(separator: ",")
args.append(contentsOf: ["--kinds", joined])
}
if intervalSeconds > 0 && intervalSeconds != 0.5 {
args.append(contentsOf: ["--interval", String(format: "%.2f", intervalSeconds)])
}
return args
}
}
/// Summary of one `hermes kanban dispatch` pass. Used by the optional
/// "Dispatch now" button to show what happened.
public struct KanbanDispatchSummary: Sendable, Equatable, Codable {
public let promoted: Int
public let failed: Int
public let dryRun: Bool
public let perTask: [DispatchedTask]
public init(
promoted: Int = 0,
failed: Int = 0,
dryRun: Bool = false,
perTask: [DispatchedTask] = []
) {
self.promoted = promoted
self.failed = failed
self.dryRun = dryRun
self.perTask = perTask
}
public struct DispatchedTask: Sendable, Equatable, Codable, Identifiable {
public var id: String { taskId }
public let taskId: String
public let decision: String // "promoted" | "skipped" | "failed"
public let reason: String?
public init(taskId: String, decision: String, reason: String? = nil) {
self.taskId = taskId
self.decision = decision
self.reason = reason
}
enum CodingKeys: String, CodingKey {
case taskId = "task_id"
case decision
case reason
}
}
enum CodingKeys: String, CodingKey {
case promoted
case failed
case dryRun = "dry_run"
case perTask = "per_task"
}
public init(from decoder: any Decoder) throws {
let c = try decoder.container(keyedBy: CodingKeys.self)
self.promoted = try c.decodeIfPresent(Int.self, forKey: .promoted) ?? 0
self.failed = try c.decodeIfPresent(Int.self, forKey: .failed) ?? 0
self.dryRun = try c.decodeIfPresent(Bool.self, forKey: .dryRun) ?? false
self.perTask = try c.decodeIfPresent([DispatchedTask].self, forKey: .perTask) ?? []
}
}
@@ -39,6 +39,13 @@ public struct ProjectEntry: Codable, Sendable, Identifiable, Hashable {
public var dashboardPath: String { path + "/.scarf/dashboard.json" }
/// Directory holding the project's Scarf-managed sidecar files
/// (dashboard.json, manifest.json, template.lock.json, config.json,
/// plus any cron-job-written reports the dashboard widgets reference).
/// Watched as a unit by `HermesFileWatcher` so any file added /
/// removed / renamed inside refreshes the dashboard automatically.
public var scarfDir: String { path + "/.scarf" }
// MARK: - Codable (custom for backward compat)
private enum CodingKeys: String, CodingKey {
@@ -152,29 +159,54 @@ public struct DashboardWidget: Codable, Sendable, Identifiable {
// List
public let items: [ListItem]?
// Webview
// Webview / Image (image reuses `url` for remote, `path` for local)
public let url: String?
public let height: Double?
// v2.7 file-reading widgets (markdown_file, log_tail, image-local).
// `path` is resolved relative to the project root (the directory that
// contains `.scarf/`). Renderers must reject `..` segments after
// normalization to prevent escape from the project boundary.
public let path: String?
public let lines: Int?
// v2.7 cron_status widget; `jobId` matches HermesCronJob.id.
public let jobId: String?
// v2.7 status_grid widget; `cells` carries label + status per square,
// `gridColumns` overrides the auto-fit column count (keep distinct
// from `columns` which is the table-widget header list).
public let cells: [StatusGridCell]?
public let gridColumns: Int?
// v2.7 optional sparkline trend on `stat` widgets.
public let sparkline: [Double]?
public init(
type: String,
title: String,
value: WidgetValue?,
icon: String?,
color: String?,
subtitle: String?,
label: String?,
content: String?,
format: String?,
columns: [String]?,
rows: [[String]]?,
chartType: String?,
xLabel: String?,
yLabel: String?,
series: [ChartSeries]?,
items: [ListItem]?,
url: String?,
height: Double?
value: WidgetValue? = nil,
icon: String? = nil,
color: String? = nil,
subtitle: String? = nil,
label: String? = nil,
content: String? = nil,
format: String? = nil,
columns: [String]? = nil,
rows: [[String]]? = nil,
chartType: String? = nil,
xLabel: String? = nil,
yLabel: String? = nil,
series: [ChartSeries]? = nil,
items: [ListItem]? = nil,
url: String? = nil,
height: Double? = nil,
path: String? = nil,
lines: Int? = nil,
jobId: String? = nil,
cells: [StatusGridCell]? = nil,
gridColumns: Int? = nil,
sparkline: [Double]? = nil
) {
self.type = type
self.title = title
@@ -194,6 +226,29 @@ public struct DashboardWidget: Codable, Sendable, Identifiable {
self.items = items
self.url = url
self.height = height
self.path = path
self.lines = lines
self.jobId = jobId
self.cells = cells
self.gridColumns = gridColumns
self.sparkline = sparkline
}
}
// MARK: - Status Grid Data (v2.7)
/// One cell of a `status_grid` widget. Status semantics match `ListItem.status`
/// parsed via `ListItemStatus(raw:)` so the same vocabulary + synonyms apply.
public struct StatusGridCell: Codable, Sendable, Identifiable, Hashable {
public var id: String { label }
public let label: String
public let status: String?
public let tooltip: String?
public init(label: String, status: String? = nil, tooltip: String? = nil) {
self.label = label
self.status = status
self.tooltip = tooltip
}
}
@@ -284,3 +339,47 @@ public struct ListItem: Codable, Sendable, Identifiable {
self.status = status
}
}
/// Typed semantic status for `ListItem` (and `status_grid` cells in v2.7+).
///
/// Wire format stays a free `String?` on `ListItem` for backwards compatibility
/// pre-existing dashboards never break. Renderers call `ListItemStatus(raw:)`
/// to map known values + synonyms to a canonical case; unknown values return
/// `nil` and render as plain neutral text.
public enum ListItemStatus: String, Sendable, Hashable, CaseIterable {
case success
case warning
case danger
case info
case pending
case done
case neutral
/// Lenient parse accepts canonical names plus common synonyms seen in
/// real-world dashboards (`ok`/`up` success, `down`/`error`/`failed`
/// danger, `active` info). Returns `nil` for unrecognized strings so
/// the renderer can fall back to plain text.
public init?(raw: String?) {
guard let raw = raw?.trimmingCharacters(in: .whitespaces).lowercased(), !raw.isEmpty else {
return nil
}
switch raw {
case "success", "ok", "up", "green", "passing":
self = .success
case "warning", "warn", "yellow", "degraded":
self = .warning
case "danger", "down", "error", "failed", "failure", "red", "critical":
self = .danger
case "info", "active", "blue":
self = .info
case "pending", "queued", "waiting", "scheduled":
self = .pending
case "done", "complete", "completed", "finished":
self = .done
case "neutral", "muted", "gray":
self = .neutral
default:
return nil
}
}
}
@@ -25,6 +25,10 @@ public struct SSHConfig: Sendable, Hashable, Codable {
/// `HermesPathSet.defaultRemoteHome` (`~/.hermes`, shell-expanded on the
/// remote side).
public var remoteHome: String?
/// Override for where Scarf installs new project templates on this host.
/// `nil` uses `~/projects` (unexpanded remote shell resolves it).
/// Created on first install if missing.
public var projectsRoot: String?
/// Resolved remote path to the `hermes` binary. Populated by
/// `SSHTransport` after the first `command -v hermes` probe; cached here
/// so subsequent calls skip the round trip.
@@ -36,6 +40,7 @@ public struct SSHConfig: Sendable, Hashable, Codable {
port: Int? = nil,
identityFile: String? = nil,
remoteHome: String? = nil,
projectsRoot: String? = nil,
hermesBinaryHint: String? = nil
) {
self.host = host
@@ -43,6 +48,7 @@ public struct SSHConfig: Sendable, Hashable, Codable {
self.port = port
self.identityFile = identityFile
self.remoteHome = remoteHome
self.projectsRoot = projectsRoot
self.hermesBinaryHint = hermesBinaryHint
}
}
@@ -106,6 +112,27 @@ public struct ServerContext: Sendable, Hashable, Identifiable {
return false
}
/// Default parent directory under which `ProjectTemplateInstaller` lays
/// out new projects. Per-host configurable on `.ssh` via
/// `SSHConfig.projectsRoot`; local always resolves to `~/Projects` on the
/// user's Mac. The remote default is left as an unexpanded `~/projects`
/// the remote shell resolves the tilde, same convention as
/// `HermesPathSet.defaultRemoteHome`. The installer calls
/// `transport.createDirectory(_:)` at install time so a missing dir on a
/// fresh host is bootstrapped on first use rather than treated as an error.
public nonisolated var defaultProjectsRoot: String {
switch kind {
case .local:
return NSHomeDirectory() + "/Projects"
case .ssh(let config):
if let configured = config.projectsRoot,
!configured.trimmingCharacters(in: .whitespaces).isEmpty {
return configured
}
return "~/projects"
}
}
/// Construct the `ServerTransport` for this context. Local contexts get
/// a `LocalTransport`; SSH contexts get an `SSHTransport` configured
/// from `SSHConfig` by default, OR whatever `sshTransportFactory`
@@ -122,7 +122,8 @@ public extension HermesConfig {
skillsHub: aux("skills_hub"),
approval: aux("approval"),
mcp: aux("mcp"),
flushMemories: aux("flush_memories")
flushMemories: aux("flush_memories"),
curator: aux("curator")
)
let security = SecuritySettings(
@@ -224,6 +225,58 @@ public extension HermesConfig {
cooldownSeconds: int("platforms.homeassistant.extra.cooldown_seconds", default: 30)
)
// -- v0.13: per-platform Messaging Gateway settings --------------
// Read `gateway.platforms.<platform>.{allowed_channels|allowed_chats|
// allowed_rooms|busy_ack_enabled|gateway_restart_notification|
// slash_command_notice_ttl_seconds}` and bundle each platform that
// has at least one v0.13 key present in the file. Platforms without
// an explicit block don't appear in the dictionary, so the
// editor's `?? .empty` fallback hands the user the v0.13 defaults
// without leaving stale keys littered across the YAML.
//
// TODO(WS-5-Q2): the `gateway.platforms.*` path is unverified
// Hermes v0.13 may emit allowlists under `platforms.<platform>.*`
// (sibling to existing `platforms.slack.reply_to_mode`) instead.
// If so, swap the `prefix` line below to `"platforms.\(platform)."`
// and update `GatewayConfigWriter` in lockstep.
let gatewayAllowlistPlatforms = [
"slack", "mattermost", "google-chat",
"telegram", "whatsapp",
"matrix", "dingtalk",
]
var gatewayPlatforms: [String: GatewayPlatformSettings] = [:]
for platform in gatewayAllowlistPlatforms {
let prefix = "gateway.platforms.\(platform)."
let allowedChannels = lists[prefix + "allowed_channels"] ?? []
let allowedChats = lists[prefix + "allowed_chats"] ?? []
let allowedRooms = lists[prefix + "allowed_rooms"] ?? []
let busy = bool(prefix + "busy_ack_enabled", default: true)
let restartNotice = bool(prefix + "gateway_restart_notification",
default: false)
let ttl = int(prefix + "slash_command_notice_ttl_seconds",
default: 0)
// Skip platforms with no v0.13 fields present anywhere in the
// file. Without this guard, every supported platform would
// round-trip an all-default block back through writes even
// when the user never touched the new surface.
let isEmpty = allowedChannels.isEmpty
&& allowedChats.isEmpty
&& allowedRooms.isEmpty
&& values[prefix + "busy_ack_enabled"] == nil
&& values[prefix + "gateway_restart_notification"] == nil
&& values[prefix + "slash_command_notice_ttl_seconds"] == nil
if !isEmpty {
gatewayPlatforms[platform] = GatewayPlatformSettings(
allowedChannels: allowedChannels,
allowedChats: allowedChats,
allowedRooms: allowedRooms,
busyAckEnabled: busy,
gatewayRestartNotification: restartNotice,
slashCommandNoticeTTLSeconds: ttl
)
}
}
self.init(
model: str("model.default", default: "unknown"),
provider: str("model.provider", default: "unknown"),
@@ -280,7 +333,29 @@ public extension HermesConfig {
matrix: matrix,
mattermost: mattermost,
whatsapp: whatsapp,
homeAssistant: homeAssistant
homeAssistant: homeAssistant,
cacheTTL: str("prompt_caching.cache_ttl", default: "5m"),
redactionEnabled: bool("redaction.enabled", default: false),
runtimeMetadataFooter: bool("agent.runtime_metadata_footer", default: false),
gatewayPlatforms: gatewayPlatforms,
// -- v0.13 additions -------------------------------------
// TODO(WS-6-Q1): the `openrouter.response_cache.enabled`
// key shape is provisional pending verification against a
// v0.13 `hermes config check`. If upstream uses a different
// path (e.g. `providers.openrouter.response_cache_enabled`
// or nested under `prompt_caching`), update this single
// line + the matching `setSetting` key in
// `SettingsViewModel.setOpenRouterResponseCache`. Default
// is `false` per WS-6-plan §Open Questions #2.
imageGenModel: str("image_gen.model", default: ""),
openrouterResponseCacheEnabled: bool("openrouter.response_cache.enabled", default: false),
// Pre-v0.13 hosts wrote a single `web_tools.backend`. v0.13 split
// it into per-capability keys. Read all three so the round-trip
// never loses a value the user already set; the WebTools tab
// chooses which to render based on `hasWebToolsBackendSplit`.
webToolsBackend: str("web_tools.backend", default: "duckduckgo"),
webToolsSearchBackend: str("web_tools.search.backend", default: "duckduckgo"),
webToolsExtractBackend: str("web_tools.extract.backend", default: "reader")
)
}
}
@@ -0,0 +1,113 @@
import Foundation
/// Pluggable query engine for `HermesDataService`. Two implementations
/// today:
///
/// * `LocalSQLiteBackend` opens the local `~/.hermes/state.db` via
/// libsqlite3 and runs queries in-process. Microseconds per query.
/// * `RemoteSQLiteBackend` invokes `sqlite3 -readonly -json` over an
/// SSH session (ControlMaster keeps the channel warm), parses the
/// JSON response into `Row`s. ~50100 ms per query.
///
/// The data service picks one based on `ServerContext.isRemote`. View
/// models are oblivious they keep calling `await dataService.fetch`
/// like before.
///
/// **Why a protocol, not a class hierarchy.** Backends have very
/// different internals (libsqlite3 handles vs. SSH script piping) but
/// the call-site shape is identical. A protocol lets us hand the data
/// service either backend through one stored property without
/// abstract-class ceremony, and keeps the test mock (see
/// `MockHermesQueryBackend` in tests) free of inheritance baggage.
///
/// **Sendable.** Concrete impls are actors, so they're trivially
/// `Sendable`. The protocol conforms to `Sendable` to satisfy Swift 6
/// strict-concurrency for the data-service stored property.
public protocol HermesQueryBackend: Sendable {
/// True iff the connected DB has the v0.7 columns (`reasoning_tokens`,
/// `actual_cost_usd`, `cost_status`, `billing_provider` on
/// `sessions` plus `reasoning` on `messages`). Detected once at
/// `open()` time.
var hasV07Schema: Bool { get async }
/// True iff the connected DB has the v0.11 columns
/// (`api_call_count` on `sessions`, `reasoning_content` on
/// `messages`). Belt-and-braces: BOTH must be present (a
/// partially-migrated DB stays on the v0.7 path to avoid "no such
/// column" failures).
var hasV011Schema: Bool { get async }
/// User-presentable error from the most recent `open()` (or the
/// most recent failed query for the remote backend's
/// connectivity-loss codepath). `nil` means everything is healthy.
var lastOpenError: String? { get async }
/// One-time setup. Local: `sqlite3_open_v2` + `PRAGMA table_info`
/// schema detection. Remote: one SSH round-trip running
/// `sqlite3 --version` plus the two PRAGMA queries.
///
/// Returns `false` on any failure; detail is in `lastOpenError`.
/// Calling `open()` on an already-open backend is a no-op that
/// returns `true`.
func open() async -> Bool
/// Local backend: `close()` then `open(forceFresh:)` re-pulls
/// the SQLite handle so a Hermes-side migration becomes visible.
/// Remote backend: a no-op when `forceFresh: false` (every query
/// is already fresh there's nothing to refresh). `forceFresh:
/// true` re-runs the schema preflight, covering the rare "user
/// upgraded Hermes on the remote, my schema flags are stale" case.
@discardableResult
func refresh(forceFresh: Bool) async -> Bool
/// Drop any persistent resources. Idempotent.
func close() async
/// Run a single SQL statement and collect every row before
/// returning. SQL uses `?` placeholders; `params` is bound
/// positionally (one entry per `?`).
///
/// Local backend: `sqlite3_prepare_v2` + `sqlite3_bind_*` +
/// `sqlite3_step` loop, materialising each row into a `Row`.
/// Remote backend: inlines params via `SQLValueInliner` to produce
/// a final SQL string, runs `sqlite3 -readonly -json` over SSH,
/// parses the resulting JSON array.
///
/// Throws `BackendError` on any failure. The data-service façade
/// generally catches and returns empty results to preserve the
/// existing "show empty UI on error" behaviour.
func query(_ sql: String, params: [SQLValue]) async throws -> [Row]
/// Run several statements in one round-trip, returning each
/// statement's row set in order. Lets multi-query view loads
/// (Dashboard's 4-query pattern, Insights' 5-query pattern)
/// amortise the SSH/sqlite3 cold-start cost.
///
/// Each `(sql, params)` pair has the same shape as `query`
/// `?` placeholders bound positionally per pair.
func queryBatch(_ statements: [(sql: String, params: [SQLValue])]) async throws -> [[Row]]
}
/// Errors that backends raise. Mapped into user-facing messages by the
/// `humanize` helper that lives alongside `HermesDataService`.
public enum BackendError: Error, Sendable, Equatable {
/// Backend is not open caller should `open()` first.
case notOpen
/// Connectivity failure (SSH down, ControlMaster dead, transport
/// can't reach the host). Carries a short human-readable reason.
/// Triggers the data-service's `lastOpenError` populate path.
case transport(String)
/// sqlite3 itself reported an error non-zero exit, parse failure,
/// schema mismatch. `exitCode` is the sqlite3 process exit (or
/// libsqlite3 result code on the local backend); `stderr` is the
/// sqlite3-emitted message (already user-readable in most cases).
case sqlite(exitCode: Int32, stderr: String)
/// JSON-parsing failed on remote-backend output. Indicates either a
/// sqlite3 binary that didn't honour `-json`, or output corruption
/// (rare). Carries the first 200 bytes of stdout for diagnostics.
case parseFailure(stdoutHead: String)
}
@@ -0,0 +1,254 @@
// MARK: - Platform gate
//
// libsqlite3 is a system module on macOS/iOS but not on swift-corelibs
// foundation. Gate the entire backend so ScarfCore still compiles for
// any future Linux target. Apple platforms the runtime targets get
// the full implementation.
#if canImport(SQLite3)
import Foundation
import SQLite3
#if canImport(os)
import os
#endif
/// `HermesQueryBackend` that opens a local SQLite file via libsqlite3
/// and runs queries in-process. Microseconds per query.
///
/// Used for `ServerContext.local` (the user's own `~/.hermes/state.db`)
/// the previous behaviour of `HermesDataService` lifted out unchanged.
/// For `.ssh` contexts the data service constructs `RemoteSQLiteBackend`
/// instead.
///
/// Actor isolation matches the parent `HermesDataService` actor: queries
/// serialise on this backend's executor, and the data service hops once
/// (`await backend.query`) per public method call.
public actor LocalSQLiteBackend: HermesQueryBackend {
#if canImport(os)
private static let logger = Logger(subsystem: "com.scarf", category: "LocalSQLiteBackend")
#endif
private var db: OpaquePointer?
private var openedAtPath: String?
private(set) public var hasV07Schema = false
private(set) public var hasV011Schema = false
private(set) public var lastOpenError: String?
private let context: ServerContext
public init(context: ServerContext) {
self.context = context
}
// MARK: - Lifecycle
public func open() async -> Bool {
if db != nil { return true }
let path = context.paths.stateDB
guard FileManager.default.fileExists(atPath: path) else {
lastOpenError = "Hermes state database not found at \(path)."
return false
}
let flags: Int32 = SQLITE_OPEN_READONLY | SQLITE_OPEN_NOMUTEX
let rc = sqlite3_open_v2(path, &db, flags, nil)
guard rc == SQLITE_OK else {
let msg: String
if let db {
msg = String(cString: sqlite3_errmsg(db))
} else {
msg = "sqlite3_open_v2 returned \(rc)"
}
lastOpenError = "Couldn't open state.db: \(msg)"
#if canImport(os)
Self.logger.warning("sqlite3_open_v2 failed (\(rc)) at \(path, privacy: .public): \(msg, privacy: .public)")
#endif
db = nil
return false
}
openedAtPath = path
lastOpenError = nil
detectSchema()
return true
}
@discardableResult
public func refresh(forceFresh: Bool) async -> Bool {
// Local always close-and-reopen the file may have been swapped
// by Hermes (rare) or we want to pick up a schema migration.
// `forceFresh` is irrelevant locally; included for protocol
// parity with the remote backend.
await close()
return await open()
}
public func close() async {
if let db {
sqlite3_close(db)
}
db = nil
openedAtPath = nil
}
// MARK: - Schema detection
private func detectSchema() {
guard let db else { return }
// sessions schema
var stmt: OpaquePointer?
if sqlite3_prepare_v2(db, "PRAGMA table_info(sessions)", -1, &stmt, nil) == SQLITE_OK {
defer { sqlite3_finalize(stmt) }
while sqlite3_step(stmt) == SQLITE_ROW {
if let name = sqlite3_column_text(stmt, 1) {
let column = String(cString: name)
if column == "reasoning_tokens" {
hasV07Schema = true
}
if column == "api_call_count" {
hasV011Schema = true
}
}
}
}
// messages schema confirm `reasoning_content` is present too.
// Belt-and-braces: a partially-migrated DB (sessions migrated,
// messages not) shouldn't blow up reads with "no such column".
if hasV011Schema {
var msgStmt: OpaquePointer?
var sawReasoningContent = false
if sqlite3_prepare_v2(db, "PRAGMA table_info(messages)", -1, &msgStmt, nil) == SQLITE_OK {
defer { sqlite3_finalize(msgStmt) }
while sqlite3_step(msgStmt) == SQLITE_ROW {
if let name = sqlite3_column_text(msgStmt, 1),
String(cString: name) == "reasoning_content" {
sawReasoningContent = true
break
}
}
}
if !sawReasoningContent {
hasV011Schema = false
}
}
}
// MARK: - Queries
public func query(_ sql: String, params: [SQLValue]) async throws -> [Row] {
guard let db else { throw BackendError.notOpen }
return try executeOne(db: db, sql: sql, params: params)
}
public func queryBatch(_ statements: [(sql: String, params: [SQLValue])]) async throws -> [[Row]] {
guard let db else { throw BackendError.notOpen }
// Local backend has no SSH/process round-trip cost running
// sequentially against the open handle is exactly equivalent
// to running each via `query`. The protocol method exists for
// remote-backend amortisation; locally we just satisfy the
// signature.
var out: [[Row]] = []
out.reserveCapacity(statements.count)
for (sql, params) in statements {
out.append(try executeOne(db: db, sql: sql, params: params))
}
return out
}
// MARK: - Internals
private func executeOne(db: OpaquePointer, sql: String, params: [SQLValue]) throws -> [Row] {
var stmt: OpaquePointer?
let prepRC = sqlite3_prepare_v2(db, sql, -1, &stmt, nil)
guard prepRC == SQLITE_OK, let stmt else {
let msg = String(cString: sqlite3_errmsg(db))
throw BackendError.sqlite(exitCode: prepRC, stderr: msg)
}
defer { sqlite3_finalize(stmt) }
for (i, value) in params.enumerated() {
let col = Int32(i + 1)
let rc: Int32
switch value {
case .null:
rc = sqlite3_bind_null(stmt, col)
case .integer(let n):
rc = sqlite3_bind_int64(stmt, col, n)
case .real(let d):
rc = sqlite3_bind_double(stmt, col, d)
case .text(let s):
rc = sqlite3_bind_text(stmt, col, s, -1, sqliteTransient)
case .blob(let d):
rc = d.withUnsafeBytes { buf -> Int32 in
guard let base = buf.baseAddress else {
return sqlite3_bind_zeroblob(stmt, col, 0)
}
return sqlite3_bind_blob(stmt, col, base, Int32(buf.count), sqliteTransient)
}
}
if rc != SQLITE_OK {
let msg = String(cString: sqlite3_errmsg(db))
throw BackendError.sqlite(exitCode: rc, stderr: msg)
}
}
// Build column-name index map once per result set, lazily on
// first row (sqlite3_column_name needs the prepared stmt; cheap
// either way). For a 0-row result set we still build it so
// callers that read column names from the first hypothetical
// row don't error though `Row.columnIndex` on an empty
// `[Row]` is moot.
let columnCount = Int(sqlite3_column_count(stmt))
var columnIndex: [String: Int] = [:]
columnIndex.reserveCapacity(columnCount)
for i in 0..<columnCount {
if let cstr = sqlite3_column_name(stmt, Int32(i)) {
columnIndex[String(cString: cstr)] = i
}
}
var rows: [Row] = []
while true {
let stepRC = sqlite3_step(stmt)
if stepRC == SQLITE_DONE { break }
if stepRC != SQLITE_ROW {
let msg = String(cString: sqlite3_errmsg(db))
throw BackendError.sqlite(exitCode: stepRC, stderr: msg)
}
var values: [SQLValue] = []
values.reserveCapacity(columnCount)
for i in 0..<columnCount {
let col = Int32(i)
let type = sqlite3_column_type(stmt, col)
switch type {
case SQLITE_NULL:
values.append(.null)
case SQLITE_INTEGER:
values.append(.integer(sqlite3_column_int64(stmt, col)))
case SQLITE_FLOAT:
values.append(.real(sqlite3_column_double(stmt, col)))
case SQLITE_TEXT:
if let cstr = sqlite3_column_text(stmt, col) {
values.append(.text(String(cString: cstr)))
} else {
values.append(.text(""))
}
case SQLITE_BLOB:
let n = Int(sqlite3_column_bytes(stmt, col))
if n > 0, let p = sqlite3_column_blob(stmt, col) {
values.append(.blob(Data(bytes: p, count: n)))
} else {
values.append(.blob(Data()))
}
default:
values.append(.null)
}
}
rows.append(Row(values: values, columnIndex: columnIndex))
}
return rows
}
}
#endif // canImport(SQLite3)
@@ -0,0 +1,651 @@
#if canImport(SQLite3)
import Foundation
#if canImport(os)
import os
#endif
/// `HermesQueryBackend` that runs `sqlite3 -readonly -json` over an
/// SSH session per query. Replaces the old snapshot-then-open pipeline
/// (issue #74): no full-DB transfers, no local cache, every query
/// against the live remote DB.
///
/// **Why one round-trip per query is OK.** ControlMaster keeps the SSH
/// session warm first connect spins up the master socket; subsequent
/// queries reuse it at ~5 ms overhead. sqlite3 cold-start is ~3050 ms,
/// query execution is sub-millisecond for indexed queries, JSON
/// serialisation is small. End-to-end ~50100 ms per query, dominated
/// by sqlite3 process spawn. Multi-query view loads (Dashboard,
/// Insights) batch via `queryBatch` one cold-start, all statements
/// in a single sqlite3 invocation, ~80100 ms total.
///
/// **Result format**. `sqlite3 -json` emits one JSON array per
/// statement that returns rows: `[{"col":val,...}, ...]`. Multi-statement
/// scripts emit each array on its own. We separate batched queries
/// with a `SELECT '__SCARF_RS_BEGIN__N' AS marker;` synthesised line so
/// the parser can split on the markers sqlite3's marker rows
/// preserve order and let us pair each result-set with the originating
/// statement index.
public actor RemoteSQLiteBackend: HermesQueryBackend {
#if canImport(os)
private static let logger = Logger(subsystem: "com.scarf", category: "RemoteSQLiteBackend")
#endif
private let context: ServerContext
private let transport: any ServerTransport
private(set) public var hasV07Schema = false
private(set) public var hasV011Schema = false
private(set) public var lastOpenError: String?
private var isOpen = false
/// Captured `sqlite3 --version` line from the most recent preflight.
/// Stashed for diagnostic logs and a future "remote sqlite3 too old"
/// error path.
private var sqliteVersion: String?
/// Resolved absolute remote `$HOME`, populated on `open()` via
/// `context.resolvedUserHome()` so that `~/` paths can be expanded
/// in Swift up front rather than relying on shell expansion across
/// the streamScript pipeline. The base64 + pipe path through
/// Citadel does not reliably propagate `$HOME` into the inner
/// `/bin/sh` on every host keeping this client-side avoids the
/// issue (and matches how `RemoteBackupService.expandTilde` already
/// handles the same problem). `nil` only when the probe failed,
/// in which case `quoteForRemoteShell` falls back to `"$HOME/..."`
/// shell expansion.
private var resolvedHome: String?
/// In-flight query coalescing keyed on the inlined SQL text,
/// value is the Task currently fetching that exact result set.
/// When two concurrent callers ask for the same query (common
/// pattern: file watcher tick + chat-finalize debounce both
/// firing `loadRecentSessions` within ~100 ms), the second
/// caller awaits the first call's task instead of spawning a
/// fresh SSH subprocess. Cleared on task completion. Drops
/// duplicate `mac.loadRecentSessions` traces observed at
/// t=960450 / t=960584 in the perf capture (two parallel 3-s
/// loads for the same data, finishing 134 ms apart).
///
/// Coalescing is *only* applied to single `query` calls, not
/// `queryBatch` batches are larger payloads with caller-
/// specific timeout scaling, and concurrent callers wanting
/// "the same batch" is rare in practice. Keep coalescing
/// surgical so we don't accidentally serialize independent
/// work that just happens to match.
private var inFlightQueries: [String: Task<[Row], Error>] = [:]
/// Per-query timeout for `query`. Healthy local queries are
/// <100 ms; remote ones over 420 ms-RTT SSH amortize one round
/// trip per call PLUS the wire payload time. A `fetchMessages`
/// over a 157-message session (~50KB JSON encoded) exceeded
/// the previous 15 s ceiling, silently returned 0 rows, and the
/// chat appeared empty a worse failure than the wait it was
/// guarding against. Bumped to 30 s; the `streamScript`
/// transport-level timeout still fires on truly wedged hosts.
private let queryTimeout: TimeInterval = 30
/// Preflight timeout. First SSH round-trip may include cold
/// ControlMaster establishment (~13 s) plus the schema PRAGMA
/// queries; 30 s is generous.
private let preflightTimeout: TimeInterval = 30
/// Marker prefix used to split `queryBatch` result sets. Picked to
/// be very unlikely to collide with a real session_id, role string,
/// or content fragment.
private static let batchMarkerPrefix = "__SCARF_RS_BEGIN__"
public init(context: ServerContext, transport: any ServerTransport) {
self.context = context
self.transport = transport
}
// MARK: - Lifecycle
public func open() async -> Bool {
if isOpen { return true }
// Resolve remote $HOME once (cached process-wide via
// ServerContext.UserHomeCache so concurrent backends share
// the probe result). Lets us hand sqlite3 absolute paths and
// skip the unreliable nested-shell expansion altogether. A
// probe failure leaves `resolvedHome == nil` and falls back
// to "$HOME/..."-quoted args; the data-service open() will
// surface whatever sqlite3 errors out with.
let probedHome = await context.resolvedUserHome()
if probedHome != "~" && !probedHome.isEmpty {
resolvedHome = probedHome
}
let dbPath = context.paths.stateDB
// One SSH round-trip running:
// 1. sqlite3 --version (sanity + capture for diagnostics)
// 2. PRAGMA table_info(sessions) | sessions schema
// 3. PRAGMA table_info(messages) | messages schema
// sqlite3 -json emits two arrays back-to-back for the two PRAGMA
// statements; we parse them as separate result sets.
let preflight = """
set -e
sqlite3 --version
sqlite3 -readonly -json \(quoteForRemoteShell(dbPath)) "PRAGMA table_info(sessions); PRAGMA table_info(messages);"
"""
do {
let result = try await transport.streamScript(preflight, timeout: preflightTimeout)
if result.exitCode != 0 {
lastOpenError = errorMessage(stderr: result.stderrString, stdout: result.stdoutString, exitCode: result.exitCode)
#if canImport(os)
Self.logger.warning("Remote preflight failed (exit \(result.exitCode)): \(self.lastOpenError ?? "", privacy: .public)")
#endif
return false
}
try parsePreflightOutput(result.stdoutString)
lastOpenError = nil
isOpen = true
#if canImport(os)
Self.logger.info("Remote SQLite backend ready: sqlite3=\(self.sqliteVersion ?? "?", privacy: .public), v0.7=\(self.hasV07Schema), v0.11=\(self.hasV011Schema)")
#endif
return true
} catch {
lastOpenError = error.localizedDescription
#if canImport(os)
Self.logger.warning("Remote preflight transport error: \(error.localizedDescription, privacy: .public)")
#endif
return false
}
}
@discardableResult
public func refresh(forceFresh: Bool) async -> Bool {
// Streaming queries are always fresh. The watcher tick still
// fires `dataService.refresh()` on every observed file change
// locally that re-opens the SQLite handle; here it's a
// no-op. `forceFresh: true` is the escape hatch for when the
// user explicitly wants a re-preflight (e.g. they upgraded
// Hermes on the remote). Drop the open state and re-run.
if forceFresh {
isOpen = false
return await open()
}
return isOpen ? true : await open()
}
public func close() async {
isOpen = false
}
// MARK: - Queries
public func query(_ sql: String, params: [SQLValue]) async throws -> [Row] {
guard isOpen else { throw BackendError.notOpen }
let inlined = SQLValueInliner.inline(sql, params: params)
// In-flight coalescing if a query with the exact same
// inlined SQL is already pending, await its task instead
// of spawning a new SSH subprocess. Surfaces in ScarfMon as
// a `sqlite.query.coalesced` event so we can see how often
// the dedup actually fires in the wild.
if let existing = inFlightQueries[inlined] {
ScarfMon.event(.sqlite, "query.coalesced", count: 1)
return try await withTaskCancellationHandler(
operation: { try await existing.value },
onCancel: { existing.cancel() }
)
}
let task = Task<[Row], Error> { [self] in
try await ScarfMon.measureAsync(.sqlite, "query") {
let dbPath = context.paths.stateDB
let script = """
sqlite3 -readonly -json \(quoteForRemoteShell(dbPath)) <<'__SCARF_SQL__'
\(inlined)
__SCARF_SQL__
"""
let result: ProcessResult
do {
result = try await transport.streamScript(script, timeout: queryTimeout)
} catch {
throw BackendError.transport(error.localizedDescription)
}
if result.exitCode != 0 {
throw BackendError.sqlite(exitCode: result.exitCode, stderr: result.stderrString)
}
let rows = try parseSingleResultSet(result.stdoutString)
ScarfMon.event(.sqlite, "query.rows", count: rows.count, bytes: result.stdout.count)
return rows
}
}
inFlightQueries[inlined] = task
defer { inFlightQueries[inlined] = nil }
// v2.8 propagate parent task cancellation INTO the
// unstructured `task`. `Task<...>{ ... }` doesn't inherit
// cancellation from the awaiting context, so without this a
// cancelled chat-hydration / dashboard-refresh would keep
// the ssh subprocess alive for the full 30s queryTimeout
// pinning a remote sqlite query and a ControlMaster
// session slot. With the bridge, the inner task's awaits
// see a cancelled parent and `SSHScriptRunner.run`'s own
// cancellation handler (v2.8) kills the ssh process inside
// the next 100ms poll.
return try await withTaskCancellationHandler(
operation: { try await task.value },
onCancel: { task.cancel() }
)
}
public func queryBatch(_ statements: [(sql: String, params: [SQLValue])]) async throws -> [[Row]] {
try await ScarfMon.measureAsync(.sqlite, "queryBatch") {
try await _queryBatchImpl(statements)
}
}
private func _queryBatchImpl(_ statements: [(sql: String, params: [SQLValue])]) async throws -> [[Row]] {
guard isOpen else { throw BackendError.notOpen }
if statements.isEmpty { return [] }
// Build one sqlite3 invocation with marker SELECTs separating
// each statement's result set. `SELECT '__SCARF_RS_BEGIN__N'`
// emits a one-row JSON array we use as a sentinel.
var sqlBlocks: [String] = []
for (i, stmt) in statements.enumerated() {
let inlined = SQLValueInliner.inline(stmt.sql, params: stmt.params)
// Marker first (so we know which result-set follows even
// if a query returns zero rows sqlite3 -json prints
// nothing for empty result sets, which would otherwise
// make the parser drift).
sqlBlocks.append("SELECT '\(Self.batchMarkerPrefix)\(i)' AS marker;")
sqlBlocks.append(ensureTrailingSemicolon(inlined))
}
let combined = sqlBlocks.joined(separator: "\n")
let dbPath = context.paths.stateDB
let script = """
sqlite3 -readonly -json \(quoteForRemoteShell(dbPath)) <<'__SCARF_SQL__'
\(combined)
__SCARF_SQL__
"""
let result: ProcessResult
do {
// Batched timeout: scale with statement count, capped at
// a comfortable 30 s. Most batches are 45 statements.
let timeout = min(30, queryTimeout + Double(statements.count) * 2)
result = try await transport.streamScript(script, timeout: timeout)
} catch {
throw BackendError.transport(error.localizedDescription)
}
if result.exitCode != 0 {
throw BackendError.sqlite(exitCode: result.exitCode, stderr: result.stderrString)
}
return try parseBatchResultSets(result.stdoutString, expectedCount: statements.count)
}
// MARK: - Preflight parsing
private func parsePreflightOutput(_ stdout: String) throws {
// Expected output:
// <sqlite3 version line>
// [<sessions PRAGMA result>]
// [<messages PRAGMA result>]
let lines = stdout.split(separator: "\n", omittingEmptySubsequences: false)
guard let firstLine = lines.first, !firstLine.isEmpty else {
throw BackendError.parseFailure(stdoutHead: String(stdout.prefix(200)))
}
sqliteVersion = String(firstLine).trimmingCharacters(in: .whitespacesAndNewlines)
// The remaining lines should contain two JSON arrays. sqlite3
// -json emits each on its own though it can wrap long arrays
// across multiple lines. We split on `][` boundaries to be
// robust. Walk the stream looking for two top-level arrays.
let rest = lines.dropFirst().joined(separator: "\n")
let arrays = splitTopLevelJSONArrays(rest)
guard arrays.count >= 2 else {
throw BackendError.parseFailure(stdoutHead: String(stdout.prefix(200)))
}
let sessionsTable = try parseTableInfo(arrays[0])
let messagesTable = try parseTableInfo(arrays[1])
// v0.7: sessions has `reasoning_tokens`.
hasV07Schema = sessionsTable.contains("reasoning_tokens")
// v0.11: BOTH sessions has `api_call_count` AND messages has
// `reasoning_content`. Belt-and-braces against partial migrations.
let sessionsHasV011 = sessionsTable.contains("api_call_count")
let messagesHasV011 = messagesTable.contains("reasoning_content")
hasV011Schema = sessionsHasV011 && messagesHasV011
}
/// Extract column names from a `PRAGMA table_info(...)` result set.
private func parseTableInfo(_ json: String) throws -> Set<String> {
guard let data = json.data(using: .utf8),
let arr = try? JSONSerialization.jsonObject(with: data) as? [[String: Any]] else {
throw BackendError.parseFailure(stdoutHead: String(json.prefix(200)))
}
var names: Set<String> = []
for row in arr {
if let name = row["name"] as? String {
names.insert(name)
}
}
return names
}
// MARK: - Result-set parsing
private func parseSingleResultSet(_ stdout: String) throws -> [Row] {
// sqlite3 -json prints nothing for empty result sets, so an
// empty stdout is valid and means "0 rows".
let trimmed = stdout.trimmingCharacters(in: .whitespacesAndNewlines)
if trimmed.isEmpty { return [] }
return try rowsFromJSONArray(trimmed)
}
private func parseBatchResultSets(_ stdout: String, expectedCount: Int) throws -> [[Row]] {
// Scan the output as a sequence of JSON arrays. Each marker
// SELECT emits a one-row array `[{"marker":"__SCARF_RS_BEGIN__N"}]`;
// the following array (if present) is statement N's result set.
let arrays = splitTopLevelJSONArrays(stdout)
var result: [[Row]] = Array(repeating: [], count: expectedCount)
var i = 0
while i < arrays.count {
let chunk = arrays[i]
// Try to read this chunk as a marker. A marker row is one
// object with exactly the `marker` field. Anything else
// is a real result set (which we attribute to the most
// recent marker we saw).
if let idx = markerIndex(in: chunk) {
// Next array (if any) is this statement's result set.
// If the next array is ALSO a marker, the current
// statement returned zero rows.
let next = i + 1
if next < arrays.count, markerIndex(in: arrays[next]) == nil {
result[idx] = try rowsFromJSONArray(arrays[next])
i = next + 1
} else {
// Empty result set for this statement.
i = next
}
} else {
// Stray array (no preceding marker). Skip shouldn't
// happen in practice given how we build the script.
i += 1
}
}
return result
}
/// If the array's single row is a marker `{"marker":"__SCARF_RS_BEGIN__N"}`,
/// return N. Otherwise nil.
private func markerIndex(in json: String) -> Int? {
guard let data = json.data(using: .utf8),
let arr = try? JSONSerialization.jsonObject(with: data) as? [[String: Any]],
arr.count == 1,
let marker = arr[0]["marker"] as? String,
marker.hasPrefix(Self.batchMarkerPrefix) else { return nil }
let suffix = marker.dropFirst(Self.batchMarkerPrefix.count)
return Int(suffix)
}
private func rowsFromJSONArray(_ json: String) throws -> [Row] {
guard let data = json.data(using: .utf8),
let arr = try? JSONSerialization.jsonObject(with: data) as? [[String: Any]] else {
throw BackendError.parseFailure(stdoutHead: String(json.prefix(200)))
}
if arr.isEmpty { return [] }
// `[String: Any]` does NOT preserve insertion order on macOS
// (NSDictionary backing). To keep the SELECT column order
// intact which the data-service row parsers depend on
// (`row.string(at: 0)` for `id`, etc.) we extract the key
// order from the FIRST object's raw JSON bytes. Subsequent
// rows reuse that key list to look up values by name from
// their parsed dictionaries.
let firstObjectRaw = extractFirstJSONObject(from: json)
let orderedKeys = firstObjectRaw.flatMap(extractKeysInOrder) ?? Array(arr[0].keys)
var columnIndex: [String: Int] = [:]
columnIndex.reserveCapacity(orderedKeys.count)
for (i, k) in orderedKeys.enumerated() { columnIndex[k] = i }
var rows: [Row] = []
rows.reserveCapacity(arr.count)
for obj in arr {
var values: [SQLValue] = []
values.reserveCapacity(orderedKeys.count)
for key in orderedKeys {
values.append(decode(obj[key]))
}
rows.append(Row(values: values, columnIndex: columnIndex))
}
return rows
}
/// Extract the substring of the first `{...}` object in a JSON
/// array string. Used so we can scan its keys in original order
/// before NSJSONSerialization's hash-table conversion strips the
/// ordering. Tolerates nested objects/arrays via depth tracking.
private func extractFirstJSONObject(from json: String) -> String? {
guard let openIdx = json.firstIndex(of: "{") else { return nil }
var depth = 0
var inString = false
var escape = false
var i = openIdx
while i < json.endIndex {
let c = json[i]
if inString {
if escape { escape = false }
else if c == "\\" { escape = true }
else if c == "\"" { inString = false }
i = json.index(after: i)
continue
}
switch c {
case "\"":
inString = true
case "{":
depth += 1
case "}":
depth -= 1
if depth == 0 {
let end = json.index(after: i)
return String(json[openIdx..<end])
}
default:
break
}
i = json.index(after: i)
}
return nil
}
/// Walk an object literal `{"k1": v1, "k2": v2, ...}` and return
/// the keys in their literal order. Doesn't decode the values
/// that's what NSJSONSerialization handles. Just extracts
/// `["k1", "k2", ...]` so we know the column ordering.
private func extractKeysInOrder(_ objectJSON: String) -> [String] {
var keys: [String] = []
var i = objectJSON.startIndex
// Skip past the leading `{`.
while i < objectJSON.endIndex, objectJSON[i] != "{" {
i = objectJSON.index(after: i)
}
if i < objectJSON.endIndex { i = objectJSON.index(after: i) }
var depth = 0
var inString = false
var escape = false
var keyStart: String.Index?
// We're at the start of object body. Looking for `"key":` patterns
// at depth 0. Toggle `expectingKey` after each `:`/`,`.
var expectingKey = true
while i < objectJSON.endIndex {
let c = objectJSON[i]
if inString {
if escape {
escape = false
} else if c == "\\" {
escape = true
} else if c == "\"" {
inString = false
if expectingKey && depth == 0, let start = keyStart {
keys.append(String(objectJSON[start..<i]))
expectingKey = false
keyStart = nil
}
}
i = objectJSON.index(after: i)
continue
}
switch c {
case "\"":
inString = true
if expectingKey && depth == 0 {
keyStart = objectJSON.index(after: i)
}
case "{", "[":
depth += 1
case "}", "]":
if depth == 0 { return keys } // end of outer object
depth -= 1
case ",":
if depth == 0 { expectingKey = true }
case ":":
if depth == 0 { expectingKey = false }
default:
break
}
i = objectJSON.index(after: i)
}
return keys
}
private func decode(_ v: Any?) -> SQLValue {
guard let v else { return .null }
if v is NSNull { return .null }
if let n = v as? NSNumber {
// NSJSONSerialization decodes both ints and doubles into
// NSNumber. Distinguish: if it round-trips through Int64
// unchanged, treat as integer; else real.
// A leading-zero-after-dot Double like 1.0 still has
// .doubleValue == 1.0 and Int64(1.0) == 1, so the round-
// trip check correctly bins integral doubles as integer
// (which sqlite3 -json does too `1` in JSON, not `1.0`).
let asInt64 = n.int64Value
if Double(asInt64) == n.doubleValue {
return .integer(asInt64)
}
return .real(n.doubleValue)
}
if let s = v as? String {
return .text(s)
}
// Fall-through: stringify whatever it is so we don't lose data
// silently. SQLite -json doesn't emit booleans or nested
// objects from PRAGMA / SELECT outputs in our usage.
return .text(String(describing: v))
}
// MARK: - JSON helpers
/// Walk a string of one or more concatenated JSON arrays at the top
/// level (sqlite3 -json's batched output) and return each array as
/// a separate substring. Tolerates whitespace/newlines between
/// arrays.
private func splitTopLevelJSONArrays(_ s: String) -> [String] {
var out: [String] = []
var depth = 0
var inString = false
var escape = false
var start: String.Index?
var i = s.startIndex
while i < s.endIndex {
let c = s[i]
if inString {
if escape {
escape = false
} else if c == "\\" {
escape = true
} else if c == "\"" {
inString = false
}
i = s.index(after: i)
continue
}
switch c {
case "\"":
inString = true
case "[":
if depth == 0 { start = i }
depth += 1
case "]":
depth -= 1
if depth == 0, let begin = start {
let end = s.index(after: i)
out.append(String(s[begin..<end]))
start = nil
}
default:
break
}
i = s.index(after: i)
}
return out
}
private func ensureTrailingSemicolon(_ sql: String) -> String {
let trimmed = sql.trimmingCharacters(in: .whitespacesAndNewlines)
if trimmed.hasSuffix(";") { return trimmed }
return trimmed + ";"
}
// MARK: - Quoting + error mapping
/// Build the shell argument that the remote `sh -c` will see for
/// the SQLite path. Three cases, in priority order:
///
/// 1. **`~`-prefixed AND we have a `resolvedHome`** the common
/// case. Pre-expand to an absolute path in Swift, then single-
/// quote. Sqlite3 receives a literal absolute path; no shell
/// expansion needed.
/// 2. **`~`-prefixed AND no `resolvedHome`** (probe failed)
/// fall back to `"$HOME/..."` and hope the remote shell expands
/// it. Works on Mac SSHTransport (login shell with $HOME set);
/// less reliable through Citadel's exec-channel + base64 +
/// inner-`/bin/sh` pipeline on iOS, which is precisely why
/// we prefer the resolved-home path above.
/// 3. **Absolute** (`/home/agent/.hermes/state.db`) single-quote
/// with the standard sh escape for any embedded single-quote.
///
/// sqlite3 doesn't expand `~` itself (that's a shell affordance),
/// so a default-config remote with `paths.stateDB ==
/// "~/.hermes/state.db"` would produce `unable to open database
/// "~/.hermes/state.db"` without one of these rewrites issue
/// reported on iOS Citadel against `127.0.0.1`.
private func quoteForRemoteShell(_ path: String) -> String {
if let home = resolvedHome {
let expanded: String
if path == "~" {
expanded = home
} else if path.hasPrefix("~/") {
expanded = home + "/" + String(path.dropFirst(2))
} else {
expanded = path
}
return "'" + expanded.replacingOccurrences(of: "'", with: "'\\''") + "'"
}
// Probe-failed fallback: rely on remote-shell `$HOME` expansion.
if path == "~" {
return "\"$HOME\""
}
if path.hasPrefix("~/") {
let rest = String(path.dropFirst(2))
let escaped = rest
.replacingOccurrences(of: "\\", with: "\\\\")
.replacingOccurrences(of: "\"", with: "\\\"")
.replacingOccurrences(of: "$", with: "\\$")
.replacingOccurrences(of: "`", with: "\\`")
return "\"$HOME/\(escaped)\""
}
return "'" + path.replacingOccurrences(of: "'", with: "'\\''") + "'"
}
/// Translate a non-zero sqlite3 exit into a user-presentable
/// message. Mirrors substrings that `HermesDataService.humanize`
/// keys off so the existing dashboard banner renders correctly.
private func errorMessage(stderr: String, stdout: String, exitCode: Int32) -> String {
let combined = (stderr.isEmpty ? stdout : stderr).trimmingCharacters(in: .whitespacesAndNewlines)
if combined.isEmpty {
return "sqlite3 exited \(exitCode) with no output"
}
return combined
}
}
#endif // canImport(SQLite3)
@@ -0,0 +1,136 @@
import Foundation
/// Typed SQLite column value. Mirrors SQLite's storage classes
/// (`SQLITE_NULL`, `SQLITE_INTEGER`, `SQLITE_FLOAT`, `SQLITE_TEXT`,
/// `SQLITE_BLOB`) so both backends libsqlite3 (`LocalSQLiteBackend`)
/// and remote `sqlite3 -json` parsing (`RemoteSQLiteBackend`) can
/// produce and consume the same `Row` shape.
///
/// Used in two places:
///
/// 1. **Bound parameters**: callers hand `[SQLValue]` to
/// `HermesQueryBackend.query(_:params:)`. The local backend feeds
/// them into `sqlite3_bind_*`; the remote backend inlines them as
/// SQLite literals via `SQLValueInliner.inline(_:into:)`.
/// 2. **Result columns**: each `Row.values` entry is one of these.
/// Parsers (`sessionFromRow`, `messageFromRow` in HermesDataService)
/// read positional accessors like `row.string(at: 3)` to get the
/// typed value.
public enum SQLValue: Sendable, Equatable {
case null
case integer(Int64)
case real(Double)
case text(String)
case blob(Data)
}
/// One result row from a query. Indexable both by position (matching the
/// libsqlite3 `sqlite3_column_*` ergonomics that `HermesDataService`'s
/// existing parsers expect) and by name (more readable for new code).
///
/// `columnIndex` is built once per result-set, not per row, so the
/// per-row overhead is just the `[SQLValue]` allocation.
public struct Row: Sendable {
/// Ordered column values, indexable by their position in the
/// underlying SELECT.
public let values: [SQLValue]
/// Column-name position map. Built once per result-set by the
/// backend, then shared (by reference) across every row in the
/// set. Lookups are case-sensitive match SQLite's default.
public let columnIndex: [String: Int]
public init(values: [SQLValue], columnIndex: [String: Int]) {
self.values = values
self.columnIndex = columnIndex
}
public subscript(_ position: Int) -> SQLValue {
guard position >= 0, position < values.count else { return .null }
return values[position]
}
public subscript(_ name: String) -> SQLValue {
guard let i = columnIndex[name] else { return .null }
return values[i]
}
// MARK: - Typed positional accessors
//
// These mirror the `columnText(stmt, i)` / `columnDate(stmt, i)`
// helpers that lived in HermesDataService so the row-parser
// migrations from `OpaquePointer` to `Row` are line-for-line.
public func string(at i: Int) -> String {
if case .text(let s) = self[i] { return s }
return ""
}
public func optionalString(at i: Int) -> String? {
switch self[i] {
case .text(let s): return s
case .null: return nil
default: return nil
}
}
public func int(at i: Int) -> Int {
switch self[i] {
case .integer(let n): return Int(n)
case .real(let d): return Int(d)
case .text(let s): return Int(s) ?? 0
default: return 0
}
}
public func optionalInt(at i: Int) -> Int? {
switch self[i] {
case .integer(let n): return Int(n)
case .real(let d): return Int(d)
case .text(let s): return Int(s)
case .null: return nil
default: return nil
}
}
public func int64(at i: Int) -> Int64 {
switch self[i] {
case .integer(let n): return n
case .real(let d): return Int64(d)
case .text(let s): return Int64(s) ?? 0
default: return 0
}
}
public func double(at i: Int) -> Double {
switch self[i] {
case .real(let d): return d
case .integer(let n): return Double(n)
case .text(let s): return Double(s) ?? 0
default: return 0
}
}
public func optionalDouble(at i: Int) -> Double? {
switch self[i] {
case .real(let d): return d
case .integer(let n): return Double(n)
case .text(let s): return Double(s)
case .null: return nil
default: return nil
}
}
/// Interpret the column as a Unix-epoch timestamp (seconds, fractional
/// allowed). Returns `nil` when the column is NULL or unparseable.
/// Mirrors the existing `columnDate` helper exactly.
public func date(at i: Int) -> Date? {
guard let secs = optionalDouble(at: i) else { return nil }
return Date(timeIntervalSince1970: secs)
}
public func isNull(at i: Int) -> Bool {
if case .null = self[i] { return true }
return false
}
}
@@ -0,0 +1,107 @@
import Foundation
/// Replaces `?` placeholders in a SQL string with SQLite-escaped
/// literal values, in order. Used by `RemoteSQLiteBackend` because
/// the `sqlite3` CLI doesn't accept `?`-bound parameters on the
/// command line it would need stdin `.parameter set @name` dot-
/// commands, which require a multi-line script for every query and
/// add round-trip overhead with no upside for our use case.
///
/// **Trust model.** This is a literal-encoder for in-tree, trusted
/// callers every current param source is either an integer (`limit`,
/// `before`, `since.timeIntervalSince1970`), a Hermes-internal ID
/// (UUID-shaped session/tool IDs that come back from the same DB), or
/// a search query that already passes through `sanitizeFTSQuery` in
/// HermesDataService. It is **NOT** a general SQL-injection defense.
/// Don't extend the data-service surface with methods that accept raw
/// untrusted user input as a `.text` param without first validating
/// upstream. The local backend skips inlining entirely (uses
/// `sqlite3_bind_*`) so this only affects the remote path.
///
/// Escape rules mirror SQLite's literal syntax:
/// * `.null` `NULL`
/// * `.integer(n)` `<n>` (no quoting)
/// * `.real(d)` `%.17g`-formatted (round-trips Double via decimal)
/// * `.text(s)` `'<s with single-quotes doubled>'`
/// * `.blob(d)` `X'<hex>'`
public enum SQLValueInliner {
/// Walk `sql`, replacing each `?` (outside SQL string literals) with
/// the corresponding `params` entry's encoded form. Throws via
/// fatalError if the placeholder count doesn't match `params.count`
/// a programmer error, not a runtime condition.
///
/// `?` inside string literals (e.g. `WHERE name = '?'`) is preserved
/// unchanged. We track quote state with a tiny scanner so existing
/// SQL with literal `?` chars in strings doesn't get mis-bound.
public static func inline(_ sql: String, params: [SQLValue]) -> String {
var out = ""
out.reserveCapacity(sql.count + params.count * 16)
var paramIndex = 0
var inSingleQuote = false
var inDoubleQuote = false
var i = sql.startIndex
while i < sql.endIndex {
let c = sql[i]
if c == "'" && !inDoubleQuote {
// Check for SQL's `''` escape (a doubled single-quote
// INSIDE a string literal stays inside; we don't toggle
// out). The next char being another `'` keeps us in.
let next = sql.index(after: i)
if inSingleQuote && next < sql.endIndex && sql[next] == "'" {
out.append("'")
out.append("'")
i = sql.index(after: next)
continue
}
inSingleQuote.toggle()
out.append(c)
i = sql.index(after: i)
continue
}
if c == "\"" && !inSingleQuote {
inDoubleQuote.toggle()
out.append(c)
i = sql.index(after: i)
continue
}
if c == "?" && !inSingleQuote && !inDoubleQuote {
// Bind placeholder.
if paramIndex >= params.count {
fatalError("SQLValueInliner: more `?` placeholders in SQL than provided params (\(params.count)). SQL: \(sql)")
}
out.append(encode(params[paramIndex]))
paramIndex += 1
i = sql.index(after: i)
continue
}
out.append(c)
i = sql.index(after: i)
}
if paramIndex != params.count {
fatalError("SQLValueInliner: \(params.count) params provided but only \(paramIndex) `?` placeholders consumed. SQL: \(sql)")
}
return out
}
/// Encode a single value as a SQLite literal. Public so callers
/// that build SQL strings by hand (rare prefer `inline`) can
/// reuse the same escape rules.
public static func encode(_ value: SQLValue) -> String {
switch value {
case .null:
return "NULL"
case .integer(let n):
return String(n)
case .real(let d):
// %.17g round-trips a Double precisely as a decimal.
return String(format: "%.17g", d)
case .text(let s):
return "'" + s.replacingOccurrences(of: "'", with: "''") + "'"
case .blob(let d):
// SQLite blob literal: X'<hex>' (case-insensitive prefix).
let hex = d.map { String(format: "%02x", $0) }.joined()
return "X'\(hex)'"
}
}
}
@@ -0,0 +1,358 @@
import Foundation
#if canImport(os)
import os
#endif
/// Async, transport-aware client for `hermes curator `. Wraps the v0.12
/// verbs (`status / run / pause / resume / pin / unpin / restore`) plus
/// the v0.13 archive surface (`archive / prune / list-archived` and a
/// synchronous-blocking `run`).
///
/// **Concurrency.** Pure-I/O `actor` no UI state. View models hold a
/// service reference and `await` methods. Each public method dispatches
/// the underlying CLI invocation through `Task.detached(priority:
/// .utility)` so two concurrent reads from the VM don't queue end-to-end
/// on a single thread. Mirrors `KanbanService` shape exactly.
///
/// **Capability gating happens at the call site, not in the service.**
/// `runNow(synchronous:timeout:)` takes a flag from the VM (the VM reads
/// `HermesCapabilities.hasCuratorArchive` to decide). The service stays
/// version-agnostic only the timeout differs in practice.
public actor CuratorService {
#if canImport(os)
private static let logger = Logger(subsystem: "com.scarf", category: "CuratorService")
#endif
private let context: ServerContext
public init(context: ServerContext) {
self.context = context
}
// MARK: - Reads
/// Run `hermes curator status` and parse stdout via
/// `HermesCuratorStatusParser`. Combines the text output with the
/// on-disk `.curator_state` JSON for richer last-run metadata.
/// Never throws a transport failure resolves to `.empty` so the
/// view always has something to render.
public func status() async -> HermesCuratorStatus {
let context = self.context
return await Task.detached(priority: .utility) { () -> HermesCuratorStatus in
let textResult = Self.runHermesSync(context: context, args: ["curator", "status"], timeout: 30)
let stateData = context.readData(context.paths.curatorStateFile)
return HermesCuratorStatusParser.parse(text: textResult.output, stateFileJSON: stateData)
}.value
}
/// `hermes curator list-archived [--json]`. Prefers JSON; falls back
/// to a defensive text parser. Empty / "no archived skills" sentinel
/// folds to `[]`.
public func listArchived() async throws -> [HermesCuratorArchivedSkill] {
// TODO(WS-4-Q2): confirm `--json` is supported on v0.13
// `list-archived`. If not, drop the flag and rely on the text
// parser path. Until then we pass `--json` and parse the output
// tolerantly.
let args = ["curator", "list-archived", "--json"]
let (code, stdout, stderr) = await runHermes(args: args, timeout: 30)
// If --json isn't recognized, the CLI typically emits
// "unrecognized arguments: --json" or similar to stderr and
// exits non-zero. Retry without the flag and parse text.
if code != 0 {
let lower = (stderr + stdout).lowercased()
if lower.contains("unrecognized") || lower.contains("unknown") || lower.contains("no such option") {
let (c2, out2, err2) = await runHermes(args: ["curator", "list-archived"], timeout: 30)
try ensureSuccess(code: c2, stdout: out2, stderr: err2, verb: "list-archived")
return Self.parseListArchivedText(out2)
}
try ensureSuccess(code: code, stdout: stdout, stderr: stderr, verb: "list-archived")
}
let trimmed = stdout.trimmingCharacters(in: .whitespacesAndNewlines)
if trimmed.isEmpty || trimmed.lowercased().contains("no archived skills") {
return []
}
// Try JSON first may also be a text dump if Hermes ignored `--json`.
if let data = trimmed.data(using: .utf8),
let arr = try? JSONDecoder().decode([HermesCuratorArchivedSkill].self, from: data) {
return arr
}
// Some builds wrap in `{"archived": [...]}` envelope.
struct Wrapper: Decodable { let archived: [HermesCuratorArchivedSkill] }
if let data = trimmed.data(using: .utf8),
let wrapped = try? JSONDecoder().decode(Wrapper.self, from: data) {
return wrapped.archived
}
// Text fallback defensive parse.
return Self.parseListArchivedText(stdout)
}
// MARK: - Writes (legacy v0.12 verbs; service form)
public func runNow(synchronous: Bool, timeout: TimeInterval) async throws {
// TODO(WS-4-Q4): default 600s for v0.13 sync runs. No Cancel
// button in v2.8 (transport.cancel parity not guaranteed across
// LocalTransport / SSHTransport).
let resolvedTimeout = synchronous ? timeout : 30
let (code, stdout, stderr) = await runHermes(args: ["curator", "run"], timeout: resolvedTimeout)
try ensureSuccess(code: code, stdout: stdout, stderr: stderr, verb: "run")
}
public func pause() async throws {
let (code, stdout, stderr) = await runHermes(args: ["curator", "pause"], timeout: 15)
try ensureSuccess(code: code, stdout: stdout, stderr: stderr, verb: "pause")
}
public func resume() async throws {
let (code, stdout, stderr) = await runHermes(args: ["curator", "resume"], timeout: 15)
try ensureSuccess(code: code, stdout: stdout, stderr: stderr, verb: "resume")
}
public func pin(_ name: String) async throws {
let (code, stdout, stderr) = await runHermes(args: ["curator", "pin", name], timeout: 15)
try ensureSuccess(code: code, stdout: stdout, stderr: stderr, verb: "pin")
}
public func unpin(_ name: String) async throws {
let (code, stdout, stderr) = await runHermes(args: ["curator", "unpin", name], timeout: 15)
try ensureSuccess(code: code, stdout: stdout, stderr: stderr, verb: "unpin")
}
public func restore(_ name: String) async throws {
let (code, stdout, stderr) = await runHermes(args: ["curator", "restore", name], timeout: 30)
try ensureSuccess(code: code, stdout: stdout, stderr: stderr, verb: "restore")
}
// MARK: - Writes (new in v0.13)
/// `hermes curator archive <name>` non-destructive; moves the
/// skill from the active set to the archived set. No `--json` is
/// expected; the verb's success channel is the exit code.
public func archive(_ name: String) async throws {
let (code, stdout, stderr) = await runHermes(args: ["curator", "archive", name], timeout: 30)
try ensureSuccess(code: code, stdout: stdout, stderr: stderr, verb: "archive")
}
/// `hermes curator prune [--dry-run]`. Destructive when `dryRun`
/// is `false` removes everything currently archived from disk.
/// Returns a `CuratorPruneSummary` describing what was (or would be)
/// removed. On `dryRun=false`, the wire shape may not include the
/// `would_remove` list the caller should not depend on it; the
/// archived list is empty after a successful destructive prune.
@discardableResult
public func prune(dryRun: Bool) async throws -> CuratorPruneSummary {
// TODO(WS-4-Q1): confirm v0.13 ships `--dry-run`. If not, fall
// back to enumerating via `list-archived` and treat any prune
// call as destructive. The retry-without-flag path below covers
// the "unrecognized argument" case automatically.
var args = ["curator", "prune"]
if dryRun { args.append("--dry-run") }
// `--json` requested for the dry-run path so we can parse the
// would-remove list. Destructive mode runs without --json since
// we only need the exit code.
if dryRun { args.append("--json") }
let (code, stdout, stderr) = await runHermes(args: args, timeout: 60)
// Detect "unrecognized --dry-run" / "unknown --json" gracefully.
if code != 0 {
let lower = (stderr + stdout).lowercased()
let unrecognized = lower.contains("unrecognized") || lower.contains("unknown") || lower.contains("no such option")
if dryRun && unrecognized {
// Q1 fallback: enumerate via list-archived. Caller still
// uses this summary for confirm-sheet display.
let archived = try await listArchived()
let total = archived.compactMap { $0.sizeBytes }.reduce(0, +)
return CuratorPruneSummary(wouldRemove: archived, totalBytes: total)
}
try ensureSuccess(code: code, stdout: stdout, stderr: stderr, verb: "prune")
}
if dryRun {
return Self.parsePruneDryRun(stdout)
}
return CuratorPruneSummary(wouldRemove: [], totalBytes: 0)
}
// MARK: - Pure parsers (nonisolated; safe to call from VMs without awaits)
/// Parse a `list-archived --json` payload. Tolerates the bare-array
/// shape, the `{"archived": [...]}` envelope, and "no archived
/// skills" / empty-string sentinels. Returns `[]` for any of the
/// empty cases. Throws `CuratorError.decoding` only when the input
/// is non-empty and clearly not JSON.
public nonisolated static func parseListArchived(stdout: String) throws -> [HermesCuratorArchivedSkill] {
let trimmed = stdout.trimmingCharacters(in: .whitespacesAndNewlines)
if trimmed.isEmpty || trimmed.lowercased().contains("no archived skills") {
return []
}
guard let data = trimmed.data(using: .utf8) else {
throw CuratorError.decoding(verb: "list-archived", message: "non-UTF8 stdout")
}
if let arr = try? JSONDecoder().decode([HermesCuratorArchivedSkill].self, from: data) {
return arr
}
struct Wrapper: Decodable { let archived: [HermesCuratorArchivedSkill] }
if let wrapped = try? JSONDecoder().decode(Wrapper.self, from: data) {
return wrapped.archived
}
// Last resort: text fallback.
let parsed = parseListArchivedText(stdout)
if !parsed.isEmpty {
return parsed
}
throw CuratorError.decoding(verb: "list-archived", message: "stdout was neither JSON nor a recognised text list")
}
/// Defensive text parser for `list-archived` output when `--json`
/// isn't supported. Format inferred from `curator status`: one row
/// per non-blank line, leading whitespace, name in column 1, then
/// optional `archived=YYYY-MM-DD`, `size=NNNN`, `reason=...` k/v
/// pairs. Blank lines, header lines, and the empty-state sentinel
/// are skipped.
public nonisolated static func parseListArchivedText(_ text: String) -> [HermesCuratorArchivedSkill] {
var rows: [HermesCuratorArchivedSkill] = []
for raw in text.split(separator: "\n") {
let line = raw.trimmingCharacters(in: .whitespaces)
if line.isEmpty { continue }
let lower = line.lowercased()
// Skip header / sentinel lines.
if lower.hasPrefix("name") && lower.contains("archived") { continue }
if lower.contains("no archived skills") { continue }
if line.unicodeScalars.allSatisfy({ $0.value == 0x2500 || $0.properties.isWhitespace }) {
continue
}
// Skip lines that look like JSON / non-row chrome `{`,
// `}`, `[`, `]` at the start or quotes / colons mean we're
// parsing a malformed JSON dump, not a row table.
if let first = line.first, "{[}]\":,".contains(first) {
continue
}
// Find the first whitespace-separated token as the name; if
// the name carries an `=` it's a header chip we should skip.
let parts = line.split(whereSeparator: { $0 == "\t" || $0 == " " }).map(String.init)
guard let name = parts.first, !name.contains("=") else { continue }
// Reject names that look like punctuation / JSON fragments.
if name.contains("\"") || name.contains(":") || name.contains("{") || name.contains("}") || name.contains("[") || name.contains("]") {
continue
}
// Pull k=v pairs from the remainder.
var archivedAt: String?
var sizeBytes: Int?
var reason: String?
var category: String?
var path: String?
for token in parts.dropFirst() {
guard let eq = token.firstIndex(of: "=") else { continue }
let key = String(token[..<eq])
let value = String(token[token.index(after: eq)...])
switch key {
case "archived", "archived_at":
archivedAt = value
case "size", "size_bytes":
sizeBytes = Int(value)
case "reason":
reason = value
case "category":
category = value
case "path":
path = value
default:
continue
}
}
rows.append(
HermesCuratorArchivedSkill(
name: name,
category: category,
archivedAt: archivedAt,
reason: reason,
sizeBytes: sizeBytes,
path: path
)
)
}
return rows
}
/// Parse a `prune --dry-run --json` payload. Tolerates an empty
/// payload (returns a zero summary) and the `{would_remove: [],
/// total_bytes: N}` shape.
public nonisolated static func parsePruneDryRun(_ stdout: String) -> CuratorPruneSummary {
let trimmed = stdout.trimmingCharacters(in: .whitespacesAndNewlines)
guard !trimmed.isEmpty else {
return CuratorPruneSummary(wouldRemove: [], totalBytes: 0)
}
if let data = trimmed.data(using: .utf8),
let summary = try? JSONDecoder().decode(CuratorPruneSummary.self, from: data) {
return summary
}
// Tolerate a bare-array fallback (some Hermes builds may print
// just the would-remove list when --json is missing the wrapper).
if let data = trimmed.data(using: .utf8),
let arr = try? JSONDecoder().decode([HermesCuratorArchivedSkill].self, from: data) {
let total = arr.compactMap { $0.sizeBytes }.reduce(0, +)
return CuratorPruneSummary(wouldRemove: arr, totalBytes: total)
}
// Last-resort text parse for "would remove N skills (X bytes)".
return CuratorPruneSummary(wouldRemove: [], totalBytes: 0)
}
// MARK: - CLI invocation
private nonisolated func runHermes(
args: [String],
timeout: TimeInterval
) async -> (exitCode: Int32, stdout: String, stderr: String) {
let context = self.context
return await Task.detached(priority: .utility) { () -> (Int32, String, String) in
let result = Self.runHermesSync(context: context, args: args, timeout: timeout)
return (result.exitCode, result.output, result.stderr)
}.value
}
/// Synchronous, transport-level invocation. `output` is stdout; the
/// caller usually only reads `output` for parser input but sometimes
/// needs `stderr` (e.g. to detect "unrecognized argument" patterns).
private nonisolated static func runHermesSync(
context: ServerContext,
args: [String],
timeout: TimeInterval
) -> (exitCode: Int32, output: String, stderr: String) {
let transport = context.makeTransport()
do {
let result = try transport.runProcess(
executable: context.paths.hermesBinary,
args: args,
stdin: nil,
timeout: timeout
)
return (result.exitCode, result.stdoutString, result.stderrString)
} catch let error as TransportError {
let message = error.diagnosticStderr.isEmpty
? (error.errorDescription ?? "transport error")
: error.diagnosticStderr
return (-1, "", message)
} catch {
return (-1, "", error.localizedDescription)
}
}
private nonisolated func ensureSuccess(
code: Int32,
stdout: String,
stderr: String,
verb: String
) throws {
guard code != 0 else { return }
if code == -1 && stderr.lowercased().contains("hermes binary not found") {
throw CuratorError.cliMissing
}
let combined = stderr.isEmpty ? stdout : stderr
#if canImport(os)
Self.logger.warning("curator \(verb) exit=\(code, privacy: .public) stderr=\(combined, privacy: .public)")
#endif
throw CuratorError.nonZeroExit(verb: verb, code: code, stderr: combined)
}
}
@@ -0,0 +1,396 @@
import Foundation
/// Direct YAML editor for `gateway.platforms.<platform>.allowed_<kind>:` list
/// blocks. Hermes v0.13 added these list-valued keys, but `hermes config set`
/// stringifies arrays (the same gotcha that forced Home Assistant's watch
/// lists to stay read-only). The Messaging Gateway editor sidesteps the CLI
/// for these keys by editing `~/.hermes/config.yaml` directly.
///
/// **Pure-function `setList`** is the heart of the editor it splits the
/// YAML into lines, finds (or creates) the targeted block, and splices the
/// new items in while preserving every byte outside the block. The async
/// `saveList` wrapper wires it through `ServerContext.readText` /
/// `writeText`, so the same code path works on `.local` and `.ssh` servers
/// local goes through `LocalTransport`, remote round-trips via SCP.
///
/// **Scalar fields don't go through here.** `busy_ack_enabled`,
/// `gateway_restart_notification`, and `slash_command_notice_ttl_seconds`
/// are scalars that `hermes config set` handles cleanly `GatewayBehaviorViewModel`
/// routes those through `PlatformSetupHelpers.saveForm` like every other
/// platform toggle.
///
/// **Why not use a real YAML library?** Same answer as everywhere else in
/// Scarf: zero external dependencies. The Hermes config flavor is a tightly
/// scoped subset (indent-based blocks, scalar-or-list values, no anchors /
/// aliases / flow style), and the targeted edit doesn't need to understand
/// the full grammar only "find this block, replace it, preserve the rest".
public enum GatewayConfigWriter {
/// Insert or replace `gateway.platforms.<platform>.<key>:` block in the
/// YAML, preserving everything else byte-for-byte.
///
/// - When `items` is empty, the block (and only the block siblings
/// stay) is removed from the YAML if present, and the function is a
/// no-op if the block was already absent.
/// - When the block is absent and `items` is non-empty, the function
/// appends a `gateway:` / `platforms:` / `<platform>:` scaffold at
/// the end of the file, creating any missing ancestors. This keeps
/// the function idempotent on round-trip but means the new block is
/// appended rather than spliced into an existing top-level
/// `gateway:` section. (See WS-5 plan §Notes for the trade-off; the
/// alternative would mean reflowing existing siblings, which is the
/// exact opposite of "preserve the surrounding YAML byte-for-byte".)
/// - When the block is present, its bullet rows are replaced with the
/// new items at the same indent. Items containing YAML-special
/// characters (`:` `#` `@` or leading whitespace) are single-quoted
/// defensively.
public static func setList(
in yaml: String,
platform: String,
key: String,
items: [String]
) -> String {
let blockIndent = 6 // `gateway:\n platforms:\n <platform>:\n <key>:`
let itemIndent = 8
let lines = yaml.components(separatedBy: "\n")
let blockHeaderText = " \(key):" // indented match for find()
let trimmedItems = items.filter { !$0.trimmingCharacters(in: .whitespaces).isEmpty }
// Locate ` <key>:` whose lineage is gateway platforms <platform>.
// We find the start of the gateway block, walk down the indent tree, and
// bail out if any ancestor is missing.
let location = locateBlock(
in: lines,
platform: platform,
key: key
)
switch location {
case .found(let blockRange):
return replaceBlock(
in: lines,
blockRange: blockRange,
key: key,
items: trimmedItems,
blockIndent: blockIndent,
itemIndent: itemIndent
)
case .platformPresentKeyMissing(let insertAfter):
if trimmedItems.isEmpty {
// No-op: empty target, no existing block.
return yaml
}
return spliceNewKey(
lines: lines,
insertAfterLineIndex: insertAfter,
key: key,
items: trimmedItems,
itemIndent: itemIndent
)
case .ancestorMissing:
if trimmedItems.isEmpty {
// Nothing to write, no existing block.
return yaml
}
return appendScaffold(
yaml: yaml,
platform: platform,
key: key,
items: trimmedItems
)
}
// (unreachable switch is exhaustive)
_ = blockHeaderText
}
/// Async wrapper that reads, mutates, writes via the given context.
/// Returns `false` on read or write failure.
///
/// The actual I/O happens via `ServerContext.readText` / `writeText`,
/// which are `nonisolated` safe to call from `MainActor` for the
/// short config.yaml writes the platform setup forms run. For remote
/// hosts the call rounds through SCP under `Task.detached` upstream
/// (per Swift 6 concurrency rules in `~/.claude/CLAUDE.md`).
public static func saveList(
context: ServerContext,
platform: String,
key: String,
items: [String]
) -> Bool {
let path = context.paths.configYAML
let existing = context.readText(path) ?? ""
let updated = setList(in: existing, platform: platform, key: key, items: items)
if updated == existing { return true } // no-op: already correct
return context.writeText(path, content: updated)
}
// MARK: - Internals
/// Result of locating the targeted block in the YAML line array.
private enum BlockLocation {
/// Block found; the closed range covers the header line + all bullet
/// rows attributed to it. Replacing this slice with the new block
/// completes the edit.
case found(ClosedRange<Int>)
/// `gateway platforms <platform>` exists, but the leaf `<key>:`
/// is absent under it. The associated value is the line index after
/// which the new key should be inserted (last line in the platform's
/// block, or the platform header itself if the platform's body is
/// empty).
case platformPresentKeyMissing(insertAfter: Int)
/// One of the ancestor section headers is missing. The whole
/// scaffold needs to be appended.
case ancestorMissing
}
private static func locateBlock(
in lines: [String],
platform: String,
key: String
) -> BlockLocation {
// Walk top-to-bottom looking for `gateway:` at indent 0.
guard let gatewayIdx = firstIndex(of: lines, headerLineEqualTo: "gateway:", indent: 0) else {
return .ancestorMissing
}
// Inside `gateway:`, find ` platforms:` at indent 2.
guard let platformsIdx = firstIndex(
of: lines,
after: gatewayIdx,
headerLineEqualTo: "platforms:",
indent: 2,
stopWhenIndentLessThan: 2
) else {
return .ancestorMissing
}
// Inside `platforms:`, find ` <platform>:` at indent 4.
guard let platformIdx = firstIndex(
of: lines,
after: platformsIdx,
headerLineEqualTo: "\(platform):",
indent: 4,
stopWhenIndentLessThan: 4
) else {
return .ancestorMissing
}
// Inside the platform block, find `<key>:` at indent 6, OR the end
// of the platform's body if the key is missing.
var keyIdx: Int?
var lastBodyIdx = platformIdx
var i = platformIdx + 1
while i < lines.count {
let line = lines[i]
let indent = leadingSpaces(line)
let trimmed = line.trimmingCharacters(in: .whitespaces)
if trimmed.isEmpty || trimmed.hasPrefix("#") {
i += 1
continue
}
if indent < 6 {
// Out of the platform's block.
break
}
if indent == 6 && trimmed == "\(key):" {
keyIdx = i
break
}
lastBodyIdx = i
i += 1
}
guard let keyIdx else {
return .platformPresentKeyMissing(insertAfter: lastBodyIdx)
}
// Walk down the bullet rows until we leave the block (indent shrinks
// below the bullet indent OR we hit a sibling key at indent 6).
var endIdx = keyIdx
var j = keyIdx + 1
while j < lines.count {
let line = lines[j]
let trimmed = line.trimmingCharacters(in: .whitespaces)
if trimmed.isEmpty || trimmed.hasPrefix("#") {
j += 1
continue
}
let indent = leadingSpaces(line)
// Block-style YAML allows bullets at the same indent as their
// parent key; tolerate 6-space `- item` rows alongside the
// canonical 8-space ones.
let isBullet = trimmed.hasPrefix("- ")
if isBullet && (indent == 8 || indent == 6) {
endIdx = j
j += 1
continue
}
// Anything not a bullet at indent 8 ends the block.
if indent <= 6 {
break
}
// Indent > 8 with no bullet unusual but tolerate (e.g. inline
// continuation). Treat as still in the block and advance.
endIdx = j
j += 1
}
return .found(keyIdx...endIdx)
}
private static func replaceBlock(
in lines: [String],
blockRange: ClosedRange<Int>,
key: String,
items: [String],
blockIndent: Int,
itemIndent: Int
) -> String {
var newLines = Array(lines.prefix(blockRange.lowerBound))
if !items.isEmpty {
newLines.append("\(spaces(blockIndent))\(key):")
for item in items {
newLines.append("\(spaces(itemIndent))- \(yamlQuoteIfNeeded(item))")
}
}
// Drop the old block but keep everything after it.
let tailStart = blockRange.upperBound + 1
if tailStart < lines.count {
newLines.append(contentsOf: lines.suffix(from: tailStart))
}
return newLines.joined(separator: "\n")
}
private static func spliceNewKey(
lines: [String],
insertAfterLineIndex: Int,
key: String,
items: [String],
itemIndent: Int
) -> String {
var newLines = Array(lines.prefix(insertAfterLineIndex + 1))
newLines.append(" \(key):")
for item in items {
newLines.append("\(spaces(itemIndent))- \(yamlQuoteIfNeeded(item))")
}
if insertAfterLineIndex + 1 < lines.count {
newLines.append(contentsOf: lines.suffix(from: insertAfterLineIndex + 1))
}
return newLines.joined(separator: "\n")
}
private static func appendScaffold(
yaml: String,
platform: String,
key: String,
items: [String]
) -> String {
var trimmed = yaml
// Ensure exactly one trailing newline before the appended block,
// so the scaffold sits on its own line cleanly.
while trimmed.hasSuffix("\n\n") {
trimmed.removeLast()
}
if !trimmed.isEmpty && !trimmed.hasSuffix("\n") {
trimmed.append("\n")
}
var lines: [String] = []
if !trimmed.isEmpty {
lines.append("") // blank separator
}
lines.append("gateway:")
lines.append(" platforms:")
lines.append(" \(platform):")
lines.append(" \(key):")
for item in items {
lines.append(" - \(yamlQuoteIfNeeded(item))")
}
lines.append("") // trailing newline so subsequent edits append cleanly
return trimmed + lines.joined(separator: "\n")
}
// MARK: - YAML scanning helpers
private static func leadingSpaces(_ line: String) -> Int {
var n = 0
for c in line {
if c == " " { n += 1 } else { break }
}
return n
}
/// Find the first line whose trimmed content equals `header` AND whose
/// leading-space count equals `indent`. Comment-only and blank lines
/// are skipped. Returns the line's index or `nil`.
private static func firstIndex(
of lines: [String],
headerLineEqualTo header: String,
indent: Int
) -> Int? {
for (i, line) in lines.enumerated() {
let trimmed = line.trimmingCharacters(in: .whitespaces)
if trimmed.isEmpty || trimmed.hasPrefix("#") { continue }
if leadingSpaces(line) == indent && trimmed == header {
return i
}
}
return nil
}
/// Scoped variant: search starts at `after + 1`, stops if a line at indent
/// `< stopWhenIndentLessThan` is encountered (we've left the parent block).
private static func firstIndex(
of lines: [String],
after: Int,
headerLineEqualTo header: String,
indent: Int,
stopWhenIndentLessThan: Int
) -> Int? {
var i = after + 1
while i < lines.count {
let line = lines[i]
let trimmed = line.trimmingCharacters(in: .whitespaces)
if trimmed.isEmpty || trimmed.hasPrefix("#") {
i += 1
continue
}
let lineIndent = leadingSpaces(line)
if lineIndent < stopWhenIndentLessThan {
return nil
}
if lineIndent == indent && trimmed == header {
return i
}
i += 1
}
return nil
}
private static func spaces(_ n: Int) -> String {
String(repeating: " ", count: n)
}
/// Quote a YAML scalar if it contains characters that the parser would
/// otherwise interpret as structure (colon, hash, leading at-sign, etc.).
/// Plain alphanumeric IDs (the common case for Slack channel IDs and
/// Telegram numeric chat IDs) are emitted unquoted.
private static func yamlQuoteIfNeeded(_ raw: String) -> String {
if raw.isEmpty { return "''" }
let needsQuoting = raw.contains(":")
|| raw.contains("#")
|| raw.contains("&")
|| raw.contains("*")
|| raw.contains(">")
|| raw.contains("|")
|| raw.first == "@"
|| raw.first == "-"
|| raw.first == " "
|| raw.last == " "
|| raw.first == "\""
|| raw.first == "'"
if !needsQuoting { return raw }
// Single-quote, escaping any embedded single quotes by doubling.
let escaped = raw.replacingOccurrences(of: "'", with: "''")
return "'\(escaped)'"
}
}
@@ -0,0 +1,435 @@
import Foundation
import Observation
#if canImport(os)
import os
#endif
/// What this Hermes installation can do, derived from `hermes --version`.
///
/// Scarf tracks Hermes feature releases by date-version + semver. v0.12 added
/// a dozen surfaces (Curator, Kanban, multimodal ACP, ...) and removed a few
/// (`flush_memories` aux task); v0.13 added Persistent Goals, ACP `/queue`,
/// Kanban diagnostics + recovery UX, Curator archive/prune, Google Chat (20th
/// platform), cross-platform allowlists, MCP SSE transport, Cron `no_agent`
/// mode, Web Tools per-capability backends, Profiles `--no-skills`, and a
/// handful of UX additions. UI that branches on these surfaces calls the
/// boolean accessors here so older Hermes installs degrade silently instead
/// of throwing on an unknown CLI subcommand.
///
/// Pure value type no side effects. The async detection lives in
/// `HermesCapabilitiesStore`.
public struct HermesCapabilities: Sendable, Equatable {
/// Raw version line as printed by `hermes --version`. Preserved verbatim
/// so diagnostics views can show the exact string Scarf saw.
public let versionLine: String
/// Parsed `0.X.Y`. `nil` when the output didn't match the expected format
/// (e.g. Hermes returned an error, or a future format change).
public let semver: SemVer?
/// Parsed `YYYY.M.D` from the parenthesized date suffix. `nil` when
/// absent older Hermes builds didn't always emit it.
public let dateVersion: DateVersion?
public init(versionLine: String, semver: SemVer?, dateVersion: DateVersion?) {
self.versionLine = versionLine
self.semver = semver
self.dateVersion = dateVersion
}
/// Sentinel for "not yet detected" / "detection failed". All capability
/// flags resolve to `false` so unguarded UI stays hidden until the real
/// version lands.
public static let empty = HermesCapabilities(
versionLine: "",
semver: nil,
dateVersion: nil
)
public var detected: Bool { semver != nil }
// MARK: - Capability flags
//
// Add a new flag here when Scarf gains UI that conditionally branches on
// a Hermes capability. Keep the comparison conservative: a flag introduced
// in v0.13.0 should gate on `>= 0.13.0`, not `>= 0.13.5`, so users on
// an early 0.13 patch still see the surface.
// MARK: v0.12 (v2026.4.30) flags
/// `hermes curator` autonomous skill maintenance (v0.12+).
public var hasCurator: Bool { atLeastSemver(0, 12, 0) }
/// `hermes fallback` provider management (v0.12+).
public var hasFallbackCommand: Bool { atLeastSemver(0, 12, 0) }
/// `hermes kanban` task board CLI (v0.12+).
public var hasKanban: Bool { atLeastSemver(0, 12, 0) }
/// `hermes -z <prompt>` non-interactive one-shot mode (v0.12+).
public var hasOneShot: Bool { atLeastSemver(0, 12, 0) }
/// `hermes skills install <https-url>` direct-URL install (v0.12+).
public var hasSkillURLInstall: Bool { atLeastSemver(0, 12, 0) }
/// ACP `session/prompt` accepts image content blocks (v0.12+).
public var hasACPImagePrompts: Bool { atLeastSemver(0, 12, 0) }
/// `hermes update --check` preflight (v0.12+).
public var hasUpdateCheck: Bool { atLeastSemver(0, 12, 0) }
/// Pluggable TTS providers including native Piper (v0.12+).
public var hasPiperTTS: Bool { atLeastSemver(0, 12, 0) }
/// `terminal.backend = vercel` Vercel Sandbox option (v0.12+).
public var hasVercelTerminal: Bool { atLeastSemver(0, 12, 0) }
/// `auxiliary.flush_memories` config row was removed in v0.12.
/// Inverse semantics `true` means the row should still be shown.
public var hasFlushMemoriesAux: Bool {
guard let s = semver else { return false } // unknown hide
return s < SemVer(major: 0, minor: 12, patch: 0) // pre-v0.12 only
}
/// `auxiliary.curator` aux task is configurable (v0.12+).
public var hasCuratorAux: Bool { atLeastSemver(0, 12, 0) }
/// Microsoft Teams (19th platform) and Yuanbao (18th) added in v0.12.
public var hasTeamsPlatform: Bool { atLeastSemver(0, 12, 0) }
public var hasYuanbaoPlatform: Bool { atLeastSemver(0, 12, 0) }
/// Cron jobs accept `--workdir` and `--context-from` flags (v0.12+).
public var hasCronWorkdir: Bool { atLeastSemver(0, 12, 0) }
/// `prompt_caching.cache_ttl` config knob (v0.12+).
public var hasPromptCacheTTL: Bool { atLeastSemver(0, 12, 0) }
/// `redaction.enabled` is now off by default in v0.12 Scarf surfaces
/// the toggle so users can flip it back on. v0.13 flips the server-side
/// default back to ON; the toggle remains so users on v0.13 can opt out.
public var hasRedactionToggle: Bool { atLeastSemver(0, 12, 0) }
// MARK: v0.13 (v2026.5.7) flags
/// `/goal` slash command + Persistent Goals + Checkpoints v2 single-store
/// (v0.13+). Used by RichChatViewModel to add `/goal` to the
/// non-interruptive command list and to render the "Goal locked" pill in
/// the chat header.
public var hasGoals: Bool { atLeastSemver(0, 13, 0) }
/// `/queue` slash command in the ACP adapter (v0.13+). Queues a prompt
/// to run after the current turn completes without interrupting.
public var hasACPQueue: Bool { atLeastSemver(0, 13, 0) }
/// `/steer` runs as a regular prompt on idle ACP sessions (v0.13+). Pre-
/// v0.13 hosts silently no-op `/steer` when no turn is in flight; with
/// this flag on, Scarf can surface `/steer` even when the agent isn't
/// mid-turn without confusing UX.
public var hasACPSteerOnIdle: Bool { atLeastSemver(0, 13, 0) }
/// Kanban v0.13 reliability surface: hallucination gate on worker-created
/// cards, generic diagnostics engine, per-task `max_retries`, multiline
/// title/body create, `auto_blocked_reason` on blocked tasks, darwin
/// zombie detection. All read through the `kanban show` JSON surface.
public var hasKanbanDiagnostics: Bool { atLeastSemver(0, 13, 0) }
/// `hermes curator archive`, `prune`, and `list-archived` subcommands
/// (v0.13+). The synchronous manual `hermes curator run` lives behind
/// this flag too pre-v0.13 `run` returns immediately and the work
/// happens in the background.
public var hasCuratorArchive: Bool { atLeastSemver(0, 13, 0) }
/// Google Chat 20th messaging-gateway platform (v0.13+).
public var hasGoogleChatPlatform: Bool { atLeastSemver(0, 13, 0) }
/// Cross-platform allowlist keys: `allowed_channels` (Slack / Mattermost
/// / Google Chat), `allowed_chats` (Telegram / WhatsApp), `allowed_rooms`
/// (Matrix / DingTalk). Settable per platform in `config.yaml` (v0.13+).
public var hasGatewayAllowlists: Bool { atLeastSemver(0, 13, 0) }
/// `busy_ack_enabled` config to suppress per-message "agent is working"
/// acks across platforms (v0.13+).
public var hasGatewayBusyAckToggle: Bool { atLeastSemver(0, 13, 0) }
/// Per-platform `gateway_restart_notification` flag controls whether the
/// platform posts a "Gateway restarted" notice on boot (v0.13+).
public var hasGatewayRestartNotification: Bool { atLeastSemver(0, 13, 0) }
/// `hermes gateway list` cross-profile status verb (v0.13+). Lets Scarf
/// show which profile is currently running which platform.
public var hasGatewayList: Bool { atLeastSemver(0, 13, 0) }
/// MCP servers can use SSE transport (v0.13+). Adds an `sse_read_timeout`
/// knob alongside the existing stdio/pipe transports.
public var hasMCPSSETransport: Bool { atLeastSemver(0, 13, 0) }
/// Cron `--no-agent` mode for script-only watchdog jobs (v0.13+). Skips
/// the AI call entirely useful for keep-alive / periodic-check jobs.
public var hasCronNoAgent: Bool { atLeastSemver(0, 13, 0) }
/// Web Tools split into per-capability backend selection: `web_search`
/// and `web_extract` can now use distinct backends (v0.13+). SearXNG
/// joined as a search-only backend.
public var hasWebToolsBackendSplit: Bool { atLeastSemver(0, 13, 0) }
/// `hermes profile create --no-skills` flag for empty profiles (v0.13+).
public var hasProfileNoSkills: Bool { atLeastSemver(0, 13, 0) }
/// Context compression count surfaced in the status feed (v0.13+). Scarf
/// renders it next to the token count in the chat status bar.
public var hasContextCompressionCount: Bool { atLeastSemver(0, 13, 0) }
/// `/new` slash command accepts an optional session-name argument (v0.13+).
public var hasNewWithSessionName: Bool { atLeastSemver(0, 13, 0) }
/// `hermes update --yes` / `-y` skips interactive prompts (v0.13+). Used
/// by Scarf's "Update Hermes" affordance to run unattended.
public var hasUpdateNonInteractive: Bool { atLeastSemver(0, 13, 0) }
/// OpenRouter response caching toggle in `config.yaml` (v0.13+).
public var hasOpenRouterResponseCache: Bool { atLeastSemver(0, 13, 0) }
/// `image_gen.model` honored from `config.yaml` (v0.13+). Pre-v0.13 the
/// value was advertised but ignored at runtime.
public var hasImageGenModel: Bool { atLeastSemver(0, 13, 0) }
/// `display.language` config key for static-message translation: zh / ja /
/// de / es / fr / uk / tr (v0.13+).
public var hasDisplayLanguage: Bool { atLeastSemver(0, 13, 0) }
/// xAI Custom Voices voice cloning support (v0.13+). Exposed in Scarf
/// as a "Cloning supported" badge next to the xAI TTS provider entry.
public var hasXAIVoiceCloning: Bool { atLeastSemver(0, 13, 0) }
/// `video_analyze` tool native video understanding on Gemini and
/// compatible models (v0.13+). Hermes handles this transparently inside
/// the agent loop; Scarf has no UI surface yet, but the flag lets future
/// dashboards / activity views light up video-tool annotations.
public var hasVideoAnalyze: Bool { atLeastSemver(0, 13, 0) }
/// `transform_llm_output` plugin hook for shaping LLM output before the
/// conversation receives it (v0.13+). Plugin-author concern; Scarf's
/// PluginsView surfaces it as a documented hook in plugin metadata.
public var hasTransformLLMOutputHook: Bool { atLeastSemver(0, 13, 0) }
// MARK: Convenience predicates
/// Whether the connected host is on the v0.13 line or newer. Convenience
/// for UI copy that needs to switch on the v0.12 v0.13 boundary without
/// proxying through a feature-specific flag (e.g. "v0.13 features active"
/// badges, redaction default-state hints). Equivalent to any individual
/// v0.13 flag; prefer this when the call site isn't actually about a
/// specific feature.
public var isV013OrLater: Bool { atLeastSemver(0, 13, 0) }
private func atLeastSemver(_ major: Int, _ minor: Int, _ patch: Int) -> Bool {
guard let s = semver else { return false }
return s >= SemVer(major: major, minor: minor, patch: patch)
}
public struct SemVer: Sendable, Equatable, Comparable, CustomStringConvertible {
public let major: Int
public let minor: Int
public let patch: Int
public init(major: Int, minor: Int, patch: Int) {
self.major = major
self.minor = minor
self.patch = patch
}
public var description: String { "\(major).\(minor).\(patch)" }
public static func < (a: SemVer, b: SemVer) -> Bool {
if a.major != b.major { return a.major < b.major }
if a.minor != b.minor { return a.minor < b.minor }
return a.patch < b.patch
}
}
public struct DateVersion: Sendable, Equatable, Comparable, CustomStringConvertible {
public let year: Int
public let month: Int
public let day: Int
public init(year: Int, month: Int, day: Int) {
self.year = year
self.month = month
self.day = day
}
public var description: String { "\(year).\(month).\(day)" }
public static func < (a: DateVersion, b: DateVersion) -> Bool {
if a.year != b.year { return a.year < b.year }
if a.month != b.month { return a.month < b.month }
return a.day < b.day
}
}
/// Parse a `Hermes Agent v0.12.0 (2026.4.30)` line out of `hermes --version`
/// output. Tolerates leading/trailing whitespace, extra header lines
/// (e.g. `Project:`, `Python:`), and the absence of the parenthesized
/// date suffix.
///
/// Returns `.empty` when no recognizable version line is present so
/// callers don't have to special-case nil.
public static func parse(_ output: String) -> HermesCapabilities {
for raw in output.components(separatedBy: "\n") {
let line = raw.trimmingCharacters(in: .whitespaces)
guard line.contains("Hermes Agent v") else { continue }
return parseLine(line)
}
return .empty
}
/// `Hermes Agent v0.12.0 (2026.4.30)` semver + date. Returns `.empty`
/// when the line doesn't match. Public for unit tests; production callers
/// should use `parse(_:)`.
public static func parseLine(_ line: String) -> HermesCapabilities {
// Locate the "v" right after "Hermes Agent ". Don't anchor at line
// start older builds prefix with ANSI color codes Scarf would
// need to strip.
guard let vRange = line.range(of: "Hermes Agent v") else { return .empty }
let tail = String(line[vRange.upperBound...])
// Read digits separated by dots until we hit non-version content.
// First three components are semver. A trailing `(Y.M.D)` is the
// date version.
let semverEnd = tail.firstIndex(where: { c in
!(c.isNumber || c == ".")
}) ?? tail.endIndex
let semverStr = String(tail[..<semverEnd])
let semverParts = semverStr.split(separator: ".").compactMap { Int($0) }
guard semverParts.count >= 3 else { return .empty }
let semver = SemVer(
major: semverParts[0],
minor: semverParts[1],
patch: semverParts[2]
)
// Optional date suffix.
var dateVersion: DateVersion?
if let openParen = tail.firstIndex(of: "("),
let closeParen = tail.firstIndex(of: ")"),
openParen < closeParen {
let dateStr = tail[tail.index(after: openParen)..<closeParen]
let dateParts = dateStr.split(separator: ".").compactMap { Int($0) }
if dateParts.count == 3 {
dateVersion = DateVersion(
year: dateParts[0],
month: dateParts[1],
day: dateParts[2]
)
}
}
return HermesCapabilities(
versionLine: line,
semver: semver,
dateVersion: dateVersion
)
}
}
/// Per-server capability cache. One per `ContextBoundRoot` (Mac) / iOS scene
/// root, injected via `.environment(_:)`. Refreshes once on init; callers
/// invoke `refresh()` after a Hermes update or when the server changes.
///
/// Not thread-safe across instances each server gets its own store, and
/// the underlying `runHermesCLI` call is detached so we never block
/// MainActor.
@Observable
@MainActor
public final class HermesCapabilitiesStore {
#if canImport(os)
private let logger = Logger(subsystem: "com.scarf", category: "HermesCapabilities")
#endif
public private(set) var capabilities: HermesCapabilities = .empty
public private(set) var isLoading = true
public let context: ServerContext
private var refreshTask: Task<Void, Never>?
public init(context: ServerContext) {
self.context = context
// Kick off a one-shot detection. Subsequent refreshes are explicit.
// Task captures `[weak self]`, so if the store is freed before
// detection completes the closure simply no-ops.
refreshTask = Task { [weak self] in
await self?.refresh()
}
}
public func refresh() async {
isLoading = true
let context = self.context
let parsed = await Task.detached(priority: .utility) { () -> HermesCapabilities in
return Self.detectSync(context: context)
}.value
self.capabilities = parsed
self.isLoading = false
#if canImport(os)
if parsed.detected {
logger.info("Hermes \(parsed.versionLine, privacy: .public) detected on \(self.context.displayName, privacy: .public)")
} else {
logger.warning("Hermes version not detected on \(self.context.displayName, privacy: .public)")
}
#endif
}
/// Synchronous detection helper. Lives here (not on `HermesCapabilities`)
/// because `ServerContext.makeTransport()` is a side-effecting call that
/// pulls in the platform-appropriate transport (LocalTransport on Mac,
/// CitadelServerTransport on iOS). The pure parser remains side-effect-free.
nonisolated private static func detectSync(context: ServerContext) -> HermesCapabilities {
let transport = context.makeTransport()
let executable = context.paths.hermesBinary
do {
let result = try transport.runProcess(
executable: executable,
args: ["--version"],
stdin: nil,
timeout: 10
)
// `hermes --version` writes to stdout but Scarf's transport
// helpers occasionally split error output across stderr fold
// both so the parser sees whichever stream the line lands on.
let combined = result.stdoutString + result.stderrString
guard result.exitCode == 0 else { return .empty }
return HermesCapabilities.parse(combined)
} catch {
return .empty
}
}
}
// MARK: - SwiftUI environment wiring
#if canImport(SwiftUI)
import SwiftUI
private struct HermesCapabilitiesStoreKey: EnvironmentKey {
static let defaultValue: HermesCapabilitiesStore? = nil
}
extension EnvironmentValues {
/// The active server's capability store. `nil` outside the per-server
/// `ContextBoundRoot`. Callers should treat `nil` and `.empty` capabilities
/// the same defensive code for harness scenarios (Previews, smoke tests).
public var hermesCapabilities: HermesCapabilitiesStore? {
get { self[HermesCapabilitiesStoreKey.self] }
set { self[HermesCapabilitiesStoreKey.self] = newValue }
}
}
extension View {
/// Inject a `HermesCapabilitiesStore` into the environment. Mirrors the
/// usual `.environment(_:)` shape but routes through the typed key
/// above so callers don't need to import the key.
public func hermesCapabilities(_ store: HermesCapabilitiesStore) -> some View {
environment(\.hermesCapabilities, store)
}
}
#endif
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,151 @@
import Foundation
/// Cross-profile snapshot returned by `hermes gateway list --json` (Hermes
/// v0.13+). Each profile is one configured Messaging Gateway instance most
/// users have a single `default` profile, but power users keep separate
/// profiles for work / personal / project-specific accounts.
public struct GatewayListSnapshot: Sendable, Equatable {
public struct ProfileEntry: Sendable, Equatable {
public let profile: String
public let isRunning: Bool
public let pid: Int?
public let platforms: [String] // platform names connected/configured
public init(
profile: String,
isRunning: Bool,
pid: Int?,
platforms: [String]
) {
self.profile = profile
self.isRunning = isRunning
self.pid = pid
self.platforms = platforms
}
}
public let profiles: [ProfileEntry]
public let detectedAt: Date
public init(profiles: [ProfileEntry], detectedAt: Date = Date()) {
self.profiles = profiles
self.detectedAt = detectedAt
}
/// One-line digest for the Messaging Gateway page header. Format depends
/// on shape:
/// - 0 profiles: `"no profiles configured"`
/// - 1 profile, running: `"default profile · running · slack, telegram"`
/// - 1 profile, stopped: `"default profile · stopped"`
/// - >1 profile: `"3 profiles (2 running) · default: slack, telegram"`
public var headerDigest: String {
if profiles.isEmpty { return "no profiles configured" }
if profiles.count == 1 {
let p = profiles[0]
let state = p.isRunning ? "running" : "stopped"
if p.isRunning && !p.platforms.isEmpty {
let plats = p.platforms.joined(separator: ", ")
return "\(p.profile) profile · \(state) · \(plats)"
}
return "\(p.profile) profile · \(state)"
}
let runningCount = profiles.filter(\.isRunning).count
// Surface the platforms of the first running profile (or first profile
// if none are running) so the digest carries one specimen of context
// beyond just counts.
let highlight = profiles.first(where: \.isRunning) ?? profiles[0]
let platsClause: String
if highlight.platforms.isEmpty {
platsClause = ""
} else {
platsClause = " · \(highlight.profile): \(highlight.platforms.joined(separator: ", "))"
}
return "\(profiles.count) profiles (\(runningCount) running)\(platsClause)"
}
}
/// Pure parser + sync fetcher for `hermes gateway list --json`. Pre-v0.13
/// hosts exit non-zero on the unknown subcommand; the fetcher returns `nil`
/// in that case so the digest row hides itself.
///
/// The detection is **synchronous** run from a `Task.detached` to avoid
/// blocking MainActor on remote SSH round-trips. The pure `parse(_:)`
/// helper has no I/O and can be used in tests against canned JSON.
public enum HermesGatewayListService {
/// Parse a JSON blob from `hermes gateway list --json` into a snapshot.
/// Tolerant of unknown keys; returns `nil` for unparseable / empty input.
///
/// // TODO(WS-5-Q3): the JSON shape below is the plan's best-guess.
/// Confirm against actual Hermes v0.13 output once available. Possible
/// alternative shapes:
/// - root array of profile objects (no `profiles` wrapper)
/// - `state` enum string instead of `running` bool
/// - `connected_platforms` instead of `platforms`
/// The parser is intentionally tolerant so a small shape change can be
/// absorbed by tweaking field names without breaking older fixtures.
public static func parse(_ json: Data) -> GatewayListSnapshot? {
guard !json.isEmpty,
let raw = try? JSONSerialization.jsonObject(with: json) else {
return nil
}
// Accept both `{"profiles": [...]}` and a bare `[...]` of profiles.
let profilesArray: [Any]
if let dict = raw as? [String: Any], let arr = dict["profiles"] as? [Any] {
profilesArray = arr
} else if let arr = raw as? [Any] {
profilesArray = arr
} else {
return nil
}
var entries: [GatewayListSnapshot.ProfileEntry] = []
for raw in profilesArray {
guard let obj = raw as? [String: Any] else { continue }
let profile = (obj["name"] as? String)
?? (obj["profile"] as? String)
?? "default"
let isRunning: Bool
if let v = obj["running"] as? Bool {
isRunning = v
} else if let s = obj["state"] as? String {
isRunning = s.lowercased() == "running"
} else {
isRunning = false
}
let pid = obj["pid"] as? Int
let platforms = (obj["platforms"] as? [String])
?? (obj["connected_platforms"] as? [String])
?? []
entries.append(GatewayListSnapshot.ProfileEntry(
profile: profile,
isRunning: isRunning,
pid: pid,
platforms: platforms
))
}
return GatewayListSnapshot(profiles: entries)
}
/// Synchronous fetch helper call from a `Task.detached`. Returns
/// `nil` when the subcommand fails (pre-v0.13 host) or when the
/// output isn't parseable.
public static func fetch(context: ServerContext) -> GatewayListSnapshot? {
let transport = context.makeTransport()
let executable = context.paths.hermesBinary
do {
let result = try transport.runProcess(
executable: executable,
args: ["gateway", "list", "--json"],
stdin: nil,
timeout: 10
)
guard result.exitCode == 0 else { return nil }
return parse(result.stdout)
} catch {
return nil
}
}
}
@@ -51,7 +51,19 @@ public enum HermesProfileResolver {
/// Returns the default `~/.hermes` when no profile is active OR when
/// the configured profile is invalid (logged) so the worst-case
/// failure mode is "Scarf shows what it always showed before."
///
/// **Test override.** Setting `SCARF_HERMES_HOME` in the environment
/// pins this resolver to the supplied absolute path and bypasses both
/// the cache and the `active_profile` lookup. Used by the E2E test
/// harness (`TemplateE2ETests`, `TemplateInstallUITests`) to drive
/// Scarf against an isolated tmpdir Hermes home so the user's real
/// `~/.hermes` is never touched. Read on every call (cheap; a single
/// `ProcessInfo` lookup) so tests can flip it across test methods
/// without stale-cache surprises.
public static func resolveLocalHome() -> String {
if let override = scarfHermesHomeOverride() {
return override
}
return refreshIfNeeded().home
}
@@ -60,9 +72,55 @@ public enum HermesProfileResolver {
/// reading from (issue #50 follow-up: prevents the next variant
/// of "where's my data wrong profile" by making it visible).
public static func activeProfileName() -> String {
if scarfHermesHomeOverride() != nil {
return "test-override"
}
return refreshIfNeeded().name
}
/// Sentinel filename that the override path MUST contain for the
/// override to be honored. Without it, production code refuses to
/// pivot off the user's real `~/.hermes` even if the env var is
/// set. This is the "even if a test leaks the env var, even if
/// some non-test process inherits it, the user's data is safe"
/// belt-and-braces guard. Tests create this marker before
/// `setenv("SCARF_HERMES_HOME", ...)`.
public static let testHomeMarkerFilename = ".scarf-test-home-marker"
/// Read `SCARF_HERMES_HOME` from the environment. Returns `nil` when
/// unset or empty so production callers fall through to the profile
/// resolver. The override must:
/// 1. Be an absolute path relative paths are rejected (they'd
/// land relative to the cwd of whatever process happened to
/// invoke the resolver, which is not what tests want).
/// 2. Contain the sentinel marker file
/// `<path>/<testHomeMarkerFilename>`. Without the marker we
/// treat the env var as untrusted and ignore it. This protects
/// the user's real `~/.hermes/` from any code path that
/// accidentally exports `SCARF_HERMES_HOME` to the wrong value
/// (e.g. a test crashed mid-teardown, an env var inherited
/// from a parent shell, a misconfigured launchctl plist).
/// Both checks are cheap `FileManager.fileExists` against a
/// known path is microseconds. The override is hot but not
/// hot-hot, so an extra stat per call is negligible.
private static func scarfHermesHomeOverride() -> String? {
guard let raw = ProcessInfo.processInfo.environment["SCARF_HERMES_HOME"] else {
return nil
}
let trimmed = raw.trimmingCharacters(in: .whitespacesAndNewlines)
guard !trimmed.isEmpty else { return nil }
guard trimmed.hasPrefix("/") else {
logger.warning("SCARF_HERMES_HOME=\(trimmed, privacy: .public) is not absolute; ignoring.")
return nil
}
let markerPath = trimmed + "/" + testHomeMarkerFilename
guard FileManager.default.fileExists(atPath: markerPath) else {
logger.warning("SCARF_HERMES_HOME=\(trimmed, privacy: .public) lacks sentinel marker (\(testHomeMarkerFilename, privacy: .public)); ignoring to protect real ~/.hermes.")
return nil
}
return trimmed
}
/// Force a re-read on the next call, regardless of TTL. Test helper.
public static func invalidateCache() {
lock.withLock { $0.resolvedAt = .distantPast }
@@ -95,15 +153,20 @@ public enum HermesProfileResolver {
let defaultHome = defaultRootHome()
let activeFile = defaultHome + "/active_profile"
// Absent file default profile. This is the common case for users
// who haven't run `hermes profile use ...` and shouldn't generate
// any log noise.
// Absent file default profile. Common case for users who
// haven't run `hermes profile use ...`. We still log at
// `.info` (key=value, not warning) so support requests can
// pull `log show | grep ProfileResolver` and confirm the
// resolver IS running and IS resolving to the default
// distinguishing "feature didn't fire" from "feature fired
// and chose default" (issue #70).
guard FileManager.default.fileExists(atPath: activeFile) else {
logger.info("Resolved active Hermes profile: name=default, home=\(defaultHome, privacy: .public), source=default-no-file")
return ("default", defaultHome)
}
guard let raw = try? String(contentsOfFile: activeFile, encoding: .utf8) else {
logger.warning("Found active_profile but could not read it; falling back to default profile.")
logger.warning("Found active_profile but could not read it; falling back to default. home=\(defaultHome, privacy: .public)")
return ("default", defaultHome)
}
@@ -111,6 +174,7 @@ public enum HermesProfileResolver {
// Empty file or explicit "default" default profile.
if trimmed.isEmpty || trimmed == "default" {
logger.info("Resolved active Hermes profile: name=default, home=\(defaultHome, privacy: .public), source=file-default")
return ("default", defaultHome)
}
@@ -129,7 +193,7 @@ public enum HermesProfileResolver {
return ("default", defaultHome)
}
logger.info("Resolved active Hermes profile to \(trimmed, privacy: .public) at \(profileHome, privacy: .public).")
logger.info("Resolved active Hermes profile: name=\(trimmed, privacy: .public), home=\(profileHome, privacy: .public), source=file")
return (trimmed, profileHome)
}
@@ -0,0 +1,34 @@
import Foundation
/// Pure helpers that build argv arrays for `hermes update` invocations.
///
/// Lives in ScarfCore so the eventual UI surface (Mac / iOS / remote)
/// shares flag selection. There is no in-app "Update Hermes" affordance
/// in v2.7.5 Sparkle handles Scarf-self-update and `hermes update` is
/// invoked by users in their terminal but capability-gated flag logic
/// is forward-compat plumbing that the future affordance will call. Each
/// helper is a `nonisolated static` pure function: no transport, no
/// MainActor, no mocking surface required.
public enum HermesUpdaterCommandBuilder {
/// Argv for an `hermes update` invocation, capability-gated.
///
/// Pre-v0.12 hosts only had `update` (no flags). v0.12+ accepts
/// `--check` for preflight. v0.13+ accepts `--yes` / `-y` for
/// unattended runs (skips the interactive confirmation prompt).
/// Flags are silently dropped when the connected host can't honor
/// them so callers don't need to branch on capabilities themselves.
public static func updateArgv(
capabilities: HermesCapabilities,
unattended: Bool,
checkOnly: Bool
) -> [String] {
var args: [String] = ["update"]
if checkOnly && capabilities.hasUpdateCheck {
args.append("--check")
}
if unattended && capabilities.hasUpdateNonInteractive {
args.append("--yes")
}
return args
}
}
@@ -0,0 +1,167 @@
import Foundation
#if canImport(AppKit)
import AppKit
#endif
#if canImport(UIKit)
import UIKit
#endif
#if canImport(CoreImage)
import CoreImage
#endif
/// Downsamples + base64-encodes user-supplied images for ACP transport.
///
/// **Why downsample on the producer side.** Hermes happily forwards the
/// bytes to a vision model, but a 12 MP screenshot at 4 MB is wasteful
/// it eats 56× more tokens than a 1024×1024 thumbnail and gives the
/// model no extra signal. Cap the long edge at 1568 px (Anthropic's
/// recommended max for Claude vision) and drop quality to JPEG 0.85,
/// which keeps screenshot text crisp while landing under ~300 KB per
/// image. The 5-image-per-message limit (chosen on the producer side)
/// keeps the total prompt payload below ~2 MB.
///
/// **Why detached.** Image loading + downsampling is CPU-bound. Run only
/// from a `Task.detached` context (the encoder type is `Sendable` and
/// every method is `nonisolated`). The companion `ChatImageAttachment`
/// is a Sendable value type so the result hops back to MainActor cleanly.
public struct ImageEncoder: Sendable {
/// Long-edge pixel cap. 1568 is Anthropic's recommended ceiling for
/// Claude vision input past it, the provider downsamples server-side
/// and we just paid for the extra bytes. Tweak only with vision-model
/// guidance from Hermes side.
public static let maxLongEdge: CGFloat = 1568
/// JPEG quality factor. 0.85 is the inflection point above which
/// file size jumps quickly without obvious visual gain on screenshots
/// or photographs.
public static let jpegQuality: CGFloat = 0.85
/// Long-edge cap for the inline thumbnail rendered in the composer
/// chip. Kept under the system thumbnail size so `Image(data:)`
/// renders without extra resampling.
public static let thumbnailLongEdge: CGFloat = 256
public init() {}
public enum EncoderError: Error, LocalizedError {
case unsupportedFormat
case decodeFailed
case encodeFailed
case empty
public var errorDescription: String? {
switch self {
case .unsupportedFormat: return "Image format not recognized"
case .decodeFailed: return "Couldn't decode image data"
case .encodeFailed: return "Couldn't encode image as JPEG"
case .empty: return "Image data was empty"
}
}
}
/// Encode raw bytes (from a paste/drop/picker) into a wire-ready
/// attachment. Detached-only never call from MainActor. The
/// originating bytes are not retained beyond this call.
public nonisolated func encode(
rawBytes: Data,
sourceFilename: String? = nil
) throws -> ChatImageAttachment {
guard !rawBytes.isEmpty else { throw EncoderError.empty }
ScarfMon.event(.render, "imageEncoder.input.bytes", count: 1, bytes: rawBytes.count)
return try ScarfMon.measure(.render, "imageEncoder.downsample") {
#if canImport(AppKit)
guard let nsImage = NSImage(data: rawBytes) else { throw EncoderError.decodeFailed }
let targetSize = Self.fittedSize(for: nsImage.size, maxLongEdge: Self.maxLongEdge)
let mainData = try Self.jpegBytes(from: nsImage, size: targetSize)
let thumbSize = Self.fittedSize(for: nsImage.size, maxLongEdge: Self.thumbnailLongEdge)
let thumbData = try? Self.jpegBytes(from: nsImage, size: thumbSize)
ScarfMon.event(.render, "imageEncoder.bytes", count: 1, bytes: mainData.count)
return ChatImageAttachment(
mimeType: "image/jpeg",
base64Data: mainData.base64EncodedString(),
thumbnailBase64: thumbData?.base64EncodedString(),
filename: sourceFilename,
approximateByteCount: mainData.count
)
#elseif canImport(UIKit)
guard let uiImage = UIImage(data: rawBytes) else { throw EncoderError.decodeFailed }
let targetSize = Self.fittedSize(for: uiImage.size, maxLongEdge: Self.maxLongEdge)
let mainData = try Self.jpegBytes(from: uiImage, size: targetSize)
let thumbSize = Self.fittedSize(for: uiImage.size, maxLongEdge: Self.thumbnailLongEdge)
let thumbData = try? Self.jpegBytes(from: uiImage, size: thumbSize)
ScarfMon.event(.render, "imageEncoder.bytes", count: 1, bytes: mainData.count)
return ChatImageAttachment(
mimeType: "image/jpeg",
base64Data: mainData.base64EncodedString(),
thumbnailBase64: thumbData?.base64EncodedString(),
filename: sourceFilename,
approximateByteCount: mainData.count
)
#else
// Linux CI / unknown platforms: pass through raw bytes if the
// input already looks like a JPEG, else refuse. Keeps the
// package compiling without a hard AppKit/UIKit dep.
if rawBytes.starts(with: [0xFF, 0xD8]) {
ScarfMon.event(.render, "imageEncoder.bytes", count: 1, bytes: rawBytes.count)
return ChatImageAttachment(
mimeType: "image/jpeg",
base64Data: rawBytes.base64EncodedString(),
thumbnailBase64: nil,
filename: sourceFilename,
approximateByteCount: rawBytes.count
)
}
throw EncoderError.unsupportedFormat
#endif
}
}
nonisolated private static func fittedSize(for source: CGSize, maxLongEdge: CGFloat) -> CGSize {
let longest = max(source.width, source.height)
if longest <= maxLongEdge { return source }
let scale = maxLongEdge / longest
return CGSize(
width: floor(source.width * scale),
height: floor(source.height * scale)
)
}
#if canImport(AppKit)
nonisolated private static func jpegBytes(from image: NSImage, size: CGSize) throws -> Data {
let resized = NSImage(size: size)
resized.lockFocus()
NSGraphicsContext.current?.imageInterpolation = .high
image.draw(
in: CGRect(origin: .zero, size: size),
from: .zero,
operation: .copy,
fraction: 1.0
)
resized.unlockFocus()
guard let tiff = resized.tiffRepresentation,
let rep = NSBitmapImageRep(data: tiff),
let data = rep.representation(
using: .jpeg,
properties: [.compressionFactor: jpegQuality]
)
else {
throw EncoderError.encodeFailed
}
return data
}
#elseif canImport(UIKit)
nonisolated private static func jpegBytes(from image: UIImage, size: CGSize) throws -> Data {
let format = UIGraphicsImageRendererFormat()
format.scale = 1
format.opaque = true
let renderer = UIGraphicsImageRenderer(size: size, format: format)
let resized = renderer.image { _ in
image.draw(in: CGRect(origin: .zero, size: size))
}
guard let data = resized.jpegData(compressionQuality: jpegQuality) else {
throw EncoderError.encodeFailed
}
return data
}
#endif
}
@@ -0,0 +1,558 @@
import Foundation
#if canImport(os)
import os
#endif
/// Async, transport-aware client for `hermes kanban `. Wraps every CLI
/// verb the v0.12 board exposes in a typed Swift surface.
///
/// **Concurrency.** This is a pure-I/O `actor` no UI state. View models
/// (`@MainActor` `@Observable`) hold a service reference and `await`
/// methods. Each public method serializes through the actor, but the
/// underlying CLI invocation runs on a `Task.detached(priority: .utility)`
/// so two concurrent reads from different VMs don't queue end-to-end on
/// a single thread.
///
/// **Hermes constraints surfaced as Swift constraints:**
/// - There is no `update` verb, so there's no `update(taskId:title:body:)`.
/// Mutations after create are state transitions (assign / claim /
/// complete / block / unblock / archive / comment) or new comments.
/// - The board is global with optional `tenant` namespacing pass a
/// tenant via `KanbanListFilter.tenant` for project-scoped views.
/// - The CLI prints `"no matching tasks"` instead of `[]` when nothing
/// matches a filter. We fold that into `[]` rather than throwing.
public actor KanbanService {
#if canImport(os)
private static let logger = Logger(subsystem: "com.scarf", category: "KanbanService")
#endif
private let context: ServerContext
public init(context: ServerContext) {
self.context = context
}
// MARK: - Reads
public func list(_ filter: KanbanListFilter = .all) async throws -> [HermesKanbanTask] {
var args = ["kanban", "list"]
args.append(contentsOf: filter.argv())
let (code, stdout, stderr) = await runHermes(args: args, timeout: 20)
try ensureSuccess(code: code, stdout: stdout, stderr: stderr, verb: "list")
// Empty filter on an empty board prints "no matching tasks" instead
// of `[]`. Treat as empty rather than letting the JSON decode fail.
if stdout.contains("no matching tasks") {
return []
}
guard let data = stdout.data(using: .utf8) else {
throw KanbanError.decoding(message: "non-UTF8 stdout")
}
do {
return try JSONDecoder().decode([HermesKanbanTask].self, from: data)
} catch {
throw KanbanError.decoding(message: error.localizedDescription)
}
}
public func show(taskId: String) async throws -> HermesKanbanTaskDetail {
let args = ["kanban", "show", taskId, "--json"]
let (code, stdout, stderr) = await runHermes(args: args, timeout: 15)
try ensureSuccess(code: code, stdout: stdout, stderr: stderr, verb: "show")
guard let data = stdout.data(using: .utf8) else {
throw KanbanError.decoding(message: "non-UTF8 stdout")
}
do {
return try JSONDecoder().decode(HermesKanbanTaskDetail.self, from: data)
} catch {
throw KanbanError.decoding(message: error.localizedDescription)
}
}
public func runs(taskId: String) async throws -> [HermesKanbanRun] {
let args = ["kanban", "runs", taskId, "--json"]
let (code, stdout, stderr) = await runHermes(args: args, timeout: 15)
try ensureSuccess(code: code, stdout: stdout, stderr: stderr, verb: "runs")
guard let data = stdout.data(using: .utf8) else {
throw KanbanError.decoding(message: "non-UTF8 stdout")
}
do {
return try JSONDecoder().decode([HermesKanbanRun].self, from: data)
} catch {
// Some Hermes builds emit a `{"runs": [...]}` envelope.
struct Wrapper: Decodable { let runs: [HermesKanbanRun] }
if let wrapped = try? JSONDecoder().decode(Wrapper.self, from: data) {
return wrapped.runs
}
throw KanbanError.decoding(message: error.localizedDescription)
}
}
public func stats() async throws -> HermesKanbanStats {
let args = ["kanban", "stats", "--json"]
let (code, stdout, stderr) = await runHermes(args: args, timeout: 15)
try ensureSuccess(code: code, stdout: stdout, stderr: stderr, verb: "stats")
guard let data = stdout.data(using: .utf8) else {
throw KanbanError.decoding(message: "non-UTF8 stdout")
}
do {
return try JSONDecoder().decode(HermesKanbanStats.self, from: data)
} catch {
throw KanbanError.decoding(message: error.localizedDescription)
}
}
/// Print the captured worker log for a task `hermes kanban log
/// <id>`. Returns whatever `$HERMES_HOME/kanban/logs/<id>` contains.
/// Empty string when the worker hasn't written anything yet (or
/// the task has never been claimed). Pass `tailBytes` to cap the
/// returned size (useful when polling at high cadence).
public func log(taskId: String, tailBytes: Int? = nil) async throws -> String {
var args = ["kanban", "log"]
if let tailBytes {
args.append(contentsOf: ["--tail", String(tailBytes)])
}
args.append(taskId)
let (code, stdout, stderr) = await runHermes(args: args, timeout: 15)
// `kanban log` exits with code 0 even when no log file exists
// it just prints "No log file." or similar to stdout. Tolerate
// non-zero codes too: some Hermes versions emit a warning to
// stderr and exit 1 when the log dir is missing.
if code != 0 {
let combined = stderr.isEmpty ? stdout : stderr
// Treat "no log" sentinels as empty rather than as errors.
let lower = combined.lowercased()
if lower.contains("no log") || lower.contains("not found") {
return ""
}
throw KanbanError.nonZeroExit(code: code, stderr: combined)
}
return stdout
}
public func assignees() async throws -> [HermesKanbanAssignee] {
// The `assignees` verb doesn't take `--json` consistently across
// 0.12.x pass it anyway and fall back to a tab-delimited parse
// if Hermes printed a human table.
let args = ["kanban", "assignees"]
let (code, stdout, stderr) = await runHermes(args: args, timeout: 15)
try ensureSuccess(code: code, stdout: stdout, stderr: stderr, verb: "assignees")
if let data = stdout.data(using: .utf8),
let arr = try? JSONDecoder().decode([HermesKanbanAssignee].self, from: data) {
return arr
}
// Fallback: each non-blank line of the form
// "<profile>\t<active>\t<total>"
// OR "<profile> <active> <total>" (whitespace separated).
return parseAssigneeTable(stdout)
}
private nonisolated func parseAssigneeTable(_ text: String) -> [HermesKanbanAssignee] {
var result: [HermesKanbanAssignee] = []
// Profile names follow the same convention as `hermes -p <name>`
// letters, digits, hyphen, underscore. Anything else is
// chrome (header rows, Rich box-drawing, fallback messages
// like "(no assignees create a profile with `hermes -p
// <name> setup`)") and gets skipped.
for raw in text.split(separator: "\n") {
let line = raw.trimmingCharacters(in: .whitespaces)
if line.isEmpty { continue }
// Skip the column header row.
if line.lowercased().hasPrefix("profile") { continue }
// Skip the empty-state sentinel without trying to tokenize
// it (used to leak "(no" into the picker).
if line.lowercased().contains("no assignees") { continue }
// Skip Rich box-drawing separators (only + whitespace).
if line.unicodeScalars.allSatisfy({ $0.value == 0x2500 || $0.properties.isWhitespace }) {
continue
}
// Strip the active marker `` (U+25C6) some `hermes`
// commands prefix to the active profile.
var working = line
if working.hasPrefix("") {
working = String(working.dropFirst()).trimmingCharacters(in: .whitespaces)
}
let parts = working
.split(whereSeparator: { $0 == "\t" || $0 == " " })
.map { String($0) }
.filter { !$0.isEmpty }
guard let profile = parts.first else { continue }
// Validate: must look like a real profile slug, not a word
// out of an English sentence.
guard profile.range(of: "^[a-zA-Z0-9_-]+$", options: .regularExpression) != nil else {
continue
}
let active = (parts.count > 1) ? Int(parts[1]) ?? 0 : 0
let total = (parts.count > 2) ? Int(parts[2]) ?? 0 : active
result.append(HermesKanbanAssignee(profile: profile, activeCount: active, totalCount: total))
}
return result
}
// MARK: - Writes
public func create(_ request: KanbanCreateRequest) async throws -> HermesKanbanTask {
var args = ["kanban", "create"]
args.append(contentsOf: request.argv())
let (code, stdout, stderr) = await runHermes(args: args, timeout: 30)
try ensureSuccess(code: code, stdout: stdout, stderr: stderr, verb: "create")
guard let data = stdout.data(using: .utf8) else {
throw KanbanError.decoding(message: "non-UTF8 stdout")
}
// Hermes returns the full task object when --json is set.
do {
return try JSONDecoder().decode(HermesKanbanTask.self, from: data)
} catch {
// Some builds emit just the new id on stdout. Fall back to a
// follow-up `show` so the caller always gets a typed task.
let trimmed = stdout.trimmingCharacters(in: .whitespacesAndNewlines)
if !trimmed.isEmpty, !trimmed.contains("\n"), !trimmed.contains("{") {
let detail = try await show(taskId: trimmed)
return detail.task
}
throw KanbanError.decoding(message: error.localizedDescription)
}
}
public func assign(taskId: String, profile: String?) async throws {
let target = (profile?.isEmpty ?? true) ? "none" : profile!
let args = ["kanban", "assign", taskId, target]
let (code, _, stderr) = await runHermes(args: args, timeout: 15)
try ensureSuccess(code: code, stdout: "", stderr: stderr, verb: "assign")
}
@discardableResult
public func claim(taskId: String, ttlSeconds: Int = 900) async throws -> String {
let args = ["kanban", "claim", taskId, "--ttl", String(ttlSeconds)]
let (code, stdout, stderr) = await runHermes(args: args, timeout: 20)
try ensureSuccess(code: code, stdout: stdout, stderr: stderr, verb: "claim")
// claim prints the resolved workspace path on stdout.
return stdout.trimmingCharacters(in: .whitespacesAndNewlines)
}
public func comment(taskId: String, text: String, author: String? = nil) async throws {
var args = ["kanban", "comment"]
if let author, !author.isEmpty {
args.append(contentsOf: ["--author", author])
}
args.append(taskId)
args.append(text)
let (code, _, stderr) = await runHermes(args: args, timeout: 15)
try ensureSuccess(code: code, stdout: "", stderr: stderr, verb: "comment")
}
public func complete(
taskIds: [String],
result: String? = nil,
summary: String? = nil,
metadataJSON: String? = nil
) async throws {
guard !taskIds.isEmpty else { return }
var args = ["kanban", "complete"]
if let result, !result.isEmpty {
args.append(contentsOf: ["--result", result])
}
if let summary, !summary.isEmpty {
args.append(contentsOf: ["--summary", summary])
}
if let metadataJSON, !metadataJSON.isEmpty {
args.append(contentsOf: ["--metadata", metadataJSON])
}
args.append(contentsOf: taskIds)
let (code, _, stderr) = await runHermes(args: args, timeout: 30)
try ensureSuccess(code: code, stdout: "", stderr: stderr, verb: "complete")
}
public func block(taskId: String, reason: String? = nil) async throws {
var args = ["kanban", "block", taskId]
if let reason, !reason.trimmingCharacters(in: .whitespaces).isEmpty {
// Hermes accepts free-form trailing words as the reason.
args.append(contentsOf: reason.split(separator: " ").map(String.init))
}
let (code, _, stderr) = await runHermes(args: args, timeout: 15)
try ensureSuccess(code: code, stdout: "", stderr: stderr, verb: "block")
}
public func unblock(taskIds: [String]) async throws {
guard !taskIds.isEmpty else { return }
var args = ["kanban", "unblock"]
args.append(contentsOf: taskIds)
let (code, _, stderr) = await runHermes(args: args, timeout: 15)
try ensureSuccess(code: code, stdout: "", stderr: stderr, verb: "unblock")
}
public func archive(taskIds: [String]) async throws {
guard !taskIds.isEmpty else { return }
var args = ["kanban", "archive"]
args.append(contentsOf: taskIds)
let (code, _, stderr) = await runHermes(args: args, timeout: 15)
try ensureSuccess(code: code, stdout: "", stderr: stderr, verb: "archive")
}
@discardableResult
public func dispatch(maxTasks: Int? = nil, dryRun: Bool = false) async throws -> KanbanDispatchSummary {
var args = ["kanban", "dispatch", "--json"]
if dryRun { args.append("--dry-run") }
if let maxTasks { args.append(contentsOf: ["--max", String(maxTasks)]) }
let (code, stdout, stderr) = await runHermes(args: args, timeout: 60)
try ensureSuccess(code: code, stdout: stdout, stderr: stderr, verb: "dispatch")
guard let data = stdout.data(using: .utf8) else {
throw KanbanError.decoding(message: "non-UTF8 stdout")
}
do {
return try JSONDecoder().decode(KanbanDispatchSummary.self, from: data)
} catch {
// Older builds may print human output. Return a stub summary.
return KanbanDispatchSummary(promoted: 0, failed: 0, dryRun: dryRun, perTask: [])
}
}
public func link(parent: String, child: String) async throws {
let args = ["kanban", "link", parent, child]
let (code, _, stderr) = await runHermes(args: args, timeout: 15)
try ensureSuccess(code: code, stdout: "", stderr: stderr, verb: "link")
}
public func unlink(parent: String, child: String) async throws {
let args = ["kanban", "unlink", parent, child]
let (code, _, stderr) = await runHermes(args: args, timeout: 15)
try ensureSuccess(code: code, stdout: "", stderr: stderr, verb: "unlink")
}
// MARK: - Hallucination gate (v0.13)
/// Mark a worker-created card as user-verified flips
/// `hallucination_gate_status` from `pending` to `verified` so the
/// dispatcher can pick it up. The polling loop picks up the new
/// state on the next tick (and the VM optimistically clears the
/// pending banner immediately on the click).
///
/// **Pre-v0.13 hosts:** the verb doesn't exist; callers MUST gate
/// on `HermesCapabilities.hasKanbanDiagnostics` before invoking this.
/// A pre-v0.13 binary will surface the failure as
/// `KanbanError.nonZeroExit` with stderr containing "unknown command".
// TODO(WS-3-Q1): Confirm the exact CLI verb name for the
// hallucination-gate verify path against a v0.13 binary (`hermes
// kanban --help`). The v0.13 release notes describe "hallucination
// gate + recovery UX" but don't enumerate the verb name. This
// implementation assumes `hermes kanban verify <id>`. If Hermes ships
// it as `hermes kanban gate verify <id>`, `hermes kanban hallucination
// verify <id>`, or another name, update the args here. The Reject
// path does NOT depend on this verb (it routes through
// `archive` + a comment), so the recovery UX stays functional even
// if Verify is a stub for an early v0.13.x.
public func verify(taskId: String) async throws {
let args = ["kanban", "verify", taskId]
let (code, _, stderr) = await runHermes(args: args, timeout: 15)
try ensureSuccess(code: code, stdout: "", stderr: stderr, verb: "verify")
}
/// Reject a worker-created card as a hallucinated reference. There
/// is no dedicated `kanban reject` verb in v0.13; the right action
/// per the v0.13 release notes is to archive the card (the work
/// doesn't exist) with a comment recording the rejection reason for
/// the audit trail. Routing this through the existing `comment` +
/// `archive` verbs keeps the wire shape stable across versions.
///
/// If a future Hermes adds a dedicated `kanban reject` verb, swap
/// the body here the public surface stays "reject" returning Void.
public func rejectHallucinated(taskId: String) async throws {
// Best-effort comment first so the audit trail records the
// rejection. A failure here shouldn't block the archive log
// and continue.
do {
try await comment(
taskId: taskId,
text: "Rejected as hallucinated (no underlying work).",
author: nil
)
} catch {
#if canImport(os)
Self.logger.warning("kanban reject: comment failed, proceeding to archive (\(error.localizedDescription, privacy: .public))")
#endif
}
try await archive(taskIds: [taskId])
}
// MARK: - Drag-drop transition mapper
/// Map a board-level column transition to the right Hermes verb call.
/// Returns the list of CLI invocations the caller should run in order.
/// Pure no I/O. Called from VMs to build an action plan; the VM
/// then either prompts the user (e.g. for a block reason) or calls
/// the matching `KanbanService` methods.
///
/// Forbidden transitions throw `KanbanError.forbiddenTransition`
/// rather than returning an empty plan, so callers can surface the
/// reason to the user.
public nonisolated static func plan(
for transition: KanbanTransition
) throws -> KanbanTransitionPlan {
let from = transition.from
let to = transition.to
if from == to {
return KanbanTransitionPlan(steps: [])
}
// "Done" is terminal Hermes has no `reopen` verb.
if from == .done {
throw KanbanError.forbiddenTransition(
from: from.displayName,
to: to.displayName,
reason: "Done is terminal — create a follow-up task to continue work."
)
}
// Triage promotion isn't a CLI verb in v0.12 it happens via
// a specifier worker. UI should disallow drag from triage.
if from == .triage {
throw KanbanError.forbiddenTransition(
from: from.displayName,
to: to.displayName,
reason: "Triage tasks are promoted by a specifier agent. Use the specifier worker pipeline."
)
}
// Archive lives outside the board only via context menu.
if to == .archived {
return KanbanTransitionPlan(steps: [.archive])
}
switch (from, to) {
case (.upNext, .running):
return KanbanTransitionPlan(steps: [.dispatch])
case (.upNext, .blocked):
return KanbanTransitionPlan(steps: [.block(reasonRequired: true)])
case (.upNext, .done):
// Direct tododone is unusual but allowed (manual checkoff).
return KanbanTransitionPlan(steps: [.complete(resultRequired: false)])
case (.running, .blocked):
return KanbanTransitionPlan(steps: [.block(reasonRequired: true)])
case (.running, .done):
return KanbanTransitionPlan(steps: [.complete(resultRequired: false)])
case (.running, .upNext):
// Release back to ready no direct verb. Closest is unblock,
// which only works for blocked tasks. Forbid for now.
throw KanbanError.forbiddenTransition(
from: from.displayName,
to: to.displayName,
reason: "Use the inspector's Comment + Unassign actions to hand a running task back."
)
case (.blocked, .upNext):
return KanbanTransitionPlan(steps: [.unblock])
case (.blocked, .running):
return KanbanTransitionPlan(steps: [.unblock, .dispatch])
case (.blocked, .done):
return KanbanTransitionPlan(steps: [.unblock, .complete(resultRequired: false)])
default:
throw KanbanError.forbiddenTransition(
from: from.displayName,
to: to.displayName,
reason: "No CLI path exists for this transition."
)
}
}
// MARK: - CLI invocation
private nonisolated func runHermes(
args: [String],
timeout: TimeInterval
) async -> (exitCode: Int32, stdout: String, stderr: String) {
let context = self.context
return await Task.detached(priority: .utility) { () -> (Int32, String, String) in
let transport = context.makeTransport()
let executable = context.paths.hermesBinary
do {
let result = try transport.runProcess(
executable: executable,
args: args,
stdin: nil,
timeout: timeout
)
return (result.exitCode, result.stdoutString, result.stderrString)
} catch let error as TransportError {
let message = error.diagnosticStderr.isEmpty
? (error.errorDescription ?? "transport error")
: error.diagnosticStderr
return (-1, "", message)
} catch {
return (-1, "", error.localizedDescription)
}
}.value
}
private nonisolated func ensureSuccess(
code: Int32,
stdout: String,
stderr: String,
verb: String
) throws {
guard code != 0 else { return }
if code == -1 && stderr.lowercased().contains("hermes binary not found") {
throw KanbanError.cliMissing
}
let combined = stderr.isEmpty ? stdout : stderr
#if canImport(os)
Self.logger.warning("kanban \(verb) exit=\(code, privacy: .public) stderr=\(combined, privacy: .public)")
#endif
throw KanbanError.nonZeroExit(code: code, stderr: combined)
}
}
// MARK: - Transition planning
/// Source/destination columns for a single drag-drop. Comparable to
/// SwiftUI's `.dropDestination` payload but kept Sendable + Hashable
/// so it can also drive iOS context-menu "Move to" actions.
public struct KanbanTransition: Sendable, Hashable {
public let from: KanbanBoardColumn
public let to: KanbanBoardColumn
public init(from: KanbanBoardColumn, to: KanbanBoardColumn) {
self.from = from
self.to = to
}
}
/// One Hermes verb call produced by `KanbanService.plan(for:)`. The VM
/// resolves any user-input requirements (block reason, completion
/// result) before invoking the corresponding actor method.
///
/// **Why `.dispatch` and not `.claim`.** `hermes kanban claim` reserves
/// a task atomically and prints the workspace path but it's a
/// "manual alternative to the dispatcher" that assumes the caller will
/// spawn the worker themselves. Scarf is not a worker host; the
/// gateway-running dispatcher is. Calling `claim` from drag-drop
/// flipped status to `running` without spawning any work, and the
/// task got reclaimed (stale_lock) ~15 minutes later. The right
/// verb is `dispatch`, which causes the dispatcher to spawn workers
/// for every assigned `ready` task in one pass.
public enum KanbanTransitionStep: Sendable, Equatable {
/// Force a dispatcher pass so the gateway spawns workers for
/// assigned `ready` tasks. Requires the task have an assignee
/// the dispatcher silently skips unassigned tasks.
case dispatch
case unblock
case block(reasonRequired: Bool)
case complete(resultRequired: Bool)
case archive
}
public struct KanbanTransitionPlan: Sendable, Equatable {
public let steps: [KanbanTransitionStep]
public init(steps: [KanbanTransitionStep]) {
self.steps = steps
}
public var requiresBlockReason: Bool {
steps.contains { if case .block(true) = $0 { return true } else { return false } }
}
public var requiresCompleteResult: Bool {
steps.contains { if case .complete(true) = $0 { return true } else { return false } }
}
}
@@ -0,0 +1,39 @@
import Foundation
/// Cross-platform read-only helper for `<project>/.scarf/manifest.json`'s
/// `kanbanTenant` field. The full `ProjectTemplateManifest` Codable
/// type lives in the Mac app target (with all the install/export
/// machinery); iOS doesn't link it, so this lightweight projection
/// gives both targets a way to read just the tenant slug without
/// duplicating the entire manifest model.
public struct KanbanTenantReader: Sendable {
public let context: ServerContext
public nonisolated init(context: ServerContext) {
self.context = context
}
/// Read the project's Kanban tenant slug, or `nil` if the manifest
/// doesn't exist or doesn't carry one. Cheap single JSON parse
/// of a tiny projection.
public nonisolated func tenant(forProjectPath projectPath: String) -> String? {
let manifestPath = projectPath + "/.scarf/manifest.json"
let transport = context.makeTransport()
guard transport.fileExists(manifestPath),
let data = try? transport.readFile(manifestPath)
else {
return nil
}
return Self.tenant(fromManifestData: data)
}
/// Pure-input variant for tests + tooling that already have the
/// JSON bytes in hand. Returns `nil` when the bytes don't decode
/// or the field isn't present.
public nonisolated static func tenant(fromManifestData data: Data) -> String? {
struct Projection: Decodable {
let kanbanTenant: String?
}
return (try? JSONDecoder().decode(Projection.self, from: data))?.kanbanTenant
}
}
@@ -155,9 +155,20 @@ public struct ModelCatalogService: Sendable {
)
}
return byID.values.sorted { lhs, rhs in
// Subscription-gated first (Nous Portal).
if lhs.subscriptionGated != rhs.subscriptionGated {
return lhs.subscriptionGated
}
// Demoted last (Vercel AI Gateway, per Hermes v0.13). The
// axis is unconditional we don't gate on the Hermes
// version because "Vercel mid-alphabet on v0.12, bottom on
// v0.13" would be more confusing than the consistent
// "Vercel last" treatment for everyone.
let lDemoted = Self.demotedProviders.contains(lhs.providerID)
let rDemoted = Self.demotedProviders.contains(rhs.providerID)
if lDemoted != rDemoted {
return !lDemoted
}
return lhs.providerName.localizedCaseInsensitiveCompare(rhs.providerName) == .orderedAscending
}
}
@@ -169,6 +180,23 @@ public struct ModelCatalogService: Sendable {
Self.overlayOnlyProviders[providerID]
}
/// Async wrapper around `loadProviders()` for use from MainActor view
/// code. The sync method does a transport-backed file read that on a
/// remote SSH context can take 12 minutes (ControlMaster setup +
/// pulling the multi-megabyte models.dev JSON), and on local contexts
/// still parses ~1500 models both unsuitable for the main thread.
/// Issue #59. Existing call sites (tests, any non-View consumers)
/// can keep using the sync method.
public nonisolated func loadProvidersAsync() async -> [HermesProviderInfo] {
await Task.detached { [self] in
let providers = ScarfMon.measure(.diskIO, "modelCatalog.loadProviders") {
self.loadProviders()
}
ScarfMon.event(.diskIO, "modelCatalog.providers.count", count: providers.count)
return providers
}.value
}
/// Models for one provider, sorted by release date (newest first), then name.
public func loadModels(for providerID: String) -> [HermesModelInfo] {
guard let catalog = loadCatalog(), let provider = catalog[providerID] else { return [] }
@@ -198,12 +226,30 @@ public struct ModelCatalogService: Sendable {
}
}
/// Async wrapper around `loadModels(for:)`. Same rationale as
/// `loadProvidersAsync()` the View call site that fires on every
/// provider-switch click in the picker sheet was reading the catalog
/// synchronously on the MainActor, freezing the UI on remote contexts.
/// Issue #59.
public nonisolated func loadModelsAsync(for providerID: String) async -> [HermesModelInfo] {
await Task.detached { [self] in
let models = ScarfMon.measure(.diskIO, "modelCatalog.loadModels") {
self.loadModels(for: providerID)
}
ScarfMon.event(.diskIO, "modelCatalog.models.count", count: models.count)
return models
}.value
}
/// Find the provider that ships a given model ID. Useful for auto-syncing
/// provider when the user picks a model from a flat list or types one in.
public func provider(for modelID: String) -> HermesProviderInfo? {
guard let catalog = loadCatalog() else { return nil }
for (providerID, p) in catalog {
if p.models?[modelID] != nil {
// Resolve any model-rename alias for this provider before
// checking the catalog see `modelAliases` for rationale.
let resolved = resolveModelAlias(providerID: providerID, modelID: modelID)
if p.models?[resolved] != nil {
return HermesProviderInfo(
providerID: providerID,
providerName: p.name ?? providerID,
@@ -267,14 +313,17 @@ public struct ModelCatalogService: Sendable {
/// Look up a specific model by provider + ID. Returns nil if not in the
/// catalog (e.g., free-typed custom model).
public func model(providerID: String, modelID: String) -> HermesModelInfo? {
// Resolve any model-rename alias for this provider before
// checking the catalog see `modelAliases` for rationale.
let resolved = resolveModelAlias(providerID: providerID, modelID: modelID)
guard let catalog = loadCatalog(),
let provider = catalog[providerID],
let raw = provider.models?[modelID] else { return nil }
let raw = provider.models?[resolved] else { return nil }
return HermesModelInfo(
providerID: providerID,
providerName: provider.name ?? providerID,
modelID: modelID,
modelName: raw.name ?? modelID,
modelID: resolved,
modelName: raw.name ?? resolved,
contextWindow: raw.limit?.context,
maxOutput: raw.limit?.output,
costInput: raw.cost?.input,
@@ -311,47 +360,53 @@ public struct ModelCatalogService: Sendable {
/// Nous's catalog has no such model and Hermes later failed with
/// HTTP 404 at runtime. Catch that at save time, not 6 hours later.
public func validateModel(_ modelID: String, for providerID: String) -> ModelValidation {
let trimmed = modelID.trimmingCharacters(in: .whitespacesAndNewlines)
guard !trimmed.isEmpty else {
return .invalid(providerName: providerID, suggestions: [])
}
ScarfMon.measure(.diskIO, "modelCatalog.validateModel") {
let raw = modelID.trimmingCharacters(in: .whitespacesAndNewlines)
guard !raw.isEmpty else {
return .invalid(providerName: providerID, suggestions: [])
}
// Resolve any model-rename alias before lookup so configs
// referencing a deprecated ID (e.g. `x-ai/grok-4.20-beta`)
// validate against the canonical successor.
let trimmed = resolveModelAlias(providerID: providerID, modelID: raw)
// Overlay-only providers (Nous Portal, OpenAI Codex, Qwen
// OAuth, ) serve their own catalogs that aren't mirrored to
// models.dev, so we don't have a reliable way to check model
// IDs locally. Treat any non-empty value as provisionally
// valid the worst case is the runtime 404 we hit in pass-1,
// but the UI has the error banner now (M7 #2) to surface that
// cleanly.
//
// Exception: if an overlay-only provider DOES appear in the
// models.dev cache (unlikely but possible as catalogs evolve),
// we fall through to the real check below.
let models = loadModels(for: providerID)
if models.isEmpty {
if Self.overlayOnlyProviders[providerID] != nil {
// Overlay-only providers (Nous Portal, OpenAI Codex, Qwen
// OAuth, ) serve their own catalogs that aren't mirrored to
// models.dev, so we don't have a reliable way to check model
// IDs locally. Treat any non-empty value as provisionally
// valid the worst case is the runtime 404 we hit in pass-1,
// but the UI has the error banner now (M7 #2) to surface that
// cleanly.
//
// Exception: if an overlay-only provider DOES appear in the
// models.dev cache (unlikely but possible as catalogs evolve),
// we fall through to the real check below.
let models = loadModels(for: providerID)
if models.isEmpty {
if Self.overlayOnlyProviders[providerID] != nil {
return .valid
}
return .unknownProvider(providerID: providerID)
}
if models.contains(where: { $0.modelID == trimmed }) {
return .valid
}
return .unknownProvider(providerID: providerID)
}
if models.contains(where: { $0.modelID == trimmed }) {
return .valid
// No exact match offer the closest names (by prefix) as
// suggestions. Up to 5, ordered by release date (newest
// first already the sort order of loadModels).
let lowerTrimmed = trimmed.lowercased()
let byPrefix = models
.filter { $0.modelID.lowercased().hasPrefix(String(lowerTrimmed.prefix(3))) }
.prefix(5)
.map(\.modelID)
let suggestions = byPrefix.isEmpty
? Array(models.prefix(5).map(\.modelID))
: Array(byPrefix)
let providerName = providerByID(providerID)?.providerName ?? providerID
return .invalid(providerName: providerName, suggestions: suggestions)
}
// No exact match offer the closest names (by prefix) as
// suggestions. Up to 5, ordered by release date (newest
// first already the sort order of loadModels).
let lowerTrimmed = trimmed.lowercased()
let byPrefix = models
.filter { $0.modelID.lowercased().hasPrefix(String(lowerTrimmed.prefix(3))) }
.prefix(5)
.map(\.modelID)
let suggestions = byPrefix.isEmpty
? Array(models.prefix(5).map(\.modelID))
: Array(byPrefix)
let providerName = providerByID(providerID)?.providerName ?? providerID
return .invalid(providerName: providerName, suggestions: suggestions)
}
// MARK: - Decoding
@@ -399,17 +454,91 @@ public struct ModelCatalogService: Sendable {
let output: Int?
}
// MARK: - Model aliases (model rename resolution)
/// Hermes deprecates model IDs across releases. When a stored config
/// `model.default` references a deprecated ID, resolve to its
/// canonical successor. Lossless we never rewrite the user's
/// `config.yaml`; the alias just lets `validateModel` /
/// `model(providerID:modelID:)` / `provider(for:)` succeed against
/// the new ID.
///
/// Keys are slash-joined `providerID/modelID` to disambiguate
/// across providers even if `vercel` later adds a `grok-4.20-beta`
/// alias on its own, the openrouter resolution shouldn't fire.
/// Values are the bare resolved model ID (no provider prefix).
///
/// **Schema is Swift-primary.** Mirror new entries into Hermes's
/// upstream deprecation map in `hermes_cli/providers.py` if/when
/// upstream tracks renames in code (today they're release-notes
/// only).
public static let modelAliases: [String: String] = [
// v0.13: x-ai dropped the `-beta` suffix once Grok 4.20 GA'd.
// The model is the same one served at the same OpenRouter slot;
// only the marketing identifier changed.
// TODO(WS-6-Q4): verify whether OpenRouter retired the
// `x-ai/grok-4.20-beta` slot entirely. Either way the alias is
// correct (cosmetic if old slot stays live, load-bearing if it
// 404s).
"openrouter/x-ai/grok-4.20-beta": "x-ai/grok-4.20",
"xai/grok-4.20-beta": "grok-4.20",
"vercel/xai/grok-4.20-beta": "xai/grok-4.20",
]
/// Resolve a stored model identifier through the alias map. Returns
/// the input unchanged when no alias exists. Pure function used at
/// read time everywhere a config'd model ID is rendered, validated,
/// or sent to Hermes.
public func resolveModelAlias(providerID: String, modelID: String) -> String {
let composite = "\(providerID)/\(modelID)"
return Self.modelAliases[composite] ?? modelID
}
// MARK: - Demoted providers (sort tail)
/// Provider IDs that Hermes v0.13 explicitly deprioritizes in the
/// picker. `loadProviders()` sorts these to the tail of the list,
/// after the alphabetical group, so users who haven't manually
/// chosen Vercel as their gateway don't end up there by default.
/// Mirrors Hermes's deprioritized-provider list in
/// `hermes-agent/hermes_cli/providers.py`.
public static let demotedProviders: Set<String> = [
"vercel",
]
// MARK: - Image-generation model allowlist (curated)
/// Known image-generation models, used to pre-populate the
/// `image_gen.model` picker on the Auxiliary tab. The list is
/// curated `models_dev_cache.json` doesn't tag image-capable
/// models, so we maintain this by hand on Hermes version bumps.
/// Always free-form-typeable on the picker too, so missing entries
/// don't block users with non-listed image providers.
///
/// Order: most-likely-to-be-chosen first.
public static let imageGenModels: [HermesImageGenModel] = [
.init(modelID: "openai/gpt-image-1", display: "OpenAI · gpt-image-1", providerHint: "openai"),
.init(modelID: "google/imagen-4", display: "Google · Imagen 4", providerHint: "google-vertex"),
.init(modelID: "google/imagen-3", display: "Google · Imagen 3", providerHint: "google-vertex"),
.init(modelID: "stability/stable-image-ultra", display: "Stability · Stable Image Ultra", providerHint: "stability"),
.init(modelID: "fal-ai/flux-pro-1.1", display: "fal · FLUX 1.1 Pro", providerHint: "fal"),
.init(modelID: "black-forest-labs/flux-1.1-pro", display: "Black Forest Labs · FLUX 1.1 Pro", providerHint: "openrouter"),
.init(modelID: "openai/dall-e-3", display: "OpenAI · DALL·E 3", providerHint: "openai"),
]
// MARK: - Hermes overlay providers
/// The six providers Hermes surfaces via `hermes model` that have no
/// The 11 providers Hermes surfaces via `hermes model` that have no
/// entry in `models_dev_cache.json` (models.dev doesn't mirror them).
/// Mirrors the overlay-only subset of `HERMES_OVERLAYS` in
/// `hermes-agent/hermes_cli/providers.py`. The other ~19 overlay entries
/// `hermes-agent/hermes_cli/providers.py`. The other overlay entries
/// already ship in the cache and only add augmentation (base-URL
/// override, extra env vars) that Scarf doesn't currently display.
///
/// Keep this in sync with the Python side on Hermes version bumps.
static let overlayOnlyProviders: [String: HermesProviderOverlay] = [
/// Keep this in sync with the Python side on Hermes version bumps
/// see `ToolGatewayTests.v012OverlayProvidersCarryCorrectAuthTypes`
/// for the auth-type lock-in.
public static let overlayOnlyProviders: [String: HermesProviderOverlay] = [
"nous": HermesProviderOverlay(
displayName: "Nous Portal",
baseURL: "https://inference-api.nousresearch.com/v1",
@@ -452,9 +581,77 @@ public struct ModelCatalogService: Sendable {
subscriptionGated: false,
docURL: nil
),
// -- v0.12 additions ---------------------------------------------
// Hermes v2026.4.30 added five overlay-only providers that
// models.dev doesn't mirror. Provider IDs match HERMES_OVERLAYS
// verbatim drift here means the picker can't reach them.
"gmi": HermesProviderOverlay(
displayName: "GMI Cloud",
baseURL: "https://api.gmi-serving.com/v1",
authType: .apiKey,
subscriptionGated: false,
docURL: nil
),
"azure-foundry": HermesProviderOverlay(
displayName: "Azure AI Foundry",
// Base URL is per-tenant Hermes resolves it from the
// AZURE_FOUNDRY_BASE_URL env var at runtime. Leave nil so the
// settings UI shows "Tenant URL set via env" instead of a
// misleading default.
baseURL: nil,
authType: .apiKey,
subscriptionGated: false,
docURL: nil
),
"lmstudio": HermesProviderOverlay(
displayName: "LM Studio",
// v0.12 promotes LM Studio from custom-endpoint alias to a
// first-class provider. 1234 is the LM Studio default port;
// users with a non-default port set LM_BASE_URL.
baseURL: "http://127.0.0.1:1234/v1",
authType: .apiKey,
subscriptionGated: false,
docURL: nil
),
"minimax-oauth": HermesProviderOverlay(
displayName: "MiniMax (OAuth)",
baseURL: "https://api.minimax.io/anthropic",
authType: .oauthExternal,
subscriptionGated: false,
docURL: nil
),
"tencent-tokenhub": HermesProviderOverlay(
displayName: "Tencent TokenHub",
// Resolved from TOKENHUB_BASE_URL at runtime.
baseURL: nil,
authType: .apiKey,
subscriptionGated: false,
docURL: nil
),
]
}
/// Curated entry for the `image_gen.model` picker on the Auxiliary
/// tab. Hermes v0.13 honors a top-level `image_gen.model` key but the
/// models.dev catalog has no `image: true` tag, so we maintain a
/// short hand-curated allowlist keyed by display order. The picker
/// always allows free-form-typing too, so any provider's model ID
/// works regardless of whether it appears here.
public struct HermesImageGenModel: Sendable, Identifiable, Hashable {
public let modelID: String
public let display: String
/// Hint at which provider serves this model surfaced as a
/// "Configure provider X first" advisory but never enforced.
public let providerHint: String?
public var id: String { modelID }
public init(modelID: String, display: String, providerHint: String?) {
self.modelID = modelID
self.display = display
self.providerHint = providerHint
}
}
/// Scarf-side mirror of `HermesOverlay` from hermes-agent's
/// `hermes_cli/providers.py`. Describes a provider that isn't in the
/// models.dev catalog.
@@ -0,0 +1,97 @@
import Foundation
/// Pre-flight check used before opening an ACP session. Hermes resolves the
/// model+provider from `config.yaml` at session boot; on a fresh install that
/// file is missing or has neither key set, and the chat fails with an opaque
/// "Model parameter is required" 400 from the upstream provider only after the
/// user has typed a prompt and hit send. Catching the missing config here lets
/// the UI surface a real "pick a model" sheet before any ACP work starts.
///
/// `HermesConfig.empty` (returned on read failure) and the YAML parser's
/// missing-key fallback both use the literal string `"unknown"`, so the check
/// has to treat `""` and `"unknown"` as equivalent. Anything else is
/// considered configured we don't try to validate the model against the
/// provider's catalog here; that happens later in `ModelPickerSheet`.
public enum ModelPreflight: Sendable {
public enum Result: Equatable, Sendable {
case configured
case missingModel
case missingProvider
case missingBoth
public var isConfigured: Bool {
self == .configured
}
/// Short user-facing reason. Long enough to be honest, short enough
/// for a sheet header full messaging belongs to the picker UI.
public var reason: String {
switch self {
case .configured: return ""
case .missingModel: return "No primary model is set in this server's config."
case .missingProvider:return "No primary provider is set in this server's config."
case .missingBoth: return "No model is configured on this server yet."
}
}
}
/// Treat `""` and the YAML parser's `"unknown"` fallback as missing.
/// Trim whitespace so a stray newline in a hand-edited config.yaml
/// doesn't read as "configured."
public static func check(_ config: HermesConfig) -> Result {
let modelMissing = isUnset(config.model)
let providerMissing = isUnset(config.provider)
switch (modelMissing, providerMissing) {
case (true, true): return .missingBoth
case (true, false): return .missingModel
case (false, true): return .missingProvider
case (false, false): return .configured
}
}
private static func isUnset(_ value: String) -> Bool {
let trimmed = value.trimmingCharacters(in: .whitespacesAndNewlines).lowercased()
return trimmed.isEmpty || trimmed == "unknown"
}
/// Result of a `model.default` `model.provider` mismatch check.
/// Captures the case where `model.default` carries a `<provider>/...`
/// prefix that doesn't match the standalone `model.provider` key
/// observed in 2026-05-05 dogfooding when switching OAuth providers
/// via Credential Pools left the prior provider's model name
/// stranded in `model.default`. Hermes can't reconcile the two and
/// chats die with an opaque `-32603 Internal error` at first prompt.
public struct Mismatch: Sendable, Equatable {
/// The provider prefix found in `model.default` (e.g. `"anthropic"`).
public let prefixProvider: String
/// The standalone `model.provider` value (e.g. `"nous"`).
public let activeProvider: String
/// The full `model.default` string as configured.
public let modelDefault: String
/// The bare model id (with the prefix stripped) what the user
/// would see if Scarf rewrites `model.default` for them.
public let bareModel: String
}
/// Detect a `model.default` / `model.provider` mismatch. Returns
/// `nil` when there's no provider prefix on `model.default`, when
/// either field is unset, or when the prefix matches the provider.
/// Uses case-insensitive comparison Hermes accepts both
/// `Anthropic/...` and `anthropic/...` casings in the wild.
public static func detectMismatch(_ config: HermesConfig) -> Mismatch? {
let modelDefault = config.model.trimmingCharacters(in: .whitespacesAndNewlines)
let activeProvider = config.provider.trimmingCharacters(in: .whitespacesAndNewlines)
guard !isUnset(modelDefault), !isUnset(activeProvider) else { return nil }
guard let slash = modelDefault.firstIndex(of: "/") else { return nil }
let prefix = String(modelDefault[..<slash])
let bare = String(modelDefault[modelDefault.index(after: slash)...])
guard !prefix.isEmpty, !bare.isEmpty else { return nil }
guard prefix.caseInsensitiveCompare(activeProvider) != .orderedSame else { return nil }
return Mismatch(
prefixProvider: prefix,
activeProvider: activeProvider,
modelDefault: modelDefault,
bareModel: bare
)
}
}
@@ -0,0 +1,313 @@
import Foundation
import os
/// One Nous Portal model as exposed by `GET /v1/models`. The shape
/// mirrors the OpenAI-compatible response schema Nous's inference
/// API uses the same envelope. Optional fields stay optional because
/// not every entry includes them; `id` is the only field we strictly
/// need (it's what Hermes passes through to the provider).
public struct NousModel: Codable, Equatable, Sendable, Identifiable {
public let id: String
public let owned_by: String?
public let created: Int?
/// Free-text description if the API ships one. Nous's current
/// catalog doesn't include this, but the field is here so future
/// shape changes don't drop user-visible context on the floor.
public let description: String?
public init(id: String, owned_by: String? = nil, created: Int? = nil, description: String? = nil) {
self.id = id
self.owned_by = owned_by
self.created = created
self.description = description
}
}
/// On-disk cache shape. Versioned so a future schema change can lift
/// stale caches gracefully bump `version` and the loader rejects
/// anything older without trying to migrate. Stored as JSON next to
/// the projects registry so a Hermes wipe takes it with the rest of
/// the Scarf-owned state.
public struct NousModelsCache: Codable, Sendable {
public static let currentVersion = 1
public let version: Int
public let fetchedAt: Date
public let models: [NousModel]
public init(version: Int = NousModelsCache.currentVersion, fetchedAt: Date, models: [NousModel]) {
self.version = version
self.fetchedAt = fetchedAt
self.models = models
}
}
/// Result of a `loadModels` call. Distinguishes "fetched fresh from
/// the API" from "cache served, network failed" so the picker UI can
/// surface a "could not refresh" hint without hiding the cached list.
public enum NousModelsLoadResult: Sendable {
case fresh(models: [NousModel], fetchedAt: Date)
case cache(models: [NousModel], fetchedAt: Date, refreshError: String?)
case fallback(models: [NousModel], reason: String)
}
/// Fetches + caches the list of available Nous Portal models. Runs in
/// the Scarf process (not on the remote), authenticated with the
/// bearer token from `~/.hermes/auth.json` on the active server
/// `NousSubscriptionService` reads that file via the active transport,
/// so a remote droplet's token comes back over SSH and the network
/// call to Nous still happens from the user's Mac. That's correct:
/// we want the model list visible whenever the user has subscription
/// credentials, regardless of where Hermes will eventually run the
/// chat from.
public struct NousModelCatalogService: Sendable {
public static let baseURL = URL(string: "https://inference-api.nousresearch.com/v1/models")!
public static let cacheTTL: TimeInterval = 24 * 60 * 60 // 24h
public static let requestTimeout: TimeInterval = 10 // seconds
/// Hard-coded fallback for offline-with-no-cache. Short on purpose
/// only the canonical Hermes models (the family the user is most
/// likely to want) plus a reminder that fresh data is one
/// successful refresh away. Update when Nous releases a new
/// flagship; deliberately not exhaustive the API is the source
/// of truth, this just keeps the picker non-empty.
public static let fallbackModels: [NousModel] = [
NousModel(id: "Hermes-3-Llama-3.1-405B"),
NousModel(id: "Hermes-3-Llama-3.1-70B"),
NousModel(id: "Hermes-3-Llama-3.1-8B"),
NousModel(id: "DeepHermes-3-Llama-3-8B-Preview")
]
private static let logger = Logger(subsystem: "com.scarf", category: "NousModelCatalogService")
public let context: ServerContext
private let session: URLSession
private let cachePath: String
public init(context: ServerContext, session: URLSession = .shared) {
self.context = context
self.session = session
self.cachePath = context.paths.nousModelsCache
}
// MARK: - Cache I/O
/// Read the cache via the active transport (so a remote droplet's
/// cache lands on the droplet, not the user's Mac). Missing or
/// malformed cache nil; the loader treats that as "no cache" and
/// kicks off a fresh fetch.
/// Race readCache against a sleep so a hung remote `cat` doesn't
/// stall the picker for the full transport-level timeout (60 s).
/// On timeout returns nil the caller treats that as "no usable
/// cache" and falls through to the network fetch.
public func readCacheWithTimeout(seconds: TimeInterval) async -> NousModelsCache? {
await withTaskGroup(of: NousModelsCache?.self) { group in
group.addTask { [self] in
// Detached because readCache is sync + does blocking
// SSH I/O; running on the cooperative pool is fine
// for one task but we don't want to fight executor
// scheduling with the timer task below.
await Task.detached { [self] in
readCache()
}.value
}
group.addTask {
try? await Task.sleep(nanoseconds: UInt64(seconds * 1_000_000_000))
ScarfMon.event(.diskIO, "nous.readCache.timeoutFired", count: 1)
return nil
}
// First completion wins; cancel the other.
let first = await group.next() ?? nil
group.cancelAll()
return first
}
}
public func readCache() -> NousModelsCache? {
ScarfMon.measure(.diskIO, "nous.readCache") {
let transport = context.makeTransport()
// Split into separate measure points so the next perf
// capture localizes the 60-second observed beach ball
// was it the fileExists probe, the read itself, or
// the JSON decode? Each on its own ScarfMon row.
let exists = ScarfMon.measure(.diskIO, "nous.readCache.fileExists") {
transport.fileExists(cachePath)
}
guard exists else { return nil }
do {
let data = try ScarfMon.measure(.diskIO, "nous.readCache.readFile") {
try transport.readFile(cachePath)
}
ScarfMon.event(.diskIO, "nous.readCache.bytes", count: 1, bytes: data.count)
return ScarfMon.measure(.diskIO, "nous.readCache.decode") {
let decoder = JSONDecoder()
decoder.dateDecodingStrategy = .iso8601
do {
let cache = try decoder.decode(NousModelsCache.self, from: data)
guard cache.version == NousModelsCache.currentVersion else {
Self.logger.info("nous models cache schema mismatch (got v\(cache.version), expected v\(NousModelsCache.currentVersion)); ignoring")
return Optional<NousModelsCache>.none
}
return cache
} catch {
Self.logger.warning("couldn't decode nous models cache: \(error.localizedDescription, privacy: .public)")
return Optional<NousModelsCache>.none
}
}
} catch {
Self.logger.warning("couldn't read nous models cache: \(error.localizedDescription, privacy: .public)")
return nil
}
}
}
private func writeCache(_ cache: NousModelsCache) {
let transport = context.makeTransport()
do {
let encoder = JSONEncoder()
encoder.dateEncodingStrategy = .iso8601
encoder.outputFormatting = [.prettyPrinted, .sortedKeys]
let data = try encoder.encode(cache)
// Make sure the parent dir exists fresh remote installs
// may not yet have `~/.hermes/scarf/`. mkdir -p is cheap
// and idempotent on both transports.
let parent = (cachePath as NSString).deletingLastPathComponent
if !parent.isEmpty {
try? transport.createDirectory(parent)
}
try transport.writeFile(cachePath, data: data)
} catch {
Self.logger.warning("couldn't write nous models cache: \(error.localizedDescription, privacy: .public)")
}
}
public func isCacheStale(_ cache: NousModelsCache) -> Bool {
Date().timeIntervalSince(cache.fetchedAt) > Self.cacheTTL
}
// MARK: - Network fetch
/// Read the bearer token from `auth.json` on the active server.
/// Returns nil when the user isn't signed in to Nous, in which
/// case `loadModels` skips the network call and falls through to
/// cache or fallback.
private func bearerToken() -> String? {
// The subscription service already checks for `present`; we
// re-read the raw token here because we need the actual string,
// not just a Bool. Mirrors the SubscriptionService parse path.
// ScarfMon: separate `nous.bearerToken` measure point because
// this is the second auth.json read of the picker's open
// sequence (subscriptionService.loadState() did the first).
// Together with `nous.subscription.loadState`, total two SSH
// round-trips of the same file candidate for caching.
ScarfMon.measure(.diskIO, "nous.bearerToken") {
let transport = context.makeTransport()
guard transport.fileExists(context.paths.authJSON) else { return nil }
guard let data = try? transport.readFile(context.paths.authJSON) else { return nil }
guard let root = try? JSONSerialization.jsonObject(with: data) as? [String: Any] else { return nil }
let providers = root["providers"] as? [String: Any] ?? [:]
let nous = providers["nous"] as? [String: Any]
let token = nous?["access_token"] as? String
guard let token, !token.isEmpty else { return nil }
return token
}
}
/// Make the API call. Times out after `requestTimeout` so a hung
/// network doesn't block the picker indefinitely. Returns the raw
/// `[NousModel]` on success, throws on any HTTP / decode error so
/// the caller can log + fall back.
public func fetchModels() async throws -> [NousModel] {
try await ScarfMon.measureAsync(.transport, "nous.fetchModels") {
guard let token = bearerToken() else {
throw NousModelCatalogError.notAuthenticated
}
var request = URLRequest(url: Self.baseURL)
request.httpMethod = "GET"
request.timeoutInterval = Self.requestTimeout
request.setValue("Bearer \(token)", forHTTPHeaderField: "Authorization")
request.setValue("application/json", forHTTPHeaderField: "Accept")
let (data, response) = try await session.data(for: request)
guard let http = response as? HTTPURLResponse else {
throw NousModelCatalogError.transport("non-HTTP response")
}
guard (200..<300).contains(http.statusCode) else {
throw NousModelCatalogError.http(status: http.statusCode)
}
struct Envelope: Decodable { let data: [NousModel] }
let envelope = try JSONDecoder().decode(Envelope.self, from: data)
ScarfMon.event(.transport, "nous.fetchModels.bytes", count: envelope.data.count, bytes: data.count)
return envelope.data
}
}
// MARK: - Public entry
/// Top-level "give me models" entry point. Cache-first: serve from
/// cache if fresh, fetch + write through if stale or empty, fall
/// back to the hard-coded list when both fail. The caller renders
/// based on the case so it can show a "could not refresh" hint
/// next to a stale-but-still-useful list.
public func loadModels(forceRefresh: Bool = false) async -> NousModelsLoadResult {
// Cache-read with a short timeout. The underlying SSH `cat`
// can hang on a corrupted or oversized cache file (a
// 120-second picker stall observed in the wild two 60 s
// timeouts stacked from a duplicated read; perf capture
// localized to `nous.readCache.readFile`). Cache is a
// performance hint, not a correctness requirement; if it
// doesn't return in 5 s, fall through to the network fetch
// and let writeCache rebuild it. The runaway `cat` keeps
// running on its own 60 s transport timeout but no longer
// blocks the picker.
let cached = await readCacheWithTimeout(seconds: 5)
if let cached, !forceRefresh, !isCacheStale(cached) {
return .cache(models: cached.models, fetchedAt: cached.fetchedAt, refreshError: nil)
}
do {
let models = try await fetchModels()
let now = Date()
writeCache(NousModelsCache(fetchedAt: now, models: models))
return .fresh(models: models, fetchedAt: now)
} catch let error as NousModelCatalogError {
// Fetch failed but we may still have *something* useful.
if let cached {
return .cache(
models: cached.models,
fetchedAt: cached.fetchedAt,
refreshError: error.userMessage
)
}
return .fallback(models: Self.fallbackModels, reason: error.userMessage)
} catch {
if let cached {
return .cache(
models: cached.models,
fetchedAt: cached.fetchedAt,
refreshError: error.localizedDescription
)
}
return .fallback(models: Self.fallbackModels, reason: error.localizedDescription)
}
}
}
public enum NousModelCatalogError: Error, Sendable {
case notAuthenticated
case http(status: Int)
case transport(String)
public var userMessage: String {
switch self {
case .notAuthenticated:
return "Sign in to Nous Portal to fetch the latest model list."
case .http(let status) where status == 401:
return "Nous rejected the saved token (401). Sign in again."
case .http(let status):
return "Nous returned HTTP \(status)."
case .transport(let detail):
return "Couldn't reach Nous: \(detail)."
}
}
}
@@ -15,14 +15,18 @@ public struct ProjectDashboardService: Sendable {
// MARK: - Registry
public func loadRegistry() -> ProjectRegistry {
guard let data = try? transport.readFile(context.paths.projectsRegistry) else {
return ProjectRegistry(projects: [])
}
do {
return try JSONDecoder().decode(ProjectRegistry.self, from: data)
} catch {
Self.logger.error("Failed to decode project registry: \(error.localizedDescription, privacy: .public)")
return ProjectRegistry(projects: [])
// Tracks time spent reading + decoding projects.json from the transport
// (local file or SSH). Helps spot slow remote round-trips.
ScarfMon.measure(.diskIO, "dashboard.loadRegistry") {
guard let data = try? transport.readFile(context.paths.projectsRegistry) else {
return ProjectRegistry(projects: [])
}
do {
return try JSONDecoder().decode(ProjectRegistry.self, from: data)
} catch {
Self.logger.error("Failed to decode project registry: \(error.localizedDescription, privacy: .public)")
return ProjectRegistry(projects: [])
}
}
}
@@ -0,0 +1,155 @@
import Foundation
#if canImport(os)
import os
#endif
/// Detects when a registered project directory contains its own `.hermes/`
/// subdirectory. Hermes' CLI uses the closest `.hermes/` as `$HERMES_HOME`
/// when invoked from inside such a directory, which **shadows** the user's
/// global Hermes home credentials, config, sessions, skills, memories
/// all bind to the project-local copy without warning.
///
/// This causes confusing failure modes: the user runs `hermes auth add nous`
/// during setup expecting a global registration, but if their cwd happens to
/// be inside a project that already has a `.hermes/` (e.g. seeded by a
/// previous workflow, copied from another machine, or checked into git),
/// Hermes writes the credentials to the project-local `.hermes/auth.json`.
/// Scarf then reads the global path on every dashboard tick and shows
/// "missing provider" warnings even though the user did sign in successfully.
///
/// The detector enumerates the registered projects on a given server and
/// reports which ones carry a shadowing `.hermes/`. Views surface a yellow
/// banner so the user can consolidate.
public struct ProjectHermesShadowDetector: Sendable {
public struct Shadow: Sendable, Hashable, Identifiable {
public var id: String { projectPath }
/// Project name from the registry (`ProjectEntry.name`).
public let projectName: String
/// Absolute path to the project on the target server.
public let projectPath: String
/// Absolute path to the shadowing `.hermes/` directory.
public let shadowPath: String
/// `true` when the shadow `.hermes/auth.json` exists. Strong signal
/// that user credentials are landing in the wrong place.
public let hasAuthJSON: Bool
/// `true` when the shadow `.hermes/state.db` exists. Hermes wrote
/// session state to the project-local home the user's chat
/// history is invisible to Scarf's global Dashboard for this slice.
public let hasStateDB: Bool
public init(
projectName: String,
projectPath: String,
shadowPath: String,
hasAuthJSON: Bool,
hasStateDB: Bool
) {
self.projectName = projectName
self.projectPath = projectPath
self.shadowPath = shadowPath
self.hasAuthJSON = hasAuthJSON
self.hasStateDB = hasStateDB
}
}
#if canImport(os)
private static let logger = Logger(subsystem: "com.scarf", category: "ProjectHermesShadowDetector")
#endif
private let context: ServerContext
private let transport: any ServerTransport
public init(context: ServerContext) {
self.context = context
self.transport = context.makeTransport()
}
/// Probe every project in `projects` for a shadowing `.hermes/`. Skips
/// archived projects and projects whose absolute path equals the
/// resolved Hermes home (rare but possible a project literally
/// rooted at `~/.hermes` shouldn't trigger a self-warning).
public func detect(in projects: [ProjectEntry]) async -> [Shadow] {
let hermesHome = await context.resolvedUserHome() + "/.hermes"
var found: [Shadow] = []
for project in projects where !project.archived {
// A project nested inside the Hermes home itself is a weird
// edge case (someone made `~/.hermes/notes` a Scarf project).
// The project is BELOW the Hermes home, so its `.hermes` is
// the same dir as `~/.hermes/.hermes` almost certainly not
// present and definitely not a shadow.
if project.path.hasPrefix(hermesHome) { continue }
let shadowPath = project.path + "/.hermes"
guard transport.fileExists(shadowPath) else { continue }
// It's only a shadow if the path is a directory; a stray
// `.hermes` file would be filtered out here.
guard transport.stat(shadowPath)?.isDirectory == true else { continue }
let hasAuth = transport.fileExists(shadowPath + "/auth.json")
let hasDB = transport.fileExists(shadowPath + "/state.db")
#if canImport(os)
Self.logger.warning(
"Detected shadow Hermes home at \(shadowPath, privacy: .public) (auth: \(hasAuth), state.db: \(hasDB))"
)
#endif
found.append(Shadow(
projectName: project.name,
projectPath: project.path,
shadowPath: shadowPath,
hasAuthJSON: hasAuth,
hasStateDB: hasDB
))
}
return found
}
/// Suggested shell one-liner that consolidates a project shadow into
/// the global Hermes home AND clears the warning on the next
/// refresh. Two ordered steps:
///
/// 1. Copy `auth.json` into the global home (only when present).
/// Hermes credentials live in this single file; preserving them
/// is the load-bearing part of "consolidate" every other
/// project-local file is either replaceable or scoped to the
/// project anyway.
/// 2. Rename the project-local `.hermes/` to
/// `.hermes.scarf-bak.<UTC-stamp>/`. Hermes' CLI stops seeing it
/// as `$HERMES_HOME` (it scans for a dir literally named
/// `.hermes`), so the global home wins from now on. The
/// user's project-local data `state.db`, `sessions/`,
/// `skills/` survives untouched in the renamed folder, so
/// they can inspect/recover/delete it later without us making
/// that decision for them.
///
/// **Why not delete instead of rename.** A project's shadow can
/// hold uncommitted session history the user hasn't audited yet.
/// `rm -rf` would be unrecoverable; the rename keeps everything
/// addressable while still removing the shadow effect. The user
/// can delete the `.bak` once they're confident.
///
/// Returns a single shell line, suitable for the user to paste
/// into a remote terminal. The rename uses `date -u +%Y%m%d-%H%M%S`
/// for a deterministic UTC suffix so two consecutive consolidations
/// don't collide on the same second.
public static func consolidationCommand(for shadow: Shadow, hermesHome: String) -> String? {
var parts: [String] = []
if shadow.hasAuthJSON {
parts.append("mkdir -p \(shellQuote(hermesHome))")
parts.append("cp \(shellQuote(shadow.shadowPath + "/auth.json")) \(shellQuote(hermesHome + "/auth.json"))")
parts.append("chmod 600 \(shellQuote(hermesHome + "/auth.json"))")
}
// The rename is unconditional: even shadows without auth.json
// still bind as $HERMES_HOME and need to move out of the way.
// `$(date -u +%Y%m%d-%H%M%S)` runs on the remote shell when
// the user pastes the command, producing the timestamp at
// exec time rather than at command-construction time.
parts.append("mv \(shellQuote(shadow.shadowPath)) \(shellQuote(shadow.shadowPath))\".scarf-bak.$(date -u +%Y%m%d-%H%M%S)\"")
return parts.joined(separator: " && ")
}
/// Single-quote a path for embedding in a `bash -c ''` string.
/// POSIX-safe single quotes with escape for embedded quotes
/// (`'` `'\\''`). Matches the convention in
/// `RemoteBackupService.shellQuote`.
private static func shellQuote(_ s: String) -> String {
"'" + s.replacingOccurrences(of: "'", with: "'\\''") + "'"
}
}
@@ -0,0 +1,539 @@
import Foundation
import CryptoKit
#if canImport(os)
import os
#endif
/// Streams a Hermes home + project trees off a (local or remote) server
/// into a single `.scarfbackup` archive on disk.
///
/// **Why not just run `hermes backup`.** Hermes's CLI captures `~/.hermes/`
/// only; project file trees (the user's actual code) live outside that
/// home and aren't included. A "rebuild this droplet from scratch" flow
/// needs both. This service does both Hermes home as one inner tarball,
/// each registered project as its own and writes a manifest pinning the
/// source server, hermes version, and per-tarball SHA-256s so restore can
/// detect corruption before it half-extracts.
///
/// **Memory profile.** Tarballs stream over SSH (`tar -czf -`) and into
/// disk-backed temp files chunk-by-chunk via `streamRawBytes`. We never
/// hold a multi-GB buffer in RAM. The final ZIP step shells out to
/// `/usr/bin/zip`, which also streams from disk.
///
/// **Cleanup.** The temp dir lives under
/// `FileManager.default.temporaryDirectory` and is removed on every exit
/// path (success, failure, cancellation) via `defer`.
public final class RemoteBackupService: @unchecked Sendable {
#if canImport(os)
private static let logger = Logger(subsystem: "com.scarf", category: "RemoteBackupService")
#endif
public let context: ServerContext
public init(context: ServerContext) {
self.context = context
}
/// Coarse stages the UI binds to. The service publishes one of these
/// per meaningful state change so a progress sheet can render
/// "Archiving Hermes home 412 MB so far" without polling.
public enum Progress: Sendable, Equatable {
case preflight
case checkpointingDB
case archivingHermes(bytesWritten: Int64)
case archivingProject(name: String, bytesWritten: Int64)
case bundling
case finalizing
}
public enum BackupError: Error, LocalizedError {
case preflightFailed(String)
case remoteCommandFailed(String)
case localIO(String)
case zipFailed(String)
case cancelled
public var errorDescription: String? {
switch self {
case .preflightFailed(let m): return "Backup preflight failed: \(m)"
case .remoteCommandFailed(let m): return "Remote command failed during backup: \(m)"
case .localIO(let m): return "Local file I/O failed during backup: \(m)"
case .zipFailed(let m): return "Couldn't assemble the backup archive: \(m)"
case .cancelled: return "Backup cancelled."
}
}
}
/// What the UI displays before any archiving starts. Populated by
/// `preflight()` so the user can see (and confirm) total size +
/// project count + hermes version before committing 4 minutes of
/// SSH traffic.
public struct PreflightSummary: Sendable, Equatable {
public var hermesVersion: String?
public var hermesHomePath: String
public var hermesHomeBytes: Int64?
public var projects: [ProjectSummary]
public var sqliteAvailable: Bool
public struct ProjectSummary: Sendable, Equatable {
public var id: String
public var name: String
public var path: String
public var sizeBytes: Int64?
public var reachable: Bool
}
public var totalSizeBytes: Int64? {
let parts: [Int64] = [hermesHomeBytes ?? 0] + projects.compactMap { $0.sizeBytes }
let sum = parts.reduce(0, +)
return sum > 0 ? sum : nil
}
}
public struct BackupResult: Sendable {
public var manifest: BackupManifest
public var archiveURL: URL
public var archiveSize: Int64
}
/// Probe the remote (or local) before committing to the full
/// archive. Cheap three short SSH calls and one file read. Safe
/// to call repeatedly; nothing is mutated on the source side.
public func preflight() async throws -> PreflightSummary {
let transport = context.makeTransport()
// 1. Resolve $HOME so the absolute paths in the manifest are
// canonical (e.g. `/home/alan/.hermes`, not the
// `~`-prefixed `HermesPathSet.home`).
let homeResult = try transport.runProcess(
executable: "/bin/bash",
args: ["-lc", "echo \"$HOME\""],
stdin: nil,
timeout: 30
)
guard homeResult.exitCode == 0 else {
throw BackupError.preflightFailed("Couldn't resolve remote $HOME (exit \(homeResult.exitCode)): \(homeResult.stderrString)")
}
let resolvedHome = homeResult.stdoutString.trimmingCharacters(in: .whitespacesAndNewlines)
// 2. Hermes version. Optional older builds may not implement
// `--version`. Empty/missing isn't fatal; the manifest just
// won't carry a version stamp.
let versionResult = try? transport.runProcess(
executable: "/bin/bash",
args: ["-lc", "hermes --version 2>/dev/null || true"],
stdin: nil,
timeout: 30
)
let hermesVersion: String? = {
guard let r = versionResult, r.exitCode == 0 else { return nil }
let trimmed = r.stdoutString.trimmingCharacters(in: .whitespacesAndNewlines)
return trimmed.isEmpty ? nil : trimmed
}()
// 3. Hermes home size + canonical path. `context.paths.home`
// can be `~/.hermes` for remotes that didn't pin
// `SSHConfig.remoteHome`; tar doesn't expand `~`, so we
// resolve every path against the just-fetched $HOME
// BEFORE storing it in the summary. `tar -C '~'` would
// fail with "No such file or directory" otherwise (and
// `du -sb '~/.hermes' 2>/dev/null` swallows the same
// error silently that's why preflight looked green).
let hermesHome = Self.expandTilde(context.paths.home, home: resolvedHome)
let hermesSize = Self.estimateBytes(transport: transport, path: hermesHome)
// 4. Enumerate projects via the existing transport-aware
// service. Empty registry empty list, not an error.
// Same tilde expansion as above so project paths stored
// in `~/.hermes/scarf/projects.json` with `~/projects/foo`
// don't blow up later in `tar -C`.
let registry = ProjectDashboardService(context: context).loadRegistry()
var projectSummaries: [PreflightSummary.ProjectSummary] = []
for project in registry.projects where !project.archived {
let expanded = Self.expandTilde(project.path, home: resolvedHome)
let reachable = transport.fileExists(expanded)
let bytes = reachable ? Self.estimateBytes(transport: transport, path: expanded) : nil
projectSummaries.append(PreflightSummary.ProjectSummary(
id: project.path, // path is the registry's stable handle
name: project.name,
path: expanded,
sizeBytes: bytes,
reachable: reachable
))
}
// 5. Is `sqlite3` on PATH? Drives the WAL-checkpoint toggle.
// Missing we still archive, just without quiescing.
let sqliteCheck = try? transport.runProcess(
executable: "/bin/bash",
args: ["-lc", "command -v sqlite3 >/dev/null 2>&1 && echo yes || echo no"],
stdin: nil,
timeout: 30
)
let sqliteAvailable = sqliteCheck?.stdoutString.trimmingCharacters(in: .whitespacesAndNewlines) == "yes"
return PreflightSummary(
hermesVersion: hermesVersion,
hermesHomePath: hermesHome,
hermesHomeBytes: hermesSize,
projects: projectSummaries,
sqliteAvailable: sqliteAvailable
)
}
/// Replace a leading `~` or `~/` with the resolved remote home.
/// Tar (and most non-shell tools) don't expand tildes only the
/// shell does, and we deliberately single-quote paths in the
/// command string for whitespace-safety, which then suppresses
/// shell expansion. So we expand here, in Swift, with a
/// known-good `$HOME` value.
static func expandTilde(_ path: String, home: String) -> String {
guard !home.isEmpty else { return path }
if path == "~" { return home }
if path.hasPrefix("~/") { return home + String(path.dropFirst(1)) }
return path
}
/// Run the full backup: stream Hermes home + each project tarball,
/// build the manifest, ZIP everything into `archiveURL`. Caller
/// holds the `Task` and can cancel; cooperative checks fire between
/// stages.
public func run(
preflight: PreflightSummary,
options: BackupManifest.Options,
archiveURL: URL,
progress: @Sendable @escaping (Progress) -> Void
) async throws -> BackupResult {
let transport = context.makeTransport()
let workDir = FileManager.default.temporaryDirectory
.appendingPathComponent("scarf-backup-\(UUID().uuidString)", isDirectory: true)
try FileManager.default.createDirectory(at: workDir, withIntermediateDirectories: true)
defer { try? FileManager.default.removeItem(at: workDir) }
try Task.checkCancellation()
progress(.preflight)
// Stage 1: WAL checkpoint (best effort). Build the state.db
// path from the already-expanded hermesHomePath rather than
// `context.paths.stateDB`, which can still carry a literal
// `~` for remotes that didn't pin `remoteHome` sqlite3
// would fail to open the file and leave the WAL un-flushed.
var checkpointed = false
if options.checkpointedWAL && preflight.sqliteAvailable {
progress(.checkpointingDB)
let stateDB = preflight.hermesHomePath + "/state.db"
let cmd = "sqlite3 \(Self.shellQuote(stateDB)) 'PRAGMA wal_checkpoint(TRUNCATE);' || true"
let result = try? transport.runProcess(
executable: "/bin/bash",
args: ["-lc", cmd],
stdin: nil,
timeout: 60
)
checkpointed = (result?.exitCode == 0)
}
// Stage 2: Hermes home tarball.
try Task.checkCancellation()
let hermesTarball = workDir.appendingPathComponent("hermes.tar.gz")
let hermesExcludes = Self.hermesExcludes(options: options)
let hermesTarCmd = Self.tarCommand(
workDir: preflight.hermesHomePath.deletingLastPathComponent_String(),
target: ".hermes",
excludes: hermesExcludes
)
let hermesHash = try await streamToFile(
transport: transport,
command: hermesTarCmd,
destination: hermesTarball
) { written in
progress(.archivingHermes(bytesWritten: written))
}
let hermesSize = (try? FileManager.default.attributesOfItem(atPath: hermesTarball.path)[.size] as? Int64) ?? 0
// Stage 3: per-project tarballs.
let projectsDir = workDir.appendingPathComponent("projects", isDirectory: true)
try FileManager.default.createDirectory(at: projectsDir, withIntermediateDirectories: true)
var projectEntries: [BackupManifest.ProjectEntry] = []
for summary in preflight.projects where summary.reachable {
try Task.checkCancellation()
let projID = Self.stableID(forPath: summary.path)
let outerName = "\(projID).tar.gz"
let dest = projectsDir.appendingPathComponent(outerName)
let parent = (summary.path as NSString).deletingLastPathComponent
let leaf = (summary.path as NSString).lastPathComponent
let cmd = Self.tarCommand(
workDir: parent,
target: leaf,
excludes: Self.projectExcludes()
)
let hash = try await streamToFile(
transport: transport,
command: cmd,
destination: dest
) { written in
progress(.archivingProject(name: summary.name, bytesWritten: written))
}
let size = (try? FileManager.default.attributesOfItem(atPath: dest.path)[.size] as? Int64) ?? 0
projectEntries.append(BackupManifest.ProjectEntry(
id: projID,
name: summary.name,
path: summary.path,
tarballPath: BackupArchiveLayout.projectTarballPath(for: projID),
tarballSize: size,
tarballSHA256: hash
))
}
// Stage 4: build manifest, write to workDir.
try Task.checkCancellation()
let manifest = BackupManifest(
createdAt: ISO8601DateFormatter().string(from: Date()),
source: BackupManifest.Source(
serverID: context.id.uuidString,
displayName: context.displayName,
host: Self.host(for: context),
user: Self.user(for: context),
hermesVersion: preflight.hermesVersion
),
hermes: BackupManifest.HermesTree(
homePath: preflight.hermesHomePath,
tarballPath: BackupArchiveLayout.hermesTarballPath,
tarballSize: hermesSize,
tarballSHA256: hermesHash
),
projects: projectEntries,
options: BackupManifest.Options(
includeAuth: options.includeAuth,
includeMcpTokens: options.includeMcpTokens,
includeLogs: options.includeLogs,
checkpointedWAL: checkpointed
)
)
let manifestData: Data
do {
let encoder = JSONEncoder()
encoder.outputFormatting = [.prettyPrinted, .sortedKeys]
manifestData = try encoder.encode(manifest)
} catch {
throw BackupError.localIO("Couldn't encode manifest: \(error.localizedDescription)")
}
let manifestURL = workDir.appendingPathComponent(BackupArchiveLayout.manifestPath)
do {
try manifestData.write(to: manifestURL, options: .atomic)
} catch {
throw BackupError.localIO("Couldn't write manifest: \(error.localizedDescription)")
}
// Stage 5: ZIP everything in workDir into the user-chosen
// destination. Atomic via temp file + rename so a half-written
// archive isn't visible.
try Task.checkCancellation()
progress(.bundling)
let tempArchive = archiveURL.deletingLastPathComponent()
.appendingPathComponent(".\(archiveURL.lastPathComponent).inflight-\(UUID().uuidString).zip")
try Self.zipDirectory(workDir: workDir, into: tempArchive)
progress(.finalizing)
do {
if FileManager.default.fileExists(atPath: archiveURL.path) {
try FileManager.default.removeItem(at: archiveURL)
}
try FileManager.default.moveItem(at: tempArchive, to: archiveURL)
} catch {
try? FileManager.default.removeItem(at: tempArchive)
throw BackupError.localIO("Couldn't move archive into place: \(error.localizedDescription)")
}
let archiveSize = (try? FileManager.default.attributesOfItem(atPath: archiveURL.path)[.size] as? Int64) ?? 0
return BackupResult(
manifest: manifest,
archiveURL: archiveURL,
archiveSize: archiveSize
)
}
// MARK: - Streaming
/// Spawn a remote (or local) `bash -lc <cmd>` and pump its stdout
/// into `destination`, computing SHA-256 incrementally as bytes
/// arrive. Returns the hex digest. The process gets a fresh
/// `bash -lc` shell on each invocation same login-shell story
/// as `streamRawBytes` so PATH picks up pipx installs etc.
private func streamToFile(
transport: any ServerTransport,
command: String,
destination: URL,
onProgress: @Sendable @escaping (Int64) -> Void
) async throws -> String {
FileManager.default.createFile(atPath: destination.path, contents: nil)
guard let fh = try? FileHandle(forWritingTo: destination) else {
throw BackupError.localIO("Couldn't open \(destination.lastPathComponent) for writing")
}
defer { try? fh.close() }
var hasher = SHA256()
var written: Int64 = 0
let stream = transport.streamRawBytes(
executable: "/bin/bash",
args: ["-lc", command]
)
do {
for try await chunk in stream {
try Task.checkCancellation()
try fh.write(contentsOf: chunk)
hasher.update(data: chunk)
written += Int64(chunk.count)
onProgress(written)
}
} catch is CancellationError {
throw BackupError.cancelled
} catch let err as TransportError {
throw BackupError.remoteCommandFailed(err.localizedDescription)
} catch {
throw BackupError.remoteCommandFailed(error.localizedDescription)
}
let digest = hasher.finalize()
return digest.map { String(format: "%02x", $0) }.joined()
}
// MARK: - Tar / shell helpers
private static func tarCommand(workDir: String, target: String, excludes: [String]) -> String {
var parts: [String] = ["tar -czf -"]
for ex in excludes {
parts.append("--exclude=\(shellQuote(ex))")
}
parts.append("-C \(shellQuote(workDir))")
parts.append(shellQuote(target))
return parts.joined(separator: " ")
}
/// Always-on Hermes-tree exclusions, regardless of options:
/// SQLite WAL siblings (would carry mid-flight writes) and runtime
/// state files (`gateway_state.json`).
private static func hermesExcludes(options: BackupManifest.Options) -> [String] {
var excludes: [String] = [
".hermes/state.db-wal",
".hermes/state.db-shm",
".hermes/gateway_state.json",
]
if !options.includeAuth { excludes.append(".hermes/auth.json") }
if !options.includeMcpTokens { excludes.append(".hermes/mcp-tokens") }
if !options.includeLogs { excludes.append(".hermes/logs") }
return excludes
}
/// Default project-tree exclusions: things that don't restore well
/// (compiled object stores, virtualenvs that hard-code absolute
/// paths, system-specific build outputs). Users can opt in via
/// the future "include build artefacts" toggle in the Backup
/// sheet for now we always exclude these.
private static func projectExcludes() -> [String] {
[
"*/node_modules",
"*/.venv",
"*/venv",
"*/__pycache__",
"*/.git/objects",
"*/.next",
"*/dist",
"*/.DS_Store",
]
}
/// Single-quote a path / argument for embedding in a `bash -lc`
/// string. Uses POSIX-safe single quotes with escape for embedded
/// quotes (`'` `'\''`).
private static func shellQuote(_ s: String) -> String {
"'" + s.replacingOccurrences(of: "'", with: "'\\''") + "'"
}
/// Convenience: same idea as ServerContext.host, but tolerates the
/// local case (no host) by returning `"localhost"`.
private static func host(for context: ServerContext) -> String {
if case .ssh(let cfg) = context.kind {
return cfg.host
}
return "localhost"
}
private static func user(for context: ServerContext) -> String? {
if case .ssh(let cfg) = context.kind {
return cfg.user
}
return nil
}
/// `du -sb` (GNU) is the most portable way to get raw bytes
/// on macOS `du -sk` returns kilobytes. Returns nil if neither
/// works.
private static func estimateBytes(transport: any ServerTransport, path: String) -> Int64? {
let cmd = "du -sb \(shellQuote(path)) 2>/dev/null | awk '{print $1}'"
guard let r = try? transport.runProcess(
executable: "/bin/bash",
args: ["-lc", cmd],
stdin: nil,
timeout: 60
), r.exitCode == 0 else { return nil }
let s = r.stdoutString.trimmingCharacters(in: .whitespacesAndNewlines)
return Int64(s)
}
/// Stable ID for a project. The project registry tracks projects
/// by absolute path, but paths can differ between source and
/// target (different `$HOME`). We hash the path to get a stable
/// 16-hex-char identifier that's safe to use as a tarball
/// filename. Collisions are vanishingly unlikely a Mac's path
/// space is small and SHA-256 truncated to 64 bits has good
/// properties for non-adversarial input.
private static func stableID(forPath path: String) -> String {
let digest = SHA256.hash(data: Data(path.utf8))
let bytes = digest.map { String(format: "%02x", $0) }.joined()
return String(bytes.prefix(16))
}
/// Shell out to `/usr/bin/zip` to assemble the outer archive.
/// macOS ships `zip` at this fixed path so we don't need a PATH
/// search. `-r` recurse, `-q` quiet, `-X` strip extended attrs
/// for reproducibility.
///
/// Mac-only: iOS doesn't ship `/usr/bin/zip` and Foundation's `Process`
/// is unavailable in the iOS SDK. The whole backup flow is a Mac-side
/// operation; the iOS stub throws so any accidental call surfaces a
/// clear message instead of an opaque link error.
private static func zipDirectory(workDir: URL, into archive: URL) throws {
#if os(iOS)
throw BackupError.zipFailed("Backup zip is not supported on iOS — run the backup from the Mac app.")
#else
let proc = Process()
proc.executableURL = URL(fileURLWithPath: "/usr/bin/zip")
proc.currentDirectoryURL = workDir
proc.arguments = ["-rqX", archive.path, "."]
let errPipe = Pipe()
proc.standardError = errPipe
proc.standardOutput = Pipe()
do {
try proc.run()
} catch {
throw BackupError.zipFailed("Couldn't launch zip: \(error.localizedDescription)")
}
proc.waitUntilExit()
if proc.terminationStatus != 0 {
let tail = (try? errPipe.fileHandleForReading.readToEnd())
.flatMap { String(data: $0 ?? Data(), encoding: .utf8) } ?? ""
throw BackupError.zipFailed("zip exited \(proc.terminationStatus): \(tail)")
}
#endif
}
}
// MARK: - Path helpers
private extension String {
/// `(somePath as NSString).deletingLastPathComponent` lifted to a
/// String extension. Used during preflight to derive the
/// remote `$HOME` from `$HOME/.hermes`.
func deletingLastPathComponent_String() -> String {
(self as NSString).deletingLastPathComponent
}
}
@@ -0,0 +1,501 @@
import Foundation
import CryptoKit
#if canImport(os)
import os
#endif
/// Reverses a `.scarfbackup` archive into a target server: validates,
/// streams tarballs into place over SSH, and re-anchors path-bearing
/// JSON sidecars so the restored Hermes home references the new layout.
///
/// **Validation gates.** No bytes are written to the target until the
/// manifest's `kind` magic + `schemaVersion` match, and every inner
/// tarball's SHA-256 matches what the manifest claims. A corrupt
/// archive surfaces a single named-path error instead of a half-extracted
/// home.
///
/// **Path re-anchoring.** Project absolute paths in
/// `~/.hermes/scarf/projects.json` reference the source server's home
/// (e.g. `/root/projects/foo`). After extraction the project lives at
/// `<targetProjectsRoot>/foo`, so the restore rewrites `path` for each
/// entry. Same logic for `<project>/.scarf/manifest.json` if it carries
/// self-references.
///
/// **Cron paused on restore.** Every job in `cron/jobs.json` is flipped
/// to `enabled = false` after restore. Restored cron jobs may carry
/// stale credentials (Slack tokens, webhooks) or run on schedules the
/// user no longer wants auto-running them on a fresh droplet is
/// surprising. The user re-enables what they want from the Cron view.
public final class RemoteRestoreService: @unchecked Sendable {
#if canImport(os)
private static let logger = Logger(subsystem: "com.scarf", category: "RemoteRestoreService")
#endif
public let context: ServerContext
public init(context: ServerContext) {
self.context = context
}
public enum Progress: Sendable, Equatable {
case validating
case verifyingHashes
case planning
case restoringHermes(bytesPushed: Int64)
case restoringProject(name: String, bytesPushed: Int64)
case reanchoringPaths
case pausingCron
case finalizing
}
public enum RestoreError: Error, LocalizedError {
case archiveUnreadable(String)
case unsupportedSchema(Int)
case wrongKind(String)
case integrityCheckFailed(path: String, expected: String, actual: String)
case remoteCommandFailed(String)
case localIO(String)
case cancelled
public var errorDescription: String? {
switch self {
case .archiveUnreadable(let m): return "Couldn't read the backup archive: \(m)"
case .unsupportedSchema(let v): return "Backup uses schema v\(v), which this version of Scarf doesn't recognize."
case .wrongKind(let k): return "This file isn't a Scarf server backup (kind: \(k))."
case .integrityCheckFailed(let p, let exp, let act): return "Backup is corrupt — \(p) hash mismatch (expected \(exp.prefix(12))…, got \(act.prefix(12))…)."
case .remoteCommandFailed(let m): return "Remote command failed during restore: \(m)"
case .localIO(let m): return "Local file I/O failed during restore: \(m)"
case .cancelled: return "Restore cancelled."
}
}
}
/// What `inspect()` returns to drive the restore-plan sheet. The
/// caller picks `targetProjectsRoot`, optionally tweaks the cron
/// pause toggle, then calls `run()` with the same archive URL.
public struct InspectionResult: Sendable {
public var manifest: BackupManifest
public var workDir: URL // unzipped temp dir; reused by run()
public var targetHomeResolved: String?
public var targetHermesVersion: String?
}
public struct RestoreOptions: Sendable {
/// Where to drop project tarballs. Each project lands at
/// `<targetProjectsRoot>/<basename>`. Defaults to
/// `<targetHome>/projects` when not specified.
public var targetProjectsRoot: String?
/// Override the resolved target home (rarely needed; the
/// default is whatever `bash -lc 'echo $HOME'` returned).
public var targetHomeOverride: String?
/// Pause every cron job after restore. Strongly recommended
/// (the user re-enables intentionally).
public var pauseCronJobs: Bool
public init(
targetProjectsRoot: String? = nil,
targetHomeOverride: String? = nil,
pauseCronJobs: Bool = true
) {
self.targetProjectsRoot = targetProjectsRoot
self.targetHomeOverride = targetHomeOverride
self.pauseCronJobs = pauseCronJobs
}
}
public struct RestoreResult: Sendable {
public var manifest: BackupManifest
public var hermesHome: String
public var projectsRestored: [RestoredProject]
public var cronJobsPaused: Int
public struct RestoredProject: Sendable {
public var name: String
public var sourcePath: String
public var targetPath: String
}
}
/// Unzip + manifest-validate + hash-verify in a temp dir. Cheap
/// enough to call from a sheet's appearance handler so the user
/// sees a populated preview before committing.
public func inspect(archiveURL: URL) async throws -> InspectionResult {
let workDir = FileManager.default.temporaryDirectory
.appendingPathComponent("scarf-restore-\(UUID().uuidString)", isDirectory: true)
try FileManager.default.createDirectory(at: workDir, withIntermediateDirectories: true)
// Unzip outer archive.
try Self.unzipArchive(at: archiveURL, into: workDir)
// Decode + validate manifest.
let manifestURL = workDir.appendingPathComponent(BackupArchiveLayout.manifestPath)
guard let data = try? Data(contentsOf: manifestURL) else {
throw RestoreError.archiveUnreadable("missing manifest.json")
}
let manifest: BackupManifest
do {
manifest = try JSONDecoder().decode(BackupManifest.self, from: data)
} catch {
throw RestoreError.archiveUnreadable("manifest.json malformed: \(error.localizedDescription)")
}
guard manifest.kind == BackupManifest.kindMagic else {
throw RestoreError.wrongKind(manifest.kind)
}
guard manifest.schemaVersion == BackupManifest.currentSchemaVersion else {
throw RestoreError.unsupportedSchema(manifest.schemaVersion)
}
// Hash-verify every inner tarball before any remote bytes are
// pushed.
try await Self.verifyHash(file: workDir.appendingPathComponent(manifest.hermes.tarballPath), expected: manifest.hermes.tarballSHA256)
for project in manifest.projects {
try await Self.verifyHash(file: workDir.appendingPathComponent(project.tarballPath), expected: project.tarballSHA256)
}
// Probe the target for $HOME + hermes version. Doesn't fail
// restore if the probe times out the user can still pick
// an override.
let transport = context.makeTransport()
let homeProbe = try? transport.runProcess(
executable: "/bin/bash",
args: ["-lc", "echo \"$HOME\""],
stdin: nil,
timeout: 30
)
let resolvedHome = homeProbe?.stdoutString.trimmingCharacters(in: .whitespacesAndNewlines)
let versionProbe = try? transport.runProcess(
executable: "/bin/bash",
args: ["-lc", "hermes --version 2>/dev/null || true"],
stdin: nil,
timeout: 30
)
let resolvedVersion = versionProbe?.stdoutString.trimmingCharacters(in: .whitespacesAndNewlines)
return InspectionResult(
manifest: manifest,
workDir: workDir,
targetHomeResolved: (resolvedHome?.isEmpty == false) ? resolvedHome : nil,
targetHermesVersion: (resolvedVersion?.isEmpty == false) ? resolvedVersion : nil
)
}
/// Run the restore. Pushes tarballs, re-anchors paths, optionally
/// pauses cron. Caller owns the `workDir` URL from `inspect()` and
/// is responsible for cleanup if `run` throws on success this
/// method removes the temp dir.
public func run(
inspection: InspectionResult,
options: RestoreOptions,
progress: @Sendable @escaping (Progress) -> Void
) async throws -> RestoreResult {
defer { try? FileManager.default.removeItem(at: inspection.workDir) }
let transport = context.makeTransport()
let manifest = inspection.manifest
try Task.checkCancellation()
progress(.planning)
let targetHome = options.targetHomeOverride
?? inspection.targetHomeResolved
?? (manifest.hermes.homePath as NSString).deletingLastPathComponent
let projectsRoot = options.targetProjectsRoot ?? (targetHome + "/projects")
// Make sure the projects root exists so `tar -xzf` doesn't
// fail on a missing -C target.
let mkdirCmd = "mkdir -p \(Self.shellQuote(projectsRoot))"
let mkdirResult = try? transport.runProcess(
executable: "/bin/bash",
args: ["-lc", mkdirCmd],
stdin: nil,
timeout: 30
)
if let r = mkdirResult, r.exitCode != 0 {
throw RestoreError.remoteCommandFailed("mkdir \(projectsRoot) failed: \(r.stderrString)")
}
// Stage 1: hermes home. Pushes into $HOME so the inner
// `.hermes/...` paths land at `<targetHome>/.hermes/...`.
try Task.checkCancellation()
let hermesTar = inspection.workDir.appendingPathComponent(manifest.hermes.tarballPath)
try await pushTarball(
transport: transport,
tarball: hermesTar,
extractInto: targetHome
) { written in
progress(.restoringHermes(bytesPushed: written))
}
// Stage 2: per-project tarballs.
var restoredProjects: [RestoreResult.RestoredProject] = []
for project in manifest.projects {
try Task.checkCancellation()
let tar = inspection.workDir.appendingPathComponent(project.tarballPath)
try await pushTarball(
transport: transport,
tarball: tar,
extractInto: projectsRoot
) { written in
progress(.restoringProject(name: project.name, bytesPushed: written))
}
let basename = (project.path as NSString).lastPathComponent
restoredProjects.append(RestoreResult.RestoredProject(
name: project.name,
sourcePath: project.path,
targetPath: projectsRoot + "/" + basename
))
}
// Stage 3: re-anchor `~/.hermes/scarf/projects.json` so the
// restored Hermes references the new project paths instead
// of the source droplet's paths.
try Task.checkCancellation()
progress(.reanchoringPaths)
try await reanchorProjectsRegistry(
transport: transport,
targetHome: targetHome,
mapping: Dictionary(
uniqueKeysWithValues: restoredProjects.map { ($0.sourcePath, $0.targetPath) }
)
)
// Stage 4: pause cron jobs.
var paused = 0
if options.pauseCronJobs {
try Task.checkCancellation()
progress(.pausingCron)
paused = try await pauseAllCronJobs(transport: transport, targetHome: targetHome)
}
progress(.finalizing)
return RestoreResult(
manifest: manifest,
hermesHome: targetHome + "/.hermes",
projectsRestored: restoredProjects,
cronJobsPaused: paused
)
}
// MARK: - Push (tarball -> remote stdin)
/// Stream a local `.tar.gz` into `tar -xzf - -C <target>` on the
/// destination. We use `transport.makeProcess` so the command is
/// shell-wrapped the same way the rest of the app talks to remotes
/// (`bash -lc` for SSH, direct invocation for local).
private func pushTarball(
transport: any ServerTransport,
tarball: URL,
extractInto target: String,
onProgress: @Sendable @escaping (Int64) -> Void
) async throws {
#if os(iOS)
throw RestoreError.remoteCommandFailed("Remote restore is not supported on iOS in this build.")
#else
let cmd = "tar -xzf - -C \(Self.shellQuote(target))"
let proc = transport.makeProcess(executable: "/bin/bash", args: ["-lc", cmd])
// standardInput: read end of an OS pipe whose write end we
// pump from the local tarball file. Going through a pipe (vs
// setting standardInput to a FileHandle directly) gives us
// cooperative chunk-by-chunk control + cancellation.
let inPipe = Pipe()
let outPipe = Pipe()
let errPipe = Pipe()
proc.standardInput = inPipe
proc.standardOutput = outPipe
proc.standardError = errPipe
do {
try proc.run()
} catch {
throw RestoreError.remoteCommandFailed("Couldn't start remote tar: \(error.localizedDescription)")
}
let writer = inPipe.fileHandleForWriting
let reader: FileHandle
do {
reader = try FileHandle(forReadingFrom: tarball)
} catch {
try? writer.close()
proc.terminate()
throw RestoreError.localIO("Couldn't open tarball: \(error.localizedDescription)")
}
defer { try? reader.close() }
var written: Int64 = 0
let chunkSize = 64 * 1024
do {
while true {
try Task.checkCancellation()
let chunk = reader.readData(ofLength: chunkSize)
if chunk.isEmpty { break }
try writer.write(contentsOf: chunk)
written += Int64(chunk.count)
onProgress(written)
}
} catch is CancellationError {
try? writer.close()
proc.terminate()
throw RestoreError.cancelled
} catch {
try? writer.close()
proc.terminate()
throw RestoreError.localIO("Couldn't pump tarball into remote: \(error.localizedDescription)")
}
try? writer.close() // signals EOF to the remote tar
proc.waitUntilExit()
if proc.terminationStatus != 0 {
let tail = (try? errPipe.fileHandleForReading.readToEnd())
.flatMap { $0.flatMap { String(data: $0, encoding: .utf8) } } ?? ""
throw RestoreError.remoteCommandFailed("tar -x exited \(proc.terminationStatus): \(tail)")
}
#endif
}
// MARK: - Path re-anchor
/// Rewrite each entry's `path` in `~/.hermes/scarf/projects.json`
/// from source-host paths to target-host paths. We do this on the
/// remote rather than mutating the tarball locally the Hermes
/// home tarball can be GBs and re-packing would double the
/// transfer cost. Python is universally present on droplets and
/// keeps the JSON shape intact (preserves keys we don't know
/// about).
private func reanchorProjectsRegistry(
transport: any ServerTransport,
targetHome: String,
mapping: [String: String]
) async throws {
guard !mapping.isEmpty else { return }
let registryPath = targetHome + "/.hermes/scarf/projects.json"
let mappingJSON: String
do {
let data = try JSONSerialization.data(withJSONObject: mapping)
mappingJSON = String(data: data, encoding: .utf8) ?? "{}"
} catch {
throw RestoreError.localIO("Couldn't encode path mapping: \(error.localizedDescription)")
}
let script = """
import json, os, sys
path = os.path.expanduser(\(Self.pythonQuote(registryPath)))
if not os.path.exists(path):
sys.exit(0)
try:
with open(path) as f: data = json.load(f)
except Exception as e:
print(f"projects.json parse failed: {e}", file=sys.stderr); sys.exit(1)
mapping = json.loads(\(Self.pythonQuote(mappingJSON)))
for entry in data.get('projects', []):
old = entry.get('path')
if old in mapping: entry['path'] = mapping[old]
with open(path, 'w') as f: json.dump(data, f, indent=2)
"""
let cmd = "python3 -c \(Self.shellQuote(script))"
let result = try? transport.runProcess(
executable: "/bin/bash",
args: ["-lc", cmd],
stdin: nil,
timeout: 60
)
if let r = result, r.exitCode != 0 {
throw RestoreError.remoteCommandFailed("Path re-anchor failed: \(r.stderrString)")
}
}
/// Set `enabled: false` on every cron job. Returns the count
/// flipped (0 if jobs.json is absent).
private func pauseAllCronJobs(transport: any ServerTransport, targetHome: String) async throws -> Int {
let path = targetHome + "/.hermes/cron/jobs.json"
let script = """
import json, os, sys
path = os.path.expanduser(\(Self.pythonQuote(path)))
if not os.path.exists(path):
print(0); sys.exit(0)
with open(path) as f: data = json.load(f)
count = 0
for job in data.get('jobs', []):
if job.get('enabled', False):
job['enabled'] = False
count += 1
with open(path, 'w') as f: json.dump(data, f, indent=2)
print(count)
"""
let cmd = "python3 -c \(Self.shellQuote(script))"
let result = try? transport.runProcess(
executable: "/bin/bash",
args: ["-lc", cmd],
stdin: nil,
timeout: 60
)
if let r = result, r.exitCode == 0 {
let count = Int(r.stdoutString.trimmingCharacters(in: .whitespacesAndNewlines)) ?? 0
return count
}
return 0
}
// MARK: - Helpers
/// Mac-only: iOS doesn't ship `/usr/bin/unzip` and Foundation's
/// `Process` is unavailable in the iOS SDK. Restore is initiated from
/// the Mac app; the iOS stub throws so any accidental call surfaces a
/// clear message instead of a link-time failure.
private static func unzipArchive(at archive: URL, into dest: URL) throws {
#if os(iOS)
throw RestoreError.archiveUnreadable("Restore unzip is not supported on iOS — run the restore from the Mac app.")
#else
let proc = Process()
proc.executableURL = URL(fileURLWithPath: "/usr/bin/unzip")
proc.arguments = ["-q", archive.path, "-d", dest.path]
let errPipe = Pipe()
proc.standardError = errPipe
proc.standardOutput = Pipe()
do {
try proc.run()
} catch {
throw RestoreError.archiveUnreadable("Couldn't launch unzip: \(error.localizedDescription)")
}
proc.waitUntilExit()
if proc.terminationStatus != 0 {
let tail = (try? errPipe.fileHandleForReading.readToEnd())
.flatMap { $0.flatMap { String(data: $0, encoding: .utf8) } } ?? ""
throw RestoreError.archiveUnreadable("unzip exited \(proc.terminationStatus): \(tail)")
}
#endif
}
/// Hash a local file in 1 MB chunks. We avoid loading the whole
/// file into memory because tarballs can be multi-GB.
private static func verifyHash(file: URL, expected: String) async throws {
guard let fh = try? FileHandle(forReadingFrom: file) else {
throw RestoreError.archiveUnreadable("missing inner file: \(file.lastPathComponent)")
}
defer { try? fh.close() }
var hasher = SHA256()
let chunkSize = 1024 * 1024
while true {
let chunk = fh.readData(ofLength: chunkSize)
if chunk.isEmpty { break }
hasher.update(data: chunk)
}
let actual = hasher.finalize().map { String(format: "%02x", $0) }.joined()
if actual != expected {
throw RestoreError.integrityCheckFailed(path: file.lastPathComponent, expected: expected, actual: actual)
}
}
private static func shellQuote(_ s: String) -> String {
"'" + s.replacingOccurrences(of: "'", with: "'\\''") + "'"
}
/// Python source-literal quoting. Triple-quoted with backslash
/// escapes for embedded triple-quotes, backslashes, and the
/// language's own escape sequences. Used to safely embed JSON +
/// path strings into a `python3 -c '...'` invocation.
private static func pythonQuote(_ s: String) -> String {
let escaped = s
.replacingOccurrences(of: "\\", with: "\\\\")
.replacingOccurrences(of: "\"\"\"", with: "\\\"\\\"\\\"")
return "\"\"\"" + escaped + "\"\"\""
}
}
@@ -0,0 +1,251 @@
import Foundation
/// Pure block-splice logic for Scarf's managed regions inside
/// `~/.hermes/.env`. Each registered project that has at least one
/// resolved secret carries one block, bounded by:
///
/// ```
/// # scarf-secrets:begin <slug>
/// SCARF_<UPPER_SLUG>_<UPPER_FIELDKEY>=<value>
/// ...
/// # scarf-secrets:end <slug>
/// ```
///
/// The Mac wraps this in `KeychainEnvMirror` (Keychain-aware, atomic
/// write, mode-0600 enforcement). This file handles only the marker
/// contract + key naming + splice logic that's testable in isolation
/// against an in-memory string and shared across hosts.
///
/// **Why `~/.hermes/.env`.** Hermes's cron scheduler reloads that file
/// fresh on every tick (cron/scheduler.py:897-903), so values become
/// available to the agent's tool-invoked subprocesses (terminal,
/// code_exec) without any Hermes-side change. Per-project `.env` is
/// not loaded at cron time today, hence we mirror into the global
/// file with namespaced keys.
///
/// **Marker contract is load-bearing.** Both markers carry the slug on
/// the same line so a multi-project file is parsed deterministically
/// and one project's edits can't disturb another's block.
public enum SecretsEnvBlock {
/// Stable across releases entries on disk reference these
/// strings and a marker change would orphan every existing block.
public static let beginMarkerPrefix = "# scarf-secrets:begin "
public static let endMarkerPrefix = "# scarf-secrets:end "
// MARK: - Key naming
/// Build the env-var name for a (slug, fieldKey) pair. Uppercases,
/// replaces every non-alphanumeric character with `_`, prefixes
/// `SCARF_`. Stable: rotating a value writes to the same key.
public static func envKeyName(slug: String, fieldKey: String) -> String {
"SCARF_" + sanitize(slug) + "_" + sanitize(fieldKey)
}
private static func sanitize(_ s: String) -> String {
var out = ""
for scalar in s.unicodeScalars {
let c = Character(scalar)
let isAlpha = ("A"..."Z").contains(c) || ("a"..."z").contains(c)
let isDigit = ("0"..."9").contains(c)
if isAlpha || isDigit {
out.append(Character(scalar.properties.uppercaseMapping))
} else {
out.append("_")
}
}
// Collapse runs of underscores so `foo--bar` doesn't become
// `FOO__BAR` (two underscores trips dotenv parsers more often
// than one). Trim leading/trailing underscores too.
while out.contains("__") {
out = out.replacingOccurrences(of: "__", with: "_")
}
while out.hasPrefix("_") { out.removeFirst() }
while out.hasSuffix("_") { out.removeLast() }
return out.isEmpty ? "UNNAMED" : out
}
// MARK: - Block render
/// Render the bounded block for a single project. Empty `entries`
/// produces an empty string callers should treat that as
/// "remove the project's block" rather than "write an empty
/// block." `entries` are emitted in stable sort order so two
/// runs with the same input produce byte-identical output.
public static func renderBlock(
slug: String,
entries: [(key: String, value: String)]
) -> String {
guard !entries.isEmpty else { return "" }
let sorted = entries.sorted { $0.key < $1.key }
var lines: [String] = []
lines.append(beginMarkerPrefix + slug)
for entry in sorted {
lines.append("\(entry.key)=\(escape(entry.value))")
}
lines.append(endMarkerPrefix + slug)
return lines.joined(separator: "\n")
}
/// Quote values that would confuse python-dotenv: anything with
/// whitespace, `#`, `$`, or quote characters. Single quotes around
/// the value are dotenv-canonical and preserve `$`-style
/// references literally (no shell expansion). Backslash-escape
/// embedded single quotes by closing+reopening: `'foo'\''bar'`.
private static func escape(_ value: String) -> String {
let needsQuoting = value.contains(where: { c in
c.isWhitespace || c == "#" || c == "$" || c == "\"" || c == "'" || c == "\\"
})
if !needsQuoting { return value }
let escaped = value.replacingOccurrences(of: "'", with: "'\\''")
return "'" + escaped + "'"
}
// MARK: - Splice
/// Splice `block` (already-rendered, with markers) into `existing`
/// for the named `slug`. Three cases:
/// 1. `existing` already has a `# scarf-secrets:begin <slug>` /
/// `# scarf-secrets:end <slug>` pair replace the inclusive
/// region. Other slugs' blocks are preserved byte-identically.
/// 2. `existing` has no block for this slug append after a
/// blank line at the end of file.
/// 3. `block` is empty behave like `removeBlock`.
///
/// Idempotent: feeding the output of one call back through
/// `applyBlock` with the same inputs produces the same string.
public static func applyBlock(
_ block: String,
forSlug slug: String,
to existing: String
) -> String {
if block.isEmpty {
return removeBlock(forSlug: slug, from: existing)
}
if let region = blockRange(forSlug: slug, in: existing) {
// Replace the inclusive region. `blockRange` covers the
// begin marker line through the end marker line plus any
// trailing newline so `removeBlock` doesn't leave a
// dangling blank line but for `applyBlock`, we need to
// re-emit that trailing newline so a round-trip
// (mirrorreadmirror with identical entries) produces
// byte-identical output. Without this, the second mirror
// would write a file shorter by one newline byte and
// bump the file's mtime, breaking the
// no-op-when-unchanged contract that the launch
// reconciler relies on.
let before = String(existing[existing.startIndex..<region.lowerBound])
let after = String(existing[region.upperBound..<existing.endIndex])
// Restore a trailing newline only when the consumed region
// had one (i.e., the block wasn't at end-of-string with
// no terminating newline).
let consumedTrailingNewline = region.upperBound > existing.startIndex
&& existing[existing.index(before: region.upperBound)] == "\n"
let separator = consumedTrailingNewline ? "\n" : ""
return before + block + separator + after
}
// Append at end of file, separated from preceding content by
// a blank line. Empty-or-whitespace files just become the
// block plus a trailing newline.
let trimmed = existing.trimmingCharacters(in: .whitespacesAndNewlines)
if trimmed.isEmpty {
return block + "\n"
}
let normalized = trimmingRightNewlines(existing)
return normalized + "\n\n" + block + "\n"
}
/// Strip the bounded block for `slug` from `existing`. No-op when
/// absent. Preserves all other slugs' blocks and user-authored
/// content byte-identically.
public static func removeBlock(forSlug slug: String, from existing: String) -> String {
guard let region = blockRange(forSlug: slug, in: existing) else {
return existing
}
let before = String(existing[existing.startIndex..<region.lowerBound])
let after = String(existing[region.upperBound..<existing.endIndex])
// Collapse the blank line we may have inserted at append time
// so repeated install/uninstall cycles don't accumulate
// blank lines. Specifically: if `before` ends in `\n\n` and
// `after` starts with `\n`, drop one of the newlines.
var trimmedBefore = before
var trimmedAfter = after
if trimmedBefore.hasSuffix("\n\n") && trimmedAfter.hasPrefix("\n") {
trimmedAfter.removeFirst()
} else if trimmedBefore.hasSuffix("\n\n") {
trimmedBefore.removeLast()
}
return trimmedBefore + trimmedAfter
}
// MARK: - Range scan
/// Locate the inclusive character range covering one project's
/// block, including a trailing newline if present so removal
/// doesn't leave a dangling empty line. Returns nil when the
/// block isn't present.
private static func blockRange(
forSlug slug: String,
in existing: String
) -> Range<String.Index>? {
let beginLine = beginMarkerPrefix + slug
let endLine = endMarkerPrefix + slug
// Match begin marker as a full line guard against false
// positives where a slug is a prefix of another slug
// (e.g. "foo" vs "foo-bar"). Require the marker to be
// followed immediately by `\n` or end-of-string.
guard let beginRange = lineRange(of: beginLine, in: existing) else {
return nil
}
// Search for the matching end marker AFTER the begin range
// can't use a leading-anchor scan because there may be other
// slugs' end markers between begin and the matching end.
let searchStart = beginRange.upperBound
guard let endRange = lineRange(of: endLine, in: existing, startingAt: searchStart) else {
return nil
}
// Include a trailing newline if the file has one immediately
// after the end marker keeps the file shape clean across
// remove operations.
var upper = endRange.upperBound
if upper < existing.endIndex, existing[upper] == "\n" {
upper = existing.index(after: upper)
}
return beginRange.lowerBound..<upper
}
/// Find a substring that appears as a complete line bounded by
/// start-of-string or `\n` on the left and `\n` or end-of-string
/// on the right. Returns the range of the substring itself, not
/// including any surrounding newlines.
private static func lineRange(
of needle: String,
in haystack: String,
startingAt start: String.Index? = nil
) -> Range<String.Index>? {
var searchStart = start ?? haystack.startIndex
while searchStart <= haystack.endIndex {
guard let range = haystack.range(of: needle, range: searchStart..<haystack.endIndex) else {
return nil
}
let leftOK = range.lowerBound == haystack.startIndex
|| haystack[haystack.index(before: range.lowerBound)] == "\n"
let rightOK = range.upperBound == haystack.endIndex
|| haystack[range.upperBound] == "\n"
if leftOK && rightOK {
return range
}
// Advance past this false positive and keep searching.
searchStart = range.upperBound
}
return nil
}
private static func trimmingRightNewlines(_ s: String) -> String {
var result = s
while let last = result.last, last.isNewline {
result.removeLast()
}
return result
}
}
@@ -133,12 +133,20 @@ public struct SkillSnapshotDiff: Sendable, Equatable {
}
/// Compact label for the "What's New" pill, e.g.
/// "2 new, 4 updated since you last looked" or "1 new skill".
/// "2 new, 4 changed since you last looked" or "1 new skill".
///
/// Wording note (issue #78): we used to say "X updated since you
/// last looked" but the same screen also surfaces an "Updates"
/// sub-tab driven by `hermes skills check` (skills with newer
/// **upstream** versions available). Two surfaces with the word
/// "update" meaning two different things read as a contradiction
/// to the user. "Changed" describes the local file delta without
/// colliding with upstream-update vocabulary.
public var label: String {
switch (newCount, updatedCount) {
case (let n, 0): return n == 1 ? "1 new skill since you last looked" : "\(n) new skills since you last looked"
case (0, let u): return u == 1 ? "1 updated skill since you last looked" : "\(u) updated skills since you last looked"
default: return "\(newCount) new, \(updatedCount) updated since you last looked"
case (0, let u): return u == 1 ? "1 changed skill since you last looked" : "\(u) changed skills since you last looked"
default: return "\(newCount) new, \(updatedCount) changed since you last looked"
}
}
}
@@ -13,7 +13,12 @@ import os
public enum SkillsScanner: Sendable {
private static let logger = Logger(subsystem: "com.scarf", category: "SkillsScanner")
public static func scan(context: ServerContext, transport: any ServerTransport) -> [HermesSkillCategory] {
public static func scan(
context: ServerContext,
transport: any ServerTransport,
disabledNames: Set<String> = [],
pinnedNames: Set<String> = []
) -> [HermesSkillCategory] {
let dir = context.paths.skillsDir
// Fresh install: skills/ may not exist yet return [] without
// logging an error.
@@ -59,7 +64,9 @@ public enum SkillsScanner: Sendable {
requiredConfig: requiredConfig,
allowedTools: v011.allowedTools,
relatedSkills: v011.relatedSkills,
dependencies: v011.dependencies
dependencies: v011.dependencies,
enabled: !disabledNames.contains(skillName),
pinned: pinnedNames.contains(skillName)
)
}
@@ -0,0 +1,34 @@
import Foundation
/// Process-wide toggles for test-mode launches.
///
/// Read `CommandLine.arguments` once at first access and cache the result so
/// any code path can ask `TestModeFlags.shared.isTestMode` without paying for
/// a re-scan. The harness sets `--scarf-test-mode` from XCUITest's
/// `XCUIApplication.launchArguments` and pairs it with `SCARF_HERMES_HOME`
/// (read by `HermesProfileResolver`) to drive Scarf against an isolated
/// Hermes home.
///
/// The flags themselves don't do anything on their own they're hook points
/// for production code paths to gate behavior. v1 lands the wiring; the
/// gating sites (Sparkle update prompt, capability live-probe, first-run
/// walkthrough) are added incrementally as the harness exercises them and
/// surfaces flakes.
public struct TestModeFlags: Sendable {
/// True when the process was launched with `--scarf-test-mode`. Read
/// once from `CommandLine.arguments`; never mutated.
public let isTestMode: Bool
/// Default singleton cached on first access. Production code reads
/// this; tests that need a different shape construct their own value.
public static let shared: TestModeFlags = TestModeFlags(
arguments: CommandLine.arguments
)
/// Constructor exposed for tests so a synthetic argv can be passed
/// without involving the real `CommandLine`. Production callers use
/// `.shared`.
public init(arguments: [String]) {
self.isTestMode = arguments.contains("--scarf-test-mode")
}
}
@@ -25,6 +25,63 @@ public struct LocalTransport: ServerTransport {
self.contextID = contextID
}
// MARK: - Environment enrichment
/// Injection point for local-subprocess environment enrichment.
/// Mirrors `SSHTransport.environmentEnricher` the Mac app wires
/// this at launch to `HermesFileService.enrichedEnvironment()`,
/// which probes the user's login shell for PATH + credential env
/// vars. Without it, GUI-launched Scarf hands subprocesses a
/// stripped `/usr/bin:/bin:/usr/sbin:/sbin` PATH and child
/// `hermes` invocations from inside spawned workers fail with
/// `executable not found on PATH`.
///
/// Set once at app launch (startup is single-threaded). Tests may
/// inject a stub. iOS leaves this `nil` because LocalTransport
/// doesn't run subprocesses there.
nonisolated(unsafe) public static var environmentEnricher: (@Sendable () -> [String: String])?
/// Build the environment dict for a single subprocess. Process
/// env wins for keys it has; the enricher fills gaps + always
/// owns PATH (which is the whole point of running it). The
/// executable's parent directory is appended as a final fallback
/// so `runProcess` works even before the enricher has been wired
/// (during very early startup, in tests, etc.).
nonisolated static func subprocessEnvironment(forExecutable executable: String) -> [String: String] {
var env = ProcessInfo.processInfo.environment
if let enricher = Self.environmentEnricher {
let extra = enricher()
for (key, value) in extra where !value.isEmpty {
if key == "PATH" {
// Enricher always wins for PATH that's the
// whole reason the enricher exists. The GUI
// process PATH is the broken thing we're
// replacing.
env[key] = value
} else if (env[key] ?? "").isEmpty {
// For other keys (credential env, locale, etc.)
// an explicit non-empty value in the GUI
// environment wins; an empty or absent value
// gets filled by the shell-harvested copy.
env[key] = value
}
}
}
// Always make sure the executable's own directory is on PATH
// covers the case where the enricher hasn't been wired (tests,
// pre-launch helpers) but a child process still tries to spawn
// its sibling tools by bare name.
let dir = (executable as NSString).deletingLastPathComponent
if !dir.isEmpty {
let currentPATH = env["PATH"] ?? "/usr/bin:/bin:/usr/sbin:/sbin"
let parts = currentPATH.split(separator: ":").map(String.init)
if !parts.contains(dir) {
env["PATH"] = "\(dir):\(currentPATH)"
}
}
return env
}
// MARK: - Files
public func readFile(_ path: String) throws -> Data {
@@ -116,6 +173,17 @@ public struct LocalTransport: ServerTransport {
let proc = Process()
proc.executableURL = URL(fileURLWithPath: executable)
proc.arguments = args
// Hand subprocesses an environment that includes the user's
// login-shell PATH. Without this, `hermes` (pipx-installed at
// `~/.local/bin/hermes`) ends up running with macOS's GUI
// launch-services PATH (`/usr/bin:/bin:/usr/sbin:/sbin`), and
// when Hermes itself shells out to spawn a worker (e.g. the
// kanban dispatcher invoking `hermes` by name from a Python
// subprocess), it returns "executable not found on PATH" and
// the run records `outcome=spawn_failed`. Mirrors the SSH
// transport's environmentEnricher hook and is wired by
// `scarfApp.swift` at launch.
proc.environment = Self.subprocessEnvironment(forExecutable: executable)
let stdoutPipe = Pipe()
let stderrPipe = Pipe()
let stdinPipe = Pipe()
@@ -176,6 +244,55 @@ public struct LocalTransport: ServerTransport {
}
#endif
public func streamRawBytes(executable: String, args: [String]) -> AsyncThrowingStream<Data, Error> {
#if os(iOS)
return AsyncThrowingStream { $0.finish() }
#else
return AsyncThrowingStream { continuation in
Task.detached {
let proc = Process()
proc.executableURL = URL(fileURLWithPath: executable)
proc.arguments = args
let outPipe = Pipe()
let errPipe = Pipe()
proc.standardOutput = outPipe
proc.standardError = errPipe
do {
try proc.run()
} catch {
continuation.finish(throwing: error)
return
}
try? outPipe.fileHandleForWriting.close()
try? errPipe.fileHandleForWriting.close()
let handle = outPipe.fileHandleForReading
while true {
let chunk = handle.availableData
if chunk.isEmpty { break }
continuation.yield(chunk)
}
proc.waitUntilExit()
let stderrTail: String
if proc.terminationStatus != 0 {
stderrTail = (try? errPipe.fileHandleForReading.readToEnd())
.flatMap { String(data: $0 ?? Data(), encoding: .utf8) } ?? ""
} else {
stderrTail = ""
}
try? outPipe.fileHandleForReading.close()
try? errPipe.fileHandleForReading.close()
if proc.terminationStatus != 0 {
continuation.finish(throwing: TransportError.commandFailed(
exitCode: proc.terminationStatus, stderr: stderrTail
))
} else {
continuation.finish()
}
}
}
#endif
}
public func streamLines(executable: String, args: [String]) -> AsyncThrowingStream<String, Error> {
#if os(iOS)
// LocalTransport doesn't run on iOS at runtime the iOS app
@@ -240,11 +357,33 @@ public struct LocalTransport: ServerTransport {
#endif
}
// MARK: - SQLite
// MARK: - Script streaming
public func snapshotSQLite(remotePath: String) throws -> URL {
// Local case: no copy needed. Services open the path directly.
URL(fileURLWithPath: remotePath)
/// Run `script` through `/bin/sh -c` locally. Local data path
/// doesn't actually call this in production (the data service
/// hands `LocalSQLiteBackend` the libsqlite3-direct path) kept
/// for protocol parity and for tooling that wants a uniform
/// "run a script" entry on either context kind.
public func streamScript(_ script: String, timeout: TimeInterval) async throws -> ProcessResult {
#if os(iOS)
throw TransportError.other(message: "LocalTransport.streamScript is unavailable on iOS")
#else
let outcome = await SSHScriptRunner.run(
script: script,
context: ServerContext(id: contextID, displayName: "Local", kind: .local),
timeout: timeout
)
switch outcome {
case .connectFailure(let reason):
throw TransportError.other(message: reason)
case .completed(let stdout, let stderr, let exitCode):
return ProcessResult(
exitCode: exitCode,
stdout: Data(stdout.utf8),
stderr: Data(stderr.utf8)
)
}
#endif
}
// MARK: - Watching
@@ -25,6 +25,58 @@ import Foundation
/// callers can treat both uniformly.
public enum SSHScriptRunner {
/// Thread-safe boolean flag used to bridge parent-task cancellation
/// into the detached `Task` body that owns the ssh subprocess.
/// `Task.detached { ... }` does NOT inherit cancellation from the
/// awaiting parent; without this flag, cancelling a chat-load /
/// hydration / activity-fetch Task only throws `CancellationError`
/// at the chat layer while the ssh subprocess keeps running until
/// its 30s timeout fires pinning a remote sqlite query (and a
/// ControlMaster session slot) for the full deadline. v2.8 fix
/// observed in 2026-05-05 dogfooding: rapid chat-switching left a
/// chain of stale 30s ssh subprocesses behind, blocking the
/// dashboard's queryBatch and producing a "spinning" load.
private final class CancelFlag: @unchecked Sendable {
private let lock = NSLock()
private var _cancelled = false
var isCancelled: Bool {
lock.lock(); defer { lock.unlock() }
return _cancelled
}
func cancel() {
lock.lock(); defer { lock.unlock() }
_cancelled = true
}
}
/// Lock-protected `Data` accumulator used by the stdout/stderr
/// readability handlers below. Two of these per script run, one per
/// stream. `@unchecked Sendable` because mutation goes through the
/// `NSLock` Swift can't see that.
///
/// Why this exists (issue #77): the previous implementation read
/// stdout/stderr via `readToEnd()` *after* the subprocess exited.
/// On macOS pipes default to a 1664 KB kernel buffer; once
/// `sqlite3 -json` writes more than that, the SSH client back-
/// pressures over the wire, the remote sqlite3 blocks, the script
/// never finishes, the 30 s timeout fires, and the caller sees
/// "Script timed out" + an empty result set. v2.7's
/// `sessionListSnapshot(limit: 500)` crossed that threshold for
/// any user with ~150+ sessions. Draining concurrently with
/// `readabilityHandler` removes the back-pressure.
private final class LockedData: @unchecked Sendable {
private let lock = NSLock()
private var buf = Data()
func append(_ chunk: Data) {
lock.lock(); defer { lock.unlock() }
buf.append(chunk)
}
func snapshot() -> Data {
lock.lock(); defer { lock.unlock() }
return buf
}
}
public enum Outcome: Sendable {
/// Couldn't even reach the remote (process spawn failed,
/// timeout before any output, network refused). Carries the
@@ -46,22 +98,38 @@ public enum SSHScriptRunner {
/// cross-platform we return a connect failure on non-macOS so
/// the file compiles everywhere.
public static func run(script: String, context: ServerContext, timeout: TimeInterval = 30) async -> Outcome {
#if os(macOS)
switch context.kind {
case .local:
return await runLocally(script: script, timeout: timeout)
case .ssh(let config):
return await runOverSSH(script: script, config: config, timeout: timeout)
await ScarfMon.measureAsync(.transport, "ssh.run") {
// Bridge parent cancellation into the detached subprocess
// task. Without this, killing a chat-hydration Task on a
// session switch only unwinds Swift state the ssh
// subprocess keeps holding a remote sqlite query + a
// ControlMaster session for the full 30s timeout. v2.8.
let cancelFlag = CancelFlag()
return await withTaskCancellationHandler(
operation: {
#if os(macOS)
switch context.kind {
case .local:
return await runLocally(script: script, timeout: timeout, cancelFlag: cancelFlag)
case .ssh(let config):
return await runOverSSH(script: script, config: config, timeout: timeout, cancelFlag: cancelFlag)
}
#else
return .connectFailure("SSHScriptRunner is only available on macOS")
#endif
},
onCancel: {
cancelFlag.cancel()
ScarfMon.event(.transport, "ssh.cancelled", count: 1)
}
)
}
#else
return .connectFailure("SSHScriptRunner is only available on macOS")
#endif
}
// MARK: - SSH path
#if os(macOS)
private static func runOverSSH(script: String, config: SSHConfig, timeout: TimeInterval) async -> Outcome {
private static func runOverSSH(script: String, config: SSHConfig, timeout: TimeInterval, cancelFlag: CancelFlag) async -> Outcome {
var sshArgv: [String] = [
"-o", "ControlMaster=auto",
"-o", "ControlPath=\(SSHTransport.controlDirPath())/%C",
@@ -111,9 +179,35 @@ public enum SSHScriptRunner {
proc.standardOutput = stdoutPipe
proc.standardError = stderrPipe
// Drain stdout/stderr concurrently with the running process
// see the LockedData docstring above for the issue-#77
// back-story. Without these handlers a >64 KB script output
// wedges the pipe + ssh + remote sqlite3 chain and the only
// visible symptom is a timeout.
let outBuf = LockedData()
let errBuf = LockedData()
stdoutPipe.fileHandleForReading.readabilityHandler = { handle in
let chunk = handle.availableData
if chunk.isEmpty {
handle.readabilityHandler = nil
} else {
outBuf.append(chunk)
}
}
stderrPipe.fileHandleForReading.readabilityHandler = { handle in
let chunk = handle.availableData
if chunk.isEmpty {
handle.readabilityHandler = nil
} else {
errBuf.append(chunk)
}
}
do {
try proc.run()
} catch {
stdoutPipe.fileHandleForReading.readabilityHandler = nil
stderrPipe.fileHandleForReading.readabilityHandler = nil
return .connectFailure("Failed to launch ssh: \(error.localizedDescription)")
}
@@ -124,14 +218,42 @@ public enum SSHScriptRunner {
let deadline = Date().addingTimeInterval(timeout)
while proc.isRunning && Date() < deadline {
// Honor BOTH the detached-task's own cancellation flag
// (set by the parent's `withTaskCancellationHandler`)
// and the legacy `Task.isCancelled` check in case the
// detached body gets cancelled directly. The flag is
// the load-bearing path; Task.isCancelled is harmless
// belt-and-suspenders.
if cancelFlag.isCancelled || Task.isCancelled {
proc.terminate()
stdoutPipe.fileHandleForReading.readabilityHandler = nil
stderrPipe.fileHandleForReading.readabilityHandler = nil
try? stdoutPipe.fileHandleForReading.close()
try? stderrPipe.fileHandleForReading.close()
return .connectFailure("Script cancelled")
}
try? await Task.sleep(nanoseconds: 100_000_000)
}
if proc.isRunning {
proc.terminate()
stdoutPipe.fileHandleForReading.readabilityHandler = nil
stderrPipe.fileHandleForReading.readabilityHandler = nil
// Pipe fds leak otherwise closing on the timeout branch
// matches the success-path discipline (see CLAUDE.md
// "Always close both fileHandleForReading and
// fileHandleForWriting on Pipe objects").
try? stdoutPipe.fileHandleForReading.close()
try? stderrPipe.fileHandleForReading.close()
return .connectFailure("Script timed out after \(Int(timeout))s")
}
let out = (try? stdoutPipe.fileHandleForReading.readToEnd()) ?? Data()
let err = (try? stderrPipe.fileHandleForReading.readToEnd()) ?? Data()
// Detach the readabilityHandlers and capture whatever the
// accumulator has. The handler may have already seen EOF
// (`chunk.isEmpty`) and self-cleared, but assigning nil is
// idempotent and guards against a late tick from the queue.
stdoutPipe.fileHandleForReading.readabilityHandler = nil
stderrPipe.fileHandleForReading.readabilityHandler = nil
let out = outBuf.snapshot()
let err = errBuf.snapshot()
// Best-effort fd close Pipe leaks fd's otherwise.
try? stdoutPipe.fileHandleForReading.close()
try? stderrPipe.fileHandleForReading.close()
@@ -145,7 +267,7 @@ public enum SSHScriptRunner {
// MARK: - Local path
private static func runLocally(script: String, timeout: TimeInterval) async -> Outcome {
private static func runLocally(script: String, timeout: TimeInterval, cancelFlag: CancelFlag) async -> Outcome {
return await Task.detached { () -> Outcome in
let proc = Process()
proc.executableURL = URL(fileURLWithPath: "/bin/sh")
@@ -155,21 +277,61 @@ public enum SSHScriptRunner {
let stderrPipe = Pipe()
proc.standardOutput = stdoutPipe
proc.standardError = stderrPipe
// Drain concurrently same pipe-buffer fix as runOverSSH.
// Local scripts can also blow past the 1664 KB pipe buffer
// (e.g. local `sqlite3 -json` over a fat result set) and
// would wedge in exactly the same way.
let outBuf = LockedData()
let errBuf = LockedData()
stdoutPipe.fileHandleForReading.readabilityHandler = { handle in
let chunk = handle.availableData
if chunk.isEmpty {
handle.readabilityHandler = nil
} else {
outBuf.append(chunk)
}
}
stderrPipe.fileHandleForReading.readabilityHandler = { handle in
let chunk = handle.availableData
if chunk.isEmpty {
handle.readabilityHandler = nil
} else {
errBuf.append(chunk)
}
}
do {
try proc.run()
} catch {
stdoutPipe.fileHandleForReading.readabilityHandler = nil
stderrPipe.fileHandleForReading.readabilityHandler = nil
return .connectFailure("Failed to launch /bin/sh: \(error.localizedDescription)")
}
let deadline = Date().addingTimeInterval(timeout)
while proc.isRunning && Date() < deadline {
if cancelFlag.isCancelled || Task.isCancelled {
proc.terminate()
stdoutPipe.fileHandleForReading.readabilityHandler = nil
stderrPipe.fileHandleForReading.readabilityHandler = nil
try? stdoutPipe.fileHandleForReading.close()
try? stderrPipe.fileHandleForReading.close()
return .connectFailure("Script cancelled")
}
try? await Task.sleep(nanoseconds: 100_000_000)
}
if proc.isRunning {
proc.terminate()
stdoutPipe.fileHandleForReading.readabilityHandler = nil
stderrPipe.fileHandleForReading.readabilityHandler = nil
try? stdoutPipe.fileHandleForReading.close()
try? stderrPipe.fileHandleForReading.close()
return .connectFailure("Script timed out after \(Int(timeout))s")
}
let out = (try? stdoutPipe.fileHandleForReading.readToEnd()) ?? Data()
let err = (try? stderrPipe.fileHandleForReading.readToEnd()) ?? Data()
stdoutPipe.fileHandleForReading.readabilityHandler = nil
stderrPipe.fileHandleForReading.readabilityHandler = nil
let out = outBuf.snapshot()
let err = errBuf.snapshot()
try? stdoutPipe.fileHandleForReading.close()
try? stderrPipe.fileHandleForReading.close()
return .completed(
@@ -425,14 +425,18 @@ public struct SSHTransport: ServerTransport {
public func makeProcess(executable: String, args: [String]) -> Process {
ensureControlDir()
// `-T` disables pty allocation critical for binary-clean stdin/stdout
// (ACP JSON-RPC, log tail bytes). Same sh -c wrapping as runProcess
// so home-relative paths in `executable`/`args` actually expand.
// (ACP JSON-RPC, log tail bytes). `bash -lc` (login shell) sources the
// user's profile so PATH picks up pipx's `~/.local/bin`, Homebrew on
// Linux, asdf shims, and conda envs. Plain `sh -c` is non-login, so
// pipx-installed `hermes` isn't on PATH unless `hermesBinaryHint` was
// set explicitly exactly the failure that surfaces as a
// "command not found" / opaque init timeout against fresh droplets.
let cmd = ([executable] + args).map { Self.remotePathArg($0) }.joined(separator: " ")
var sshArgv = sshArgs()
sshArgv.insert("-T", at: 0)
sshArgv.append(hostSpec)
sshArgv.append("sh")
sshArgv.append("-c")
sshArgv.append("bash")
sshArgv.append("-lc")
sshArgv.append(Self.shellQuote(cmd))
let proc = Process()
proc.executableURL = URL(fileURLWithPath: sshBinary)
@@ -453,12 +457,17 @@ public struct SSHTransport: ServerTransport {
return AsyncThrowingStream { continuation in
Task.detached { [self] in
ensureControlDir()
// `bash -lc` (login shell) so PATH picks up profile-only
// entries like pipx's `~/.local/bin` same rationale as
// `makeProcess` above. Streaming consumers (log tails)
// don't tolerate a missing-binary failure any better than
// ACP does.
let cmd = ([executable] + args).map { Self.remotePathArg($0) }.joined(separator: " ")
var sshArgv = sshArgs()
sshArgv.insert("-T", at: 0)
sshArgv.append(hostSpec)
sshArgv.append("sh")
sshArgv.append("-c")
sshArgv.append("bash")
sshArgv.append("-lc")
sshArgv.append(Self.shellQuote(cmd))
let proc = Process()
proc.executableURL = URL(fileURLWithPath: sshBinary)
@@ -514,6 +523,69 @@ public struct SSHTransport: ServerTransport {
#endif
}
public func streamRawBytes(executable: String, args: [String]) -> AsyncThrowingStream<Data, Error> {
#if os(iOS)
return AsyncThrowingStream { $0.finish() }
#else
return AsyncThrowingStream { continuation in
Task.detached { [self] in
ensureControlDir()
// Same `bash -lc` wrapping as `streamLines` so PATH picks
// up profile-only entries (pipx, asdf, conda). The
// difference here is we yield raw `Data` chunks no
// newline framing, no UTF-8 decoding. Required for
// backup tarballs.
let cmd = ([executable] + args).map { Self.remotePathArg($0) }.joined(separator: " ")
var sshArgv = sshArgs()
sshArgv.insert("-T", at: 0)
sshArgv.append(hostSpec)
sshArgv.append("bash")
sshArgv.append("-lc")
sshArgv.append(Self.shellQuote(cmd))
let proc = Process()
proc.executableURL = URL(fileURLWithPath: sshBinary)
proc.arguments = sshArgv
proc.environment = Self.sshSubprocessEnvironment()
let outPipe = Pipe()
let errPipe = Pipe()
proc.standardOutput = outPipe
proc.standardError = errPipe
do {
try proc.run()
} catch {
continuation.finish(throwing: error)
return
}
try? outPipe.fileHandleForWriting.close()
try? errPipe.fileHandleForWriting.close()
let handle = outPipe.fileHandleForReading
while true {
let chunk = handle.availableData
if chunk.isEmpty { break }
continuation.yield(chunk)
}
proc.waitUntilExit()
let stderrTail: String
if proc.terminationStatus != 0 {
stderrTail = (try? errPipe.fileHandleForReading.readToEnd())
.flatMap { String(data: $0 ?? Data(), encoding: .utf8) } ?? ""
} else {
stderrTail = ""
}
try? outPipe.fileHandleForReading.close()
try? errPipe.fileHandleForReading.close()
if proc.terminationStatus != 0 {
continuation.finish(throwing: TransportError.classifySSHFailure(
host: config.host, exitCode: proc.terminationStatus, stderr: stderrTail
))
} else {
continuation.finish()
}
}
}
#endif
}
/// Injection point for ssh/scp subprocess environment enrichment.
///
/// On the Mac app, this is wired at startup to
@@ -548,59 +620,26 @@ public struct SSHTransport: ServerTransport {
return env
}
// MARK: - SQLite snapshot
// MARK: - Script streaming
public func snapshotSQLite(remotePath: String) throws -> URL {
try? FileManager.default.createDirectory(atPath: snapshotDir, withIntermediateDirectories: true)
let localPath = snapshotDir + "/state.db"
// `.backup` is WAL-safe: sqlite takes a consistent snapshot without
// blocking writers. A plain `cp` of a WAL-mode DB could corrupt.
let remoteTmp = "/tmp/scarf-snapshot-\(UUID().uuidString).db"
// sqlite3's `.backup` is a dot-command, not a CLI arg. The whole
// dot-command must be one shell argument (double-quoted) so sqlite3
// receives it as a single command; the backup path inside it is
// single-quoted so sqlite3 parses it correctly. The DB path is a
// separate shell argument and goes through `remotePathArg`
// (double-quoted, $HOME-aware) so `~/.hermes/state.db` actually
// resolves on the remote.
//
// The second sqlite3 invocation flips the snapshot out of WAL mode
// so the scp'd file is self-contained: `.backup` preserves the
// source's journal_mode in the destination header, so without this
// step the client would need the `-wal`/`-shm` sidecars too, and
// every read would fail with "unable to open database file".
//
// Final shell command on the remote:
// sqlite3 "$HOME/.hermes/state.db" ".backup '/tmp/scarf-snapshot-XYZ.db'" \
// && sqlite3 '/tmp/scarf-snapshot-XYZ.db' "PRAGMA journal_mode=DELETE;"
let backupScript = #"sqlite3 \#(Self.remotePathArg(remotePath)) ".backup '\#(remoteTmp)'" && sqlite3 '\#(remoteTmp)' "PRAGMA journal_mode=DELETE;" > /dev/null"#
let backup = try runRemoteShell(backupScript)
if backup.exitCode != 0 {
throw TransportError.classifySSHFailure(host: config.host, exitCode: backup.exitCode, stderr: backup.stderrString)
/// Pipe `script` to `/bin/sh -s` over the ControlMaster-shared SSH
/// channel. Used by `RemoteSQLiteBackend` to invoke `sqlite3 -json`
/// per query without the per-arg quoting that `runProcess` would
/// apply. Delegates to `SSHScriptRunner` which already implements
/// the ssh-stdin-pipe pattern correctly.
public func streamScript(_ script: String, timeout: TimeInterval) async throws -> ProcessResult {
let context = ServerContext(id: contextID, displayName: displayName, kind: .ssh(config))
let outcome = await SSHScriptRunner.run(script: script, context: context, timeout: timeout)
switch outcome {
case .connectFailure(let reason):
throw TransportError.other(message: reason)
case .completed(let stdout, let stderr, let exitCode):
return ProcessResult(
exitCode: exitCode,
stdout: Data(stdout.utf8),
stderr: Data(stderr.utf8)
)
}
// scp the backup down. scp/sftp expands `~` natively (it goes
// through the SSH file-transfer protocol, not a remote shell), so
// remoteTmp's `/tmp/...` absolute path round-trips as-is.
ensureControlDir()
var scpArgs: [String] = [
"-o", "ControlMaster=auto",
"-o", "ControlPath=\(controlDir)/%C",
"-o", "ControlPersist=600",
"-o", "StrictHostKeyChecking=accept-new",
"-o", "LogLevel=QUIET",
"-o", "BatchMode=yes"
]
if let port = config.port { scpArgs += ["-P", String(port)] }
if let id = config.identityFile, !id.isEmpty { scpArgs += ["-i", id] }
scpArgs.append("\(hostSpec):\(remoteTmp)")
scpArgs.append(localPath)
let pull = try runLocal(executable: scpBinary, args: scpArgs, stdin: nil, timeout: 120)
// Regardless of pull outcome, try to clean up the remote tmp.
_ = try? runRemoteShell("rm -f \(Self.remotePathArg(remoteTmp))")
if pull.exitCode != 0 {
throw TransportError.classifySSHFailure(host: config.host, exitCode: pull.exitCode, stderr: pull.stderrString)
}
return URL(fileURLWithPath: localPath)
}
// MARK: - Watching
@@ -685,12 +724,28 @@ public struct SSHTransport: ServerTransport {
try? stdinPipe.fileHandleForWriting.close()
}
if let timeout {
let deadline = Date().addingTimeInterval(timeout)
while proc.isRunning && Date() < deadline {
Thread.sleep(forTimeInterval: 0.1)
}
if proc.isRunning {
// Kernel-wait via DispatchGroup + terminationHandler instead
// of a 100ms Thread.sleep spin loop. The old loop burned a
// cooperative-pool thread for the full timeout duration AND
// had 100ms granularity on the deadline; this version blocks
// once on a semaphore that the OS wakes when the process
// terminates (or when the timeout fires). Net effect: under
// concurrent SSH load (sidebar reload + chat finalize +
// watcher poll all firing together) we don't accumulate
// multiple spin-blocked threads, which was the mechanism
// behind the 7-second `loadRecentSessions` outliers
// observed in remote-context perf captures.
let waitGroup = DispatchGroup()
waitGroup.enter()
proc.terminationHandler = { _ in waitGroup.leave() }
let outcome = waitGroup.wait(timeout: .now() + timeout)
proc.terminationHandler = nil
if outcome == .timedOut {
proc.terminate()
// Brief block until the kill actually lands so we can
// collect partial stdout. terminate() is async; without
// this wait the readToEnd below could race the close.
proc.waitUntilExit()
let partial = (try? stdoutPipe.fileHandleForReading.readToEnd()) ?? Data()
try? stdoutPipe.fileHandleForReading.close()
try? stderrPipe.fileHandleForReading.close()
@@ -81,14 +81,40 @@ public protocol ServerTransport: Sendable {
args: [String]
) -> AsyncThrowingStream<String, Error>
// MARK: - SQLite
/// Binary-safe streaming exec. Same shape as `streamLines` but yields
/// arbitrary `Data` chunks of stdout instead of newline-delimited
/// strings. Required by the backup feature: `tar -czf -` produces
/// gzipped tar bytes that must NOT be decoded as UTF-8 / split on
/// `\n` `streamLines` would silently corrupt the archive.
///
/// Stream finishes on EOF / clean exit; errors with
/// `TransportError.commandFailed` on non-zero exit (carrying the
/// captured stderr tail). Chunk sizes are whatever the underlying
/// pipe returns from `availableData`, typically 464 KB on macOS.
nonisolated func streamRawBytes(
executable: String,
args: [String]
) -> AsyncThrowingStream<Data, Error>
/// Return a local filesystem URL pointing at a fresh, consistent copy of
/// the SQLite database at `remotePath`. For local transports this is
/// just the remote path unchanged. For SSH transports this performs
/// `sqlite3 .backup` on the remote side and scp's the backup into
/// `~/Library/Caches/scarf/<serverID>/state.db`, returning that URL.
nonisolated func snapshotSQLite(remotePath: String) throws -> URL
/// Pipe a multi-line shell script through `/bin/sh -s` on the
/// target and return its captured output. The script travels as a
/// single opaque byte stream no per-line shell interpolation,
/// no per-arg quoting so `"$VAR"` references, here-docs, and
/// nested quotes survive untouched.
///
/// Replaces the old `snapshotSQLite` + scp pipeline. Used by
/// `RemoteSQLiteBackend` to invoke `sqlite3 -readonly -json` over
/// SSH per query (or per batch). Local transport runs the script
/// in-process via `/bin/sh -c`. SSH transport delegates to
/// `SSHScriptRunner` (ControlMaster-shared channel). Citadel
/// transport (iOS) base64-encodes the script + decodes remotely
/// to skirt Citadel's missing-stdin support.
///
/// Throws on transport failures (host unreachable, ssh exit 255,
/// timeout). Returns `ProcessResult` with the script's exit code
/// + stdout + stderr on completion non-zero exit is NOT a
/// throw; callers inspect `exitCode` and decide.
nonisolated func streamScript(_ script: String, timeout: TimeInterval) async throws -> ProcessResult
// MARK: - Watching
@@ -97,6 +123,25 @@ public protocol ServerTransport: Sendable {
nonisolated func watchPaths(_ paths: [String]) -> AsyncStream<WatchEvent>
}
public extension ServerTransport {
/// Default: backup-class binary streaming isn't implemented for
/// every transport (notably the iOS `CitadelServerTransport`,
/// which doesn't expose a raw stdout pipe). Concrete Mac
/// transports override this. The fallback yields a stream that
/// throws on first iteration so callers fail fast rather than
/// hanging silently.
nonisolated func streamRawBytes(
executable: String,
args: [String]
) -> AsyncThrowingStream<Data, Error> {
AsyncThrowingStream { continuation in
continuation.finish(throwing: TransportError.other(
message: "streamRawBytes is not supported on this transport"
))
}
}
}
/// Stat-style file metadata. `nil` (return value) means the file does not
/// exist or couldn't be queried.
public struct FileStat: Sendable, Hashable {
@@ -23,6 +23,13 @@ public final class ActivityViewModel {
public var toolResult: String?
public var sessionPreviews: [String: String] = [:]
public var isLoading = true
/// True while the Phase 2 background fill is paging through
/// `hydrateAssistantToolCalls`. Drives a "Loading tool details"
/// pill in the page header so the user knows the placeholder
/// rows on screen will fill in. v2.8.
public var isHydratingToolCalls = false
@ObservationIgnored
private var hydrationTask: Task<Void, Never>?
public var availableSessions: [(id: String, label: String)] {
var seen = Set<String>()
@@ -34,8 +41,29 @@ public final class ActivityViewModel {
}
public var filteredActivity: [ActivityEntry] {
let entries = toolMessages.flatMap { message in
message.toolCalls.map { call in
let entries = toolMessages.flatMap { message -> [ActivityEntry] in
// v2.8 emit a single "Loading tool calls" placeholder
// entry per skeleton message (one whose tool_calls JSON
// hasn't been hydrated yet). The user sees the timeline
// shape immediately; real entries replace the placeholder
// in-place when `hydrateAssistantToolCalls` returns.
// Filtering still works (we apply the session filter
// below) but kind filter hides placeholders since
// .other is the placeholder's default kind.
guard !message.toolCalls.isEmpty else {
return [ActivityEntry(
id: "skeleton-\(message.id)",
sessionId: message.sessionId,
toolName: "Loading tool details…",
kind: .other,
summary: "",
arguments: "",
messageContent: "",
timestamp: message.timestamp,
isPlaceholder: true
)]
}
return message.toolCalls.map { call in
ActivityEntry(
id: call.callId,
sessionId: message.sessionId,
@@ -49,14 +77,34 @@ public final class ActivityViewModel {
}
}
return entries.filter { entry in
let kindOk = filterKind == nil || entry.kind == filterKind
// Placeholders bypass the kind filter so they don't all
// disappear when the user picks a non-`.other` filter
// chip they still represent rows that may resolve to
// the matching kind once hydrated.
let kindOk = filterKind == nil || entry.isPlaceholder || entry.kind == filterKind
let sessionOk = filterSessionId == nil || entry.sessionId == filterSessionId
return kindOk && sessionOk
}
}
/// Last load's transport-failure reason, if any. Activity surfaces
/// this to the user instead of leaving the empty-state visible
/// (which the user reads as "no activity" rather than "couldn't
/// reach the host"). v2.8.
public var loadError: String?
public func load() async {
// Cancel any in-flight hydration from a prior load (e.g. a
// file-watcher delta firing while the prior pass was still
// paging). The new skeleton replaces the message set, so
// hydrating against the old ids would just splice into rows
// that no longer exist.
hydrationTask?.cancel()
hydrationTask = nil
isHydratingToolCalls = false
isLoading = true
loadError = nil
// refresh() = close + reopen, which forces a fresh snapshot pull on
// remote contexts. Using open() here would short-circuit after the
// first load and show stale data for the view's lifetime. The DB
@@ -64,12 +112,68 @@ public final class ActivityViewModel {
// results without re-opening cleanup() closes on disappear.
let opened = await dataService.refresh()
guard opened else {
loadError = "Couldn't reach \(context.displayName) — check the SSH connection and pull-to-refresh to retry."
isLoading = false
return
}
toolMessages = await dataService.fetchRecentToolCalls(limit: 200)
sessionPreviews = await dataService.fetchSessionPreviews(limit: 200)
// v2.8 Phase L skeleton-then-hydrate. Phase 1 metadata
// fetch is bounded by 50 rows × ~50 bytes (id + session_id +
// role + timestamp; tool_calls JSON is NULLed at the SQL
// level) 3 KB on the wire regardless of how big the
// underlying tool_calls blobs are. Comes back in
// sub-second on healthy remotes; placeholder rows render
// immediately. Phase 2 (paged hydrate) fills the real
// tool details in via 5-id batches in the background.
let outcome = await dataService.fetchRecentToolCallSkeleton(limit: 50)
toolMessages = outcome.messages
if let reason = outcome.transportError {
loadError = "Couldn't load activity from \(context.displayName) — the connection timed out (\(reason)). Pull to refresh to retry."
isLoading = false
return
}
sessionPreviews = await dataService.fetchSessionPreviews(limit: 50)
isLoading = false
// Phase 2 background hydrate. Mirrors the chat path's
// `startToolHydration`. Newest-first (the splice happens in
// batch order), cancellable via `cleanup()` / next `load()`.
startToolCallHydration()
}
/// Phase 2 of the v2.8 Activity loader. Pages through
/// `hydrateAssistantToolCalls` in batches of 5 ids and splices
/// the parsed `[HermesToolCall]` arrays into the existing
/// `toolMessages` skeleton. Once a message has its tool calls,
/// `filteredActivity` swaps the placeholder entry for the real
/// per-call entries on the next observation tick.
private func startToolCallHydration() {
let messageIds = toolMessages
.filter { $0.toolCalls.isEmpty && $0.id > 0 }
.map(\.id)
guard !messageIds.isEmpty else {
isHydratingToolCalls = false
return
}
isHydratingToolCalls = true
let dataService = self.dataService
hydrationTask = Task { @MainActor [weak self] in
defer { self?.isHydratingToolCalls = false }
// Page in 5-id batches matching the chat path
// hydrateAssistantToolCalls already does the paging
// internally; here we just hand it all the ids and
// let it return whatever it could pull. Parent task
// cancellation propagates down via the v2.8 SSH
// cancellation handler we wired through SSHScriptRunner.
let map = await dataService.hydrateAssistantToolCalls(messageIds: messageIds)
guard let self else { return }
if Task.isCancelled { return }
if !map.isEmpty {
self.toolMessages = self.toolMessages.map { msg in
guard msg.toolCalls.isEmpty, let calls = map[msg.id] else { return msg }
return msg.withToolCalls(calls)
}
}
}
}
public func selectEntry(_ entry: ActivityEntry?) async {
@@ -82,6 +186,9 @@ public final class ActivityViewModel {
}
public func cleanup() async {
hydrationTask?.cancel()
hydrationTask = nil
isHydratingToolCalls = false
await dataService.close()
}
}
@@ -95,6 +202,13 @@ public struct ActivityEntry: Identifiable, Sendable {
public let arguments: String
public let messageContent: String
public let timestamp: Date?
/// True for skeleton entries emitted while the v2.8 two-phase
/// loader is still hydrating tool_calls JSON for the underlying
/// message. ActivityRow renders these as greyed "Loading" rows
/// so the user sees the timeline shape without the per-call
/// detail. Splice happens in-place when hydration completes
/// the placeholder vanishes and the real entries take its slot.
public let isPlaceholder: Bool
public init(
id: String,
@@ -104,7 +218,8 @@ public struct ActivityEntry: Identifiable, Sendable {
summary: String,
arguments: String,
messageContent: String,
timestamp: Date?
timestamp: Date?,
isPlaceholder: Bool = false
) {
self.id = id
self.sessionId = sessionId
@@ -114,6 +229,7 @@ public struct ActivityEntry: Identifiable, Sendable {
self.arguments = arguments
self.messageContent = messageContent
self.timestamp = timestamp
self.isPlaceholder = isPlaceholder
}
public var prettyArguments: String {
@@ -16,7 +16,7 @@ public final class ConnectionStatusViewModel {
#endif
public enum Status: Equatable {
/// Healthy: SSH connected AND we can read `~/.hermes/config.yaml`.
/// Healthy: SSH connected AND we can read `~/.hermes/state.db`.
case connected
/// SSH connects but the follow-up read-access probe failed. Data
/// views will be empty until this is resolved.
@@ -38,14 +38,17 @@ public final class ConnectionStatusViewModel {
/// Specific tier-2 failure mode emitted by the probe script. Used to
/// drive both the pill copy and the popover hint (issue #53).
public enum DegradedCause: Equatable {
/// `config.yaml` is missing entirely. Most common cause: Hermes
/// hasn't run `setup` yet on this remote.
/// `state.db` is missing entirely. Most common cause: Hermes
/// is installed but no session has run on this remote yet.
/// Case name kept as `configMissing` for back-compat with
/// callers that pattern-match on it; "config" here is loose
/// for "Scarf's required state file."
case configMissing
/// `~/.hermes` itself doesn't exist. Hermes isn't installed for
/// the SSH user on this host.
case homeMissing
/// File exists but the SSH user can't read it. Permission /
/// ownership mismatch.
/// ownership mismatch. Same back-compat note as above.
case configUnreadable
/// `~/.hermes/active_profile` points at a non-default Hermes
/// profile and the configured Hermes home doesn't carry the
@@ -110,10 +113,18 @@ public final class ConnectionStatusViewModel {
let hermesHome = context.paths.home
// Two-tier probe in one SSH round-trip:
// tier 1: `true` raw connectivity / auth / ControlMaster path
// tier 2: `test -r $HERMESHOME/config.yaml` can we actually
// read the file Dashboard reads on every tick? Green pill
// only if both pass; yellow "degraded" if tier 1 passes
// but tier 2 fails (the exact symptom in issue #19).
// tier 2: `test -r $HERMESHOME/state.db` can we actually read
// the file Dashboard / Sessions / Activity all hit on
// every tick? Green pill only if both pass.
//
// Probe historically targeted `config.yaml`, but Hermes v0.11+
// doesn't materialize that file eagerly it ships with sane
// defaults and only writes config.yaml when the user actually
// changes something. Result: a freshly-installed Hermes that's
// running, persisting sessions, and serving Scarf was being
// marked "degraded config missing" indefinitely. `state.db`
// is created on first agent run and is the actual surface
// Scarf depends on, so we probe that instead.
// Script emits two lines: TIER1:<exitcode> and TIER2:<exitcode>.
let homeArg: String
if hermesHome.hasPrefix("~/") {
@@ -124,22 +135,21 @@ public final class ConnectionStatusViewModel {
homeArg = "\"\(hermesHome.replacingOccurrences(of: "\"", with: "\\\""))\""
}
// Probe emits a granular `TIER2:1:<cause>` code so the pill can
// surface a specific hint (issue #53) instead of the prior
// collapsed-to-binary "can't read config.yaml". Causes:
// surface a specific hint (issue #53). Causes:
// no-home $H itself doesn't exist
// missing config.yaml absent
// missing state.db absent (Hermes hasn't been run yet)
// perm exists but unreadable by SSH user
// profile:<name> config missing AND ~/.hermes/active_profile
// profile:<name> state.db missing AND ~/.hermes/active_profile
// points at a Hermes profile, suggesting Scarf
// is reading the wrong dir
let script = """
echo TIER1:0
H=\(homeArg)
if [ -r "$H/config.yaml" ]; then
if [ -r "$H/state.db" ]; then
echo TIER2:0
elif [ ! -d "$H" ]; then
echo TIER2:1:no-home
elif [ ! -e "$H/config.yaml" ]; then
elif [ ! -e "$H/state.db" ]; then
ACTIVE=""
if [ -r "$HOME/.hermes/active_profile" ]; then
ACTIVE=$(head -n1 "$HOME/.hermes/active_profile" 2>/dev/null | tr -d ' \\t\\r\\n')
@@ -263,23 +273,23 @@ public final class ConnectionStatusViewModel {
)
case .configMissing:
return (
"Hermes hasn't been set up yet",
"`\(hermesHome)/config.yaml` is missing. Run `hermes setup` (or your first `hermes chat`) on the remote to create it. Scarf will go green automatically once it appears."
"Hermes hasn't been run yet",
"`\(hermesHome)/state.db` is missing — Hermes creates it on first agent run. Start any session on the remote (e.g. `hermes chat`) and Scarf will go green automatically."
)
case .configUnreadable:
return (
"Permission denied on config.yaml",
"`\(hermesHome)/config.yaml` exists but the SSH user can't read it. Check ownership: `ls -l \(hermesHome)/config.yaml`. Either run Hermes as the SSH user, `chmod a+r` the file, or SSH as the Hermes user."
"Permission denied on state.db",
"`\(hermesHome)/state.db` exists but the SSH user can't read it. Check ownership: `ls -l \(hermesHome)/state.db`. Either run Hermes as the SSH user, `chmod a+r` the file, or SSH as the Hermes user."
)
case .profileActive(let name):
return (
"Hermes profile \"\(name)\" is active",
"The remote is using Hermes profile `\(name)` — its config lives at `~/.hermes/profiles/\(name)/config.yaml`, not `\(hermesHome)/config.yaml`. Either set this server's Hermes home to `~/.hermes/profiles/\(name)` in Manage Servers → Edit, or run `hermes profile use default` on the remote to revert."
"The remote is using Hermes profile `\(name)` — its state lives at `~/.hermes/profiles/\(name)/state.db`, not `\(hermesHome)/state.db`. Either set this server's Hermes home to `~/.hermes/profiles/\(name)` in Manage Servers → Edit, or run `hermes profile use default` on the remote to revert."
)
case .unknown:
return (
"Can't read Hermes state",
"SSH is fine but Scarf can't reach `\(hermesHome)/config.yaml`. Run diagnostics for a full breakdown."
"SSH is fine but Scarf can't reach `\(hermesHome)/state.db`. Run diagnostics for a full breakdown."
)
}
}
@@ -0,0 +1,279 @@
import Foundation
import Observation
#if canImport(os)
import os
#endif
/// Mac + iOS view model for the Curator surface (v0.12 base + v0.13
/// archive/prune additions).
///
/// Drives `hermes curator status / run / pause / resume / pin / unpin /
/// restore` plus (v0.13+) `archive`, `prune`, `list-archived`. All CLI
/// invocations route through `CuratorService` (the actor) so polling
/// and writes share the same concurrency model and a single error path.
///
/// Capability-gated: callers should construct this only when
/// `HermesCapabilities.hasCurator` is true. Archive-aware UI surfaces
/// (Archive button, Archived section, Prune) gate independently on
/// `hasCuratorArchive`. The view model itself doesn't gate it exposes
/// every method and the View decides what to render.
@Observable
@MainActor
public final class CuratorViewModel {
#if canImport(os)
private let logger = Logger(subsystem: "com.scarf", category: "CuratorViewModel")
#endif
public let context: ServerContext
public private(set) var status: HermesCuratorStatus = .empty
public private(set) var isLoading = false
public private(set) var lastReportMarkdown: String?
// Archive state (v0.13+ only populated by `loadArchive()` on hosts
// where `hasCuratorArchive` is true).
public private(set) var archivedSkills: [HermesCuratorArchivedSkill] = []
public private(set) var isLoadingArchive = false
// Prune state `pruneSummary` non-nil while the confirm sheet is
// mid-flight; `isPruning` flips during the destructive step.
public private(set) var pruneSummary: CuratorPruneSummary?
public private(set) var isPruning = false
// Track which active-skill row is currently being archived so the
// row chrome can show an inline spinner without blocking the rest.
public private(set) var pendingArchiveName: String?
/// Happy-path success toast ("Pinned X", "Resumed", "Archived
/// legacy-helper"). Auto-clears 3s after assignment.
public var transientMessage: String?
/// Failure path populated by every CLI verb when it throws. Shown
/// as an inline yellow banner above the status summary so users
/// don't have to dismiss a modal alert during a high-frequency
/// surface like the leaderboard. Manually dismissed via the View's
/// "x" button (sets to nil).
public var errorMessage: String?
@ObservationIgnored
private let service: CuratorService
public init(context: ServerContext) {
self.context = context
self.service = CuratorService(context: context)
}
// MARK: - Loads
public func load() async {
isLoading = true
defer { isLoading = false }
let context = self.context
// v2.8 instrumented. Curator load fires `hermes curator
// status` (CLI subprocess) plus 1-2 file reads; on remote each
// is a separate SSH RTT. Visibility lets future captures show
// how often the report file is missing or oversized.
let parsed = await ScarfMon.measureAsync(.diskIO, "curator.load") {
await Task.detached(priority: .userInitiated) { () -> (HermesCuratorStatus, String?) in
let textResult = Self.runCuratorStatus(context: context)
let stateData = context.readData(context.paths.curatorStateFile)
let parsed = HermesCuratorStatusParser.parse(text: textResult, stateFileJSON: stateData)
// Best-effort markdown report: the state file points at the
// most recent <YYYYMMDD-HHMMSS>/ dir; load REPORT.md from
// there. Missing on first run, which is fine.
var report: String?
if let reportDir = parsed.lastReportPath {
let reportPath = reportDir.hasSuffix("/")
? "\(reportDir)REPORT.md"
: "\(reportDir)/REPORT.md"
report = context.readText(reportPath)
}
return (parsed, report)
}.value
}
ScarfMon.event(
.diskIO,
"curator.load.bytes",
count: 0,
bytes: parsed.1?.utf8.count ?? 0
)
self.status = parsed.0
self.lastReportMarkdown = parsed.1
}
/// Refresh the archived-skills list. No-op on hosts without
/// `hasCuratorArchive` the caller gates the call.
public func loadArchive() async {
isLoadingArchive = true
defer { isLoadingArchive = false }
do {
archivedSkills = try await service.listArchived()
} catch {
archivedSkills = []
errorMessage = (error as? LocalizedError)?.errorDescription
?? error.localizedDescription
}
}
// MARK: - Writes (v0.12)
/// Run the curator manually. On v0.13+ hosts this blocks for the
/// duration of the run (default 600s timeout); pre-v0.13 returns
/// immediately. Caller passes the capability-decided flag.
public func runNow(synchronous: Bool, timeout: TimeInterval = 600) async {
await runWithReload(
verb: "run",
successMessage: synchronous ? "Curator run complete" : "Curator run started"
) {
try await self.service.runNow(synchronous: synchronous, timeout: timeout)
}
}
public func pause() async {
await runWithReload(verb: "pause", successMessage: "Curator paused") {
try await self.service.pause()
}
}
public func resume() async {
await runWithReload(verb: "resume", successMessage: "Curator resumed") {
try await self.service.resume()
}
}
public func pin(_ skill: String) async {
await runWithReload(verb: "pin", successMessage: "Pinned \(skill)") {
try await self.service.pin(skill)
}
}
public func unpin(_ skill: String) async {
await runWithReload(verb: "unpin", successMessage: "Unpinned \(skill)") {
try await self.service.unpin(skill)
}
}
public func restore(_ skill: String) async {
await runWithReload(verb: "restore", successMessage: "Restored \(skill)") {
try await self.service.restore(skill)
}
// Restore drops the entry from the archived list refresh it
// so the row disappears immediately.
await loadArchive()
}
// MARK: - Writes (v0.13)
public func archive(_ skill: String) async {
pendingArchiveName = skill
await runWithReload(verb: "archive", successMessage: "Archived \(skill)") {
try await self.service.archive(skill)
}
pendingArchiveName = nil
await loadArchive()
}
/// Stage 1 of the bulk-prune flow. Calls `prune --dry-run` and
/// populates `pruneSummary`; the View binds its confirm sheet to
/// the non-nil presence of this property.
public func planPrune() async {
do {
pruneSummary = try await service.prune(dryRun: true)
} catch {
errorMessage = (error as? LocalizedError)?.errorDescription
?? error.localizedDescription
pruneSummary = nil
}
}
/// Stage 2 of the bulk-prune flow. Destructive removes everything
/// currently archived. Clears `pruneSummary` regardless of outcome
/// so the confirm sheet dismisses.
public func confirmPrune() async {
isPruning = true
do {
_ = try await service.prune(dryRun: false)
transientMessage = "Pruned archived skills"
errorMessage = nil
await loadArchive()
await load()
scheduleTransientClear()
} catch {
errorMessage = (error as? LocalizedError)?.errorDescription
?? error.localizedDescription
}
isPruning = false
pruneSummary = nil
}
/// Cancel the in-flight prune-confirm flow without running.
public func cancelPrune() {
pruneSummary = nil
}
/// User-driven dismissal of the inline error banner.
public func dismissError() {
errorMessage = nil
}
// MARK: - Helpers
/// Run a service call, route success `transientMessage`, failure
/// `errorMessage`, and reload `status` either way. Mirrors the
/// previous `runAndReload` helper but goes through the typed
/// service surface.
private func runWithReload(
verb: String,
successMessage: String,
body: @escaping @Sendable () async throws -> Void
) async {
do {
try await body()
transientMessage = successMessage
errorMessage = nil
await load()
scheduleTransientClear()
} catch {
let message = (error as? LocalizedError)?.errorDescription
?? error.localizedDescription
errorMessage = message
transientMessage = nil
await load()
}
}
private func scheduleTransientClear() {
Task { @MainActor [weak self] in
try? await Task.sleep(nanoseconds: 3_000_000_000)
self?.transientMessage = nil
}
}
// MARK: - Legacy sync helpers (kept for `load`'s detached path)
nonisolated private static func runHermes(
context: ServerContext,
args: [String]
) -> (exitCode: Int32, output: String) {
let transport = context.makeTransport()
do {
let result = try transport.runProcess(
executable: context.paths.hermesBinary,
args: args,
stdin: nil,
timeout: 30
)
return (result.exitCode, result.stdoutString + result.stderrString)
} catch let error as TransportError {
return (-1, error.diagnosticStderr.isEmpty
? (error.errorDescription ?? "transport error")
: error.diagnosticStderr)
} catch {
return (-1, error.localizedDescription)
}
}
nonisolated private static func runCuratorStatus(context: ServerContext) -> String {
runHermes(context: context, args: ["curator", "status"]).output
}
}
@@ -29,17 +29,24 @@ public final class IOSCronViewModel {
let ctx = context
let path = ctx.paths.cronJobsJSON
let result: Result<CronJobsFile, Error> = await Task.detached {
do {
guard let data = ctx.readData(path) else {
throw LoadError.missingFile(path: path)
// v2.7 instrumented for parity with Mac `cron.load`. iOS
// Cron load is a single SFTP read of jobs.json so should be
// snappy on most remotes; this measure point makes the cost
// visible in ScarfMon traces alongside the rest of the iOS
// load paths.
let result: Result<CronJobsFile, Error> = await ScarfMon.measureAsync(.diskIO, "ios.cron.load") {
await Task.detached {
do {
guard let data = ctx.readData(path) else {
throw LoadError.missingFile(path: path)
}
let decoded = try JSONDecoder().decode(CronJobsFile.self, from: data)
return .success(decoded)
} catch {
return Result<CronJobsFile, Error>.failure(error)
}
let decoded = try JSONDecoder().decode(CronJobsFile.self, from: data)
return .success(decoded)
} catch {
return .failure(error)
}
}.value
}.value
}
switch result {
case .success(let file):
@@ -96,15 +96,24 @@ public final class IOSMemoryViewModel {
// Run the file read on a detached task `readTextThrowing`
// blocks on transport I/O, and we don't want the MainActor
// hanging during a remote SFTP fetch.
// v2.7 instrumented for parity with Mac `memory.load`.
// iOS path is one SFTP read per Memory tab open (per kind:
// memory / user / soul); the bytes counter shows payload
// size alongside latency.
let ctx = context
let path = kind.path(on: context)
let result: Result<String?, Error> = await Task.detached {
do {
return .success(try ctx.readTextThrowing(path))
} catch {
return .failure(error)
}
}.value
let result: Result<String?, Error> = await ScarfMon.measureAsync(.diskIO, "ios.memory.load") {
await Task.detached {
do {
return Result<String?, Error>.success(try ctx.readTextThrowing(path))
} catch {
return Result<String?, Error>.failure(error)
}
}.value
}
if case .success(.some(let loaded)) = result {
ScarfMon.event(.diskIO, "ios.memory.load.bytes", count: 0, bytes: loaded.utf8.count)
}
switch result {
case .success(.some(let loaded)):
@@ -117,12 +117,19 @@ public final class InsightsViewModel {
}
let since = period.sinceDate
// The four insights queries (user-message count, tool usage,
// hourly + daily activity histograms) batch through one
// `insightsSnapshot` round-trip. Sessions and session-previews
// stay separate they're large result sets and stay on their
// own calls. For remote contexts this turns ~5 SSH round-trips
// into 3.
sessions = await dataService.fetchSessionsInPeriod(since: since)
sessionPreviews = await dataService.fetchSessionPreviews(limit: 500)
userMessageCount = await dataService.fetchUserMessageCount(since: since)
let tools = await dataService.fetchToolUsage(since: since)
hourlyActivity = await dataService.fetchSessionStartHours(since: since)
dailyActivity = await dataService.fetchSessionDaysOfWeek(since: since)
let snapshot = await dataService.insightsSnapshot(since: since)
userMessageCount = snapshot.userMessageCount
let tools = snapshot.toolUsage
hourlyActivity = snapshot.startHours
dailyActivity = snapshot.daysOfWeek
await dataService.close()
@@ -164,6 +164,16 @@ public final class ProjectsViewModel {
projects.map(\.dashboardPath)
}
/// Per-project `.scarf/` directories watched alongside `dashboardPaths`
/// so that file-reading widgets (markdown_file, log_tail, image) refresh
/// when their underlying files are added / removed / renamed inside the
/// directory by a cron job. In-place file appends within an existing
/// file are NOT detected here; the cron job should write atomically
/// (write-then-rename) or `touch` dashboard.json after each run.
public var projectScarfDirs: [String] {
projects.map(\.scarfDir)
}
private func loadDashboard(for project: ProjectEntry) {
dashboardError = nil
if !service.dashboardExists(for: project) {
@@ -5,6 +5,7 @@
import Foundation
import Observation
import SwiftUI
public enum ChatDisplayMode: String, CaseIterable {
case terminal
@@ -63,6 +64,23 @@ public final class RichChatViewModel {
public var messages: [HermesMessage] = []
public var currentSession: HermesSession?
public var messageGroups: [MessageGroup] = []
/// True while the v2.8 two-phase loader's background hydration
/// (tool_calls JSON + tool result rows) is in flight. Chat header
/// shows "Loading tool details" so the user knows the bare
/// transcript they're looking at will fill in. Cleared once both
/// hydration passes finish or the session-id changes underneath.
public var isHydratingTools: Bool = false
@ObservationIgnored
private var hydrationTask: Task<Void, Never>?
/// UserDefaults key controlling whether the chat resume path
/// auto-fetches the CONTENT of tool result rows (`role='tool'`) for
/// past messages. Defaults false a single tool result blob
/// (file dump, stack trace) can be hundreds of KB; bulk-fetching
/// all of them during chat resume on a slow remote can blow past
/// the 30s SSH timeout. The Mac Settings Display tab exposes
/// the toggle (mirror string in `ChatDensityKeys`).
public static let loadHistoricalToolResultsKey = "scarf.chat.loadHistoricalToolResults"
/// True from the moment the user sends a prompt until the ACP
/// `promptComplete` event arrives. Covers the whole round-trip
/// including auxiliary post-processing (title generation, usage
@@ -120,6 +138,12 @@ public final class RichChatViewModel {
/// users can copy-paste the raw output into a bug report.
public var acpErrorDetails: String?
/// Lowercase OAuth provider name (`"nous"`, `"claude"`, ) when the
/// most recent failure was an OAuth refresh-revocation Hermes asked
/// the user to fix via re-authentication. Drives the chat banner's
/// "Re-authenticate" button. Nil for any other failure mode.
public var acpErrorOAuthProvider: String?
/// Optional stderr-tail provider the controller can hook up when it
/// creates the ACPClient. Used by `handlePromptComplete` to enrich
/// the error banner on non-retryable stopReasons. The closure is
@@ -134,6 +158,7 @@ public final class RichChatViewModel {
acpError = nil
acpErrorHint = nil
acpErrorDetails = nil
acpErrorOAuthProvider = nil
}
/// Populate the error triplet from a thrown Error + the ACPClient
@@ -154,10 +179,11 @@ public final class RichChatViewModel {
}
let msg = error.localizedDescription
let stderrTail = await client?.recentStderr ?? ""
let hint = ACPErrorHint.classify(errorMessage: msg, stderrTail: stderrTail)
let cls = ACPErrorHint.classify(errorMessage: msg, stderrTail: stderrTail)
acpError = msg
acpErrorHint = hint
acpErrorHint = cls?.hint
acpErrorDetails = stderrTail.isEmpty ? nil : stderrTail
acpErrorOAuthProvider = cls?.oauthProvider
}
/// Populate the error triplet when `handlePromptComplete` sees a
@@ -168,11 +194,11 @@ public final class RichChatViewModel {
public func recordPromptStopFailure(stopReason: String, client: ACPClient?) async {
let msg = "Prompt ended without a response (stopReason: \(stopReason))."
let stderrTail = await client?.recentStderr ?? ""
let hint = ACPErrorHint.classify(errorMessage: msg, stderrTail: stderrTail)
?? Self.fallbackHint(for: stopReason)
let cls = ACPErrorHint.classify(errorMessage: msg, stderrTail: stderrTail)
acpError = msg
acpErrorHint = hint
acpErrorHint = cls?.hint ?? Self.fallbackHint(for: stopReason)
acpErrorDetails = stderrTail.isEmpty ? nil : stderrTail
acpErrorOAuthProvider = cls?.oauthProvider
}
/// Same as `recordPromptStopFailure` but pulls stderr from the
@@ -182,11 +208,11 @@ public final class RichChatViewModel {
private func recordPromptStopFailureUsingProvider(stopReason: String) async {
let msg = "Prompt ended without a response (stopReason: \(stopReason))."
let stderrTail = await acpStderrProvider?() ?? ""
let hint = ACPErrorHint.classify(errorMessage: msg, stderrTail: stderrTail)
?? Self.fallbackHint(for: stopReason)
let cls = ACPErrorHint.classify(errorMessage: msg, stderrTail: stderrTail)
acpError = msg
acpErrorHint = hint
acpErrorHint = cls?.hint ?? Self.fallbackHint(for: stopReason)
acpErrorDetails = stderrTail.isEmpty ? nil : stderrTail
acpErrorOAuthProvider = cls?.oauthProvider
}
private static func fallbackHint(for stopReason: String) -> String? {
@@ -203,6 +229,12 @@ public final class RichChatViewModel {
public private(set) var acpOutputTokens = 0
public private(set) var acpThoughtTokens = 0
public private(set) var acpCachedReadTokens = 0
/// Running count of context compactions Hermes has performed on this
/// session. Surfaced as the `🗜 ×N` chip in `SessionInfoBar` when > 0
/// and `HermesCapabilities.hasContextCompressionCount` is true. Each
/// `session/prompt` response carries the latest server-side total, so
/// we replace (with a `max` guard) rather than accumulate.
public private(set) var acpCompressionCount = 0
/// Slash commands advertised by the ACP server via `available_commands_update`.
public private(set) var acpCommands: [HermesSlashCommand] = []
@@ -222,15 +254,73 @@ public final class RichChatViewModel {
/// Hermes v2026.4.23+ but listed here unconditionally so older
/// hosts that don't advertise it still surface the trigger; the
/// agent will respond appropriately or no-op gracefully.
///
/// v2.8 / Hermes v0.13 adds `/goal` (lock the agent on a target
/// across turns) and `/queue` (queue a prompt for after the current
/// turn). Both ride the same `.acpNonInterruptive` source Hermes
/// parses them server-side, the wire shape is plain
/// `session/prompt`, and the chat UI keeps the "Agent working"
/// indicator off when they're sent. They're listed unconditionally
/// here; capability filtering happens in `availableCommands` so
/// pre-v0.13 hosts don't see `/goal` or `/queue` in the slash menu.
// TODO(WS-2-Q7): verify against a real v0.13 ACP host that `/goal`
// is in fact non-interruptive on the wire. If Hermes treats it as a
// regular prompt that flips "Agent working", drop it from this
// list and route it through the standard send path (the pill
// bookkeeping in `recordActiveGoal` is independent of the
// interruptive classification).
public static let nonInterruptiveCommands: [HermesSlashCommand] = [
HermesSlashCommand(
name: "steer",
description: "Nudge the agent mid-run (applies after the next tool call)",
argumentHint: "<guidance>",
source: .acpNonInterruptive
),
HermesSlashCommand(
name: "goal",
description: "Lock the agent on a goal that persists across turns",
argumentHint: "<text>",
source: .acpNonInterruptive
),
HermesSlashCommand(
name: "queue",
description: "Queue a prompt to run after the current turn",
argumentHint: "<text>",
source: .acpNonInterruptive
)
]
/// Capability snapshot the chat surface uses to filter
/// `availableCommands`. Set by the chat controller (Mac
/// `ChatViewModel`, iOS `ChatController`) at session-start time and
/// kept fresh via the `HermesCapabilitiesStore` env binding. Default
/// `.empty` means "no v0.13 surfaces" pre-v0.13 hosts and harness
/// scenarios (Previews, smoke tests) never expose `/goal` or
/// `/queue` until the controller publishes a real capabilities
/// value. `@ObservationIgnored` so capability refreshes don't trash
/// the streaming-message render budget; controllers call
/// `publishCapabilities(_:)` once per refresh tick.
@ObservationIgnored
public var capabilitiesGate: HermesCapabilities = .empty
/// Optimistic local mirror of the agent's currently-locked goal.
/// Set by `recordActiveGoal(text:)` the moment the user sends
/// `/goal `; cleared on `/goal --clear` or `reset()`. Pre-v0.13
/// hosts can't reach this code path (the slash menu hides `/goal`),
/// but a typed-out `/goal foo` against an older host would still
/// land here briefly until Hermes' "unknown command" reply lands
/// see WS-2 plan "Inconsistency caveat".
public private(set) var activeGoal: HermesActiveGoal?
/// Optimistic mirror of prompts the user has queued via `/queue `
/// while a turn is in flight. Hermes is the authoritative owner
/// server-side; this list drives the chat-header chip + popover and
/// drains FIFO via `popQueuedPrompt()` when a turn completes.
/// Best-effort: if Hermes' server-side queue gets out of sync
/// (deferred prompt aborted, dropped on disconnect) the user sees a
/// stale chip until their next interaction.
public private(set) var queuedPrompts: [HermesQueuedPrompt] = []
/// Transient hint shown above the composer, e.g. "Guidance queued
/// applies after the next tool call." for `/steer`. The chat view
/// auto-clears it after a short delay (handled in the view); the
@@ -292,12 +382,94 @@ public final class RichChatViewModel {
!acpNames.contains($0.name) && !projectNames.contains($0.name)
}
let occupied = acpNames.union(projectNames).union(Set(quicks.map(\.name)))
let nonInterruptive = Self.nonInterruptiveCommands.filter {
!occupied.contains($0.name)
// Capability gate: `/goal` and `/queue` are v0.13+ surfaces;
// hide them when the connected host is older. `/steer` is
// surfaced unconditionally it works on v0.11+ during an
// active turn; idle-session greying for pre-v0.13 hosts is
// the input bar's concern (it reads `hasACPSteerOnIdle`).
let supported: [HermesSlashCommand] = Self.nonInterruptiveCommands.filter { cmd in
switch cmd.name {
case "goal": return capabilitiesGate.hasGoals
case "queue": return capabilitiesGate.hasACPQueue
case "steer": return true
default: return true
}
}
let nonInterruptive = supported.filter { !occupied.contains($0.name) }
return acpCommands + projectAsHermes + quicks + nonInterruptive
}
/// Publish a fresh capabilities snapshot from the controller.
/// Called whenever `HermesCapabilitiesStore.capabilities` changes
/// (initial detection, post-refresh, server switch). The chat input
/// bar's slash menu re-reads `availableCommands` lazily, so this is
/// just a stored-value swap no observable churn.
public func publishCapabilities(_ caps: HermesCapabilities) {
capabilitiesGate = caps
}
/// Optimistic write triggered when the user sends `/goal <text>`.
/// Pass `nil` (or empty) to clear (the `/goal --clear` path). The
/// pill renders synchronously off this state; there is no
/// authoritative server read-back in v2.8.0 see WS-2 plan Q1.
// TODO(WS-2-Q1): hook a Hermes-supplied goal-state read-back path
// here once we know whether v0.13 exposes goal state via an ACP
// session-startup notification, a session-sidecar JSON field, or a
// `/goal --status` reply. Until then `activeGoal` is purely
// user-set and does not survive a session resume.
public func recordActiveGoal(text: String?) {
if let text, !text.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty {
activeGoal = HermesActiveGoal(
text: text.trimmingCharacters(in: .whitespacesAndNewlines),
setAt: Date()
)
} else {
activeGoal = nil
}
}
/// Append an optimistically-queued prompt to the local mirror
/// (driven by `/queue <text>`). No-op for empty / whitespace input.
public func recordQueuedPrompt(text: String) {
let trimmed = text.trimmingCharacters(in: .whitespacesAndNewlines)
guard !trimmed.isEmpty else { return }
queuedPrompts.append(HermesQueuedPrompt(text: trimmed))
}
/// Drain the next queued prompt off the local mirror, FIFO. Called
/// from `handlePromptComplete` once a turn settles Hermes runs
/// the actual queued prompt server-side; popping here keeps the
/// header chip count honest. Returns the popped prompt for any
/// caller that wants to log it; the chat UI ignores the return.
@discardableResult
public func popQueuedPrompt() -> HermesQueuedPrompt? {
queuedPrompts.isEmpty ? nil : queuedPrompts.removeFirst()
}
/// Parse the argument slug from a `/goal ` invocation. Pure
/// function exposed for unit tests. The chat dispatch reads this
/// to decide whether to set, clear, or no-op the optimistic pill.
public enum GoalCommandArgument: Equatable {
case set(String)
case clear
/// User typed `/goal` with no argument Hermes will reply
/// with usage; Scarf shows a neutral hint and doesn't touch
/// the pill state.
case empty
}
public static func parseGoalArgument(_ raw: String) -> GoalCommandArgument {
let trimmed = raw.trimmingCharacters(in: .whitespacesAndNewlines)
if trimmed.isEmpty { return .empty }
// Accept `--clear`, `clear`, and case-insensitive variants so
// typos don't accidentally lock the goal text to literal
// "Clear". `--clear` is the canonical form (matches Hermes
// CLI flag style).
let lowered = trimmed.lowercased()
if lowered == "--clear" || lowered == "clear" { return .clear }
return .set(trimmed)
}
/// True when `text` is a non-interruptive command that should NOT
/// flip `isAgentWorking` to true on send. Used by the Mac/iOS chat
/// view models to skip the "agent working" overlay change for
@@ -339,11 +511,51 @@ public final class RichChatViewModel {
/// The original CLI session ID when resuming a CLI session via ACP.
/// Used to combine old CLI messages with new ACP messages.
public private(set) var originSessionId: String?
/// Smallest DB id currently loaded for the *current session* (i.e.
/// `sessionId`). Drives `loadEarlier()`: page back with
/// `before: oldestLoadedMessageID`. `nil` when nothing has been
/// loaded yet or the session has no DB-persisted messages.
public private(set) var oldestLoadedMessageID: Int?
/// Whether the most recent fetch suggests there are more older
/// messages on disk that haven't been loaded into `messages` yet.
/// Set to `true` when the initial fetch returned exactly `limit`
/// rows (a strong hint the table has more). Drives the "Load
/// earlier" button visibility in chat views.
public private(set) var hasMoreHistory: Bool = false
/// Cleared during a `loadEarlier()` fetch so the UI can show a
/// spinner and we don't fan out duplicate page requests.
public private(set) var isLoadingEarlier: Bool = false
private var nextLocalId = -1
/// Issue #63: locally-created user messages awaiting state.db
/// persistence, keyed by session id. ACP roundtrips Hermes' DB
/// write asynchronously, so a user who sends a prompt and
/// immediately switches to another session triggers `reset()`
/// before Hermes flushes the row `loadSessionHistory` then reads
/// from a DB that doesn't have the message yet, and the bubble
/// renders blank or vanishes on return. We hold a per-session
/// copy here that survives `reset()` so `loadSessionHistory` can
/// re-inject anything still in flight, and clean entries out as
/// soon as a matching DB row appears.
private var pendingLocalUserMessages: [String: [HermesMessage]] = [:]
private var streamingAssistantText = ""
private var streamingThinkingText = ""
private var streamingToolCalls: [HermesToolCall] = []
/// True while a turn is in flight, has emitted thought-stream
/// bytes, but has NOT yet produced any visible assistant text.
/// Surfaces the user-facing "Thinking" status promotion (the
/// model is reasoning before answering Hermes reasoning models
/// commonly take 38 s here, which the ScarfMon `firstThoughtByte`
/// vs `firstByte` split makes visible). Becomes false the moment
/// the first message chunk arrives or the turn ends.
public var isStreamingThoughtsOnly: Bool {
currentTurnStart != nil
&& !streamingThinkingText.isEmpty
&& streamingAssistantText.isEmpty
}
// DB polling state (used in terminal mode fallback)
private var lastKnownFingerprint: HermesDataService.MessageFingerprint?
private var debounceTask: Task<Void, Never>?
@@ -374,6 +586,9 @@ public final class RichChatViewModel {
public func reset() {
debounceTask?.cancel()
hydrationTask?.cancel()
hydrationTask = nil
isHydratingTools = false
stopActivePolling()
Task { await dataService.close() }
messages = []
@@ -382,6 +597,9 @@ public final class RichChatViewModel {
lastKnownFingerprint = nil
sessionId = nil
originSessionId = nil
oldestLoadedMessageID = nil
hasMoreHistory = false
isLoadingEarlier = false
isAgentWorking = false
userSendPending = false
resetTimestamp = Date()
@@ -396,12 +614,21 @@ public final class RichChatViewModel {
acpErrorHint = nil
acpErrorDetails = nil
acpCachedReadTokens = 0
acpCompressionCount = 0
acpCommands = []
projectScopedCommands = []
currentTurnStart = nil
turnDurations = [:]
transientHint = nil
pendingPermission = nil
// v2.8 / Hermes v0.13 drop optimistic v0.13 surfaces on
// session reset so a fresh chat (or a resume into a different
// session) doesn't paint stale goal / queue state from the
// previous one. The capabilities gate stays on whatever the
// controller most recently published; it's a host-level value
// that doesn't change with session boundaries.
activeGoal = nil
queuedPrompts = []
loadQuickCommands()
}
@@ -418,13 +645,15 @@ public final class RichChatViewModel {
/// Re-fetch session metadata from DB to pick up cost/token updates.
public func refreshSessionFromDB() async {
guard let sessionId else { return }
let opened = await dataService.open()
guard opened else { return }
if let session = await dataService.fetchSession(id: sessionId) {
currentSession = session
await ScarfMon.measureAsync(.sessionLoad, "mac.refreshSessionFromDB") {
guard let sessionId else { return }
let opened = await dataService.open()
guard opened else { return }
if let session = await dataService.fetchSession(id: sessionId) {
currentSession = session
}
await dataService.close()
}
await dataService.close()
}
// MARK: - ACP Event Handling
@@ -451,6 +680,12 @@ public final class RichChatViewModel {
reasoning: nil
)
messages.append(message)
// Track the local message in the pending-user-messages cache
// so a reset/resume cycle on this session before Hermes
// persists the row can still re-inject it on return (#63).
if let sid = sessionId {
pendingLocalUserMessages[sid, default: []].append(message)
}
// Per-turn stopwatch (v2.5): record the start time only when
// we're entering a fresh agent turn. /steer-style mid-run sends
// arrive while isAgentWorking is already true; preserve the
@@ -597,11 +832,23 @@ public final class RichChatViewModel {
}
private func appendMessageChunk(text: String) {
// ScarfMon "first byte" fires once per turn, on the first
// visible message chunk. Splits "user tap first byte"
// (network + Hermes thinking) from "first byte turn end"
// (streaming + Scarf rendering) so we can attribute slow-feel
// bugs to the right side. `bytes` carries the first chunk's
// size, not the full turn.
if streamingAssistantText.isEmpty && currentTurnStart != nil {
ScarfMon.event(.chatStream, "firstByte", count: 1, bytes: text.utf8.count)
}
streamingAssistantText += text
upsertStreamingMessage()
}
private func appendThoughtChunk(text: String) {
if streamingThinkingText.isEmpty && currentTurnStart != nil {
ScarfMon.event(.chatStream, "firstThoughtByte", count: 1, bytes: text.utf8.count)
}
streamingThinkingText += text
upsertStreamingMessage()
}
@@ -719,7 +966,30 @@ public final class RichChatViewModel {
acpOutputTokens += response.outputTokens
acpThoughtTokens += response.thoughtTokens
acpCachedReadTokens += response.cachedReadTokens
// Compression count is a session-wide running total emitted by
// Hermes; each prompt response carries the latest value, so we
// replace rather than accumulate. The `max` guard tolerates
// pre-v0.13 hosts (which emit 0) being upgraded server-side
// mid-session once a real number lands the count resumes from
// there rather than snapping back to 0.
acpCompressionCount = max(acpCompressionCount, response.compressionCount)
isAgentWorking = false
// v2.8 / Hermes v0.13 Hermes runs the next `/queue`-deferred
// prompt server-side now that this turn has settled. Drain the
// local mirror FIFO so the header chip count matches what the
// user staged. Best-effort: if Hermes' authoritative queue
// diverged (deferred prompt aborted, dropped on disconnect),
// the chip is one tick stale until the user's next interaction.
if !queuedPrompts.isEmpty {
popQueuedPrompt()
}
// TODO(v2.8.1): when this completes after an auto-resumed
// checkpoint (Hermes v0.13's "Auto-resume interrupted sessions
// after gateway restart"), surface a one-shot "Auto-resumed
// from checkpoint" indicator. Wire-shape unknown until a v0.13
// dogfooding pass confirms whether the resume lands as a
// visible ACP event or is purely server-side. Deferred from
// v2.8.0 per WS-2 plan Q3.
buildMessageGroups()
// Final position after the prompt settles. Catches fast responses
// (slash commands, short replies) where `.defaultScrollAnchor(.bottom)`
@@ -814,6 +1084,12 @@ public final class RichChatViewModel {
/// Convert the streaming message (id=0) into a permanent message and reset streaming state.
private func finalizeStreamingMessage() {
ScarfMon.measure(.chatStream, "finalizeStreamingMessage") {
_finalizeStreamingMessageImpl()
}
}
private func _finalizeStreamingMessageImpl() {
guard let idx = messages.firstIndex(where: { $0.id == Self.streamingId }) else { return }
// Only finalize if there's actual content
@@ -821,22 +1097,52 @@ public final class RichChatViewModel {
|| !streamingThinkingText.isEmpty
|| !streamingToolCalls.isEmpty
// ScarfMon surface turns that finalize with NO visible
// assistant text. Common Nous-model failure mode: model
// emits a few thought-stream bytes then falls silent;
// Hermes finalizes with empty content; the user sees a
// stuck "(°°) deliberating..." placeholder bubble. The
// event fires for both the all-empty case (which gets
// removed below) and the thoughts-only case (which is
// kept as a permanent message with empty body) both
// are user-visible failures worth tracking.
if streamingAssistantText.isEmpty && streamingToolCalls.isEmpty {
ScarfMon.event(
.chatStream,
"emptyAssistantTurn",
count: 1,
bytes: streamingThinkingText.utf8.count
)
}
if hasContent {
let id = nextLocalId
nextLocalId -= 1
messages[idx] = HermesMessage(
id: id,
sessionId: sessionId ?? "",
role: "assistant",
content: streamingAssistantText,
toolCallId: nil,
toolCalls: streamingToolCalls,
toolName: nil,
timestamp: Date(),
tokenCount: nil,
finishReason: streamingToolCalls.isEmpty ? "stop" : nil,
reasoning: streamingThinkingText.isEmpty ? nil : streamingThinkingText
)
// Wrap the streaming-id rewrite in a no-animation
// transaction. Without this SwiftUI sees an identity
// change for the streaming ForEach element (id 0 new
// permanent id) and runs an animated diff against
// adjacent elements, which costs ~58 RichMessageBubble
// body re-evaluations per turn-end (visible in the
// ScarfMon ring as a 12 ms burst right after every
// `finalizeStreamingMessage` interval). The new message
// is content-equal to the streaming one there is no
// animation worth running.
withTransaction(Transaction(animation: nil)) {
messages[idx] = HermesMessage(
id: id,
sessionId: sessionId ?? "",
role: "assistant",
content: streamingAssistantText,
toolCallId: nil,
toolCalls: streamingToolCalls,
toolName: nil,
timestamp: Date(),
tokenCount: nil,
finishReason: streamingToolCalls.isEmpty ? "stop" : nil,
reasoning: streamingThinkingText.isEmpty ? nil : streamingThinkingText
)
}
// Capture per-turn duration so the chat UI can render the
// stopwatch pill (v2.5). Skips assistants we don't have a
// start time for e.g., the .promptComplete fired but the
@@ -847,8 +1153,12 @@ public final class RichChatViewModel {
currentTurnStart = nil
}
} else {
// Remove empty streaming placeholder
messages.remove(at: idx)
// Remove empty streaming placeholder. Same no-animation
// transaction pattern empty-finalize used to ripple the
// ForEach diff to every following bubble.
withTransaction(Transaction(animation: nil)) {
messages.remove(at: idx)
}
}
// Reset streaming state for next chunk
@@ -875,12 +1185,15 @@ public final class RichChatViewModel {
let opened = await dataService.open()
guard opened else { return }
var dbMessages = await dataService.fetchMessages(sessionId: sessionId)
// Reconnects don't generate hundreds of unseen messages, so a
// 200-row tail is plenty for the merge and it keeps us from
// re-materializing 1000+ message sessions on every reconnect.
var dbMessages = await dataService.fetchMessages(sessionId: sessionId, limit: HistoryPageSize.reconcile)
// If we have an origin session (CLI session continued via ACP),
// include those messages too
if let origin = originSessionId, origin != sessionId {
let originMessages = await dataService.fetchMessages(sessionId: origin)
let originMessages = await dataService.fetchMessages(sessionId: origin, limit: HistoryPageSize.reconcile)
if !originMessages.isEmpty {
dbMessages = originMessages + dbMessages
dbMessages.sort { ($0.timestamp ?? .distantPast) < ($1.timestamp ?? .distantPast) }
@@ -920,15 +1233,57 @@ public final class RichChatViewModel {
/// Load message history from the DB, optionally combining an origin session
/// (e.g., CLI session) with the current ACP session.
public func loadSessionHistory(sessionId: String, acpSessionId: String? = nil) async {
await ScarfMon.measureAsync(.sessionLoad, "mac.hydrateMessages") {
self.sessionId = sessionId
// Capture the session-id we're loading FOR so we can verify
// it's still the active one before assigning to `messages`.
// Without this guard, switching to a small chat while a
// larger one is mid-fetch can result in last-write-wins:
// the slow fetch finishes after the small chat's, drops
// the user back into the big chat's transcript, and the
// user has to reselect the small one. Observed in remote
// perf captures (parallel fetchMessages calls, one timing
// out at 30s for a 157-message session, the other 2-message
// chat completing in 425ms; the 30s one's assignment
// overwrote the small chat).
let loadingForSession = sessionId
// Force a fresh snapshot pull on remote contexts. An earlier open()
// would have cached a stale copy on resume we need whatever
// Hermes has actually persisted since then, or the resumed session
// will show only history up to the moment the snapshot was taken.
let opened = await dataService.refresh()
// `forceFresh: true` refuses the stale-snapshot fallback the data
// service grew in M11 falling back here would silently hide
// messages the agent streamed during the user's offline window.
let opened = await dataService.refresh(forceFresh: true)
guard opened else { return }
// Race-check #1: session id may have changed during refresh.
guard self.sessionId == loadingForSession else {
ScarfMon.event(.sessionLoad, "mac.hydrateMessages.dropped", count: 1)
return
}
var allMessages = await dataService.fetchMessages(sessionId: sessionId)
// v2.8 two-phase loader. Phase 1 skeleton: user + assistant
// rows only, no tool_calls JSON, no reasoning, no
// reasoning_content. Wire payload bounded by conversational
// text alone so chats with multi-page tool result blobs (the
// 30s-timeout case) come up in seconds. Phase 2 (kicked off
// below in a Task.detached) fills tool calls + tool results in
// the background the chat is usable while it runs.
let pageSize = HistoryPageSize.initial
let originOutcome = await dataService.fetchSkeletonMessages(sessionId: sessionId, limit: pageSize)
var allMessages = originOutcome.messages
var transportFailure: String? = originOutcome.transportError
// Race-check #2: session id may have changed during the
// long fetch (the most common race a 30s timeout on a
// big session lets the user switch to a small one and back).
guard self.sessionId == loadingForSession else {
ScarfMon.event(.sessionLoad, "mac.hydrateMessages.dropped", count: 1)
return
}
// The DB has more on-disk history when the initial fetch
// saturated the limit. The "Load earlier" affordance reads
// this flag.
var moreHistory = allMessages.count >= pageSize
let session = await dataService.fetchSession(id: sessionId)
// If the ACP session is different from the origin, load its messages too
@@ -936,17 +1291,284 @@ public final class RichChatViewModel {
if let acpId = acpSessionId, acpId != sessionId {
originSessionId = sessionId
self.sessionId = acpId
let acpMessages = await dataService.fetchMessages(sessionId: acpId)
if !acpMessages.isEmpty {
allMessages.append(contentsOf: acpMessages)
let acpOutcome = await dataService.fetchSkeletonMessages(sessionId: acpId, limit: pageSize)
// Race-check #3: same guard, after the second fetch.
guard self.sessionId == acpId else {
ScarfMon.event(.sessionLoad, "mac.hydrateMessages.dropped", count: 1)
return
}
if let acpErr = acpOutcome.transportError, transportFailure == nil {
transportFailure = acpErr
}
if !acpOutcome.messages.isEmpty {
allMessages.append(contentsOf: acpOutcome.messages)
allMessages.sort { ($0.timestamp ?? .distantPast) < ($1.timestamp ?? .distantPast) }
moreHistory = moreHistory || acpOutcome.messages.count >= pageSize
}
}
messages = allMessages
// Issue #63 re-inject any locally-created user messages
// we still have on file for this session that haven't yet
// shown up in state.db. Covers two paths:
// 1. The user just sent a prompt then resumed a different
// session before Hermes persisted the row. `reset()` had
// cleared `messages` but the per-session pending cache
// survived; restore the row here so the bubble doesn't
// come back blank.
// 2. The DB-resume path on first load a previously-pending
// message Hermes is still mid-write may not appear in
// this fetch. We merge it in, and drop it from the cache
// as soon as a matching DB row (same content, persisted
// id 0) shows up.
let pendingForSession = pendingLocalUserMessages[sessionId] ?? []
if pendingForSession.isEmpty {
messages = allMessages
} else {
var merged = allMessages
var stillPending: [HermesMessage] = []
for local in pendingForSession {
let persisted = merged.contains { msg in
msg.isUser && msg.id >= 0 && msg.content == local.content
}
if persisted {
continue // DB caught up drop the local copy
}
if !merged.contains(where: { $0.id == local.id }) {
merged.append(local)
}
stillPending.append(local)
}
merged.sort { ($0.timestamp ?? .distantPast) < ($1.timestamp ?? .distantPast) }
messages = merged
if stillPending.isEmpty {
pendingLocalUserMessages.removeValue(forKey: sessionId)
} else {
pendingLocalUserMessages[sessionId] = stillPending
}
}
currentSession = session
let minId = allMessages.map(\.id).min() ?? 0
let minId = messages.map(\.id).min() ?? 0
nextLocalId = min(minId - 1, -1)
// Track the oldest loaded id from THIS session (not the merged
// origin) so `loadEarlier()` pages back through the live ACP
// session's history. Cross-session backfill (paging into the
// CLI origin) isn't supported in v1 the merged 2× pageSize
// is enough headroom for the dashboard-resume case.
let currentSessionId = self.sessionId ?? sessionId
oldestLoadedMessageID = allMessages
.filter { $0.sessionId == currentSessionId }
.map(\.id)
.min()
hasMoreHistory = moreHistory
ScarfMon.event(.sessionLoad, "mac.hydrateMessages.rows", count: messages.count)
buildMessageGroups()
// Partial-result detection if a fetch tripped a transport
// failure (SSH timeout / ControlMaster drop) the user is now
// looking at zero or near-zero messages with no idea why. The
// pre-v2.8 behavior was a silent empty transcript. Surface a
// banner via the existing acpError triplet so the user sees
// "couldn't load full history connection slow." We assume
// more history exists (so the "Load earlier" affordance is
// honest about the gap) caller can retry by reopening the
// session.
if let reason = transportFailure {
acpError = "Couldn't load full chat history — the connection to \(dataService.context.displayName) timed out."
acpErrorHint = "Reopen the session to retry, or check the SSH link if this keeps happening."
acpErrorDetails = reason
acpErrorOAuthProvider = nil
hasMoreHistory = true
} else {
// v2.8 kick off background hydration of tool_calls JSON
// and tool result rows for the just-loaded skeleton.
// Non-blocking on the main load path (chat is usable).
startToolHydration(loadingForSession: self.sessionId ?? sessionId)
}
} // end measureAsync(.sessionLoad, "mac.hydrateMessages")
}
/// Phase 2 of the two-phase chat loader. Pulls `tool_calls` JSON
/// for the loaded assistant rows, then fetches `role='tool'` rows
/// in the loaded id range and splices both into `messages` /
/// `messageGroups` without disturbing what the user is already
/// reading. Cancellable restarting (a session switch, a
/// `reset()`) drops any in-flight pass.
///
/// Tool calls go in first because they live ON the existing
/// assistant message and surface the most-visible UI affordance
/// (the tool card chips). Tool result content rows go in second
/// because they're the heaviest payload and the UI degrades
/// gracefully without them (the cards still show "running" /
/// "complete" state; only the result body is missing).
private func startToolHydration(loadingForSession: String) {
hydrationTask?.cancel()
let sessionForLoad = loadingForSession
let dataService = self.dataService
hydrationTask = Task { @MainActor [weak self] in
guard let self else { return }
self.isHydratingTools = true
defer { self.isHydratingTools = false }
// Snapshot the assistant ids + id range from the messages
// we just loaded. Doing this on MainActor keeps us in step
// with the observable view of `messages`; the actual
// SQL calls happen in `await` slots that release the actor.
let assistantIds = self.messages
.filter { $0.isAssistant && $0.id > 0 }
.map(\.id)
guard let minId = self.messages.map(\.id).min(),
let maxId = self.messages.map(\.id).max(),
!assistantIds.isEmpty || minId < maxId else {
return
}
// Phase 2a tool_calls JSON. Splice parsed values into
// each assistant message that has them.
let toolCallMap = await dataService.hydrateAssistantToolCalls(messageIds: assistantIds)
if Task.isCancelled || self.sessionId != sessionForLoad {
ScarfMon.event(.sessionLoad, "mac.hydrateTools.dropped", count: 1)
return
}
if !toolCallMap.isEmpty {
self.messages = self.messages.map { msg in
guard msg.isAssistant, let calls = toolCallMap[msg.id] else { return msg }
return msg.withToolCalls(calls)
}
self.buildMessageGroups()
}
// Phase 2b tool result rows. Default OFF (v2.8). A
// single tool result blob (file dump, stack trace) can run
// hundreds of KB; bulk-fetching all of them during chat
// resume on a slow remote was the cause of the 30s timeout
// observed in 2026-05-05 dogfooding. Users can opt in via
// Settings Display "Load tool results in past chats"
// when bandwidth is plentiful. Tool call CARDS still
// render either way (`tool_calls` JSON loads in Phase 2a);
// only the inspector pane's "Output" section is empty
// until the user opens a card, at which point a per-call
// lazy fetch fills it in.
let loadResults = UserDefaults.standard.bool(
forKey: Self.loadHistoricalToolResultsKey
)
guard loadResults else {
ScarfMon.event(.sessionLoad, "mac.hydrateTools.skippedToolResults", count: 1)
return
}
let toolResults = await dataService.fetchToolResultsInRange(
sessionId: sessionForLoad,
minId: minId,
maxId: maxId
)
if Task.isCancelled || self.sessionId != sessionForLoad {
ScarfMon.event(.sessionLoad, "mac.hydrateTools.dropped", count: 1)
return
}
if !toolResults.isEmpty {
var merged = self.messages
let existingIds = Set(merged.map(\.id))
for tr in toolResults where !existingIds.contains(tr.id) {
merged.append(tr)
}
merged.sort { lhs, rhs in
let lt = lhs.timestamp ?? .distantPast
let rt = rhs.timestamp ?? .distantPast
if lt != rt { return lt < rt }
return lhs.id < rhs.id
}
self.messages = merged
self.buildMessageGroups()
}
ScarfMon.event(.sessionLoad, "mac.hydrateTools.complete", count: 1)
}
}
/// Lazy-load the content of a single tool result by call id and
/// splice it into `messages` / `messageGroups` as a synthetic
/// `role='tool'` row. Used by `ChatInspectorPane` when the user
/// opens a tool call card whose result hasn't been hydrated yet
/// (auto-hydrate is opt-in via `loadHistoricalToolResultsKey`).
/// No-op when the result is already present in the transcript or
/// the session id has changed underneath us.
@MainActor
public func loadToolResultIfMissing(callId: String) async {
guard let sessionForLoad = sessionId else { return }
// Already in the transcript? Done.
if messages.contains(where: { $0.toolCallId == callId && $0.isToolResult }) {
return
}
guard let content = await dataService.fetchToolResult(callId: callId) else {
return
}
guard self.sessionId == sessionForLoad else { return }
// Build a synthetic tool result row. We don't have the original
// row id (would need a second SELECT) so we use a negative
// local id that won't collide with persisted rows. The bubble
// and inspector both key on `toolCallId`, not `id`, for tool
// results so this is enough to render correctly.
let placeholderId = nextLocalId
nextLocalId -= 1
let synthetic = HermesMessage(
id: placeholderId,
sessionId: sessionForLoad,
role: "tool",
content: content,
toolCallId: callId,
toolCalls: [],
toolName: nil,
timestamp: Date(),
tokenCount: nil,
finishReason: nil,
reasoning: nil,
reasoningContent: nil
)
messages.append(synthetic)
// Re-sort so the tool result lands next to its assistant
// parent. ID-based ordering preserves the chronological order
// of all the persisted rows; the synthetic placeholder uses a
// negative id so it slots in last fine for inspector display
// since the inspector keys on toolCallId.
messages.sort { lhs, rhs in
let lt = lhs.timestamp ?? .distantPast
let rt = rhs.timestamp ?? .distantPast
if lt != rt { return lt < rt }
return lhs.id < rhs.id
}
buildMessageGroups()
ScarfMon.event(.sessionLoad, "mac.lazyToolResult.fetched", count: 1)
}
// MARK: - Load Earlier (pagination)
/// Page back through the current session's DB-persisted history
/// before `oldestLoadedMessageID` and prepend the page to
/// `messages`. Cheap on the SQLite side (`id` is the primary
/// key); the cost is the data-service `open()` round-trip on
/// remote contexts. `pageSize` defaults to the same 200-row
/// budget as the initial load.
public func loadEarlier(pageSize: Int = HistoryPageSize.initial) async {
guard !isLoadingEarlier, hasMoreHistory else { return }
guard let sessionId, let oldest = oldestLoadedMessageID else { return }
isLoadingEarlier = true
defer { isLoadingEarlier = false }
let opened = await dataService.open()
guard opened else { return }
let older = await dataService.fetchMessages(
sessionId: sessionId,
limit: pageSize,
before: oldest
)
guard !older.isEmpty else {
hasMoreHistory = false
return
}
messages.insert(contentsOf: older, at: 0)
oldestLoadedMessageID = older.first?.id
// If this fetch returned fewer than the page size we've hit
// the bottom of the table no further pages worth fetching.
hasMoreHistory = older.count >= pageSize
buildMessageGroups()
}
@@ -990,7 +1612,7 @@ public final class RichChatViewModel {
let fingerprint = await dataService.fetchMessageFingerprint(sessionId: sessionId)
if fingerprint != lastKnownFingerprint {
let fetched = await dataService.fetchMessages(sessionId: sessionId)
let fetched = await dataService.fetchMessages(sessionId: sessionId, limit: HistoryPageSize.polling)
let session = await dataService.fetchSession(id: sessionId)
lastKnownFingerprint = fingerprint
@@ -49,6 +49,18 @@ public final class SkillsViewModel {
public var hubMessage: String?
public var hubSource: String = "all"
/// Last successful `browseHub` payload, kept around so that the
/// "All Sources" search path can filter client-side (issue #79).
/// `hermes skills search` with no `--source` flag routes through
/// the centralized `hermes-index` source which can miss skills
/// that are visible in browse we'd rather give the user the
/// canonical "type-to-filter" UX than chase Hermes's index gaps.
/// Source-specific searches still shell out to the CLI for full
/// upstream semantics. Setter is `internal` so the in-tree test
/// suite can seed the cache without invoking the live CLI;
/// out-of-module callers can still only read.
public internal(set) var lastBrowseResults: [HermesHubSkill] = []
public let hubSources = ["all", "official", "skills-sh", "well-known", "github", "clawhub", "lobehub"]
public var filteredCategories: [HermesSkillCategory] {
@@ -70,19 +82,116 @@ public final class SkillsViewModel {
/// Awaitable scan. iOS's `.task { await vm.load() }` and the
/// ScarfCore unit tests use this directly; Mac call sites wrap in
/// `Task { await ... }` from `onAppear`.
///
/// Pinned-name set is auto-fetched from the curator state file on
/// v0.12+ hosts; callers can override by passing an explicit set
/// (the Curator screen does this when it has a fresher snapshot in
/// hand).
@MainActor
public func load() async {
public func load(pinnedNames: Set<String>? = nil) async {
isLoading = true
lastError = nil
let ctx = context
let xport = transport
let cats: [HermesSkillCategory] = await Task.detached {
SkillsScanner.scan(context: ctx, transport: xport)
}.value
let pins = pinnedNames
// v2.8 instrumented so future captures show how many SSH
// RTTs the SkillsScanner walk costs on remote (it stats
// every ~/.hermes/skills/* directory + reads SKILL.md per).
let cats: [HermesSkillCategory] = await ScarfMon.measureAsync(.diskIO, "skills.load") {
await Task.detached {
let disabled = Self.readDisabledSkillNames(context: ctx)
let pinned = pins ?? Self.readPinnedSkillNames(context: ctx)
return SkillsScanner.scan(
context: ctx,
transport: xport,
disabledNames: disabled,
pinnedNames: pinned
)
}.value
}
let totalSkills = cats.reduce(0) { $0 + $1.skills.count }
ScarfMon.event(.diskIO, "skills.load.count", count: totalSkills)
categories = cats
isLoading = false
}
/// Read the curator's pinned-skills list from
/// `~/.hermes/skills/.curator_state` (JSON despite the lack of an
/// extension). Pre-v0.12 hosts won't have this file yet return
/// an empty set so the pin badge stays hidden.
nonisolated static func readPinnedSkillNames(context: ServerContext) -> Set<String> {
guard let data = context.readData(context.paths.curatorStateFile),
let obj = try? JSONSerialization.jsonObject(with: data) as? [String: Any]
else { return [] }
// Curator stores pins in either `pinned: [name, ...]` or
// `pinned_skills: [name, ...]` depending on Hermes version
// accept both shapes so we don't break on a future rename.
let raw = (obj["pinned"] as? [String]) ?? (obj["pinned_skills"] as? [String]) ?? []
return Set(raw)
}
/// Read the `skills.disabled:` array from `~/.hermes/config.yaml`.
/// Hermes v0.12 stores skill disable state there (one global list
/// + optional `skills.platform_disabled` overrides). Returns the
/// global list only Scarf doesn't surface platform overrides
/// today. Empty set on missing file / parse failure.
nonisolated static func readDisabledSkillNames(context: ServerContext) -> Set<String> {
guard let yaml = context.readText(context.paths.configYAML) else { return [] }
// Lightweight match: find `skills:` block, then `disabled:` array
// inside it. The full YAML parser is overkill for one nested array.
var inSkillsBlock = false
var disabledIndent: Int?
var collected: [String] = []
for raw in yaml.components(separatedBy: "\n") {
// Top-level `skills:` declaration.
if raw.hasPrefix("skills:") {
inSkillsBlock = true
continue
}
if inSkillsBlock {
// A new top-level block ends the `skills:` scope.
if !raw.hasPrefix(" ") && !raw.hasPrefix("\t") && raw.contains(":") {
break
}
let trimmed = raw.trimmingCharacters(in: .whitespaces)
if trimmed.hasPrefix("disabled:") {
// Inline form `disabled: [a, b, c]`
let after = trimmed.dropFirst("disabled:".count).trimmingCharacters(in: .whitespaces)
if after.hasPrefix("[") && after.hasSuffix("]") {
let body = after.dropFirst().dropLast()
let parts = body.split(separator: ",").map { String($0).trimmingCharacters(in: .whitespaces) }
for p in parts where !p.isEmpty {
collected.append(p.trimmingCharacters(in: CharacterSet(charactersIn: "\"' ")))
}
return Set(collected)
}
// Block form: `disabled:` followed by ` - name`
disabledIndent = raw.prefix { $0 == " " || $0 == "\t" }.count
continue
}
if let baseIndent = disabledIndent {
let leading = raw.prefix { $0 == " " || $0 == "\t" }.count
if !trimmed.isEmpty {
// PyYAML's default `yaml.dump` emits list items at the
// same indent as the parent key, so `- foo` lines for
// `disabled:` arrive at `leading == baseIndent`. Only
// a strictly shallower indent or a same-indent line
// that isn't a list item (sibling key) ends the block.
if leading < baseIndent { break }
if leading == baseIndent && !trimmed.hasPrefix("- ") { break }
}
if trimmed.hasPrefix("- ") {
let name = trimmed.dropFirst(2).trimmingCharacters(in: CharacterSet(charactersIn: "\"' "))
if !name.isEmpty {
collected.append(String(name))
}
}
}
}
}
return Set(collected)
}
public func selectSkill(_ skill: HermesSkill) {
selectedSkill = skill
let mainFile = skill.files.first(where: { $0.hasSuffix(".md") }) ?? skill.files.first
@@ -163,14 +272,34 @@ public final class SkillsViewModel {
browseHub()
return
}
let source = hubSource
let query = hubQuery
// Issue #79 for "All Sources", filter the cached browse list
// client-side instead of shelling out. Hermes's all-source
// search routes through its centralized index which can miss
// skills (e.g. honcho) that browse surfaces from non-indexed
// registries. Specific-source searches keep the CLI path so
// power users still get full upstream search semantics.
if source == "all" {
if lastBrowseResults.isEmpty {
// No cache yet kick off a browse, then filter on
// completion. The chained call lets the user type a
// query before ever clicking Browse.
browseHubThenFilter(query: query)
} else {
// Pure in-memory filter runs synchronously on the
// calling actor (UI invocations are already on
// MainActor) so the user sees the narrowed list
// without a render-tick gap.
applyClientSideFilter(query: query, against: lastBrowseResults)
}
return
}
isHubLoading = true
let bin = context.paths.hermesBinary
let xport = transport
let source = hubSource
let query = hubQuery
Task.detached { [weak self] in
var args = ["skills", "search", query, "--limit", "40"]
if source != "all" { args += ["--source", source] }
let args = ["skills", "search", query, "--limit", "40", "--source", source]
let result = Self.runHermes(executable: bin, args: args, transport: xport, timeout: 30)
let parsed = HermesSkillsHubParser.parseHubList(result.output)
await self?.finishBrowse(
@@ -182,6 +311,66 @@ public final class SkillsViewModel {
}
}
/// Run a browse fetch and then immediately apply a client-side
/// filter. Used by `searchHub` when the user types into search
/// before any browse has cached results.
private func browseHubThenFilter(query: String) {
isHubLoading = true
let bin = context.paths.hermesBinary
let xport = transport
Task.detached { [weak self] in
let args = ["skills", "browse", "--size", "40"]
let result = Self.runHermes(executable: bin, args: args, transport: xport, timeout: 30)
let parsed = HermesSkillsHubParser.parseHubList(result.output)
await self?.finishBrowseThenFilter(
browseResults: parsed,
query: query,
exitCode: result.exitCode,
rawOutput: result.output
)
}
}
@MainActor
private func finishBrowseThenFilter(
browseResults: [HermesHubSkill],
query: String,
exitCode: Int32,
rawOutput: String
) async {
if exitCode == 0 {
lastBrowseResults = browseResults
applyClientSideFilter(query: query, against: browseResults)
} else {
// Surface the underlying browse failure rather than a
// blank "no matches" state the user typed a query, not
// a browse request, but the cache was empty so we tried.
isHubLoading = false
hubResults = []
let detail = Self.firstSignificantLine(rawOutput)
hubMessage = detail.isEmpty
? "Search failed (exit \(exitCode))"
: "Search failed: \(detail)"
}
}
private func applyClientSideFilter(query: String, against pool: [HermesHubSkill]) {
let needle = query.trimmingCharacters(in: .whitespaces)
let matches: [HermesHubSkill]
if needle.isEmpty {
matches = pool
} else {
matches = pool.filter { skill in
skill.name.localizedCaseInsensitiveContains(needle)
|| skill.description.localizedCaseInsensitiveContains(needle)
|| skill.identifier.localizedCaseInsensitiveContains(needle)
}
}
isHubLoading = false
hubResults = matches
hubMessage = matches.isEmpty ? "No matches" : nil
}
public func installHubSkill(_ skill: HermesHubSkill) {
isHubLoading = true
hubMessage = "Installing \(skill.identifier)"
@@ -200,6 +389,68 @@ public final class SkillsViewModel {
}
}
/// v0.12: install a skill from a direct HTTPS URL pointing at a
/// SKILL.md (or a tarball). Hermes pulls + installs without going
/// through the registry indirection. The Mac UI gates this on
/// `HermesCapabilities.hasSkillURLInstall` so a v0.11 host doesn't
/// see a button that errors out with "unrecognized argument".
///
/// `categoryOverride` and `nameOverride` map to `--category` /
/// `--name` flags Hermes ships for direct-URL installs (the URL's
/// SKILL.md may not declare those, especially for one-off scripts).
public func installFromURL(
_ url: String,
categoryOverride: String? = nil,
nameOverride: String? = nil
) {
isHubLoading = true
hubMessage = "Installing from URL…"
let bin = context.paths.hermesBinary
let xport = transport
Task.detached { [weak self] in
var args = ["skills", "install", url, "--yes"]
if let category = categoryOverride, !category.isEmpty {
args += ["--category", category]
}
if let name = nameOverride, !name.isEmpty {
args += ["--name", name]
}
let result = Self.runHermes(
executable: bin,
args: args,
transport: xport,
timeout: 180
)
await self?.finishInstall(identifier: url, exitCode: result.exitCode)
}
}
/// v0.12: trigger a hot reload of `~/.hermes/skills/` so the agent
/// picks up file edits without a session restart. Hermes ships
/// `/reload-skills` as a slash command in chat AND `hermes skills
/// audit` as a CLI form. We use `audit` here so the reload works
/// even when no chat session is active.
public func reloadSkills() async {
isHubLoading = true
let bin = context.paths.hermesBinary
let xport = transport
let result = await Task.detached {
Self.runHermes(
executable: bin,
args: ["skills", "audit"],
transport: xport,
timeout: 30
)
}.value
hubMessage = result.exitCode == 0 ? "Skills reloaded" : "Reload failed"
isHubLoading = false
await load()
Task { @MainActor [weak self] in
try? await Task.sleep(nanoseconds: 3_000_000_000)
self?.hubMessage = nil
}
}
public func uninstallHubSkill(_ identifier: String) {
let bin = context.paths.hermesBinary
let xport = transport
@@ -262,6 +513,13 @@ public final class SkillsViewModel {
) async {
isHubLoading = false
hubResults = results
// Cache the fresh browse payload so the "All Sources" search
// path can filter client-side (issue #79). Search results are
// not cached they're already filtered by the user's query
// and would poison the filter pool.
if !isSearch && exitCode == 0 {
lastBrowseResults = results
}
if results.isEmpty {
if exitCode == 0 {
hubMessage = isSearch ? "No matches" : "No results"
@@ -0,0 +1,70 @@
import Testing
import Foundation
@testable import ScarfCore
/// Pure mapping tests for `GatewayAllowlistKind`. Locks down the (platform
/// kind) table so a refactor doesn't accidentally drop a platform.
@Suite struct GatewayAllowlistKindTests {
@Test func mapsKnownPlatformsToCorrectKind() {
#expect(GatewayAllowlistKind.kind(for: "slack") == .channels)
#expect(GatewayAllowlistKind.kind(for: "mattermost") == .channels)
#expect(GatewayAllowlistKind.kind(for: "google-chat") == .channels)
#expect(GatewayAllowlistKind.kind(for: "telegram") == .chats)
#expect(GatewayAllowlistKind.kind(for: "whatsapp") == .chats)
#expect(GatewayAllowlistKind.kind(for: "matrix") == .rooms)
#expect(GatewayAllowlistKind.kind(for: "dingtalk") == .rooms)
}
@Test func acceptsBothGoogleChatSpellings() {
// // TODO(WS-5-Q1) both spellings round-trip until Hermes confirms
// the wire identifier.
#expect(GatewayAllowlistKind.kind(for: "google-chat") == .channels)
#expect(GatewayAllowlistKind.kind(for: "googlechat") == .channels)
}
@Test func returnsNilForPlatformsWithoutAllowlist() {
#expect(GatewayAllowlistKind.kind(for: "cli") == nil)
#expect(GatewayAllowlistKind.kind(for: "yuanbao") == nil)
#expect(GatewayAllowlistKind.kind(for: "microsoft-teams") == nil)
#expect(GatewayAllowlistKind.kind(for: "discord") == nil)
#expect(GatewayAllowlistKind.kind(for: "signal") == nil)
#expect(GatewayAllowlistKind.kind(for: "homeassistant") == nil)
#expect(GatewayAllowlistKind.kind(for: "") == nil)
#expect(GatewayAllowlistKind.kind(for: "unknown") == nil)
}
@Test func yamlKeyMatchesHermesContract() {
#expect(GatewayAllowlistKind.channels.yamlKey == "allowed_channels")
#expect(GatewayAllowlistKind.chats.yamlKey == "allowed_chats")
#expect(GatewayAllowlistKind.rooms.yamlKey == "allowed_rooms")
}
@Test func nounsAreUserFacingSafe() {
#expect(GatewayAllowlistKind.channels.noun == "channel")
#expect(GatewayAllowlistKind.chats.noun == "chat")
#expect(GatewayAllowlistKind.rooms.noun == "room")
#expect(GatewayAllowlistKind.channels.pluralNoun == "channels")
#expect(GatewayAllowlistKind.chats.pluralNoun == "chats")
#expect(GatewayAllowlistKind.rooms.pluralNoun == "rooms")
}
@Test func placeholdersAreNonEmpty() {
// Smoke test placeholder strings are advisory; we just don't want
// them silently emptied during a refactor.
#expect(!GatewayAllowlistKind.channels.inputPlaceholder.isEmpty)
#expect(!GatewayAllowlistKind.chats.inputPlaceholder.isEmpty)
#expect(!GatewayAllowlistKind.rooms.inputPlaceholder.isEmpty)
}
@Test func gatewayPlatformSettingsItemsForKind() {
let s = GatewayPlatformSettings(
allowedChannels: ["C01"],
allowedChats: ["@user"],
allowedRooms: ["!room:matrix.org"]
)
#expect(s.items(for: .channels) == ["C01"])
#expect(s.items(for: .chats) == ["@user"])
#expect(s.items(for: .rooms) == ["!room:matrix.org"])
}
}
@@ -0,0 +1,276 @@
import Testing
import Foundation
@testable import ScarfCore
/// Round-trip + idempotence tests for `GatewayConfigWriter.setList`. Pure
/// `String` operations only runs cleanly on Linux SwiftPM.
@Suite struct GatewayConfigWriterTests {
// MARK: - Insert
@Test func setListInsertsBlockOnEmpty() {
let yaml = ""
let updated = GatewayConfigWriter.setList(
in: yaml,
platform: "slack",
key: "allowed_channels",
items: ["C0123ABCD", "C0456EFGH"]
)
#expect(updated.contains("gateway:"))
#expect(updated.contains(" platforms:"))
#expect(updated.contains(" slack:"))
#expect(updated.contains(" allowed_channels:"))
#expect(updated.contains("- C0123ABCD"))
#expect(updated.contains("- C0456EFGH"))
}
@Test func setListAppendsScaffoldPreservingPriorContent() {
let yaml = """
model:
default: gpt-4o
provider: openai
"""
let updated = GatewayConfigWriter.setList(
in: yaml,
platform: "slack",
key: "allowed_channels",
items: ["C01"]
)
// Original content preserved verbatim at the top.
#expect(updated.contains("model:"))
#expect(updated.contains(" default: gpt-4o"))
#expect(updated.contains(" provider: openai"))
// New scaffold appended.
#expect(updated.contains("gateway:"))
#expect(updated.contains(" slack:"))
#expect(updated.contains("- C01"))
}
// MARK: - Replace
@Test func setListReplacesExistingBlock() {
let yaml = """
gateway:
platforms:
slack:
allowed_channels:
- C_OLD_1
- C_OLD_2
"""
let updated = GatewayConfigWriter.setList(
in: yaml,
platform: "slack",
key: "allowed_channels",
items: ["C_NEW_1"]
)
#expect(updated.contains("- C_NEW_1"))
#expect(!updated.contains("- C_OLD_1"))
#expect(!updated.contains("- C_OLD_2"))
}
@Test func setListPreservesScalarSiblings() {
// The `busy_ack_enabled` scalar sibling of `allowed_channels` must
// stay byte-for-byte after a list-write to the same platform.
let yaml = """
gateway:
platforms:
slack:
allowed_channels:
- C_OLD
busy_ack_enabled: false
gateway_restart_notification: true
"""
let updated = GatewayConfigWriter.setList(
in: yaml,
platform: "slack",
key: "allowed_channels",
items: ["C_NEW"]
)
#expect(updated.contains("- C_NEW"))
#expect(!updated.contains("- C_OLD"))
// Scalars at the same indent must survive.
#expect(updated.contains("busy_ack_enabled: false"))
#expect(updated.contains("gateway_restart_notification: true"))
}
@Test func setListPreservesOtherPlatformsBlocks() {
// Editing slack must not touch matrix.
let yaml = """
gateway:
platforms:
slack:
allowed_channels:
- C_SLACK
matrix:
allowed_rooms:
- '!room1:matrix.org'
- '!room2:matrix.org'
"""
let updated = GatewayConfigWriter.setList(
in: yaml,
platform: "slack",
key: "allowed_channels",
items: ["C_SLACK_NEW"]
)
#expect(updated.contains("- C_SLACK_NEW"))
// Matrix block intact.
#expect(updated.contains(" matrix:"))
#expect(updated.contains("'!room1:matrix.org'"))
#expect(updated.contains("'!room2:matrix.org'"))
}
// MARK: - Remove
@Test func setListWithEmptyItemsRemovesBlock() {
let yaml = """
gateway:
platforms:
slack:
allowed_channels:
- C01
- C02
busy_ack_enabled: true
"""
let updated = GatewayConfigWriter.setList(
in: yaml,
platform: "slack",
key: "allowed_channels",
items: []
)
// Block removed; sibling scalar preserved.
#expect(!updated.contains("allowed_channels:"))
#expect(!updated.contains("- C01"))
#expect(!updated.contains("- C02"))
#expect(updated.contains("busy_ack_enabled: true"))
}
@Test func setListWithEmptyItemsOnAbsentBlockIsNoOp() {
let yaml = """
model:
default: gpt-4o
"""
let updated = GatewayConfigWriter.setList(
in: yaml,
platform: "slack",
key: "allowed_channels",
items: []
)
#expect(updated == yaml)
}
// MARK: - Idempotence
@Test func setListIsIdempotent() {
let yaml = """
model:
default: gpt-4o
"""
let once = GatewayConfigWriter.setList(
in: yaml,
platform: "telegram",
key: "allowed_chats",
items: ["@alice", "@bob"]
)
let twice = GatewayConfigWriter.setList(
in: once,
platform: "telegram",
key: "allowed_chats",
items: ["@alice", "@bob"]
)
#expect(once == twice)
}
@Test func setListReplaceThenReplaceIsStable() {
let yaml = ""
let a = GatewayConfigWriter.setList(
in: yaml, platform: "matrix", key: "allowed_rooms",
items: ["!a:m", "!b:m"]
)
let b = GatewayConfigWriter.setList(
in: a, platform: "matrix", key: "allowed_rooms",
items: ["!c:m"]
)
#expect(b.contains("- '!c:m'"))
#expect(!b.contains("'!a:m'"))
#expect(!b.contains("'!b:m'"))
}
// MARK: - Quoting
@Test func setListQuotesItemsContainingColons() {
// Matrix room IDs contain `:` must be single-quoted.
let yaml = ""
let updated = GatewayConfigWriter.setList(
in: yaml, platform: "matrix", key: "allowed_rooms",
items: ["!RoomId:matrix.org"]
)
#expect(updated.contains("'!RoomId:matrix.org'"))
}
@Test func setListQuotesItemsStartingWithAt() {
// Telegram usernames `@alice`.
let yaml = ""
let updated = GatewayConfigWriter.setList(
in: yaml, platform: "telegram", key: "allowed_chats",
items: ["@alice"]
)
#expect(updated.contains("'@alice'"))
}
@Test func setListLeavesPlainAlphanumericUnquoted() {
// Slack channel IDs are A-Z0-9 emit unquoted for readability.
let yaml = ""
let updated = GatewayConfigWriter.setList(
in: yaml, platform: "slack", key: "allowed_channels",
items: ["C0123ABCD"]
)
#expect(updated.contains("- C0123ABCD"))
#expect(!updated.contains("'C0123ABCD'"))
}
@Test func setListEscapesEmbeddedSingleQuotes() {
let yaml = ""
let updated = GatewayConfigWriter.setList(
in: yaml, platform: "slack", key: "allowed_channels",
items: ["weird:'name"]
)
// Embedded single quote doubled per YAML spec.
#expect(updated.contains("'weird:''name'"))
}
// MARK: - Insertion when ancestors exist but key is absent
@Test func setListInsertsKeyUnderExistingPlatformBlock() {
// `gateway platforms slack` exists with a busy_ack_enabled
// scalar; `allowed_channels` is missing. Add it without disturbing
// the scalar sibling.
let yaml = """
gateway:
platforms:
slack:
busy_ack_enabled: false
"""
let updated = GatewayConfigWriter.setList(
in: yaml, platform: "slack", key: "allowed_channels",
items: ["C42"]
)
#expect(updated.contains("busy_ack_enabled: false"))
#expect(updated.contains("allowed_channels:"))
#expect(updated.contains("- C42"))
}
// MARK: - Round-trip with the YAML loader
@Test func roundTripsThroughHermesConfigYAMLLoader() {
// Write a list, then parse the result through HermesConfig+YAML and
// confirm we read back what we wrote.
var yaml = ""
yaml = GatewayConfigWriter.setList(
in: yaml, platform: "slack", key: "allowed_channels",
items: ["C01", "C02"]
)
let cfg = HermesConfig(yaml: yaml)
let block = cfg.gatewayPlatforms["slack"]
#expect(block?.allowedChannels == ["C01", "C02"])
}
}
@@ -0,0 +1,150 @@
#if canImport(SQLite3)
import Foundation
@testable import ScarfCore
/// Test double for `HermesQueryBackend`. Lets the data-service-façade
/// tests assert which SQL gets emitted, with which params, and feed
/// scripted result rows back.
///
/// Implemented as an `actor` to satisfy the protocol's `Sendable`
/// requirement and to mirror how the real backends serialize state.
/// Marked `final` to prevent accidental subclassing Swift Testing
/// instances are short-lived per-`@Test`, but a stray subclass could
/// hide override quirks.
final actor MockHermesQueryBackend: HermesQueryBackend {
// MARK: - Knobs
var openShouldSucceed: Bool = true
var hasV07Schema: Bool = false
var hasV011Schema: Bool = false
var lastOpenError: String? = nil
/// Map of SQL prefix rows. Lookup picks the longest matching
/// prefix, so callers can register both broad ("SELECT") and
/// narrow ("SELECT id, source FROM sessions") matchers without
/// the broad one swallowing the narrow one.
private var scriptedResults: [String: [Row]] = [:]
/// Map of SQL prefix backend error to throw instead of returning
/// rows. Used to test the data-service's error-swallowing paths.
private var scriptedFailures: [String: BackendError] = [:]
/// Every `query(_:params:)` call lands here in order assertion
/// material for "did the façade emit the SQL we expected".
private(set) var queryLog: [(sql: String, params: [SQLValue])] = []
/// Every `queryBatch` call lands here in order, one outer entry
/// per call, inner entries for each statement in that batch.
private(set) var batchLog: [[(sql: String, params: [SQLValue])]] = []
/// Track open/refresh/close lifecycle for a couple of tests that
/// want to assert "façade really did call open()".
private(set) var openCallCount = 0
private(set) var refreshCallCount = 0
private(set) var closeCallCount = 0
// MARK: - Knob mutators (called from tests)
func setOpenShouldSucceed(_ value: Bool) { openShouldSucceed = value }
func setHasV07Schema(_ value: Bool) { hasV07Schema = value }
func setHasV011Schema(_ value: Bool) { hasV011Schema = value }
func setLastOpenError(_ value: String?) { lastOpenError = value }
/// Build a one-row result keyed on `prefix`. `columns` is the
/// column-name position map; `values` must be the same length.
func _seedRow(forSQLPrefix prefix: String, columns: [String: Int], values: [SQLValue]) {
let row = Row(values: values, columnIndex: columns)
scriptedResults[prefix] = [row]
}
/// Seed an arbitrary row sequence for queries that share `prefix`.
func _seedRows(forSQLPrefix prefix: String, _ rows: [Row]) {
scriptedResults[prefix] = rows
}
/// Make `query` throw the specified `error` whenever it sees a SQL
/// that begins with `prefix`.
func _seedFailure(forSQLPrefix prefix: String, error: BackendError) {
scriptedFailures[prefix] = error
}
// MARK: - HermesQueryBackend conformance
func open() async -> Bool {
openCallCount += 1
return openShouldSucceed
}
@discardableResult
func refresh(forceFresh: Bool) async -> Bool {
refreshCallCount += 1
return openShouldSucceed
}
func close() async {
closeCallCount += 1
}
func query(_ sql: String, params: [SQLValue]) async throws -> [Row] {
queryLog.append((sql: sql, params: params))
if let failure = longestMatchingFailure(for: sql) {
throw failure
}
return longestMatchingRows(for: sql) ?? []
}
func queryBatch(_ statements: [(sql: String, params: [SQLValue])]) async throws -> [[Row]] {
batchLog.append(statements)
var out: [[Row]] = []
out.reserveCapacity(statements.count)
for stmt in statements {
if let failure = longestMatchingFailure(for: stmt.sql) {
throw failure
}
out.append(longestMatchingRows(for: stmt.sql) ?? [])
}
return out
}
// MARK: - Internals
/// Pick the longest registered prefix that `sql` starts with.
/// Ties go to whichever ordering Dictionary iteration produced
/// callers should not register two equal-length matchers for the
/// same SQL because the resolution order is undefined.
private func longestMatchingRows(for sql: String) -> [Row]? {
var bestMatch: (key: String, rows: [Row])?
for (prefix, rows) in scriptedResults {
if sql.hasPrefix(prefix) {
if let current = bestMatch {
if prefix.count > current.key.count {
bestMatch = (prefix, rows)
}
} else {
bestMatch = (prefix, rows)
}
}
}
return bestMatch?.rows
}
private func longestMatchingFailure(for sql: String) -> BackendError? {
var bestMatch: (key: String, error: BackendError)?
for (prefix, error) in scriptedFailures {
if sql.hasPrefix(prefix) {
if let current = bestMatch {
if prefix.count > current.key.count {
bestMatch = (prefix, error)
}
} else {
bestMatch = (prefix, error)
}
}
}
return bestMatch?.error
}
}
#endif // canImport(SQLite3)
@@ -0,0 +1,227 @@
import Testing
import Foundation
@testable import ScarfCore
/// Pure parser tests for `HermesCapabilities`. The detection store
/// (`HermesCapabilitiesStore`) is exercised separately under integration
/// tests since it spawns `hermes --version`.
@Suite struct HermesCapabilitiesTests {
// MARK: - Version line parsing
@Test func parseV013ReleaseLine() {
let caps = HermesCapabilities.parseLine("Hermes Agent v0.13.0 (2026.5.7)")
#expect(caps.semver == HermesCapabilities.SemVer(major: 0, minor: 13, patch: 0))
#expect(caps.dateVersion == HermesCapabilities.DateVersion(year: 2026, month: 5, day: 7))
#expect(caps.detected)
}
@Test func parseV012ReleaseLine() {
let caps = HermesCapabilities.parseLine("Hermes Agent v0.12.0 (2026.4.30)")
#expect(caps.semver == HermesCapabilities.SemVer(major: 0, minor: 12, patch: 0))
#expect(caps.dateVersion == HermesCapabilities.DateVersion(year: 2026, month: 4, day: 30))
#expect(caps.detected)
}
@Test func parseV011ReleaseLine() {
let caps = HermesCapabilities.parseLine("Hermes Agent v0.11.0 (2026.4.23)")
#expect(caps.semver == HermesCapabilities.SemVer(major: 0, minor: 11, patch: 0))
#expect(caps.dateVersion == HermesCapabilities.DateVersion(year: 2026, month: 4, day: 23))
}
@Test func parseSemverWithoutDate() {
// Some older Hermes builds emit only the semver suffix.
let caps = HermesCapabilities.parseLine("Hermes Agent v0.10.5")
#expect(caps.semver == HermesCapabilities.SemVer(major: 0, minor: 10, patch: 5))
#expect(caps.dateVersion == nil)
}
@Test func parseFullStdoutBlock() {
// Real `hermes --version` output is multi-line; the version sits on
// the first line and the rest is metadata.
let stdout = """
Hermes Agent v0.12.0 (2026.4.30)
Project: /Users/alan/.hermes/hermes-agent
Python: 3.11.15
OpenAI SDK: 2.31.0
Up to date
"""
let caps = HermesCapabilities.parse(stdout)
#expect(caps.semver?.minor == 12)
#expect(caps.dateVersion?.year == 2026)
}
@Test func parseRejectsUnrelatedOutput() {
let caps = HermesCapabilities.parse("hermes: command not found")
#expect(caps.semver == nil)
#expect(!caps.detected)
}
@Test func parseHandlesEmptyString() {
let caps = HermesCapabilities.parse("")
#expect(caps == .empty)
}
@Test func parseHandlesPartialSemver() {
// "v0.11" without the patch component shouldn't accidentally match.
let caps = HermesCapabilities.parseLine("Hermes Agent v0.11")
#expect(caps.semver == nil)
}
// MARK: - SemVer ordering
@Test func semverOrdering() {
let v0_11_0 = HermesCapabilities.SemVer(major: 0, minor: 11, patch: 0)
let v0_12_0 = HermesCapabilities.SemVer(major: 0, minor: 12, patch: 0)
let v0_12_5 = HermesCapabilities.SemVer(major: 0, minor: 12, patch: 5)
let v1_0_0 = HermesCapabilities.SemVer(major: 1, minor: 0, patch: 0)
#expect(v0_11_0 < v0_12_0)
#expect(v0_12_0 < v0_12_5)
#expect(v0_12_5 < v1_0_0)
}
// MARK: - Capability flags
@Test func v013FlagsAllOn() {
let caps = HermesCapabilities.parseLine("Hermes Agent v0.13.0 (2026.5.7)")
// v0.12 surfaces remain on.
#expect(caps.hasCurator)
#expect(caps.hasKanban)
#expect(caps.hasACPImagePrompts)
#expect(!caps.hasFlushMemoriesAux)
// v0.13 surfaces light up.
#expect(caps.hasGoals)
#expect(caps.hasACPQueue)
#expect(caps.hasACPSteerOnIdle)
#expect(caps.hasKanbanDiagnostics)
#expect(caps.hasCuratorArchive)
#expect(caps.hasGoogleChatPlatform)
#expect(caps.hasGatewayAllowlists)
#expect(caps.hasGatewayBusyAckToggle)
#expect(caps.hasGatewayRestartNotification)
#expect(caps.hasGatewayList)
#expect(caps.hasMCPSSETransport)
#expect(caps.hasCronNoAgent)
#expect(caps.hasWebToolsBackendSplit)
#expect(caps.hasProfileNoSkills)
#expect(caps.hasContextCompressionCount)
#expect(caps.hasNewWithSessionName)
#expect(caps.hasUpdateNonInteractive)
#expect(caps.hasOpenRouterResponseCache)
#expect(caps.hasImageGenModel)
#expect(caps.hasDisplayLanguage)
#expect(caps.hasXAIVoiceCloning)
#expect(caps.hasVideoAnalyze)
#expect(caps.hasTransformLLMOutputHook)
}
@Test func v012FlagsAllOn() {
let caps = HermesCapabilities.parseLine("Hermes Agent v0.12.0 (2026.4.30)")
// v0.12 surfaces on.
#expect(caps.hasCurator)
#expect(caps.hasFallbackCommand)
#expect(caps.hasKanban)
#expect(caps.hasOneShot)
#expect(caps.hasSkillURLInstall)
#expect(caps.hasACPImagePrompts)
#expect(caps.hasUpdateCheck)
#expect(caps.hasPiperTTS)
#expect(caps.hasVercelTerminal)
#expect(caps.hasCuratorAux)
#expect(caps.hasTeamsPlatform)
#expect(caps.hasYuanbaoPlatform)
#expect(caps.hasCronWorkdir)
#expect(caps.hasPromptCacheTTL)
#expect(caps.hasRedactionToggle)
// flush_memories was REMOVED in v0.12 flag inverts.
#expect(!caps.hasFlushMemoriesAux)
// v0.13 surfaces stay off on a v0.12 host.
#expect(!caps.hasGoals)
#expect(!caps.hasACPQueue)
#expect(!caps.hasKanbanDiagnostics)
#expect(!caps.hasCuratorArchive)
#expect(!caps.hasGoogleChatPlatform)
#expect(!caps.hasGatewayAllowlists)
#expect(!caps.hasMCPSSETransport)
#expect(!caps.hasCronNoAgent)
#expect(!caps.hasWebToolsBackendSplit)
#expect(!caps.hasProfileNoSkills)
#expect(!caps.hasContextCompressionCount)
#expect(!caps.hasOpenRouterResponseCache)
#expect(!caps.hasImageGenModel)
#expect(!caps.hasDisplayLanguage)
#expect(!caps.hasXAIVoiceCloning)
}
@Test func v011FlagsAllOff() {
let caps = HermesCapabilities.parseLine("Hermes Agent v0.11.0 (2026.4.23)")
#expect(!caps.hasCurator)
#expect(!caps.hasFallbackCommand)
#expect(!caps.hasKanban)
#expect(!caps.hasOneShot)
#expect(!caps.hasSkillURLInstall)
#expect(!caps.hasACPImagePrompts)
#expect(!caps.hasUpdateCheck)
#expect(!caps.hasPiperTTS)
#expect(!caps.hasVercelTerminal)
#expect(!caps.hasCuratorAux)
#expect(!caps.hasTeamsPlatform)
#expect(!caps.hasYuanbaoPlatform)
#expect(!caps.hasCronWorkdir)
#expect(!caps.hasPromptCacheTTL)
#expect(!caps.hasRedactionToggle)
// flush_memories aux row was still alive on v0.11.
#expect(caps.hasFlushMemoriesAux)
}
@Test func emptyCapabilitiesAllOff() {
// Undetected installs should hide every gated UI surface.
let caps = HermesCapabilities.empty
#expect(!caps.hasCurator)
#expect(!caps.hasFlushMemoriesAux) // unknown hide either way
#expect(!caps.detected)
}
@Test func futureVersionRetainsCapabilities() {
// A v0.14 (hypothetical) should still see all v0.12 + v0.13 capabilities on.
let caps = HermesCapabilities.parseLine("Hermes Agent v0.14.0 (2026.7.1)")
#expect(caps.hasCurator)
#expect(caps.hasACPImagePrompts)
#expect(caps.hasGoals)
#expect(caps.hasKanbanDiagnostics)
#expect(caps.hasCuratorArchive)
// And flush_memories stays gone.
#expect(!caps.hasFlushMemoriesAux)
}
@Test func v0_13_patchReleaseStillEnablesAllFlags() {
// A v0.13.4 patch release should still enable every v0.13 flag.
let caps = HermesCapabilities.parseLine("Hermes Agent v0.13.4 (2026.5.20)")
#expect(caps.hasGoals)
#expect(caps.hasACPQueue)
#expect(caps.hasKanbanDiagnostics)
#expect(caps.hasGoogleChatPlatform)
}
// MARK: - isV013OrLater convenience predicate
@Test func isV013OrLater_v013HostTrue() {
let caps = HermesCapabilities.parseLine("Hermes Agent v0.13.0 (2026.5.7)")
#expect(caps.isV013OrLater)
}
@Test func isV013OrLater_v012HostFalse() {
let caps = HermesCapabilities.parseLine("Hermes Agent v0.12.0 (2026.4.30)")
#expect(!caps.isV013OrLater)
}
@Test func isV013OrLater_emptyFalse() {
let caps = HermesCapabilities.empty
#expect(!caps.isV013OrLater)
}
@Test func isV013OrLater_v014HostTrue() {
let caps = HermesCapabilities.parseLine("Hermes Agent v0.14.0 (2026.7.1)")
#expect(caps.isV013OrLater)
}
}
@@ -0,0 +1,319 @@
import Testing
import Foundation
@testable import ScarfCore
@Suite struct HermesCuratorParserTests {
/// Real `hermes curator status` output captured from a v0.12.0
/// install with no curator runs yet. Locks in the empty-state
/// happy path so a Hermes layout tweak surfaces here before
/// CuratorView starts rendering "" placeholders silently.
private static let realFreshOutput = """
curator: ENABLED
runs: 0
last run: never
last summary: (none)
interval: every 7d
stale after: 30d unused
archive after: 90d unused
agent-created skills: 18 total
active 18
stale 0
archived 0
least recently active (top 5):
Scarf Dashboard Chart Widget Parse Error Fix activity= 0 use= 0 view= 0 patches= 0 last_activity=never
Scarf Project Registry Format Fix activity= 0 use= 0 view= 0 patches= 0 last_activity=never
clip activity= 0 use= 0 view= 0 patches= 0 last_activity=never
find-nearby activity= 0 use= 0 view= 0 patches= 0 last_activity=never
gguf-quantization activity= 0 use= 0 view= 0 patches= 0 last_activity=never
least active (top 5):
Scarf Dashboard Chart Widget Parse Error Fix activity= 0 use= 0 view= 0 patches= 0 last_activity=never
Scarf Project Registry Format Fix activity= 0 use= 0 view= 0 patches= 0 last_activity=never
clip activity= 0 use= 0 view= 0 patches= 0 last_activity=never
find-nearby activity= 0 use= 0 view= 0 patches= 0 last_activity=never
gguf-quantization activity= 0 use= 0 view= 0 patches= 0 last_activity=never
"""
@Test func parseRealFreshOutput() {
let s = HermesCuratorStatusParser.parse(text: Self.realFreshOutput)
#expect(s.state == .enabled)
#expect(s.runCount == 0)
#expect(s.lastRunISO == nil)
#expect(s.lastSummary == nil)
#expect(s.intervalLabel == "every 7d")
#expect(s.staleAfterLabel == "30d unused")
#expect(s.archiveAfterLabel == "90d unused")
#expect(s.totalSkills == 18)
#expect(s.activeSkills == 18)
#expect(s.staleSkills == 0)
#expect(s.archivedSkills == 0)
#expect(s.pinnedNames.isEmpty)
#expect(s.leastRecentlyActive.count == 5)
#expect(s.leastActive.count == 5)
#expect(s.mostActive.isEmpty)
let firstRow = s.leastRecentlyActive.first
#expect(firstRow?.name == "Scarf Dashboard Chart Widget Parse Error Fix")
#expect(firstRow?.activityCount == 0)
#expect(firstRow?.lastActivityLabel == "never")
}
@Test func parsedPausedState() {
let text = """
curator: PAUSED
runs: 5
last run: 2026-04-29T03:10:00Z
last summary: pruned 2 skills, consolidated 1
interval: every 7d
stale after: 30d unused
archive after: 90d unused
agent-created skills: 12 total
active 8
stale 3
archived 1
pinned (2): kanban-orchestrator, scarf-template-author
"""
let s = HermesCuratorStatusParser.parse(text: text)
#expect(s.state == .paused)
#expect(s.runCount == 5)
#expect(s.lastRunISO == "2026-04-29T03:10:00Z")
#expect(s.lastSummary == "pruned 2 skills, consolidated 1")
#expect(s.totalSkills == 12)
#expect(s.activeSkills == 8)
#expect(s.staleSkills == 3)
#expect(s.archivedSkills == 1)
#expect(s.pinnedNames == ["kanban-orchestrator", "scarf-template-author"])
}
@Test func stateFileOverridesTextSummary() {
// The state file is authoritative for last_run_at /
// last_run_summary / last_report_path because it carries full
// ISO timestamps the text output may have rounded. Verify that
// a state file with richer values overrides parsed text.
let text = """
curator: ENABLED
runs: 1
last run: 2026-04-30T11:00:00Z
last summary: short
interval: every 7d
stale after: 30d unused
archive after: 90d unused
agent-created skills: 3 total
active 3
stale 0
archived 0
"""
let stateJSON: [String: Any] = [
"run_count": 4,
"last_run_at": "2026-04-30T18:42:13.001Z",
"last_run_summary": "richer summary from state file",
"last_report_path": "/Users/u/.hermes/logs/curator/20260430-184213"
]
let data = try! JSONSerialization.data(withJSONObject: stateJSON)
let s = HermesCuratorStatusParser.parse(text: text, stateFileJSON: data)
#expect(s.runCount == 4)
#expect(s.lastRunISO == "2026-04-30T18:42:13.001Z")
#expect(s.lastSummary == "richer summary from state file")
#expect(s.lastReportPath == "/Users/u/.hermes/logs/curator/20260430-184213")
}
@Test func parsedDisabledStatus() {
let s = HermesCuratorStatusParser.parse(text: "curator: DISABLED\n runs: 0\n")
#expect(s.state == .disabled)
}
@Test func parsedEmptyOutputStaysSafe() {
let s = HermesCuratorStatusParser.parse(text: "")
#expect(s.state == .unknown)
#expect(s.totalSkills == 0)
#expect(s.leastRecentlyActive.isEmpty)
}
@Test func skillRowParserHandlesMultiWordNames() {
// Names with spaces are common (Scarf Dashboard Chart Widget)
// The parser slices at the first `activity=` so names can be
// arbitrary length without breaking the counter columns.
let row = " Some Long Skill Name v2 activity= 12 use= 4 view= 6 patches= 2 last_activity=2026-04-25"
let s = HermesCuratorStatusParser.parse(text: """
least recently active (top 5):
\(row)
""")
let parsed = s.leastRecentlyActive.first
#expect(parsed?.name == "Some Long Skill Name v2")
#expect(parsed?.activityCount == 12)
#expect(parsed?.useCount == 4)
#expect(parsed?.viewCount == 6)
#expect(parsed?.patchCount == 2)
#expect(parsed?.lastActivityLabel == "2026-04-25")
}
// MARK: - v0.13 list-archived / prune fixtures (WS-4)
/// Empty JSON array `[]`. Locks in the happy-path no-archives shape.
@Test func listArchivedEmpty() throws {
let result = try CuratorService.parseListArchived(stdout: "[]")
#expect(result.isEmpty)
}
/// Three archives with full optional fields. Asserts each
/// optional value decodes through `decodeIfPresent` and that
/// the computed labels resolve.
@Test func listArchivedThreeSkills() throws {
let json = """
[
{
"name": "legacy-helper",
"category": "templates",
"archived_at": "2026-04-22T03:14:09Z",
"reason": "stale: 91d unused",
"size_bytes": 4521,
"path": "/Users/u/.hermes/skills/.archived/legacy-helper"
},
{
"name": "old-translator",
"category": "user",
"archived_at": "2026-04-23T10:00:00Z",
"reason": "consolidated with translator",
"size_bytes": 8192
},
{
"name": "minimal"
}
]
"""
let result = try CuratorService.parseListArchived(stdout: json)
#expect(result.count == 3)
#expect(result[0].name == "legacy-helper")
#expect(result[0].category == "templates")
#expect(result[0].reason == "stale: 91d unused")
#expect(result[0].sizeBytes == 4521)
#expect(result[0].archivedAtLabel == "2026-04-22")
#expect(result[0].path == "/Users/u/.hermes/skills/.archived/legacy-helper")
// Tolerant: only `name` set on the third row.
#expect(result[2].name == "minimal")
#expect(result[2].category == nil)
#expect(result[2].reason == nil)
#expect(result[2].archivedAtLabel == "")
#expect(result[2].sizeLabel == "")
}
/// `{"archived": [...]}` envelope is also accepted.
@Test func listArchivedEnvelope() throws {
let json = """
{"archived": [
{"name": "envelope-skill", "size_bytes": 1024}
]}
"""
let result = try CuratorService.parseListArchived(stdout: json)
#expect(result.count == 1)
#expect(result[0].name == "envelope-skill")
}
/// Text fallback when `--json` isn't supported. Each row carries
/// the name in column 1 plus k=v chips for the optional fields.
@Test func listArchivedTextFallback() {
let text = """
legacy-helper archived=2026-04-22 size=4521 reason=stale
old-translator archived=2026-04-23 size=8192
minimal-row
"""
let result = CuratorService.parseListArchivedText(text)
#expect(result.count == 3)
#expect(result[0].name == "legacy-helper")
#expect(result[0].archivedAt == "2026-04-22")
#expect(result[0].sizeBytes == 4521)
#expect(result[0].reason == "stale")
#expect(result[2].name == "minimal-row")
#expect(result[2].sizeBytes == nil)
}
/// Empty-state sentinel folds to `[]` (parallel to KanbanService's
/// `"no matching tasks"` handling).
@Test func listArchivedNoArchivedSentinel() throws {
let result = try CuratorService.parseListArchived(stdout: "no archived skills\n")
#expect(result.isEmpty)
}
/// Whitespace-only stdout also folds to empty.
@Test func listArchivedWhitespaceFoldsToEmpty() throws {
let result = try CuratorService.parseListArchived(stdout: " \n\n")
#expect(result.isEmpty)
}
/// Decode failure (clearly non-JSON, non-text) throws. We accept
/// JSON, the envelope, the empty sentinel, or text rows; anything
/// else surfaces as a `CuratorError.decoding`.
@Test func listArchivedNonsenseThrows() throws {
do {
_ = try CuratorService.parseListArchived(stdout: "{garbage")
Issue.record("expected decoding throw")
} catch let error as CuratorError {
if case .decoding = error {
// expected
} else {
Issue.record("unexpected error \(error)")
}
}
}
/// Prune-dry-run JSON with `would_remove` + `total_bytes`.
@Test func pruneDryRunHappyPath() {
let json = """
{
"would_remove": [
{"name": "stale-a", "size_bytes": 1000},
{"name": "stale-b", "size_bytes": 2000}
],
"total_bytes": 3000
}
"""
let summary = CuratorService.parsePruneDryRun(json)
#expect(summary.totalCount == 2)
#expect(summary.totalBytes == 3000)
#expect(summary.wouldRemove.first?.name == "stale-a")
}
/// Zero-skill prune is a valid dry-run (no archives).
@Test func pruneDryRunZeroSkills() {
let json = """
{"would_remove": [], "total_bytes": 0}
"""
let summary = CuratorService.parsePruneDryRun(json)
#expect(summary.totalCount == 0)
#expect(summary.totalBytes == 0)
#expect(summary.totalBytesLabel == "")
}
/// Bare-array fallback: some Hermes builds may print just the
/// would-remove list when the wrapper is missing.
@Test func pruneDryRunBareArrayFallback() {
let json = """
[{"name": "lonely", "size_bytes": 500}]
"""
let summary = CuratorService.parsePruneDryRun(json)
#expect(summary.totalCount == 1)
#expect(summary.totalBytes == 500)
}
/// Empty / whitespace stdout zero summary (no decoding throw).
@Test func pruneDryRunEmptyStaysSafe() {
let summary = CuratorService.parsePruneDryRun(" \n")
#expect(summary.totalCount == 0)
#expect(summary.totalBytes == 0)
}
/// Verify the size label uses the byte formatter (not raw bytes).
@Test func archivedSkillSizeLabelFormats() {
let big = HermesCuratorArchivedSkill(name: "x", sizeBytes: 1_500_000)
// ByteCountFormatter produces a localized label; just verify
// it's non-empty and not raw "1500000".
#expect(!big.sizeLabel.isEmpty)
#expect(big.sizeLabel != "1500000")
}
}
@@ -0,0 +1,338 @@
#if canImport(SQLite3)
import Testing
import Foundation
@testable import ScarfCore
/// Exercises the `HermesDataService` façade against a `MockHermesQueryBackend`
/// via the `internal init(context:backend:)` test seam. Focus is the SQL
/// the façade emits + how it consumes the rows that come back.
@Suite struct HermesDataServiceBackendTests {
// MARK: - Helpers
/// Build a `Row` from `(name, value)` pairs in column order.
/// Mirrors the shape `LocalSQLiteBackend.executeOne` produces.
private func makeRow(_ pairs: [(String, SQLValue)]) -> Row {
var values: [SQLValue] = []
var columnIndex: [String: Int] = [:]
values.reserveCapacity(pairs.count)
for (i, pair) in pairs.enumerated() {
values.append(pair.1)
columnIndex[pair.0] = i
}
return Row(values: values, columnIndex: columnIndex)
}
/// Default 16-column session row matching `sessionColumns` for
/// the bare base schema. Uses `.text("s1")` for id by default.
private func makeBaseSessionRow(id: String = "s1") -> Row {
makeRow([
("id", .text(id)),
("source", .text("acp")),
("user_id", .null),
("model", .text("gpt-5")),
("title", .text("hello")),
("parent_session_id", .null),
("started_at", .real(1_700_000_000.0)),
("ended_at", .null),
("end_reason", .null),
("message_count", .integer(5)),
("tool_call_count", .integer(2)),
("input_tokens", .integer(100)),
("output_tokens", .integer(200)),
("cache_read_tokens", .integer(0)),
("cache_write_tokens", .integer(0)),
("estimated_cost_usd", .real(0.05))
])
}
/// 10-column message row matching `messageColumns` for the bare base schema.
private func makeBaseMessageRow(id: Int, sessionId: String = "s1", timestamp: Double = 1_700_000_001.0) -> Row {
makeRow([
("id", .integer(Int64(id))),
("session_id", .text(sessionId)),
("role", .text("user")),
("content", .text("hi #\(id)")),
("tool_call_id", .null),
("tool_calls", .null),
("tool_name", .null),
("timestamp", .real(timestamp)),
("token_count", .integer(10)),
("finish_reason", .null)
])
}
/// Use a real `ServerContext.local` so the data service has a
/// transport to construct (it's never used by these tests every
/// I/O path goes through the injected backend).
private let context: ServerContext = .local
// MARK: - fetchSessions
@Test func fetchSessionsEmitsExpectedSQLPrefixAndDefaultLimit() async {
let mock = MockHermesQueryBackend()
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
_ = await service.fetchSessions()
let log = await mock.queryLog
#expect(log.count == 1)
let first = log[0]
#expect(first.sql.hasPrefix("SELECT id, source"))
#expect(first.sql.contains("FROM sessions WHERE parent_session_id IS NULL ORDER BY started_at DESC LIMIT ?"))
// QueryDefaults.sessionLimit == 100.
#expect(first.params == [.integer(100)])
}
@Test func fetchSessionsBareSchemaUsesBaseColumnList() async {
let mock = MockHermesQueryBackend()
// Both schema flags off neither v0.7 nor v0.11 columns selected.
await mock.setHasV07Schema(false)
await mock.setHasV011Schema(false)
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
_ = await service.fetchSessions()
let sql = await mock.queryLog[0].sql
#expect(!sql.contains("reasoning_tokens"))
#expect(!sql.contains("api_call_count"))
// Sanity: base columns are still all there.
#expect(sql.contains("estimated_cost_usd"))
}
@Test func fetchSessionsWithV07SchemaIncludesReasoningTokens() async {
let mock = MockHermesQueryBackend()
await mock.setHasV07Schema(true)
await mock.setHasV011Schema(false)
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
_ = await service.fetchSessions()
let sql = await mock.queryLog[0].sql
#expect(sql.contains("reasoning_tokens"))
#expect(sql.contains("actual_cost_usd"))
#expect(sql.contains("cost_status"))
#expect(sql.contains("billing_provider"))
#expect(!sql.contains("api_call_count"))
}
@Test func fetchSessionsWithV011SchemaIncludesApiCallCount() async {
let mock = MockHermesQueryBackend()
await mock.setHasV07Schema(true)
await mock.setHasV011Schema(true)
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
_ = await service.fetchSessions()
let sql = await mock.queryLog[0].sql
#expect(sql.contains("reasoning_tokens"))
#expect(sql.contains("api_call_count"))
}
// MARK: - fetchSession(id:)
@Test func fetchSessionByIdBindsTextParam() async {
let mock = MockHermesQueryBackend()
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
await mock._seedRow(
forSQLPrefix: "SELECT id, source",
columns: makeBaseSessionRow().columnIndex,
values: makeBaseSessionRow().values
)
let session = await service.fetchSession(id: "abc-123")
#expect(session?.id == "s1") // From the seeded row.
let log = await mock.queryLog
#expect(log.count == 1)
#expect(log[0].sql.contains("FROM sessions WHERE id = ? LIMIT 1"))
#expect(log[0].params == [.text("abc-123")])
}
// MARK: - fetchMessages
@Test func fetchMessagesWithoutBeforeBindsSessionAndLimit() async {
let mock = MockHermesQueryBackend()
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
_ = await service.fetchMessages(sessionId: "s1", limit: 25, before: nil)
let log = await mock.queryLog
#expect(log.count == 1)
#expect(!log[0].sql.contains("id < ?"))
#expect(log[0].sql.contains("WHERE session_id = ? ORDER BY id DESC LIMIT ?"))
#expect(log[0].params == [.text("s1"), .integer(25)])
}
@Test func fetchMessagesWithBeforeIncludesIdLessThanClause() async {
let mock = MockHermesQueryBackend()
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
_ = await service.fetchMessages(sessionId: "s1", limit: 25, before: 999)
let log = await mock.queryLog
#expect(log.count == 1)
#expect(log[0].sql.contains("WHERE session_id = ? AND id < ? ORDER BY id DESC LIMIT ?"))
#expect(log[0].params == [.text("s1"), .integer(999), .integer(25)])
}
@Test func fetchMessagesReversesDescResultsToChronological() async {
let mock = MockHermesQueryBackend()
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
// Backend returns DESC (newest first); service should reverse to
// chronological (oldest first) for display.
let row3 = makeBaseMessageRow(id: 3, timestamp: 1_700_000_003.0)
let row2 = makeBaseMessageRow(id: 2, timestamp: 1_700_000_002.0)
let row1 = makeBaseMessageRow(id: 1, timestamp: 1_700_000_001.0)
await mock._seedRows(forSQLPrefix: "SELECT id, session_id", [row3, row2, row1])
let result = await service.fetchMessages(sessionId: "s1", limit: 10, before: nil)
#expect(result.count == 3)
#expect(result.map { $0.id } == [1, 2, 3])
}
// MARK: - dashboardSnapshot
@Test func dashboardSnapshotUsesQueryBatchNotIndividualQueries() async {
let mock = MockHermesQueryBackend()
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
_ = await service.dashboardSnapshot()
let queries = await mock.queryLog
let batches = await mock.batchLog
#expect(queries.isEmpty)
#expect(batches.count == 1)
#expect(batches[0].count == 4)
}
@Test func dashboardSnapshotBatchOrderIsStatsRecentSessionsPreviewsToolCalls() async {
let mock = MockHermesQueryBackend()
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
_ = await service.dashboardSnapshot()
let batches = await mock.batchLog
#expect(batches.count == 1)
let stmts = batches[0]
// 0: stats selects COUNT(*), SUM(...) from sessions.
#expect(stmts[0].sql.contains("COUNT(*)"))
#expect(stmts[0].sql.contains("FROM sessions"))
// 1: recent sessions selects session columns with a LIMIT param.
#expect(stmts[1].sql.hasPrefix("SELECT id, source"))
#expect(stmts[1].sql.contains("ORDER BY started_at DESC LIMIT ?"))
// 2: session previews joins messages with first user message.
#expect(stmts[2].sql.contains("INNER JOIN"))
#expect(stmts[2].sql.contains("MIN(id)"))
// 3: recent tool calls selects messages WHERE tool_calls IS NOT NULL.
#expect(stmts[3].sql.contains("WHERE tool_calls IS NOT NULL"))
}
@Test func dashboardSnapshotAssemblesDataFromFourResultSets() async {
let mock = MockHermesQueryBackend()
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
// Stats row (6 cols on bare schema).
let statsRow = makeRow([
("c0", .integer(7)), // totalSessions
("c1", .integer(50)), // totalMessages
("c2", .integer(12)), // totalToolCalls
("c3", .integer(1000)), // totalInputTokens
("c4", .integer(2000)), // totalOutputTokens
("c5", .real(1.25)) // totalCostUSD
])
await mock._seedRow(forSQLPrefix: "SELECT COUNT(*),", columns: statsRow.columnIndex, values: statsRow.values)
// Recent sessions: one base session row.
await mock._seedRows(forSQLPrefix: "SELECT id, source", [makeBaseSessionRow(id: "sess-A")])
// Previews: two-column rows (session_id, content slice).
let p1 = makeRow([("session_id", .text("sess-A")), ("preview", .text("first user msg"))])
await mock._seedRows(forSQLPrefix: "SELECT m.session_id", [p1])
// Recent tool calls: one message row with non-empty tool_calls.
var toolRow = makeBaseMessageRow(id: 99, sessionId: "sess-A")
// Manually rewrite tool_calls column (idx 5) to non-null/non-empty.
let toolRowValues: [SQLValue] = [
.integer(99), .text("sess-A"), .text("assistant"), .text("Calling tool"),
.null, .text("[{\"id\":\"t1\",\"name\":\"bash\"}]"), .text("bash"),
.real(1_700_000_010.0), .integer(15), .text("stop")
]
toolRow = Row(values: toolRowValues, columnIndex: toolRow.columnIndex)
// Both `fetchRecentToolCalls` and the dashboard batch slot start
// with the same `messageColumns` prefix; match on a shorter
// common substring that's whitespace-stable across the two
// SQL builders.
await mock._seedRows(forSQLPrefix: "SELECT id, session_id, role, content, tool_call_id, tool_calls,\ntool_name", [toolRow])
let snapshot = await service.dashboardSnapshot()
#expect(snapshot.stats.totalSessions == 7)
#expect(snapshot.stats.totalMessages == 50)
#expect(snapshot.recentSessions.map { $0.id } == ["sess-A"])
#expect(snapshot.sessionPreviews["sess-A"] == "first user msg")
#expect(snapshot.recentToolCalls.count == 1)
#expect(snapshot.recentToolCalls[0].id == 99)
}
// MARK: - searchMessages
@Test func searchMessagesEmptyInputReturnsEmptyAndSkipsBackend() async {
let mock = MockHermesQueryBackend()
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
let result = await service.searchMessages(query: " ")
#expect(result.isEmpty)
let log = await mock.queryLog
#expect(log.isEmpty)
}
@Test func searchMessagesWrapsTokensInDoubleQuotes() async {
let mock = MockHermesQueryBackend()
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
_ = await service.searchMessages(query: "config.yaml v0.7.0")
let log = await mock.queryLog
#expect(log.count == 1)
// FTS query is the first param.
guard case .text(let fts) = log[0].params[0] else {
Issue.record("Expected first FTS search param to be .text")
return
}
// Each whitespace-delimited token gets wrapped in double-quotes
// and joined with spaces.
#expect(fts == "\"config.yaml\" \"v0.7.0\"")
}
// MARK: - Error swallowing
@Test func fetchSessionsReturnsEmptyOnBackendTransportError() async {
let mock = MockHermesQueryBackend()
let service = HermesDataService(context: context, backend: mock)
_ = await service.open()
await mock._seedFailure(forSQLPrefix: "SELECT id, source", error: .transport("ssh dropped"))
let result = await service.fetchSessions()
#expect(result.isEmpty)
// Sanity: the error reached the backend (the call was made).
let log = await mock.queryLog
#expect(log.count == 1)
}
}
#endif // canImport(SQLite3)
@@ -0,0 +1,131 @@
import Testing
import Foundation
@testable import ScarfCore
/// Parser tests for `hermes gateway list --json`. Pure no transport, no
/// process calls.
@Suite struct HermesGatewayListServiceTests {
private func data(_ s: String) -> Data { s.data(using: .utf8)! }
@Test func parsesSingleProfileSinglePlatform() {
let json = data(#"""
{"profiles":[{"name":"default","running":true,"pid":1234,
"platforms":["slack","telegram"]}]}
"""#)
let snap = HermesGatewayListService.parse(json)
#expect(snap?.profiles.count == 1)
#expect(snap?.profiles[0].profile == "default")
#expect(snap?.profiles[0].pid == 1234)
#expect(snap?.profiles[0].isRunning == true)
#expect(snap?.profiles[0].platforms == ["slack", "telegram"])
}
@Test func parsesMultipleProfiles() {
let json = data(#"""
{"profiles":[
{"name":"work","running":true,"pid":2001,"platforms":["slack"]},
{"name":"personal","running":false,"platforms":["telegram"]}
]}
"""#)
let snap = HermesGatewayListService.parse(json)
#expect(snap?.profiles.count == 2)
#expect(snap?.profiles[0].profile == "work")
#expect(snap?.profiles[0].isRunning == true)
#expect(snap?.profiles[1].profile == "personal")
#expect(snap?.profiles[1].isRunning == false)
#expect(snap?.profiles[1].pid == nil)
}
@Test func parsesBareArrayShape() {
// Tolerance for a top-level array (no `profiles` wrapper).
let json = data(#"""
[{"name":"default","running":true,"pid":42,"platforms":["discord"]}]
"""#)
let snap = HermesGatewayListService.parse(json)
#expect(snap?.profiles.count == 1)
#expect(snap?.profiles[0].profile == "default")
}
@Test func toleratesAlternateFieldNames() {
// `profile` instead of `name`, `state` instead of `running`,
// `connected_platforms` instead of `platforms` defensive defaults
// keep the parser happy if Hermes ships any of these.
let json = data(#"""
{"profiles":[{"profile":"alt","state":"running","pid":7,
"connected_platforms":["matrix"]}]}
"""#)
let snap = HermesGatewayListService.parse(json)
#expect(snap?.profiles[0].profile == "alt")
#expect(snap?.profiles[0].isRunning == true)
#expect(snap?.profiles[0].platforms == ["matrix"])
}
@Test func returnsNilOnEmptyData() {
#expect(HermesGatewayListService.parse(Data()) == nil)
}
@Test func returnsNilOnUnparseableJSON() {
let json = data("not-json")
#expect(HermesGatewayListService.parse(json) == nil)
}
@Test func returnsEmptySnapshotOnEmptyProfilesArray() {
let json = data(#"{"profiles":[]}"#)
let snap = HermesGatewayListService.parse(json)
#expect(snap?.profiles.isEmpty == true)
}
@Test func toleratesUnknownKeys() {
// Forward-compat: a future v0.13.x Hermes adds extra fields, parser
// still works.
let json = data(#"""
{"profiles":[{"name":"default","running":true,"platforms":["slack"],
"future_field":"value","another":42}]}
"""#)
let snap = HermesGatewayListService.parse(json)
#expect(snap?.profiles[0].profile == "default")
}
// MARK: - headerDigest
@Test func headerDigestEmptyProfiles() {
let snap = GatewayListSnapshot(profiles: [])
#expect(snap.headerDigest == "no profiles configured")
}
@Test func headerDigestSingleProfileRunning() {
let snap = GatewayListSnapshot(profiles: [
.init(profile: "default", isRunning: true, pid: 100,
platforms: ["slack", "telegram"])
])
#expect(snap.headerDigest == "default profile · running · slack, telegram")
}
@Test func headerDigestSingleProfileStopped() {
let snap = GatewayListSnapshot(profiles: [
.init(profile: "default", isRunning: false, pid: nil, platforms: [])
])
#expect(snap.headerDigest == "default profile · stopped")
}
@Test func headerDigestMultipleProfilesSomeRunning() {
let snap = GatewayListSnapshot(profiles: [
.init(profile: "work", isRunning: true, pid: 1, platforms: ["slack"]),
.init(profile: "home", isRunning: false, pid: nil, platforms: ["matrix"]),
.init(profile: "extra", isRunning: true, pid: 2, platforms: [])
])
// 3 profiles total, 2 running, surface first running profile's
// platform list as the highlight.
#expect(snap.headerDigest == "3 profiles (2 running) · work: slack")
}
@Test func headerDigestMultipleProfilesNoneRunning() {
let snap = GatewayListSnapshot(profiles: [
.init(profile: "a", isRunning: false, pid: nil, platforms: ["slack"]),
.init(profile: "b", isRunning: false, pid: nil, platforms: ["matrix"])
])
// No running profile fall back to the first profile's platforms.
#expect(snap.headerDigest == "2 profiles (0 running) · a: slack")
}
}
@@ -0,0 +1,119 @@
import Testing
import Foundation
@testable import ScarfCore
/// Exercises the `SCARF_HERMES_HOME` test-mode override on `HermesProfileResolver`.
/// The override is the seam every E2E test relies on without it, tests would
/// touch the user's real `~/.hermes`. Serialized because we mutate process-wide
/// environment.
///
/// **Marker file requirement.** As of v2.8 the override only activates when the
/// path contains the sentinel `HermesProfileResolver.testHomeMarkerFilename`.
/// Tests that want the override active drop the marker before `setenv`. Tests
/// that want to verify the override is rejected (relative path, missing
/// marker, empty value) skip the marker. The hardening prevents a leaked env
/// var from ever pivoting Scarf off the user's real `~/.hermes`.
@Suite(.serialized)
struct HermesProfileResolverOverrideTests {
private static let envKey = "SCARF_HERMES_HOME"
@Test func absoluteOverrideTakesPrecedenceWhenMarkerPresent() throws {
let saved = ProcessInfo.processInfo.environment[Self.envKey]
defer { restore(saved) }
let tmp = NSTemporaryDirectory().appending("scarf-test-home-\(UUID().uuidString)")
try FileManager.default.createDirectory(atPath: tmp, withIntermediateDirectories: true)
try Data().write(to: URL(fileURLWithPath: tmp + "/" + HermesProfileResolver.testHomeMarkerFilename))
defer { try? FileManager.default.removeItem(atPath: tmp) }
setenv(Self.envKey, tmp, 1)
#expect(HermesProfileResolver.resolveLocalHome() == tmp)
#expect(HermesProfileResolver.activeProfileName() == "test-override")
}
@Test func overrideIsIgnoredWhenMarkerMissing() throws {
let saved = ProcessInfo.processInfo.environment[Self.envKey]
defer { restore(saved) }
// Real-looking dir, no marker exactly the shape a leaked env
// var or misconfigured launchctl plist would produce. Must NOT
// override; must fall through to the real resolver.
let tmp = NSTemporaryDirectory().appending("scarf-no-marker-\(UUID().uuidString)")
try FileManager.default.createDirectory(atPath: tmp, withIntermediateDirectories: true)
defer { try? FileManager.default.removeItem(atPath: tmp) }
setenv(Self.envKey, tmp, 1)
HermesProfileResolver.invalidateCache()
let resolved = HermesProfileResolver.resolveLocalHome()
#expect(resolved != tmp)
#expect(resolved.hasSuffix("/.hermes") || resolved.contains("/.hermes/profiles/"))
}
@Test func emptyOverrideFallsThrough() {
let saved = ProcessInfo.processInfo.environment[Self.envKey]
defer { restore(saved) }
setenv(Self.envKey, "", 1)
HermesProfileResolver.invalidateCache()
let resolved = HermesProfileResolver.resolveLocalHome()
#expect(!resolved.isEmpty)
#expect(resolved.hasSuffix("/.hermes") || resolved.contains("/.hermes/profiles/"))
}
@Test func relativeOverrideIsRejected() {
let saved = ProcessInfo.processInfo.environment[Self.envKey]
defer { restore(saved) }
setenv(Self.envKey, "relative/path", 1)
HermesProfileResolver.invalidateCache()
let resolved = HermesProfileResolver.resolveLocalHome()
#expect(!resolved.hasSuffix("relative/path"))
}
@Test func unsetOverrideUsesProfileResolver() {
let saved = ProcessInfo.processInfo.environment[Self.envKey]
defer { restore(saved) }
unsetenv(Self.envKey)
HermesProfileResolver.invalidateCache()
let resolved = HermesProfileResolver.resolveLocalHome()
#expect(!resolved.isEmpty)
}
@Test func overrideBypassesCacheWhenMarkerPresent() throws {
let saved = ProcessInfo.processInfo.environment[Self.envKey]
defer { restore(saved) }
let first = NSTemporaryDirectory().appending("scarf-cache-bypass-1-\(UUID().uuidString)")
let second = NSTemporaryDirectory().appending("scarf-cache-bypass-2-\(UUID().uuidString)")
try FileManager.default.createDirectory(atPath: first, withIntermediateDirectories: true)
try FileManager.default.createDirectory(atPath: second, withIntermediateDirectories: true)
try Data().write(to: URL(fileURLWithPath: first + "/" + HermesProfileResolver.testHomeMarkerFilename))
try Data().write(to: URL(fileURLWithPath: second + "/" + HermesProfileResolver.testHomeMarkerFilename))
defer {
try? FileManager.default.removeItem(atPath: first)
try? FileManager.default.removeItem(atPath: second)
}
setenv(Self.envKey, first, 1)
#expect(HermesProfileResolver.resolveLocalHome() == first)
// Flip env var without invalidating the cache. Override is read
// fresh on every call, so the new value takes effect immediately.
setenv(Self.envKey, second, 1)
#expect(HermesProfileResolver.resolveLocalHome() == second)
}
private func restore(_ saved: String?) {
if let saved {
setenv(Self.envKey, saved, 1)
} else {
unsetenv(Self.envKey)
}
HermesProfileResolver.invalidateCache()
}
}
@@ -0,0 +1,522 @@
import Testing
import Foundation
@testable import ScarfCore
/// Pure-logic tests for the v2.7.5 Kanban model layer. The actor-based
/// `KanbanService` is exercised separately under integration tests
/// since it spawns `hermes kanban ` subprocesses; this suite covers
/// the wire-shape contracts and the synchronous transition planner.
@Suite struct KanbanModelsTests {
// MARK: - HermesKanbanTask decoding
@Test func decodeListRow() throws {
let json = """
{
"id": "t_9f2a",
"title": "Investigate flaky test",
"body": "Repro on CI but not local.",
"assignee": "researcher",
"status": "running",
"priority": 50,
"tenant": "scarf:demo",
"workspace_kind": "scratch",
"workspace_path": "/Users/alan/.hermes/kanban/workspaces/t_9f2a",
"created_by": "user",
"created_at": "2026-05-06T12:00:00Z",
"started_at": "2026-05-06T12:01:00Z",
"skills": ["debugging"],
"idempotency_key": "abc",
"last_heartbeat_at": "2026-05-06T12:05:00Z",
"max_runtime_seconds": 1800,
"current_run_id": 1
}
"""
let task = try JSONDecoder().decode(HermesKanbanTask.self, from: Data(json.utf8))
#expect(task.id == "t_9f2a")
#expect(task.assignee == "researcher")
#expect(task.status == "running")
#expect(task.tenant == "scarf:demo")
#expect(task.workspaceKind == "scratch")
#expect(task.skills == ["debugging"])
#expect(task.idempotencyKey == "abc")
#expect(task.maxRuntimeSeconds == 1800)
#expect(task.currentRunId == 1)
}
// MARK: - Assignee table parsing
//
// `hermes kanban assignees` prints either a JSON array (when
// `--json` is honored) OR a Rich-style human table OR an
// empty-state sentinel "(no assignees create a profile with
// `hermes -p <name> setup`)". The first iteration of the parser
// tokenized the sentinel and emitted `(no` as a profile name,
// which surfaced in the Mac inspector's assignee dropdown.
// MARK: - LocalTransport subprocess environment
@Test func localTransportSubprocessEnvIncludesExecutableDir() {
// GUI-launched Scarf would otherwise hand subprocesses
// `/usr/bin:/bin:/usr/sbin:/sbin`, which doesn't include
// `~/.local/bin` so when Hermes's kanban dispatcher
// spawns a worker by bare name, it fails with
// `executable not found on PATH` and the run records
// `outcome=spawn_failed`. Unblock by always making sure
// the directory of the executable we're launching is on
// PATH for the child.
let previous = LocalTransport.environmentEnricher
defer { LocalTransport.environmentEnricher = previous }
LocalTransport.environmentEnricher = nil
let env = LocalTransport.subprocessEnvironment(
forExecutable: "/Users/alanwizemann/.local/bin/hermes"
)
let path = env["PATH"] ?? ""
#expect(path.contains("/Users/alanwizemann/.local/bin"))
}
@Test func localTransportSubprocessEnvLetsEnricherWinPATH() {
let previous = LocalTransport.environmentEnricher
defer { LocalTransport.environmentEnricher = previous }
LocalTransport.environmentEnricher = {
// Simulate a login-shell probe returning a fuller PATH +
// some credential env. The enricher's PATH must override
// the GUI-process PATH.
return [
"PATH": "/opt/homebrew/bin:/usr/local/bin:/Users/me/.local/bin",
"ANTHROPIC_API_KEY": "sk-test-fake"
]
}
let env = LocalTransport.subprocessEnvironment(
forExecutable: "/usr/local/bin/hermes"
)
// Enricher's PATH wins (PATH is the whole point of running it).
#expect(env["PATH"]?.contains("/opt/homebrew/bin") == true)
// Credential env is forwarded (process env didn't have it).
#expect(env["ANTHROPIC_API_KEY"] == "sk-test-fake")
}
@Test func parseAssigneeTableSkipsNoAssigneesSentinel() {
// Use the same parser via its public stand-in: round-trip
// through a fixture that decodes via JSON would skip the
// table parser, so we test the fallback indirectly by
// constructing the same decoder pipeline. The parser is
// private to KanbanService; this test asserts the visible
// contract (no garbage profile names appear in the picker)
// by verifying the decode path on the real CLI fixture
// returns an empty array rather than a `(no` row.
let fixture = "(no assignees — create a profile with `hermes -p <name> setup`)"
// Through the public surface: we know `KanbanService.assignees`
// would consume this stdout when --json fails. The validator
// we care about is the regex check; reproduce inline:
let pattern = "^[a-zA-Z0-9_-]+$"
let firstToken = fixture
.split(whereSeparator: { $0 == "\t" || $0 == " " })
.first.map(String.init) ?? ""
// Confirms the parser's regex would reject "(no".
#expect(firstToken.range(of: pattern, options: .regularExpression) == nil)
}
@Test func decodeUnixIntegerTimestamps() throws {
// Real `hermes kanban create --json` output uses Unix integer
// seconds for created_at / started_at its SQLite columns are
// INTEGER. The decoder must normalize them into ISO-8601 strings
// so downstream code works with one type.
let json = """
{
"id": "t_2a0be199",
"title": "smoke",
"status": "ready",
"priority": 50,
"created_at": 1778160614,
"started_at": null,
"skills": []
}
"""
let task = try JSONDecoder().decode(HermesKanbanTask.self, from: Data(json.utf8))
#expect(task.id == "t_2a0be199")
// Should have been converted from Unix int to an ISO-8601 string
// exact format is platform-stable.
#expect(task.createdAt?.contains("2026") == true)
#expect(task.startedAt == nil)
}
@Test func decodeMissingOptionalsBecomesNil() throws {
// Hermes emits a minimal task object when many fields are
// absent; the decoder must tolerate it.
let json = """
{ "id": "t_x", "title": "ok", "status": "todo" }
"""
let task = try JSONDecoder().decode(HermesKanbanTask.self, from: Data(json.utf8))
#expect(task.id == "t_x")
#expect(task.assignee == nil)
#expect(task.priority == nil)
#expect(task.tenant == nil)
#expect(task.skills.isEmpty)
}
// MARK: - Status / column projection
@Test func statusToColumnMapping() {
#expect(KanbanStatus.from("triage").boardColumn == .triage)
#expect(KanbanStatus.from("todo").boardColumn == .upNext)
#expect(KanbanStatus.from("ready").boardColumn == .upNext)
#expect(KanbanStatus.from("running").boardColumn == .running)
#expect(KanbanStatus.from("blocked").boardColumn == .blocked)
#expect(KanbanStatus.from("done").boardColumn == .done)
#expect(KanbanStatus.from("archived").boardColumn == .archived)
#expect(KanbanStatus.from("WHATEVER").boardColumn == .upNext) // unknown upNext
}
// MARK: - KanbanCreateRequest argv assembly
@Test func createRequestArgvIncludesAllFields() {
let req = KanbanCreateRequest(
title: "Translate doc",
body: "Spanish, please",
assignee: "researcher",
parentIds: ["t_parent"],
workspace: .directory("/tmp/proj"),
tenant: "scarf:demo",
priority: 75,
triage: true,
idempotencyKey: "key-1",
maxRuntimeSeconds: 1800,
createdBy: "alan",
skills: ["translation", "github-code-review"]
)
let argv = req.argv()
#expect(argv.contains("--body"))
#expect(argv.contains("--assignee"))
#expect(argv.contains("--parent"))
#expect(argv.contains("--workspace"))
#expect(argv.contains("dir:/tmp/proj"))
#expect(argv.contains("--tenant"))
#expect(argv.contains("scarf:demo"))
#expect(argv.contains("--priority"))
#expect(argv.contains("75"))
#expect(argv.contains("--triage"))
#expect(argv.contains("--idempotency-key"))
#expect(argv.contains("--max-runtime"))
#expect(argv.contains("--created-by"))
#expect(argv.contains("--skill"))
#expect(argv.last == "Translate doc") // positional title is last
#expect(argv.contains("--json"))
}
@Test func createRequestArgvOmitsAbsent() {
let req = KanbanCreateRequest(title: "minimal")
let argv = req.argv()
#expect(argv.contains("--json"))
#expect(argv.last == "minimal")
#expect(!argv.contains("--body"))
#expect(!argv.contains("--assignee"))
#expect(!argv.contains("--triage"))
}
// MARK: - KanbanListFilter argv
@Test func listFilterEmptyOnlyJSON() {
let argv = KanbanListFilter.all.argv()
#expect(argv == ["--json"])
}
@Test func listFilterStatusFlag() {
let argv = KanbanListFilter(status: .running).argv()
#expect(argv.contains("--status"))
#expect(argv.contains("running"))
}
@Test func listFilterTenantPasses() {
let argv = KanbanListFilter(tenant: "scarf:demo").argv()
#expect(argv.contains("--tenant"))
#expect(argv.contains("scarf:demo"))
}
@Test func listFilterArchivedAndMine() {
let argv = KanbanListFilter(includeArchived: true, mineOnly: true).argv()
#expect(argv.contains("--mine"))
#expect(argv.contains("--archived"))
}
// MARK: - Transition planning
@Test func planUpNextToRunningDispatches() throws {
// `dispatch`, not `claim`. See KanbanTransitionStep doc for the
// rationale claim doesn't spawn a worker; the dispatcher does.
let plan = try KanbanService.plan(
for: KanbanTransition(from: .upNext, to: .running)
)
#expect(plan.steps == [.dispatch])
}
@Test func planRunningToBlockedRequiresReason() throws {
let plan = try KanbanService.plan(
for: KanbanTransition(from: .running, to: .blocked)
)
#expect(plan.requiresBlockReason)
}
@Test func planBlockedToRunningChainsTwoVerbs() throws {
let plan = try KanbanService.plan(
for: KanbanTransition(from: .blocked, to: .running)
)
// unblock then dispatch
#expect(plan.steps.count == 2)
if case .unblock = plan.steps.first {} else {
Issue.record("expected first step .unblock, got \(plan.steps)")
}
if case .dispatch = plan.steps.last {} else {
Issue.record("expected last step .dispatch, got \(plan.steps)")
}
}
@Test func planDoneToAnythingForbidden() {
do {
_ = try KanbanService.plan(
for: KanbanTransition(from: .done, to: .upNext)
)
Issue.record("expected error")
} catch let err as KanbanError {
if case .forbiddenTransition = err {
// ok
} else {
Issue.record("wrong error: \(err)")
}
} catch {
Issue.record("unexpected error: \(error)")
}
}
@Test func planTriageToUpNextForbidden() {
do {
_ = try KanbanService.plan(
for: KanbanTransition(from: .triage, to: .upNext)
)
Issue.record("expected error")
} catch let err as KanbanError {
if case .forbiddenTransition = err {
// ok
} else {
Issue.record("wrong error: \(err)")
}
} catch {
Issue.record("unexpected error: \(error)")
}
}
@Test func planNoOpProducesEmptyPlan() throws {
let plan = try KanbanService.plan(
for: KanbanTransition(from: .running, to: .running)
)
#expect(plan.steps.isEmpty)
}
// MARK: - Stats glance
@Test func glanceStringJoinsNonEmptyBuckets() {
let stats = HermesKanbanStats(
byStatus: ["todo": 12, "running": 3, "blocked": 5, "done": 0]
)
#expect(stats.glanceString == "12 todo · 3 running · 5 blocked")
#expect(stats.activeCount == 12 + 3 + 5)
}
@Test func glanceStringEmptyWhenZero() {
let stats = HermesKanbanStats(byStatus: [:])
#expect(stats.glanceString.isEmpty)
#expect(stats.activeCount == 0)
}
// MARK: - v0.13 (Hermes 2026.5.7) tolerant decode
//
// The contract these tests pin: a v0.13 host's task / run / detail
// JSON decodes successfully WITH the new fields populated, AND a
// pre-v0.13 (v0.12) host's task / run / detail JSON decodes
// successfully WITHOUT the new fields (everything resolves to nil
// or empty). Drift from this pair = a regression that bites every
// user not yet on Hermes v0.13.
@Test func decodeV013TaskFields() throws {
let json = """
{
"id": "t_v013",
"title": "v0.13 task",
"status": "blocked",
"max_retries": 5,
"auto_blocked_reason": "worker exited without `kanban complete`",
"hallucination_gate_status": "pending",
"diagnostics": [
{"kind": "worker_exit_no_complete", "message": "exit code 0 with no complete call", "detected_at": 1778160614},
{"kind": "darwin_zombie_detected", "detected_at": "2026-05-09T12:00:00Z"}
]
}
"""
let task = try JSONDecoder().decode(HermesKanbanTask.self, from: Data(json.utf8))
#expect(task.maxRetries == 5)
#expect(task.autoBlockedReason?.contains("kanban complete") == true)
#expect(task.hallucinationGateStatus == "pending")
#expect(task.diagnostics.count == 2)
#expect(task.diagnostics.first?.kind == "worker_exit_no_complete")
#expect(task.diagnostics.last?.detectedAt?.contains("2026") == true)
}
@Test func decodeV012TaskHasNoNewFields() throws {
// The most damaging failure mode is a v0.12 user upgrading Scarf
// and having the board stop loading because a v0.13-only field
// is required. Pin the contract.
let json = """
{"id": "t_legacy", "title": "v0.12 task", "status": "ready"}
"""
let task = try JSONDecoder().decode(HermesKanbanTask.self, from: Data(json.utf8))
#expect(task.maxRetries == nil)
#expect(task.autoBlockedReason == nil)
#expect(task.hallucinationGateStatus == nil)
#expect(task.diagnostics.isEmpty)
}
@Test func decodeMalformedDiagnosticTolerated() throws {
// If Hermes emits a malformed diagnostics value, the rest of the
// task should still decode. We use try? on the diagnostics decode
// so a single bad entry doesn't reject the whole row.
let json = """
{
"id": "t_x",
"title": "x",
"status": "ready",
"diagnostics": "not-an-array"
}
"""
let task = try JSONDecoder().decode(HermesKanbanTask.self, from: Data(json.utf8))
#expect(task.id == "t_x")
// Diagnostics field couldn't decode treat as empty.
#expect(task.diagnostics.isEmpty)
}
@Test func hallucinationGateMirrorMapsKnownValues() {
#expect(KanbanHallucinationGate.from("pending") == .pending)
#expect(KanbanHallucinationGate.from("verified") == .verified)
#expect(KanbanHallucinationGate.from("REJECTED") == .rejected) // case-insensitive
#expect(KanbanHallucinationGate.from(nil) == nil)
#expect(KanbanHallucinationGate.from("") == nil)
// Unknown wire values fall through to nil so the banner stays
// hidden; future Hermes versions can add `quarantined` etc.
// without a Scarf release.
#expect(KanbanHallucinationGate.from("quarantined") == nil)
}
@Test func diagnosticKindMirrorMapsKnownValues() {
#expect(KanbanDiagnosticKind.from("heartbeat_stalled") == .heartbeatStalled)
#expect(KanbanDiagnosticKind.from("DARWIN_ZOMBIE_DETECTED") == .darwinZombieDetected)
// Unknown kinds fall through to .unknown so views can render
// the raw string verbatim.
#expect(KanbanDiagnosticKind.from("future_kind_v014") == .unknown)
}
@Test func diagnosticSeverityMapping() {
#expect(KanbanDiagnosticKind.retryCapHit.severity == .danger)
#expect(KanbanDiagnosticKind.darwinZombieDetected.severity == .danger)
#expect(KanbanDiagnosticKind.heartbeatStalled.severity == .warning)
#expect(KanbanDiagnosticKind.workerExitNoComplete.severity == .warning)
#expect(KanbanDiagnosticKind.unknown.severity == .neutral)
}
@Test func createRequestArgvIncludesMaxRetries() {
let req = KanbanCreateRequest(title: "t", maxRetries: 5)
let argv = req.argv()
#expect(argv.contains("--max-retries"))
#expect(argv.contains("5"))
}
@Test func createRequestArgvOmitsMaxRetriesWhenAbsent() {
let req = KanbanCreateRequest(title: "t")
let argv = req.argv()
#expect(!argv.contains("--max-retries"))
}
@Test func decodeRunWithDiagnostics() throws {
let json = """
{
"id": 1,
"task_id": "t_x",
"status": "failed",
"started_at": 1778160000,
"ended_at": 1778160300,
"outcome": "crashed",
"error": "OOM",
"diagnostics": [
{"kind": "retry_cap_hit", "message": "3/3 retries exhausted"}
],
"failure_count": 3
}
"""
let run = try JSONDecoder().decode(HermesKanbanRun.self, from: Data(json.utf8))
#expect(run.diagnostics.count == 1)
#expect(run.diagnostics.first?.kind == "retry_cap_hit")
#expect(run.failureCount == 3)
}
@Test func decodeRunWithoutDiagnostics() throws {
// v0.12 run row no diagnostics, no failure_count, must still
// decode cleanly.
let json = """
{"id": 1, "task_id": "t_x", "status": "running", "started_at": 1778160000}
"""
let run = try JSONDecoder().decode(HermesKanbanRun.self, from: Data(json.utf8))
#expect(run.diagnostics.isEmpty)
#expect(run.failureCount == nil)
}
@Test func taskDetailMergesEnvelopeAndTaskDiagnostics() throws {
// Hermes's wire shape may put diagnostics on the task envelope OR
// on the inner task. `allDiagnostics` dedupes by (kind, detected_at)
// so a server emitting both sides doesn't surface dupes.
let json = """
{
"task": {
"id": "t_y",
"title": "y",
"status": "blocked",
"diagnostics": [
{"kind": "heartbeat_stalled", "detected_at": "2026-05-09T12:00:00Z"}
]
},
"comments": [],
"events": [],
"diagnostics": [
{"kind": "heartbeat_stalled", "detected_at": "2026-05-09T12:00:00Z"},
{"kind": "retry_cap_hit"}
]
}
"""
let detail = try JSONDecoder().decode(HermesKanbanTaskDetail.self, from: Data(json.utf8))
let merged = detail.allDiagnostics
#expect(merged.count == 2)
#expect(merged.contains(where: { $0.kind == "heartbeat_stalled" }))
#expect(merged.contains(where: { $0.kind == "retry_cap_hit" }))
}
@Test func taskDetailWithoutEnvelopeDiagnosticsDecodes() throws {
// Pre-v0.13 task detail no envelope diagnostics. Must decode.
let json = """
{
"task": {"id": "t_z", "title": "z", "status": "ready"},
"comments": [],
"events": []
}
"""
let detail = try JSONDecoder().decode(HermesKanbanTaskDetail.self, from: Data(json.utf8))
#expect(detail.envelopeDiagnostics == nil)
#expect(detail.allDiagnostics.isEmpty)
}
@Test func diagnosticDecodesUnixTimestamp() throws {
let json = """
{"kind": "spawn_failure", "detected_at": 1778160614}
"""
let diag = try JSONDecoder().decode(HermesKanbanDiagnostic.self, from: Data(json.utf8))
#expect(diag.kind == "spawn_failure")
// Decoder normalizes Unix int ISO-8601 string.
#expect(diag.detectedAt?.contains("2026") == true)
}
}

Some files were not shown because too many files have changed in this diff Show More