From 37266f3efc07b9831f51b9a381bf6f0ef38322ae Mon Sep 17 00:00:00 2001 From: Alan Wizemann Date: Tue, 5 May 2026 19:42:56 +0200 Subject: [PATCH] docs(perf): document v2.8 skeleton-then-hydrate + SSH cancellation patterns --- Architecture-Overview.md | 2 +- Chat.md | 21 +++++++++++++++++++++ Dashboard.md | 6 +++++- Insights-and-Activity.md | 9 +++++++++ Performance-Monitoring.md | 18 ++++++++++++++++++ Project-Templates.md | 12 +++++++++++- Projects-and-Profiles.md | 31 ++++++++++++++++++++----------- 7 files changed, 85 insertions(+), 14 deletions(-) diff --git a/Architecture-Overview.md b/Architecture-Overview.md index 127268d..c2164c7 100644 --- a/Architecture-Overview.md +++ b/Architecture-Overview.md @@ -64,7 +64,7 @@ The Rich Chat surface speaks the Hermes Agent Client Protocol (ACP) — a JSON-R ## File watching -`HermesFileWatcher` reacts to changes under `~/.hermes/` so the Dashboard, Sessions browser, Activity feed, and Memory viewer refresh without manual reload. Local windows use FSEvents (`DispatchSourceFileSystemObject`); remote windows use mtime polling tunneled over the SSH ControlMaster. +`HermesFileWatcher` reacts to changes under `~/.hermes/` so the Dashboard, Sessions browser, Activity feed, and Memory viewer refresh without manual reload. Local windows use FSEvents (`DispatchSourceFileSystemObject`); remote windows use mtime polling tunneled over the SSH ControlMaster. **v2.7+** also watches each registered project's `/.scarf/` directory (both local FSEvents and remote polling), so file-reading dashboard widgets (`markdown_file`, `log_tail`, `image`) refresh automatically when the cron job updates their underlying file. ## Updates diff --git a/Chat.md b/Chat.md index 3426957..4d15c7e 100644 --- a/Chat.md +++ b/Chat.md @@ -89,6 +89,27 @@ ScarfGo now survives phone-sleep, network handoffs, and SSH socket drops without `HermesDataService.fetchMessages(sessionId:limit:before:)` paginates by id desc with centralized `HistoryPageSize` constants. `RichChatViewModel.loadEarlier()` walks back through long sessions via `oldestLoadedMessageID` + `hasMoreHistory`. Pre-fix the message fetch was unbounded — sessions with thousands of messages were doing a full-history load on every reconnect. +## Skeleton-then-hydrate chat loader _(v2.8+, Mac)_ + +Resuming a chat on a slow remote used to fetch every column the bubble might need (`content` + `tool_calls` JSON + `reasoning_content`) in one shot, which routinely tripped the 30s SSH timeout on chats with multi-page tool result blobs. v2.8 splits the load into two phases: + +1. **Skeleton.** `fetchSkeletonMessages` selects only user + assistant rows (skips `role='tool'`) with `tool_calls`/`reasoning`/`reasoning_content` hard-NULLed at the SQL level. Wire payload bounded by conversational text alone — typically a few KB. The chat appears in seconds. +2. **Background hydrate.** `RichChatViewModel.startToolHydration()` pages through `hydrateAssistantToolCalls` in 5-id batches to splice tool-call JSON into the existing assistant messages. Tool-result CONTENT is opt-in (Settings → Display → "Load tool results in past chats", default off) — without it, tool call cards still render, and the inspector pane lazy-fetches per-result content via `fetchToolResult(callId:)` when you open it. + +The chat header surfaces "Loading tool details…" while hydration is in flight. If a 5-id batch trips the 30s timeout (an oversized `tool_calls` blob — long Edit args, big diffs), an L1 single-id retry isolates the whale so the rest of the batch still hydrates. The whale row stays bare; the assistant message is still readable. + +## Partial-result + chat error banners _(v2.8+, Mac)_ + +When the skeleton fetch itself trips an SSH transport failure (rather than a clean empty result), the chat surfaces "Couldn't load full chat history — the connection to *server* timed out" through the existing `acpError` triplet so the user sees what happened instead of a silent empty transcript. A separate banner detects `model.default` / `model.provider` mismatches in `config.yaml` (e.g. `model.default: anthropic/...` with `model.provider: nous` after switching OAuth providers via Credential Pools) and offers a one-click fix in either direction. The ACP error classifier also recognizes `model_not_found` / `404 messages` / `model is not available` and surfaces "This session was created with a model the provider no longer offers — start a new chat" so the pinned-model failure mode has a clear recovery path. + +## Loading-state UX during session boot _(v2.8+, Mac)_ + +The Mac chat sidebar greys out and disables row taps the moment a session-switch is initiated (synchronously, before `client.start()` returns), with a floating ProgressView showing the current phase: "Spawning hermes acp…" → "Authenticating…" → "Loading session…" → "Loading history…" → "Ready". Pre-fix the sidebar looked engageable while the 5-7 second SSH+ACP boot was still in flight, and the user could queue up a second session-switch behind the first. The new gating prevents that contention. + +## SSH cancellation propagation _(v2.8+, Mac)_ + +Cancelling a Swift Task used to leave the underlying ssh subprocess running for the full 30s SSH timeout — `Task.detached` doesn't inherit cancellation from the awaiting parent, so `proc.terminate()` was never called. This pinned remote sqlite queries and ControlMaster sessions when the user navigated away mid-load. v2.8 wires `withTaskCancellationHandler` through `SSHScriptRunner.run` and `RemoteSQLiteBackend.query`; cancellation now reaches the `Process` within ~100ms (the poll-loop interval). Fires `ssh.cancelled` in ScarfMon traces. Fixes the "third chat hangs" / "dashboard spins after rapid switching" symptom. + ## iOS keyboard dismissal _(v2.5.1+)_ Pre-fix the chat composer's `TextField` had no keyboard dismissal at all — the keyboard would rise and stick, hiding the system tab bar (which iOS auto-hides while a keyboard is up) and trapping users in the Chat tab. v2.5.1 adds two redundant dismissal paths: diff --git a/Dashboard.md b/Dashboard.md index 6f36fad..d726880 100644 --- a/Dashboard.md +++ b/Dashboard.md @@ -19,6 +19,10 @@ A scrolling stack of cards, refreshed automatically when `~/.hermes/state.db`, ` Local windows watch `~/.hermes/` with FSEvents (`DispatchSourceFileSystemObject`). Remote windows poll mtimes every 3 seconds over the SSH ControlMaster. Either way, the Dashboard updates without needing a manual refresh — open it on a second monitor and watch sessions tick by as Hermes works. +### Project dashboards (v2.7+) + +Per-project dashboards (the **Projects** sidebar item, separate from this system Dashboard) refresh on a project-wide watch: any change anywhere under `/.scarf/` triggers a reload, not just `dashboard.json` itself. So a `markdown_file` widget pointing at `reports/weekly.md` (placed under `.scarf/reports/`) refreshes automatically when the cron job rewrites it. The same coverage applies on remote SSH-attached projects via 3-second mtime polling on each project's `.scarf/` directory. _Limitation:_ in-place appends to an existing file (`>> file.log`) don't tick the watcher — write atomically (write-temp + rename) or `touch dashboard.json` after each cron run. + ## Status pill in the toolbar Every Scarf window has a connection pill in the toolbar showing the bound server and its state: @@ -49,4 +53,4 @@ A **Switch server** button in the iOS Dashboard's top-right corner (added v2.5) - [ScarfGo](ScarfGo) for the iOS Dashboard tour. --- -_Last updated: 2026-04-25 — Scarf v2.5.0 (added iOS Dashboard cross-reference + Switch server button)_ +_Last updated: 2026-05-04 — Scarf v2.7 (project-wide auto-refresh on `.scarf/` directory)_ diff --git a/Insights-and-Activity.md b/Insights-and-Activity.md index 0af54dd..595802c 100644 --- a/Insights-and-Activity.md +++ b/Insights-and-Activity.md @@ -41,6 +41,15 @@ The per-tool execution feed — what Hermes did, when, and with what arguments: - **Detail inspector** — pretty-printed arguments JSON, tool output (when available), `tool_call_id` for cross-referencing back to the Sessions message stream. - **Live refresh** — same `HermesFileWatcher`; the feed scrolls as Hermes works. +### Skeleton-then-hydrate loader _(v2.8+)_ + +Activity historically pulled the full message column set for the 200 most recent tool-call rows in one shot, which routinely tripped the 30s SSH timeout on remote contexts (the `tool_calls` JSON column for 200 rows = ~600KB-1MB on the wire). v2.8 splits the load: + +1. **Phase 1 — skeleton.** `fetchRecentToolCallSkeleton(limit: 50)` projects only `id` + `session_id` + `role` + `timestamp` (everything fat NULLed at the SQL level). Wire payload ≈ 3 KB. The day-grouped feed renders placeholder rows immediately. +2. **Phase 2 — paged hydrate.** `hydrateAssistantToolCalls` runs in 5-id batches in the background via `startToolCallHydration()`. Each batch splices parsed `[HermesToolCall]` arrays into the existing skeleton; `filteredActivity` swaps the placeholder entry for the real per-call entries on the next observation tick. A "Loading tool details…" pill in the header surfaces hydration progress. + +When a 5-id batch trips the 30s timeout (an oversized `tool_calls` blob), an L1 single-id retry isolates the offending row so the rest of the batch still hydrates. Transport-layer failures during the skeleton fetch surface an orange "Couldn't load activity" banner with a Retry button instead of the silent empty state pre-v2.8 left users staring at. + ## Live data freshness All three views observe the file watcher and re-query when `state.db` changes. Remote windows pull a fresh atomic snapshot via `sqlite3 .backup` (deduped by `SnapshotCoordinator` so Dashboard + Insights + Sessions don't each spawn parallel backups). See [Transport Layer](Transport-Layer) for the snapshot mechanics. diff --git a/Performance-Monitoring.md b/Performance-Monitoring.md index 8d22e48..fe25746 100644 --- a/Performance-Monitoring.md +++ b/Performance-Monitoring.md @@ -64,12 +64,30 @@ For a full export, hit **Copy as JSON**. Each line is one sample with `category` | `chatStream` | `finalizeStreamingMessage` | Same | End-of-turn finalize cost (target: < 1 ms) | | `chatStream` | `ios.send` / `ios.startResuming` / `ios.acpEvent` / `ios.handleACPEvent` | iOS `ChatView.swift` | Same shape on iOS | | `sessionLoad` | `mac.startACPSession` / `ios.startResuming` | Both targets | Session boot cost | +| `sessionLoad` | `mac.fetchSkeletonMessages` / `.rows` / `.transportError` | [HermesDataService](https://github.com/awizemann/scarf/tree/main/scarf/Packages/ScarfCore/Sources/ScarfCore/Services/HermesDataService.swift) | Phase 1 of v2.8 two-phase chat loader — user+assistant rows only, ~few KB on the wire regardless of tool_calls blob size | +| `sessionLoad` | `mac.fetchToolCallSkeleton` / `.rows` / `.transportError` | Same | Phase L Activity skeleton fetch — metadata-only, ~3 KB for 50 rows | +| `sessionLoad` | `mac.hydrateToolCalls` / `.rows` / `.cancelled` / `.pageTimeout` / `.singleTimeout` | Same | Phase 2a paged hydration. `pageTimeout` → batch fell back to single-id retry; `singleTimeout` → individual whale row skipped | +| `sessionLoad` | `mac.hydrateToolResults` / `mac.hydrateTools.skippedToolResults` / `.dropped` / `.complete` | Same | Phase 2b tool-result content hydrate. `skippedToolResults` fires when the opt-in setting is off (default); `dropped` fires when the parent task cancelled mid-page | +| `sessionLoad` | `mac.lazyToolResult.fetched` | Same | Inspector pane lazy-fetched a single tool result on user expand | +| `sessionLoad` | `mac.fetchMessages.transportError` | Same | Skeleton fetch tripped the SSH timeout — chat surfaces the partial-result banner | +| `sessionLoad` | `mac.loadRecentSessions.coalesced` | [Mac ChatViewModel](https://github.com/awizemann/scarf/tree/main/scarf/scarf/Features/Chat/ViewModels/ChatViewModel.swift) | A second caller awaited the in-flight load instead of spawning a parallel SSH round-trip | | `sqlite` | `sqlite.query` / `sqlite.queryBatch` | [RemoteSQLiteBackend](https://github.com/awizemann/scarf/tree/main/scarf/Packages/ScarfCore/Sources/ScarfCore/Services/Backends/RemoteSQLiteBackend.swift) | Per-call latency over SSH (carries row count + stdout bytes) | | `transport` | `ssh.streamScript` (iOS) / `ssh.run` (Mac) | [CitadelServerTransport](https://github.com/awizemann/scarf/tree/main/scarf/Packages/ScarfIOS/Sources/ScarfIOS/CitadelServerTransport.swift), [SSHScriptRunner](https://github.com/awizemann/scarf/tree/main/scarf/Packages/ScarfCore/Sources/ScarfCore/Transport/SSHScriptRunner.swift) | SSH round-trip time | +| `transport` | `ssh.cancelled` | [SSHScriptRunner](https://github.com/awizemann/scarf/tree/main/scarf/Packages/ScarfCore/Sources/ScarfCore/Transport/SSHScriptRunner.swift) | Parent task cancellation reached the ssh subprocess (v2.8) — terminated within 100ms instead of running to its 30s deadline | | `diskIO` | `loadConfig` / `loadCronJobs` | [HermesFileService](https://github.com/awizemann/scarf/tree/main/scarf/scarf/Core/Services/HermesFileService.swift) | Hot disk reads. `loadConfig` also logs caller stack frames in Full mode | Adding a new measure point is two lines (see Developer Guide below). +### v2.8 perf architecture: skeleton-then-hydrate + cancellation propagation + +Two patterns landed in v2.8 that anyone touching remote-context code should know about: + +**Skeleton-then-hydrate.** Heavy SSH fetches (`fetchMessages`, `fetchRecentToolCalls`) used to pull the FAT column set in one shot, which routinely tripped the 30s SSH timeout on chats with multi-page tool result blobs. The new pattern: a fast skeleton fetch projects only the columns needed to render placeholder rows (NULLs the heavy ones at the SQL level), then a paged background hydration fills the rest in. Used by chat-resume (`fetchSkeletonMessages` + `hydrateAssistantToolCalls`) and Activity (`fetchRecentToolCallSkeleton` + same hydrate). Pages run in 5-id batches; if a page times out, an L1 single-id retry isolates the whale so the rest of the batch still hydrates. + +**Cancellation propagation through SSH.** `Task.detached { … }` doesn't inherit cancellation from the awaiting parent, and `Task<…> { … }` (unstructured) also drops the signal. Without explicit bridging, cancelling a chat-load Task only unwinds Swift state — the underlying ssh subprocess kept running for the full 30s, pinning a remote sqlite query and a ControlMaster session slot. v2.8 wires `withTaskCancellationHandler` through `SSHScriptRunner` and `RemoteSQLiteBackend.query` so parent cancellation reaches the `Process` and calls `proc.terminate()` within 100ms. New `ssh.cancelled` event surfaces this. + +**In-flight coalescing.** `loadRecentSessions` (Mac chat sidebar) coalesces against an in-flight task. File-watcher deltas during streaming used to stack 2-3 parallel `loadRecentSessions` tasks; now subsequent callers await the active one. New `mac.loadRecentSessions.coalesced` event tracks how often the dedup fires. + ## Capture recipe for a useful baseline 1. Build + run the latest version. diff --git a/Project-Templates.md b/Project-Templates.md index 8499b4f..e940974 100644 --- a/Project-Templates.md +++ b/Project-Templates.md @@ -184,6 +184,16 @@ Templates can ship project-scoped slash commands by listing each name in `conten `schemaVersion` bumps to **3** only when a bundle ships slash commands. v1 and v2 bundles continue to install identically (the installer accepts schemaVersion 1, 2, and 3). See [Slash Commands](Slash-Commands) for the file format and substitution rules. +### v2.7 widget vocabulary expansion (no schema bump) + +Scarf v2.7 added five new widget types — `markdown_file`, `log_tail`, `cron_status`, `image`, `status_grid` — plus a `sparkline` field on `stat` and a typed status enum on `list` items. **None of these require a `schemaVersion` bump.** They're additive within `dashboard.json` itself, so: + +- v1, v2, v3 bundles that use only the original 7 widget types keep working byte-identically on v2.7+. +- Bundles that adopt new widget types still validate against the existing manifest schema — only the catalog validator's vocabulary list ([`tools/widget-schema.json`](https://github.com/awizemann/scarf/blob/main/tools/widget-schema.json)) was extended. +- A v2.7-authored dashboard installed into a pre-v2.7 Scarf renders unknown widgets as a clearly-labeled error card (not a crash), so forward-incompatibility degrades gracefully. + +See [Projects](Projects-and-Profiles) for the full widget catalog and the typed status badge synonyms. + ### `AGENTS.md` contract `AGENTS.md` is the single source of truth for what the project does and how to operate it. It must: @@ -285,4 +295,4 @@ Use `{{PROJECT_DIR}}` in the cron prompt. Hermes doesn't set a CWD for cron runs - [Release Notes Index](Release-Notes-Index) — v2.2.0 for the full launch notes. --- -_Last updated: 2026-04-29 — Scarf v2.5.2 (remote-aware parent-directory pick on remote server contexts)_ +_Last updated: 2026-05-04 — Scarf v2.7 (5 new widget types; no manifest schema bump required)_ diff --git a/Projects-and-Profiles.md b/Projects-and-Profiles.md index 04136c9..1ccc9c5 100644 --- a/Projects-and-Profiles.md +++ b/Projects-and-Profiles.md @@ -6,17 +6,26 @@ Two distinct concepts in adjacent sidebar items. **Projects** are agent-generate A project is any directory you tell Scarf about — typically a code repo, but anything works. Each project gets a custom dashboard composed of widgets defined in `/.scarf/dashboard.json`. -**Widget types** (from [`ProjectDashboard.swift`](https://github.com/awizemann/scarf/blob/main/scarf/scarf/Core/Models/ProjectDashboard.swift)): +**Widget types** (canonical vocabulary lives at [`tools/widget-schema.json`](https://github.com/awizemann/scarf/blob/main/tools/widget-schema.json); each type maps to a Swift view under [`scarf/scarf/Features/Projects/Views/Widgets/`](https://github.com/awizemann/scarf/tree/main/scarf/scarf/Features/Projects/Views/Widgets)): -| Type | Purpose | -|---|---| -| `stat` | Single metric: value + label + optional icon and color. | -| `progress` | Progress bar with label. | -| `text` | Markdown / plain text block. | -| `table` | Columns + rows. | -| `chart` | Line / bar / area / pie with `ChartSeries[]` of `ChartDataPoint{x, y}`. | -| `list` | Bulleted list with optional status badges. | -| `webview` | Embedded web view (URL + height). | +| Type | Since | Purpose | +|---|---|---| +| `stat` | v2.2 | Single metric: value + label + optional icon, color, and inline `sparkline: [Number]` trend (v2.7+). | +| `progress` | v2.2 | Progress bar with label (0.0..1.0). | +| `text` | v2.2 | Inline markdown / plain text block. | +| `table` | v2.2 | Columns + rows of strings. | +| `chart` | v2.2 | Line / bar / area / pie with `ChartSeries[]` of `ChartDataPoint{x, y}`. | +| `list` | v2.2 | Bulleted list with optional **typed** status badges per item — see Status badges below. | +| `webview` | v2.2 | Embedded web view (URL + height). Including any webview also exposes a Site tab. | +| `markdown_file` | v2.7 | Renders a markdown file from `/`. Refreshes when any file under `.scarf/` changes. | +| `log_tail` | v2.7 | Tails the last `lines` of a file (default 20), monospaced; ANSI codes stripped. | +| `cron_status` | v2.7 | Last run / next run / state for a Hermes cron job by `jobId`, plus a small log tail. | +| `image` | v2.7 | Local image (`path` relative to project root) or remote `url`. | +| `status_grid` | v2.7 | Compact NxM grid of colored cells, one per service / item. Reuses the typed status enum. | + +**Status badges (typed in v2.7):** `list` items and `status_grid` cells accept a `status` field that maps to a colored badge. Canonical values are `success`, `warning`, `danger`, `info`, `pending`, `done`, `neutral`. Common synonyms also work (`ok` / `up` → success, `down` / `error` / `failed` → danger, `active` → info, `complete` → done). Unknown strings render as plain text — old dashboards that used ad-hoc statuses keep working byte-identically. + +**Auto-refresh (v2.7):** Scarf watches each project's entire `/.scarf/` directory, not just `dashboard.json`. So when a cron job atomically writes `/.scarf/reports/uptime.md` (write-temp + rename), the `markdown_file` widget pointing at it refreshes automatically. _Limitation:_ in-place appends to an existing file (`>> file.log`) don't tick the watcher — the cron job should write atomically, or `touch dashboard.json` after each run to force a refresh. Remote (SSH-attached) projects share the same coverage via 3-second mtime polling. **The Hermes pattern:** ask your agent to build and maintain the dashboard for you. "Update `.scarf/dashboard.json` to show test pass rate, lines of code, and the open PR list." Scarf renders the result; the agent maintains it. @@ -58,4 +67,4 @@ Remote SSH contexts don't yet auto-resolve `active_profile` — `HermesPathSet.d - [Settings](Gateway-Cron-Health-Logs) — exposes "Backup & Restore" buttons (`hermes backup` / `hermes import`) at the profile level. --- -_Last updated: 2026-04-29 — Scarf v2.5.2 (remote-aware profile import/export sheets)_ +_Last updated: 2026-05-04 — Scarf v2.7 (project-wide auto-refresh, 5 new widget types, typed status enum)_