scarf

mirror of https://github.com/awizemann/scarf.git synced 2026-05-08 02:14:37 +00:00

Author	SHA1	Message	Date
Alan Wizemann	97ec4d2882	chore: Bump version to 2.7.0 v2.7.0	2026-05-05 20:41:39 +02:00
Alan Wizemann	cd5bb32a21	release: prep v2.7.0 — consolidated notes + in-app Sparkle release notes Rolls up everything since v2.6.5 (36 commits across remote-perf, project wizard, dashboard widgets, OAuth resilience, ScarfMon instrumentation, and the v2.7 skeleton-then-hydrate redesign) into a single 2.7.0 release. * releases/v2.7.0/RELEASE_NOTES.md — full consolidated notes, reorganized around the throughline (slow-remote performance) with five thematic sections: skeleton-then-hydrate loaders, SSH cancellation, project wizard + Keychain cron secrets, dashboard widgets, OAuth resilience, and ScarfMon. Replaces the previously- drafted dashboard-only v2.7.0 stub and the separate v2.8 wizard stub (both unreleased). * releases/v2.8/ — deleted; folded into v2.7. * README.md — "What's New in 2.6" → "What's New in 2.7" with the five-section summary linking out to the full notes. * tools/render-release-notes.py — stdlib-only Markdown → HTML renderer covering the subset of GitHub-flavored markdown that release notes use (## / ### headings, paragraphs, ul lists, fenced code, inline code/bold/italic/links, hr). Output includes a small <style> block tuned for Sparkle's update alert WebKit view (light + dark variants via prefers-color-scheme). * scripts/release.sh — render the active RELEASE_NOTES.md and inject the result as <description><![CDATA[...]]></description> on the appcast item. Sparkle's standard updater renders this in the in-app update sheet so users see release-specific "what's new" alongside the version number, not just the bare version. Falls back to a "see GitHub release page" placeholder when the notes file is missing. User runs ./scripts/release.sh 2.7.0 to ship. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 20:31:27 +02:00
Alan Wizemann	5e23b59697	test(model-preflight): cover detect-mismatch + fix newline-trim bug * New ModelPreflightTests suite (19 tests) covering both `check(_:)` and the v2.8 `detectMismatch(_:)` paths. Pins the dogfooding scenario (anthropic-prefixed model + nous active provider after Credential Pools OAuth swap), the case-insensitive prefix match, empty-prefix / empty-bare-model edge cases, and multi-slash model ids (OpenRouter style). * Bug fix surfaced by the tests: `ModelPreflight` was using `trimmingCharacters(in: .whitespaces)` which doesn't strip newlines. A stray `\n` in a hand-edited config.yaml would either miss the missing-fields classifier OR false-positive the mismatch banner (showing "anthropic" vs "anthropic\n"). Switched both trims to `.whitespacesAndNewlines`. perf(observability): instrument Tier C load paths + fetchSessionPreviews No behavior change — adds ScarfMon coverage so future captures show how often Memory/Skills/Cron/Curator/SessionPreviews load paths fire and what they cost on remote (each is multiple sequential SFTP RTTs that pre-fix were invisible). New events: * `mac.fetchSessionPreviews` / `.rows` / `.transportError` * `memory.load` / `.bytes` * `cron.load` / `.jobs` * `skills.load` / `.count` * `curator.load` / `.bytes` All 321 ScarfCore tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 20:03:35 +02:00
Alan Wizemann	09e33b2999	perf(chat,activity,transport): skeleton-then-hydrate loaders + SSH cancellation propagation Major perf overhaul for slow-remote contexts. Chats and Activity now render in <2s instead of timing out at 30s; abandoned SSH work is killed within 100ms instead of pinning a ControlMaster session. * Skeleton-then-hydrate chat loader. New `fetchSkeletonMessages` selects user+assistant rows only (skips role='tool', NULLs tool_calls + reasoning at the SQL level). Wire payload bounded by conversational text alone — sub-second on remote regardless of underlying tool result blob sizes. Background `startToolHydration` pages through `hydrateAssistantToolCalls` (5-id batches) to splice tool calls in. Tool-result CONTENT is opt-in via Settings → Display → "Load tool results in past chats" (default off); inspector pane lazy-fetches per-result via `fetchToolResult(callId:)` on expand. * Skeleton-then-hydrate Activity loader. New `fetchRecentToolCallSkeleton` returns metadata-only rows in ~3 KB for 50 entries; placeholder ActivityRows render immediately, real per-call entries swap in as paged hydration completes. Loading pill in the page header, orange transport-error banner replaces the pre-fix silent empty state. * SSH cancellation propagation. `Task.detached` and unstructured `Task<...> { ... }` don't inherit cancellation from awaiting parents — without bridging, killing a Swift Task left the ssh subprocess running for the full 30s deadline, pinning a remote sqlite query and a ControlMaster session. Wired `withTaskCancellationHandler` through `SSHScriptRunner.run` and `RemoteSQLiteBackend.query`; cancellation now reaches `Process` within ~100ms. New `ssh.cancelled` ScarfMon event. * L1 single-id retry. When a 5-id `hydrateAssistantToolCalls` page trips the 30s timeout (one row carries an oversized tool_calls blob — long Edit args, big diffs), fall back to single-id queries to isolate the whale. Non-whale rows in the same batch hydrate normally; whale row stays bare. New `mac.hydrateToolCalls.singleTimeout` event tracks how often the recovery fires. * L2 in-flight coalescing for `loadRecentSessions`. File-watcher deltas during streaming used to stack 2-3 parallel sessions-list reload tasks; subsequent callers now await the active one. New `mac.loadRecentSessions.coalesced` event tracks dedup hits. * Loading-state UX hardening. New `isStartingSession` flag flips synchronously on user click so the chat sidebar greys + disables immediately instead of waiting for `client.start()` to return (5-7s on remote). Phase-typed status: "Spawning hermes acp…" → "Authenticating…" → "Loading session…" → "Loading history…" → "Ready". `ChatSessionListPane` overlays a ProgressView showing the current phase. * Partial-result detection. `fetchMessagesOutcome` distinguishes a transport failure from a genuine empty result; `loadSessionHistory` surfaces "Couldn't load full chat history — connection timed out" through the existing acpError triplet so the user sees what happened instead of a silent empty transcript. * Model/provider mismatch banner. `ModelPreflight.detectMismatch` recognizes when `model.default` carries a `<provider>/...` prefix that disagrees with `model.provider` (e.g. anthropic prefix + nous active provider after switching OAuth via Credential Pools). Banner offers one-click fix in either direction. Companion: ACP error classifier recognizes `model_not_found` / `404 messages` and surfaces "Hermes pins each session to its original model — start a new chat" so the pinned-model failure mode has a clear recovery path. * OAuth-completion provider swap prompt. After successful OAuth in Credential Pools, if the just-authed provider differs from `model.provider` in config.yaml, surface "Switch active provider to <name>?" with [Switch] / [Keep current] instead of auto-dismissing. All 302 ScarfCore tests pass. New ScarfMon events documented in the Performance-Monitoring wiki page. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 19:43:53 +02:00
Alan Wizemann	9f2e2ecfcd	perf(chat): exclude reasoning_content from initial fetch + drop page size to 25 The 160-message thinking-model session still timed out at the 30s ceiling even after dropping page size 200→50 in commit `a193003`. ScarfMon trace: mac.fetchMessages 30,105,329,125 ns ← 30s timeout fired mac.hydrateMessages.rows count=1 ← 1 partial row only Root cause: `reasoning_content` is huge on thinking models (20+ KB per row). Even 50 rows × 30 KB = 1.5 MB JSON shipping over a 420ms-RTT remote SSH channel exceeds the budget. The chat appeared empty AGAIN. Two cuts: 1. `messageColumnsLight` — same as messageColumns but omits `reasoning_content`. Used by `fetchMessages` so the bulk wire payload is small. `messageFromRow` reads reasoning_content via `row.optionalString(at: 11)` which gracefully returns nil when the column isn't present, so the shape change is transparent. 2. `fetchReasoningContent(for:)` — single-row lazy fetch the inspector pane calls when the user expands a thinking disclosure. One small SSH round-trip per inspection vs. paying for ALL reasoning content on every session boot. 3. `HistoryPageSize.initial` 50 → 25 — sized for the lite column shape with margin for sessions that include some heavy tool-call payloads. The "Load earlier" affordance still pages back through older messages. Net effect on the user-reported case: 160-message session loads the most-recent 25 messages in ~5-10s (one SSH round-trip ~420ms plus ~3 KB × 25 = 75 KB wire). The remaining 135 are reachable via Load earlier. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 13:28:40 +02:00
Alan Wizemann	1eb5c92f6a	fix(aux-tab): correct nested-YAML parser so unknown-task surface works on remote Bug 1 — the previous parser collected every indented child under `auxiliary:` as if it were a task name, including leaf fields (provider, model, base_url, api_key, timeout). Result: bogus rows on local where the parser happened to fire, plus pollution of the unknown-tasks set with field names that subtractFrom-known left orphaned. Bug 2 — the flat-dot-path branch (`auxiliary.X.Y:`) was dead code. config.yaml is always nested YAML; the dot-path form only appears in interactive `hermes config get` output, never on disk. Removing it. User reported the unknown-tasks section showed on local but not on remote. Most likely root cause: the buggy parser surfaced junk on local (where their config has nested-form aux settings) while the dead flat-path branch never fired on remote either, so remote silently rendered nothing. With the parser fixed both contexts now surface real unknown task names if any are present. Rewrite as a clean two-pass walker: - First nested line inside the block locks taskIndent. - Only collect at exactly taskIndent (skip leaf fields deeper). - Tolerate CRLF line endings, blank lines, and YAML comments without resetting block state. - Handles 2-space and 4-space indent equally. Verified manually with four fixture shapes: 2-space, 4-space, with-comments-and-blanks, no-aux-block. All correct. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 13:12:55 +02:00
Alan Wizemann	bccaba0742	feat(acp,aux): classify resolve_provider_client errors + surface unknown aux tasks Two fixes for the user-reported "ACP -32603 Internal error" after removing a Nous OAuth provider while config.yaml still referenced nous for an auxiliary task. The actual stderr was clear: agent.auxiliary_client: resolve_provider_client: nous requested but Nous Portal not configured But Scarf's chat banner showed only the bare JSON-RPC code and the user had no actionable path through the UI. ACPErrorHint.classify now pattern-matches the `resolve_provider_client: <name> requested but` stderr line and extracts the provider name. Surfaces: An auxiliary task is configured to use `<name>` but that provider isn't authenticated. Open Settings → Aux Models, or check ~/.hermes/config.yaml for auxiliary.<task>.provider: <name> and switch it to your active provider (or set it to `auto`). Routed through the existing chat-banner pipeline that already catches OAuth revocation and missing-credentials errors. AuxiliaryTab gains an "Other tasks in config.yaml" section that surfaces aux task keys present in YAML but not in Scarf's typed list (vision, web_extract, compression, session_search, skills_hub, approval, mcp, flush_memories, curator). Common case: `auxiliary.summarization.provider: nous` left over from older Hermes versions or hand-edited configs. Each unknown task gets a one-click "Reset provider" button that writes `auxiliary.<key>.provider: auto` — the most-actionable fix for the OAuth-removal failure mode. Detection scans both flat-dot-path and nested YAML shapes so it works regardless of how Hermes dumped the file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 13:00:48 +02:00
Alan Wizemann	4684b9deed	feat(credential-pools): OAuth remove button + auto-refresh on auth.json change User reports the Nous OAuth provider still showed in the credential pool after they 'removed' it, and Reload didn't help. Two underlying bugs: Bug 1 — no UI path to remove OAuth providers. The pool view had a Re-authenticate button on each OAuth row but no remove. Users who switched active provider thought that removed Nous; the OAuth tokens stayed in auth.json and the row kept rendering. Add a trash icon next to Re-authenticate that calls `hermes auth logout <provider>` after a confirmation dialog. ViewModel route is `removeOAuthProvider` mirroring `removeCredential`. Bug 2 — view didn't refresh on external auth.json changes. Pool view subscribed only to .onAppear and sheet-dismiss. A terminal `hermes auth logout` or another window's OAuth flow left the view stale until manually re-entered. Wire up `fileWatcher.lastChangeDate` so any auth.json mtime tick triggers a reload (the file watcher already polls auth.json on the remote SSH path). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 12:46:41 +02:00
Alan Wizemann	f6dc45b397	feat(scarfmon): track empty-assistant turns + document Nous quirk User reports chats "dying" on Nous models — screenshot shows the assistant bubble stuck with `(°□°) deliberating...` and a 1.7s turn-duration pill (turn DID complete; the content is the problem). The literal placeholder string isn't in Scarf's source; it's coming from Hermes or Nous itself when the model emits a brief thought stream and then fails to produce any visible output. ScarfMon trace confirms the failure mode: mac.sendViaACP → firstThoughtByte (25 bytes) mac.handleACPEvent ✓ mac.sendPrompt ✓ (1.7s, normal) finalizeStreamingMessage ✓ (turn cleanly closed) So Scarf sees no transport error — the turn finalized normally with empty assistant text plus a small thought stream. The visible "deliberating" text is content Hermes/Nous chose to substitute for the missing response. Adds `mac.emptyAssistantTurn` event (category .chatStream) that fires whenever a turn finalizes with empty `streamingAssistantText` and empty `streamingToolCalls`. Bytes carry the thinking-text length so we can distinguish: - bytes=0: total empty turn (model produced nothing) - bytes>0: thoughts-only turn (model thought but didn't answer) Both are user-visible failures. The fix is upstream — Hermes should refuse to finalize a turn with no response and surface an error, OR Nous should not return empty responses with the placeholder string. Document this finding so a future capture that shows multiple `mac.emptyAssistantTurn` events confirms the rate / model-correlation. For now Scarf surfaces the same UX as before (no UI change in this commit). A follow-on commit could intercept this case and replace the bubble with a clearer "Model returned no response" banner, but that requires a confident heuristic for which empty-finalize cases are real failures vs. legitimate no-response turns. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 12:40:21 +02:00
Alan Wizemann	f2ddcbbd60	feat(model-picker): add search filter to Nous overlay model list Nous returned 402 models in the recent perf capture (~496 KB of JSON). The picker's existing top-bar search field already filters the catalog list (`filteredModels`) but the Nous overlay path showed all 402 unfiltered, making it nearly unusable. Add `filteredNousModels` mirroring the `filteredModels` shape: filters `nousModels` by case-insensitive substring match against both `id` and `owned_by`. Updates the empty-state overlay so "no matches" surfaces a different message from "no models loaded" — the user knows the catalog is fine, the search just didn't match. User feedback: "we need a search in the model picker, some of these lists are large and unorganized." Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 12:38:30 +02:00
Alan Wizemann	a193003842	fix(chat): paginate session-load + race-guard against session switch Two related bugs from remote-context perf captures. Bug 1 — 30s timeout fetching the 157-message session. The initial page size was 200 messages. For a session including `reasoning_content` from a thinking model, that produces enough JSON over `sqlite3 -json \| ssh` to time out at exactly 30s on a 420ms-RTT remote, returning 0 rows. Bumping queryTimeout further just trades latency for stalls. Drop `HistoryPageSize.initial` from 200 → 50. Sized to fit comfortably inside the 30s queryTimeout; the existing "Load earlier" affordance pages back through older messages on demand. Bug 2 — session-switch race silently swaps transcripts. When the user picks a small chat while a slow fetch for a different chat is still in flight, the slow fetch finishes second and its `messages = …` assignment overwrites the small chat's transcript. User sees the small chat "jump back" to the big one. ScarfMon trace: parallel `mac.fetchMessages` events at t=641870 (small, 425ms, 2 rows) and t=643316 (big, 30,028ms timeout) — last write won. Add a `loadingForSession` capture and three guards: after the DB refresh, after the primary fetch, after the ACP-fork fetch. Each compares `self.sessionId` against the captured id; on mismatch fire `mac.hydrateMessages.dropped` and return without assigning. Race is silent in normal usage but visible in traces. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 12:38:19 +02:00
Alan Wizemann	93a64e3e82	fix(nous-picker): kill 120s beach-ball — dedupe readCache + 5s timeout Two stacking bugs in the Nous-overlay branch of the model picker caused a 120-second beach-ball on remote contexts. Bug 1 — duplicated readCache. ModelPickerSheet.refreshNousModels called `service.readCache()` directly (for instant first-paint), then called `service.loadModels(forceRefresh: false)` which calls `readCache()` AGAIN as its first step. Two SSH round-trips per picker open. Drop the inline call; loadModels is already cache-first on its happy path (returns `.cache(...)` when fresh). One read per open. Bug 2 — 60s readFile timeout for a hint. `readCache()` goes through SSHTransport.readFile which has a 60s default timeout. On a remote with a corrupted or oversized cache file, `cat` never returns and we wait the full 60s — twice, due to bug 1, for a total 120s picker stall. ScarfMon perf capture (commit 00a1bbd's diagnostic split) localized this precisely: nous.readCache.fileExists = 251 ms ✓ nous.readCache.readFile = 60,011 ms ❌ (60s timeout) Cache is an optimization, not a requirement. Added `readCacheWithTimeout(seconds: 5)` that races readCache against a 5-second sleep via withTaskGroup. On timeout returns nil; caller treats that as no-cache and falls through to the network fetch (which succeeded in 2s in the offending capture, returning 402 models). The runaway `cat` keeps running on its own 60s transport timeout but no longer blocks the picker. New ScarfMon event: `nous.readCache.timeoutFired` surfaces hits in traces so we can tell whether the timeout is being exercised in the wild. The underlying `cat` hang on the cache file is still unexplained; the file size (~500KB) shouldn't take 60s on a 420ms-RTT SSH link. For now: deleting the cache file (`rm ~/.hermes/scarf/nous_models_cache.json` on the remote) is the workaround. The next picker open will rebuild it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 12:17:45 +02:00
Alan Wizemann	00a1bbd109	feat(scarfmon): split nous.readCache into fileExists/readFile/decode/bytes Last perf capture showed nous.readCache as a single 60-second interval — but the function does three things (transport.fileExists, transport.readFile, JSONDecoder). Splitting the measure points so the next capture localizes which step actually owns the wall-clock. Adds: - nous.readCache.fileExists (interval) — SSH `test -e` round-trip - nous.readCache.readFile (interval) — SSH `cat` round-trip - nous.readCache.bytes (event) — payload size of the cache file - nous.readCache.decode (interval) — JSON parsing cost If the next 60-second beach ball localizes to readFile, we know the cache file is somehow huge or the SSH read is hung; if it's fileExists, the path resolution is the issue; if decode, we have malformed JSON. All three wear the same outer wrapper so the existing nous.readCache total stays for trend comparison. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 12:07:43 +02:00
Alan Wizemann	20cc3a2985	perf(sessions): fold sessions+previews into one batched SSH round-trip Audit Finding 1 — ChatViewModel.loadRecentSessions and SessionsViewModel.load each fired two sequential `await dataService.fetch*` calls (sessions + previews), paying the 420 ms SSH RTT twice on every reload. Visible in ScarfMon traces as back-to-back `ssh.run` intervals, totaling ~840 ms minimum overhead per sidebar refresh. Adds HermesDataService.sessionListSnapshot(limit:) — same shape as the existing dashboardSnapshot, folds both queries into a single backend.queryBatch() call. Both call sites switched. Halves the SSH round-trips for every sidebar load. With Finding 5's coalescing, redundant parallel reloads also become free. Together, the 9× redundant queries-per-minute observed in baseline captures should drop substantially. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 12:07:31 +02:00
Alan Wizemann	432d5b0b52	fix(remote-sqlite): bump query timeout 15s→30s + add in-flight coalescing Two issues from the perf capture: 1. fetchMessages on a 157-message session timed out at exactly 15.06 s (`mac.fetchMessages` interval = 15,062,646,042 ns), then silently returned 0 rows. The chat appeared empty but the session had data; the timeout was firing before sqlite3 -json could ship the ~50KB payload over a 420 ms-RTT SSH link. Bumped queryTimeout to 30 s. The streamScript transport-level timeout still fires on truly wedged hosts. 2. mac.loadRecentSessions fired twice in parallel at t=960450 + t=960584, finishing 134 ms apart — two independent watcher ticks each spawning a full 3-query SSH load for the same data. Added in-flight request coalescing keyed on the inlined SQL text: when a query with the exact same SQL is already pending, second caller awaits the first task instead of spawning a new subprocess. New ScarfMon event `sqlite.query.coalesced` surfaces hits in traces. Coalescing is surgical — applies to single `query` calls only, not `queryBatch` (different timeout scaling, concurrent-same-batch is rare). Avoids serializing independent work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 12:07:19 +02:00
Alan Wizemann	12e152bfea	perf(ssh): replace Thread.sleep spin with kernel-wait for runLocal timeout Audit Finding 3 — every SSH operation funnels through SSHTransport.runLocal, which used a 100ms Thread.sleep loop while waiting for the timeout. Each call held one cooperative-pool thread for the full timeout duration with spin-poll overhead, AND had 100ms granularity on the deadline. Replace with proc.terminationHandler + DispatchGroup wait — kernel-wakeup when the process exits (or the deadline fires), no spin. Same one-thread blocking footprint, but eliminates the per-operation spin work that inflated query latency 60-70% under concurrent SSH load (visible in ScarfMon as 7-second mac.loadRecentSessions outliers when sidebar reload + chat finalize + watcher poll all fired together). Minimum-touch fix; full async migration of runLocal documented for follow-up. The bigger refactor would let cooperative-pool threads park on a true async suspension during the wait, but requires propagating async through every ServerTransport caller. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 12:06:58 +02:00
Alan Wizemann	099d73dde8	feat(scarfmon): instrument Nous model catalog + subscription path (beach-ball investigation) User reported a remote-context beach-ball when opening the model picker with Nous as the active provider. Existing measure points showed loadProviders + loadModels at ~315ms each (fast). The beach-ball must be in the uninstrumented Nous-overlay branch the picker fires when nous is selected. Adds four measure points covering every blocking call in that path: - nous.subscription.loadState (interval, .diskIO) — auth.json read via NousSubscriptionService.loadState. Already known to do an SSH read; now precisely measurable. - nous.readCache (interval, .diskIO) — nous_models cache read, TWO sequential SSH ops (fileExists + readFile). - nous.bearerToken (interval, .diskIO) — auth.json read AGAIN inside fetchModels. This is a duplicate read — loadState already parsed the same file moments earlier. Comment-flagged as a caching candidate. - nous.fetchModels (interval, .transport) + .bytes (event) — HTTP GET against the Nous /v1/models endpoint with the body byte count attached. The most likely beach-ball culprit if the endpoint is slow or hung. After the next capture we'll know which of the four owns the user's wall-clock; if `nous.bearerToken` shows up alongside `nous.subscription.loadState` with similar duration, the duplicate read is also a real cost worth fixing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 11:50:51 +02:00
Alan Wizemann	4efd84c119	feat(projects,cron): new project wizard + keychain env mirror + #75 fix Three coordinated additions to the project surface: 1. New Project from Scratch wizard. Toolbar entry that scaffolds a Scarf-standard project skeleton (`<project>/.scarf/dashboard.json` placeholder + `AGENTS.md` marker block), registers it, opens an ACP chat session in the project's cwd, and auto-sends a kickoff prompt that activates the bundled `scarf-template-author` skill. The skill drives the substantive setup conversationally — widgets, optional config schema, optional cron, AGENTS.md content. 2. Keychain secrets mirror into ~/.hermes/.env. Cron jobs can now reference Keychain-backed config values via env vars named `SCARF_<UPPER_SLUG>_<UPPER_FIELDKEY>`. Hermes reloads .env per cron tick (cron/scheduler.py:897-903), so credential rotation is free. Source of truth stays in the Keychain — config.json keeps `keychain://` URIs unchanged. Mirror runs at install, post-install Configuration save, uninstall, "Remove from List", and on app launch (reconcileAll). Mode 0600 on `.env` enforced by LocalTransport's existing `.env` heuristic. 3. Configuration form layout recursion fix (issue #75). Per-stage frame sizes on `ConfigEditorSheet` triggered `_NSDetectedLayoutRecursion` for projects with manifest.json. Stabilized the outer frame at the editing stage's intrinsic size so transitions only swap content, never resize the container. New services: - `ProjectScaffolder` (Mac) — bare-shell project + AGENTS.md marker - `SkillBootstrapService` (Mac) — copies bundled skills into ~/.hermes/skills/ - `KeychainEnvMirror` (Mac) — splice/unmirror/reconcileAll over ~/.hermes/.env - `SecretsEnvBlock` (ScarfCore) — pure marker-block helpers Bundled skill `scarf-template-author` v1.1.0 ships in `Resources/BuiltinSkills.bundle/`; SkillBootstrapService copies it into `~/.hermes/skills/scarf-template-author/` on launch (idempotent + version-gated). The skill grew a "Using secrets in cron prompts" section documenting the env-var convention. Migration: launch reconciler auto-populates .env on first v2.8 launch. Users with cron prompts authored against the old (broken) pattern need to update them to use $SCARF_… references — see release notes. Tests: - SecretsEnvBlockTests: 24/24 (`swift test --filter SecretsEnvBlock`) - KeychainEnvMirrorTests: 11/11 (`xcodebuild ... -only-testing:scarfTests/KeychainEnvMirror`) The idempotent-mirror test caught a real bug: applyBlock's replace path consumed the trailing newline from blockRange but didn't restore it, breaking the no-op-when-unchanged contract that the launch reconciler relies on. Fixed. v2.8 RELEASE_NOTES.md committed but no release cut yet. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 11:44:23 +02:00
Alan Wizemann	bd9bacb8b3	feat(scarfmon): B2 + B3 + iOS dashboard — file watcher, message hydration, dashboard load Three areas instrumented in this batch. Both targets build clean. B2 — Mac HermesFileWatcher (FSEvents + remote SSH poll) - mac.fileWatcher.localFire (event) — every FSEvents change on a watched core or project path. High counts during streaming chats are normal (state.db-wal ticks per persisted message); high counts during idle suggest a runaway watcher install. - mac.fileWatcher.remoteRestart (event, bytes=path-count) — fires once per SSH poller restart, with the union path count attached. Frequent restarts mean the project-list update path is churning. - mac.fileWatcher.remoteDelta (event) — fires per non-empty change detected on the SSH poll. Pair with `ssh.streamScript` cadence to see actual poll latency. B3 — Chat session boot + message hydration - mac.fetchMessages (interval) + .rows (event) — bounded SQL fetch from HermesDataService. Catches slow paginated scrolls back through long sessions. - mac.refreshSessionFromDB (interval) — RichChatViewModel's post-promptComplete refresh that picks up cost/token data. - mac.hydrateMessages (interval) + .rows (event) — full session-boot hydration in RichChatViewModel.loadSessionHistory. Was the suspected trigger of the 22-bubble session-start storms in the Phase 3a baseline; now precisely measurable. iOS Dashboard (resolves the original "out of sync" mystery) - ios.loadDashboard (interval) — wraps the four dataService.fetch* Citadel SFTP round-trips in IOSDashboardViewModel.load(). - ios.allSessions.count (event) — sidebar list size after each load, correlates load latency with list growth. - ios.dashboardRefresh.trigger (event) — fires only on pull-to-refresh, separates that entry path from initial appear. Architectural finding: the original v2.6.0 user feedback ("chat out of sync iOS↔Mac on fast LAN") is now firmly attributable to this — iOS does NOT subscribe to a file watcher. The dashboard refresh path is appear-time + pull-to-refresh only. `CitadelServerTransport.watchPaths()` is effectively dead code on iOS today; nobody calls it. Earlier A1 instrumentation (commit `9df7142`) put measure points on it, which is why captures showed zero `ios.fileWatcher.tick` events. Future work: either add a foregrounded poll loop to iOS, or thread the file watcher into the dashboard subscription. Documented in the ScarfMon roadmap memory. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 23:52:11 +02:00
Alan Wizemann	96af545e66	feat(scarfmon): Tier A2/A3/B1/B4 — sessions, model catalog, dashboard widgets, image encoder Four parallel instrumentation drops orchestrated by the perf roadmap. All adds; no logic changes; both targets build clean. A2 — Mac sessions list reload - mac.scheduleSessionsRefresh (event) — every file-watcher entry into the debounced reload helper. Pair with mac.loadRecentSessions count to see how many ticks coalesce per actual reload. - mac.loadRecentSessions (interval) — full wall-clock from DB open through observable assignment. - mac.recentSessions.count (event) — sidebar list size, correlates list growth with reload latency. A3 — ModelCatalogService loads - modelCatalog.loadProviders (interval) + .providers.count (event). - modelCatalog.loadModels (interval) + .models.count (event). - modelCatalog.validateModel (interval) — covers loadCatalog -> transport.readFile, hits disk on every call. Sync wrap (not measureAsync): the inner Task.detached body is synchronous; the detached hop is the async boundary. B1 — Dashboard render - mac.dashboard.body (event) — ProjectsView body re-eval count. - dashboard.loadRegistry (interval) — projects.json read + decode. - widget.markdown_file.load / widget.log_tail.load / widget.image.load / widget.cron_status.load (intervals) — one per v2.7 file-reading widget. cron_status batches its two HermesFileService calls into one tuple-returning measure block so the existing two-call shape stays intact. B4 — Image encoder - imageEncoder.input.bytes (event) — raw input size. - imageEncoder.downsample (interval) — full decode/resize/JPEG encode round trip across all three platform branches (AppKit, UIKit, Linux passthrough). - imageEncoder.bytes (event) — final encoded JPEG size, lets us spot blowup cases. Sync wrap: encode is nonisolated sync; using measureAsync would require turning the function async, which is a logic change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 23:38:50 +02:00
Alan Wizemann	9df7142f49	feat(scarfmon): A1 — instrument iOS file-watcher polling cadence Adds three measure points to CitadelServerTransport.watchPaths: - ios.fileWatcher.tick (interval) — full poll cycle latency including the SSH stat round-trips. > 1500ms here is what 'out of sync' feels like — the channel is congested or the host is slow. - ios.fileWatcher.delta (event) — fires only when the signature actually changed. Low delta/tick ratio means we can safely drop the 3-second cadence; high ratio means we'd just burn bandwidth. - ios.fileWatcher.paths (event, bytes=count) — number of paths watched per cycle. Explains slow ticks as the project list grows. Surgical addition; existing 3-second cadence + signature-diff logic unchanged. With Full mode on, a few minutes of usage on LAN will tell us empirically whether the cadence can drop to 1s — the original v2.6.0 user feedback complained 'chat is out of sync' between iOS and Mac on a fast LAN. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 23:33:30 +02:00
Alan Wizemann	9ff9a018e7	feat(scarfmon,chat): Phase 3b — dampen finalize bursts + Thinking… status + wider loadConfig stack Three targeted fixes from the Phase 3a baseline. Bubble-burst dampening (Phase 3b-1): - RichChatViewModel.finalizeStreamingMessage wraps both the streaming-id rewrite and the empty-finalize remove() in a no-animation Transaction. The id flip from 0 → permanent value was the load-bearing trigger of the 5–8 RichMessageBubble.body fires we were seeing 1–2 ms after every `finalizeStreamingMessage` interval; SwiftUI ran an animated diff against neighbors and re-evaluated their bodies. The new message is content-equal to the streaming one — there is no animation worth running. Thinking… status promotion (Phase 3b-2): - RichChatViewModel exposes `isStreamingThoughtsOnly` — true while a turn is in flight, has emitted thought-stream bytes, and has not yet produced any visible assistant text. The Phase 3a baseline showed this is where most of the user-perceived "feels slow" lives: reasoning models commonly take 3–8 s before producing visible output, and Scarf surfaced no specific signal during that window. - Mac ChatView.displayedStatus promotes the toolbar pill to "Thinking…" when the flag is true. - iOS connectionBanner gains a transient "Thinking…" strip with spinner, same trigger condition. Phase 3a fix-up: - HermesFileService.loadConfig stack-trace logging widened from one frame to a 10-frame window prefixed with "#N", so the actual caller is visible past inlined ScarfMon wrappers (the prior log surfaced ScarfMon.measure itself, not the loadConfig caller). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 23:14:03 +02:00
Alan Wizemann	0a4f8de492	feat(scarfmon): Phase 3a — diagnostic measure points for chat-render bursts Adds four targeted measure points so the next baseline capture can attribute the bubble-re-render storm and the slow sendPrompt to a specific cause: - mac.RichChatMessageList.body — distinguishes "the parent is re-issuing the ForEach" from "the bubbles are re-rendering on their own". If list.body fires once and bubble.body fires N times, churn is in the bubbles; if list.body fires N times, the ForEach itself is being rebuilt. - finalizeStreamingMessage (interval) — pinpoints the end-of-stream burst trigger. The 20-bubble re-eval burst we saw at the close of each turn lines up with this call; measuring it surfaces whether it's the streaming-id rewrite, the turn-duration assignment, or something downstream. - firstByte / firstThoughtByte (event) — fires once per turn on the first chunk after currentTurnStart is set. Splits user-tap → first byte (network + Hermes thinking, the dominant component of the 7-11s sendPrompt) from first byte → turn end (Scarf streaming render). - loadConfig caller hint via os.Logger — when ScarfMon is in Full mode, logs the first stack frame above each loadConfig call to the com.scarf.mon subsystem so mystery callers (the read at t=264282 with no apparent trigger in the prior baseline) become traceable via `log stream`. Symbol-only, no PII, free outside Full mode. All four are pure additions — no behavior change, same zero-cost default-off semantics as Phase 2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 22:47:29 +02:00
Alan Wizemann	3126c34561	feat(scarfmon): chat + transport + sqlite measure points (Phase 2) Wires ScarfMon measure points into the chat hot path on both targets, plus the underlying SSH transport and remote-SQLite backend. All callsites are surgical adds — no behavior change. Cost when ScarfMon is in `.signpostOnly` (default) is one os_signpost emit per call, elided by the runtime outside an Instruments session. In `.full` mode the same callsites also push samples into the in-memory ring buffer. Render counters (event): - mac.ChatView.body / ios.ChatView.body — full transcript pane re-evals - mac.RichMessageBubble.body / ios.MessageBubble.body — per-bubble re-evals Stream + session (event + interval): - mac.sendViaACP, mac.sendPrompt — user tap → first-byte - mac.acpEvent, mac.handleACPEvent — per-event delivery + handle cost - mac.startACPSession — session boot - ios.send, ios.startResuming — same shape on iOS - ios.acpEvent, ios.handleACPEvent — same per-event split on iOS Transport + SQLite (interval, with byte counts on rows): - ssh.streamScript (Citadel iOS) — SSH round-trip - ssh.run (SSHScriptRunner Mac) — SSH round-trip - sqlite.query, sqlite.queryBatch — Remote SQLite per-call - sqlite.query.rows — row count + stdout bytes per query Disk I/O (interval): - diskIO.loadConfig — config.yaml read + parse - diskIO.loadCronJobs — cron jobs.json decode Body counters use the `let _: Void = ScarfMon.event(...)` pattern at the top of `body` — works inside `@ViewBuilder` and fires on every re-eval, which is exactly the signal we want. To use: Mac: Settings → Advanced → Performance Diagnostics → Full iOS: Settings → Diagnostics → Performance → Full Both panels auto-aggregate by (category, name), surface top 20 by p95, and offer Copy as JSON for sharing in feedback threads. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 22:18:06 +02:00
Alan Wizemann	6cf59c8a44	feat(scarfmon): perf instrumentation plumbing for iOS + Mac (Phase 1) ScarfMon lands the always-on perf instrumentation harness. Phase 1 ships the plumbing only; Phase 2 wires the chat measure points. Core (ScarfCore/Diagnostics/): - ScarfMon — public API: measure / measureAsync / event with @inline(__always) short-circuit when the backend set is empty so the off path is one branch + return. Categories are an enum, names are StaticString so user content cannot leak through metric tags. - ScarfMonRingBuffer — fixed-capacity (4096) lock-protected ring; one os_unfair_lock per record; summary() aggregates by (category, name) with nearest-rank p50/p95; exportJSON() emits a one-line-per-sample dump for the Copy as JSON button. - ScarfMonSignpostBackend — emits os_signpost into a dedicated com.scarf.mon subsystem so Instruments → Points of Interest shows Scarf's own measure points without a debug build. - ScarfMonLoggerBackend — Logger(.debug) sink for users running `log stream --predicate 'subsystem == \"com.scarf.mon\"'`. - ScarfMonBoot — three modes (off / signpostOnly / full); persists the user's choice in UserDefaults under ScarfMonMode; configure() is idempotent and replaces the active backend set atomically. Tests: 11 cases covering ring ordering / wrap / reset, summary aggregation, p95 percentiles, event vs interval semantics, install / isActive, measure + measureAsync (including the throw path), boot mode transitions, and JSON export round-trip. @Suite(.serialized) because the suite mutates process-wide backend state. App wiring: - ScarfIOSApp.init + ScarfApp.init call ScarfMonBoot.configure(mode:) with the persisted mode (default .signpostOnly). - iOS Settings → Diagnostics → Performance row leads to a list-style panel with the segmented mode picker, top-20 stat rows by p95, Copy as JSON, and Reset. - Mac Settings → Advanced gains a ScarfMonDiagnosticsSection with the same shape (NSPasteboard for copy). Open-source by design — no remote upload, no analytics. The ring buffer never leaves the device unless the user explicitly taps Copy as JSON. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 22:08:21 +02:00
Alan Wizemann	272da6a915	fix(transport,widgets): code-review fixes for v2.7 + iOS Citadel transport - CronStatusWidgetView: include jobId + lineCount in `.task(id:)` so widget reload fires when dashboard.json changes either field, not only when the file watcher ticks - CitadelServerTransport.runScript: enforce the timeout via withThrowingTaskGroup race; propagate transport-level Citadel errors as TransportError.other (so RemoteSQLiteBackend.query maps them to BackendError.transport instead of misclassifying as BackendError.sqlite via a fake -1 exit code); throw TransportError.timeout on the deadline branch with partial stdout preserved - SSHScriptRunner: close fileHandleForReading on stdout/stderr Pipes in the timeout branch (success path already did); check Task.isCancelled inside the busy-wait so a cancelled parent task terminates the subprocess early instead of waiting out the full timeout. Both runOverSSH and runLocally fixed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 21:40:07 +02:00
Alan Wizemann	c7bcfd8655	feat(dashboards): v2.7 widget catalog — file-reading widgets, sparkline, typed status, project-wide watch Major project-dashboard release. Five new widget types (markdown_file, log_tail, cron_status, image, status_grid), inline sparkline on stat, typed status enum shared by list + status_grid, structured WidgetErrorCard, and a project-wide .scarf/ directory watch that picks up files cron jobs write next to dashboard.json. - ProjectDashboard: extend DashboardWidget with path/lines/jobId/cells/gridColumns/sparkline; add StatusGridCell + ListItemStatus (lenient parse with synonyms) - HermesFileWatcher: watch each project's .scarf/ dir alongside dashboard.json (local FSEvents + remote SSH mtime poll); updateProjectWatches signature now takes dashboardPaths + scarfDirs - New widget views: CronStatus, Image, LogTail, MarkdownFile, StatusGrid, plus WidgetErrorCard for structured failure messaging; legacy "Unknown" placeholder replaced everywhere - WidgetPathResolver: project-root-anchored path resolution that rejects absolute paths + ".." escapes pre and post canonicalization - Stat widget gains optional inline sparkline (pure SwiftUI Path, no Charts dep); list widget rows route through typed status with semantic icons + ScarfColor tints - iOS list widget + unsupported card adopt typed status + warning-toned error card (parity with Mac error styling); new widget types remain Mac-only - Site mirror: widgets.js renders all five new types (file-reading widgets show annotated catalog placeholders), sparkline SVG, status-grid grid; styles.css adds typed-status palette + error-card + sparkline + grid styles - Catalog validator: tools/widget-schema.json is the single source of truth; build-catalog.py loads it and enforces per-type required fields. 8 new test cases in test_build_catalog.py covering schema load, v2.7 additions, and missing-required rejection - Template-author skill (SKILL.md) gains v2.7 Widget Catalog section + canonical status guidance; CONTRIBUTING.md points authors at widget-schema.json; template-author bundle rebuilt - Localizable.xcstrings picks up auto-extracted strings for the previously-shipped OAuth keepalive feature - Release notes drafted at releases/v2.7.0/RELEASE_NOTES.md Backwards compatible — existing dashboard.json renders byte-identically, status synonyms (ok/up/down/active/etc.) keep working. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 21:16:29 +02:00
Alan Wizemann	9d945150e0	fix(chat): suppress 'stop' badge in metadata footer for normal turn ends Every text-bearing assistant turn finalizes with `finishReason="stop"` (set by `RichChatViewModel.finalizeStreamingMessage` line 881 — the standard end-of-turn signal Hermes/ACP/OpenAI all emit). The `metadataFooter` in `RichMessageBubble` was rendering it unconditionally, so every assistant bubble carried a `· stop · TIME` footer. Combined with terse model output (e.g. deepseek-v4-flash emitting only a brief status line before ending the turn), the badge created a misleading "the agent gave up" impression — there was no warning, error, or actual failure. Match the convention used by ChatGPT, Claude.ai, Cursor, etc.: suppress the badge for normal end-of-turn (`stop` / `end_turn`), reserve it for abnormal terminations the user actually wants to see (`max_tokens`, `length`, `error`, `refusal`, `content_filter`, …). When it does render, color it with severity tone — warning yellow for "response cut short" cases, danger red for failures and refusals, muted otherwise. The existing `handlePromptComplete` system-message-injection path (line 725-751) for non-`end_turn` stops still surfaces those cases explicitly at the top of the chat — this change only trims the always-on badge from the per-message footer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 15:40:31 +02:00
Alan Wizemann	fa15634381	fix(oauth-keepalive): drop unsupported --silent flag from cron create `hermes cron create` only accepts --name, --deliver, --repeat, --skill, --script, --workdir. The `silent: Bool?` field on HermesCronJob exists in the JSON model but isn't exposed through the CLI's create verb today — argparse rejected the unknown flag, non-zero exit, toggle failed with the generic CLI hint. Drops the flag; the keepalive runs with Hermes's default delivery. Token-refresh side effect during session boot is unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 15:33:25 +02:00
Alan Wizemann	3271391506	fix(chat): debounce sidebar reloads so sessions list doesn't flicker mid-stream ChatView's `.onChange(of: fileWatcher.lastChangeDate)` fired an unconditional `Task { await viewModel.loadRecentSessions() }` on every file-watcher tick. During an ACP message stream the watcher fires 5–10 times per second (every message Hermes persists bumps `state.db-wal`'s mtime), and each spawned task re-fetched sessions + previews + project attribution and reassigned `recentSessions` even though the data was identical. Each reassignment triggered an @Observable re-render of the chat sidebar; the user saw the chats list visibly disappear and reappear several times while typing the first message in a new chat. Two changes: * Add `scheduleSessionsRefresh()` to ChatViewModel — coalesces rapid ticks into one trailing `loadRecentSessions()` ~500 ms after the last tick. ChatView's onChange now calls this instead. The 500 ms window is short enough that idle external changes (a session created from another `hermes` invocation, a rename from a different window) still appear "soon", and long enough to absorb a streaming-response burst. * Add an explicit `await loadRecentSessions()` to `autoStartACPAndSend` after the new session id resolves — the debounce would otherwise delay the just-created chat from appearing in the sidebar by 500 ms after first send. Mirrors what `startACPSession` already does at line 619 for the explicit New / Resume paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 14:56:59 +02:00
Alan Wizemann	5afd391838	feat(sidebar): promote Projects to first section + move profile chip under server name Two small UX tweaks to the macOS sidebar: * Reorder sections so Projects is the top section above Monitor. Reflects how users actually start sessions in Scarf — they pick a project first, then drill into chat / sessions / etc. The previous order put the read-mostly Dashboard at the top, which made Projects feel like a secondary surface. * Move the active-profile chip out of the top header HStack (where it competed for horizontal space with the server-name pill) and drop it into a second row right-aligned under the server name. Top row stays clean: `[icon] Scarf <server>`. Second row: ` profile: <name>` only on local contexts. Same click target, same .help, just better-anchored. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 14:37:29 +02:00
Alan Wizemann	2a368a04f7	feat(window): persist window size + position across app launches SwiftUI's WindowGroup exposes `.defaultSize` and `.windowResizability` but no built-in autosave for window frame across launches. The documented escape hatch is AppKit's `NSWindow.setFrameAutosaveName(_:)`, which writes the frame to UserDefaults on resize/move and restores it on next open. Add a small `WindowFrameAutosave` NSViewRepresentable that finds its hosting NSWindow on first appear and stamps the autosave name. Apply it to `ContextBoundRoot` keyed off `context.id` so each open server window remembers its own geometry. New servers fall back to the WindowGroup's `.defaultSize(1100, 700)` until the user resizes once. A previous WIP attempt (dd4a61f) tried to use a fictional `.windowFrameAutosaveName(...)` SwiftUI modifier that doesn't exist — which is why it was never merged. This works because we go through AppKit directly. Also picks up Xcode's auto-extracted cron-related Localizable.xcstrings entries that had been pending. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 14:34:08 +02:00
Alan Wizemann	9aa901a286	fix(credential-pools): refresh view after OAuth sheet dismiss The sheet auto-closes 0.8s after `oauthFlow.succeeded` flips, but the parent view didn't reload — so the expiry badge stayed red and the `tokenTail` stayed stale until the user hit Reload. Hook `viewModel.load()` + `probeKeepalive()` into the sheet's `onDismiss` so the freshly-written `auth.json` lands on screen immediately. Runs on every dismiss (success or cancel) — `load()` is cheap and idempotent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 14:33:22 +02:00
Alan Wizemann	111fe9bb67	feat(oauth): unblock remote re-auth + daily keepalive to prevent expiry Two related fixes for OAuth subscriptions (Nous Portal, Anthropic Claude OAuth, etc.): - Remote re-auth stall: Both `NousAuthFlow` and `OAuthFlowController` set `PYTHONUNBUFFERED=1` only on local contexts. On remote, setting `proc.environment` only affects the local-side ssh process — not the remote python interpreter. ssh doesn't forward arbitrary env vars without `SendEnv` configured on both sides, so remote hermes ran with default block-buffered stdout and the device-code prompt never reached Scarf — the sheet hung at "Contacting Nous Portal" forever. Fix: when remote, wrap the command in `env PYTHONUNBUFFERED=1 …` to inject the var on the remote side regardless of ssh config. - Daily keepalive: Hermes refreshes OAuth access tokens on agent startup but never proactively. If the user goes longer than the refresh-token lifetime (~30 days for Nous) without starting a session, the refresh token itself expires and full re-auth is required. New `OAuthKeepaliveCronService` registers a Scarf-owned daily cron job (`[scarf:oauth-keepalive] OAuth token refresh`) at 4am that runs a minimal one-token prompt — booting the session is what triggers `resolve_nous_runtime_credentials()`. Wired as an opt-in toggle in the OAuth providers section of CredentialPoolsView. When `hermes auth refresh <provider>` lands upstream we'll swap the prompt for that verb; the surrounding wiring stays unchanged. - Stale-refresh nudge: `NousSubscriptionState` gains `daysSinceLastRefresh()` + `hasStaleRefresh` (>= 14 days, half of Nous's 30-day refresh-token window). The keepalive section surfaces an inline orange warning when stale and the toggle is off — points the user at the toggle that would have prevented the problem. Verification: scarfCore 263/263; Mac app builds clean. Manual repro of remote stall against Digital Ocean droplet pending user test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 14:32:06 +02:00
Alan Wizemann	6191c9f19f	fix(remote-backend): pre-expand ~/ in Swift via resolvedUserHome The previous fix (`b8b426e`) rewrote `~/.hermes/state.db` to `"$HOME/.hermes/state.db"` and relied on the remote shell to expand $HOME. That works on Mac SSHTransport (login shell with $HOME set in the environment) but not reliably through Citadel's exec channel + base64-decode + inner-/bin/sh pipeline on iOS — the user reports "unable to open database \"~/.hermes/state.db\"" connecting from ScarfGo (iOS Simulator) to 127.0.0.1, meaning the literal `~` character reached sqlite3 untouched. Switch to client-side expansion: probe remote $HOME once at RemoteSQLiteBackend.open() via the existing ServerContext.resolvedUserHome() helper (which uses transport.runProcess to `echo $HOME` — same code path Hermes CLI calls already exercise successfully on iOS). Cache the result. quoteForRemoteShell then substitutes `~/` with the absolute path in Swift before single- quoting, so sqlite3 receives `/Users/alan/.hermes/state.db` directly — no nested-shell expansion required. Falls back to the previous "$HOME/..."-quoted form when the home probe fails (rare; covers the case where runProcess can't reach the remote but the user happens to have a working streamScript path). Mirrors how RemoteBackupService.expandTilde already handles the same problem upstream. Refs #74 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 13:40:33 +02:00
Alan Wizemann	b8b426ed75	fix(remote-backend): expand ~/ to $HOME so sqlite3 finds the DB Default-config remotes (Hetzner, Digital Ocean, anything where the user hasn't overridden remoteHome on the SSHConfig) have `paths.stateDB == "~/.hermes/state.db"`. The streaming backend was single-quoting that path, which suppresses tilde expansion, and sqlite3 itself doesn't expand `~` (that's a shell affordance). Result: "Error: unable to open database \"~/.hermes/state.db\": unable to open database file" — the path was reaching sqlite3 with a literal `~` that it tried to interpret as a directory name. Replace the single-quote-only `escape(_:)` with `quoteForRemoteShell(_:)` that mirrors `SSHTransport.remotePathArg`'s pattern: rewrite leading `~/` to `"$HOME/..."` (double-quoted so the shell expands `$HOME`, backslash-escaping any embedded `\\`, `"`, `$`, ` to keep the literal intact), bare `~` to `"$HOME"`, and absolute paths get the standard single-quote-with-`'\''`-escape treatment. Adds a regression test (`openWithDefaultTildeHomeExpands`) that exercises the tilde-rewrite end-to-end against a real /bin/sh: places a fixture state.db at `~/.hermes/state.db` (backing up the user's real DB if present) and verifies open() + a query both succeed through the streaming path. Refs #74 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 13:34:20 +02:00
Alan Wizemann	593b4e62cb	feat(remote): replace SQLite snapshot pipeline with SSH query streaming The remote-DB pipeline pulled the entire state.db down via scp on every refresh tick. For the issue #74 user (4.87 GB DB) that meant ~7-min per-snapshot wall time even with the size-aware-timeout fix, ~30 GB/hour upload, and data permanently 5–10 minutes stale. This isn't a bug to patch — it's the wrong architecture for any non-trivial remote DB. Replace it with per-query streaming over SSH. Each SQL statement becomes one ssh round-trip running `sqlite3 -readonly -json` against the live remote DB. ControlMaster keeps the channel warm at ~5 ms overhead; sqlite3 cold-start adds ~30–50 ms; total ~50–100 ms per query vs. the old multi-minute snapshot. Bandwidth scales with query result size, not DB size. What changed: * New `HermesQueryBackend` protocol and two implementations: `LocalSQLiteBackend` (libsqlite3 in-process — local performance unchanged) and `RemoteSQLiteBackend` (sqlite3 over SSH per query with batched-statement support for multi-query view loads). * `SQLValue` and `Row` types as the typed boundary between backends and the row parsers. `SQLValueInliner` substitutes `?` placeholders with SQLite-escaped literals for the remote-CLI codepath (local backend keeps real `sqlite3_bind_`). `ServerTransport` swaps `snapshotSQLite` + `cachedSnapshotPath` for `streamScript(_:timeout:)`. SSHTransport delegates to the existing `SSHScriptRunner`; CitadelServerTransport (iOS) base64-encodes the script + decodes remotely via Citadel's exec channel since stdin pipes aren't supported there yet. * `HermesDataService` becomes a thin facade — every fetch* method routes through `backend.query(...)`. Public API is unchanged for view-model callers; `lastSnapshotMtime`/`isUsingStaleSnapshot`/ `staleAge` removed (had zero UI consumers). * New `dashboardSnapshot()` and `insightsSnapshot(since:)` batched calls turn Dashboard's 4-query and Insights' 5-query view loads into one SSH round-trip each (~80–100 ms total instead of ~280 ms naive). DashboardViewModel and InsightsViewModel updated to use them. * One-time launch migration in `scarfApp` wipes the orphaned `~/Library/Caches/scarf/snapshots/` directory (could be 5 GB+ for the issue #74 user). JSON parsing detail: sqlite3 -json preserves SELECT column order in the raw bytes, but `[String: Any]` from NSJSONSerialization doesn't. The remote backend extracts column ordering by walking the first object's literal bytes — without this, every positional row read (`row.string(at: 0)`) would silently return wrong columns. Tests: 41 new across `SQLValueInlinerTests`, `HermesDataServiceBackendTests` (mock backend) and `RemoteSQLiteBackendTests` (integration via local sqlite3 binary). Full suite 262/262 passing. Builds clean on Mac and iOS. Ships as part of v2.7. Refs #74 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 13:09:06 +02:00
Alan Wizemann	de36411a8d	fix(remote): size-aware snapshot timeouts and partial-file cleanup (#74 ) The remote-DB snapshot pipeline was hardcoded to a 120s scp timeout and a 60s remote-backup timeout. For users with a multi-GB state.db (the report cites 4.87 GB), 120s is wildly insufficient — at typical home upload speeds (5-50 Mbps) a 5GB transfer takes 13 minutes to several hours. scp gets killed mid-transfer, leaves a partially-written .db at the cache path, and every subsequent attempt opens that corrupt file with sqlite_open returning garbage. Symptom: SSH connects, all diagnostics pass, but Dashboard / Sessions / Memory show no data. Changes to SSHTransport.snapshotSQLite: * Probe `stat` on the remote DB before starting. Drives both the timeout budget and a local-disk-space pre-flight (refuses to start if local Caches volume can't hold size + 500MB margin). * Adaptive timeouts based on remote size: - backup: 60s base + 1s per 100MB, capped at 600s. - scp: 300s base + 0.5s per MB (≈2 MB/s minimum throughput), capped at 3600s. Defaults of 60s/300s when stat fails (still up from 120s on scp). * Add `-C` to scp args. SQLite DBs have lots of zero-padded empty pages and typically compress 30-50% in transit. * On any failure path, remove the partial local snapshot file so the next attempt starts fresh instead of opening a corrupt DB. * Rewrite the generic "Command timed out after Ns" error into a specific "Snapshot transfer timed out after Ns pulling X.X GB state.db from <host>" so users on slow links know what hit the wall instead of seeing a meaningless number. Cannot reproduce locally (no 5GB state.db on hand), but the failure mode is unambiguous from code reading: hardcoded 120s vs. real-world multi-GB transfer durations. Closes #74 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 11:25:38 +02:00
Alan Wizemann	6a7ac21ebe	chore: Bump version to 2.6.5 v2.6.5	2026-05-03 22:15:05 +02:00
Alan Wizemann	5be67282d8	test(layer-b): full Install → Configure → Open → Uninstall journey XCUITest (#73 ) Closes the deferred Layer B install-drive that v2.7's smoke test left as future work. The new test (`testFullCatalogToInstallToDashboardJourney`) drives the full install/uninstall pipeline end-to-end and validates 9 assertion points along the way: - Window surfaces under `--scarf-test-mode` - Sidebar navigation to Projects - Install sheet appears (URL handoff via launch arg) - Parent-dir field accepts custom path + Continue - Configure sheet renders + commit clicks - Confirm Install runs the install pipeline - Open Project advances to success view - Project row appears in sidebar with uniquified name - Right-click Uninstall + confirm Remove + Done removes the row Runs in ~30s green on the dev Mac. ## What needed wiring up SwiftUI Menu / NSToolbarItem accessibility-bridging. macOS toolbar Menus don't propagate `.accessibilityIdentifier` through to XCUITest — neither the menu trigger NOR the popup contents are queryable by ID. Verified by tree-dump diagnostics. The test sidesteps this entirely by routing the install URL through a new `--scarf-test-install-url <https-url>` launch arg that calls `TemplateURLRouter.shared.handle(scarf://install?url=...)` at App init, gated on `TestModeFlags.shared.isTestMode`. Production launches (no flag) untouched. Accessibility IDs added on the new install/uninstall path: - `templateConfig.commitButton`, `templateConfig.cancelButton` - `projects.row.<name>`, `sidebar.section.<rawValue>` - `projects.contextMenu.uninstallTemplate` - `templateUninstall.confirmRemove` - `templateInstall.success.openProject` - `templateUninstall.success.done` Sandboxed-runner caveat. The XCUITest runner's `/tmp` is sandbox-protected (createDirectory throws EPERM); we use `NSTemporaryDirectory()` which resolves to the runner's container tmp (`~/Library/Containers/com.scarfUITests.xctrunner/Data/tmp/`), which the unsandboxed Scarf app can read since it has full disk access. ## Known cohabitation hazard (pre-existing uninstaller bug) If the dev Mac already has a project from the same template installed, the install pipeline uniquifies the new project's name ("HackerNews Daily Digest 2") but BOTH projects' cron jobs get registered under the same `[tmpl:awizemann/hackernews-digest] Daily HN digest` name. `ProjectTemplateUninstaller.loadUninstallPlan` resolves cron jobs to remove by NAME and can target the wrong project's job. The Layer B test surfaces this — manifests as: test passes, the dev's real project's cron job disappears. Fix (separate work): store cron-job IDs in `<project>/.scarf/template.lock.json` at install time and resolve by ID at uninstall time. Until then, the test docstring warns about cohabitation; recovery is `hermes cron create` to recreate the lost job. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 22:09:50 +02:00
Alan Wizemann	c661945a1f	feat(cron): auth-error banner + running indicator + per-job log tail (#72 ) Cron rows now surface the same OAuth-refresh-revoked recovery flow as chat instead of a generic red dot, plus three previously-missing observability cues: - ACPErrorHint.classify is reused on `job.lastError`. When it returns `oauthRefreshRevoked(provider)` the detail pane shows the human hint + a "Re-authenticate" button that drops the user into Credential Pools via `coordinator.pendingOAuthReauth = provider` — same wiring ChatView's banner uses. Unrecognized errors fall back to the legacy red `lastError` text (no regression). - Row dot turns blue + pulses when `state == "running"` (taking precedence over disabled / error / success); the detail header gains a `ScarfBadge("running…", kind: .info)` next to active/paused. No new polling — `HermesFileWatcher.lastChangeDate` (already wired into ActivityView/Logs) drives `CronViewModel.load()` so state flips surface within a watcher tick. - "LAST RUN OUTPUT" replaces the inline `LAST OUTPUT` block with a collapsible panel: a one-line summary (`<timestamp> — ok\|error\|running…`) always visible, full monospaced terminal-style scroll view on expand, auto-scrolls to bottom when new runs land. Also fixes a pre-existing bug in `HermesFileService.loadCronOutput`: Hermes nests per-run output under `~/.hermes/cron/output/<jobId>/<ts>.md` but the loader treated the dir as flat, so the cron output panel never rendered any content. The fix walks the per-job subdir + keeps the legacy flat-file fallback for older Hermes layouts. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 22:09:21 +02:00
Alan Wizemann	f5f8dc30b6	Dogfooding templates: HN Digest + in-app catalog browser + test harness (#71 ) * feat(templates): hackernews-digest template + dogfooding test harness First pass of the dogfooding-templates initiative. Each pre-release cycle ships one new official `.scarftemplate` and uses installing/exercising that template as the regression test. v1 lands the harness scaffolding plus the first template under it. - HackerNews Daily Digest template (`templates/awizemann/hackernews-digest/`): config-driven (min_score / max_items / topics) cron-only template. No secrets — keeps the harness minimal until the fake-Keychain shim lands. Bundle validates against `tools/build-catalog.py`; entry added to `templates/catalog.json`. - `SCARF_HERMES_HOME` env-var override at `HermesProfileResolver` — the seam every Layer-B test relies on to drive Scarf against an isolated Hermes home. Bypasses cache + active_profile lookup; rejects relative paths. 5 unit tests + 3 ServerContext integration tests. - `TestModeFlags.shared.isTestMode` — reads `--scarf-test-mode` once from `CommandLine.arguments`. Wiring only; gating sites (Sparkle, capability probe, first-run walkthrough) land as Layer-B exercises them. - Layer A (`scarf/scarfTests/TemplateE2ETests.swift`): parses + plans the shipped HN bundle the way the app does at install time; asserts manifest, config schema, dashboard widgets, and cron prompt contract. Mirrors the existing site-status-checker coverage. - Layer B scaffold (`scarf/scarfUITests/TemplateInstallUITests.swift`): proves the launch-arg + env-var plumbing reaches Scarf. Full install click-through deferred until fixture-Hermes-home and accessibility IDs land. Wiki pages added separately on the `.wiki-worktree` branch: - `Template-Ideas.md` — backlog of 9 v1-feasible templates + full-spec v3 epic for Project-Site-as-Living-Surface (eBay listings use case). - `Test-Harness.md` — contributor guide for extending the harness. Verification: scarfTests 124/124, ScarfCore 220/220, new Layer A 3/3, Layer B scaffold 1/1, build-catalog.py + its 28 unit tests all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(test-harness): Layer B pivot to real ~/.hermes + a11y IDs + Sparkle gating Discovered during Layer B work that XCUITest runners are sandboxed: they can read ~/.hermes/ but writes throw NSFileWriteNoPermissionError. That kills the SCARF_HERMES_HOME-based isolation pattern for UI tests — snapshot/restore from inside the runner can't work. Pivot: - Layer B drives the real ~/.hermes the dev Mac is already running against. The harness assumes a working Hermes install (XCTSkip if the binary isn't there). Cleanup is via the app's own UI flows (which have full disk access), not direct file I/O. Layer A keeps its env-var seam — those tests run inside the host app's address space and write freely. - SwiftUI's WindowGroup(for: ServerID.self) doesn't auto-surface a window on a fresh XCUIApplication.launch(). The harness sends ⌘1 (the "Open Server → Local" menu shortcut wired in scarfApp.swift's OpenServerCommands) to take the same code path real users hit via Dock click. - Real user home resolved via getpwuid(getuid()) rather than NSHomeDirectory(), which inside the sandboxed runner returns ~/Library/Containers/com.scarfUITests.xctrunner/Data. - 8 accessibility IDs added on the install path so the next iteration can drive the full Templates → Install from URL → Parent dir → Confirm Install flow without depending on view-tree label scraping: templates.toolbar.menu, templates.installFromFile, templates.installFromURL, templates.installURL.field, templates.installURL.confirm, templateInstall.parentDir.field, templateInstall.parentDir.continue, templateInstall.confirmInstall. - TestModeFlags.shared.isTestMode now gates UpdaterService — --scarf-test-mode launches Sparkle inert so update prompts don't pop on top of an XCUITest-driven window. Production launches unchanged. FixtureHermesHome.swift removed — the fixture-tmpdir approach is abandoned in favour of using the real installation. Layer A's SCARF_HERMES_HOME tests still pass; they just don't need a populated home to exercise path derivation. Verification: scarfTests 124/124, ScarfCore 220/220, Layer B smoke 1/1 (after fresh build — XCUITest is sensitive to stale binaries). catalog.py --check still green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(chat): clip placeholder to TextEditor bounds and clear it on focus Two related bugs in the Mac chat composer's placeholder overlay: * The "Message Hermes… / for commands · drag images to attach" hint had no width constraint, so on narrower window geometries it visibly overflowed past the rounded TextEditor boundary. Add `lineLimit(1)`, `truncationMode(.tail)`, and `frame(maxWidth: .infinity, alignment: .leading)` so it ellipsizes inside the field instead. * The opacity formula `text.isEmpty ? 1 : 0` only hid the placeholder once content was typed, not when the field gained focus. Standard NSTextField / UITextField semantics clear the placeholder on focus. Switch to `(text.isEmpty && !isFocused) ? 1 : 0` so the hint disappears the moment the user clicks into the field. The opaque-background ghosting mitigation from #65 is preserved unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(chat): surface OAuth refresh-revoked errors with in-app re-auth When an OAuth provider's refresh token was revoked, Hermes printed "Refresh session has been revoked. Run `hermes model` to re-authenticate." to stderr but Scarf swallowed it — the user saw a typing indicator that silently disappeared with no banner, no system message, no actionable hint. The error classifier had no pattern for OAuth revocation. - `ACPErrorHint.classify` now returns a `Classification` struct carrying the hint plus an optional `oauthProvider` name. New patterns match "Refresh session has been revoked", "re-authenticate", and 401-with-OAuth-provider-name (whole-word so `anthropicapi` doesn't false-match `anthropic`). Provider extraction lets the UI dispatch the right re-auth flow. - Chat error banner ([ChatView.swift]) gains a "Re-authenticate" button when an OAuth provider was identified — sets `AppCoordinator.pendingOAuthReauth` and routes to Credential Pools. - Credential Pools view consumes the hand-off slot to auto-present AddCredentialSheet seeded with the affected provider, AND adds a per-row "Re-authenticate" button on every OAuth provider so users who go straight there don't have to retype the provider name. - `AddCredentialSheet` accepts an optional `initialProvider` that pre-fills providerID + authType=.oauth; the existing Nous-vs-PKCE- vs-CLI gate dispatches re-auth identically to first-time setup — reuses the same `OAuthFlowController` / `NousSignInSheet` plumbing, no new flow code. Verification: ScarfCore 221/221 (incl. new errorHintsClassifyOAuthRefreshRevoked covering the four patterns + word-boundary guard); Mac app builds clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(catalog): in-app template catalog browser + sentinel-marker test isolation The v2.8 catalog browser surfaces every shipped .scarftemplate from awizemann.github.io/scarf/templates/catalog.json directly in Scarf. Users now discover and install templates without leaving the app. Closes the gap that publishing the catalog updated the website but nothing inside Scarf. Architecture mirrors NousModelCatalogService 1:1: cache-first fetch, 24h TTL at ~/.hermes/scarf/catalog_cache.json, result enum (fresh / cache / fallback) with bundled fallback so a fresh-install / offline user still sees something. Search + category filter + sort (awizemann official first). Detail page renders entry.config schema preview without separate README fetch — what's in catalog.json is what we render. Install hands the HTTPS URL to the existing TemplateInstallerViewModel.openRemoteURL flow; nothing about the installer itself changes. Files: - Core/Models/CatalogEntry.swift — Decodable mirror of catalog.json per-template shape. Identity-based Equatable/Hashable on `id`. - Core/Services/CatalogService.swift — fetch + cache + fallback - Core/Services/InstalledTemplatesIndex.swift — walks projects.json + template.lock.json to build [templateId: version] map; classify() helper for Installed / Update available / Not installed badges - Features/Templates/ViewModels/CatalogViewModel.swift — @Observable - Features/Templates/Views/{CatalogView,CatalogRowView,CatalogDetailView,CatalogCategoryFilter}.swift - Packages/ScarfCore/.../HermesPathSet.swift — adds catalogCache path - Features/Projects/Views/ProjectsView.swift — Templates toolbar menu now opens with "Browse Catalog…"; sheet binding. Tests (20 new, all passing in isolation): - CatalogServiceTests (6) — live catalog.json snapshot, cache lifecycle, staleness boundary, schema-version mismatch rejection, bundled fallback - InstalledTemplatesIndexTests (5) — empty registry, templated project, ad-hoc project skip, corrupt lock skip, classify() branches - CatalogViewModelTests (6) — search filter, category filter, official-first sort, deduped categories, install state, install URL pass-through Accessibility IDs (6, on the catalog path): templates.browseCatalog, catalog.searchField, catalog.refreshButton, catalog.row.<detailSlug>, catalog.categoryFilter, catalogDetail.installButton. ## Sentinel-marker hardening on SCARF_HERMES_HOME (incident response) While iterating on v2.8 tests, the env-var override pattern racing under Swift Testing's parallel-suite scheduler caused ~/.hermes/scarf/projects.json to be overwritten with fixture data from ProjectsViewModelTests. Recovered the user's projects from the on-disk dirs they referenced + cron-job prompt paths (6 projects restored). To make this class of incident impossible going forward: HermesProfileResolver.scarfHermesHomeOverride() now requires the override path to contain a sentinel marker file (`.scarf-test-home-marker`). Without the marker, the override is ignored and Scarf falls through to the real ~/.hermes/. Even if a test crashes mid-teardown leaving the env var set, even if the var leaks to a non-test process, even if a misconfigured launchctl plist exports it — the override only activates against directories that explicitly opt in by carrying the marker. Tests drop the marker in their tmpdir setUp; production never carries it. HermesProfileResolverTests gains overrideIsIgnoredWhenMarkerMissing which verifies the guard is load-bearing. All test files using SCARF_HERMES_HOME (CatalogServiceTests, CatalogViewModelTests, InstalledTemplatesIndexTests, TemplateE2ETests) now drop the marker before setenv. Verification: 20/20 v2.8 + v2.7 hardened tests pass; 45/45 adjacent existing tests pass; ScarfCore package tests pass (221/221); catalog validator clean (3 templates); wiki secret-scan clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(swift6): retroactive conformance + verbatim help text + xcstrings refresh Three small Swift 6 compile-cleanups that landed during the dogfooding-templates iteration: - MessageSpeechService — drop `@preconcurrency` on the AVSpeechSynthesizerDelegate conformance now that the protocol's Sendable annotations are upstreamed. - ChatView — mark `RichChatViewModel.PendingPermission: Identifiable` as `@retroactive`. We don't own either the type or the protocol; the Swift 6 compiler flags this so downstream breakage is loud if ScarfCore ever adds the conformance upstream. - CredentialPoolsView — wrap the `.help(...)` string in `Text(verbatim:)` so the backticks render literally instead of being interpreted as markdown inline-code by the LocalizedStringKey overload (which `.help(_:)` rejects styled). Localizable.xcstrings: auto-generated catalog refresh picking up the new active-profile + chat error-hint strings landed in earlier commits on this branch (`acd3692`, `301806d`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(catalog): error logging + MainActor I/O + semver pre-release + decoder fault tolerance - InstalledTemplatesIndex: replace bare `try?` reads/decodes with logged do/catch so corrupt registry/lock files leave a breadcrumb instead of a silent nil. - InstalledTemplatesIndex.isVersionNewer: handle pre-release suffixes per semver §11 — `1.0.0-beta` no longer reports as newer than `1.0.0`, preventing a ghost "Update available" that would downgrade users. - CatalogViewModel.refresh: dispatch the synchronous index walk through `Task.detached` so registry + N lock-file reads don't run on @MainActor. - Catalog decoder: per-element fault tolerance via custom `init(from:)` — one malformed catalog entry is dropped with a logged warning instead of failing the whole catalog decode (honors the per-entry doc-comment contract). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 20:04:13 +02:00
Alan Wizemann	34d315793b	fix(chat): clip placeholder to TextEditor bounds and clear it on focus Two related bugs in the Mac chat composer's placeholder overlay: * The "Message Hermes… / for commands · drag images to attach" hint had no width constraint, so on narrower window geometries it visibly overflowed past the rounded TextEditor boundary. Add `lineLimit(1)`, `truncationMode(.tail)`, and `frame(maxWidth: .infinity, alignment: .leading)` so it ellipsizes inside the field instead. * The opacity formula `text.isEmpty ? 1 : 0` only hid the placeholder once content was typed, not when the field gained focus. Standard NSTextField / UITextField semantics clear the placeholder on focus. Switch to `(text.isEmpty && !isFocused) ? 1 : 0` so the hint disappears the moment the user clicks into the field. The opaque-background ghosting mitigation from #65 is preserved unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 16:48:32 +02:00
Alan Wizemann	acd3692faf	fix(profiles): switch-and-relaunch flow + active-profile chip + structured logs Profile selection had no apparent effect on Webhooks/Sessions/SOUL.md/Memory even after restart in some user setups. The path-resolution code reads ~/.hermes/active_profile correctly on paper, so the failure mode is likely environment-specific (HERMES_HOME exported in the shell, in-process state that didn't reset on what the user perceived as a restart, etc). Layer a defense that's correct regardless of root cause: * New AppRelauncher helper spawns a fresh `open -n <bundleURL>` and asks the current process to terminate after a 250ms delay. Refuses to fire from Xcode/DerivedData (the .debugBuild guard) so debug sessions don't lose their attached debugger. * ProfilesViewModel.switchAndRelaunch runs `hermes profile use`, calls HermesProfileResolver.invalidateCache(), then relaunches via the helper. Existing switchTo() also gains the cache-invalidation step so the context-menu "Set Active (no relaunch)" path stays self-consistent. * ProfilesView replaces the passive "Restart Scarf after switching" text with a confirmation-gated `Switch & Relaunch` primary button on the detail pane plus the same item in each row's context menu. Confirmation dialog flags that all Scarf windows will close. * SidebarView header gains a brand-tinted ScarfBadge showing the currently-active profile on local contexts. Click to jump to the Profiles tab. The chip refreshes on `selectedSection` change so a terminal-side `hermes profile use` is visible after the next nav. * HermesProfileResolver success logs gain `name=…, home=…, source=…` key=value structure across all three resolution paths (file / file-default / default-no-file). `log show … \| grep ProfileResolver` now answers "what did the resolver decide?" unambiguously for support requests. Closes #70 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 13:18:10 +02:00
Alan Wizemann	ab615f0c28	feat(ios-chat): redesign composer with HIG touch targets and clear disabled state Send button is now a 44pt circular target with an explicit color swap (rust accent → background-tertiary) on disable, instead of relying on SwiftUI's default opacity dim — addresses the "first tap doesn't register" complaint by making the inactive state visibly different in both light and dark mode. Paperclip and text field both gain a 44pt minimum height so the row feels modern and roomy. The text field swaps `.roundedBorder` for a plain field with a ScarfRadius.xl rounded fill (ScarfColor.backgroundSecondary) and a borderStrong stroke. Outer paddings and HStack spacing migrate from magic numbers to ScarfSpace tokens. Preserves verbatim: the `.toolbar { ToolbarItemGroup(placement: .keyboard) }` keyboard-dismiss chevron (issue #51), draft persistence, .submitLabel, @FocusState, photo-picker wiring, attachment-strip rendering, and every .disabled() predicate. Closes #69 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 13:14:09 +02:00
Alan Wizemann	982ed7da92	chore: bump iOS build to 30 for TestFlight iOS-only patch carrying the rotation lock + chat-start preflight off-MainActor fixes from `cb164f0`. Mac side stays on the v2.6.0 binary already shipped (build 29 archive); this build number bump only affects future Mac archives, not the one already notarized. Uploaded to App Store Connect via altool — Apple processing now, will land in TestFlight once the binary clears the post-upload scan (typically 5–15 min).	2026-05-01 16:20:13 +02:00
Alan Wizemann	cb164f07f9	fix(ios): lock iPhone to portrait + move chat-start preflight off MainActor Two iOS-specific crash classes from the v2.5.1 TestFlight feedback round: Rotation crash — locked the iPhone target to `UIInterfaceOrientationPortrait` only (was Portrait + LandscapeLeft + LandscapeRight). The phone can't rotate the app at all anymore, so any layout path that wasn't audited for size-class transitions is no longer reachable. iPad orientation list left alone (target device family is iPhone-only anyway). "Crash while typing" / "trying to continue an existing conversation" — `ChatController.passModelPreflight()` was doing a synchronous SSH read (`context.readText(configYAML)`) on `@MainActor` during chat-start. On a remote ScarfGo context that blocks the main thread for seconds; iOS's non-responsive-app watchdog kills the process around 10s. To the user this surfaces as a "crash" while they're typing — they kept tapping the keyboard while the connect was hung. Move the read to `Task.detached` and await it; the UI stays responsive while the SSH I/O drains. Three callers (`start`, `start(projectPath:)`, `startResuming`) updated to `await passModelPreflight(...)` — they were already async. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 16:03:28 +02:00
Alan Wizemann	1dbdf9d079	chore: ignore local crashes/ triage directory TestFlight feedback / crash JSONs land here while we're working through an iOS fix round. They carry tester PII (emails, carriers, locales) and aren't meant for the public repo. Kept local-only; deleted after the round closes.	2026-05-01 15:57:41 +02:00
Alan Wizemann	101488cd0d	docs(readme): bump What's New to v2.6.0 + Hermes v0.12 catch-up Replaces the 2.5 "What's New" block with a 2.6 summary that covers the Hermes v0.12 surfaces (Curator, multimodal images, 5 new providers, Teams + Yuanbao, Kanban, Skills v0.12, cron --workdir, settings deltas, ScarfGo Webhooks/Plugins/Profiles) and the post-merge chat fix round (#67/#68/#65/#62/#63/#64/#66/ #61). Verified-versions table gains v0.12.0 as the current target; recommended-Hermes line points at v0.12.0+ for full feature support. ScarfGo block kept but de-emphasised since it shipped in 2.5.	2026-05-01 15:55:16 +02:00
Alan Wizemann	03c996ee80	chore: Bump version to 2.6.0 v2.6.0	2026-05-01 15:42:48 +02:00

1 2 3 4 5 ...

373 Commits