mirror of
https://github.com/awizemann/scarf.git
synced 2026-05-10 18:44:45 +00:00
Merge pull request #81 from awizemann/coordination/v2.8.0-plans
docs(v2.8): coordinator review + 8 work-stream plans for Hermes v0.13.0 catch-up
This commit is contained in:
@@ -0,0 +1,112 @@
|
|||||||
|
# v2.8.0 Coordinator Review — Hermes v0.13.0 catch-up
|
||||||
|
|
||||||
|
**Status:** all 8 work-stream plans drafted; WS-1 (capability flags) committed on branch `ws-1-capabilities-v0.13` (PR #80). This document is the coordinator's cross-stream review compiled from each per-stream plan's _Open Questions_ section, file inventory, and confidence rating. It exists so the user can review the v2.8.0 implementation surface in one read instead of eight.
|
||||||
|
|
||||||
|
## Plan inventory
|
||||||
|
|
||||||
|
| Stream | File | Lines | Confidence | Open Q's | Files touched | Branch |
|
||||||
|
| --- | --- | --: | --- | --: | --: | --- |
|
||||||
|
| WS-2 | [WS-2-goals-and-queue-plan.md](WS-2-goals-and-queue-plan.md) | ~600 | medium-high | 7 | ~6 | `ws-2-goals-and-queue` |
|
||||||
|
| WS-3 | [WS-3-kanban-v0.13-plan.md](WS-3-kanban-v0.13-plan.md) | 947 | medium-high | 7 | 12 (1 new) | `ws-3-kanban-v0.13` |
|
||||||
|
| WS-4 | [WS-4-curator-archive-plan.md](WS-4-curator-archive-plan.md) | 561 | medium-high | 6 | ~10 | `ws-4-curator-archive` |
|
||||||
|
| WS-5 | [WS-5-gateway-v0.13-plan.md](WS-5-gateway-v0.13-plan.md) | 520 | medium-high | 8 | ~17 (5 new) | `ws-5-gateway-v0.13` |
|
||||||
|
| WS-6 | [WS-6-providers-v0.13-plan.md](WS-6-providers-v0.13-plan.md) | 625 | high (arch) / medium (key) | 5 | 8 | `ws-6-providers-v0.13` |
|
||||||
|
| WS-7 | [WS-7-settings-v0.13-plan.md](WS-7-settings-v0.13-plan.md) | 628 | medium-high | 8 | 17 | `ws-7-settings-v0.13` |
|
||||||
|
| WS-8 | [WS-8-ux-v0.13-plan.md](WS-8-ux-v0.13-plan.md) | 580 | high (5 of 6) / medium (1) | 5 | 12 | `ws-8-ux-v0.13` |
|
||||||
|
| WS-9 | [WS-9-ios-v0.13-plan.md](WS-9-ios-v0.13-plan.md) | 926 | medium-high | 8 | 7 | `ws-9-ios-v0.13` |
|
||||||
|
|
||||||
|
**Total v2.8.0 surface:** ~89 files touched (with overlap; net unique ~75), ~5400 lines of plan, 54 distinct open questions across 8 streams.
|
||||||
|
|
||||||
|
## Cross-stream collisions (coordinator-tracked)
|
||||||
|
|
||||||
|
These files appear in more than one work-stream and need explicit sequencing:
|
||||||
|
|
||||||
|
| File | Streams | Resolution |
|
||||||
|
| --- | --- | --- |
|
||||||
|
| `RichChatViewModel.swift` | WS-2 (`/goal`/`/queue`), WS-8 (`/new <name>` help text) | WS-8 lands AFTER WS-2; the `/new <name>` change is one-line and rebases trivially. |
|
||||||
|
| `SessionInfoBar` (chat status bar) | WS-2 (queue chip), WS-8 (compression count) | Both add SwiftUI children to the same HStack — order-independent. WS-8 lands after WS-2 to avoid file-level conflicts. |
|
||||||
|
| `HermesCapabilities.swift` | WS-1 (all flags), WS-8 + WS-9 (request `isV013OrLater` helper) | Decided: add `isV013OrLater` helper to WS-1 PR (one-line, lands cleanly). See _Decision A_ below. |
|
||||||
|
| `HermesConfig` model | WS-5 (gateway allowlists), WS-6 (`image_gen.model`, `openrouter.response_cache`), WS-7 (mcp/cron/web-tools/profiles) | Each work-stream extends a different namespace. Touch the same file; merge resolution mechanical. |
|
||||||
|
| iOS surfaces | WS-9 consumes WS-2/WS-3/WS-4/WS-5 model fields | WS-9 lands LAST in the v2.8.0 cycle. Hard sequencing constraint. |
|
||||||
|
|
||||||
|
## Open-questions matrix (cluster-organized)
|
||||||
|
|
||||||
|
Of 54 questions across the 8 plans, **45 are wire-shape unknowns** that can only be resolved by inspecting a real Hermes v0.13.0 install (i.e. they need a v0.13 host to dogfood against, since the release notes don't pin every CLI flag, JSON field, or YAML key). The remaining 9 are Scarf-side architectural choices that the agents already recommended; they need user adjudication.
|
||||||
|
|
||||||
|
### Cluster A — wire-shape unknowns (resolve at integration time, not before implementation starts)
|
||||||
|
|
||||||
|
These are the questions where each plan agent gave a best-inference default, marked the spot with a `// TODO` comment, and recommended verification when a v0.13 host is reachable. The implementation can proceed safely with these defaults; if any are wrong, the fix is a one-line edit + a new test fixture.
|
||||||
|
|
||||||
|
- **WS-2:** goal-state read-back channel (Q1), `/queue --clear` syntax (Q2), `/queue` argument shape (Q5), `/goal` non-interruptive on the wire (Q7)
|
||||||
|
- **WS-3:** hallucination verb name (Q1), diagnostics location (task vs run, Q2), `set_max_retries` post-create (Q3), failure-counter unification field (Q4), darwin-zombie kind (Q5), default `max_retries` value (Q6), `kanban diagnose <id>` verb (Q7)
|
||||||
|
- **WS-4:** `prune --dry-run` flag (Q1), `--json` on read verbs (Q2), single-skill prune (Q3), sync-run timeout (Q4)
|
||||||
|
- **WS-5:** Google Chat platform identifier (Q1), allowlist YAML key path (Q2), `gateway list --json` shape (Q3), `[[as_document]]` discoverability (Q6)
|
||||||
|
- **WS-6:** `openrouter.response_cache.enabled` exact key (Q1), default value (Q2), grok rename old-slot redirect (Q4), `models_dev_cache.json` refresh on clean install (Q5)
|
||||||
|
- **WS-7:** MCP transport names (Q1), `sse_read_timeout` default (Q2), `--transport sse` flag spelling (Q3), `--no-agent` toggle-off shape (Q4), argparse + `--no-agent` (Q5), web-tools backend lists (Q6), `web_tools.backend` legacy fallback (Q7), `--no-skills` × `--clone-all` interaction (Q8)
|
||||||
|
- **WS-8:** compression-count wire field name (Q1), xAI TTS config keys (Q2), `display.language` empty-string vs `"en"` default (Q3)
|
||||||
|
|
||||||
|
**Recommended resolution:** proceed with implementation against the agents' inferred defaults. Each implementation agent should be briefed to mark its TODO callsites. A coordinator pass before merging WS-2…WS-9 (after the user has dogfooded a v0.13 host) confirms or fixes each in <30 minutes total.
|
||||||
|
|
||||||
|
### Cluster B — Scarf-side architectural choices (need user adjudication)
|
||||||
|
|
||||||
|
These are the 9 questions where the user's input directly shapes the implementation:
|
||||||
|
|
||||||
|
| ID | Question | Agent's recommendation |
|
||||||
|
| --- | --- | --- |
|
||||||
|
| **A** | Add `isV013OrLater` helper to WS-1? | **Yes** — both WS-8 and WS-9 want it. One-line addition. Land in the existing WS-1 PR before merging. |
|
||||||
|
| **B** | "Auto-resumed from checkpoint" indicator | **Defer to v2.8.1** (WS-2 Q3). Hermes v0.13's auto-resume signal isn't documented; surfacing it requires a wire-format we don't have yet. |
|
||||||
|
| **C** | `/queue --clear` button when syntax unconfirmed | **Remove the "Clear all" button from the queue popover until syntax is confirmed.** Local-only clear that lies about server state is worse than no button. |
|
||||||
|
| **D** | Curator prune confirm UX | **Custom sheet matching template-uninstall** (WS-4 Q5). Enumerated list + asymmetric keyboard shortcut, no typed-name confirmation. |
|
||||||
|
| **E** | Filter Yuanbao + Teams platforms on pre-v0.12? | **Keep current behavior** (WS-5 Q4). Don't change v0.12 host UX in a v0.13 work-stream. Document the asymmetry. |
|
||||||
|
| **F** | Capability flag for slash-command notice TTL | **Proxy through `hasGatewayBusyAckToggle ‖ hasGatewayRestartNotification`** (WS-5 Q5). A dedicated flag is YAGNI. |
|
||||||
|
| **G** | Rename `MessagingGatewayViewModel`? | **Apply rename if <5 callsites change.** Otherwise keep the type name and rely on user-facing label. |
|
||||||
|
| **H** | Profile `--no-skills` + `--clone-all` interaction | **Conservative: disable `--no-skills` toggle when `--clone-all` is on.** Argparse may reject anyway. |
|
||||||
|
| **I** | Implementation parallelism — 8 PRs in parallel worktrees, or sequential review? | Recommend **parallel worktree implementation** with **sequential coordinator review** (one PR at a time merging into main). Parallel impl = ~3-4 days of agent-time; sequential review = the natural throttle for production safety. |
|
||||||
|
|
||||||
|
### Cluster C — out-of-scope deferrals (no decision needed)
|
||||||
|
|
||||||
|
These were identified during planning but the agents already deferred them with sound rationale:
|
||||||
|
|
||||||
|
- WS-2: optimistic-vs-authoritative goal reconciliation
|
||||||
|
- WS-3: failure-counter unification field rendering
|
||||||
|
- WS-6: Arcee Trinity Large Thinking temperature/compression overrides surface
|
||||||
|
- WS-7: `web_tools.backend` legacy migration prompt
|
||||||
|
- WS-9: deep-links from v0.13-features sheet, hallucination-badge tap-target alert
|
||||||
|
- All streams: iOS write surfaces (always deferred)
|
||||||
|
|
||||||
|
## Recommended next steps (post-review)
|
||||||
|
|
||||||
|
Once the user resolves Cluster B questions A–I:
|
||||||
|
|
||||||
|
1. **Patch WS-1 PR #80** with the `isV013OrLater` helper (Decision A). One commit, one push.
|
||||||
|
2. **Spawn 8 implementation agents in parallel** (Decision I), each in an isolated worktree:
|
||||||
|
- Each agent gets its plan file + the answers to relevant Cluster B questions + the WS-1 commit ref.
|
||||||
|
- Each agent produces a single PR from its branch.
|
||||||
|
- Branch names match the plan inventory table.
|
||||||
|
3. **Coordinator-review each PR sequentially** in dependency order:
|
||||||
|
- Wave 1 (WS-2, WS-3, WS-4, WS-5) — review one at a time, merge in any order
|
||||||
|
- Wave 2 (WS-6, WS-7, WS-8) — same
|
||||||
|
- Wave 3 (WS-9) — last; consumes Wave 1+2 model fields
|
||||||
|
4. **WS-10 release** after WS-9 merges:
|
||||||
|
- Update CLAUDE.md (already partially done in WS-1)
|
||||||
|
- Update wiki via `scripts/wiki.sh`
|
||||||
|
- Write `releases/v2.8.0/RELEASE_NOTES.md`
|
||||||
|
- Run `scripts/release.sh v2.8.0 --draft` to validate
|
||||||
|
- Run `scripts/release.sh v2.8.0` for the full promotion
|
||||||
|
|
||||||
|
## Risk register
|
||||||
|
|
||||||
|
- **Production app, thousands of users.** Each PR must build clean, all tests green, manual smoke against a v0.13 host before merge.
|
||||||
|
- **Cluster A wire-shape risk.** Mitigated by tolerant decoders + capability gates; if any guess is wrong, pre-v0.13 hosts still work and v0.13 hosts surface a benign decode-failure (UI hides instead of crashes).
|
||||||
|
- **Sparkle update path.** v2.8.0 is delivered via the existing Sparkle appcast; there's no migration path for users on pre-v0.12 Hermes hosts (their v0.13-only surfaces stay hidden).
|
||||||
|
- **No data migrations.** Per CLAUDE.md, schema is unchanged from v0.11/v0.12 across this release. Per-project `manifest.json` and Scarf-owned sidecars at `~/.hermes/scarf/` are untouched.
|
||||||
|
|
||||||
|
## Estimate
|
||||||
|
|
||||||
|
- WS-1: shipped (PR #80 awaiting merge after Decision A)
|
||||||
|
- Wave 1 implementation: ~3 days agent-time × 4 streams in parallel = ~3 calendar days
|
||||||
|
- Wave 2 implementation: ~2 days agent-time × 3 streams in parallel = ~2 calendar days
|
||||||
|
- WS-9 implementation: ~2 days agent-time
|
||||||
|
- WS-10 release coordination: ~½ day
|
||||||
|
|
||||||
|
**Calendar-time estimate: ~8 days** with parallel implementation + sequential review. The bottleneck is coordinator review at PR-merge boundaries, not agent throughput.
|
||||||
@@ -0,0 +1,497 @@
|
|||||||
|
# WS-2 Plan: Persistent Goals + ACP `/queue`
|
||||||
|
|
||||||
|
Branch suggestion: `ws-2-goals-and-queue-v0.13`. Depends on WS-1 (`ws-1-capabilities-v0.13`, PR #80) for the three v0.13 capability flags consumed below.
|
||||||
|
|
||||||
|
## Goals (what this PR ships)
|
||||||
|
|
||||||
|
User-visible features (all capability-gated, all degrade silently on pre-v0.13 hosts):
|
||||||
|
|
||||||
|
- `/goal <text>` slash command, surfaced in the slash menu, sent as a non-interruptive prompt (no "Agent working…" flip).
|
||||||
|
- `/goal --clear` slash command (and a quick-clear affordance on the goal pill itself) to drop the active goal.
|
||||||
|
- A "Goal locked" pill in the chat header (mounted alongside the project / branch chips in [SessionInfoBar](../../scarf/Features/Chat/Views/SessionInfoBar.swift)). Hidden when no active goal.
|
||||||
|
- `/queue <text>` slash command, surfaced in the slash menu, non-interruptive, with a transient toast (`Queued — runs after current turn`) reusing the existing `transientHint` machinery.
|
||||||
|
- `/queue` listing affordance: a small chip in the chat header showing queued-prompt count, expanding to a popover with the queued-prompt previews when there are any pending entries (Mac only — iOS gets a read-only listing affordance in WS-9).
|
||||||
|
- `/steer` on idle: pre-v0.13 hosts grey-out `/steer` and `/queue` and `/goal` in the slash menu when the session is idle (they do nothing useful there); v0.13+ hosts allow `/steer` to fire on idle sessions and treat it as a regular prompt.
|
||||||
|
- iOS read-only "Goal locked" pill (added in WS-9, plumbed here so the VM is iOS-ready).
|
||||||
|
|
||||||
|
Out-of-scope items captured in [Out of scope](#out-of-scope-deferred).
|
||||||
|
|
||||||
|
## Files to change
|
||||||
|
|
||||||
|
### [scarf/Packages/ScarfCore/Sources/ScarfCore/Models/HermesSlashCommand.swift](../../Packages/ScarfCore/Sources/ScarfCore/Models/HermesSlashCommand.swift)
|
||||||
|
|
||||||
|
- Re-use the existing `Source.acpNonInterruptive` enum case — `/goal` and `/queue` slot in there alongside `/steer`. No new source case is needed (a "non-interruptive" command, regardless of whether it sets a goal or queues a turn, has the same wire shape: send through `ACPClient.sendPrompt`, do not flip "Agent working…").
|
||||||
|
- No struct changes needed.
|
||||||
|
|
||||||
|
### [scarf/Packages/ScarfCore/Sources/ScarfCore/Models/HermesActiveGoal.swift](../../Packages/ScarfCore/Sources/ScarfCore/Models/HermesActiveGoal.swift) (NEW)
|
||||||
|
|
||||||
|
Plain value type:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
public struct HermesActiveGoal: Sendable, Equatable, Identifiable {
|
||||||
|
public let text: String
|
||||||
|
public let setAt: Date
|
||||||
|
public var id: String { text + "@" + ISO8601DateFormatter().string(from: setAt) }
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Lives next to `HermesSession.swift` and `HermesSlashCommand.swift`. Used by the goal pill and the goal viewmodel state (read-only — no mutation API on the struct).
|
||||||
|
|
||||||
|
### [scarf/Packages/ScarfCore/Sources/ScarfCore/Models/HermesQueuedPrompt.swift](../../Packages/ScarfCore/Sources/ScarfCore/Models/HermesQueuedPrompt.swift) (NEW)
|
||||||
|
|
||||||
|
Plain value type for one queued prompt:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
public struct HermesQueuedPrompt: Sendable, Equatable, Identifiable {
|
||||||
|
public let id: UUID
|
||||||
|
public let text: String
|
||||||
|
public let queuedAt: Date
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Used by `RichChatViewModel.queuedPrompts` and the queue-popover view. The `id` is a Scarf-side UUID minted at queue-time — Hermes' wire protocol doesn't expose a per-queue-entry id (see [Open questions](#open-questions)).
|
||||||
|
|
||||||
|
### [scarf/Packages/ScarfCore/Sources/ScarfCore/ViewModels/RichChatViewModel.swift](../../Packages/ScarfCore/Sources/ScarfCore/ViewModels/RichChatViewModel.swift)
|
||||||
|
|
||||||
|
This is the load-bearing change. All changes are MainActor-isolated; no sync I/O is added.
|
||||||
|
|
||||||
|
**1. Extend `nonInterruptiveCommands` (currently around [RichChatViewModel:251-258](../../Packages/ScarfCore/Sources/ScarfCore/ViewModels/RichChatViewModel.swift)):**
|
||||||
|
|
||||||
|
Today the list contains only `/steer`. Add `/goal` and `/queue`. Per the existing contract these are appended unconditionally — capability gating is applied in `availableCommands` (next change). Each entry uses `source: .acpNonInterruptive` so the existing `isNonInterruptiveSlash(_:)` helper at [RichChatViewModel:331-342](../../Packages/ScarfCore/Sources/ScarfCore/ViewModels/RichChatViewModel.swift) auto-recognizes them.
|
||||||
|
|
||||||
|
```swift
|
||||||
|
public static let nonInterruptiveCommands: [HermesSlashCommand] = [
|
||||||
|
HermesSlashCommand(name: "steer", description: "...", argumentHint: "<guidance>", source: .acpNonInterruptive),
|
||||||
|
HermesSlashCommand(name: "goal", description: "Lock the agent on a goal that persists across turns",
|
||||||
|
argumentHint: "<text>", source: .acpNonInterruptive),
|
||||||
|
HermesSlashCommand(name: "queue", description: "Queue a prompt to run after the current turn",
|
||||||
|
argumentHint: "<text>", source: .acpNonInterruptive),
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
**2. Capability-gated filtering of the static list.**
|
||||||
|
|
||||||
|
`availableCommands` (currently [RichChatViewModel:304-325](../../Packages/ScarfCore/Sources/ScarfCore/ViewModels/RichChatViewModel.swift)) merges the static `nonInterruptiveCommands` unconditionally. Replace that with a filter against a new public `capabilitiesGate` value the controller sets at session-start time:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@ObservationIgnored public var capabilitiesGate: HermesCapabilities = .empty
|
||||||
|
```
|
||||||
|
|
||||||
|
Inside `availableCommands`, after building `acpNames` / `projectNames` / `quicks`:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
let supported: [HermesSlashCommand] = Self.nonInterruptiveCommands.filter { cmd in
|
||||||
|
switch cmd.name {
|
||||||
|
case "goal": return capabilitiesGate.hasGoals
|
||||||
|
case "queue": return capabilitiesGate.hasACPQueue
|
||||||
|
case "steer": return true // present pre-v0.13 too; idle gating handled separately
|
||||||
|
default: return true
|
||||||
|
}
|
||||||
|
}
|
||||||
|
let nonInterruptive = supported.filter { !occupied.contains($0.name) }
|
||||||
|
return acpCommands + projectAsHermes + quicks + nonInterruptive
|
||||||
|
```
|
||||||
|
|
||||||
|
**3. Active goal state.**
|
||||||
|
|
||||||
|
Add observable storage:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
public private(set) var activeGoal: HermesActiveGoal?
|
||||||
|
```
|
||||||
|
|
||||||
|
Reset to nil in `reset()` (around [RichChatViewModel:441-478](../../Packages/ScarfCore/Sources/ScarfCore/ViewModels/RichChatViewModel.swift)).
|
||||||
|
|
||||||
|
Add a slim mutator `recordActiveGoal(text: String?)`:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@MainActor public func recordActiveGoal(text: String?) {
|
||||||
|
if let text, !text.isEmpty {
|
||||||
|
activeGoal = HermesActiveGoal(text: text, setAt: Date())
|
||||||
|
} else {
|
||||||
|
activeGoal = nil
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Two callers will populate this: (a) the slash-command handler in `ChatViewModel.sendViaACP` / `ChatController._sendImpl` does an optimistic write the moment the user presses send (`/goal foo` → record `foo`; `/goal --clear` → record nil), so the pill appears synchronously without waiting for a server round-trip; (b) a future ACP-side signal could correct it (see [Open questions](#open-questions)).
|
||||||
|
|
||||||
|
**4. Queued-prompt state.**
|
||||||
|
|
||||||
|
Add observable storage:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
public private(set) var queuedPrompts: [HermesQueuedPrompt] = []
|
||||||
|
```
|
||||||
|
|
||||||
|
Reset to empty in `reset()`.
|
||||||
|
|
||||||
|
Add mutators:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@MainActor public func recordQueuedPrompt(text: String) {
|
||||||
|
queuedPrompts.append(HermesQueuedPrompt(id: UUID(), text: text, queuedAt: Date()))
|
||||||
|
}
|
||||||
|
|
||||||
|
@MainActor public func clearAllQueuedPrompts() { queuedPrompts.removeAll() }
|
||||||
|
|
||||||
|
@MainActor public func popQueuedPrompt() -> HermesQueuedPrompt? {
|
||||||
|
queuedPrompts.isEmpty ? nil : queuedPrompts.removeFirst()
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
`recordQueuedPrompt` is called optimistically when the user sends `/queue ...`. `popQueuedPrompt` runs inside `handlePromptComplete` (currently [RichChatViewModel:763-820](../../Packages/ScarfCore/Sources/ScarfCore/ViewModels/RichChatViewModel.swift)) when the agent finishes a turn — Hermes is responsible for actually running the queued prompt (it lives server-side); the Scarf-side list is purely a UI mirror. Popping is best-effort: if Hermes' server-side queue gets out of sync (deferred prompt aborted, dropped on disconnect), the user sees a stale chip until their next interaction. We accept that v1 trade-off; see [Open questions](#open-questions).
|
||||||
|
|
||||||
|
**5. `/goal` argument parsing helper (test-friendly).**
|
||||||
|
|
||||||
|
```swift
|
||||||
|
public enum GoalCommandArgument: Equatable {
|
||||||
|
case set(String)
|
||||||
|
case clear
|
||||||
|
case empty // user typed `/goal` with no argument
|
||||||
|
}
|
||||||
|
|
||||||
|
public static func parseGoalArgument(_ raw: String) -> GoalCommandArgument {
|
||||||
|
let trimmed = raw.trimmingCharacters(in: .whitespacesAndNewlines)
|
||||||
|
if trimmed.isEmpty { return .empty }
|
||||||
|
if trimmed == "--clear" || trimmed == "clear" { return .clear }
|
||||||
|
return .set(trimmed)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Pure function, no MainActor. Lets `M9SlashCommandTests` exercise the parser directly.
|
||||||
|
|
||||||
|
### [scarf/scarf/Features/Chat/ViewModels/ChatViewModel.swift](../../scarf/Features/Chat/ViewModels/ChatViewModel.swift) (Mac)
|
||||||
|
|
||||||
|
**1. Plumb capabilities into the VM.**
|
||||||
|
|
||||||
|
Today the VM doesn't carry a reference to `HermesCapabilitiesStore`. Add a stored property + initializer overload:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@ObservationIgnored var capabilitiesStore: HermesCapabilitiesStore?
|
||||||
|
```
|
||||||
|
|
||||||
|
`ChatView` passes the env-resolved store in via `.task` (or `.onAppear`) and the VM forwards `capabilitiesStore.capabilities` into `richChatViewModel.capabilitiesGate` whenever the store's `capabilities` changes (use a one-shot `.task(id: capabilities)` modifier on the chat view to re-publish on refresh).
|
||||||
|
|
||||||
|
Rationale: the slash menu's `availableCommands` filter (above) needs the live capabilities. `ChatViewModel` is `@Observable`; storing the snapshot directly here would force the entire VM to re-render on capability refreshes — using `@ObservationIgnored` + an explicit "publish" call into RichChatViewModel keeps re-render scope tight.
|
||||||
|
|
||||||
|
**2. Detect non-interruptive commands by name in `sendViaACP` (currently [ChatViewModel:556-635](../../scarf/Features/Chat/ViewModels/ChatViewModel.swift)).**
|
||||||
|
|
||||||
|
The current `isSteer` branch only special-cases the toast. Extend it to dispatch:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
let trimmedSlash = parseSlashName(text) // small helper, returns (name: String?, args: String)
|
||||||
|
let isNonInterruptive = richChatViewModel.isNonInterruptiveSlash(text)
|
||||||
|
|
||||||
|
switch trimmedSlash.name {
|
||||||
|
case "goal":
|
||||||
|
let arg = RichChatViewModel.parseGoalArgument(trimmedSlash.args)
|
||||||
|
switch arg {
|
||||||
|
case .set(let goalText):
|
||||||
|
richChatViewModel.recordActiveGoal(text: goalText)
|
||||||
|
richChatViewModel.transientHint = "Goal locked: \(goalText)"
|
||||||
|
case .clear:
|
||||||
|
richChatViewModel.recordActiveGoal(text: nil)
|
||||||
|
richChatViewModel.transientHint = "Goal cleared."
|
||||||
|
case .empty:
|
||||||
|
// Agent will respond with usage; show neutral hint.
|
||||||
|
richChatViewModel.transientHint = "Sent /goal — see the agent reply for current goal."
|
||||||
|
}
|
||||||
|
scheduleHintClear()
|
||||||
|
case "queue":
|
||||||
|
let queuedText = trimmedSlash.args.trimmingCharacters(in: .whitespacesAndNewlines)
|
||||||
|
if !queuedText.isEmpty {
|
||||||
|
richChatViewModel.recordQueuedPrompt(text: queuedText)
|
||||||
|
}
|
||||||
|
richChatViewModel.transientHint = "Queued — runs after current turn."
|
||||||
|
scheduleHintClear()
|
||||||
|
case "steer" where isNonInterruptive:
|
||||||
|
richChatViewModel.transientHint = "Guidance queued — applies after the next tool call."
|
||||||
|
scheduleHintClear()
|
||||||
|
default:
|
||||||
|
if !isNonInterruptive { acpStatus = ACPPhase.agentWorking }
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
`scheduleHintClear()` extracts the existing 4-second auto-clear pattern (currently inlined for `/steer` at [ChatViewModel:585-591](../../scarf/Features/Chat/ViewModels/ChatViewModel.swift)) into a private helper, so all three commands use the same clear behaviour. The wire send (the existing `client.sendPrompt(...)` call at [ChatViewModel:597](../../scarf/Features/Chat/ViewModels/ChatViewModel.swift)) is unchanged — Hermes parses the slash on the server side.
|
||||||
|
|
||||||
|
**3. Clear active goal state on session reset.**
|
||||||
|
|
||||||
|
`startNewSession` (and `resumeSession`, `continueLastSession`) call `richChatViewModel.reset()` which already resets `activeGoal` and `queuedPrompts` (from change #3 above in the VM). Confirm `stopACP()` doesn't need an additional clear — it doesn't, because reset() is the explicit teardown.
|
||||||
|
|
||||||
|
**4. `/steer` on idle pre-v0.13.**
|
||||||
|
|
||||||
|
In the slash menu (rendered by `SlashCommandRow` — see Slash menu changes below), grey-out `/steer` when:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
!richChatViewModel.isAgentWorking && !capabilitiesGate.hasACPSteerOnIdle
|
||||||
|
```
|
||||||
|
|
||||||
|
Tooltip / disabled state: "Use `/steer` while the agent is working — your Hermes version doesn't support steering on idle sessions."
|
||||||
|
|
||||||
|
### [scarf/scarf/Features/Chat/Views/SlashCommandMenu.swift](../../scarf/Features/Chat/Views/SlashCommandMenu.swift)
|
||||||
|
|
||||||
|
Add a new `disabled: Bool` parameter to `SlashCommandRow`. When disabled, render the row at 0.55 opacity, prevent `onTapGesture` from firing, and append a one-line subtitle "(use during a turn)". Also accept a `disabledReason: String?` for the tooltip.
|
||||||
|
|
||||||
|
Plumb the disabled state through from the parent (`RichChatInputBar`). Logic stays in the parent: a row is disabled iff `(name == "steer") && isIdle && !hasACPSteerOnIdle`. Goal/queue rows are never grey when present (they're already filtered out when their cap is off).
|
||||||
|
|
||||||
|
### [scarf/scarf/Features/Chat/Views/SessionInfoBar.swift](../../scarf/Features/Chat/Views/SessionInfoBar.swift)
|
||||||
|
|
||||||
|
Add the goal pill alongside the existing project / branch chips. Two new optional inputs:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
var activeGoal: HermesActiveGoal? = nil
|
||||||
|
var onClearGoal: (() -> Void)? = nil
|
||||||
|
```
|
||||||
|
|
||||||
|
Render block (positioned right after the existing `gitBranch` Label, before the working dot at [SessionInfoBar:65](../../scarf/Features/Chat/Views/SessionInfoBar.swift)):
|
||||||
|
|
||||||
|
```swift
|
||||||
|
if let activeGoal {
|
||||||
|
HStack(spacing: 4) {
|
||||||
|
Image(systemName: "scope")
|
||||||
|
Text(truncatedGoal(activeGoal.text))
|
||||||
|
}
|
||||||
|
.scarfStyle(.caption)
|
||||||
|
.padding(.horizontal, ScarfSpace.s2)
|
||||||
|
.padding(.vertical, 2)
|
||||||
|
.background(Capsule().fill(ScarfColor.info.opacity(0.16)))
|
||||||
|
.foregroundStyle(ScarfColor.info)
|
||||||
|
.help("Goal locked: \(activeGoal.text)")
|
||||||
|
.contextMenu {
|
||||||
|
if let onClearGoal {
|
||||||
|
Button("Clear goal", role: .destructive, action: onClearGoal)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
private func truncatedGoal(_ text: String) -> String {
|
||||||
|
text.count <= 36 ? text : String(text.prefix(33)) + "…"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Color choice: `ScarfColor.info` matches the badge intent — informational state, not a warning, not an error. Per CLAUDE.md, accent (rust) is reserved for primary brand surfaces; project / branch already use accent so reusing it would mean three accent chips in a row. `info` differentiates the goal pill visually.
|
||||||
|
|
||||||
|
The `onClearGoal` closure flows from `ChatViewModel`: when invoked, it dispatches `sendText("/goal --clear")` so Hermes' authoritative state stays in sync (the optimistic local clear happens via the send-path in `sendViaACP`).
|
||||||
|
|
||||||
|
### [scarf/scarf/Features/Chat/Views/ChatTranscriptPane.swift](../../scarf/Features/Chat/Views/ChatTranscriptPane.swift)
|
||||||
|
|
||||||
|
Forward the new `SessionInfoBar` parameters at [ChatTranscriptPane:17-25](../../scarf/Features/Chat/Views/ChatTranscriptPane.swift):
|
||||||
|
|
||||||
|
```swift
|
||||||
|
SessionInfoBar(
|
||||||
|
session: richChat.currentSession,
|
||||||
|
isWorking: richChat.isGenerating,
|
||||||
|
acpInputTokens: richChat.acpInputTokens,
|
||||||
|
acpOutputTokens: richChat.acpOutputTokens,
|
||||||
|
acpThoughtTokens: richChat.acpThoughtTokens,
|
||||||
|
projectName: chatViewModel.currentProjectName,
|
||||||
|
gitBranch: chatViewModel.currentGitBranch,
|
||||||
|
activeGoal: richChat.activeGoal,
|
||||||
|
onClearGoal: { chatViewModel.sendText("/goal --clear") }
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
### [scarf/scarf/Features/Chat/Views/ChatQueueIndicator.swift](../../scarf/Features/Chat/Views/ChatQueueIndicator.swift) (NEW)
|
||||||
|
|
||||||
|
Small chip + popover for the queued-prompt list. Mounted in `SessionInfoBar` next to the goal pill, but extracted to its own file because it owns popover state.
|
||||||
|
|
||||||
|
```swift
|
||||||
|
struct ChatQueueIndicator: View {
|
||||||
|
let queuedPrompts: [HermesQueuedPrompt]
|
||||||
|
var onClearAll: () -> Void
|
||||||
|
@State private var isPopoverShown = false
|
||||||
|
|
||||||
|
var body: some View {
|
||||||
|
if queuedPrompts.isEmpty { EmptyView() } else {
|
||||||
|
Button {
|
||||||
|
isPopoverShown = true
|
||||||
|
} label: {
|
||||||
|
HStack(spacing: 4) {
|
||||||
|
Image(systemName: "tray.full")
|
||||||
|
Text("\(queuedPrompts.count) queued")
|
||||||
|
}
|
||||||
|
.scarfStyle(.caption)
|
||||||
|
.padding(.horizontal, ScarfSpace.s2)
|
||||||
|
.padding(.vertical, 2)
|
||||||
|
.background(Capsule().fill(ScarfColor.warning.opacity(0.16)))
|
||||||
|
.foregroundStyle(ScarfColor.warning)
|
||||||
|
}
|
||||||
|
.buttonStyle(.plain)
|
||||||
|
.help("Prompts waiting to run after the current turn finishes")
|
||||||
|
.popover(isPresented: $isPopoverShown, arrowEdge: .bottom) {
|
||||||
|
queuePopover
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
@ViewBuilder private var queuePopover: some View { /* list + clear-all action */ }
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Color: `.warning` (amber) — these are pending side-effects the user should notice. Distinct from goal (`.info`) and project (`.accent`) so all three chips are visually decodable.
|
||||||
|
|
||||||
|
Caveat: this chip is OPTIMISTIC. The popover header includes a one-line note: "Local view — Hermes manages the actual queue." The popover offers "Clear all" but NOT individual deletion (Hermes has no per-entry remove verb; clearing locally would diverge from server state). "Clear all" sends `/queue --clear` if Hermes accepts that syntax (see [Open questions](#open-questions)) and otherwise just resets the local mirror with a tooltip explaining the discrepancy.
|
||||||
|
|
||||||
|
### [scarf/Scarf iOS/Chat/ChatView.swift](../../Scarf%20iOS/Chat/ChatView.swift) — DEFERRED to WS-9
|
||||||
|
|
||||||
|
The iOS chat already wires non-interruptive commands at [ChatView:1310-1322](../../Scarf%20iOS/Chat/ChatView.swift) and uses the same `RichChatViewModel`, so the model-side changes are picked up automatically. Surface changes (read-only goal pill, queue chip) belong in WS-9 per the work-stream split. **Do not** add iOS UI changes in this PR — keep the diff scoped.
|
||||||
|
|
||||||
|
**Exception:** the iOS controller's `_sendImpl` at [ChatView:1291-1342](../../Scarf%20iOS/Chat/ChatView.swift) needs the same dispatch changes as Mac (record the optimistic goal/queue mutation when the user types `/goal` or `/queue`), otherwise the iOS VM state will diverge from Mac. Mirror change #2 from `ChatViewModel.swift` above into the `_sendImpl` body. iOS just doesn't *render* the goal pill / queue chip yet — that's WS-9.
|
||||||
|
|
||||||
|
### [scarf/Packages/ScarfCore/Tests/ScarfCoreTests/M9SlashCommandTests.swift](../../Packages/ScarfCore/Tests/ScarfCoreTests/M9SlashCommandTests.swift)
|
||||||
|
|
||||||
|
Extend with v0.13 cases. The current file tests project-scoped commands and the context block; add a new section "v0.13 non-interruptive commands":
|
||||||
|
|
||||||
|
- `nonInterruptiveListIncludesGoalAndQueue` — `RichChatViewModel.nonInterruptiveCommands.map(\.name)` contains both names.
|
||||||
|
- `availableCommandsHidesGoalWhenCapabilityOff` — set `capabilitiesGate = .empty`, assert `goal` not in `availableCommands`.
|
||||||
|
- `availableCommandsHidesQueueWhenCapabilityOff` — same for `queue`.
|
||||||
|
- `availableCommandsExposesAllThreeOnV013` — set `capabilitiesGate = HermesCapabilities.parseLine("Hermes Agent v0.13.0 (2026.5.7)")`, assert all three are present.
|
||||||
|
- `parseGoalArgumentRecognizesClearVariants` — `--clear`, `clear`, `Clear`, ` --clear ` all return `.clear`.
|
||||||
|
- `parseGoalArgumentReturnsSetForArbitraryText` — `"finish v2.8 on time"` → `.set("finish v2.8 on time")`.
|
||||||
|
- `parseGoalArgumentReturnsEmptyForBlank` — `""` and `" "` return `.empty`.
|
||||||
|
- `recordActiveGoalSetsAndClears` — call `recordActiveGoal(text: "x")` then `recordActiveGoal(text: nil)` on a fresh VM, assert observable transitions.
|
||||||
|
- `recordQueuedPromptAppendsAndPopsFIFO` — append three, pop two, verify order + remaining count.
|
||||||
|
- `clearAllQueuedPromptsEmpties` — straightforward.
|
||||||
|
- `isNonInterruptiveSlashRecognizesGoalAndQueue` — verify `/goal foo`, `/queue bar`, `/queue` (no args) all return `true`.
|
||||||
|
- `resetClearsGoalAndQueue` — set both, call `reset()`, assert both empty.
|
||||||
|
|
||||||
|
All MainActor-bound; use `@MainActor @Test` annotations. The current suite uses `@Suite` with default isolation, which is fine.
|
||||||
|
|
||||||
|
### [scarf/Packages/ScarfCore/Tests/ScarfCoreTests/HermesCapabilitiesTests.swift](../../Packages/ScarfCore/Tests/ScarfCoreTests/HermesCapabilitiesTests.swift)
|
||||||
|
|
||||||
|
WS-1 already added cases for `hasGoals` / `hasACPQueue` / `hasACPSteerOnIdle`. No further changes needed unless the existing tests don't assert all three are true on `v0.13.0` and false on `v0.12.0` — verify this is covered before merging WS-2.
|
||||||
|
|
||||||
|
## New types / fields
|
||||||
|
|
||||||
|
| Type | Where | Purpose |
|
||||||
|
| --- | --- | --- |
|
||||||
|
| `HermesActiveGoal` | new ScarfCore model | observable goal-pill state |
|
||||||
|
| `HermesQueuedPrompt` | new ScarfCore model | one queued-prompt mirror entry |
|
||||||
|
| `RichChatViewModel.GoalCommandArgument` | nested enum on the VM | pure parser for `/goal` arg |
|
||||||
|
| `RichChatViewModel.activeGoal` | observable | drives the pill |
|
||||||
|
| `RichChatViewModel.queuedPrompts` | observable | drives the chip + popover |
|
||||||
|
| `RichChatViewModel.capabilitiesGate` | non-observable | filters non-interruptive commands |
|
||||||
|
| `ChatViewModel.capabilitiesStore` | non-observable | bridge from env → VM |
|
||||||
|
| `ChatQueueIndicator` (Mac view) | new chat view | header chip |
|
||||||
|
|
||||||
|
No new ACP RPC types; we ride the existing `session/prompt` shape. No DB schema changes.
|
||||||
|
|
||||||
|
## Capability gating
|
||||||
|
|
||||||
|
| Affordance | Gate | Pre-v0.13 behaviour |
|
||||||
|
| --- | --- | --- |
|
||||||
|
| `/goal` in slash menu | `hasGoals` | hidden |
|
||||||
|
| `/goal --clear` (also clear-from-pill) | `hasGoals` | n/a (no pill to clear; menu item also hidden) |
|
||||||
|
| Goal pill in `SessionInfoBar` | `activeGoal != nil` (which only becomes non-nil when user sends `/goal`, which requires the menu, which requires `hasGoals`) | hidden by transitive impossibility |
|
||||||
|
| `/queue` in slash menu | `hasACPQueue` | hidden |
|
||||||
|
| Queue chip in `SessionInfoBar` | `queuedPrompts.isEmpty == false` (transitive on `hasACPQueue`) | hidden |
|
||||||
|
| `/steer` greyed-out on idle | `hasACPSteerOnIdle == false && !isAgentWorking` | greyed; tooltip explains |
|
||||||
|
| `/steer` on idle (sent normally) | `hasACPSteerOnIdle == true` | works as regular prompt (server handles) |
|
||||||
|
|
||||||
|
Belt-and-suspenders defence: `availableCommands` filters BEFORE menu rendering; the dispatch in `sendViaACP` does NOT pre-validate (Hermes' server-side error message is more accurate than any client guard we'd write). If a user types `/goal` directly via a quick-command alias on a pre-v0.13 host, the slash gets sent to Hermes, which will respond with its own "unknown command" reply — acceptable v1 behaviour.
|
||||||
|
|
||||||
|
## How to test
|
||||||
|
|
||||||
|
### Unit tests
|
||||||
|
|
||||||
|
Run `swift test --package-path scarf/Packages/ScarfCore --filter M9SlashCommandTests`. Should be ~12 new tests; existing 23 stay green.
|
||||||
|
|
||||||
|
### Manual: v0.13 host
|
||||||
|
|
||||||
|
Prereq: Hermes v0.13.0 installed locally OR the dogfooding box (`192.168.0.82`) with `remote-servers` branch.
|
||||||
|
|
||||||
|
1. **Goal happy path:**
|
||||||
|
- Open chat (any project / quick chat).
|
||||||
|
- Type `/`, verify `/goal` appears in slash menu.
|
||||||
|
- Send `/goal finish WS-2 by Friday` — confirm:
|
||||||
|
- "Agent working…" does NOT flip on (non-interruptive).
|
||||||
|
- Transient toast appears: "Goal locked: finish WS-2 by Friday".
|
||||||
|
- "Goal locked" chip appears in `SessionInfoBar` next to project / branch.
|
||||||
|
- Toast auto-dismisses after ~4s.
|
||||||
|
- Send a normal prompt; verify the chip stays put across turns.
|
||||||
|
2. **Goal clear path:**
|
||||||
|
- With a goal active, right-click the chip → "Clear goal".
|
||||||
|
- Verify chip disappears, transient toast says "Goal cleared.", and the underlying `sendText("/goal --clear")` actually fires (check Hermes log).
|
||||||
|
- Alternative path: type `/goal --clear` directly — same outcome.
|
||||||
|
3. **Queue happy path:**
|
||||||
|
- Send a long-running prompt to occupy the agent.
|
||||||
|
- While it's working, send `/queue summarize what you just did`.
|
||||||
|
- Confirm: toast "Queued — runs after current turn.", chip appears showing "1 queued".
|
||||||
|
- Click chip → popover lists the queued prompt with timestamp.
|
||||||
|
- When the current turn finishes, verify Hermes runs the queued prompt automatically (server-side) AND the chip count decrements (via `popQueuedPrompt`).
|
||||||
|
4. **Steer-on-idle:**
|
||||||
|
- On v0.13, send `/steer` on an idle session — confirm it sends as a regular prompt (no error, no "Agent working" indicator misbehaviour).
|
||||||
|
5. **Capability refresh:**
|
||||||
|
- Connect to a remote running Hermes v0.12. Verify `/goal` and `/queue` are absent from the slash menu.
|
||||||
|
- Verify `/steer` is present but greyed-out on idle, with the tooltip.
|
||||||
|
6. **Session reset:**
|
||||||
|
- Set a goal + queue 2 prompts. Click "New chat" — confirm chip and pill clear.
|
||||||
|
- Resume an old session — confirm pill stays empty (we don't persist active-goal across sessions in v1; see [Open questions](#open-questions)).
|
||||||
|
|
||||||
|
### Manual: pre-v0.13 host
|
||||||
|
|
||||||
|
1. Connect to a remote running Hermes v0.11.x or v0.12.x.
|
||||||
|
2. Slash menu should show `/steer` only (no `/goal`, no `/queue`).
|
||||||
|
3. With idle session, hover `/steer` — verify greyed + tooltip.
|
||||||
|
4. Manually type `/goal foo` and send — Hermes returns its own "unknown command" reply; Scarf does not crash, the goal pill does not appear (because `recordActiveGoal` is gated on the slash dispatch being routed via the `case "goal":` branch, and that branch fires unconditionally — but the chip is only rendered when `activeGoal != nil` AND we sent the slash, so the user sees an inconsistent local pill until the agent's "unknown command" response).
|
||||||
|
- **Inconsistency caveat:** the optimistic write means a typed-out `/goal` against a pre-v0.13 host paints the pill briefly. Acceptable: pre-v0.13 users have to type the command literally (no menu surface), so this is power-user territory. Document in release notes.
|
||||||
|
|
||||||
|
### Visual
|
||||||
|
|
||||||
|
- Goal chip should be `info`-tinted and visually distinct from accent (project) and warning (queue).
|
||||||
|
- Pill text truncates to ~33 chars + ellipsis for long goals; full text in tooltip.
|
||||||
|
- Three-chip overflow at narrow window widths: SessionInfoBar already wraps via the `HStack(spacing: 16)` parent — the pills should naturally elide. If they don't, we constrain `lineLimit(1)` per chip (already the pattern for project name).
|
||||||
|
|
||||||
|
## Open questions
|
||||||
|
|
||||||
|
These need coordinator resolution before implementation closes.
|
||||||
|
|
||||||
|
1. **Goal persistence across session restarts.** Hermes v0.13's "Persistent Goals" implies the active goal survives restarts on the server side. Does Hermes expose:
|
||||||
|
- (a) a session-startup ACP notification with the current goal, or
|
||||||
|
- (b) a sidecar JSON file (e.g. `~/.hermes/sessions/session_<id>.json` with a `goal: ...` field), or
|
||||||
|
- (c) a `/goal --status` command that returns the current goal?
|
||||||
|
|
||||||
|
The release notes mention "Preserve pending update prompts across restarts" and "Preserve thread routing from cached live session sources" — neither of those is the persistent-goal channel.
|
||||||
|
|
||||||
|
**Recommendation:** ship v2.8 with optimistic-only state (no read-back). Open a follow-up to read goal state from whichever channel Hermes exposes once the v0.13 server is dogfooded. Mark the chip as "user-set this session" in the tooltip until then. This means resuming an old session won't paint the goal pill even if the agent still has the goal — the chip will appear the next time the user runs `/goal`. This is the safest v2.8 behaviour and aligns with the "minimal-surface, maximal-ship" approach for the v2.8 catch-up release.
|
||||||
|
|
||||||
|
2. **`/queue --clear` syntax.** Does Hermes accept `/queue --clear` (or `/queue clear`) to drain the server-side queue? If not, the "Clear all" button in the popover can only clear the local mirror — which means a queued prompt would still run server-side after the user thought they'd cancelled it.
|
||||||
|
|
||||||
|
**Recommendation:** if the syntax is unsupported, **remove the "Clear all" button from v2.8** and document the limitation in the popover header. Don't ship a button that lies about what it does.
|
||||||
|
|
||||||
|
3. **Auto-resume after gateway restart — ACP signal.** The release notes say "Auto-resume interrupted sessions after gateway restart" but it's unclear whether that signal:
|
||||||
|
- lands as a Scarf-visible ACP event (so we can show an "Auto-resumed" toast),
|
||||||
|
- or is purely server-side (Hermes resumes the session transparently and Scarf sees nothing different).
|
||||||
|
|
||||||
|
**Recommendation:** defer the "Auto-resumed from checkpoint" indicator to v2.8.1. Add a `// TODO(WS-2 followup)` comment in the ACP event-loop hooks pointing at this question. Ship v2.8 without the indicator. If user-visible auto-resume is in fact happening silently, the lack of UI is a no-op (correct behaviour by accident); if it's announced via an event, we surface it in the next point release.
|
||||||
|
|
||||||
|
4. **Optimistic-vs-authoritative goal state.** If the user types `/goal foo` then immediately disconnects before Hermes acks, our optimistic chip will say `foo` while the server has nothing. Reconciliation isn't implemented in v1.
|
||||||
|
|
||||||
|
**Recommendation:** accept the trade-off. Reconciling would require Open Question #1's resolution (a way to read server-side goal state), so it's blocked on the same answer.
|
||||||
|
|
||||||
|
5. **`/queue` argument shape.** Release notes call it "queue a prompt" — but is the syntax `/queue <text>` (verbatim text becomes the queued prompt) or does it accept named priorities / IDs? If the latter, our optimistic-mirror logic over-simplifies.
|
||||||
|
|
||||||
|
**Recommendation:** assume verbatim. Verify against `hermes acp` in dogfooding before merging.
|
||||||
|
|
||||||
|
6. **Active goal injection into the system prompt.** If Hermes injects the active goal into every turn's system prompt (likely — that's how a "locked" goal would survive across turns server-side), Scarf doesn't need to re-send it on resume. If Hermes uses some other mechanism (e.g. a sidecar tool), that's also Hermes' problem. **No Scarf-side action needed regardless.**
|
||||||
|
|
||||||
|
7. **`/goal` non-interruptive on the wire — does Hermes actually accept it during an active turn?** `/steer` is documented as non-interruptive; `/goal` is documented as "lock onto a target." The server may treat `/goal` as a prompt that DOES need a turn to take effect. If so, our `nonInterruptiveCommands` classification for `/goal` is wrong — it should flip "Agent working…" like a regular prompt.
|
||||||
|
|
||||||
|
**Recommendation:** verify against the v0.13 ACP adapter behaviour on a real host. If `/goal` is in fact interruptive, drop it from `nonInterruptiveCommands` and treat it as a normal prompt that just happens to also mutate `activeGoal`. The pill behaviour is unchanged either way.
|
||||||
|
|
||||||
|
## Out of scope (deferred)
|
||||||
|
|
||||||
|
- iOS surface for goal pill + queue chip — WS-9.
|
||||||
|
- Persistent-goal cross-session memory (paint the pill from server state on session resume) — blocked on Open Question #1, deferred to v2.8.1.
|
||||||
|
- "Auto-resumed from checkpoint" indicator — blocked on Open Question #3, deferred to v2.8.1.
|
||||||
|
- "Resumed from checkpoint" sessions-list badge — same as above.
|
||||||
|
- A dedicated Goals feature surface (sidebar entry showing all locked goals across sessions) — out of scope; the chip is enough for v2.8.
|
||||||
|
- Per-queued-prompt deletion in the popover — Hermes has no remove-by-id verb.
|
||||||
|
- Goal mutation via UI affordance other than the slash command (e.g. a "Set goal…" toolbar button) — defer to v2.8.1; the slash menu is the canonical entry.
|
||||||
|
- Goal text Markdown rendering in the pill — pill is a one-line plain-text chip.
|
||||||
|
- Telemetry: ScarfMon counters for `/goal` / `/queue` invocations — nice-to-have, ship without.
|
||||||
|
|
||||||
|
## Estimate
|
||||||
|
|
||||||
|
**Medium.** ~5 files changed (3 in ScarfCore, 3 Mac chat views — one new), 2 new model files, ~12 new tests. The capability-flag plumbing is non-trivial because `RichChatViewModel.capabilitiesGate` needs a clean injection seam without forcing the whole VM to re-render on every refresh. Two days of focused work end-to-end including manual verification on both a v0.13 and a v0.12 host. The biggest uncertainty is server-side `/goal` and `/queue` behaviour, captured in Open Questions 1, 2, and 7 — coordinator should answer these before the implementation PR opens.
|
||||||
@@ -0,0 +1,947 @@
|
|||||||
|
# WS-3 Plan: Kanban v0.13 diagnostics + recovery UX
|
||||||
|
|
||||||
|
**Workstream:** WS-3 of Scarf v2.8.0
|
||||||
|
**Hermes target:** v0.13.0 (v2026.5.7)
|
||||||
|
**Capability gate:** `HermesCapabilities.hasKanbanDiagnostics` (already shipped in WS-1, PR #80; resolves to `>= 0.13.0`)
|
||||||
|
**Builds on:** v2.7.5 Kanban v3 (drag-and-drop board, per-project tenants, optimistic-merge VM, inspector pane). The existing surface stays intact; this WS layers v0.13 reliability + recovery affordances on top.
|
||||||
|
**Owner:** TBD
|
||||||
|
**Reviewers:** Alan (always); whoever is on Kanban duty during v2.8 cycle.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Goals
|
||||||
|
|
||||||
|
The Hermes v0.13.0 release notes list eight Kanban-shaped items in scope for Scarf:
|
||||||
|
|
||||||
|
1. **Hallucination gate + recovery UX** for worker-created cards — workers now emit a "I created a follow-up card" claim that Hermes flags as `hallucination_gate_status=pending` until something verifies the underlying card exists. Scarf needs to render the flag and offer Verify / Reject so the user is the verification gate.
|
||||||
|
2. **Generic diagnostics engine** for task distress signals — Hermes now emits a structured diagnostics array on a task / run when it observes distress (heartbeat-stalled, repeated tool errors, unbounded retry loop, OOM proxy, etc.). Scarf needs to render those diagnostics in the inspector so the user can act before the auto-block fires.
|
||||||
|
3. **Per-task `max_retries` override** — `hermes kanban create --max-retries N` (write-once at create) and the field shows up on `kanban show --json`. Surface on the create sheet + inspector header.
|
||||||
|
4. **Multiline textarea for inline-create title** — v0.13 server tolerates multi-line titles. The Scarf create sheet's title is currently a single-line `ScarfTextField`; convert to a multi-line input so a long title doesn't get clipped on hover-truncate.
|
||||||
|
5. **Heartbeat / reclaim / zombie / retry-cap reliability fixes** — mostly server-side, but Scarf's run-row + log-tab phrasing ("stale_lock") becomes user-hostile when v0.13 emits a richer outcome ("zombied — reclaimed by reaper"). Render the new outcome string verbatim and add a glossary tooltip.
|
||||||
|
6. **Auto-block workers that exit without completing** + `auto_blocked_reason` — currently Scarf renders a generic "Last run: blocked" banner; v0.13 attaches a structured reason ("worker exited without `kanban complete`"). Replace the generic banner with the reason when present.
|
||||||
|
7. **Detect darwin zombie workers** — when a card is reclaimed because the worker zombied (process exited but didn't release the lock), the diagnostics engine emits a `darwin_zombie_detected` kind. Render with a specific glyph + tooltip rather than the generic stale-lock banner.
|
||||||
|
8. **Unify failure counter across spawn / timeout / crash outcomes** — server-side counter rename; Scarf's run-row outcome label rendering may need to absorb a new normalized counter (`failure_count` rather than three separate counters). Verify the run row still renders all outcomes.
|
||||||
|
|
||||||
|
The two release-notes items NOT in WS-3 scope:
|
||||||
|
|
||||||
|
- **Multi-project boards** — already shipped in v2.7.5 via per-project tenants. Hermes v0.13's "one install, many kanbans" framing is the server's catch-up to what Scarf already solved client-side via the `scarf:<slug>` tenant convention. No change here.
|
||||||
|
- **Shared board, workspaces, and worker logs across profiles** — entirely server-side; Scarf already shows whichever assignee owns a row.
|
||||||
|
- **Dashboard: workspace kind + path inputs, per-platform home-channel notification toggles** — workspace kind/path already shipped in v2.7.5 (`KanbanCreateSheet.workspaceField`); home-channel toggles are in WS-5 (gateway / messaging) not Kanban.
|
||||||
|
- **Worker task-ownership enforcement on destructive tool calls** — server-side; Scarf observes the failure mode (a run ends with `permission_denied`) but doesn't need new UI.
|
||||||
|
|
||||||
|
### Non-goals (explicitly deferred)
|
||||||
|
|
||||||
|
- **Within-column reorder.** Hermes still has no `update --priority` verb. CLAUDE.md "Kanban v3" section explicitly forbids client-side ordering sidecars.
|
||||||
|
- **Drag from Done.** Done is terminal; the WS-2.7.5 transition planner already throws `forbiddenTransition`. No change.
|
||||||
|
- **Mutating `priority` / `title` / `body` post-create.** No CLI verb exists. We surface `max_retries` on the inspector header in read-only form.
|
||||||
|
- **iOS read-only counterpart.** WS-9 picks up iOS catch-up. Scope here is Mac.
|
||||||
|
- **Live `watch` streaming.** v2.7.5 polls every 5s. v0.13 hasn't added a stable `watch --json` shape Scarf can rely on; deferred until a future flag (`hasKanbanWatch`).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Files to change
|
||||||
|
|
||||||
|
The plan is intentionally minimal-touch. Most of the lift is in the Mac inspector + card view + create sheet; the model layer adds a handful of `Codable` fields with `nil` defaults so pre-v0.13 hosts decode without error.
|
||||||
|
|
||||||
|
### 1. `scarf/Packages/ScarfCore/Sources/ScarfCore/Models/HermesKanbanTask.swift`
|
||||||
|
|
||||||
|
**Why:** v0.13 adds four task-level fields the inspector / card need (`max_retries`, `auto_blocked_reason`, `hallucination_gate_status`, optional `diagnostics`). All four must be `Optional` with `nil` decoded for pre-v0.13 hosts.
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- Add four new stored properties between `currentRunId` and the end of the property list (preserve existing initializer ordering — append at the tail of the parameter list with nil defaults so call sites in `KanbanModelsTests`, etc. don't break):
|
||||||
|
- `public let maxRetries: Int?`
|
||||||
|
- `public let autoBlockedReason: String?`
|
||||||
|
- `public let hallucinationGateStatus: String?` — wire enum: `pending` / `verified` / `rejected` / nil. Stays a `String` for the same forward-compat reason `status: String` does (Hermes might add `quarantined`).
|
||||||
|
- `public let diagnostics: [HermesKanbanDiagnostic]` — defaults to `[]` when absent, matching the existing `skills` pattern (line 115).
|
||||||
|
- Extend `enum CodingKeys` with:
|
||||||
|
- `case maxRetries = "max_retries"`
|
||||||
|
- `case autoBlockedReason = "auto_blocked_reason"`
|
||||||
|
- `case hallucinationGateStatus = "hallucination_gate_status"`
|
||||||
|
- `case diagnostics`
|
||||||
|
- Extend the custom `init(from:)` with four `decodeIfPresent` calls. The `[HermesKanbanDiagnostic]` decode mirrors the `skills` decode: `(try? c.decodeIfPresent([HermesKanbanDiagnostic].self, forKey: .diagnostics)) ?? []`. Wrapping in `try?` matters — a single malformed diagnostic shouldn't poison the whole row.
|
||||||
|
- Extend the public memberwise initializer (the explicit one starting line 37) — add the four parameters at the tail with nil defaults so v2.7.5 callers compile unchanged.
|
||||||
|
- Add a typed-mirror enum `KanbanHallucinationGate` next to `KanbanStatus` so views don't string-compare:
|
||||||
|
```swift
|
||||||
|
public enum KanbanHallucinationGate: String, Sendable, CaseIterable {
|
||||||
|
case pending, verified, rejected
|
||||||
|
public static func from(_ raw: String?) -> KanbanHallucinationGate? {
|
||||||
|
guard let raw, !raw.isEmpty else { return nil }
|
||||||
|
return KanbanHallucinationGate(rawValue: raw.lowercased())
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Tolerance contract:** A v0.12 row missing all four fields decodes successfully and renders with no v0.13 chrome. A v0.13 row with all four fields decodes and lights up the new chrome.
|
||||||
|
|
||||||
|
### 2. `scarf/Packages/ScarfCore/Sources/ScarfCore/Models/HermesKanbanDiagnostic.swift` (NEW)
|
||||||
|
|
||||||
|
**Why:** Diagnostics are a fresh wire shape. They're attached in two places (per-task `diagnostics: [...]` and per-run `diagnostics: [...]`), but the Swift type is shared between the two sites.
|
||||||
|
|
||||||
|
**Shape (best inference from release notes — verify against live JSON during integration):**
|
||||||
|
|
||||||
|
```swift
|
||||||
|
public struct HermesKanbanDiagnostic: Sendable, Equatable, Identifiable, Codable {
|
||||||
|
public let id: UUID = UUID() // synthetic; not on wire
|
||||||
|
public let kind: String // heartbeat_stalled | tool_error_loop | retry_cap_hit |
|
||||||
|
// unbounded_retry | darwin_zombie_detected | spawn_failure |
|
||||||
|
// worker_exit_no_complete | …
|
||||||
|
public let message: String? // human-friendly elaboration
|
||||||
|
public let detectedAt: String? // ISO-8601 (decode flexible — Unix int or string)
|
||||||
|
|
||||||
|
enum CodingKeys: String, CodingKey {
|
||||||
|
case kind
|
||||||
|
case message
|
||||||
|
case detectedAt = "detected_at"
|
||||||
|
}
|
||||||
|
// custom init(from:) for flexible timestamp decode, mirroring HermesKanbanTask.decodeFlexibleTimestamp
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Plus a typed-mirror enum `KanbanDiagnosticKind` for known kinds (default `.unknown` for forward compat — matches the `KanbanStatus` / `KanbanEventKind` pattern). Glyph + color helpers live alongside it so views don't switch on raw strings.
|
||||||
|
|
||||||
|
**Cases for the typed-mirror enum (initial set; add as Hermes ships more):**
|
||||||
|
|
||||||
|
- `.heartbeatStalled` — heartbeat older than `max_runtime_seconds / 4`, glyph `waveform.path.badge.minus`, tint `.warning`
|
||||||
|
- `.toolErrorLoop` — same tool errored ≥ 3 times in a row, glyph `arrow.triangle.2.circlepath.exclamationmark`, tint `.warning`
|
||||||
|
- `.retryCapHit` — `failure_count >= max_retries`, glyph `nosign`, tint `.danger`
|
||||||
|
- `.unboundedRetry` — worker is retrying without backoff (was a v0.12 bug class), glyph `arrow.clockwise.circle.fill`, tint `.warning`
|
||||||
|
- `.darwinZombieDetected` — process zombied without releasing lock, glyph `apple.logo`, tint `.danger`
|
||||||
|
- `.spawnFailure` — `os/exec` returned non-zero spawning the worker, glyph `bolt.slash`, tint `.danger`
|
||||||
|
- `.workerExitNoComplete` — worker exited 0 without calling `kanban complete`, glyph `figure.walk.departure`, tint `.warning` (pairs with `auto_blocked_reason`)
|
||||||
|
- `.unknown` — fallback for any kind Hermes adds we don't recognize; render kind raw
|
||||||
|
|
||||||
|
### 3. `scarf/Packages/ScarfCore/Sources/ScarfCore/Models/HermesKanbanRun.swift`
|
||||||
|
|
||||||
|
**Why:** Per-run diagnostics share the same type. The run row in the inspector renders them under the run.
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- Add `public let diagnostics: [HermesKanbanDiagnostic]` (defaults to `[]`).
|
||||||
|
- Extend `enum CodingKeys` with `case diagnostics`.
|
||||||
|
- Extend `init(from:)` with the same `decodeIfPresent` + `?? []` pattern.
|
||||||
|
- Extend the public memberwise initializer with the parameter (default `[]`).
|
||||||
|
- Extend `encode(to:)` with `try c.encode(diagnostics, forKey: .diagnostics)` (encoding round-trip matters for tests).
|
||||||
|
- Optional v0.13 housekeeping: `failure_count: Int?` if v0.13's unified counter is exposed on the run shape (unify failure counter across spawn / timeout / crash). If it appears as a top-level key on the run, decode it; if not, this stays a server-internal field and Scarf doesn't need it.
|
||||||
|
|
||||||
|
### 4. `scarf/Packages/ScarfCore/Sources/ScarfCore/Models/HermesKanbanTaskDetail.swift`
|
||||||
|
|
||||||
|
**Why:** No structural change required if `diagnostics` is on the inner `HermesKanbanTask`. But verify the JSON shape: in some Hermes verbs the diagnostics array hangs off the *envelope* (`{task: {…}, comments: […], events: […], diagnostics: […]}`) rather than the task. If it's on the envelope, add an optional sibling field here and surface `detail.task.diagnostics ?? detail.diagnostics ?? []` from the inspector.
|
||||||
|
|
||||||
|
**Edits (defensive):** add `public let envelopeDiagnostics: [HermesKanbanDiagnostic]?` decoded from `case envelopeDiagnostics = "diagnostics"`. UI source of truth becomes a computed helper on the detail:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
public var allDiagnostics: [HermesKanbanDiagnostic] {
|
||||||
|
let onTask = task.diagnostics
|
||||||
|
let onEnvelope = envelopeDiagnostics ?? []
|
||||||
|
// Dedup by (kind, detectedAt). Wire-side dupes are unlikely but cheap to filter.
|
||||||
|
var seen = Set<String>()
|
||||||
|
return (onTask + onEnvelope).filter {
|
||||||
|
let key = "\($0.kind)|\($0.detectedAt ?? "")"
|
||||||
|
return seen.insert(key).inserted
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5. `scarf/Packages/ScarfCore/Sources/ScarfCore/Models/KanbanCreateRequest.swift`
|
||||||
|
|
||||||
|
**Why:** The create sheet needs a `--max-retries N` flag.
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- Add `public var maxRetries: Int?` to the struct.
|
||||||
|
- Add the parameter to the public initializer (tail position, default nil).
|
||||||
|
- Extend `argv()` between `maxRuntimeSeconds` and `createdBy` (line 80-ish):
|
||||||
|
```swift
|
||||||
|
if let maxRetries {
|
||||||
|
args.append(contentsOf: ["--max-retries", String(maxRetries)])
|
||||||
|
}
|
||||||
|
```
|
||||||
|
- Argv ordering is purely cosmetic from Hermes's perspective (it re-parses), but keep deterministic order so test fixtures stay stable.
|
||||||
|
|
||||||
|
### 6. `scarf/Packages/ScarfCore/Sources/ScarfCore/Services/KanbanService.swift`
|
||||||
|
|
||||||
|
**Why:** Hallucination-gate verify / reject. Best inference from the release notes is that Hermes added a verb like `kanban verify <id>` or expanded `kanban show` with a sibling write-verb. **This needs verification** — see Open Questions #1.
|
||||||
|
|
||||||
|
**Edits (proposed; mark TODO until verified against Hermes v0.13 source):**
|
||||||
|
|
||||||
|
- Add a `verify(taskId:)` method that runs `hermes kanban verify <id>`. Returns Void; the polling loop picks up the new `hallucination_gate_status=verified`. If the verb is named differently (`hallucination verify`, `confirm`, `accept`), rename the Swift method to track. **Do not invent a CLI verb name without a real CLI to call against** — gate this behind a guarded TODO and pull from the live binary first.
|
||||||
|
- Add a `rejectHallucinated(taskId:)` method. Most likely path: the user "rejects" by archiving (since the worker's claim was a hallucination, the right resolution is to archive the bogus card). If Hermes ships a dedicated reject verb, wire it; otherwise route through `archive(taskIds:)` with a comment ("Rejected as hallucinated by Scarf user").
|
||||||
|
- **Do NOT** add a `setMaxRetries(taskId:)` post-create mutation method. Hermes pattern is write-once. Setting `max_retries` after create has no CLI verb in v0.13. Document this as a Limitation in inspector tooltips.
|
||||||
|
|
||||||
|
### 7. `scarf/scarf/Features/Kanban/Views/KanbanCreateSheet.swift`
|
||||||
|
|
||||||
|
**Why:** Multi-line title + new `Max retries` numeric field, both gated on `hasKanbanDiagnostics`.
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- Replace the single-line `titleField` (lines 116-122):
|
||||||
|
```swift
|
||||||
|
ScarfTextField("What needs doing?", text: $title)
|
||||||
|
```
|
||||||
|
with a multi-line variant. Two acceptable approaches:
|
||||||
|
- **Preferred:** SwiftUI `TextField` with `axis: .vertical` and `lineLimit(1...4)`. Wraps cleanly inside the existing `ScarfTextField` chrome on macOS 14.6+. Pre-existing `ScarfTextField` is a wrapper — extend the wrapper to take an optional `axis` parameter or add a new `ScarfTextEditor` sibling component to `ScarfDesign`. Touch the design package only if the multi-line variant doesn't already exist there. (Audit `Packages/ScarfDesign/` first; if `ScarfTextEditor` exists, use it.)
|
||||||
|
- **Fallback:** A bare `TextEditor` mirroring the `descriptionField` chrome, with a smaller `minHeight: 36, maxHeight: 96` so single-line titles still feel right.
|
||||||
|
- Gating: Since macOS 14.6 has no plumbing problem with multi-line text, keep the multi-line input on for **all** versions of Hermes — pre-v0.13 will simply receive a single-line title at the wire (`\n` stripped client-side before submit if Hermes < 0.13 truncates on newlines). Use the `hasKanbanDiagnostics` flag to **decide whether to strip newlines** at submit time, not whether to render the multi-line input. Read the capability via the existing `@Environment` injection pattern (look up how other create sheets read it; if not yet wired here, accept it as a `let capabilities: HermesCapabilitiesStore` init parameter).
|
||||||
|
- Add a new section between `priorityField` and `skillsField`:
|
||||||
|
```
|
||||||
|
┌─────────────────────────────┐
|
||||||
|
│ Max retries │
|
||||||
|
│ subtitle: "0 = no retries. │
|
||||||
|
│ Defaults to 3." │
|
||||||
|
│ ┌───────────────────────┐ │
|
||||||
|
│ │ Stepper: [3] [- +] │ │
|
||||||
|
│ └───────────────────────┘ │
|
||||||
|
└─────────────────────────────┘
|
||||||
|
```
|
||||||
|
- New `@State` storage: `@State private var maxRetries: Int = 3` and `@State private var maxRetriesEnabled: Bool = false`. Toggle gates whether `maxRetries` is sent at all (so we can preserve "let server pick the default" by leaving the flag absent).
|
||||||
|
- Show this section only when `capabilities.hasKanbanDiagnostics` is true. Pre-v0.13 hosts get the v2.7.5 sheet unchanged (no new field).
|
||||||
|
- Wire into `makeRequest()` (line 309-347): pass `maxRetries: maxRetriesEnabled ? maxRetries : nil`.
|
||||||
|
- Strip newlines in title pre-submit when `!capabilities.hasKanbanDiagnostics` to defend against pre-v0.13 hosts: `let titleForSubmit = trimmedTitle.replacingOccurrences(of: "\n", with: " ")` only on the pre-v0.13 path.
|
||||||
|
|
||||||
|
### 8. `scarf/scarf/Features/Kanban/Views/KanbanInspectorPane.swift`
|
||||||
|
|
||||||
|
**Why:** This is the biggest delta — diagnostics rendering, hallucination Verify/Reject banner, max_retries header chip, expanded auto_blocked_reason banner.
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
#### 8a. Header chip row (lines 152-167)
|
||||||
|
|
||||||
|
Add a chip for `max_retries` when present (gated on `hasKanbanDiagnostics`):
|
||||||
|
|
||||||
|
```swift
|
||||||
|
if let maxRetries = task.maxRetries {
|
||||||
|
ScarfBadge("retries: \(maxRetries)", kind: .neutral)
|
||||||
|
.fixedSize()
|
||||||
|
.help("Max retries set at create time. Hermes has no update verb — re-create the task to change this.")
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Inserted in the chip-row HStack between `workspaceKind` and `tenant`.
|
||||||
|
|
||||||
|
#### 8b. Hallucination-gate banner (NEW, in `healthBanner(for:)`)
|
||||||
|
|
||||||
|
Insert above the existing `needsAssignee` / `hadFailedEndedRun` checks. Render only when `KanbanHallucinationGate.from(task.hallucinationGateStatus) == .pending`:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@ViewBuilder
|
||||||
|
private func hallucinationBanner(for task: HermesKanbanTask) -> some View {
|
||||||
|
HStack(alignment: .top, spacing: ScarfSpace.s2) {
|
||||||
|
Image(systemName: "questionmark.diamond.fill")
|
||||||
|
.foregroundStyle(ScarfColor.warning)
|
||||||
|
.font(.system(size: 13, weight: .semibold))
|
||||||
|
VStack(alignment: .leading, spacing: 4) {
|
||||||
|
Text("Created by a worker — verify before running")
|
||||||
|
.scarfStyle(.captionStrong)
|
||||||
|
.foregroundStyle(ScarfColor.foregroundPrimary)
|
||||||
|
Text("A worker claimed it created this card; Hermes hasn't confirmed the underlying work exists. Verify the card matches a real follow-up, or reject if it's a hallucinated reference.")
|
||||||
|
.scarfStyle(.caption)
|
||||||
|
.foregroundStyle(ScarfColor.foregroundMuted)
|
||||||
|
HStack(spacing: ScarfSpace.s2) {
|
||||||
|
Button("Verify") { onVerifyHallucination() }
|
||||||
|
.buttonStyle(ScarfPrimaryButton())
|
||||||
|
Button("Reject") { onRejectHallucination() }
|
||||||
|
.buttonStyle(ScarfDestructiveButton())
|
||||||
|
}
|
||||||
|
.padding(.top, 2)
|
||||||
|
}
|
||||||
|
Spacer(minLength: 0)
|
||||||
|
}
|
||||||
|
.padding(ScarfSpace.s2)
|
||||||
|
.background(
|
||||||
|
RoundedRectangle(cornerRadius: ScarfRadius.md, style: .continuous)
|
||||||
|
.fill(ScarfColor.warning.opacity(0.10))
|
||||||
|
)
|
||||||
|
.overlay(
|
||||||
|
RoundedRectangle(cornerRadius: ScarfRadius.md, style: .continuous)
|
||||||
|
.strokeBorder(ScarfColor.warning.opacity(0.4), lineWidth: 1)
|
||||||
|
)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Two new closure parameters on the inspector init: `onVerifyHallucination: () -> Void`, `onRejectHallucination: () -> Void`. They're called from the buttons; `KanbanBoardView` wires them to `viewModel.verify(taskId:)` / `viewModel.rejectHallucinated(taskId:)`.
|
||||||
|
|
||||||
|
Render order in `healthBanner`: hallucination first (the user must resolve this before anything else makes sense), then unassigned, then last-failed-run. Stack vertically inside a `VStack(alignment: .leading, spacing: ScarfSpace.s2)` rather than the current `if/else if`.
|
||||||
|
|
||||||
|
#### 8c. Auto-blocked reason banner (extension of existing red banner)
|
||||||
|
|
||||||
|
Currently `healthBanner` renders a generic "Last run: blocked" message. v0.13 ships `auto_blocked_reason` on the task itself. Update logic:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
if KanbanStatus.from(task.status) == .blocked,
|
||||||
|
let reason = task.autoBlockedReason, !reason.isEmpty {
|
||||||
|
bannerRow(
|
||||||
|
icon: "exclamationmark.octagon.fill",
|
||||||
|
tint: ScarfColor.danger,
|
||||||
|
title: "Auto-blocked",
|
||||||
|
message: reason // verbatim — Hermes-side message is the source of truth
|
||||||
|
)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This banner takes precedence over the existing `lastEndedRun.outcome == "blocked"` rendering (server-side reason is more specific than client-side derived).
|
||||||
|
|
||||||
|
#### 8d. Diagnostics rendering on Runs tab
|
||||||
|
|
||||||
|
Below each `runRow(_:)` (lines 562-594), insert a `diagnosticsRow(for:)` when the run has any:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
if !run.diagnostics.isEmpty {
|
||||||
|
diagnosticsBlock(run.diagnostics)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@ViewBuilder
|
||||||
|
private func diagnosticsBlock(_ diags: [HermesKanbanDiagnostic]) -> some View {
|
||||||
|
VStack(alignment: .leading, spacing: 4) {
|
||||||
|
Text("Diagnostics")
|
||||||
|
.scarfStyle(.captionUppercase)
|
||||||
|
.foregroundStyle(ScarfColor.foregroundFaint)
|
||||||
|
FlowLayout(spacing: 4) { // reuse existing layout primitive if present; otherwise HStack with wrapping
|
||||||
|
ForEach(diags) { diag in
|
||||||
|
let kind = KanbanDiagnosticKind.from(diag.kind)
|
||||||
|
ScarfBadge(diag.kind, kind: kind.badgeKind)
|
||||||
|
.help(diag.message ?? diag.kind)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
.padding(.top, 4)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
If a `FlowLayout` primitive doesn't exist in the codebase, fall back to a single-line `ScrollView(.horizontal, showsIndicators: false)` so a long diag list doesn't blow out card width.
|
||||||
|
|
||||||
|
#### 8e. Diagnostics on the task header
|
||||||
|
|
||||||
|
Top-level diagnostics (the `task.diagnostics ?? []`, NOT the per-run ones) are about the task, not a specific attempt. Render under the chip row in the header:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
if !task.diagnostics.isEmpty {
|
||||||
|
diagnosticsBlock(task.diagnostics)
|
||||||
|
.padding(.top, 4)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 8f. Action bar update
|
||||||
|
|
||||||
|
When `hallucination_gate_status == .pending`, suppress the "Start" button (Verify-or-Reject is the gate). The existing `primaryAction` switch already keys on `KanbanStatus.from(task.status)`; add a guard at the top of `@ViewBuilder primaryAction`:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
if KanbanHallucinationGate.from(task.hallucinationGateStatus) == .pending {
|
||||||
|
EmptyView() // banner provides the actions
|
||||||
|
} else {
|
||||||
|
// existing switch
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 9. `scarf/scarf/Features/Kanban/Views/KanbanCardView.swift`
|
||||||
|
|
||||||
|
**Why:** Card-level signals — hallucination dim + glyph, auto-block sub-line, diagnostics indicator.
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- New computed `private var hallucinationGate: KanbanHallucinationGate?` reading off the task.
|
||||||
|
- In `body`, apply 0.6 opacity when `hallucinationGate == .pending`:
|
||||||
|
```swift
|
||||||
|
.opacity(task.isDone ? doneOpacity : (hallucinationGate == .pending ? 0.6 : 1.0))
|
||||||
|
```
|
||||||
|
- In `titleRow`, add a yellow ⚠ glyph when `hallucinationGate == .pending`. It overlaps semantically with the existing `needsAssignmentWarning` glyph, so:
|
||||||
|
- If both are true, prefer the hallucination glyph (more specific).
|
||||||
|
- Render at the same right-side slot.
|
||||||
|
```swift
|
||||||
|
if hallucinationGate == .pending {
|
||||||
|
Image(systemName: "questionmark.diamond.fill")
|
||||||
|
.foregroundStyle(ScarfColor.warning)
|
||||||
|
.font(.system(size: 11, weight: .semibold))
|
||||||
|
.help("Worker-created — verify before running")
|
||||||
|
} else if needsAssignmentWarning {
|
||||||
|
Image(systemName: "exclamationmark.triangle.fill")
|
||||||
|
.foregroundStyle(ScarfColor.warning)
|
||||||
|
.font(.system(size: 11, weight: .semibold))
|
||||||
|
.help("Unassigned — Hermes's dispatcher silently skips tasks with no assignee, …")
|
||||||
|
}
|
||||||
|
```
|
||||||
|
- Auto-block sub-line: in the blocked branch of `relativeTimeLabel` (line 254-260), if `task.autoBlockedReason` is present, append the first 30 chars truncated:
|
||||||
|
- Easier path: don't shoehorn into `relativeTimeLabel`. Add a separate sub-line in the footer above the existing `relativeTimeLabel` when `KanbanStatus.from(status) == .blocked && task.autoBlockedReason != nil`:
|
||||||
|
```swift
|
||||||
|
if KanbanStatus.from(task.status) == .blocked,
|
||||||
|
let reason = task.autoBlockedReason, !reason.isEmpty {
|
||||||
|
Text(reason.prefix(60))
|
||||||
|
.scarfStyle(.caption)
|
||||||
|
.foregroundStyle(ScarfColor.danger)
|
||||||
|
.lineLimit(1)
|
||||||
|
.truncationMode(.tail)
|
||||||
|
.help(reason)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
- Diagnostics indicator (subtle): if `!task.diagnostics.isEmpty`, render a small dot in the footer right side next to the priority indicator:
|
||||||
|
```swift
|
||||||
|
if !task.diagnostics.isEmpty {
|
||||||
|
Image(systemName: "stethoscope")
|
||||||
|
.font(.system(size: 9))
|
||||||
|
.foregroundStyle(ScarfColor.warning)
|
||||||
|
.help("\(task.diagnostics.count) diagnostic signal\(task.diagnostics.count == 1 ? "" : "s")")
|
||||||
|
}
|
||||||
|
```
|
||||||
|
- Done dim: leave alone; v0.13 darwin-zombie fix doesn't change Done semantics.
|
||||||
|
|
||||||
|
### 10. `scarf/scarf/Features/Kanban/Views/KanbanBoardView.swift`
|
||||||
|
|
||||||
|
**Why:** Wire the new inspector callbacks (`onVerifyHallucination`, `onRejectHallucination`) into the VM.
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- In the inspector instantiation, pass two new closures:
|
||||||
|
```swift
|
||||||
|
KanbanInspectorPane(
|
||||||
|
service: viewModel.service,
|
||||||
|
taskId: id,
|
||||||
|
...,
|
||||||
|
onVerifyHallucination: { viewModel.verifyHallucination(taskId: id) },
|
||||||
|
onRejectHallucination: { viewModel.rejectHallucination(taskId: id) }
|
||||||
|
)
|
||||||
|
```
|
||||||
|
- Capability gate ambient via the `HermesCapabilitiesStore` `.environment(_:)` injection from `ContextBoundRoot` (already in place per CLAUDE.md). Read with `@Environment(HermesCapabilitiesStore.self)` and pass the relevant flag down to `KanbanCreateSheet` for the max-retries field.
|
||||||
|
|
||||||
|
### 11. `scarf/scarf/Features/Kanban/ViewModels/KanbanBoardViewModel.swift`
|
||||||
|
|
||||||
|
**Why:** Add `verifyHallucination(taskId:)` and `rejectHallucination(taskId:)` methods. Also extend the optimistic-override mechanism to cover hallucination-gate transitions so the banner disappears immediately on Verify (and the card un-dims).
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- Add a sibling override map for hallucination state:
|
||||||
|
```swift
|
||||||
|
/// Mirrors `optimisticOverrides` but for hallucination-gate transitions.
|
||||||
|
/// Cleared when the polled response confirms the new gate status.
|
||||||
|
private var optimisticHallucinationOverrides: [String: KanbanHallucinationGate] = [:]
|
||||||
|
```
|
||||||
|
- Or simpler: extend `optimisticOverrides` to a richer struct
|
||||||
|
```swift
|
||||||
|
private struct OptimisticOverride {
|
||||||
|
var status: String?
|
||||||
|
var hallucinationGate: KanbanHallucinationGate?
|
||||||
|
}
|
||||||
|
private var optimisticOverrides: [String: OptimisticOverride] = [:]
|
||||||
|
```
|
||||||
|
This is cleaner long-term; touches more existing code (~10 lines). Recommend the struct approach.
|
||||||
|
- Add `verifyHallucination(taskId:)`:
|
||||||
|
```swift
|
||||||
|
func verifyHallucination(taskId: String) {
|
||||||
|
// Optimistic — flip to verified locally so banner disappears.
|
||||||
|
optimisticOverrides[taskId, default: .init()].hallucinationGate = .verified
|
||||||
|
Task {
|
||||||
|
do {
|
||||||
|
try await service.verify(taskId: taskId) // pending CLI verb confirmation; see Open Questions
|
||||||
|
await refresh()
|
||||||
|
} catch let err as KanbanError {
|
||||||
|
optimisticOverrides[taskId]?.hallucinationGate = nil
|
||||||
|
lastError = err.errorDescription
|
||||||
|
} catch {
|
||||||
|
optimisticOverrides[taskId]?.hallucinationGate = nil
|
||||||
|
lastError = error.localizedDescription
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
- Add `rejectHallucination(taskId:)`:
|
||||||
|
```swift
|
||||||
|
func rejectHallucination(taskId: String) {
|
||||||
|
// Treat as archive + comment for clarity in the audit trail.
|
||||||
|
Task {
|
||||||
|
do {
|
||||||
|
try await service.comment(taskId: taskId, text: "Rejected as hallucinated (no underlying work).", author: nil)
|
||||||
|
try await service.archive(taskIds: [taskId])
|
||||||
|
await refresh()
|
||||||
|
} catch let err as KanbanError {
|
||||||
|
lastError = err.errorDescription
|
||||||
|
} catch {
|
||||||
|
lastError = error.localizedDescription
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
**Note:** if Hermes v0.13 adds a dedicated `kanban reject` or `kanban hallucination reject` verb, swap the body to call it. Either way, the VM API stays stable — the surface for views is "reject" returning Void.
|
||||||
|
- Update `mergePolledTasks` to clear `optimisticHallucinationOverrides` entries when the polled task's `hallucination_gate_status` matches:
|
||||||
|
```swift
|
||||||
|
for (id, override) in optimisticOverrides {
|
||||||
|
guard let row = filtered.first(where: { $0.id == id }) else {
|
||||||
|
if !presentIds.contains(id) {
|
||||||
|
optimisticOverrides.removeValue(forKey: id)
|
||||||
|
}
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
// Status side (existing).
|
||||||
|
if let optStatus = override.status,
|
||||||
|
columnFromStatus(optStatus) == columnFromStatus(row.status) {
|
||||||
|
optimisticOverrides[id]?.status = nil
|
||||||
|
}
|
||||||
|
// Hallucination gate side (new).
|
||||||
|
if let optGate = override.hallucinationGate,
|
||||||
|
KanbanHallucinationGate.from(row.hallucinationGateStatus) == optGate {
|
||||||
|
optimisticOverrides[id]?.hallucinationGate = nil
|
||||||
|
}
|
||||||
|
// Empty override — drop entirely.
|
||||||
|
if optimisticOverrides[id]?.status == nil,
|
||||||
|
optimisticOverrides[id]?.hallucinationGate == nil {
|
||||||
|
optimisticOverrides.removeValue(forKey: id)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
- Update `effectiveColumn` and a new `effectiveHallucinationGate(_:)` to consult the override.
|
||||||
|
|
||||||
|
### 12. `scarf/Packages/ScarfCore/Tests/ScarfCoreTests/KanbanModelsTests.swift`
|
||||||
|
|
||||||
|
**Why:** The tolerant-decode contract is the single most important invariant. Tests must cover both shapes.
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
#### 12a. New test — v0.13 task shape decodes with all new fields populated:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@Test func decodeV013TaskFields() throws {
|
||||||
|
let json = """
|
||||||
|
{
|
||||||
|
"id": "t_v013",
|
||||||
|
"title": "v0.13 task",
|
||||||
|
"status": "blocked",
|
||||||
|
"max_retries": 5,
|
||||||
|
"auto_blocked_reason": "worker exited without `kanban complete`",
|
||||||
|
"hallucination_gate_status": "pending",
|
||||||
|
"diagnostics": [
|
||||||
|
{"kind": "worker_exit_no_complete", "message": "exit code 0 with no complete call", "detected_at": 1778160614},
|
||||||
|
{"kind": "darwin_zombie_detected", "detected_at": "2026-05-09T12:00:00Z"}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
"""
|
||||||
|
let task = try JSONDecoder().decode(HermesKanbanTask.self, from: Data(json.utf8))
|
||||||
|
#expect(task.maxRetries == 5)
|
||||||
|
#expect(task.autoBlockedReason?.contains("kanban complete") == true)
|
||||||
|
#expect(task.hallucinationGateStatus == "pending")
|
||||||
|
#expect(task.diagnostics.count == 2)
|
||||||
|
#expect(task.diagnostics.first?.kind == "worker_exit_no_complete")
|
||||||
|
#expect(task.diagnostics.last?.detectedAt?.contains("2026") == true)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 12b. New test — v0.12 (legacy) task shape decodes with new fields = nil/empty:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@Test func decodeV012TaskHasNoNewFields() throws {
|
||||||
|
let json = """
|
||||||
|
{"id": "t_legacy", "title": "v0.12 task", "status": "ready"}
|
||||||
|
"""
|
||||||
|
let task = try JSONDecoder().decode(HermesKanbanTask.self, from: Data(json.utf8))
|
||||||
|
#expect(task.maxRetries == nil)
|
||||||
|
#expect(task.autoBlockedReason == nil)
|
||||||
|
#expect(task.hallucinationGateStatus == nil)
|
||||||
|
#expect(task.diagnostics.isEmpty)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 12c. New test — diagnostics with malformed entry doesn't poison the array:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@Test func decodeMalformedDiagnosticTolerated() throws {
|
||||||
|
// If Hermes emits a malformed diagnostic, the rest of the task should
|
||||||
|
// still decode. We use try? on the diagnostics decode so a single
|
||||||
|
// bad entry doesn't reject the whole row.
|
||||||
|
let json = """
|
||||||
|
{
|
||||||
|
"id": "t_x",
|
||||||
|
"title": "x",
|
||||||
|
"status": "ready",
|
||||||
|
"diagnostics": "not-an-array"
|
||||||
|
}
|
||||||
|
"""
|
||||||
|
let task = try JSONDecoder().decode(HermesKanbanTask.self, from: Data(json.utf8))
|
||||||
|
#expect(task.id == "t_x")
|
||||||
|
// Diagnostics field couldn't decode — treat as empty.
|
||||||
|
#expect(task.diagnostics.isEmpty)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 12d. New test — `KanbanHallucinationGate.from(_:)` mirror:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@Test func hallucinationGateMirrorMapsKnownValues() {
|
||||||
|
#expect(KanbanHallucinationGate.from("pending") == .pending)
|
||||||
|
#expect(KanbanHallucinationGate.from("verified") == .verified)
|
||||||
|
#expect(KanbanHallucinationGate.from("REJECTED") == .rejected) // case-insensitive
|
||||||
|
#expect(KanbanHallucinationGate.from(nil) == nil)
|
||||||
|
#expect(KanbanHallucinationGate.from("") == nil)
|
||||||
|
#expect(KanbanHallucinationGate.from("quarantined") == nil) // unknown returns nil
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 12e. New test — KanbanCreateRequest argv carries `--max-retries`:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@Test func createRequestArgvIncludesMaxRetries() {
|
||||||
|
let req = KanbanCreateRequest(title: "t", maxRetries: 5)
|
||||||
|
let argv = req.argv()
|
||||||
|
#expect(argv.contains("--max-retries"))
|
||||||
|
#expect(argv.contains("5"))
|
||||||
|
}
|
||||||
|
|
||||||
|
@Test func createRequestArgvOmitsMaxRetriesWhenAbsent() {
|
||||||
|
let req = KanbanCreateRequest(title: "t")
|
||||||
|
let argv = req.argv()
|
||||||
|
#expect(!argv.contains("--max-retries"))
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 12f. New test — Run with diagnostics decodes:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@Test func decodeRunWithDiagnostics() throws {
|
||||||
|
let json = """
|
||||||
|
{
|
||||||
|
"id": 1,
|
||||||
|
"task_id": "t_x",
|
||||||
|
"status": "failed",
|
||||||
|
"started_at": 1778160000,
|
||||||
|
"ended_at": 1778160300,
|
||||||
|
"outcome": "crashed",
|
||||||
|
"error": "OOM",
|
||||||
|
"diagnostics": [
|
||||||
|
{"kind": "retry_cap_hit", "message": "3/3 retries exhausted"}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
"""
|
||||||
|
let run = try JSONDecoder().decode(HermesKanbanRun.self, from: Data(json.utf8))
|
||||||
|
#expect(run.diagnostics.count == 1)
|
||||||
|
#expect(run.diagnostics.first?.kind == "retry_cap_hit")
|
||||||
|
}
|
||||||
|
|
||||||
|
@Test func decodeRunWithoutDiagnostics() throws {
|
||||||
|
let json = """
|
||||||
|
{"id": 1, "task_id": "t_x", "status": "running", "started_at": 1778160000}
|
||||||
|
"""
|
||||||
|
let run = try JSONDecoder().decode(HermesKanbanRun.self, from: Data(json.utf8))
|
||||||
|
#expect(run.diagnostics.isEmpty)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
These tests pin the tolerant-decode contract on both sides (with new fields, without new fields). Pre-v0.13 hosts running v2.8 Scarf must keep decoding cleanly — without these tests we'd ship a regression that bites every customer not yet on Hermes v0.13.
|
||||||
|
|
||||||
|
### 13. `scarf/Packages/ScarfDesign/` — IF a multi-line text component is missing
|
||||||
|
|
||||||
|
**Why:** If `ScarfTextField` doesn't already accept an `axis: .vertical` parameter (likely the case in v2.7.5), add one OR add a `ScarfTextEditor` component to the design package so the create sheet can use the design-system token.
|
||||||
|
|
||||||
|
**Conservative approach:** Use `TextField` with `axis: .vertical` directly inside `KanbanCreateSheet`, styled to match `ScarfTextField` chrome (background, border, padding from `ScarfColor`/`ScarfRadius`/`ScarfSpace`). Defer adding a new design-system component to a follow-up — design-system additions deserve their own review pass and aren't on this WS's critical path.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Capability gating
|
||||||
|
|
||||||
|
All of the new Mac surface gates on `HermesCapabilities.hasKanbanDiagnostics` (already shipped in WS-1, semver `>= 0.13.0`).
|
||||||
|
|
||||||
|
### Gating decisions per surface
|
||||||
|
|
||||||
|
| Surface | Gated? | Rationale |
|
||||||
|
| --- | --- | --- |
|
||||||
|
| `max_retries` field on create sheet | Yes | Pre-v0.13 Hermes rejects `--max-retries` flag with non-zero exit. Hide the field; don't pass the flag. |
|
||||||
|
| Multi-line title input rendering | No | Multi-line input is harmless on v0.12 (the ScarfTextField is just visually taller). |
|
||||||
|
| Multi-line title submitted with `\n` | Yes | Pre-v0.13 may truncate at the first `\n`. Strip newlines client-side when `!hasKanbanDiagnostics`. |
|
||||||
|
| `max_retries` chip on inspector header | Yes | Pre-v0.13 task rows never carry `max_retries`, so `task.maxRetries` is nil — `if let` already hides it. Belt-and-suspenders: also gate on the flag. |
|
||||||
|
| Hallucination-gate banner | Yes | Pre-v0.13 task rows never carry `hallucination_gate_status`. Same `if let` belt-and-suspenders. |
|
||||||
|
| Diagnostics rendering on inspector | Yes | Pre-v0.13 tasks carry empty `diagnostics`, so the rendering would no-op. Gate explicitly anyway so a future server-side change doesn't accidentally surface partial UX on a pre-v0.13 host. |
|
||||||
|
| Auto-blocked banner with reason | Yes | Pre-v0.13 may write a similar string in a different place. Gate so we don't double-render. |
|
||||||
|
| Card hallucination dim/glyph | Yes | Same. |
|
||||||
|
| Card diagnostics dot | Yes | Same. |
|
||||||
|
| Verify / Reject buttons | Yes (hard gate) | The `kanban verify` verb (or whatever Hermes ships) doesn't exist pre-v0.13. The buttons MUST be hidden, not just disabled — a disabled button conveys "this might work later in this session" which is wrong for a capability-gated feature. |
|
||||||
|
|
||||||
|
### Plumbing
|
||||||
|
|
||||||
|
`HermesCapabilitiesStore` is already injected via `.environment(_:)` on `ContextBoundRoot` (Mac) — see CLAUDE.md "Capability gating" section. Read in views with `@Environment(HermesCapabilitiesStore.self) private var capabilities` (or whatever key is currently used; verify with the existing `Curator` feature gating).
|
||||||
|
|
||||||
|
**No new HermesCapabilities flag.** WS-1 already shipped `hasKanbanDiagnostics` covering all eight v0.13 Kanban additions in a single boolean. Resist the urge to split into `hasHallucinationGate` / `hasDiagnostics` / `hasMaxRetries` — Hermes shipped them together, and finer gating is YAGNI per the CLAUDE.md "Kanban v3" pattern.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## How to test
|
||||||
|
|
||||||
|
### Unit tests (KanbanModelsTests)
|
||||||
|
|
||||||
|
The test additions are listed above (§12.a–§12.f). Run:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
xcodebuild -project scarf/scarf.xcodeproj \
|
||||||
|
-scheme ScarfCore \
|
||||||
|
-destination 'platform=macOS' \
|
||||||
|
test
|
||||||
|
```
|
||||||
|
|
||||||
|
All v0.13 fixtures should decode AND all v0.12 fixtures should continue to decode. The two-shape pair is the critical contract.
|
||||||
|
|
||||||
|
### Manual smoke (against a real Hermes v0.13 host)
|
||||||
|
|
||||||
|
Per CLAUDE.md "remote-servers dogfooding" memory: dogfood against the Mardon Mac Mini at 192.168.0.82 — set context to that server (or run against local v0.13 install).
|
||||||
|
|
||||||
|
1. **Hallucination gate end-to-end**
|
||||||
|
- Trigger a worker that creates a follow-up card via the agent's tooling. Server flips it to `pending`.
|
||||||
|
- Card on board: 0.6 opacity, yellow ⚠ glyph in title row.
|
||||||
|
- Inspector: yellow banner above body with Verify / Reject buttons.
|
||||||
|
- Click Verify: optimistic flip — banner disappears immediately, card un-dims. Within 5s, polled state confirms `verified`. No regressions in optimistic-override clearing.
|
||||||
|
- Click Reject (on a different pending task): comment + archive sequence runs; card disappears from active board (visible only with "Show archived").
|
||||||
|
|
||||||
|
2. **Diagnostics**
|
||||||
|
- Trigger a worker that hits a heartbeat stall (e.g. Sleep > heartbeat interval). Verify `heartbeat_stalled` diagnostic appears under the run row in the inspector Runs tab.
|
||||||
|
- Trigger a tool-error loop (force a tool to error 3+ times). Verify `tool_error_loop` diagnostic shows up.
|
||||||
|
- Verify the dot-indicator on the card lights up.
|
||||||
|
|
||||||
|
3. **`max_retries`**
|
||||||
|
- Create a task via the create sheet with Max retries = 1.
|
||||||
|
- Verify the inspector header shows `retries: 1`.
|
||||||
|
- Force a failure; verify the worker is auto-blocked after 1 retry (server-side behavior).
|
||||||
|
- The chip is read-only — verify there's no edit affordance.
|
||||||
|
|
||||||
|
4. **Auto-blocked reason**
|
||||||
|
- Trigger a worker that exits 0 without calling `kanban complete`.
|
||||||
|
- Verify the inspector banner says "Auto-blocked" with the server's `auto_blocked_reason` verbatim.
|
||||||
|
- Verify the card footer shows the truncated reason in red.
|
||||||
|
|
||||||
|
5. **Multi-line title**
|
||||||
|
- In the create sheet, type a 3-line title.
|
||||||
|
- Verify the field grows.
|
||||||
|
- Submit. Verify on the Hermes v0.13 host the title is preserved with newlines (`hermes kanban show` JSON should round-trip them).
|
||||||
|
|
||||||
|
6. **Pre-v0.13 host (regression smoke)**
|
||||||
|
- Switch context to a Hermes v0.12 host.
|
||||||
|
- Verify: max-retries field hidden in create sheet; max-retries chip absent in inspector; no hallucination banner; no diagnostics rendering; create still works; existing v2.7.5 chrome unchanged.
|
||||||
|
- Title field: type a multi-line title — submit. Verify newlines were stripped client-side (no exception on the server).
|
||||||
|
|
||||||
|
### Integration smoke
|
||||||
|
|
||||||
|
Build the app and run the existing Kanban smoke flow from `docs/PRD.md` to verify drag-drop, optimistic merge, and the per-project tenant flow are unaffected. The new code paths should not change v2.7.5 behavior on a v0.13 host that happens to have no diagnostics / hallucination signals (the dominant case in normal use).
|
||||||
|
|
||||||
|
### Example v0.13 JSON fixtures (use as test inputs and as documentation)
|
||||||
|
|
||||||
|
Drop these into `KanbanModelsTests` as inline fixtures. They're our wire-shape claim until we can validate against real CLI output during integration.
|
||||||
|
|
||||||
|
#### Task with all v0.13 fields
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"id": "t_v013_full",
|
||||||
|
"title": "Investigate flaky test\nReproduces only on CI",
|
||||||
|
"body": "Repro: run the integration suite 10x.",
|
||||||
|
"assignee": "researcher",
|
||||||
|
"status": "blocked",
|
||||||
|
"priority": 75,
|
||||||
|
"tenant": "scarf:demo",
|
||||||
|
"workspace_kind": "scratch",
|
||||||
|
"workspace_path": "/Users/alan/.hermes/kanban/workspaces/t_v013_full",
|
||||||
|
"created_by": "agent:claude-sonnet-4-7",
|
||||||
|
"created_at": 1778160614,
|
||||||
|
"skills": ["debugging"],
|
||||||
|
"max_runtime_seconds": 1800,
|
||||||
|
"max_retries": 3,
|
||||||
|
"auto_blocked_reason": "worker exited (code 0) without calling `kanban complete`",
|
||||||
|
"hallucination_gate_status": "pending",
|
||||||
|
"diagnostics": [
|
||||||
|
{
|
||||||
|
"kind": "worker_exit_no_complete",
|
||||||
|
"message": "exit code 0 with no complete call",
|
||||||
|
"detected_at": 1778161000
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"kind": "heartbeat_stalled",
|
||||||
|
"message": "no heartbeat for 4m20s (max_runtime/4 = 7m30s, slack budget exceeded)",
|
||||||
|
"detected_at": 1778161200
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Task with no v0.13 fields (legacy v0.12 host)
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"id": "t_v012_legacy",
|
||||||
|
"title": "Translate doc",
|
||||||
|
"status": "ready",
|
||||||
|
"priority": 50,
|
||||||
|
"skills": []
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Run with diagnostics
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"id": 7,
|
||||||
|
"task_id": "t_v013_full",
|
||||||
|
"profile": "researcher",
|
||||||
|
"status": "failed",
|
||||||
|
"started_at": 1778160614,
|
||||||
|
"ended_at": 1778160914,
|
||||||
|
"outcome": "crashed",
|
||||||
|
"error": "subprocess died with SIGKILL",
|
||||||
|
"summary": null,
|
||||||
|
"diagnostics": [
|
||||||
|
{"kind": "darwin_zombie_detected", "message": "PID 9842 left as zombie", "detected_at": 1778160916},
|
||||||
|
{"kind": "retry_cap_hit", "message": "3/3 retries exhausted"}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Open questions
|
||||||
|
|
||||||
|
1. **What's the exact CLI verb name for hallucination-gate verify / reject?** Release notes say "hallucination gate + recovery UX" but don't enumerate the verb. Best inference is `hermes kanban verify <id>` or `hermes kanban gate verify <id>`. **Action:** before implementation, run `hermes kanban --help` against a v0.13 binary and confirm. If absent (and the gate is server-flipped automatically once a worker tries to dispatch a hallucinated card), the Reject path still works (archive + comment), but Verify becomes "do nothing" and the card waits for server-side detection. Document in code comment.
|
||||||
|
|
||||||
|
2. **Where do diagnostics live on the wire — task envelope, run envelope, or both?** Release notes: "Generic diagnostics engine for task distress signals." This implies task-level. But heartbeat-stalled is a per-run signal. Best inference: per-run for in-flight signals, per-task for cross-run signals (retry cap hit). **Action:** plan handles both via `HermesKanbanTaskDetail.allDiagnostics` and per-run `run.diagnostics`. Verify against real JSON during integration.
|
||||||
|
|
||||||
|
3. **Does Hermes v0.13 expose a `set_max_retries` verb post-create?** Release notes say "Per-task `max_retries` override configuration" — ambiguous. If it's create-only (write-once like `priority`), we surface the chip read-only and document the limitation. If it's a settable field, we add an inspector edit affordance. **Action:** confirm at integration time. Plan assumes write-once (matches Hermes pattern).
|
||||||
|
|
||||||
|
4. **Failure-counter unification — does the run row need a new field?** Release notes: "Unify failure counter across spawn / timeout / crash outcomes." Best inference: server-side, the `failure_count` is a single column rather than three columns. From Scarf's view, this changes nothing — we render `outcome` (already present), and the count is implicit (count of failed runs in `runs` array). **Action:** verify at integration. If a `failure_count: Int` field shows up, decode it on `HermesKanbanRun` (already in §3) and surface in the run row label as "x/N retries" when `max_retries` is set.
|
||||||
|
|
||||||
|
5. **How does v0.13 distinguish darwin zombie from generic stale_lock?** Release notes: "Detect darwin zombie workers." Best inference: the diagnostics array includes a `darwin_zombie_detected` kind on the run. **Action:** plan renders it via the typed-mirror enum. Verify the kind string at integration.
|
||||||
|
|
||||||
|
6. **What's the default `max_retries` value?** Plan defaults the create-sheet field to 3 with a "0 = no retries. Defaults to 3." subtitle. Confirm against `hermes kanban stats --json` defaults block (or `hermes kanban --help` text) at integration. If Hermes config exposes a global default, read it and use that as the field's pre-fill.
|
||||||
|
|
||||||
|
7. **Are there sub-commands like `hermes kanban diagnose <id>`?** Release notes don't mention, but generic-diagnostics-engine framing leaves room. If such a verb exists, the inspector's diagnostics block could grow a "Run diagnostics" button to manually trigger a fresh check. **Action:** ship without; revisit when verb existence is confirmed.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Out of scope (deferred — likely v2.8.x or v2.9)
|
||||||
|
|
||||||
|
- **iOS read-only counterpart** — covered by WS-9 (iOS catch-up). Render hallucination dim, max_retries chip, and auto_blocked_reason banner on the iOS detail sheet read-only. No buttons.
|
||||||
|
- **`watch` streaming** — when Hermes ships a stable `kanban watch --json` shape, replace the 5s polling loop. New flag `hasKanbanWatch` will gate the surface.
|
||||||
|
- **Within-column reorder** — still no `update --priority` verb. If Hermes ships one in a future minor, revisit.
|
||||||
|
- **In-place title / body edit** — same constraint. CLAUDE.md "Don't" list applies unchanged.
|
||||||
|
- **Cross-column drag from Done** — terminal state.
|
||||||
|
- **Diagnostics filter on the board** — could imagine "show only tasks with active diagnostics" toggle in the toolbar. Defer until we see how often the dot indicator fires in real use.
|
||||||
|
- **Bulk verify / reject** — multi-select card → verify all. Defer; the hallucination gate is rare enough that one-at-a-time UX is fine in v2.8.0.
|
||||||
|
- **Diagnostics history graph** — over time, "this task had heartbeat-stalled 3 times in 6 attempts" is a valuable signal. Defer to a v2.9 dashboard widget on top of the v0.13 stats endpoint.
|
||||||
|
- **Worker log → diagnostics correlation** — when a diagnostic fires at time T, scroll the log tab to that timestamp. Nice-to-have; defer.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Estimate
|
||||||
|
|
||||||
|
**Engineering hours (one engineer, focused):**
|
||||||
|
|
||||||
|
| Block | Hours |
|
||||||
|
| --- | --- |
|
||||||
|
| Model additions (§1, §2, §3, §4, §5) — fields + tolerant decode | 3 |
|
||||||
|
| KanbanService verb additions (§6) — verify + reject (with TODO until CLI confirmed) | 2 |
|
||||||
|
| KanbanCreateSheet edits (§7) — multi-line title + max_retries field | 3 |
|
||||||
|
| KanbanInspectorPane edits (§8) — banners + diagnostics + header chip + action-bar gate | 5 |
|
||||||
|
| KanbanCardView edits (§9) — hallucination dim/glyph + auto-block sub-line + diagnostics dot | 2 |
|
||||||
|
| KanbanBoardView wiring (§10) | 1 |
|
||||||
|
| KanbanBoardViewModel edits (§11) — extended optimistic override + verify/reject methods | 3 |
|
||||||
|
| KanbanModelsTests additions (§12) | 2 |
|
||||||
|
| Capability gating audit / plumbing | 1 |
|
||||||
|
| Manual smoke (§How to test) — both v0.13 host and v0.12 host | 2 |
|
||||||
|
| Code review + revisions | 3 |
|
||||||
|
| **Total** | **~27 hours (≈3.5 working days)** |
|
||||||
|
|
||||||
|
**Confidence:** medium-high. The model additions and view edits are mechanical given v2.7.5's existing scaffolding (the optimistic-override pattern, the inspector pane structure, the tolerant-decode tests). The single biggest risk is the hallucination-gate CLI verb name (Open Question #1) — if Hermes shipped a verb name we can't infer, the Verify path is a stub until we see the binary's `--help`. The Reject path always works (archive + comment) so the recovery UX is functional even with #1 unresolved.
|
||||||
|
|
||||||
|
**Critical-path dependency:** none. WS-1 already shipped `hasKanbanDiagnostics`. WS-3 has no other workstream dependency.
|
||||||
|
|
||||||
|
**Risk register:**
|
||||||
|
|
||||||
|
- **Wire-shape mismatch.** If our inferred JSON shape is wrong (e.g. `diagnostics` is keyed `signals` on the wire), the model code is wrong. Mitigation: tolerant decode + integration smoke against a real v0.13 host before merging. Add a fixture-from-real-output test once we have stdout from `hermes kanban show --json` on a v0.13 host.
|
||||||
|
- **Verb-name uncertainty.** See Open Question #1. Mitigation: stub method with TODO + comment-only archive flow for Reject; ship Verify behind a feature gate in the inspector if needed.
|
||||||
|
- **Optimistic-override regressions.** Extending the override mechanism to cover hallucination state could destabilize the existing drag-drop optimistic flow. Mitigation: write the struct refactor as a single commit, run the existing transition-planner tests, then write the new tests.
|
||||||
|
- **Pre-v0.13 silent regression.** The most damaging failure mode is a v0.12 user upgrading Scarf and seeing the board stop loading. Mitigation: §12 tests pin the v0.12 contract; the gating audit table covers each surface; manual smoke against a v0.12 host is a P0 step.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Appendix A — File-touch summary
|
||||||
|
|
||||||
|
| File | Purpose | Lines changed (estimate) |
|
||||||
|
| --- | --- | --- |
|
||||||
|
| `Models/HermesKanbanTask.swift` | +4 fields, init/decoder updates, +1 enum | ~50 |
|
||||||
|
| `Models/HermesKanbanDiagnostic.swift` | NEW model + enum mirror | ~80 (new file) |
|
||||||
|
| `Models/HermesKanbanRun.swift` | +1 field, init/decoder/encoder updates | ~15 |
|
||||||
|
| `Models/HermesKanbanTaskDetail.swift` | +1 envelope-level diagnostics field, +1 helper | ~20 |
|
||||||
|
| `Models/KanbanCreateRequest.swift` | +1 field, +1 argv branch | ~10 |
|
||||||
|
| `Services/KanbanService.swift` | +2 verb methods (verify, reject) | ~30 |
|
||||||
|
| `Tests/KanbanModelsTests.swift` | +6 tests | ~120 |
|
||||||
|
| `Features/Kanban/Views/KanbanCreateSheet.swift` | multi-line title + max-retries field + capability plumbing | ~80 |
|
||||||
|
| `Features/Kanban/Views/KanbanInspectorPane.swift` | hallucination banner + diagnostics + header chip + auto-block reason + action-bar gate | ~150 |
|
||||||
|
| `Features/Kanban/Views/KanbanCardView.swift` | hallucination dim/glyph + auto-block sub-line + diagnostics dot | ~50 |
|
||||||
|
| `Features/Kanban/Views/KanbanBoardView.swift` | wire new closures | ~10 |
|
||||||
|
| `Features/Kanban/ViewModels/KanbanBoardViewModel.swift` | struct override refactor + verify/reject methods + merge update | ~80 |
|
||||||
|
|
||||||
|
**Total: 12 files (1 new), roughly 690 lines changed.**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Appendix B — Wiring diagram
|
||||||
|
|
||||||
|
```
|
||||||
|
Hermes v0.13 binary
|
||||||
|
│
|
||||||
|
│ hermes kanban show --json
|
||||||
|
▼
|
||||||
|
KanbanService.show ─┐
|
||||||
|
│
|
||||||
|
hermes kanban runs │
|
||||||
|
│ │
|
||||||
|
▼ ▼
|
||||||
|
HermesKanbanRun HermesKanbanTaskDetail
|
||||||
|
+ diagnostics + task.diagnostics
|
||||||
|
+ envelope.diagnostics
|
||||||
|
+ task.maxRetries
|
||||||
|
+ task.autoBlockedReason
|
||||||
|
+ task.hallucinationGateStatus
|
||||||
|
│
|
||||||
|
│ KanbanBoardViewModel polls every 5s
|
||||||
|
▼
|
||||||
|
optimisticOverrides (struct, not String)
|
||||||
|
{ taskId: { status?, hallucinationGate? } }
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
KanbanBoardView ─── KanbanCardView (dim/glyph/dot/sub-line)
|
||||||
|
│
|
||||||
|
└── KanbanInspectorPane
|
||||||
|
├── headerChips (+ retries chip)
|
||||||
|
├── hallucinationBanner (Verify / Reject)
|
||||||
|
├── autoBlockedBanner
|
||||||
|
├── failureBanner (existing)
|
||||||
|
├── unassignedBanner (existing)
|
||||||
|
├── runsTab (+ per-run diagnostics)
|
||||||
|
└── actionBar (suppressed when hallucination=pending)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Appendix C — UX copy register
|
||||||
|
|
||||||
|
Centralizing the user-facing strings here so a copy review pass can run before implementation.
|
||||||
|
|
||||||
|
| Surface | Copy |
|
||||||
|
| --- | --- |
|
||||||
|
| Hallucination banner title | "Created by a worker — verify before running" |
|
||||||
|
| Hallucination banner body | "A worker claimed it created this card; Hermes hasn't confirmed the underlying work exists. Verify the card matches a real follow-up, or reject if it's a hallucinated reference." |
|
||||||
|
| Hallucination banner Verify button | "Verify" |
|
||||||
|
| Hallucination banner Reject button | "Reject" |
|
||||||
|
| Card hallucination glyph tooltip | "Worker-created — verify before running" |
|
||||||
|
| Auto-blocked banner title | "Auto-blocked" |
|
||||||
|
| Auto-blocked banner body | (server-supplied verbatim from `auto_blocked_reason`) |
|
||||||
|
| Max retries chip | `retries: N` |
|
||||||
|
| Max retries chip tooltip | "Max retries set at create time. Hermes has no update verb — re-create the task to change this." |
|
||||||
|
| Diagnostics block label | "Diagnostics" (uppercase caption style) |
|
||||||
|
| Card diagnostics dot tooltip | "N diagnostic signal(s)" |
|
||||||
|
| Create sheet max-retries section header | "Max retries" |
|
||||||
|
| Create sheet max-retries subtitle | "0 = no retries. Defaults to 3." |
|
||||||
|
| Reject confirm-comment text | "Rejected as hallucinated (no underlying work)." |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Appendix D — Why no dedicated `set_max_retries` verb is right
|
||||||
|
|
||||||
|
Hermes's design pattern is consistent: anything that affects how a worker is dispatched is set at `create` time and immutable afterward. `priority`, `title`, `body`, `tenant`, `max_runtime_seconds`, and now `max_retries` all follow this pattern.
|
||||||
|
|
||||||
|
The reasoning is dispatcher-correctness: a worker spawning at moment T captures the configuration at moment T. Mutating `max_retries` post-spawn would either:
|
||||||
|
- Apply only to *future* retry attempts (confusing because the user thinks they raised the cap), OR
|
||||||
|
- Apply retroactively (confusing because the dispatcher's internal counter mid-stream needs a flush).
|
||||||
|
|
||||||
|
Hermes resolves this by making the question moot — the field is write-once. Scarf's posture should be: surface the value clearly, explain the limitation, and make re-create-with-new-value cheap. We already meet the third bar (the create sheet pre-fills sensible defaults). For v2.8.0 we surface the value (max_retries chip in inspector header) and document the limitation in tooltip copy. If there's user demand for "raise the cap on this stuck task," the right move is a "Re-create with bumped retries" inspector action that reads the existing task body / assignee / etc., archives the original, and creates a sibling — a pattern v0.12 already supports without any new verbs. Defer until v2.8.x.
|
||||||
@@ -0,0 +1,561 @@
|
|||||||
|
# WS-4 Plan: Curator archive + prune + list-archived (v2.8.0 / Hermes v0.13)
|
||||||
|
|
||||||
|
> **Scope.** Catch Scarf's Curator surface up to Hermes v0.13's new write-side
|
||||||
|
> verbs: `archive <skill>`, `prune`, `list-archived`, and the synchronous flavor
|
||||||
|
> of `run`. WS-4 owns Mac UX end-to-end + the ScarfCore parser/service work that
|
||||||
|
> backs it. iOS catches up read-only in WS-9 (deferred — note at the end).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Goals
|
||||||
|
|
||||||
|
1. **Wire all four new v0.13 curator verbs** (`archive`, `prune`, `list-archived`,
|
||||||
|
synchronous `run`) into ScarfCore through a typed actor surface so the view
|
||||||
|
model stops shelling out via `runHermes` ad-hoc.
|
||||||
|
2. **Replace the v0.12 placeholder restore sheet** (free-form text field that
|
||||||
|
prompted the user to remember archived skill names) with an actual list
|
||||||
|
of archived skills returned by `hermes curator list-archived`, each row with
|
||||||
|
per-row Restore + Prune-this-one actions.
|
||||||
|
3. **Add an "Archive" affordance** to every active-skill row in the leaderboard
|
||||||
|
so users can manually archive a skill the curator didn't auto-archive.
|
||||||
|
4. **Add a destructive "Prune all archived" toolbar button** that opens a
|
||||||
|
confirm sheet enumerating exactly which archived skills are about to be
|
||||||
|
deleted forever.
|
||||||
|
5. **Make the "Run Now" button block-with-progress on v0.13+** since the verb is
|
||||||
|
now synchronous; preserve fire-and-forget on pre-v0.13 hosts.
|
||||||
|
6. **Pre-v0.13 hosts must see the v2.7.x curator surface unchanged** — no
|
||||||
|
"Archive" buttons, no Archived section, no Prune button. The legacy
|
||||||
|
`CuratorRestoreSheet` stays accessible (it's all the v0.12 host has).
|
||||||
|
7. **Keep parsing pure & testable**: list-archived / prune-summary parse paths
|
||||||
|
live in `HermesCuratorStatusParser` (or a sibling) with synthetic-fixture
|
||||||
|
coverage in `HermesCuratorParserTests`.
|
||||||
|
|
||||||
|
Non-goals: iOS surface (WS-9), curator config knobs (out of scope — config tab
|
||||||
|
already covers `auxiliary.curator`), exporting reports.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## CLI integration — wire shape per verb
|
||||||
|
|
||||||
|
> **Investigation note.** Hermes v0.13 ships these verbs but neither the release
|
||||||
|
> notes nor the CLI man-page in our repo capture the exact stdout format. Plan
|
||||||
|
> assumes both human-text and `--json` are available since that's the v0.12
|
||||||
|
> Kanban convention; first task at implementation time is to run each verb
|
||||||
|
> against a real v0.13 install and capture stdout into `Tests/Fixtures/`. If
|
||||||
|
> `--json` doesn't exist for one of these verbs, fall back to a defensive
|
||||||
|
> text parser and add a `// TODO upstream` flag. **All assumed CLI flags below
|
||||||
|
> must be confirmed before wiring the parser.**
|
||||||
|
|
||||||
|
### `hermes curator list-archived [--json]`
|
||||||
|
|
||||||
|
- **Wire shape:** prefer `--json` and decode to `[HermesCuratorArchivedSkill]`.
|
||||||
|
Fall back to text parse if the flag isn't present (mirrors `kanban runs` JSON
|
||||||
|
envelope handling).
|
||||||
|
- **Assumed JSON shape (verify on first run):**
|
||||||
|
|
||||||
|
```json
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"name": "legacy-helper",
|
||||||
|
"category": "templates",
|
||||||
|
"archived_at": "2026-04-22T03:14:09Z",
|
||||||
|
"reason": "stale: 91d unused",
|
||||||
|
"size_bytes": 4521,
|
||||||
|
"path": "/Users/u/.hermes/skills/.archived/legacy-helper"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
- **New model:** `HermesCuratorArchivedSkill` in
|
||||||
|
`scarf/Packages/ScarfCore/Sources/ScarfCore/Models/HermesCuratorReport.swift`
|
||||||
|
with `name: String`, `category: String?`, `archivedAt: String?`,
|
||||||
|
`reason: String?`, `sizeBytes: Int?`, `path: String?`. All optional except
|
||||||
|
`name` so a stripped-down host doesn't crash the view. Identifiable on `name`.
|
||||||
|
- **Empty-state sentinel:** Hermes may print `"no archived skills"` instead of
|
||||||
|
`[]` (parallel to `"no matching tasks"` in Kanban). Treat as empty — same
|
||||||
|
defensive fold KanbanService does at line ~45 today.
|
||||||
|
|
||||||
|
### `hermes curator archive <skill-name>`
|
||||||
|
|
||||||
|
- **Wire shape:** non-destructive (skill is moved, not deleted). No `--json`
|
||||||
|
needed — exit code is the success channel; stdout is human-readable.
|
||||||
|
- **Argv:** `["curator", "archive", name]`. No flags in v0.13.
|
||||||
|
- **Side effects we surface to the user:** the active count drops by 1, the
|
||||||
|
archived count rises by 1 — both visible after the next `status` reload.
|
||||||
|
|
||||||
|
### `hermes curator prune [--dry-run]`
|
||||||
|
|
||||||
|
- **Wire shape:** destructive. Removes everything currently archived. Open
|
||||||
|
question 1 (below): does Hermes v0.13 ship `--dry-run`? Plan **two code paths**:
|
||||||
|
1. **If `--dry-run` exists:** Scarf's prune confirm sheet calls
|
||||||
|
`hermes curator prune --dry-run` first, parses the "would remove N skills"
|
||||||
|
output, and renders the list. Final confirmation calls
|
||||||
|
`hermes curator prune` (no flag). This is the preferred path.
|
||||||
|
2. **If no `--dry-run`:** Scarf calls `hermes curator list-archived` to
|
||||||
|
enumerate what's about to be deleted, shows that list in the confirm
|
||||||
|
sheet, then calls `hermes curator prune` once the user confirms.
|
||||||
|
- **Assumed `--dry-run` JSON output (verify):**
|
||||||
|
|
||||||
|
```json
|
||||||
|
{ "would_remove": [{ "name": "...", "size_bytes": 4521 }, ...], "total_bytes": 12345 }
|
||||||
|
```
|
||||||
|
|
||||||
|
- **Optional per-skill prune:** if Hermes accepts
|
||||||
|
`hermes curator prune <name>` (single-skill prune), wire it as a per-row
|
||||||
|
action in the Archived list. **Verify before implementing** — release notes
|
||||||
|
describe `prune` only in the bulk sense. If single-skill is unavailable, the
|
||||||
|
per-row "Prune" button on the Archived list is dropped from the v2.8
|
||||||
|
scope and only the bulk "Prune all archived" toolbar button ships.
|
||||||
|
|
||||||
|
### `hermes curator run` (now synchronous)
|
||||||
|
|
||||||
|
- **Wire shape:** unchanged argv. Behavior changes from fire-and-forget to
|
||||||
|
blocking on v0.13+. Plan: bump the `runProcess(timeout:)` value from the
|
||||||
|
current 30 s default to 600 s on v0.13+ hosts. Surface a `ProgressView` next
|
||||||
|
to the "Run Now" button while the call is in flight, and disable the button
|
||||||
|
until completion.
|
||||||
|
- **Capability branch:** `if caps.hasCuratorArchive { /* blocking with
|
||||||
|
progress */ } else { /* fire-and-forget, immediate toast */ }`.
|
||||||
|
- **Cancel UX:** for v0.13+ blocking runs, plan a "Cancel" button that calls
|
||||||
|
`transport.cancel()` on the running process (existing TransportError path).
|
||||||
|
If transport-level cancel isn't reliable (Local vs Citadel parity), the
|
||||||
|
cancel button is dropped and we just show indeterminate progress.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Files to change (with specific edits)
|
||||||
|
|
||||||
|
### New files
|
||||||
|
|
||||||
|
- **`scarf/Packages/ScarfCore/Sources/ScarfCore/Services/CuratorService.swift`**
|
||||||
|
— new `public actor CuratorService`. Mirrors `KanbanService` shape exactly:
|
||||||
|
pure I/O, no UI state, every public method dispatches the CLI invocation
|
||||||
|
through `Task.detached(priority: .utility)` inside the actor. Exposes:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
public actor CuratorService {
|
||||||
|
public init(context: ServerContext)
|
||||||
|
|
||||||
|
// Reads
|
||||||
|
public func status() async -> HermesCuratorStatus // moves logic out of VM
|
||||||
|
public func listArchived() async throws -> [HermesCuratorArchivedSkill]
|
||||||
|
|
||||||
|
// Writes — already-wired verbs (refactored from VM helpers)
|
||||||
|
public func runNow(synchronous: Bool, timeout: TimeInterval) async throws
|
||||||
|
public func pause() async throws
|
||||||
|
public func resume() async throws
|
||||||
|
public func pin(_ name: String) async throws
|
||||||
|
public func unpin(_ name: String) async throws
|
||||||
|
public func restore(_ name: String) async throws
|
||||||
|
|
||||||
|
// Writes — new in v0.13 (WS-4)
|
||||||
|
public func archive(_ name: String) async throws
|
||||||
|
public func prune(dryRun: Bool) async throws -> CuratorPruneSummary
|
||||||
|
|
||||||
|
// Pure helpers
|
||||||
|
public nonisolated static func parseListArchived(stdout: String) throws -> [HermesCuratorArchivedSkill]
|
||||||
|
public nonisolated static func parsePruneDryRun(stdout: String) throws -> CuratorPruneSummary
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- Errors land in a new `CuratorError` enum (Sendable, LocalizedError) —
|
||||||
|
`transport(message:)`, `nonZeroExit(verb:code:stderr:)`,
|
||||||
|
`decoding(verb:message:)`. Identical shape to `KanbanError`.
|
||||||
|
- `runNow(synchronous:timeout:)` takes the capability-decided sync flag from
|
||||||
|
the call site; the service itself stays version-agnostic (only the timeout
|
||||||
|
differs in practice).
|
||||||
|
|
||||||
|
- **`scarf/Packages/ScarfCore/Sources/ScarfCore/Models/HermesCuratorArchive.swift`**
|
||||||
|
— new file holding `HermesCuratorArchivedSkill` and `CuratorPruneSummary`
|
||||||
|
structs. Both `Sendable, Equatable, Identifiable, Codable`.
|
||||||
|
|
||||||
|
```swift
|
||||||
|
public struct HermesCuratorArchivedSkill: Sendable, Equatable, Identifiable, Codable {
|
||||||
|
public var id: String { name }
|
||||||
|
public let name: String
|
||||||
|
public let category: String?
|
||||||
|
public let archivedAt: String?
|
||||||
|
public let reason: String?
|
||||||
|
public let sizeBytes: Int?
|
||||||
|
public let path: String?
|
||||||
|
|
||||||
|
// Computed for UI — never persisted.
|
||||||
|
public var sizeLabel: String { /* "4.4 KB" / "—" */ }
|
||||||
|
public var archivedAtLabel: String { /* "2026-04-22" / "—" */ }
|
||||||
|
}
|
||||||
|
|
||||||
|
public struct CuratorPruneSummary: Sendable, Equatable, Codable {
|
||||||
|
public let wouldRemove: [HermesCuratorArchivedSkill]
|
||||||
|
public let totalBytes: Int
|
||||||
|
public var totalCount: Int { wouldRemove.count }
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- **`scarf/scarf/Features/Curator/Views/CuratorArchivedSection.swift`** — new
|
||||||
|
Mac sub-view used by `CuratorView`. Renders a `ScarfCard` containing the
|
||||||
|
Archived list. Inputs: `[HermesCuratorArchivedSkill]`,
|
||||||
|
`onRestore(name:)`, `onPruneOne(name:)?`, `onPruneAll()`. Empty-state path
|
||||||
|
renders an "No archived skills" `ScarfCard` with copy explaining what archive
|
||||||
|
does (helpful since Curator hasn't run yet on a fresh install).
|
||||||
|
|
||||||
|
- **`scarf/scarf/Features/Curator/Views/CuratorPruneConfirmSheet.swift`** —
|
||||||
|
new destructive-confirm sheet. Presents the about-to-be-removed list, total
|
||||||
|
count, total bytes, and a final "Prune permanently" red button.
|
||||||
|
|
||||||
|
### Edited files
|
||||||
|
|
||||||
|
- **`scarf/Packages/ScarfCore/Sources/ScarfCore/ViewModels/CuratorViewModel.swift`**
|
||||||
|
- Replace inline `runAndReload(args:successMessage:)` helpers with
|
||||||
|
`service.<verb>()` calls. Keep the toast + reload pattern inside the VM.
|
||||||
|
- Add new `@Observable` state:
|
||||||
|
- `archivedSkills: [HermesCuratorArchivedSkill] = []`
|
||||||
|
- `isLoadingArchive = false`
|
||||||
|
- `isPruning = false`
|
||||||
|
- `pruneSummary: CuratorPruneSummary?`
|
||||||
|
- `pendingArchiveName: String?` (track which skill is currently being
|
||||||
|
archived so the row can show a small spinner without blocking the rest)
|
||||||
|
- `errorMessage: String?` (replace transient-toast-only failure path with
|
||||||
|
an inline-banner state, mirroring KanbanBoardViewModel)
|
||||||
|
- Add new methods:
|
||||||
|
- `func loadArchive() async`
|
||||||
|
- `func archive(_ name: String) async`
|
||||||
|
- `func planPrune() async` — calls `service.prune(dryRun: true)`, populates
|
||||||
|
`pruneSummary`, opens the confirm sheet (sheet binding sits in the View)
|
||||||
|
- `func confirmPrune() async` — calls `service.prune(dryRun: false)`
|
||||||
|
- `func pruneOne(_ name: String) async` — only wired if upstream supports
|
||||||
|
single-skill prune; otherwise method elided
|
||||||
|
- Update `runNow()` to accept a `caps: HermesCapabilities` argument (passed
|
||||||
|
from the View) and switch between sync/async invocations:
|
||||||
|
- On v0.13+: `await service.runNow(synchronous: true, timeout: 600)` and
|
||||||
|
poll `viewModel.isLoading` for a progress spinner.
|
||||||
|
- On pre-v0.13: existing fire-and-forget; toast says "Curator run started".
|
||||||
|
- Construct service lazily: `private lazy var service = CuratorService(context: context)`.
|
||||||
|
|
||||||
|
- **`scarf/Packages/ScarfCore/Sources/ScarfCore/Models/HermesCuratorReport.swift`**
|
||||||
|
- No edits to existing models. Add archive-related types in the new
|
||||||
|
`HermesCuratorArchive.swift` to keep the diff scoped. (Decision: keep one
|
||||||
|
file per logical surface.)
|
||||||
|
|
||||||
|
- **`scarf/scarf/Features/Curator/Views/CuratorView.swift`**
|
||||||
|
- Inject `@Environment(\.hermesCapabilities)` and read
|
||||||
|
`caps?.hasCuratorArchive ?? false` once into a local `let archiveAvailable`.
|
||||||
|
- Header toolbar additions (only when `archiveAvailable`):
|
||||||
|
- "Prune Archived…" `ScarfDestructiveButton` in the overflow `Menu`,
|
||||||
|
disabled when `archivedSkills.isEmpty && !isLoadingArchive`.
|
||||||
|
- Replace "Restore Archived…" menu item with a deep-link to scroll to the
|
||||||
|
new Archived section (when `archiveAvailable`); leave the existing
|
||||||
|
`CuratorRestoreSheet` reachable from the same menu **only on pre-v0.13** as
|
||||||
|
the legacy fallback. On v0.13+ the menu shows just "Prune Archived…" and
|
||||||
|
the section becomes the restore entry point.
|
||||||
|
- Add `archiveAvailable` to `activityTables` rendering: each row in the three
|
||||||
|
leaderboards gains an "Archive" pin-style button (small `Image(systemName:
|
||||||
|
"archivebox")`) next to the existing pin button. Tooltip "Archive (move
|
||||||
|
out of active set)". Hidden on pre-v0.13.
|
||||||
|
- Append `CuratorArchivedSection` between `activityTables` and
|
||||||
|
`lastReportSection` whenever `archiveAvailable`. Loaded by an additional
|
||||||
|
`viewModel.loadArchive()` call inside `.task { … }`.
|
||||||
|
- Wire confirm sheets:
|
||||||
|
- `.sheet(isPresented: $showPruneSheet) { CuratorPruneConfirmSheet(...) }`
|
||||||
|
- Existing `$showRestoreSheet` stays — only shown on pre-v0.13.
|
||||||
|
- Run Now button: while `viewModel.isLoading && archiveAvailable`, show a
|
||||||
|
`ProgressView()` next to the button label and disable the button. Tooltip:
|
||||||
|
"Curator running — usually 10-90s. Hermes v0.13 runs synchronously."
|
||||||
|
- Inline error banner: render `viewModel.errorMessage` as a yellow
|
||||||
|
`ScarfCard` above `statusSummary` with an "x" dismiss. (Use existing
|
||||||
|
`ScarfColor.warning` background; inspect the Kanban inline banner for
|
||||||
|
pattern.)
|
||||||
|
|
||||||
|
- **`scarf/scarf/Features/Curator/Views/CuratorRestoreSheet.swift`**
|
||||||
|
- **No code changes.** Sheet stays as v0.12 fallback. Add a doc-comment
|
||||||
|
update at the top noting it's legacy-only on v0.13+ — the new
|
||||||
|
`CuratorArchivedSection` is the preferred path. Don't delete this file
|
||||||
|
even after WS-4 ships; pre-v0.13 hosts still need it.
|
||||||
|
|
||||||
|
- **`scarf/Scarf iOS/Curator/CuratorView.swift`**
|
||||||
|
- **No code changes in WS-4.** WS-9 will add a read-only "Archived" section
|
||||||
|
that mirrors the Mac one without per-row write actions. Leave a
|
||||||
|
`// TODO(WS-9):` marker.
|
||||||
|
|
||||||
|
- **`scarf/Packages/ScarfCore/Tests/ScarfCoreTests/HermesCuratorParserTests.swift`**
|
||||||
|
- Add tests (see "How to test" below).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## New types / fields
|
||||||
|
|
||||||
|
### `HermesCuratorArchivedSkill` (new)
|
||||||
|
|
||||||
|
In `HermesCuratorArchive.swift`. Codable directly from the assumed
|
||||||
|
`list-archived --json` shape. All fields except `name` optional so a
|
||||||
|
stripped-down host doesn't crash decoding. Computed `sizeLabel` /
|
||||||
|
`archivedAtLabel` for the view layer; never persisted.
|
||||||
|
|
||||||
|
### `CuratorPruneSummary` (new)
|
||||||
|
|
||||||
|
Lists what `prune --dry-run` would remove, plus aggregated `totalBytes`. The
|
||||||
|
view derives `totalCount` from `wouldRemove.count` so the wire shape stays
|
||||||
|
flat.
|
||||||
|
|
||||||
|
### `CuratorError` (new)
|
||||||
|
|
||||||
|
```swift
|
||||||
|
public enum CuratorError: Error, Sendable, LocalizedError {
|
||||||
|
case transport(message: String)
|
||||||
|
case nonZeroExit(verb: String, code: Int32, stderr: String)
|
||||||
|
case decoding(verb: String, message: String)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Identical shape to `KanbanError`. View model maps these to inline-banner copy.
|
||||||
|
|
||||||
|
### `CuratorViewModel` additions
|
||||||
|
|
||||||
|
Already enumerated above. Note: the existing `transientMessage: String?` stays
|
||||||
|
for happy-path success ("Pinned X", "Resumed", "Archived legacy-helper");
|
||||||
|
failures route through the new `errorMessage: String?` so dismissals don't
|
||||||
|
cross-contaminate.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Capability gating
|
||||||
|
|
||||||
|
All branches keyed on `caps?.hasCuratorArchive ?? false` (already defined in
|
||||||
|
`HermesCapabilities.swift:138` per the WS-1 inventory).
|
||||||
|
|
||||||
|
| Surface | Pre-v0.13 (`hasCurator && !hasCuratorArchive`) | v0.13+ (`hasCuratorArchive`) |
|
||||||
|
|---|---|---|
|
||||||
|
| Sidebar item | Visible (gated on `hasCurator`) | Visible |
|
||||||
|
| Status summary, leaderboards, pinned section | Identical | Identical |
|
||||||
|
| Per-row "Archive" button | **Hidden** | Visible |
|
||||||
|
| "Archived" section in CuratorView | **Hidden** | Visible (renders empty-state if no archives) |
|
||||||
|
| "Prune Archived…" menu item | **Hidden** | Visible |
|
||||||
|
| Existing "Restore Archived…" menu item | Visible (legacy text-prompt sheet) | **Hidden** (replaced by per-row Restore in Archived section) |
|
||||||
|
| `Run Now` blocking + progress | **No** (fire-and-forget) | **Yes** (synchronous w/ progress + 600s timeout) |
|
||||||
|
| `CuratorRestoreSheet.swift` | Used | Dead code path but file kept |
|
||||||
|
|
||||||
|
The View reads `caps` once at the top of `body` and threads
|
||||||
|
`archiveAvailable: Bool` down. Don't sprinkle `caps?.hasCuratorArchive` checks
|
||||||
|
across every sub-view — single source of truth at the entry point.
|
||||||
|
|
||||||
|
**Defensive default.** If `caps` is `nil` (preview / smoke test) or detection
|
||||||
|
hasn't completed yet, `archiveAvailable` resolves to `false` and the surface
|
||||||
|
behaves like a pre-v0.13 host. Same defensive shape as the Goals / Kanban-watch
|
||||||
|
gates.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## How to test
|
||||||
|
|
||||||
|
### CLI fixtures (capture once, commit to repo)
|
||||||
|
|
||||||
|
Create `scarf/Packages/ScarfCore/Tests/ScarfCoreTests/Fixtures/Curator/`:
|
||||||
|
|
||||||
|
- `list-archived-empty.json` — `[]`
|
||||||
|
- `list-archived-three.json` — three skills with varied optional fields
|
||||||
|
- `list-archived-no-json-flag.txt` — text fallback (one row per line)
|
||||||
|
- `prune-dry-run.json` — `{ wouldRemove: [...], totalBytes: 12345 }`
|
||||||
|
- `status-with-archived.txt` — pre-existing fixture but with the
|
||||||
|
`archived 4` count populated (drives the badge-count test)
|
||||||
|
|
||||||
|
These are captured by running the verbs against a real Hermes v0.13 install
|
||||||
|
on the dogfooding Mardon Mac Mini (per the "remote-servers dogfooding" memory)
|
||||||
|
during implementation. **Do not commit fabricated fixtures** — every fixture
|
||||||
|
must come from a real CLI invocation; otherwise the tests lock in a parser
|
||||||
|
that doesn't match production.
|
||||||
|
|
||||||
|
### Parser tests (`HermesCuratorParserTests.swift`)
|
||||||
|
|
||||||
|
Add to the existing `@Suite struct HermesCuratorParserTests`:
|
||||||
|
|
||||||
|
- `listArchivedEmpty()` — empty array decodes to `[]`.
|
||||||
|
- `listArchivedThreeSkills()` — happy path, asserts each field including
|
||||||
|
optional `category` / `reason`.
|
||||||
|
- `listArchivedNoJSONFallback()` — text parser on the .txt fixture.
|
||||||
|
- `listArchivedNoArchivedSkillsSentinel()` — `"no archived skills"` literal in
|
||||||
|
stdout folds to `[]` (parallel to KanbanService's `"no matching tasks"`).
|
||||||
|
- `listArchivedMissingOptionalsStaysSafe()` — JSON with only `name` populated
|
||||||
|
decodes; size/date labels render `"—"`.
|
||||||
|
- `pruneDryRunHappyPath()` — `CuratorPruneSummary` decodes `wouldRemove` list
|
||||||
|
and `totalBytes`.
|
||||||
|
- `pruneDryRunZeroSkills()` — `wouldRemove: [], totalBytes: 0` is valid.
|
||||||
|
|
||||||
|
### View-model tests (new file `CuratorViewModelTests.swift` — optional)
|
||||||
|
|
||||||
|
If a `MockCuratorService` protocol is plausible (the actor pattern allows
|
||||||
|
swapping via a protocol), add:
|
||||||
|
|
||||||
|
- `archiveCallSucceedsAndReloads()` — verifies `viewModel.transientMessage`
|
||||||
|
flips to "Archived X" and `loadArchive()` is re-invoked.
|
||||||
|
- `archiveCallFailsRoutesToErrorBanner()` — failure path populates
|
||||||
|
`errorMessage` (not `transientMessage`).
|
||||||
|
- `pruneTwoStepFlow()` — `planPrune()` populates `pruneSummary` then
|
||||||
|
`confirmPrune()` clears it.
|
||||||
|
- `runNowIsSynchronousOnV013()` — VM passes `synchronous: true` to the service.
|
||||||
|
|
||||||
|
If extracting a protocol is too much yak-shave, plan only the parser tests.
|
||||||
|
|
||||||
|
### UI scenarios (manual verification on Mardon)
|
||||||
|
|
||||||
|
1. **Pre-v0.13 host (Mac Mini paused at v0.12):** sidebar shows Curator;
|
||||||
|
page renders unchanged from v2.7.5; "Restore Archived…" menu item present;
|
||||||
|
no Archive section, no Prune button; `Run Now` returns immediately.
|
||||||
|
2. **v0.13 host with no archives:** Archived section shows empty-state copy
|
||||||
|
("No archived skills — Curator will move stale skills here after the next
|
||||||
|
review cycle."); "Prune Archived…" menu item disabled.
|
||||||
|
3. **v0.13 host with 3 archives:** Archived rows render with size + date;
|
||||||
|
per-row Restore moves the skill back to active (verified by status reload);
|
||||||
|
"Prune Archived…" opens confirm sheet listing all 3 with sizes; confirming
|
||||||
|
removes them.
|
||||||
|
4. **v0.13 host: archive an active skill:** click Archive on a leaderboard
|
||||||
|
row → row disappears from active list, appears in Archived section, active
|
||||||
|
count drops by 1, archived count rises by 1.
|
||||||
|
5. **v0.13 host: blocking `Run Now`:** spinner appears, button stays disabled
|
||||||
|
for the full duration; on completion the toast fires and the leaderboard
|
||||||
|
reflects the new pass.
|
||||||
|
6. **v0.13 host: prune failure mid-flight:** simulate by SIGKILL'ing the
|
||||||
|
curator process; verify error banner appears with stderr excerpt and the
|
||||||
|
archived list isn't optimistically wiped.
|
||||||
|
7. **Restore sheet legacy fallback (pre-v0.13):** unchanged — verify the
|
||||||
|
existing free-form text sheet still works.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Open questions (must resolve at implementation start)
|
||||||
|
|
||||||
|
1. **Does `hermes curator prune` ship a `--dry-run` flag in v0.13?** If yes,
|
||||||
|
the prune confirm sheet uses it for accurate "will remove these" copy. If
|
||||||
|
no, the sheet falls back to displaying the current `list-archived` output
|
||||||
|
and assumes prune removes exactly that set. This is the **biggest unknown**
|
||||||
|
in the plan — the entire prune confirm UX shape pivots on this answer.
|
||||||
|
_Resolution path: run `hermes curator prune --help` against v0.13 install
|
||||||
|
on Mardon as the very first WS-4 implementation step._
|
||||||
|
|
||||||
|
2. **Does any curator verb support `--json`?** Plan assumes yes for
|
||||||
|
`list-archived` and `prune --dry-run` since v0.12 Kanban set the precedent.
|
||||||
|
If neither does, parser fixtures shift to text-only and decode logic moves
|
||||||
|
into `HermesCuratorStatusParser`. Resolution: same as Q1.
|
||||||
|
|
||||||
|
3. **Is `hermes curator prune <name>` (single-skill prune) supported?** If so,
|
||||||
|
per-row "Prune permanently" buttons in the Archived section are easy to
|
||||||
|
add. If not, the only prune affordance is the bulk one. Plan accommodates
|
||||||
|
both; per-row prune is dropped if upstream doesn't support it. Resolution:
|
||||||
|
`hermes curator prune --help`.
|
||||||
|
|
||||||
|
4. **What's the exact synchronous-`run` timeout?** The release notes say
|
||||||
|
"synchronous" but don't specify duration. 600 s (10 min) is a defensible
|
||||||
|
default since curator runs are O(skill-count × LLM RTT). Long-running
|
||||||
|
timeouts are acceptable here since the spinner is honest. Open: should
|
||||||
|
Scarf surface a Cancel button? Probably not in v0.13 — transport-level
|
||||||
|
process cancel isn't reliable across LocalTransport / CitadelServerTransport
|
||||||
|
parity. Defer cancel to a later release if users complain.
|
||||||
|
|
||||||
|
5. **Confirm UX: typed-name confirmation, multi-tap, or destructive-button
|
||||||
|
confirm sheet?** Scarf precedent (see "Constraints"):
|
||||||
|
- **Memory reset** (`MemoryView.swift:56-65`) uses a single-step
|
||||||
|
`.confirmationDialog` with `Button("Reset", role: .destructive)`. One
|
||||||
|
click after the dialog opens.
|
||||||
|
- **Template uninstall** (`TemplateUninstallSheet.swift:79-96`) uses a
|
||||||
|
custom modal sheet listing every file/skill/cron/memory entry that will
|
||||||
|
be removed, then a `ScarfPrimaryButton` tinted red labeled "Remove".
|
||||||
|
One click after the sheet opens.
|
||||||
|
- **Recommendation for prune:** match template-uninstall's shape. Prune is
|
||||||
|
bulkier than memory-reset (multiple skills enumerated) and the user
|
||||||
|
benefits from seeing the list. Custom sheet > confirmation dialog. The
|
||||||
|
confirm button is `ScarfDestructiveButton` labeled "Prune permanently"
|
||||||
|
with `keyboardShortcut(.defaultAction)` reserved for Cancel (not the
|
||||||
|
destructive action — flipping it reduces accidental Enter-key prunes).
|
||||||
|
Cancel is `ScarfGhostButton`, "Cancel". No typed-name confirmation; the
|
||||||
|
enumerated list + the asymmetric keyboard shortcut is enough friction
|
||||||
|
for a v0.13 surface that's already gated on a destructive intent ("I
|
||||||
|
opened the prune sheet on purpose"). Single-tap on the destructive
|
||||||
|
button is fine.
|
||||||
|
|
||||||
|
6. **Should the `lastReportPath` JSON field on `HermesCuratorStatus` get
|
||||||
|
populated from a v0.13 path under `logs/curator/`?** v0.12 already populates
|
||||||
|
it via the state file. v0.13 might point at a different directory after
|
||||||
|
archive/prune runs (a separate `archive_report_path`?). Out of scope unless
|
||||||
|
v0.13 introduces a new field — plan only handles existing
|
||||||
|
`lastReportPath`. Defer to dogfooding.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Out of scope (deferred)
|
||||||
|
|
||||||
|
- **iOS archive surface (WS-9).** Read-only Archived list mirroring the Mac
|
||||||
|
one — no Archive / Prune actions. iOS users still get value (visibility
|
||||||
|
into what the curator pruned). Scoped to a separate work-stream.
|
||||||
|
- **Curator scheduling knobs.** Already lives in Settings → Auxiliary; no
|
||||||
|
changes for v2.8.
|
||||||
|
- **Per-skill curator-config flags** (e.g. "exclude this skill from auto-archive
|
||||||
|
forever" — distinct from pin which already prevents auto-archive). Hermes
|
||||||
|
doesn't ship this verb in v0.13. If the user wants permanent exclusion, pin.
|
||||||
|
- **Bulk-archive multi-select on active skills.** A future v0.14 verb might
|
||||||
|
enable this; for v2.8 each archive is one CLI call.
|
||||||
|
- **Archive history / undo.** Hermes doesn't track archive history beyond the
|
||||||
|
archived state itself. Restore is the undo for archive; once pruned, there's
|
||||||
|
no recovery.
|
||||||
|
- **Curator report rendering for archive/prune events.** v0.12's
|
||||||
|
`lastReportMarkdown` covers run reports; whether v0.13's archive/prune
|
||||||
|
events land in a separate report is an open question. Stick with
|
||||||
|
current rendering; revisit if dogfooding shows a gap.
|
||||||
|
- **`hermes curator pause/resume` on the synchronous run.** The new sync `run`
|
||||||
|
doesn't interact with the autonomous schedule; pause/resume still work as
|
||||||
|
before. No UX change.
|
||||||
|
- **Telemetry on prune.** No ScarfMon event for prune — measure if a user
|
||||||
|
reports a slow prune. Easy follow-up.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Risk + rollback
|
||||||
|
|
||||||
|
- **Highest risk:** parser drift between assumed JSON shape and Hermes v0.13's
|
||||||
|
actual output. Mitigation: capture real fixtures at implementation start
|
||||||
|
(see Open Q1 + Q2). Don't commit synthetic fixtures.
|
||||||
|
- **Second risk:** synchronous `run` timing out on `runProcess(timeout: 600)`.
|
||||||
|
Mitigation: 10 min is generous; if a real run exceeds 10 min, that's a
|
||||||
|
Hermes regression worth surfacing. Falls back to inline error banner.
|
||||||
|
- **Rollback path:** every WS-4 surface is gated on `hasCuratorArchive`. If a
|
||||||
|
late-cycle bug shows up, a single-line revert in `HermesCapabilities.swift`
|
||||||
|
(`atLeastSemver(0, 13, 0)` → `atLeastSemver(99, 0, 0)`) hides every WS-4
|
||||||
|
surface from production hosts without ripping the code out. Same rollback
|
||||||
|
shape as Kanban v3 used during v2.7.5 dogfooding.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Estimate
|
||||||
|
|
||||||
|
| Bucket | Effort |
|
||||||
|
|---|---|
|
||||||
|
| `CuratorService` actor + models + errors | 0.5 day |
|
||||||
|
| Parser tests (with real fixtures captured from Mardon) | 0.5 day |
|
||||||
|
| `CuratorViewModel` refactor + new state + new methods | 0.5 day |
|
||||||
|
| `CuratorView` edits (header, per-row archive, archived section, prune sheet, error banner) | 1 day |
|
||||||
|
| `CuratorPruneConfirmSheet` + `CuratorArchivedSection` views | 0.5 day |
|
||||||
|
| Capability-gating audit + manual UI scenarios on pre-v0.13 + v0.13 hosts | 0.5 day |
|
||||||
|
| Unknown-buffer (CLI shape surprises, single-skill prune verification) | 0.5 day |
|
||||||
|
|
||||||
|
**Total: ~4 days of focused work** for one engineer, assuming a v0.13 install
|
||||||
|
is already running on Mardon and accessible for fixture capture. If `--json`
|
||||||
|
turns out to be missing on either of the two read verbs, add a 0.5-day
|
||||||
|
buffer for text-parser hardening.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Sequencing inside WS-4
|
||||||
|
|
||||||
|
1. Capture real-world stdout fixtures by running every new v0.13 curator verb
|
||||||
|
against the dogfooding Mardon install. Commit to
|
||||||
|
`Tests/ScarfCoreTests/Fixtures/Curator/`. _(Resolves Open Q1 + Q2 + Q3.)_
|
||||||
|
2. Land `HermesCuratorArchive.swift` (models) + `CuratorService` actor with
|
||||||
|
parser tests. No UI yet.
|
||||||
|
3. Refactor `CuratorViewModel` to use the service. Existing v0.12 surface
|
||||||
|
should still work after this step — verify by rebuilding and clicking
|
||||||
|
through every existing button.
|
||||||
|
4. Add the Mac Archived section + per-row Archive button + Prune confirm sheet
|
||||||
|
behind the `archiveAvailable` flag.
|
||||||
|
5. Bump `Run Now` to synchronous-with-progress on v0.13+.
|
||||||
|
6. Pre-v0.13 regression pass on a v0.12 install.
|
||||||
|
7. v0.13 dogfood pass on Mardon — full UI tour + error injection.
|
||||||
|
8. Update relevant wiki pages (`Core-Services.md` adds `CuratorService`;
|
||||||
|
sidebar / Curator user-guide page documents the new actions). Per
|
||||||
|
CLAUDE.md the wiki update is part of the WS, not a follow-up.
|
||||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,625 @@
|
|||||||
|
# WS-6 Plan: Provider catalog refresh + Auxiliary `image_gen.model` + OpenRouter response caching
|
||||||
|
|
||||||
|
**Workstream:** WS-6 of Scarf v2.8.0
|
||||||
|
**Hermes target:** v0.13.0 (v2026.5.7)
|
||||||
|
**Capability gates (already shipped in WS-1):**
|
||||||
|
- `HermesCapabilities.hasImageGenModel` (`>= 0.13.0`) — `image_gen.model` honored from `config.yaml`.
|
||||||
|
- `HermesCapabilities.hasOpenRouterResponseCache` (`>= 0.13.0`) — OpenRouter response caching toggle.
|
||||||
|
**Builds on:** v2.7.5 ModelCatalogService overlay table (11 entries: nous, openai-codex, qwen-oauth, google-gemini-cli, copilot-acp, arcee, gmi, azure-foundry, lmstudio, minimax-oauth, tencent-tokenhub) + the existing AuxiliaryTab pattern (Hermes v0.12 catch-up: `curator` aux row, `flush_memories` row inverse-gated).
|
||||||
|
**Owner:** TBD
|
||||||
|
**Reviewers:** Alan; whoever has provider-config bandwidth in the v2.8 cycle.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Goals
|
||||||
|
|
||||||
|
The Hermes v0.13.0 release notes list four item-clusters in WS-6 scope:
|
||||||
|
|
||||||
|
1. **Provider catalog refresh** — five new model IDs (`deepseek/deepseek-v4-pro`, `x-ai/grok-4.3`, `openrouter/owl-alpha`, `tencent/hy3-preview`, Arcee Trinity Large Thinking) plus a rename (`x-ai/grok-4.20-beta` → `x-ai/grok-4.20`). All five new IDs already appear in `models_dev_cache.json` on the local v0.13 dev host (verified: see Appendix A), so the catalog file does the heavy lifting on next `models.dev` cache refresh — Scarf just needs alias-resolution + (sparingly) curated metadata.
|
||||||
|
2. **Vercel AI Gateway demotion** — Hermes deprioritizes the `vercel` provider (display name `Vercel AI Gateway`) in the picker. Currently Scarf sorts providers `subscriptionGated → alphabetical`; Vercel sits mid-alphabet. We add a `demoted` axis so Vercel sinks to the bottom while keeping all other providers in their alphabetic positions.
|
||||||
|
3. **`image_gen.model` from `config.yaml`** — Hermes v0.13 honors a top-level `image_gen.model` key. Scarf surfaces a model picker for it on the Auxiliary tab, capability-gated on `hasImageGenModel`.
|
||||||
|
4. **OpenRouter response caching toggle** — Hermes v0.13 added an OpenRouter response-caching switch in `config.yaml`. Scarf surfaces a `Toggle` next to OpenRouter's other knobs, capability-gated on `hasOpenRouterResponseCache`. **Open Question** on the exact key shape (`openrouter.response_cache.enabled` vs `providers.openrouter.response_cache_enabled` vs nested under `prompt_caching`) — flagged below.
|
||||||
|
|
||||||
|
The two release-notes items NOT in WS-6 scope:
|
||||||
|
|
||||||
|
- **"Honor runtime default model during delegate provider resolution"** — server-side resolution behavior. Scarf's existing `delegation.model` / `delegation.provider` fields in `DelegationSettings` are unchanged; the picker continues to fill those values straight to `config.yaml`. No Scarf surface change needed. Document in the `Out of scope` section as verified-no-change.
|
||||||
|
- **"Avoid Bedrock credential probe in provider picker"** — server-side: the `hermes model` CLI no longer probes AWS_ACCESS_KEY_ID at picker open time. Scarf's `ModelPickerSheet` was already not invoking that probe (we read the cached catalog, not `hermes model`). No change needed.
|
||||||
|
- **`ProviderProfile` ABC + `plugins/model-providers/` + `list_picker_providers`** — these are Hermes-internal pluggability scaffolding. They expand which providers can ship via plugin, but none alter the on-disk shape of `models_dev_cache.json` or the `HERMES_OVERLAYS` table. Scarf's existing read path (cache file + overlay table) reaches them transparently. **Caveat:** the `list_picker_providers` change adds a credential-filter so providers without the right env vars are hidden. Scarf's picker today shows everything regardless of credentials. We choose to **not adopt** the credential filter in the picker (users frequently configure providers in-app and need to see the row before they can fill the secret). Documented in the `Out of scope` section.
|
||||||
|
- **Shared Hermes dotenv loader / Nous OAuth persistence across profiles** — entirely server-side. Scarf's `NousSubscriptionService` reads `~/.hermes/auth.json`; the new shared dotenv loader doesn't change that file's path or shape. No Scarf surface change.
|
||||||
|
- **`/provider` alias removal** — server-side CLI cleanup. Scarf already invokes `/model` directly via ACP slash command routing; no Scarf surface used `/provider`. No change.
|
||||||
|
|
||||||
|
### Non-goals (explicitly deferred)
|
||||||
|
|
||||||
|
- **In-app credential entry sheet** for providers requiring an API key. v2.7.5 surfaces "Set in Terminal: `hermes auth <provider>`" as the path for OAuth providers; for new BYO-key providers (none in this WS — the five new models all flow through OpenRouter / Nous Portal / Arcee already-credentialed) we keep the same convention.
|
||||||
|
- **Per-model image-gen capability tag** in the catalog. The `models_dev_cache.json` schema doesn't include an `image: true` field today. Filtering the `image_gen.model` picker to "image-capable models only" is therefore not feasible at the catalog level. We pre-populate a small allowlist of well-known image models in Scarf instead (see §New types / fields).
|
||||||
|
- **iOS surface for new image_gen / openrouter toggles.** ScarfGo's settings is read-mostly; a dedicated iOS tab is deferred to WS-9 (iOS catch-up). The capability flags will work on iOS too once the surface lands.
|
||||||
|
- **Migration ceremony for the Grok rename.** We resolve the alias at read time (option 1) — no ceremony, no race, lossless. See §Migration.
|
||||||
|
- **A standalone "Image Gen" Settings tab.** v0.13 has exactly two image-gen-related fields (the model + the existing `image_gen.provider` from v0.12). That's not enough surface to warrant a tab — they belong next to the `vision` row in Auxiliary. If v0.14 adds size/quality/style fields, we revisit and split into its own tab then.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Files to change
|
||||||
|
|
||||||
|
The plan is intentionally minimal-touch. The `models_dev_cache.json` refresh handles four of the five new model IDs without any Swift change; the rename + the one new aux field + the toggle are surgical.
|
||||||
|
|
||||||
|
### 1. `scarf/Packages/ScarfCore/Sources/ScarfCore/Services/ModelCatalogService.swift`
|
||||||
|
|
||||||
|
**Why:** Two changes:
|
||||||
|
- Adds an alias-resolution path so `x-ai/grok-4.20-beta` keeps working when a user's `config.yaml` references the old name. Lossless, opt-in, zero migration risk.
|
||||||
|
- Adds a `demoted` axis to provider sort so Vercel AI Gateway sinks to the bottom of the picker.
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- **Alias map.** Add a static table near `overlayOnlyProviders`:
|
||||||
|
```swift
|
||||||
|
/// Hermes deprecates model IDs across releases. When a stored config
|
||||||
|
/// `model.default` references a deprecated ID, resolve to its
|
||||||
|
/// canonical successor. Lossless — we never rewrite the user's
|
||||||
|
/// config.yaml; the alias just lets `validateModel` /
|
||||||
|
/// `model(providerID:modelID:)` succeed against the new ID.
|
||||||
|
///
|
||||||
|
/// Keys are dot-separated `providerID/modelID` to disambiguate
|
||||||
|
/// across providers — even if `vercel` later adds a `grok-4.20-beta`
|
||||||
|
/// alias on its own, the openrouter resolution shouldn't fire.
|
||||||
|
///
|
||||||
|
/// **Schema is Swift-primary.** Mirror new entries into
|
||||||
|
/// `tools/build-catalog.py` only if the catalog tool grows a model-ID
|
||||||
|
/// validation pass (it doesn't today — see §`tools/build-catalog.py`
|
||||||
|
/// mirror below).
|
||||||
|
public static let modelAliases: [String: String] = [
|
||||||
|
// v0.13: x-ai dropped the `-beta` suffix once Grok 4.20 GA'd.
|
||||||
|
// The model is the same one served at the same OpenRouter slot;
|
||||||
|
// only the marketing identifier changed.
|
||||||
|
"openrouter/x-ai/grok-4.20-beta": "openrouter/x-ai/grok-4.20",
|
||||||
|
"xai/grok-4.20-beta": "xai/grok-4.20",
|
||||||
|
"vercel/xai/grok-4.20-beta": "vercel/xai/grok-4.20",
|
||||||
|
]
|
||||||
|
|
||||||
|
/// Resolve a stored model identifier through the alias map. Returns
|
||||||
|
/// the input unchanged when no alias exists. Pure function — used at
|
||||||
|
/// read time everywhere a config'd model ID is rendered, validated,
|
||||||
|
/// or sent to Hermes.
|
||||||
|
public func resolveModelAlias(providerID: String, modelID: String) -> String {
|
||||||
|
let composite = "\(providerID)/\(modelID)"
|
||||||
|
return Self.modelAliases[composite].map { resolved -> String in
|
||||||
|
// Strip the providerID prefix from the resolved value before
|
||||||
|
// returning — callers want the bare model ID.
|
||||||
|
let prefix = providerID + "/"
|
||||||
|
return resolved.hasPrefix(prefix)
|
||||||
|
? String(resolved.dropFirst(prefix.count))
|
||||||
|
: resolved
|
||||||
|
} ?? modelID
|
||||||
|
}
|
||||||
|
```
|
||||||
|
Call sites that need to resolve: `validateModel(_:for:)` resolves the input before lookup; `model(providerID:modelID:)` resolves before `provider.models?[modelID]` indexing; `provider(for:)` resolves the input model ID before scanning. Each is a one-line addition at the top of the function.
|
||||||
|
|
||||||
|
- **Demoted-provider axis.** Add a static set:
|
||||||
|
```swift
|
||||||
|
/// Provider IDs that Hermes v0.13 explicitly deprioritizes in the
|
||||||
|
/// picker. `loadProviders()` sorts these to the tail of the list,
|
||||||
|
/// after the alphabetical group, so users who haven't manually
|
||||||
|
/// chosen Vercel as their gateway don't end up there by default.
|
||||||
|
/// Mirrors Hermes's `DEMOTED_PROVIDERS` list in
|
||||||
|
/// `hermes_cli/providers.py`.
|
||||||
|
public static let demotedProviders: Set<String> = [
|
||||||
|
"vercel",
|
||||||
|
]
|
||||||
|
```
|
||||||
|
Update the sort comparator in `loadProviders()`:
|
||||||
|
```swift
|
||||||
|
return byID.values.sorted { lhs, rhs in
|
||||||
|
// Subscription-gated first (Nous Portal).
|
||||||
|
if lhs.subscriptionGated != rhs.subscriptionGated {
|
||||||
|
return lhs.subscriptionGated
|
||||||
|
}
|
||||||
|
// Demoted last (Vercel AI Gateway).
|
||||||
|
let lDemoted = Self.demotedProviders.contains(lhs.providerID)
|
||||||
|
let rDemoted = Self.demotedProviders.contains(rhs.providerID)
|
||||||
|
if lDemoted != rDemoted {
|
||||||
|
return !lDemoted
|
||||||
|
}
|
||||||
|
return lhs.providerName.localizedCaseInsensitiveCompare(rhs.providerName) == .orderedAscending
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- **Image-gen model allowlist.** Add a static curated list of well-known image-gen-capable model IDs (kept short and updated by hand; the catalog file has no `image_capable` flag today):
|
||||||
|
```swift
|
||||||
|
/// Known image-generation models, used to pre-populate the
|
||||||
|
/// `image_gen.model` picker on the Auxiliary tab. The list is
|
||||||
|
/// curated — `models_dev_cache.json` doesn't tag image-capable
|
||||||
|
/// models, so we maintain this by hand on Hermes version bumps.
|
||||||
|
/// Always free-form-typeable on the picker too, so missing entries
|
||||||
|
/// don't block users with non-listed image providers.
|
||||||
|
///
|
||||||
|
/// Order: most-likely-to-be-chosen first.
|
||||||
|
public static let imageGenModels: [HermesImageGenModel] = [
|
||||||
|
.init(modelID: "openai/gpt-image-1", display: "OpenAI · gpt-image-1", providerHint: "openai"),
|
||||||
|
.init(modelID: "google/imagen-4", display: "Google · Imagen 4", providerHint: "google-vertex"),
|
||||||
|
.init(modelID: "google/imagen-3", display: "Google · Imagen 3", providerHint: "google-vertex"),
|
||||||
|
.init(modelID: "stability/stable-image-ultra", display: "Stability · Stable Image Ultra", providerHint: "stability"),
|
||||||
|
.init(modelID: "fal-ai/flux-pro-1.1", display: "fal · FLUX 1.1 Pro", providerHint: "fal"),
|
||||||
|
.init(modelID: "black-forest-labs/flux-1.1-pro", display: "Black Forest Labs · FLUX 1.1 Pro", providerHint: "openrouter"),
|
||||||
|
.init(modelID: "openai/dall-e-3", display: "OpenAI · DALL·E 3", providerHint: "openai"),
|
||||||
|
]
|
||||||
|
|
||||||
|
public struct HermesImageGenModel: Sendable, Identifiable, Hashable {
|
||||||
|
public let modelID: String
|
||||||
|
public let display: String
|
||||||
|
/// Hint at which provider serves this model — surfaced as a
|
||||||
|
/// "Configure provider X first" advisory but never enforced.
|
||||||
|
public let providerHint: String?
|
||||||
|
public var id: String { modelID }
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Tolerance contract:** When a user has a config with `model.default: x-ai/grok-4.20-beta` and provider `openrouter`, `validateModel("x-ai/grok-4.20-beta", for: "openrouter")` resolves to `"x-ai/grok-4.20"` and validates against the catalog. If the alias isn't present in the map, the function behaves identically to today.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 2. `scarf/Packages/ScarfCore/Sources/ScarfCore/Models/HermesConfig.swift`
|
||||||
|
|
||||||
|
**Why:** Add two new top-level config fields:
|
||||||
|
- `imageGenModel: String` — `image_gen.model` value, default `""` (empty means "use provider default").
|
||||||
|
- `openrouterResponseCacheEnabled: Bool` — `openrouter.response_cache.enabled` (working name pending Open Question §1), default `false`.
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- Add stored properties next to `cacheTTL` / `redactionEnabled` / `runtimeMetadataFooter`:
|
||||||
|
```swift
|
||||||
|
/// `image_gen.model` (v0.13+) — overrides the per-provider default
|
||||||
|
/// image-gen model. Empty string means "let Hermes pick the
|
||||||
|
/// provider default". Hermes v0.12 advertised this key but ignored
|
||||||
|
/// it; Scarf's `AuxiliaryTab` only renders the picker when
|
||||||
|
/// `HermesCapabilities.hasImageGenModel` is `true`.
|
||||||
|
public var imageGenModel: String
|
||||||
|
|
||||||
|
/// `openrouter.response_cache.enabled` (v0.13+) — when true, Hermes
|
||||||
|
/// asks OpenRouter to cache responses for repeat prompts within a
|
||||||
|
/// session. **Open Question:** the exact YAML key shape is
|
||||||
|
/// unconfirmed. See WS-6 plan §Open Questions #1.
|
||||||
|
public var openrouterResponseCacheEnabled: Bool
|
||||||
|
```
|
||||||
|
- Append `imageGenModel: String = ""` and `openrouterResponseCacheEnabled: Bool = false` to the trailing parameter list in the explicit memberwise `init` (after `runtimeMetadataFooter`). Default values mean every existing call site (`HermesConfig.empty`, `init(yaml:)`) compiles unchanged until updated.
|
||||||
|
- Update the static `HermesConfig.empty` factory to pass both new defaults explicitly so the empty-config sentinel matches the post-load shape.
|
||||||
|
|
||||||
|
**Tolerance contract:** Pre-v0.13 hosts have neither key in `config.yaml`; the parser defaults both to empty / false. UI is gated separately on the capability flag, so the values never reach the screen on pre-v0.13 hosts even if they were somehow non-default.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 3. `scarf/Packages/ScarfCore/Sources/ScarfCore/Parsing/HermesConfig+YAML.swift`
|
||||||
|
|
||||||
|
**Why:** Wire the two new keys into the YAML parser.
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- In the trailing `self.init(...)` call, add (next to `cacheTTL` / `redactionEnabled` / `runtimeMetadataFooter`):
|
||||||
|
```swift
|
||||||
|
imageGenModel: str("image_gen.model", default: ""),
|
||||||
|
openrouterResponseCacheEnabled: bool("openrouter.response_cache.enabled", default: false),
|
||||||
|
```
|
||||||
|
- The exact key for `openrouter.response_cache.enabled` is **provisional** — see §Open Questions #1. Lock the key only after manual verification on a v0.13 host (`hermes config check` against a sample YAML with the candidate key + a printout of the `Settings`-level key). We may need a fallback: read the legacy key first and fall through to the canonical one, exactly like the `slack.reply_to_mode` ↔ `platforms.slack.reply_to_mode` pattern at line 187.
|
||||||
|
|
||||||
|
**Tolerance contract:** A v0.12 host with neither key produces `imageGenModel == ""` and `openrouterResponseCacheEnabled == false`, matching the runtime defaults. A v0.13 host with both keys present round-trips through `init(yaml:)` cleanly.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 4. `scarf/scarf/Features/Settings/ViewModels/SettingsViewModel.swift`
|
||||||
|
|
||||||
|
**Why:** Two new setters, one for each new field.
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- Add to the "Auxiliary model sub-tasks" section (since `image_gen` lives logically next to other aux tasks even though the YAML key is at the top level):
|
||||||
|
```swift
|
||||||
|
// MARK: - Image generation (v0.13+)
|
||||||
|
|
||||||
|
func setImageGenModel(_ value: String) { setSetting("image_gen.model", value: value) }
|
||||||
|
func setOpenRouterResponseCache(_ value: Bool) {
|
||||||
|
setSetting("openrouter.response_cache.enabled", value: value ? "true" : "false")
|
||||||
|
}
|
||||||
|
```
|
||||||
|
- Both setters route through `setSetting` → `runHermes(["config", "set", key, value])`, matching the existing pattern. `hermes config set` is forward-compatible — pre-v0.13 hosts accept any key without complaint and write it to YAML; the gate keeps the UI hidden so users on pre-v0.13 never invoke these.
|
||||||
|
|
||||||
|
**Tolerance contract:** No new error paths. Existing `setSetting`'s `saveMessage` plumbing handles success/failure surfacing.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 5. `scarf/scarf/Features/Settings/Views/Tabs/AuxiliaryTab.swift`
|
||||||
|
|
||||||
|
**Why:** Surface the two new fields. Both belong on Auxiliary because they're per-task / per-provider knobs, not main-model-pickers.
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- **Image-gen model row.** Add a new `SettingsSection(title: "Image Generation", icon: "photo")` between the static base tasks and `unknownTasks`, gated on `capabilitiesStore?.capabilities.hasImageGenModel == true`:
|
||||||
|
```swift
|
||||||
|
if capabilitiesStore?.capabilities.hasImageGenModel ?? false {
|
||||||
|
SettingsSection(title: "Image Generation", icon: "photo") {
|
||||||
|
imageGenRow
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
`imageGenRow` is a small `@ViewBuilder`:
|
||||||
|
```swift
|
||||||
|
@ViewBuilder
|
||||||
|
private var imageGenRow: some View {
|
||||||
|
let value = viewModel.config.imageGenModel
|
||||||
|
Picker("Model", selection: Binding(
|
||||||
|
get: { value },
|
||||||
|
set: { viewModel.setImageGenModel($0) }
|
||||||
|
)) {
|
||||||
|
Text("Provider default").tag("")
|
||||||
|
Divider()
|
||||||
|
ForEach(ModelCatalogService.imageGenModels) { model in
|
||||||
|
Text(model.display).tag(model.modelID)
|
||||||
|
}
|
||||||
|
if !value.isEmpty
|
||||||
|
&& !ModelCatalogService.imageGenModels.contains(where: { $0.modelID == value }) {
|
||||||
|
// User has set a custom value; preserve it as a tagged option
|
||||||
|
// so the picker renders the actual selection, not "Provider default".
|
||||||
|
Divider()
|
||||||
|
Text(value + " (custom)").tag(value)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
.pickerStyle(.menu)
|
||||||
|
EditableTextField(label: "Custom model ID", value: value) { newValue in
|
||||||
|
viewModel.setImageGenModel(newValue.trimmingCharacters(in: .whitespaces))
|
||||||
|
}
|
||||||
|
Text("Used for image generation calls. Leave as Provider default unless your provider documents a specific model ID for image-gen.")
|
||||||
|
.font(.caption2)
|
||||||
|
.foregroundStyle(.tertiary)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
The `EditableTextField` lets users free-form-type a model ID we haven't curated. Together they cover both the curated allowlist + the long tail.
|
||||||
|
|
||||||
|
- **OpenRouter response cache row.** Add a new section (or fold into a future "Providers" section):
|
||||||
|
```swift
|
||||||
|
if capabilitiesStore?.capabilities.hasOpenRouterResponseCache ?? false {
|
||||||
|
SettingsSection(title: "OpenRouter", icon: "shippingbox") {
|
||||||
|
ToggleRow(label: "Response caching",
|
||||||
|
isOn: viewModel.config.openrouterResponseCacheEnabled) { newValue in
|
||||||
|
viewModel.setOpenRouterResponseCache(newValue)
|
||||||
|
}
|
||||||
|
Text("OpenRouter caches identical prompts within a session to reduce token costs. Off by default — enable when your workload has highly repeated prompts.")
|
||||||
|
.font(.caption2)
|
||||||
|
.foregroundStyle(.tertiary)
|
||||||
|
.padding(.horizontal, 12)
|
||||||
|
.padding(.bottom, 4)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Tolerance contract:** Pre-v0.13 host hides both sections entirely. Capability flag false → guard fails → section never enters the view tree. Dynamic Type clamp on iOS (n/a here, this is Mac-only) preserved on captions.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 6. `scarf/Packages/ScarfCore/Tests/ScarfCoreTests/M0cServicesTests.swift`
|
||||||
|
|
||||||
|
**Why:** The existing model-catalog tests freeze the `loadProviders()` sort order + decoding shape. Add three new tests:
|
||||||
|
|
||||||
|
**New tests (Swift Testing macros):**
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@Test func vercelAIGatewayDemotedToBottom() throws {
|
||||||
|
// Build a minimal catalog with vercel + alphabetically-later providers,
|
||||||
|
// then assert vercel sorts after them.
|
||||||
|
let json = """
|
||||||
|
{
|
||||||
|
"anthropic": { "name": "Anthropic", "models": {} },
|
||||||
|
"vercel": { "name": "Vercel AI Gateway", "models": {} },
|
||||||
|
"zonk": { "name": "Zonk Provider", "models": {} }
|
||||||
|
}
|
||||||
|
"""
|
||||||
|
let tmp = FileManager.default.temporaryDirectory
|
||||||
|
.appendingPathComponent("scarf-models-\(UUID().uuidString).json")
|
||||||
|
try json.write(to: tmp, atomically: true, encoding: .utf8)
|
||||||
|
defer { try? FileManager.default.removeItem(at: tmp) }
|
||||||
|
let svc = ModelCatalogService(path: tmp.path)
|
||||||
|
let providers = svc.loadProviders().filter { !$0.isOverlay }
|
||||||
|
let names = providers.map(\.providerName)
|
||||||
|
// anthropic first (alpha), zonk next (alpha), vercel last (demoted).
|
||||||
|
#expect(names.last == "Vercel AI Gateway")
|
||||||
|
#expect(names.firstIndex(of: "Vercel AI Gateway")! > names.firstIndex(of: "Zonk Provider")!)
|
||||||
|
}
|
||||||
|
|
||||||
|
@Test func grok420BetaAliasResolvesToGrok420() {
|
||||||
|
let svc = ModelCatalogService(path: "/tmp/scarf-nonexistent-\(UUID().uuidString).json")
|
||||||
|
#expect(svc.resolveModelAlias(providerID: "openrouter", modelID: "x-ai/grok-4.20-beta")
|
||||||
|
== "x-ai/grok-4.20")
|
||||||
|
#expect(svc.resolveModelAlias(providerID: "xai", modelID: "grok-4.20-beta")
|
||||||
|
== "grok-4.20")
|
||||||
|
// Non-aliased ID passes through unchanged.
|
||||||
|
#expect(svc.resolveModelAlias(providerID: "anthropic", modelID: "claude-4.7-opus")
|
||||||
|
== "claude-4.7-opus")
|
||||||
|
// Cross-provider isolation: same modelID on a different provider isn't aliased.
|
||||||
|
#expect(svc.resolveModelAlias(providerID: "fictional", modelID: "x-ai/grok-4.20-beta")
|
||||||
|
== "x-ai/grok-4.20-beta")
|
||||||
|
}
|
||||||
|
|
||||||
|
@Test func imageGenModelAllowlistShape() {
|
||||||
|
// Lock the curated list size + a few sentinel entries so unintentional
|
||||||
|
// edits get caught in review.
|
||||||
|
let models = ModelCatalogService.imageGenModels
|
||||||
|
#expect(models.count >= 5)
|
||||||
|
#expect(models.contains(where: { $0.modelID == "openai/gpt-image-1" }))
|
||||||
|
#expect(models.contains(where: { $0.modelID == "google/imagen-4" }))
|
||||||
|
// Every entry has a non-empty display + a non-empty modelID.
|
||||||
|
for m in models {
|
||||||
|
#expect(!m.modelID.isEmpty)
|
||||||
|
#expect(!m.display.isEmpty)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Tolerance contract:** All three are pure-function tests that run without a Hermes binary or models cache file. They survive a `ModelCatalogService(path: nonexistent)` because the alias + allowlist paths don't read the catalog.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 7. `scarf/Packages/ScarfCore/Tests/ScarfCoreTests/M6ConfigCronTests.swift` (or new `WS6ProvidersConfigTests.swift`)
|
||||||
|
|
||||||
|
**Why:** Lock the YAML round-trip for the two new keys.
|
||||||
|
|
||||||
|
**New test:**
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@Test func imageGenAndOpenRouterCacheRoundTrip() {
|
||||||
|
let yaml = """
|
||||||
|
image_gen:
|
||||||
|
model: openai/gpt-image-1
|
||||||
|
openrouter:
|
||||||
|
response_cache:
|
||||||
|
enabled: true
|
||||||
|
"""
|
||||||
|
let cfg = HermesConfig(yaml: yaml)
|
||||||
|
#expect(cfg.imageGenModel == "openai/gpt-image-1")
|
||||||
|
#expect(cfg.openrouterResponseCacheEnabled == true)
|
||||||
|
}
|
||||||
|
|
||||||
|
@Test func imageGenDefaultsToEmptyString() {
|
||||||
|
let cfg = HermesConfig(yaml: "")
|
||||||
|
#expect(cfg.imageGenModel == "")
|
||||||
|
#expect(cfg.openrouterResponseCacheEnabled == false)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Tolerance contract:** Tracks the exact YAML keys the parser expects. If the Open Question resolves a different key shape, this test pins the change to one place.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 8. `tools/build-catalog.py` mirror
|
||||||
|
|
||||||
|
**Why:** Per CLAUDE.md, every new schema-shaped change must mirror into the Python validator. Audit:
|
||||||
|
|
||||||
|
| New surface | Mirror needed? | Rationale |
|
||||||
|
| -- | -- | -- |
|
||||||
|
| `modelAliases` | **No** | The catalog tool validates `template.json` manifests, not model IDs. Aliases live entirely in Scarf-side ModelCatalogService. |
|
||||||
|
| `demotedProviders` | **No** | Same — the catalog tool doesn't render the picker. |
|
||||||
|
| `imageGenModels` (curated) | **No** | Curated list is Scarf UI-only. |
|
||||||
|
| `HermesConfig.imageGenModel` | **No** | The catalog tool never reads `config.yaml`; it reads `template.json`. |
|
||||||
|
| `HermesConfig.openrouterResponseCacheEnabled` | **No** | Same. |
|
||||||
|
|
||||||
|
**Verdict:** No `tools/build-catalog.py` changes for WS-6. Document the audit explicitly in the WS-6 PR description so future plans know we checked.
|
||||||
|
|
||||||
|
If WS-6 ever adds a new `ProjectDashboardWidget.type` (it doesn't — image_gen is in Settings, not a dashboard widget), the mirror would be required. The widget vocabulary is the only Swift-primary schema the catalog tool tracks.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 9. `scarf/CLAUDE.md` — schema-drift line
|
||||||
|
|
||||||
|
**Why:** CLAUDE.md says "Keep `ModelCatalogService.overlayOnlyProviders` in sync with `HERMES_OVERLAYS` in … `providers.py`." After this WS, Scarf also needs to keep `modelAliases` in sync with Hermes's deprecation map (currently a small list inside `hermes_cli/providers.py`). Add one bullet in the "Hermes Version" section:
|
||||||
|
|
||||||
|
> Keep `ModelCatalogService.modelAliases` in sync with `HERMES_DEPRECATED_MODEL_IDS` (or whatever the upstream module renames to) in `hermes-agent/hermes_cli/providers.py`. Drift here means a user's old model ID stops resolving in the picker even though Hermes still accepts it at runtime.
|
||||||
|
|
||||||
|
(Plus the existing demoted-providers bullet — see below.)
|
||||||
|
|
||||||
|
> Keep `ModelCatalogService.demotedProviders` in sync with the deprioritized-provider list in `hermes-agent/hermes_cli/providers.py`. Drift means Vercel AI Gateway sorts in the wrong position in Scarf's picker.
|
||||||
|
|
||||||
|
**Touchpoint:** the single block at line ~205 of `scarf/CLAUDE.md` (the "Keep `ModelCatalogService.overlayOnlyProviders` in sync" paragraph). Append two more bullets next to it.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## New models / overlay entries
|
||||||
|
|
||||||
|
| Model ID | Provider | Cache hit (verified) | Overlay change? | Action |
|
||||||
|
| -- | -- | -- | -- | -- |
|
||||||
|
| `deepseek/deepseek-v4-pro` | OpenRouter + Nous Portal | **Yes** (openrouter) | No | Auto-shows on next `models_dev_cache.json` refresh; Nous Portal serves it via the Nous overlay's free-form model list. No code change. |
|
||||||
|
| `x-ai/grok-4.3` | OpenRouter + Nous Portal + xAI direct + Vercel | **Yes** (openrouter, xai, vercel) | No | Auto-shows. No code change. |
|
||||||
|
| `openrouter/owl-alpha` | OpenRouter only (free tier) | **Yes** | No | Auto-shows. No code change. |
|
||||||
|
| `tencent/hy3-preview` | OpenRouter only (paid route) | **Yes** | No | Auto-shows. No code change. |
|
||||||
|
| `arcee-ai/trinity-large-thinking` | Arcee (overlay) + OpenRouter + DigitalOcean + Venice + Kilo | **Yes** (openrouter, etc.) | No | Auto-shows on non-overlay providers. The Arcee overlay's free-form picker remains the path for direct Arcee API users. **No catalog field captures the v0.13 "temperature + compression overrides" — that's a per-call hint Hermes passes through, not a per-model metadata field.** Scarf doesn't need to surface it. |
|
||||||
|
| `x-ai/grok-4.20-beta` → `x-ai/grok-4.20` | OpenRouter + xAI + Vercel | **Both present** | No | Add to `modelAliases` (see file 1). Resolution at read time means a user's stored config keeps working without a rewrite. |
|
||||||
|
|
||||||
|
**Why no overlay changes:** All 11 existing overlay entries (`nous`, `openai-codex`, `qwen-oauth`, `google-gemini-cli`, `copilot-acp`, `arcee`, `gmi`, `azure-foundry`, `lmstudio`, `minimax-oauth`, `tencent-tokenhub`) remain. v0.13's `ProviderProfile` ABC + `plugins/model-providers/` framework adds **internal** Hermes pluggability but does not introduce new overlay-only providers in this release. Verify on Hermes upstream by diffing `hermes_cli/providers.py` against the v0.12 baseline; if the `HERMES_OVERLAYS` dict gained entries, mirror them. Lock in `ToolGatewayTests.v013OverlayProvidersCarryCorrectAuthTypes` (mirror of the existing v0.12 lock-in test).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## New types / fields
|
||||||
|
|
||||||
|
### `HermesProviderOverlay` — no shape change
|
||||||
|
|
||||||
|
The release notes mention `ProviderProfile` ABC, but it's an internal Python abstraction. Nothing in the on-disk overlay contract changes. `HermesProviderOverlay` keeps its current five-field shape (`displayName`, `baseURL`, `authType`, `subscriptionGated`, `docURL`).
|
||||||
|
|
||||||
|
### `ModelCatalogService.HermesImageGenModel` — new
|
||||||
|
|
||||||
|
Curated image-gen model entry, pre-populated for the picker on Auxiliary tab. Five fields: `modelID`, `display`, `providerHint`. Scope is intentionally tiny — we don't enumerate every provider's image model; users with niche providers free-form-type the model ID instead.
|
||||||
|
|
||||||
|
### `ModelCatalogService.modelAliases` — new
|
||||||
|
|
||||||
|
`[String: String]` map keyed by composite `providerID/modelID`. Used at read time by `validateModel`, `model(_:_:)`, and `provider(for:)`. **Does not** rewrite stored config.
|
||||||
|
|
||||||
|
### `ModelCatalogService.demotedProviders` — new
|
||||||
|
|
||||||
|
`Set<String>` of provider IDs to sink to the bottom of the picker. Sort comparator update in `loadProviders()` is the only consumer.
|
||||||
|
|
||||||
|
### `HermesConfig.imageGenModel` / `HermesConfig.openrouterResponseCacheEnabled` — new
|
||||||
|
|
||||||
|
Top-level config fields, defaults `""` and `false`. Read by `init(yaml:)`, written via `setSetting` → `hermes config set`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Capability gating
|
||||||
|
|
||||||
|
| Capability | Flag | UI surface | Pre-v0.13 host behavior |
|
||||||
|
| -- | -- | -- | -- |
|
||||||
|
| `image_gen.model` honored at runtime | `hasImageGenModel` | `AuxiliaryTab` "Image Generation" section | Section never enters the view tree. The model picker would otherwise no-op silently on pre-v0.13 (the value goes to YAML but Hermes ignores it). Hiding spares users a "I set this and nothing happened" trap. |
|
||||||
|
| OpenRouter response caching | `hasOpenRouterResponseCache` | `AuxiliaryTab` "OpenRouter" section | Section never enters the view tree. Same reasoning — silent no-op on pre-v0.13. |
|
||||||
|
| `modelAliases` resolution | (none) | `validateModel`, `model(_:_:)`, `provider(for:)` | Always on. The alias is a Scarf-side concept that doesn't depend on Hermes version — even on pre-v0.13 hosts, OpenRouter still serves the model via either the old or new ID. (Verify upstream: if OpenRouter has dropped the `-beta` slot entirely, the alias resolution still helps users on the new ID. If OpenRouter kept the `-beta` slot live, the alias still helps users on the new ID. Win-win.) |
|
||||||
|
| Vercel demotion | (none) | `loadProviders()` sort | Always on. Vercel's display position is a Scarf-UI choice, not a Hermes-version-gated behavior. |
|
||||||
|
|
||||||
|
**Why no flag for the demotion / aliases:** Both are Scarf-UX choices that improve every Hermes version's experience equally. Adding a flag would mean dragging the sort order with the version, which is worse — users on a v0.12 host would see Vercel mid-alphabet, then mysteriously at the bottom after upgrading. Consistency wins.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## How to test
|
||||||
|
|
||||||
|
### Unit tests (Swift Testing — see file 6 + 7)
|
||||||
|
|
||||||
|
- `vercelAIGatewayDemotedToBottom` — locks the new sort axis.
|
||||||
|
- `grok420BetaAliasResolvesToGrok420` — locks the alias map shape.
|
||||||
|
- `imageGenModelAllowlistShape` — locks the curated list size + sentinel entries.
|
||||||
|
- `imageGenAndOpenRouterCacheRoundTrip` — locks the YAML key shape (`image_gen.model` + `openrouter.response_cache.enabled`).
|
||||||
|
- `imageGenDefaultsToEmptyString` — locks the empty-config default.
|
||||||
|
|
||||||
|
### Manual test plan (Mac, against a v0.13 Hermes host)
|
||||||
|
|
||||||
|
1. **Picker order.** Open `Settings → General → Model picker`. Confirm Nous Portal (subscription-gated) is at the top, alphabetical group fills the middle, Vercel AI Gateway is the last non-subscription entry. Resize the sheet; the order is stable across re-renders.
|
||||||
|
2. **Grok rename.** Edit `~/.hermes/config.yaml` directly: set `model.default: x-ai/grok-4.20-beta`, provider `openrouter`. Reload Scarf. The picker should show `x-ai/grok-4.20` selected (the alias resolved). The stored YAML is untouched. Save a new model — confirm Hermes still accepts `x-ai/grok-4.20-beta` at the wire level (it should — OpenRouter keeps the slot live).
|
||||||
|
3. **Image-gen model picker.** Open `Settings → Auxiliary → Image Generation`. Confirm:
|
||||||
|
- Section is visible (you're on v0.13).
|
||||||
|
- The picker has "Provider default" + the 7 curated entries.
|
||||||
|
- Selecting `openai/gpt-image-1` writes `image_gen.model: openai/gpt-image-1` to `config.yaml` (verify with `grep image_gen ~/.hermes/config.yaml`).
|
||||||
|
- Free-form-typing a custom value sets it.
|
||||||
|
- Setting it back to "Provider default" (`""`) clears the key from YAML on next save.
|
||||||
|
4. **OpenRouter response cache toggle.** Same tab, "OpenRouter" section. Confirm:
|
||||||
|
- Section is visible.
|
||||||
|
- Toggle off → on writes `openrouter.response_cache.enabled: true`.
|
||||||
|
- Toggle on → off writes `openrouter.response_cache.enabled: false`.
|
||||||
|
5. **Pre-v0.13 fallback.** Switch the active server to a v0.12 host (or stash with `HERMES_VERSION_OVERRIDE=0.12.0` env shim). Confirm:
|
||||||
|
- Image Generation section is hidden.
|
||||||
|
- OpenRouter section is hidden.
|
||||||
|
- The picker still shows Vercel AI Gateway at the bottom (sort axis is unconditional).
|
||||||
|
- Grok alias resolution still works.
|
||||||
|
6. **`hermes config set` round-trip.** Set `image_gen.model` from Scarf, then `hermes config check` from Terminal — confirm the new key validates against Hermes's schema.
|
||||||
|
|
||||||
|
### Integration / smoke
|
||||||
|
|
||||||
|
- `scripts/smoke.sh` (if present) — run the full smoke sweep, verify no provider catalog regressions on the existing 11 overlay entries.
|
||||||
|
- Build clean: `xcodebuild -project scarf/scarf.xcodeproj -scheme scarf -configuration Debug build`. New Swift Testing tests run via `swift test --package-path scarf/Packages/ScarfCore`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Open questions
|
||||||
|
|
||||||
|
1. **`openrouter.response_cache.enabled` — exact YAML key shape.** The release notes say "OpenRouter response caching support" but don't specify the key. Three plausible shapes:
|
||||||
|
- `openrouter.response_cache.enabled: true` (top-level provider block)
|
||||||
|
- `providers.openrouter.response_cache_enabled: true` (under the new `providers:` map v0.13 introduces)
|
||||||
|
- `prompt_caching.openrouter.enabled: true` (nested under the existing `prompt_caching` block from v0.12)
|
||||||
|
|
||||||
|
**Recommendation:** Verify by inspecting the v0.13 Hermes config schema (`hermes config check` against a sample YAML for each shape, or `grep -r response_cache hermes-agent/hermes_cli/`) before merging WS-6. The first shape is consistent with how Hermes handles other per-provider knobs (`xai.voice_cloning.enabled` from v0.13's xAI Voice Cloning); it's our default until verified. If the shape changes, file 3's parser line + file 4's setter key + file 7's test fixture all update in lockstep.
|
||||||
|
|
||||||
|
2. **Default value for OpenRouter response caching.** The release notes don't specify whether v0.13 defaults the toggle on or off. **Recommendation:** Default off in Scarf's parser (`bool("openrouter.response_cache.enabled", default: false)`). Worst case, the user explicitly opts in. If Hermes defaults on server-side, our `false` parse still matches because the key would be present in the YAML.
|
||||||
|
|
||||||
|
3. **Arcee Trinity Large Thinking "temperature + compression overrides".** The release notes mention "temperature + compression overrides" for this model. Hermes treats these as per-model invocation hints (not catalog metadata). Scarf has no surface for per-model temperature today — it's set by the user via `hermes ask --temperature` or the per-aux-task config. **Recommendation:** Defer to a future cycle if user feedback asks for per-model temperature picker. v2.8 ships without.
|
||||||
|
|
||||||
|
4. **Grok rename — does OpenRouter delete the old slot?** If OpenRouter keeps `x-ai/grok-4.20-beta` live (with a redirect to `x-ai/grok-4.20`), our alias is purely cosmetic — Hermes still accepts the old ID. If OpenRouter deletes the old slot, the alias becomes load-bearing — without it, users on the old config get a 404 at runtime. **Either way, the alias is correct.** Verify before merging by sending a request to OpenRouter for both IDs.
|
||||||
|
|
||||||
|
5. **`models_dev_cache.json` refresh timing.** Hermes ships with a snapshot; the user's local cache refreshes via Hermes's own cache-refresh logic (background task or on-demand). Confirm that a v0.13 install ships with all five new models pre-populated (not deferred to a first-run network fetch), so the picker doesn't render an empty list on a fresh `~/.hermes/`. **Verified locally:** the dev host's cache has all five new IDs. Re-verify on a clean `~/.hermes/` after `hermes update` to v0.13.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Out of scope (deferred)
|
||||||
|
|
||||||
|
- **In-app Hermes restart** after toggling response caching. Some toggles need a Hermes restart to take effect; the response_cache toggle is unclear. Defer the auto-restart prompt to a future cycle once we know which toggles need it. Scarf already has a "Restart Hermes" button at `Settings → General` for users who hit a stale-toggle case.
|
||||||
|
- **iOS surface for image_gen.model + OpenRouter cache.** ScarfGo's settings is read-mostly. WS-9 picks up iOS catch-up; the capability flags work cross-platform once the surface lands.
|
||||||
|
- **Per-image-gen-model metadata** (cost, max resolution, prompt-token-cost). Not in `models_dev_cache.json`; out of scope until the catalog adds a tag.
|
||||||
|
- **Provider profile MCP plugins (`plugins/model-providers/`).** Server-side framework. Scarf reaches whatever providers Hermes exposes via the cache + overlay — the indirection is transparent.
|
||||||
|
- **Bedrock credential probe avoidance.** Server-side; Scarf was already not invoking that probe.
|
||||||
|
- **Honor runtime default model during delegate provider resolution.** Server-side; Scarf's `delegation.model` field is already a free-form string we hand to `hermes config set`.
|
||||||
|
- **`/provider` alias removal.** Server-side; Scarf already used `/model` directly.
|
||||||
|
- **Credential filter on picker provider list.** v0.13's `list_picker_providers` filters the CLI picker by available credentials. We deliberately don't adopt this in Scarf — users frequently configure providers in-app and need to see the row before they can fill the secret. If user feedback strongly favors hiding unconfigured providers, revisit in a future WS.
|
||||||
|
- **Migration to one-shot rewrite for the Grok alias.** Option 2 (rewrite YAML) was rejected; option 1 (read-time alias) wins on safety + simplicity. See §Migration.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Migration
|
||||||
|
|
||||||
|
### Grok 4.20-beta → 4.20
|
||||||
|
|
||||||
|
**Option 1 — alias-resolve at read time. ✅ Recommended.**
|
||||||
|
|
||||||
|
- `ModelCatalogService.modelAliases` maps `openrouter/x-ai/grok-4.20-beta` → `openrouter/x-ai/grok-4.20`.
|
||||||
|
- `validateModel` resolves the alias before lookup; `model(_:_:)` resolves before indexing; `provider(for:)` resolves before scanning.
|
||||||
|
- The user's `config.yaml` stays as-is. Scarf treats the alias as an internal display + lookup detail; Hermes (which still accepts both IDs at runtime) handles the wire.
|
||||||
|
|
||||||
|
**Pros:**
|
||||||
|
- Lossless. The user's hand-edits to `config.yaml` are sacred — we never touch them.
|
||||||
|
- No race. There's no point at which Scarf's "rewrite YAML" path could conflict with the user's editor.
|
||||||
|
- Trivial to reverse. If a future Hermes brings the old ID back, drop the entry from `modelAliases`.
|
||||||
|
- Free of edge cases. A user with a custom `model.default` value Hermes never recognized still works.
|
||||||
|
|
||||||
|
**Cons:**
|
||||||
|
- Two IDs in flight on the user's system (one in `config.yaml`, one in the picker's selected state). Cosmetic — the picker shows the resolved name, the YAML keeps the old name.
|
||||||
|
|
||||||
|
**Option 2 — one-shot YAML rewrite on next launch.**
|
||||||
|
|
||||||
|
Rejected. TOCTOU race (user edits YAML in `vim`, Scarf rewrites mid-edit), no path to undo, and the only "win" (a clean YAML) is invisible to most users.
|
||||||
|
|
||||||
|
**Precedent:** No prior model-rename has shipped through Scarf's overlay table. The new alias map is the precedent for this and future renames.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Estimate
|
||||||
|
|
||||||
|
- File 1 (`ModelCatalogService.swift`): ~80 lines net add (alias map + helper + curated list + sort axis update).
|
||||||
|
- File 2 (`HermesConfig.swift`): ~25 lines net add (two stored props + memberwise init params + empty-config update).
|
||||||
|
- File 3 (`HermesConfig+YAML.swift`): ~5 lines net add (two parser lines).
|
||||||
|
- File 4 (`SettingsViewModel.swift`): ~5 lines net add (two setters).
|
||||||
|
- File 5 (`AuxiliaryTab.swift`): ~70 lines net add (two new sections + the image-gen view).
|
||||||
|
- File 6 (`M0cServicesTests.swift`): ~60 lines net add (three tests).
|
||||||
|
- File 7 (`M6ConfigCronTests.swift` or new file): ~30 lines net add (two tests).
|
||||||
|
- File 9 (`scarf/CLAUDE.md`): ~6 lines net add (two new bullets in the schema-drift block).
|
||||||
|
|
||||||
|
**Total:** ~280 lines net add across 8 files (Swift + Markdown). No deletes. No file moves. No new package targets.
|
||||||
|
|
||||||
|
**Build risk:** Low. All edits are additive; existing call sites use default values. No behavior change for pre-v0.13 hosts (capability flag + alias resolution are both safe).
|
||||||
|
|
||||||
|
**Review risk:** Medium-low. The Open Question on the OpenRouter cache key shape is the single highest-risk item; everything else is mechanical. Block the PR until that key is verified.
|
||||||
|
|
||||||
|
**Effort:** ~1 day implementation + 0.5 day verification (manual test plan + Open Question verification on a real v0.13 host).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Appendix A — `models_dev_cache.json` verification
|
||||||
|
|
||||||
|
Local `~/.hermes/models_dev_cache.json` (v0.13 dev host) confirms:
|
||||||
|
|
||||||
|
| Query | Provider | Match |
|
||||||
|
| -- | -- | -- |
|
||||||
|
| `deepseek-v4-pro` | openrouter | `deepseek/deepseek-v4-pro` ✅ |
|
||||||
|
| `grok-4.3` | openrouter, xai, vercel | `x-ai/grok-4.3`, `grok-4.3`, `xai/grok-4.3` ✅ |
|
||||||
|
| `owl-alpha` | openrouter | `openrouter/owl-alpha` ✅ |
|
||||||
|
| `hy3-preview` | openrouter | `tencent/hy3-preview` ✅ |
|
||||||
|
| `trinity-large-thinking` | openrouter, kilo, venice, digitalocean | `arcee-ai/trinity-large-thinking`, etc. ✅ |
|
||||||
|
| `grok-4.20-beta` | openrouter | `x-ai/grok-4.20-beta` ✅ (live, not yet renamed in cache) |
|
||||||
|
| `grok-4.20` | openrouter | `x-ai/grok-4.20-multi-agent-beta` (similar but distinct) — the bare `x-ai/grok-4.20` ID is **not yet** in this cache snapshot |
|
||||||
|
|
||||||
|
**Implication:** The Grok rename hasn't fully landed in `models_dev_cache.json` on this dev host yet. The alias resolution is therefore **load-bearing** for users who manually update their `model.default` to the new ID before the cache refresh — they'd otherwise get an "unknown model" warning from Scarf's validator. Once the cache catches up, the alias falls back to cosmetic.
|
||||||
|
|
||||||
|
`vercel` provider: present, named `Vercel AI Gateway`, 248 models. Demotion target confirmed.
|
||||||
|
|
||||||
|
`arcee` overlay: present in Scarf's `overlayOnlyProviders`, NOT in `models_dev_cache.json`. Trinity Large Thinking still reaches users via the Arcee overlay's free-form picker + via OpenRouter / Vercel / DigitalOcean / Venice / Kilo where the cache surfaces it. No code change needed.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Appendix B — schema-drift checklist
|
||||||
|
|
||||||
|
Before merging WS-6, verify the following are aligned across Swift and the upstream Hermes Python:
|
||||||
|
|
||||||
|
- [ ] `ModelCatalogService.overlayOnlyProviders` matches `HERMES_OVERLAYS` in `hermes_cli/providers.py` (no change in WS-6, but verify nothing drifted since WS-1).
|
||||||
|
- [ ] `ModelCatalogService.modelAliases` matches Hermes's deprecation map (verify the key location in `hermes_cli/providers.py` or wherever upstream tracks renames).
|
||||||
|
- [ ] `ModelCatalogService.demotedProviders` matches Hermes's deprioritized-provider list.
|
||||||
|
- [ ] `HermesConfig.openrouterResponseCacheEnabled` YAML key matches Hermes's config schema (resolve the Open Question).
|
||||||
|
- [ ] `HermesConfig.imageGenModel` YAML key (`image_gen.model`) matches Hermes's config schema. Currently confident — the release notes name the key explicitly.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**End of WS-6 plan.**
|
||||||
@@ -0,0 +1,628 @@
|
|||||||
|
# WS-7 Plan: Settings tab additions
|
||||||
|
|
||||||
|
**Workstream:** WS-7 of Scarf v2.8.0
|
||||||
|
**Hermes target:** v0.13.0 (v2026.5.7)
|
||||||
|
**Capability gates (already shipped in WS-1):**
|
||||||
|
- `HermesCapabilities.hasMCPSSETransport` (`>= 0.13.0`)
|
||||||
|
- `HermesCapabilities.hasCronNoAgent` (`>= 0.13.0`)
|
||||||
|
- `HermesCapabilities.hasWebToolsBackendSplit` (`>= 0.13.0`)
|
||||||
|
- `HermesCapabilities.hasProfileNoSkills` (`>= 0.13.0`)
|
||||||
|
|
||||||
|
**Builds on:**
|
||||||
|
- v2.7.5 MCP Servers feature (`Features/MCPServers/`) — list + detail + add (preset / custom) + edit + per-server delete + OAuth token surface.
|
||||||
|
- v2.7.5 Cron feature (`Features/Cron/`) — `--workdir` already plumbed through `CronJobEditor` + `CronViewModel.createJob` / `updateJob`. Provides the precedent for v0.13 capability-gated form fields.
|
||||||
|
- v2.7.5 Settings feature (`Features/Settings/`) — 10 tabs, single `SettingsViewModel` write surface routing through `setSetting(key, value)` → `hermes config set <key> <value>`.
|
||||||
|
- v2.7.5 Profiles feature (`Features/Profiles/`) — Mac (read/write) + iOS (read-only); Mac create-sheet has `--clone` / `--clone-all` toggles today.
|
||||||
|
|
||||||
|
**Owner:** TBD
|
||||||
|
**Reviewers:** Alan; whoever rides Settings/Profiles during v2.8.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Goals
|
||||||
|
|
||||||
|
Four small, independent additions, each gated on its own v0.13 capability flag. Each lands as its own commit inside the WS-7 PR so reviewers can scan them as four self-contained changes.
|
||||||
|
|
||||||
|
1. **MCP SSE transport** — third transport option alongside `stdio` and `http` (which Hermes calls "pipe" when it means stdin/stdout JSON-RPC; "http" in our code is the HTTP transport — see Open Questions). Adds `URL` + `sse_read_timeout` fields to the add-server flow and the editor; surfaces the "SSE" segment only on v0.13+ hosts.
|
||||||
|
2. **Cron `--no-agent`** — script-only watchdog jobs. New toggle in `CronJobEditor`; when ON, the prompt + skills fields collapse with a hint. Maps to `--no-agent` on `hermes cron create / edit`. Read-side adds `noAgent: Bool?` to `HermesCronJob` for round-trip tolerance.
|
||||||
|
3. **Web Tools backend split** — `web_search` and `web_extract` config keys gain distinct backends. Net-new tab "Web Tools" in `SettingsView` with two backend pickers. Pre-v0.13 hosts see a legacy combined picker (single `web_tools.backend` key) rendered inside the same tab so the chrome stays consistent.
|
||||||
|
4. **Profiles `--no-skills`** — Mac create-profile sheet gains an "Empty profile (no skills)" toggle that appends `--no-skills` to `hermes profile create`. iOS is read-only and out of scope.
|
||||||
|
|
||||||
|
### Non-goals
|
||||||
|
|
||||||
|
- **Live MCP SSE wire-format probing.** WS-7 only writes the YAML + surfaces the field. Hermes owns the runtime connect; Scarf trusts `hermes mcp test <name>` to verify.
|
||||||
|
- **MCP `pipe` transport surface.** v0.13 release notes mention "Retry stale pipe transport failures as session-expired" — pipe is Hermes-internal jargon for the existing stdio transport (per parser logic at `HermesFileService.parseMCPServersBlock` and `MCPTransport` enum cases). No new user-facing transport option for "pipe".
|
||||||
|
- **`web_tools.search.<backend>.<api_key>` deep settings.** Backend-specific tuning (e.g. SearXNG host URL, Tavily API key) stays in raw YAML editor for v2.8. Per-backend config sheets are a follow-up — the "split" is the v0.13 wire change WS-7 must ship.
|
||||||
|
- **iOS `--no-skills`.** iOS Profiles is read-only (per CLAUDE.md "v0.12 iOS catch-up (Phase H)" and `Scarf iOS/Profiles/ProfilesView.swift`). No new toggles on iOS.
|
||||||
|
- **Cron `--no-agent` retroactive flagging.** A v0.13 host whose `~/.hermes/cron/jobs.json` already has `no_agent: true` jobs gets the badge for free via the new `noAgent` field; no migration UX.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. MCP SSE transport
|
||||||
|
|
||||||
|
### Files / changes
|
||||||
|
|
||||||
|
#### 1a. `scarf/Packages/ScarfCore/Sources/ScarfCore/Models/HermesMCPServer.swift`
|
||||||
|
|
||||||
|
**Why:** `MCPTransport` is currently a 2-case enum (`stdio`, `http`). Adding `sse` keeps SwiftUI Picker code paths simple — the existing `Picker(selection: $transport) { ForEach(MCPTransport.allCases) { ... } }` in `MCPServerAddCustomView` then iterates three cases automatically.
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- Extend `MCPTransport`:
|
||||||
|
```swift
|
||||||
|
public enum MCPTransport: String, Sendable, Equatable, CaseIterable, Identifiable {
|
||||||
|
case stdio
|
||||||
|
case http
|
||||||
|
case sse // v0.13+
|
||||||
|
...
|
||||||
|
}
|
||||||
|
```
|
||||||
|
- Add `displayName` case for `.sse`: `"Remote (SSE)"`.
|
||||||
|
- Add a single new stored property to `HermesMCPServer`:
|
||||||
|
- `public let sseReadTimeout: Int?` — seconds. `nil` when the YAML doesn't specify `sse_read_timeout`.
|
||||||
|
- Append `sseReadTimeout: Int? = nil` to the memberwise initializer's tail (defaulted) so existing call sites compile unchanged. Mirrors how `connectTimeout` lives next to `timeout`.
|
||||||
|
- Update `summary` so `.sse` returns `url ?? ""` (same shape as `.http`).
|
||||||
|
|
||||||
|
**Tolerance contract:** A pre-v0.13 server entry with no `url` and no `sse_read_timeout` parses as `.stdio`. A v0.13 entry with `url` + `sse_read_timeout` parses as `.sse` — see parser change below.
|
||||||
|
|
||||||
|
#### 1b. `scarf/scarf/Core/Services/HermesFileService.swift`
|
||||||
|
|
||||||
|
**Why:** YAML parser at `parseMCPServersBlock` (line 796) currently distinguishes stdio vs http with `let transport: MCPTransport = fields["url"] != nil ? .http : .stdio`. SSE also has a `url`, so we need a second discriminator.
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- Inside the `flush()` closure (around line 815), replace the binary discriminator with a 3-way one:
|
||||||
|
```swift
|
||||||
|
let transport: MCPTransport = {
|
||||||
|
if fields["transport"]?.lowercased() == "sse" { return .sse }
|
||||||
|
if fields["url"] != nil { return .http }
|
||||||
|
return .stdio
|
||||||
|
}()
|
||||||
|
```
|
||||||
|
Hermes v0.13's `mcp add --url <https://...> --transport sse` writes a `transport: sse` scalar into the YAML entry; older hosts emit no `transport` key, defaulting to `.http` for url-based entries and `.stdio` otherwise. This preserves byte-for-byte round-trip on existing files.
|
||||||
|
- Read `sse_read_timeout` from `fields["sse_read_timeout"]`, parse as `Int?`, pass into `HermesMCPServer` initializer.
|
||||||
|
- New writer method:
|
||||||
|
```swift
|
||||||
|
@discardableResult
|
||||||
|
nonisolated func addMCPServerSSE(name: String, url: String, sseReadTimeout: Int?) -> (exitCode: Int32, output: String) {
|
||||||
|
var args = ["mcp", "add", name, "--url", url, "--transport", "sse"]
|
||||||
|
if let t = sseReadTimeout { args += ["--sse-read-timeout", String(t)] }
|
||||||
|
return runHermesCLI(args: args, timeout: 45, stdinInput: "y\ny\ny\n")
|
||||||
|
}
|
||||||
|
```
|
||||||
|
Verify the exact CLI flag name during integration — `--sse-read-timeout` is the natural form but Hermes may have shipped it as `--sse-read-timeout-seconds` or merged it under `--timeout`. See Open Questions.
|
||||||
|
- New writer for changing `sse_read_timeout` post-create:
|
||||||
|
```swift
|
||||||
|
@discardableResult
|
||||||
|
nonisolated func setMCPServerSSETimeout(name: String, sseReadTimeout: Int?) -> Bool {
|
||||||
|
patchMCPServerField(name: name) { entryLines in
|
||||||
|
if let t = sseReadTimeout {
|
||||||
|
Self.replaceOrInsertScalar(key: "sse_read_timeout", value: String(t), in: &entryLines)
|
||||||
|
} else {
|
||||||
|
Self.removeScalar(key: "sse_read_timeout", in: &entryLines)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
Mirrors `setMCPServerTimeouts` line-for-line.
|
||||||
|
|
||||||
|
**Round-trip invariant:** Adding an SSE server through `addMCPServerSSE`, then editing its `sse_read_timeout` through `setMCPServerSSETimeout`, then re-loading, must produce the same `HermesMCPServer.sseReadTimeout` value. Test fixture below.
|
||||||
|
|
||||||
|
#### 1c. `scarf/scarf/Features/MCPServers/Views/MCPServerAddCustomView.swift`
|
||||||
|
|
||||||
|
**Why:** This is the add-server form. It currently has a 2-segment transport picker.
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- Add `@Environment(\.hermesCapabilities) private var capabilitiesStore`.
|
||||||
|
- Add `@State private var sseReadTimeout: String = ""`.
|
||||||
|
- Replace the static `Picker { ForEach(MCPTransport.allCases) }` segmented control with a filtered list that drops `.sse` when capability is off:
|
||||||
|
```swift
|
||||||
|
private var availableTransports: [MCPTransport] {
|
||||||
|
var t: [MCPTransport] = [.stdio, .http]
|
||||||
|
if capabilitiesStore?.capabilities.hasMCPSSETransport ?? false { t.append(.sse) }
|
||||||
|
return t
|
||||||
|
}
|
||||||
|
```
|
||||||
|
Render with `ForEach(availableTransports) { ... }`. Iterating `MCPTransport.allCases` would render the SSE option even on pre-v0.13 hosts, which Hermes argparse would reject.
|
||||||
|
- Branch the body: when `transport == .sse`, render an `sseSection` next to (not replacing) the existing `httpSection`. Shape:
|
||||||
|
```swift
|
||||||
|
private var sseSection: some View {
|
||||||
|
sectionBox(title: "Endpoint (SSE)") {
|
||||||
|
VStack(alignment: .leading, spacing: 8) {
|
||||||
|
VStack(alignment: .leading, spacing: 4) {
|
||||||
|
Text("URL").font(.caption.bold())
|
||||||
|
TextField("https://.../sse", text: $url)
|
||||||
|
.textFieldStyle(.roundedBorder)
|
||||||
|
.font(.system(.body, design: .monospaced))
|
||||||
|
}
|
||||||
|
VStack(alignment: .leading, spacing: 4) {
|
||||||
|
Text("SSE Read Timeout (seconds)").font(.caption.bold())
|
||||||
|
TextField("default 300", text: $sseReadTimeout)
|
||||||
|
.textFieldStyle(.roundedBorder)
|
||||||
|
.frame(maxWidth: 140)
|
||||||
|
Text("Hermes-side keepalive interval. Leave blank to use the default.")
|
||||||
|
.font(.caption2)
|
||||||
|
.foregroundStyle(.secondary)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
Default placeholder reads `default 300` since Hermes v0.13's `sse_read_timeout` defaults to 300s (verify against `~/.hermes/hermes-agent/hermes_cli/mcp.py` during integration; if it's 60s or 600s adjust the placeholder copy).
|
||||||
|
- Adjust `canSubmit` + `submit()`:
|
||||||
|
- `case .sse: return !url.trimmingCharacters(in: .whitespaces).isEmpty`
|
||||||
|
- In `submit()`, dispatch based on `transport`:
|
||||||
|
```swift
|
||||||
|
switch transport {
|
||||||
|
case .stdio: viewModel.addCustom(...) // existing
|
||||||
|
case .http: viewModel.addCustom(...) // existing
|
||||||
|
case .sse: viewModel.addCustomSSE(name: trimmedName, url: ..., sseReadTimeout: Int(sseReadTimeout))
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 1d. `scarf/scarf/Features/MCPServers/ViewModels/MCPServersViewModel.swift`
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- New method:
|
||||||
|
```swift
|
||||||
|
func addCustomSSE(name: String, url: String, sseReadTimeout: Int?) {
|
||||||
|
let fileService = self.fileService
|
||||||
|
Task.detached {
|
||||||
|
let result = fileService.addMCPServerSSE(name: name, url: url, sseReadTimeout: sseReadTimeout)
|
||||||
|
await MainActor.run {
|
||||||
|
if result.exitCode == 0 {
|
||||||
|
self.flashStatus("Added \(name)")
|
||||||
|
self.load()
|
||||||
|
self.selectedServerName = name
|
||||||
|
self.showRestartBanner = true
|
||||||
|
self.showAddCustom = false
|
||||||
|
} else {
|
||||||
|
self.activeError = "Add failed: \(result.output)"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
- Optional cosmetic: add a third filtered list `sseServers: [HermesMCPServer]` matching the `stdioServers` / `httpServers` pattern, plus a third `Section("Remote (SSE)")` in `MCPServersView.serversList`. Keeping the two existing sections + a new one mirrors the existing UX better than collapsing all remote into one section.
|
||||||
|
|
||||||
|
#### 1e. `scarf/scarf/Features/MCPServers/Views/MCPServersView.swift`
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- Add a third `if !viewModel.sseServers.isEmpty { Section("Remote (SSE)") { ... } }` block in `serversList`. The icon for the row stays `network` (same as http) — the "(SSE)" label in the section header is the differentiator.
|
||||||
|
- No capability gate inside `MCPServersView` — pre-v0.13 hosts simply have no `.sse` entries to render.
|
||||||
|
|
||||||
|
#### 1f. `scarf/scarf/Features/MCPServers/Views/MCPServerEditorView.swift`
|
||||||
|
|
||||||
|
**Why:** Edit existing server's `sse_read_timeout`. The editor today exposes `timeout` + `connect_timeout` in `timeoutsSection`; SSE servers want a third numeric.
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- Add `@Environment(\.hermesCapabilities)` so the editor can know whether the field is editable.
|
||||||
|
- Branch `timeoutsSection` on `viewModel.server.transport`:
|
||||||
|
- For `.stdio` and `.http`: render the existing connect/call timeouts.
|
||||||
|
- For `.sse`: render the existing connect/call timeouts AND add a third "SSE Read Timeout" field bound to `viewModel.sseReadTimeoutDraft`.
|
||||||
|
- Update `MCPServerEditorViewModel`:
|
||||||
|
- Add `var sseReadTimeoutDraft: String` initialized from `server.sseReadTimeout.map(String.init) ?? ""`.
|
||||||
|
- Inside `save()`, when `transport == .sse`, call `service.setMCPServerSSETimeout(name: name, sseReadTimeout: Int(sseReadTimeoutDraft))` alongside the existing `setMCPServerTimeouts` call. A failure flips `ok = false` like the others.
|
||||||
|
|
||||||
|
#### 1g. `scarf/Packages/ScarfCore/Tests/ScarfCoreTests/HermesMCPServerYAMLTests.swift` (NEW or extension to existing)
|
||||||
|
|
||||||
|
**Tests:**
|
||||||
|
|
||||||
|
1. `parseMCPServersBlock_v013_sseEntry_decodesAsSSE` — fixture YAML with `transport: sse` + `url: https://...` + `sse_read_timeout: 300` parses to `.sse` transport with the right `sseReadTimeout` value.
|
||||||
|
2. `parseMCPServersBlock_v012_httpEntry_stillDecodesAsHTTP` — pre-v0.13 entry without `transport:` still resolves to `.http` when `url` is present.
|
||||||
|
3. `parseMCPServersBlock_v012_stdioEntry_stillDecodesAsStdio` — entry with no `url` and no `transport:` resolves to `.stdio`.
|
||||||
|
4. `setMCPServerSSETimeout_writesAndClears` — round-trip integration test using a temp YAML: write `300`, re-read, assert; write `nil`, re-read, assert key removed.
|
||||||
|
|
||||||
|
### Capability gating
|
||||||
|
|
||||||
|
- **Add-server form:** `availableTransports` filter drops `.sse` when `hasMCPSSETransport` is false. Pre-v0.13 hosts see only "stdio | http" segments. The toolbar add button stays unconditional — the gate lives inside the form.
|
||||||
|
- **Editor:** `sse_read_timeout` field renders only for servers whose `transport == .sse`. Since pre-v0.13 hosts can't write SSE servers, the field never appears for those users. (Defensive: if a v0.13 server is somehow viewed on a pre-v0.13 host — e.g. user downgraded Hermes — the editor still reads + writes the field. Hermes will ignore it. Acceptable.)
|
||||||
|
- **List rendering:** `Section("Remote (SSE)")` only renders when `sseServers` is non-empty, so pre-v0.13 hosts don't see an empty section.
|
||||||
|
|
||||||
|
### Tests
|
||||||
|
|
||||||
|
- ScarfCore: 4 YAML-parser tests above + 2 model tests (`MCPTransport.allCases.count == 3`, `sseReadTimeout` round-trips through memberwise init).
|
||||||
|
- ScarfTests (Mac app): `MCPServersViewModelTests.testAddCustomSSE` mock-fileservice test verifying the `--transport sse --sse-read-timeout` flag shape.
|
||||||
|
|
||||||
|
### Rollout
|
||||||
|
|
||||||
|
- Feature-gate behind `hasMCPSSETransport` so a pre-v0.13 host never sees the SSE option.
|
||||||
|
- No migration: existing stdio/http servers are unaffected.
|
||||||
|
- One commit. Should land at ~250-350 LOC additions across 6 files.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. Cron `--no-agent` toggle
|
||||||
|
|
||||||
|
### Files / changes
|
||||||
|
|
||||||
|
#### 2a. `scarf/Packages/ScarfCore/Sources/ScarfCore/Models/HermesCronJob.swift`
|
||||||
|
|
||||||
|
**Why:** Read-side support so `loadCronJobs()` can round-trip `no_agent: true` from `~/.hermes/cron/jobs.json`. Pre-v0.13 jobs.json files don't have the field — the existing `decodeIfPresent` pattern (line 113 for `workdir`) handles that.
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- Add `public nonisolated let noAgent: Bool?` between `workdir` and `contextFrom`.
|
||||||
|
- Extend `enum CodingKeys` with `case noAgent = "no_agent"`.
|
||||||
|
- Extend the public memberwise initializer's tail with `noAgent: Bool? = nil`.
|
||||||
|
- Extend `init(from decoder:)`: `self.noAgent = try c.decodeIfPresent(Bool.self, forKey: .noAgent)`.
|
||||||
|
- Extend `encode(to encoder:)`: `try c.encodeIfPresent(noAgent, forKey: .noAgent)`.
|
||||||
|
|
||||||
|
**Tolerance contract:** A pre-v0.13 jobs.json with no `no_agent` field decodes with `noAgent == nil`. A v0.13 jobs.json with explicit `no_agent: false` decodes with `noAgent == false`. The "render the badge?" check is `job.noAgent == true` (treats `nil` and `false` identically — a script-only job must opt in).
|
||||||
|
|
||||||
|
#### 2b. `scarf/scarf/Features/Cron/Views/CronView.swift`
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- Extend `CronJobEditor.FormState` with `var noAgent: Bool = false`.
|
||||||
|
- Add `let supportsNoAgent: Bool` next to the existing `let supportsWorkdir: Bool`.
|
||||||
|
- Inside `body`, add a Toggle row near the bottom of the form (after `Workdir`, before `availableSkills`):
|
||||||
|
```swift
|
||||||
|
if supportsNoAgent {
|
||||||
|
Toggle("Run script only (no agent call)", isOn: $form.noAgent)
|
||||||
|
.scarfStyle(.body)
|
||||||
|
.tint(ScarfColor.accent)
|
||||||
|
if form.noAgent {
|
||||||
|
Text("Watchdog mode — Hermes runs the pre-run script and skips the AI turn. Prompt + skills are ignored.")
|
||||||
|
.scarfStyle(.caption)
|
||||||
|
.foregroundStyle(ScarfColor.foregroundMuted)
|
||||||
|
.padding(.leading, ScarfSpace.s3)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
- Conditionally collapse the prompt + skills sections when `form.noAgent` is true. Don't *remove* them from the view tree — keep them rendered but visually muted (and perhaps disabled). This avoids the layout shift surprise of fields disappearing mid-edit:
|
||||||
|
```swift
|
||||||
|
// around the existing Prompt TextEditor
|
||||||
|
.opacity(form.noAgent ? 0.4 : 1.0)
|
||||||
|
.disabled(form.noAgent)
|
||||||
|
.accessibilityHint(form.noAgent ? Text("Disabled — Run script only is on") : Text(""))
|
||||||
|
```
|
||||||
|
Apply the same to the Skills picker. Script field stays fully active — it's the load-bearing thing in `--no-agent` mode.
|
||||||
|
- On entering edit mode (the existing `.onAppear` handler), hydrate `form.noAgent = job.noAgent ?? false`.
|
||||||
|
- Wire through to the parent: pass `form.noAgent` in the `onSave(form)` callback. The parent's `viewModel.createJob` / `updateJob` then knows the flag.
|
||||||
|
|
||||||
|
#### 2c. `scarf/scarf/Features/Cron/Views/CronView.swift` — owner site
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- Add a private capability accessor next to `hasCronWorkdir`:
|
||||||
|
```swift
|
||||||
|
private var hasCronNoAgent: Bool {
|
||||||
|
capabilitiesStore?.capabilities.hasCronNoAgent ?? false
|
||||||
|
}
|
||||||
|
```
|
||||||
|
- Plumb `supportsNoAgent: hasCronNoAgent` into `CronJobEditor` instantiations (both the create and edit sheet paths, mirroring how `supportsWorkdir` is wired).
|
||||||
|
- Update the create + edit `.sheet` closures to pass `noAgent: form.noAgent` into `viewModel.createJob` / `updateJob`. Mirror the `workdir` strip-on-pre-v0.12 pattern: pass `hasCronNoAgent ? form.noAgent : false`. (For the update path, pass `hasCronNoAgent ? form.noAgent : nil` if the underlying VM signature distinguishes "don't touch" from "set false" — see VM section below.)
|
||||||
|
|
||||||
|
#### 2d. `scarf/scarf/Features/Cron/ViewModels/CronViewModel.swift`
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- Extend `createJob` signature with `noAgent: Bool = false` at the tail:
|
||||||
|
```swift
|
||||||
|
func createJob(schedule: String, prompt: String, name: String, deliver: String, skills: [String], script: String, repeatCount: String, workdir: String = "", noAgent: Bool = false) {
|
||||||
|
var args = ["cron", "create"]
|
||||||
|
...
|
||||||
|
if noAgent { args.append("--no-agent") }
|
||||||
|
args.append(schedule)
|
||||||
|
// When --no-agent is set Hermes ignores the prompt arg, but argparse still
|
||||||
|
// wants positional args to line up with the schedule. Pass an empty string
|
||||||
|
// explicitly so the positional parser doesn't treat the prompt as missing.
|
||||||
|
if noAgent {
|
||||||
|
args.append("")
|
||||||
|
} else if !prompt.isEmpty {
|
||||||
|
args.append(prompt)
|
||||||
|
}
|
||||||
|
runAndReload(args, success: "Job created")
|
||||||
|
}
|
||||||
|
```
|
||||||
|
Verify Hermes's argparse behavior during integration — if `cron create --no-agent <schedule>` rejects the trailing empty positional, drop the empty-string append.
|
||||||
|
- Extend `updateJob` signature with `noAgent: Bool? = nil`:
|
||||||
|
```swift
|
||||||
|
func updateJob(id: String, ..., workdir: String? = nil, noAgent: Bool? = nil) {
|
||||||
|
var args = ["cron", "edit", id]
|
||||||
|
...
|
||||||
|
if let noAgent {
|
||||||
|
// Hermes documents `--no-agent` as a flag on `cron edit` for v0.13+.
|
||||||
|
// Verify exact toggle-off shape (likely `--no-agent=false` or
|
||||||
|
// `--agent` to flip back). See Open Questions.
|
||||||
|
if noAgent { args.append("--no-agent") }
|
||||||
|
else { args.append("--agent") }
|
||||||
|
}
|
||||||
|
runAndReload(args, success: "Updated")
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 2e. `scarf/scarf/Features/Cron/Views/CronView.swift` — detail rendering
|
||||||
|
|
||||||
|
**Edits (cosmetic, optional but high-value):** When the selected job has `noAgent == true`, render a small `ScarfBadge("script-only", kind: .info)` in `detailHeader` next to the existing `paused` / `running…` badges so the user can tell at a glance which jobs are watchdogs. Same in the `cronRow` list — append a `ScarfBadge("no-agent", kind: .neutral)` when the flag is on, similar to the existing `paused` badge.
|
||||||
|
|
||||||
|
### Capability gating
|
||||||
|
|
||||||
|
- **Editor toggle:** rendered only when `supportsNoAgent` is true. Pre-v0.13 hosts never see the field.
|
||||||
|
- **Defensive write-strip:** `CronView` passes `hasCronNoAgent ? form.noAgent : false` on create and `hasCronNoAgent ? form.noAgent : nil` on edit. Mirrors the `workdir` strip from v0.12 (`workdir: hasCronWorkdir ? form.workdir : ""` on create, `nil` on edit).
|
||||||
|
- **Read-side rendering:** badges + collapsed-fields visual cue render unconditionally when `job.noAgent == true`. A user who downgraded Hermes after creating a `no_agent` job still sees it labeled correctly, even though they can no longer create new ones.
|
||||||
|
|
||||||
|
### Tests
|
||||||
|
|
||||||
|
- `M6ConfigCronTests` extension: add `decodes_v013_jobs_json_with_no_agent` — fixture jobs.json with one job carrying `no_agent: true`. Assert `job.noAgent == true`.
|
||||||
|
- `M6ConfigCronTests`: `decodes_v012_jobs_json_no_no_agent_field` — pre-v0.13 fixture, assert `job.noAgent == nil`.
|
||||||
|
- `CronViewModelNoAgentTests` (new): mock-fileservice test asserting `createJob(..., noAgent: true)` produces `["cron", "create", "--no-agent", schedule, ""]` (or whatever argparse shape we converge on after integration).
|
||||||
|
- Manual: pre-v0.13 host — toggle absent in editor. v0.13 host — toggle present, creating a script-only job with no AGENTS.md context completes without an LLM call (verify in `~/.hermes/logs/`).
|
||||||
|
|
||||||
|
### Rollout
|
||||||
|
|
||||||
|
- One commit. ~150-200 LOC across 4 files (model + view + editor + VM).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. Web Tools backend split
|
||||||
|
|
||||||
|
### Files / changes
|
||||||
|
|
||||||
|
A net-new Settings tab. Today there is no Web Tools tab — `web_extract`'s **provider** lives in Aux Models, but `web_tools.search.backend` / `web_tools.extract.backend` (the backend-not-provider keys) are not surfaced by Scarf today (verified: `grep web_tools = ` returns no Scarf hits). v0.13 makes "split per capability" the wire model, so introducing the tab here gives us a clean substrate to add backend-specific rows on later.
|
||||||
|
|
||||||
|
Layout shape:
|
||||||
|
|
||||||
|
- Pre-v0.13: a single row "Combined backend" → `web_tools.backend` key (legacy v0.12 shape).
|
||||||
|
- v0.13+: two rows — "Search backend" → `web_tools.search.backend`, "Extract backend" → `web_tools.extract.backend`. SearXNG appears in the Search picker only.
|
||||||
|
|
||||||
|
Both shapes coexist in the same tab; the gate decides which renders.
|
||||||
|
|
||||||
|
#### 3a. `scarf/scarf/Features/Settings/Views/SettingsView.swift`
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- Add a new case to `SettingsTab`:
|
||||||
|
```swift
|
||||||
|
case webTools = "Web Tools"
|
||||||
|
```
|
||||||
|
Position: between `.browser` and `.voice` (browser-adjacent in the user's mental model). Update `displayName`, `icon` (`"globe.americas"`), and `tabContent` switch.
|
||||||
|
- `tabContent` adds: `case .webTools: WebToolsTab(viewModel: viewModel)`.
|
||||||
|
|
||||||
|
#### 3b. `scarf/scarf/Features/Settings/Views/Tabs/WebToolsTab.swift` (NEW)
|
||||||
|
|
||||||
|
**Why:** Self-contained tab file matching the existing pattern (`BrowserTab.swift`, `TerminalTab.swift`, etc.). Pre-v0.13 + v0.13+ shapes both live here behind a capability check.
|
||||||
|
|
||||||
|
**Shape:**
|
||||||
|
|
||||||
|
```swift
|
||||||
|
import SwiftUI
|
||||||
|
import ScarfCore
|
||||||
|
import ScarfDesign
|
||||||
|
|
||||||
|
struct WebToolsTab: View {
|
||||||
|
@Bindable var viewModel: SettingsViewModel
|
||||||
|
@Environment(\.hermesCapabilities) private var capabilitiesStore
|
||||||
|
|
||||||
|
private var split: Bool {
|
||||||
|
capabilitiesStore?.capabilities.hasWebToolsBackendSplit ?? false
|
||||||
|
}
|
||||||
|
|
||||||
|
private static let searchBackends: [String] = [
|
||||||
|
"duckduckgo", "tavily", "brave", "exa", "you", "searxng"
|
||||||
|
]
|
||||||
|
private static let extractBackends: [String] = [
|
||||||
|
"reader", "browserless", "trafilatura", "firecrawl"
|
||||||
|
]
|
||||||
|
private static let combinedBackends: [String] = [
|
||||||
|
"duckduckgo", "tavily", "brave", "exa", "you", "reader", "browserless", "trafilatura", "firecrawl"
|
||||||
|
]
|
||||||
|
|
||||||
|
var body: some View {
|
||||||
|
VStack(alignment: .leading, spacing: ScarfSpace.s5) {
|
||||||
|
SettingsSection(title: "Web Tools", icon: "globe.americas") {
|
||||||
|
if split {
|
||||||
|
Picker("Search backend", selection: Binding(
|
||||||
|
get: { viewModel.config.webToolsSearchBackend },
|
||||||
|
set: { viewModel.setWebToolsSearchBackend($0) }
|
||||||
|
)) {
|
||||||
|
ForEach(Self.searchBackends, id: \.self) { Text($0).tag($0) }
|
||||||
|
}
|
||||||
|
Text("SearXNG joined v0.13 as a search-only backend.")
|
||||||
|
.scarfStyle(.caption)
|
||||||
|
.foregroundStyle(ScarfColor.foregroundMuted)
|
||||||
|
Picker("Extract backend", selection: Binding(
|
||||||
|
get: { viewModel.config.webToolsExtractBackend },
|
||||||
|
set: { viewModel.setWebToolsExtractBackend($0) }
|
||||||
|
)) {
|
||||||
|
ForEach(Self.extractBackends, id: \.self) { Text($0).tag($0) }
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
Picker("Backend", selection: Binding(
|
||||||
|
get: { viewModel.config.webToolsBackend },
|
||||||
|
set: { viewModel.setWebToolsBackend($0) }
|
||||||
|
)) {
|
||||||
|
ForEach(Self.combinedBackends, id: \.self) { Text($0).tag($0) }
|
||||||
|
}
|
||||||
|
Text("Hermes v0.13 splits search and extract into separate backends. Update Hermes to access the per-capability picker.")
|
||||||
|
.scarfStyle(.caption)
|
||||||
|
.foregroundStyle(ScarfColor.foregroundFaint)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The backend lists are intentionally small + curated. **The exact set must be reconciled against `~/.hermes/hermes-agent/hermes_cli/web_tools.py` (or wherever Hermes registers the dispatch table)** during integration. See Open Questions.
|
||||||
|
|
||||||
|
#### 3c. `scarf/Packages/ScarfCore/Sources/ScarfCore/Models/HermesConfig.swift`
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- Add three new top-level fields to `HermesConfig` (next to `redactionEnabled` near line 663, since they share the v0.12+ migration tail comment):
|
||||||
|
```swift
|
||||||
|
/// Pre-v0.13: single combined backend at `web_tools.backend`. v0.13
|
||||||
|
/// flipped to per-capability split (see below). Kept for round-trip
|
||||||
|
/// on hosts that never migrated.
|
||||||
|
public var webToolsBackend: String // default "duckduckgo"
|
||||||
|
/// v0.13+: `web_tools.search.backend`. SearXNG can land here.
|
||||||
|
public var webToolsSearchBackend: String // default "duckduckgo"
|
||||||
|
/// v0.13+: `web_tools.extract.backend`.
|
||||||
|
public var webToolsExtractBackend: String // default "reader"
|
||||||
|
```
|
||||||
|
- Add to the memberwise initializer at the tail with defaults so v2.7.5 call sites still compile.
|
||||||
|
- Extend `.empty` with `"duckduckgo"` / `"duckduckgo"` / `"reader"` defaults.
|
||||||
|
|
||||||
|
#### 3d. `scarf/Packages/ScarfCore/Sources/ScarfCore/Parsing/HermesConfig+YAML.swift`
|
||||||
|
|
||||||
|
**Edits:** Read three new keys via the existing `str(...)` helper:
|
||||||
|
- `webToolsBackend: str("web_tools.backend", default: "duckduckgo")`
|
||||||
|
- `webToolsSearchBackend: str("web_tools.search.backend", default: "duckduckgo")`
|
||||||
|
- `webToolsExtractBackend: str("web_tools.extract.backend", default: "reader")`
|
||||||
|
|
||||||
|
Pre-v0.13 YAML has only `web_tools.backend`; the two split keys default to the same value. v0.13 YAML may have `web_tools.search.backend` set and `web_tools.backend` absent — the legacy field falls back to its default but is unused on v0.13 hosts (the tab gates on `hasWebToolsBackendSplit`).
|
||||||
|
|
||||||
|
#### 3e. `scarf/scarf/Features/Settings/ViewModels/SettingsViewModel.swift`
|
||||||
|
|
||||||
|
**Edits:** Three new setters:
|
||||||
|
```swift
|
||||||
|
func setWebToolsBackend(_ value: String) { setSetting("web_tools.backend", value: value) }
|
||||||
|
func setWebToolsSearchBackend(_ value: String) { setSetting("web_tools.search.backend", value: value) }
|
||||||
|
func setWebToolsExtractBackend(_ value: String) { setSetting("web_tools.extract.backend", value: value) }
|
||||||
|
```
|
||||||
|
All three route through `hermes config set <key> <value>` — the v0.13 CLI accepts the dotted path keys as written. Hermes config-set rejects unknown keys, so on a pre-v0.13 host `setWebToolsSearchBackend` would fail; we don't expose the call site there (the picker isn't rendered).
|
||||||
|
|
||||||
|
### Capability gating
|
||||||
|
|
||||||
|
- **Tab itself:** the tab is always shown — pre-v0.13 hosts see the legacy combined picker so they're not blocked from configuring Web Tools at all. Removing the tab entirely on pre-v0.13 hosts would create a feature regression for users on v0.12.
|
||||||
|
- **Picker shape:** `split` flag inside `WebToolsTab` chooses between the two shapes.
|
||||||
|
- **SearXNG visibility:** appears only in `searchBackends` (the v0.13 split case). Never in `combinedBackends`. This matches Hermes — pre-v0.13 doesn't dispatch SearXNG at all.
|
||||||
|
|
||||||
|
### Tests
|
||||||
|
|
||||||
|
- `HermesConfigYAMLTests`:
|
||||||
|
1. `parses_v012_combined_backend` — fixture with `web_tools.backend: tavily`, no split keys → `webToolsBackend == "tavily"`, split keys == defaults.
|
||||||
|
2. `parses_v013_split_backend` — fixture with both `web_tools.search.backend: searxng` + `web_tools.extract.backend: reader` → both split keys populated.
|
||||||
|
3. `parses_v013_partial` — fixture with only `web_tools.search.backend` set (extract uses default) → search populated, extract == default.
|
||||||
|
- Manual: load v0.12 host → see combined picker. Load v0.13 host → see split. Confirm SearXNG only in Search.
|
||||||
|
|
||||||
|
### Rollout
|
||||||
|
|
||||||
|
- One commit. ~200-260 LOC: 1 new file (~80 LOC), edits to 4 existing files. New tab makes this the largest of the four additions.
|
||||||
|
- Add an entry to the Settings tab strip — verify horizontal scroll still fits 11 tabs comfortably (it should; the strip is `.scrollView(.horizontal)` already).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. Profiles `--no-skills` toggle
|
||||||
|
|
||||||
|
### Files / changes
|
||||||
|
|
||||||
|
#### 4a. `scarf/scarf/Features/Profiles/Views/ProfilesView.swift`
|
||||||
|
|
||||||
|
**Edits:**
|
||||||
|
|
||||||
|
- Add `@Environment(\.hermesCapabilities) private var capabilitiesStore` next to the existing state.
|
||||||
|
- Add `@State private var createNoSkills: Bool = false` next to `createCloneConfig` / `createCloneAll`.
|
||||||
|
- Inside `createSheet`, add a new toggle row between the existing toggles:
|
||||||
|
```swift
|
||||||
|
if capabilitiesStore?.capabilities.hasProfileNoSkills ?? false {
|
||||||
|
Toggle("Empty profile (no skills)", isOn: $createNoSkills)
|
||||||
|
.disabled(createCloneAll) // mutually exclusive with full clone
|
||||||
|
}
|
||||||
|
```
|
||||||
|
Why disabled when `createCloneAll`: a full clone copies skills wholesale — `--no-skills` would be a contradiction. Hermes likely rejects the combination but the UX is cleaner if we don't let the user reach it.
|
||||||
|
- Reset on sheet open: in the existing reset (line 126: `createName = ""; createCloneConfig = true; createCloneAll = false`), add `createNoSkills = false`.
|
||||||
|
- Wire to the VM:
|
||||||
|
```swift
|
||||||
|
Button("Create") {
|
||||||
|
viewModel.create(name: createName, cloneConfig: createCloneConfig, cloneAll: createCloneAll, noSkills: createNoSkills)
|
||||||
|
showCreate = false
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 4b. `scarf/scarf/Features/Profiles/ViewModels/ProfilesViewModel.swift`
|
||||||
|
|
||||||
|
**Edits:** Extend `create` signature with `noSkills: Bool = false`:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
func create(name: String, cloneConfig: Bool, cloneAll: Bool, noSkills: Bool = false) {
|
||||||
|
var args = ["profile", "create", name]
|
||||||
|
if cloneAll { args.append("--clone-all") }
|
||||||
|
else if cloneConfig { args.append("--clone") }
|
||||||
|
if noSkills { args.append("--no-skills") }
|
||||||
|
runAndReload(args, success: "Profile '\(name)' created")
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `--no-skills` flag is independent of `--clone` / `--clone-all` per the v0.13 release notes ("`--no-skills` flag for empty profile creation"). The UX disables the toggle under `--clone-all` for clarity, but the wire is unconditional — the user can stack `--clone --no-skills` to clone config but skip skills, which is a plausible workflow.
|
||||||
|
|
||||||
|
### Capability gating
|
||||||
|
|
||||||
|
- **Toggle visibility:** wrapped in `capabilitiesStore?.capabilities.hasProfileNoSkills ?? false`. Pre-v0.13 hosts never see it.
|
||||||
|
- **Defensive write-strip:** the VM always reads `noSkills` as the default `false` if the form didn't surface the toggle. No need for a `?? false` strip at the call site — the parameter has a default in the VM signature.
|
||||||
|
|
||||||
|
### Tests
|
||||||
|
|
||||||
|
- `ProfilesViewModelTests` (new or extension): `create_emitsNoSkillsFlagWhenSet` — mock-fileservice asserting `["profile", "create", "name", "--no-skills"]` for `noSkills: true`.
|
||||||
|
- `create_combinesCloneAndNoSkills` — `["profile", "create", "name", "--clone", "--no-skills"]`.
|
||||||
|
- `create_omitsNoSkillsByDefault` — verifies the v2.7.5 signature still produces the v2.7.5 args.
|
||||||
|
- Manual: pre-v0.13 host — toggle absent. v0.13 host — toggle creates an empty `~/.hermes/profiles/<name>/skills/` (verify on disk).
|
||||||
|
|
||||||
|
### Rollout
|
||||||
|
|
||||||
|
- One commit. ~30-50 LOC across 2 files. Smallest of the four additions.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Open questions
|
||||||
|
|
||||||
|
1. **MCP transport names.** The release notes say "SSE transport" and reference "stale pipe transport failures." Scarf's `MCPTransport` enum has `stdio` and `http`; Hermes internally calls those `stdio` and `streamable-http` (or just `http`), and the "pipe" callsite likely refers to internal stdio process pipes — not a third user-facing transport. We're proceeding on that assumption. **Verify:** read `~/.hermes/hermes-agent/hermes_cli/mcp.py` (or equivalent) during integration to confirm `pipe` is internal-only and not a fourth user-selectable transport.
|
||||||
|
|
||||||
|
2. **`sse_read_timeout` default value.** The plan uses 300s as the placeholder ("default 300"). Hermes v0.13's `_wait_for_lifecycle_event` keepalive cadence may have a different default — could be 60s, could be 600s. Verify in code; the placeholder copy is the only impact.
|
||||||
|
|
||||||
|
3. **`hermes mcp add --transport sse` flag spelling.** The plan assumes `--transport sse` and `--sse-read-timeout <int>`. If Hermes shipped them as `--sse` (boolean) + `--read-timeout`, or merged into `--timeout`, adjust `addMCPServerSSE` accordingly. Test by running `hermes mcp add --help` against a v0.13 install.
|
||||||
|
|
||||||
|
4. **Cron `--no-agent` toggle-off shape on edit.** The plan assumes `hermes cron edit <id> --agent` flips the flag back. Possible Hermes ships only `--no-agent` (one-way) and you must `hermes cron remove` + `cron create` without the flag to undo. If so, the edit-mode toggle should be disabled or render a tooltip "Toggling off requires recreating the job." Verify against `hermes cron edit --help`.
|
||||||
|
|
||||||
|
5. **Cron `--no-agent` + positional prompt argparse.** The plan passes an empty-string positional after `--no-agent <schedule>` to satisfy argparse. Verify whether Hermes's `cron create` parser tolerates a missing prompt positional when `--no-agent` is set.
|
||||||
|
|
||||||
|
6. **Web Tools backend lists.** The plan curates a backend list inline based on the v0.13 release notes mentioning "SearXNG joined search-only." The exact dispatch table (which backends Hermes registers for search vs extract) lives in Hermes source. **Verify** during integration; the Picker contents are the only source of drift, and a wrong entry just produces a `hermes config set` failure on save (recoverable, but ugly).
|
||||||
|
|
||||||
|
7. **`web_tools.backend` legacy key on v0.13 hosts.** Hermes v0.13 may *also* honor the legacy `web_tools.backend` key as a fallback when neither split key is set, or may *only* honor it on the rare combined-capability backends. The plan keeps the field readable but only writes the split keys when `hasWebToolsBackendSplit` is true. Verify Hermes' fallback semantics — if `web_tools.backend` is silently ignored on v0.13, a user upgrading from v0.12 with `web_tools.backend: tavily` would suddenly see DuckDuckGo on both capabilities. We may want to add a one-time migration ("we noticed your config has the legacy `web_tools.backend` — promote to split keys?") in a follow-up.
|
||||||
|
|
||||||
|
8. **Profile `--no-skills` interaction with `--clone-all`.** Plan disables the `noSkills` toggle when `cloneAll` is on. Verify Hermes's behavior when both flags are passed: argparse may reject as mutually exclusive (good — argparse is the source of truth); may take last-flag-wins; or may produce a profile with everything-but-skills cloned (most useful). The disabled-toggle UX is conservative until we know.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Out of scope
|
||||||
|
|
||||||
|
- **MCP per-server SSE auth selection** (Bearer vs OAuth vs none for SSE endpoints). The existing `auth` field on `HermesMCPServer` may or may not carry through to SSE; left untouched. Users can edit the YAML directly via "Open in Editor."
|
||||||
|
- **Cron `--no-agent` health surface.** A watchdog cron that fails silently (script returns non-zero, no LLM to recover) is a meaningful failure mode but the existing `lastError` rendering covers it. No new health check.
|
||||||
|
- **Web Tools per-backend config sheets.** SearXNG host URL, Tavily API key, Brave key — all stay in raw YAML for v2.8. The two backend pickers are the v0.13 wire-format change WS-7 ships; the deeper config UI is a follow-up (plausible v2.9).
|
||||||
|
- **Profiles `--no-skills` post-create surface.** No UI to list a profile's skill count, no "convert to skill-less" verb. Profiles stay create-time-only for skill scoping.
|
||||||
|
- **iOS surfaces.** All four additions are Mac-only:
|
||||||
|
- MCP SSE: Scarf has no iOS MCP servers UI today.
|
||||||
|
- Cron `--no-agent`: iOS Cron is read-only (`Scarf iOS/Cron/CronListView.swift`); no editor.
|
||||||
|
- Web Tools: iOS Settings doesn't currently surface Web Tools.
|
||||||
|
- Profiles `--no-skills`: iOS Profiles is read-only (`Scarf iOS/Profiles/ProfilesView.swift`).
|
||||||
|
iOS catch-up is WS-9 territory.
|
||||||
|
- **Wiki updates.** Per CLAUDE.md, wiki updates land alongside the release once the feature is shipped — not pre-merge. WS-7 PR notes the wiki pages that will need updating in `Scarf-Settings.md`, `Scarf-Cron.md`, `Scarf-MCP-Servers.md`, `Scarf-Profiles.md`, and `Hermes-Version-Compatibility.md`. The wiki PR is its own commit on `gh-pages` after v2.8.0 ships.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Estimate
|
||||||
|
|
||||||
|
| Section | LOC est. | Files | Risk |
|
||||||
|
|---------|----------|-------|------|
|
||||||
|
| 1. MCP SSE | 250-350 | 6 (model + parser + view × 2 + VM + editor) | Medium — YAML parser change is the riskiest |
|
||||||
|
| 2. Cron `--no-agent` | 150-200 | 4 (model + view + editor + VM) | Low — mirrors v0.12 `--workdir` pattern |
|
||||||
|
| 3. Web Tools split | 200-260 | 5 (1 new tab + config model + parser + VM + tabs enum) | Medium — backend lists need verification against Hermes source |
|
||||||
|
| 4. Profiles `--no-skills` | 30-50 | 2 (view + VM) | Trivial |
|
||||||
|
| **Total** | **~700-900** | **~17 unique files** | |
|
||||||
|
|
||||||
|
**Time estimate (single dev, focused):** 2-3 days of implementation + 1 day of integration verification (the Open Questions section is mostly small empirical checks against a v0.13 Hermes install). Ten files have no overlap between the four additions, so two devs could parallelize after the model-layer work in §1 + §2 + §3 lands.
|
||||||
|
|
||||||
|
**Commit shape inside the WS-7 PR (one PR, four commits):**
|
||||||
|
|
||||||
|
1. `feat(mcp): add SSE transport support gated on hasMCPSSETransport`
|
||||||
|
2. `feat(cron): add --no-agent watchdog toggle gated on hasCronNoAgent`
|
||||||
|
3. `feat(settings): add Web Tools tab with v0.13 search/extract split`
|
||||||
|
4. `feat(profiles): add --no-skills toggle to create-profile sheet`
|
||||||
|
|
||||||
|
Reviewer can scan one commit at a time, and each can be reverted independently if a v0.13 wire-format surprise lands during integration.
|
||||||
@@ -0,0 +1,607 @@
|
|||||||
|
# WS-8 Plan: UX polish (v0.13 small-surface additions)
|
||||||
|
|
||||||
|
Branch suggestion: `ws-8-ux-v0.13`. Depends on WS-1 (`ws-1-capabilities-v0.13`, PR #80) for the v0.13 capability flags consumed below — every change here is a leaf surface that reads from `HermesCapabilities` and degrades silently on pre-v0.13 hosts.
|
||||||
|
|
||||||
|
## Goals (what this PR ships)
|
||||||
|
|
||||||
|
Six small, mostly-independent UX additions tracking the v0.13 release notes' "everything else" bucket:
|
||||||
|
|
||||||
|
1. **Context compression count chip** in the chat status bar — `🗜 ×N` rendered alongside the existing token counter when Hermes' status feed surfaces a non-zero compression count.
|
||||||
|
2. **`/new <name>` argument hint** on the slash menu — extends `argumentHint` for the `/new` entry on v0.13+ hosts so users discover the optional name.
|
||||||
|
3. **`hermes update --yes` plumbing** — purely forward-compatible. v2.7.5 has no in-app "Update Hermes" affordance (Sparkle handles Scarf-self-update, and `hermes update` is invoked by users in their terminal). This WS adds a stub helper on `UpdaterService` (or a new `HermesUpdaterCommandBuilder` static) that the future affordance will call; the helper takes a `HermesCapabilities` and decides whether to append `--yes`. No user-visible change ships in v2.8 from this item alone — see [Out of scope](#out-of-scope).
|
||||||
|
4. **Redaction default-flip awareness** — the existing "Redact secrets in patches" toggle in `Settings → Advanced → Caching & Redaction` gets a hint footnote whose copy depends on the connected host's version (server default flipped from OFF in v0.12 → ON in v0.13).
|
||||||
|
5. **`display.language` picker** in Settings → General → Locale — 8-option enum (`en` / `zh` / `ja` / `de` / `es` / `fr` / `uk` / `tr`), persisted via `hermes config set display.language <code>`.
|
||||||
|
6. **xAI Custom Voices badge** next to the xAI TTS provider entry in Settings → Voice → Text-to-Speech (and `xai` added to the provider list — it's not currently there).
|
||||||
|
|
||||||
|
Out-of-scope items captured in [Out of scope](#out-of-scope).
|
||||||
|
|
||||||
|
## 1. Context compression count
|
||||||
|
|
||||||
|
### What v0.13 emits
|
||||||
|
|
||||||
|
Hermes v0.13 adds a context compression count to the status feed shown in the CLI / TUI. The release notes phrase it as "Show context compression count in status bar" — they don't pin the wire field name. See [Open question Q1](#open-questions) — the plan below assumes it lands on the existing `usage` blob in `session/prompt`'s response and that it's a monotonically-incrementing integer counting how many auto-compactions have run on the active session. This matches the structure of the existing token counters (also on `usage`) and means a single small extension to `ACPPromptResult` covers it.
|
||||||
|
|
||||||
|
### Files to change
|
||||||
|
|
||||||
|
#### [scarf/Packages/ScarfCore/Sources/ScarfCore/Models/ACPMessages.swift](../../Packages/ScarfCore/Sources/ScarfCore/Models/ACPMessages.swift)
|
||||||
|
|
||||||
|
`ACPPromptResult` (around line 240) gains one optional field:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
public struct ACPPromptResult: Sendable {
|
||||||
|
public let stopReason: String
|
||||||
|
public let inputTokens: Int
|
||||||
|
public let outputTokens: Int
|
||||||
|
public let thoughtTokens: Int
|
||||||
|
public let cachedReadTokens: Int
|
||||||
|
/// Number of automatic context compactions Hermes has performed on
|
||||||
|
/// this session so far. v0.13+ — older Hermes hosts always return 0,
|
||||||
|
/// which the chat status bar treats as "hide chip". Optional in the
|
||||||
|
/// wire payload; folded into a non-optional `Int` here with a 0
|
||||||
|
/// default so the rest of the pipeline doesn't need to nil-check.
|
||||||
|
public let compressionCount: Int
|
||||||
|
|
||||||
|
public init(
|
||||||
|
stopReason: String,
|
||||||
|
inputTokens: Int,
|
||||||
|
outputTokens: Int,
|
||||||
|
thoughtTokens: Int,
|
||||||
|
cachedReadTokens: Int,
|
||||||
|
compressionCount: Int = 0
|
||||||
|
) { … }
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Default-zero on the initializer keeps existing call sites compiling; the only mutator is `ACPClient.sendPrompt`.
|
||||||
|
|
||||||
|
#### [scarf/Packages/ScarfCore/Sources/ScarfCore/ACP/ACPClient.swift](../../Packages/ScarfCore/Sources/ScarfCore/ACP/ACPClient.swift)
|
||||||
|
|
||||||
|
`sendPrompt` (around line 311–322) gains one decode line. The exact key is the open question — encode tolerantly:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
let usage = dict["usage"] as? [String: Any] ?? [:]
|
||||||
|
// Tolerate either snake_case or camelCase per the rest of the ACP
|
||||||
|
// payload's mixed conventions; whichever Hermes ships, we read.
|
||||||
|
let compression = (usage["compressionCount"] as? Int)
|
||||||
|
?? (usage["compression_count"] as? Int)
|
||||||
|
?? 0
|
||||||
|
```
|
||||||
|
|
||||||
|
Pass `compressionCount: compression` into the `ACPPromptResult` initializer.
|
||||||
|
|
||||||
|
#### [scarf/Packages/ScarfCore/Sources/ScarfCore/ViewModels/RichChatViewModel.swift](../../Packages/ScarfCore/Sources/ScarfCore/ViewModels/RichChatViewModel.swift)
|
||||||
|
|
||||||
|
Add an observable counter alongside the existing token counters (around line 228–231):
|
||||||
|
|
||||||
|
```swift
|
||||||
|
public private(set) var acpCompressionCount = 0
|
||||||
|
```
|
||||||
|
|
||||||
|
Reset to 0 in `reset()` (around line 464–470) alongside the token counters.
|
||||||
|
|
||||||
|
In `handlePromptComplete` (around line 810–813) — the same place that aggregates ACP token counts — overwrite (don't add) with the latest server value:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
acpInputTokens += response.inputTokens
|
||||||
|
acpOutputTokens += response.outputTokens
|
||||||
|
acpThoughtTokens += response.thoughtTokens
|
||||||
|
acpCachedReadTokens += response.cachedReadTokens
|
||||||
|
// Compression count is a session-wide running total emitted by Hermes;
|
||||||
|
// each prompt response carries the latest value, so we replace rather
|
||||||
|
// than accumulate. Treat 0 as "no compactions yet" — the view hides
|
||||||
|
// the chip in that case.
|
||||||
|
acpCompressionCount = max(acpCompressionCount, response.compressionCount)
|
||||||
|
```
|
||||||
|
|
||||||
|
The `max(...)` guard tolerates pre-v0.13 hosts that return `0` mid-session: if the agent is upgraded server-side without restarting Scarf, the count will resume at the higher value the next time `usage` carries a real number.
|
||||||
|
|
||||||
|
#### [scarf/scarf/Features/Chat/Views/SessionInfoBar.swift](../../scarf/Features/Chat/Views/SessionInfoBar.swift)
|
||||||
|
|
||||||
|
Add one more pass-through prop alongside the existing `acpInputTokens` / `acpOutputTokens` / `acpThoughtTokens` (lines 9–11):
|
||||||
|
|
||||||
|
```swift
|
||||||
|
var acpCompressionCount: Int = 0
|
||||||
|
/// Capability snapshot for v0.13 surfaces. Defaulted so previews and
|
||||||
|
/// pre-v0.13 hosts render the v2.7.5 layout unchanged.
|
||||||
|
var capabilities: HermesCapabilities = .empty
|
||||||
|
```
|
||||||
|
|
||||||
|
Inside the `body` `HStack`, after the reasoning-tokens label and before the cost label, render the compression chip:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
if capabilities.hasContextCompressionCount && acpCompressionCount > 0 {
|
||||||
|
Label("×\(acpCompressionCount)", systemImage: "arrow.down.right.and.arrow.up.left")
|
||||||
|
.scarfStyle(.caption)
|
||||||
|
.foregroundStyle(ScarfColor.foregroundMuted)
|
||||||
|
.help("Hermes auto-compacted this session's context \(acpCompressionCount) time\(acpCompressionCount == 1 ? "" : "s")")
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Notes on the visual: stick to existing `Label` + `scarfStyle(.caption)` + `ScarfColor.foregroundMuted` so the chip blends with the other counters. **Don't** invent a new `ScarfBadge` style — the row's already badge-like via the surrounding `.padding(.horizontal, ScarfSpace.s4)` background, and ScarfBadge would visually overpower a passive count. Icon: `arrow.down.right.and.arrow.up.left` (the SF Symbol for compaction). If the symbol doesn't render on macOS 14.6 — which we deploy to — fall back to a Unicode box-drawing glyph or `archivebox.fill`; flag as a follow-up rather than picking now.
|
||||||
|
|
||||||
|
#### [scarf/scarf/Features/Chat/Views/ChatTranscriptPane.swift](../../scarf/Features/Chat/Views/ChatTranscriptPane.swift)
|
||||||
|
|
||||||
|
Plumb the new field plus the env-resolved capabilities through to `SessionInfoBar`:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
SessionInfoBar(
|
||||||
|
session: richChat.currentSession,
|
||||||
|
isWorking: richChat.isGenerating,
|
||||||
|
acpInputTokens: richChat.acpInputTokens,
|
||||||
|
acpOutputTokens: richChat.acpOutputTokens,
|
||||||
|
acpThoughtTokens: richChat.acpThoughtTokens,
|
||||||
|
acpCompressionCount: richChat.acpCompressionCount,
|
||||||
|
projectName: chatViewModel.currentProjectName,
|
||||||
|
gitBranch: chatViewModel.currentGitBranch,
|
||||||
|
capabilities: capabilities?.capabilities ?? .empty
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
Pull the capabilities from the existing `@Environment(\.hermesCapabilities)` (declared on the parent view tree per [HermesCapabilities.swift:411](../../Packages/ScarfCore/Sources/ScarfCore/Services/HermesCapabilities.swift)). If the pane doesn't currently observe it, add `@Environment(\.hermesCapabilities) private var capabilities` at the top.
|
||||||
|
|
||||||
|
#### iOS
|
||||||
|
|
||||||
|
`Scarf iOS` doesn't have a `SessionInfoBar` mirror today; the iOS chat tab uses a different header. Skip iOS in this WS — capture under [Out of scope](#out-of-scope) for follow-up. Reasoning: iOS users are read-only consumers of compression count, the data model already flows through `RichChatViewModel`, and an iOS surface isn't gated on this WS.
|
||||||
|
|
||||||
|
### Coordination with WS-2
|
||||||
|
|
||||||
|
WS-2 mounts a "Goal locked" pill into `SessionInfoBar` between the project / branch chips and the working dot. The compression chip lives on the **right** half of the bar (next to tokens / cost), not the left, so the two changes don't collide spatially. They both add `var capabilities: HermesCapabilities = .empty` to `SessionInfoBar`, however — pick the same parameter name and order so whichever WS lands first establishes the prop and the second WS just reads it. WS-2 is presumed to land first (WS-2 is a flagship feature, this is polish); if not, both WSs need to add the prop and the merger should keep one declaration.
|
||||||
|
|
||||||
|
## 2. `/new <name>` slash command argument
|
||||||
|
|
||||||
|
### Current state
|
||||||
|
|
||||||
|
`/new` already appears in the slash menu — it's advertised by the ACP server via `available_commands_update` (handled in [RichChatViewModel:234](../../Packages/ScarfCore/Sources/ScarfCore/ViewModels/RichChatViewModel.swift) into `acpCommands`). The argumentHint comes from whatever the server sends. That means the v0.13 server will *automatically* surface a hint update because Hermes will send `"argument_hint": "[name]"` (or similar) once the new flag lands. We don't need to hardcode a Scarf-side override.
|
||||||
|
|
||||||
|
### What we change
|
||||||
|
|
||||||
|
The user-visible work here is mostly verification / smoke-testing. The mechanical changes are minor, mostly defensive:
|
||||||
|
|
||||||
|
#### [scarf/scarf/Features/Chat/Views/SlashCommandMenu.swift](../../scarf/Features/Chat/Views/SlashCommandMenu.swift)
|
||||||
|
|
||||||
|
The argument hint renderer at line 89–93 wraps the hint in `<…>` literally. Hermes v0.13 likely emits the optional argument as `[name]` (square-bracket convention for "optional"). If we leave the wrapper in place we'd render `<[name]>`. Replace the wrapper with a smarter join:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
if let hint = command.argumentHint {
|
||||||
|
let display = hint.hasPrefix("<") || hint.hasPrefix("[")
|
||||||
|
? hint
|
||||||
|
: "<\(hint)>"
|
||||||
|
Text(display)
|
||||||
|
.font(ScarfFont.monoSmall)
|
||||||
|
.foregroundStyle(ScarfColor.foregroundFaint)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This way the server's chosen brackets pass through, and existing entries that send `guidance` (without brackets) still render `<guidance>`.
|
||||||
|
|
||||||
|
#### Capability gate (none required, but a help-text override is allowed)
|
||||||
|
|
||||||
|
We *could* gate the rendering behind `hasNewWithSessionName` and override the hint only on v0.13+ — but the agent is the source of truth for the hint, and pre-v0.13 will send no hint at all (or the old hint). Leaving the renderer un-gated and trusting the agent's value is simpler and forward-compatible. **No flag check at this site.**
|
||||||
|
|
||||||
|
The flag exists for one place: a small banner in the slash menu that says "Tip: `/new <name>` lets you label the next session" on v0.13+ if the user hovers `/new` for >1s. **Defer the tip — over-engineering for one slash command.** Capture under [Out of scope](#out-of-scope).
|
||||||
|
|
||||||
|
### Coordination with WS-2
|
||||||
|
|
||||||
|
WS-2 also touches the slash menu (adds `/goal` and `/queue` to `nonInterruptiveCommands`), but only at the `RichChatViewModel.nonInterruptiveCommands` array site. This WS doesn't touch that array — only the renderer. Independent.
|
||||||
|
|
||||||
|
## 3. `hermes update --yes` plumbing
|
||||||
|
|
||||||
|
### Current state
|
||||||
|
|
||||||
|
There is **no in-app `hermes update` affordance** in v2.7.5. `UpdaterService` ([scarf/Core/Services/UpdaterService.swift](../../scarf/Core/Services/UpdaterService.swift)) wraps Sparkle for Scarf-self-update — that's a separate concern from updating the Hermes binary. The `hermes update` subcommand (added in v0.12 with `--check`, extended in v0.13 with `--yes`) is currently invoked by users in their terminal. The comment at [scarfApp.swift:281](../../scarf/scarfApp.swift) ("explicit refresh after `hermes update`") is aspirational — there's no UI that invokes `hermes update`.
|
||||||
|
|
||||||
|
### What this WS adds
|
||||||
|
|
||||||
|
A small forward-compatible utility so the future "Update Hermes" affordance (queued for a later release) doesn't have to re-derive flag selection. Add a single static helper on either `HermesUpdaterCommandBuilder` (new, in ScarfCore) or as a static on `UpdaterService` (Mac-only). Picking ScarfCore so iOS gets it for free, even though iOS won't ship the affordance soon either:
|
||||||
|
|
||||||
|
#### [scarf/Packages/ScarfCore/Sources/ScarfCore/Services/HermesUpdaterCommandBuilder.swift](../../Packages/ScarfCore/Sources/ScarfCore/Services/HermesUpdaterCommandBuilder.swift) (NEW)
|
||||||
|
|
||||||
|
```swift
|
||||||
|
import Foundation
|
||||||
|
|
||||||
|
/// Pure helpers that build argv arrays for `hermes update` invocations.
|
||||||
|
/// Lives here so the eventual UI surface (Mac / iOS / remote) shares
|
||||||
|
/// flag selection. Each helper is a `nonisolated static` pure function
|
||||||
|
/// — no transport, no MainActor, no mocking surface required.
|
||||||
|
public enum HermesUpdaterCommandBuilder {
|
||||||
|
/// Argv for an interactive update. Pre-v0.12 hosts only had `update`;
|
||||||
|
/// v0.12+ accepts `--check` for preflight; v0.13+ accepts `--yes` /
|
||||||
|
/// `-y` for unattended runs.
|
||||||
|
public static func updateArgv(
|
||||||
|
capabilities: HermesCapabilities,
|
||||||
|
unattended: Bool,
|
||||||
|
checkOnly: Bool
|
||||||
|
) -> [String] {
|
||||||
|
var args: [String] = ["update"]
|
||||||
|
if checkOnly && capabilities.hasUpdateCheck {
|
||||||
|
args.append("--check")
|
||||||
|
}
|
||||||
|
if unattended && capabilities.hasUpdateNonInteractive {
|
||||||
|
args.append("--yes")
|
||||||
|
}
|
||||||
|
return args
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Test target: a small `M0eUpdaterTests` suite (new file under `ScarfCoreTests`) covering the matrix:
|
||||||
|
|
||||||
|
- pre-v0.12 → `["update"]` regardless of flags
|
||||||
|
- v0.12 + checkOnly → `["update", "--check"]`
|
||||||
|
- v0.12 + unattended → `["update"]` (flag absent — host can't honor it)
|
||||||
|
- v0.13 + unattended → `["update", "--yes"]`
|
||||||
|
- v0.13 + checkOnly + unattended → `["update", "--check", "--yes"]`
|
||||||
|
|
||||||
|
### What this WS does NOT add
|
||||||
|
|
||||||
|
No UI surface. No menu item, no Settings row, no command-palette entry. The plumbing exists so when v2.9 / v3.0 adds the affordance it doesn't need to derive flag logic from scratch. Per the WS-8 prompt: "If no such surface exists in v2.7.5, the v0.13 flag is forward-compat plumbing only — note that and don't over-build."
|
||||||
|
|
||||||
|
### Coordination with WS-2
|
||||||
|
|
||||||
|
None. Different files.
|
||||||
|
|
||||||
|
## 4. Redaction default-flip awareness
|
||||||
|
|
||||||
|
### Current state
|
||||||
|
|
||||||
|
The toggle lives in [scarf/Features/Settings/Views/Tabs/AdvancedTab.swift:129–133](../../scarf/Features/Settings/Views/Tabs/AdvancedTab.swift), inside the `Caching & Redaction` section. It's wired through `viewModel.config.redactionEnabled` ↔ `redaction.enabled`. The default for the *Scarf-side* `bool("redaction.enabled", default: false)` at [HermesFileService.swift:315](../../scarf/Core/Services/HermesFileService.swift) is `false` — meaning when the YAML key is absent, Scarf reads the toggle as off. That matches v0.12 server behavior.
|
||||||
|
|
||||||
|
In v0.13 the *server-side* default flips to ON (Hermes treats absence-of-key as redaction-enabled). Scarf's read default at the line above stays `false` because that's what we display when the user hasn't explicitly set the key — but the *meaning* of "off-with-no-key" diverges:
|
||||||
|
|
||||||
|
- pre-v0.13 host + no key → Scarf shows OFF, server treats as OFF. Honest.
|
||||||
|
- v0.13 host + no key → Scarf shows OFF, server treats as ON. **Confusing.**
|
||||||
|
|
||||||
|
### What we change — option A (recommended): hint copy only
|
||||||
|
|
||||||
|
Smallest possible surface. Don't change the parsing default; the file ground-truth is "key absent". Add a one-line hint below the toggle whose copy depends on `capabilities.hasContextCompressionCount` (any v0.13 flag works as a discriminant; reuse one rather than adding `hasV013` to `HermesCapabilities`). Pick `hasGoals` as the marker since it's the most central v0.13 flag — but that's an aesthetic choice; any of the v0.13 flags discriminate the same set of hosts.
|
||||||
|
|
||||||
|
#### [scarf/scarf/Features/Settings/Views/Tabs/AdvancedTab.swift](../../scarf/Features/Settings/Views/Tabs/AdvancedTab.swift)
|
||||||
|
|
||||||
|
Inside `v012CachingSection`'s `SettingsSection` (around line 122–139), after the `ToggleRow` for `redaction.enabled`, append a `HintRow` (or whatever the existing inline-hint pattern in that file is — likely just a `Text` wrapped in a styled `HStack` matching the `credentialsHint` shape from `GeneralTab`):
|
||||||
|
|
||||||
|
```swift
|
||||||
|
ToggleRow(
|
||||||
|
label: "Redact secrets in patches",
|
||||||
|
isOn: viewModel.config.redactionEnabled
|
||||||
|
) { viewModel.setSetting("redaction.enabled", value: $0 ? "true" : "false") }
|
||||||
|
|
||||||
|
redactionDefaultsHint
|
||||||
|
```
|
||||||
|
|
||||||
|
…and add the computed view:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@Environment(\.hermesCapabilities) private var capabilitiesStore
|
||||||
|
|
||||||
|
@ViewBuilder
|
||||||
|
private var redactionDefaultsHint: some View {
|
||||||
|
let v013 = capabilitiesStore?.capabilities.hasGoals == true
|
||||||
|
HStack {
|
||||||
|
Text("")
|
||||||
|
.frame(width: 160, alignment: .trailing)
|
||||||
|
Text(v013
|
||||||
|
? "Recommended: ON. Hermes v0.13+ defaults to redacting secrets unless you opt out."
|
||||||
|
: "Default OFF in Hermes v0.12. Toggle ON to redact secrets in logs and shares.")
|
||||||
|
.scarfStyle(.caption)
|
||||||
|
.foregroundStyle(ScarfColor.foregroundFaint)
|
||||||
|
Spacer()
|
||||||
|
}
|
||||||
|
.padding(.horizontal, 12)
|
||||||
|
.padding(.vertical, 4)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The aligned-right empty `Text` mimics the label-column gutter so the hint tucks under the toggle's value column rather than aligning with the section's left edge — matches the existing visual rhythm in this tab.
|
||||||
|
|
||||||
|
### Why option A and not option B (changing the parsing default)
|
||||||
|
|
||||||
|
Option B would be: read `bool("redaction.enabled", default: capabilities.hasGoals)`. That sounds nicer but wires capabilities into `HermesFileService.parseConfig`, which is currently `nonisolated` and pure. Threading the store through would touch a dozen call sites. Not worth it for a hint that's already accurate via option A.
|
||||||
|
|
||||||
|
### Coordination with WS-2
|
||||||
|
|
||||||
|
None. Different file, different section.
|
||||||
|
|
||||||
|
## 5. `display.language` picker
|
||||||
|
|
||||||
|
### What v0.13 adds
|
||||||
|
|
||||||
|
Hermes v0.13 honors `display.language` in `config.yaml` for static-message translations. Supported values: `en` (default), `zh`, `ja`, `de`, `es`, `fr`, `uk`, `tr`. Users can already write the YAML by hand; this WS adds an in-app picker so it's discoverable.
|
||||||
|
|
||||||
|
### Files to change
|
||||||
|
|
||||||
|
#### [scarf/Packages/ScarfCore/Sources/ScarfCore/Models/HermesConfig.swift](../../Packages/ScarfCore/Sources/ScarfCore/Models/HermesConfig.swift)
|
||||||
|
|
||||||
|
`DisplaySettings` (around line 30) gains one field:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
public struct DisplaySettings: Sendable, Equatable {
|
||||||
|
public var skin: String
|
||||||
|
public var compact: Bool
|
||||||
|
public var resumeDisplay: String
|
||||||
|
public var bellOnComplete: Bool
|
||||||
|
public var inlineDiffs: Bool
|
||||||
|
public var toolProgressCommand: Bool
|
||||||
|
public var toolPreviewLength: Int
|
||||||
|
public var busyInputMode: String
|
||||||
|
/// Static-message translation language. v0.13+. Empty string means
|
||||||
|
/// "follow Hermes default" — we display this as `en` in the picker.
|
||||||
|
/// Persisted via `hermes config set display.language <code>`.
|
||||||
|
public var language: String
|
||||||
|
…
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Add to the initializer (with a default empty-string value and a fall-through assignment) and to the `.empty` static. **Don't** default to `"en"` here — empty string means "config key absent", which is semantically distinct from "user explicitly chose en". The picker collapses both to "English" in display, but the writer only writes a value when the user picks something.
|
||||||
|
|
||||||
|
#### [scarf/scarf/Core/Services/HermesFileService.swift](../../scarf/Core/Services/HermesFileService.swift)
|
||||||
|
|
||||||
|
Inside the `display` block construction (around line 79–84), add:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
let display = DisplaySettings(
|
||||||
|
skin: str("display.skin", default: "default"),
|
||||||
|
compact: bool("display.compact", default: false),
|
||||||
|
resumeDisplay: str("display.resume_display", default: "full"),
|
||||||
|
bellOnComplete: bool("display.bell_on_complete", default: false),
|
||||||
|
inlineDiffs: bool("display.inline_diffs", default: true),
|
||||||
|
toolProgressCommand: bool("display.tool_progress_command", default: false),
|
||||||
|
toolPreviewLength: int("display.tool_preview_length", default: 0),
|
||||||
|
busyInputMode: str("display.busy_input_mode", default: "interrupt"),
|
||||||
|
language: str("display.language", default: "")
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
#### [scarf/scarf/Features/Settings/ViewModels/SettingsViewModel.swift](../../scarf/Features/Settings/ViewModels/SettingsViewModel.swift)
|
||||||
|
|
||||||
|
Add a setter alongside the existing `setSkin` (line 99):
|
||||||
|
|
||||||
|
```swift
|
||||||
|
func setDisplayLanguage(_ value: String) {
|
||||||
|
setSetting("display.language", value: value)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
And expose the option list (8 entries; mirror the v0.13 release notes):
|
||||||
|
|
||||||
|
```swift
|
||||||
|
var displayLanguages: [(code: String, label: String)] = [
|
||||||
|
("", "English (default)"),
|
||||||
|
("en", "English"),
|
||||||
|
("zh", "中文 (Chinese)"),
|
||||||
|
("ja", "日本語 (Japanese)"),
|
||||||
|
("de", "Deutsch (German)"),
|
||||||
|
("es", "Español (Spanish)"),
|
||||||
|
("fr", "Français (French)"),
|
||||||
|
("uk", "Українська (Ukrainian)"),
|
||||||
|
("tr", "Türkçe (Turkish)"),
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
Two "English" entries (empty string + `en`) is intentional: the empty string means "no key" — picking `en` writes the key explicitly. UX-wise that's fine — the picker shows "English (default)" while the value-stored is still empty, and switching to `en` writes a key. Most users will move between languages, not toggle the key's presence.
|
||||||
|
|
||||||
|
#### [scarf/scarf/Features/Settings/Views/Tabs/GeneralTab.swift](../../scarf/Features/Settings/Views/Tabs/GeneralTab.swift)
|
||||||
|
|
||||||
|
Inside the existing `Locale` section (line 40–42), add a picker row gated on `hasDisplayLanguage`:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
SettingsSection(title: "Locale", icon: "globe.americas") {
|
||||||
|
EditableTextField(label: "Timezone (IANA)", value: viewModel.config.timezone) {
|
||||||
|
viewModel.setTimezone($0)
|
||||||
|
}
|
||||||
|
if capabilitiesStore?.capabilities.hasDisplayLanguage == true {
|
||||||
|
PickerRow(
|
||||||
|
label: "Display language",
|
||||||
|
selection: viewModel.config.display.language.isEmpty
|
||||||
|
? "" : viewModel.config.display.language,
|
||||||
|
options: viewModel.displayLanguages.map(\.code),
|
||||||
|
optionLabel: { code in
|
||||||
|
viewModel.displayLanguages.first { $0.code == code }?.label ?? code
|
||||||
|
}
|
||||||
|
) { viewModel.setDisplayLanguage($0) }
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Add `@Environment(\.hermesCapabilities) private var capabilitiesStore` at the top of `GeneralTab`.
|
||||||
|
|
||||||
|
The `PickerRow` overload that takes a `optionLabel:` mapper may not exist today — check at implementation time, and if it doesn't, either (a) add the overload to `PickerRow.swift` (a simple closure parameter), or (b) inline a SwiftUI `Picker` directly rather than `PickerRow` for this one row. Option (a) is preferred so the rest of Settings can use it.
|
||||||
|
|
||||||
|
#### iOS
|
||||||
|
|
||||||
|
`Scarf iOS` settings are read-mostly (config writes are deferred to the Mac per the existing pattern). Skip iOS for the picker; iOS just shows the value as-is wherever Settings displays it. No iOS work in this WS.
|
||||||
|
|
||||||
|
### Capability gate
|
||||||
|
|
||||||
|
`hasDisplayLanguage` is checked at the picker site. Pre-v0.13 hosts hide the row entirely — the field would be silently ignored by the agent if written. **Don't** half-render with a "requires v0.13" label; the row should be invisible on older hosts so the user doesn't think the surface is broken.
|
||||||
|
|
||||||
|
### Coordination with WS-2
|
||||||
|
|
||||||
|
None. Different file.
|
||||||
|
|
||||||
|
## 6. xAI Custom Voices badge
|
||||||
|
|
||||||
|
### Current state
|
||||||
|
|
||||||
|
The xAI provider is **not in `ttsProviders` today** (verify at [SettingsViewModel.swift:32](../../scarf/Features/Settings/ViewModels/SettingsViewModel.swift) — the array reads `["edge", "elevenlabs", "openai", "minimax", "mistral", "neutts", "piper"]`, no `xai`). Hermes v0.13 adds xAI as a TTS provider (it was added earlier in fact, v0.12 — the v0.13 surface is just the *Custom Voices* / cloning support on top). This WS does both at once: add `xai` to the picker and surface the cloning-supported badge.
|
||||||
|
|
||||||
|
### Files to change
|
||||||
|
|
||||||
|
#### [scarf/scarf/Features/Settings/ViewModels/SettingsViewModel.swift](../../scarf/Features/Settings/ViewModels/SettingsViewModel.swift)
|
||||||
|
|
||||||
|
Extend the provider list:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
var ttsProviders = ["edge", "elevenlabs", "openai", "minimax", "mistral", "neutts", "piper", "xai"]
|
||||||
|
```
|
||||||
|
|
||||||
|
Add setter(s) for whichever xAI-specific config keys Hermes uses. Per [Open question Q2](#open-questions) the exact keys — likely `tts.xai.voice_id` (or similar) and possibly `tts.xai.model` — need confirmation. Conservative shape mirroring elevenlabs:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
func setTTSXAIVoiceID(_ value: String) { setSetting("tts.xai.voice_id", value: value) }
|
||||||
|
func setTTSXAIModel(_ value: String) { setSetting("tts.xai.model", value: value) }
|
||||||
|
```
|
||||||
|
|
||||||
|
#### [scarf/Packages/ScarfCore/Sources/ScarfCore/Models/HermesConfig.swift](../../Packages/ScarfCore/Sources/ScarfCore/Models/HermesConfig.swift)
|
||||||
|
|
||||||
|
`VoiceSettings` (around line 178) gains two fields next to the existing TTS provider blobs:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
public var ttsXAIVoiceID: String
|
||||||
|
public var ttsXAIModel: String
|
||||||
|
```
|
||||||
|
|
||||||
|
Initializer + `.empty` updates. Defaults to empty string.
|
||||||
|
|
||||||
|
#### [scarf/scarf/Core/Services/HermesFileService.swift](../../scarf/Core/Services/HermesFileService.swift)
|
||||||
|
|
||||||
|
Add the YAML reads inside the voice block construction (mirror the elevenlabs / openai shape).
|
||||||
|
|
||||||
|
#### [scarf/scarf/Features/Settings/Views/Tabs/VoiceTab.swift](../../scarf/Features/Settings/Views/Tabs/VoiceTab.swift)
|
||||||
|
|
||||||
|
Inside the `switch viewModel.config.voice.ttsProvider` (line 19), add a `case "xai":` arm:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
case "xai":
|
||||||
|
EditableTextField(label: "Voice ID", value: viewModel.config.voice.ttsXAIVoiceID) {
|
||||||
|
viewModel.setTTSXAIVoiceID($0)
|
||||||
|
}
|
||||||
|
EditableTextField(label: "Model", value: viewModel.config.voice.ttsXAIModel) {
|
||||||
|
viewModel.setTTSXAIModel($0)
|
||||||
|
}
|
||||||
|
if capabilitiesStore?.capabilities.hasXAIVoiceCloning == true {
|
||||||
|
xaiCloningBadge
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Add `@Environment(\.hermesCapabilities) private var capabilitiesStore` at the top.
|
||||||
|
|
||||||
|
The badge view, using `ScarfBadge` (kind `.info`):
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@ViewBuilder
|
||||||
|
private var xaiCloningBadge: some View {
|
||||||
|
HStack {
|
||||||
|
Text("")
|
||||||
|
.frame(width: 160, alignment: .trailing)
|
||||||
|
ScarfBadge("Cloning supported", kind: .info)
|
||||||
|
Text("Manage cloned voices in your terminal: `hermes voice` (xAI subcommands).")
|
||||||
|
.scarfStyle(.caption)
|
||||||
|
.foregroundStyle(ScarfColor.foregroundMuted)
|
||||||
|
Spacer()
|
||||||
|
}
|
||||||
|
.padding(.horizontal, 12)
|
||||||
|
.padding(.vertical, 4)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The hint text references `hermes voice` because Scarf doesn't manage cloned voices — Hermes does, and v2.7.5 has no in-app voice-cloning UI. Capture under [Out of scope](#out-of-scope) for follow-up.
|
||||||
|
|
||||||
|
### Capability gate
|
||||||
|
|
||||||
|
- `xai` in the provider picker: **not gated**. The provider exists pre-v0.13 (TTS only); cloning is the v0.13 add-on. Listing it always means pre-v0.13 users with xAI keys can still pick it.
|
||||||
|
- Cloning badge: gated on `hasXAIVoiceCloning`. Pre-v0.13: badge hidden, EditableTextField rows still visible.
|
||||||
|
|
||||||
|
### Coordination with WS-2
|
||||||
|
|
||||||
|
None.
|
||||||
|
|
||||||
|
## Files to change (combined)
|
||||||
|
|
||||||
|
New:
|
||||||
|
|
||||||
|
- `scarf/Packages/ScarfCore/Sources/ScarfCore/Services/HermesUpdaterCommandBuilder.swift` (item 3)
|
||||||
|
- `scarf/Packages/ScarfCore/Tests/ScarfCoreTests/M0eUpdaterTests.swift` (item 3 tests)
|
||||||
|
|
||||||
|
Modified:
|
||||||
|
|
||||||
|
- `scarf/Packages/ScarfCore/Sources/ScarfCore/Models/ACPMessages.swift` (item 1: `compressionCount` field)
|
||||||
|
- `scarf/Packages/ScarfCore/Sources/ScarfCore/ACP/ACPClient.swift` (item 1: decode)
|
||||||
|
- `scarf/Packages/ScarfCore/Sources/ScarfCore/ViewModels/RichChatViewModel.swift` (item 1: counter + reset + `handlePromptComplete`)
|
||||||
|
- `scarf/Packages/ScarfCore/Sources/ScarfCore/Models/HermesConfig.swift` (items 5 + 6: `display.language`, xAI voice/model fields)
|
||||||
|
- `scarf/scarf/Features/Chat/Views/SessionInfoBar.swift` (item 1: chip + props)
|
||||||
|
- `scarf/scarf/Features/Chat/Views/ChatTranscriptPane.swift` (item 1: pass-through)
|
||||||
|
- `scarf/scarf/Features/Chat/Views/SlashCommandMenu.swift` (item 2: bracket-aware hint)
|
||||||
|
- `scarf/scarf/Features/Settings/Views/Tabs/AdvancedTab.swift` (item 4: redaction hint)
|
||||||
|
- `scarf/scarf/Features/Settings/Views/Tabs/GeneralTab.swift` (item 5: language picker)
|
||||||
|
- `scarf/scarf/Features/Settings/Views/Tabs/VoiceTab.swift` (item 6: xai case + badge)
|
||||||
|
- `scarf/scarf/Features/Settings/ViewModels/SettingsViewModel.swift` (items 5 + 6: setters + lists)
|
||||||
|
- `scarf/scarf/Core/Services/HermesFileService.swift` (items 5 + 6: YAML reads)
|
||||||
|
- (possibly) `scarf/scarf/Features/Settings/Views/Components/PickerRow.swift` — add a `optionLabel:` overload (item 5, if the existing API doesn't carry one)
|
||||||
|
|
||||||
|
That's roughly **3 ScarfCore files + 7 Mac app files + 1 new file + 1 test file = ~12 files**, most edits being a few lines each.
|
||||||
|
|
||||||
|
## Capability gating (combined)
|
||||||
|
|
||||||
|
| Item | Flag | Behavior on pre-v0.13 |
|
||||||
|
|------|------|------------------------|
|
||||||
|
| 1. Compression chip | `hasContextCompressionCount` + `acpCompressionCount > 0` | Chip hidden (counter stays 0) |
|
||||||
|
| 2. `/new <name>` hint | none — driven by ACP server payload | Hint is whatever pre-v0.13 server sends (probably empty) |
|
||||||
|
| 3. `--yes` plumbing | `hasUpdateNonInteractive` (used inside the helper) | Helper omits the flag |
|
||||||
|
| 4. Redaction hint copy | discriminator on any v0.13 flag (use `hasGoals`) | Shows the v0.12 copy |
|
||||||
|
| 5. Language picker | `hasDisplayLanguage` | Picker row hidden |
|
||||||
|
| 6. xAI cloning badge | `hasXAIVoiceCloning` | Badge hidden, xai picker option still visible |
|
||||||
|
|
||||||
|
Six surfaces, six independent fall-back paths. None of them break the existing layout if every flag returns false.
|
||||||
|
|
||||||
|
## How to test
|
||||||
|
|
||||||
|
### Unit (ScarfCoreTests)
|
||||||
|
|
||||||
|
- `M0eUpdaterTests` — five-case matrix for `HermesUpdaterCommandBuilder.updateArgv` covering every combination listed in item 3.
|
||||||
|
- Extend `M0dViewModelsTests` with one test that sets `acpCompressionCount = 5` via a mocked `handlePromptComplete` and asserts the value via the public getter; assert `reset()` clears it.
|
||||||
|
- Extend the existing `ACPMessages` tests (or add one if there isn't one) with: a `usage` blob carrying `"compressionCount": 3` parses into `ACPPromptResult.compressionCount == 3`; same with `"compression_count": 3`; missing key parses as 0.
|
||||||
|
|
||||||
|
### UI smoke (manual against real Hermes)
|
||||||
|
|
||||||
|
1. **Pre-v0.13 host**: launch Scarf with a Hermes v0.12 binary on PATH. Verify:
|
||||||
|
- No compression chip in `SessionInfoBar` even after long sessions.
|
||||||
|
- Settings → General → Locale shows only the Timezone field; no language picker.
|
||||||
|
- Settings → Advanced → Caching & Redaction shows the v0.12 hint copy.
|
||||||
|
- Settings → Voice → Text-to-Speech with provider `xai` shows Voice ID + Model fields, **no** "Cloning supported" badge.
|
||||||
|
|
||||||
|
2. **v0.13 host**: launch Scarf against the v0.13 dev branch. Verify:
|
||||||
|
- Long enough chat to trigger compaction → chip appears in `SessionInfoBar` with the count.
|
||||||
|
- Settings → General → Locale → "Display language" picker visible, switching writes `display.language` in `config.yaml`.
|
||||||
|
- Settings → Advanced shows the v0.13 hint copy.
|
||||||
|
- Settings → Voice → xai provider shows the "Cloning supported" badge.
|
||||||
|
- `/new Foo Bar Baz` from the slash menu starts a session named "Foo Bar Baz" (no Scarf-side validation; Hermes handles it).
|
||||||
|
- Slash menu shows `/new` with whatever hint v0.13 server sends — bracket-aware renderer doesn't double-wrap if hint is `[name]`.
|
||||||
|
|
||||||
|
3. **`HermesUpdaterCommandBuilder` smoke** (no UI): once integrated, write a one-shot script (or a `#Preview`-only call) that prints `updateArgv` for each capability snapshot and pastes the matrix into the PR description.
|
||||||
|
|
||||||
|
### Visual / accessibility
|
||||||
|
|
||||||
|
- Compression chip uses `ScarfColor.foregroundMuted` — verify in light + dark; ensure contrast ratio ≥ 4.5:1 against `backgroundSecondary`.
|
||||||
|
- Picker on Locale section honors keyboard navigation (Tab in / Space to open / Arrows / Return / Esc).
|
||||||
|
- "Cloning supported" badge uses `ScarfBadge(... kind: .info)` — verify color resolves correctly in both modes; not green (that's `.success`), not yellow (that's `.warning`).
|
||||||
|
|
||||||
|
## Open questions
|
||||||
|
|
||||||
|
**Q1. Wire field name for compression count.** v0.13 release notes say "Show context compression count in status bar" without naming the field. The plan assumes `usage.compressionCount` (or `usage.compression_count`) on the `session/prompt` response. If Hermes instead emits it as a `session/update` notification on a status feed (separate path from `usage`), the plumbing is bigger: `RichChatViewModel.handleStatusUpdate` (or equivalent) needs a new branch, and `ACPClient.startReadLoop` needs a new event type. **Resolution path**: read `~/.hermes/hermes-agent/hermes_cli/acp/server.py` (or wherever the v0.13 status emission lives) before merging. If the field is on a notification, swap items 1's `ACPPromptResult` extension for a new `ACPEvent.compressionCountChanged(sessionId:count:)` case in `ACPMessages.swift` and a corresponding branch in `RichChatViewModel.handleEvent`.
|
||||||
|
|
||||||
|
**Q2. xAI TTS config keys.** The plan assumes `tts.xai.voice_id` / `tts.xai.model` mirroring elevenlabs. v0.13 source might use different names (`tts.xai.voice`, `tts.xai.model_id`, or a top-level `tts.xai_voice`). **Resolution path**: grep `~/.hermes/hermes-agent/hermes_cli/voice/tts.py` for the xAI config block before merging. If keys differ, just rename the setter functions and `VoiceSettings` fields — no architectural change.
|
||||||
|
|
||||||
|
**Q3. Empty-string vs `"en"` for `display.language` default.** The plan uses an empty string in `DisplaySettings.language` to represent "key absent" and surfaces the picker entry as "English (default)". Whether the picker should *also* offer `en` as a separate explicit value is a UX call. The plan keeps both for now; v2.8.1 can collapse if it's confusing.
|
||||||
|
|
||||||
|
**Q4. iOS coverage.** The plan defers iOS for items 1 (compression chip) and 5 (language picker) — iOS doesn't have a `SessionInfoBar` mirror, and iOS Settings is read-mostly. For v2.8 this is acceptable; for v2.9 we should mirror both surfaces in `Scarf iOS/`. Tracking under [Out of scope](#out-of-scope) below.
|
||||||
|
|
||||||
|
**Q5. Redaction hint discriminator.** Using `hasGoals` as a stand-in for "is this a v0.13 host" feels indirect. Consider adding a small convenience `var isV013OrLater: Bool { atLeastSemver(0, 13, 0) }` on `HermesCapabilities` so the call site reads more honestly. Trivial change; either lands in WS-1 (preferred — that's the capabilities home) or here. Flag for WS-1 owner.
|
||||||
|
|
||||||
|
## Out of scope (deferred)
|
||||||
|
|
||||||
|
- **iOS compression chip** — iOS chat header doesn't currently render any token counter; adding the chip there means designing a header bar, not just inserting one element. Track for v2.9.
|
||||||
|
- **iOS `display.language` picker** — iOS Settings is read-mostly; full pickers wait until iOS Settings becomes a write surface.
|
||||||
|
- **In-app "Update Hermes" affordance** — a Sparkle-style auto-updater for the Hermes binary, with the `--yes` flag plumbed through. Long-term feature, probably v3.0. The helper added in item 3 paves the runway.
|
||||||
|
- **`/new <name>` hover tooltip** — extra discoverability for the optional argument. v0.13 server sends the hint via `available_commands_update`; that's enough for v2.8.
|
||||||
|
- **xAI Custom Voices management UI** — the badge points users at `hermes voice`. Building cloned-voice management in-app is a feature on its own. Track separately.
|
||||||
|
- **Schema sync to `tools/build-catalog.py`** — none of this WS adds new widget types or template manifest fields, so the catalog validator doesn't need an update. Verify at PR time.
|
||||||
|
|
||||||
|
## Estimate
|
||||||
|
|
||||||
|
- ScarfCore changes: ~30 LOC across 3 files + 1 new file + 1 test file ≈ **~120 LOC**.
|
||||||
|
- Mac app changes: ~15-20 LOC per item 1, 4, 5, 6 + 5 LOC for items 2 = **~80 LOC** spread over 7 files.
|
||||||
|
- Tests: ~80 LOC for `M0eUpdaterTests` + ~40 LOC for compression decode tests = **~120 LOC**.
|
||||||
|
|
||||||
|
Total ≈ **300-350 LOC**, ~12 files. Each item is independently revertable and capability-gated. Implementation: 1 dev-day; review + smoke against v0.13 host: 0.5 day. **1.5 dev-days end-to-end.**
|
||||||
|
|
||||||
|
Confidence: **high** that items 2 / 3 / 4 / 5 / 6 land cleanly. **Medium** for item 1 (compression chip) — pinned to Q1's wire-field resolution. If Q1 surfaces an event-stream shape rather than a `usage` blob, item 1's plumbing roughly doubles in size but the *view* is unchanged.
|
||||||
@@ -0,0 +1,926 @@
|
|||||||
|
# WS-9 Plan: ScarfGo iOS catch-up (read-only mirrors of WS-2 / WS-3 / WS-4 / WS-5)
|
||||||
|
|
||||||
|
**Workstream:** WS-9 of Scarf v2.8.0
|
||||||
|
**Hermes target:** v0.13.0 (v2026.5.7)
|
||||||
|
**Capability gates consumed (already shipped in WS-1, PR #80):**
|
||||||
|
- `HermesCapabilities.hasGoals` (`>= 0.13.0`) — drives the Goal pill
|
||||||
|
- `HermesCapabilities.hasACPQueue` (`>= 0.13.0`) — read-only queue indicator stub
|
||||||
|
- `HermesCapabilities.hasKanbanDiagnostics` (`>= 0.13.0`) — diagnostics on the iOS Kanban detail sheet
|
||||||
|
- `HermesCapabilities.hasCuratorArchive` (`>= 0.13.0`) — Archived list section in the iOS Curator surface
|
||||||
|
- `HermesCapabilities.hasGoogleChatPlatform` / `hasGatewayAllowlists` / `hasGatewayBusyAckToggle` / `hasGatewayRestartNotification` (`>= 0.13.0`) — Settings → Platforms additions
|
||||||
|
|
||||||
|
**Builds on:**
|
||||||
|
- v2.7.5 iOS Kanban (`Scarf iOS/Kanban/ScarfGoKanbanView.swift`, `ScarfGoKanbanDetailSheet.swift`).
|
||||||
|
- v2.7.5 iOS Curator (`Scarf iOS/Curator/CuratorView.swift`).
|
||||||
|
- v2.7.5 iOS Settings (`Scarf iOS/Settings/SettingsView.swift`) including `platformsSection`.
|
||||||
|
- v2.5+ iOS Chat (`Scarf iOS/Chat/ChatView.swift`) including `projectContextBar` and `transientHint`.
|
||||||
|
- WS-1 capability flags + the `.hermesCapabilities(_:)` env injection at `ScarfGoTabRoot.swift:153`.
|
||||||
|
- Phase H precedent: iOS catch-up "parity-match the Mac surfaces but skip mutating CLI verbs."
|
||||||
|
|
||||||
|
**Owner:** TBD
|
||||||
|
**Reviewers:** Alan (always); whoever owns iOS during v2.8 cycle.
|
||||||
|
**Sequencing:** WS-9 lands AFTER WS-2 / WS-3 / WS-4 / WS-5 merge to main, since it consumes their model fields, view-model state, and capability flags.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Goals (read-only mirrors of WS-2 / WS-3 / WS-4 / WS-5)
|
||||||
|
|
||||||
|
WS-9 is iOS-only and **strictly read-only**. It mirrors selected Mac surfaces from earlier work-streams without introducing any iOS-side write verb. Per the v2.8.0 release plan, iOS write surfaces (Verify / Reject buttons, iOS create-task, iOS curator-archive button, iOS allowlist editor, etc.) are deferred to v2.8.x.
|
||||||
|
|
||||||
|
User-visible additions (all capability-gated, all degrade silently on pre-v0.13 hosts):
|
||||||
|
|
||||||
|
1. **Goal pill in iOS chat.** When `caps.hasGoals == true` AND `controller.vm.activeGoal != nil`, surface a "Goal: <text>" pill at the top of the chat view (mounted next to the existing folder/branch chips in `projectContextBar`). Read-only — no `/goal` slash command on iOS in v2.8.0; no clear affordance.
|
||||||
|
2. **Read-only `/queue` count chip.** When `caps.hasACPQueue == true` AND `controller.vm.queuedPrompts.count > 0`, surface a small "N queued" chip in the same `projectContextBar`. No popover, no mutation. Tap is a no-op (or shows a sheet listing the previews — see Open Question #2).
|
||||||
|
3. **Kanban v0.13 diagnostics on iOS detail sheet.** Extend `ScarfGoKanbanDetailSheet` to render `max_retries`, `auto_blocked_reason`, `hallucination_gate_status`, and the diagnostics array. NO Verify / Reject buttons; the hallucination state is rendered as a badge with the copy "Worker-created — verify on Mac" (since iOS can't verify in v2.8.0).
|
||||||
|
4. **iOS Curator Archived section.** Append a read-only "Archived" section to the existing `Scarf iOS/Curator/CuratorView.swift`. Per-row: name, kind, archived-date, optional reason (sized small for thumb scrolling). NO Restore / Prune-this / Prune-all buttons. Empty-state copy points the user to the Mac app for restore.
|
||||||
|
5. **iOS Settings v0.13 features-active badge.** When `caps.semver >= 0.13.0`, surface a small read-only "v0.13 features active" `ScarfBadge` at the top of `SettingsView` with a "Learn more" tap action that opens an action sheet listing the new features.
|
||||||
|
6. **iOS Platforms read-only mirror (extension to existing `platformsSection`).** Add a Google Chat read-only row, a "Restart notifications" yes/no row, a "Busy ack" yes/no row, and a per-platform allowlist chip-row ("3 allowed channels: …, 4 allowed chats: …"). No editing — that's a Mac-only surface in v2.8.0.
|
||||||
|
|
||||||
|
### Non-goals (explicitly deferred)
|
||||||
|
|
||||||
|
- **iOS write surfaces** (Verify / Reject, Create Task, Archive Skill, Prune, Allowlist editor, `/goal`, `/queue` send) — deferred to v2.8.x. Per Phase H precedent.
|
||||||
|
- **iOS Curator surface from scratch** — out of scope. iOS already has `CuratorView.swift`; WS-9 only adds the Archived list. (See Open Question #1 for what the user prompt anticipated.)
|
||||||
|
- **iOS Gateway/Platforms surface from scratch** — out of scope. iOS Settings already has `platformsSection` (lines 280-288 of `SettingsView.swift`); WS-9 extends it. There is **no separate iOS Gateway feature module** today and WS-9 does not add one.
|
||||||
|
- **iOS goal/queue clear affordance** — `/goal --clear` and "Clear all queued" are write verbs; deferred.
|
||||||
|
- **iOS Kanban verify on tap** — iOS Kanban is read-only and stays read-only in v2.8.0.
|
||||||
|
- **iOS Curator Run Now blocking + progress (synchronous run)** — that's a write change in scope of WS-4, not WS-9. iOS keeps fire-and-forget `runNow` regardless of v0.13.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Existing iOS surface inventory
|
||||||
|
|
||||||
|
(Verified by walking `Scarf iOS/` at plan time.)
|
||||||
|
|
||||||
|
| iOS dir | Files | Mac counterpart |
|
||||||
|
|---|---|---|
|
||||||
|
| `App/` | `ScarfIOSApp.swift`, `ScarfGoCoordinator.swift`, `ScarfGoTabRoot.swift`, `Theme/` | `scarfApp.swift`, `AppCoordinator.swift`, `SidebarView.swift` |
|
||||||
|
| `Chat/` | `ChatView.swift`, `ChatContentFormatter.swift`, `ProjectPickerSheet.swift`, `ProjectSlashCommandsBrowser.swift` | `Features/Chat/` |
|
||||||
|
| `Components/` | `FlowLayout.swift`, `HermesVersionBanner.swift` | (cross-feature shared) |
|
||||||
|
| `Cron/` | (read-only views) | `Features/Cron/` |
|
||||||
|
| **`Curator/`** | **`CuratorView.swift` (read-mostly, runNow/pause/resume/pin/unpin/restore wired)** | `Features/Curator/` |
|
||||||
|
| `Dashboard/` | iOS dashboard views | `Features/Dashboard/` |
|
||||||
|
| **`Kanban/`** | **`ScarfGoKanbanView.swift`, `ScarfGoKanbanDetailSheet.swift` (5-column horizontal-paged Picker, read-only)** | `Features/Kanban/` |
|
||||||
|
| `Memory/` | (read-only views) | `Features/Memory/` |
|
||||||
|
| `Notifications/` | `APNSTokenStore.swift`, `NotificationRouter.swift` | `Core/Services/Notifications*` |
|
||||||
|
| `Onboarding/` | (first-run wizard) | `Features/Onboarding/` |
|
||||||
|
| `Plugins/` | `PluginsView.swift` (Phase H read-only) | `Features/Plugins/` |
|
||||||
|
| `Profiles/` | `ProfilesView.swift` (Phase H read-only) | `Features/Profiles/` |
|
||||||
|
| `Projects/` | iOS project surfaces (incl. `ProjectDetailView.swift`) | `Features/Projects/` |
|
||||||
|
| `Servers/` | server-list + connect surfaces | `Features/Servers/` |
|
||||||
|
| **`Settings/`** | **`SettingsView.swift`, `SettingEditorSheet.swift`, `ScarfMonDiagnosticsView.swift`** | `Features/Settings/` |
|
||||||
|
| `Skills/` | iOS Skills surface | `Features/Skills/` |
|
||||||
|
| `Webhooks/` | `WebhooksView.swift` (Phase H read-only) | `Features/Webhooks/` |
|
||||||
|
|
||||||
|
**Surfaces that DO NOT exist on iOS today:**
|
||||||
|
- No standalone `Scarf iOS/Gateway/` or `Scarf iOS/Platforms/` directory. iOS surfaces gateway / platform configuration through `SettingsView.platformsSection`. WS-9 mirror item 6 extends that section; it does NOT spin up a new feature module.
|
||||||
|
- No iOS goal / queue surface. WS-2 lays the VM-side scaffolding (`activeGoal`, `queuedPrompts` on the shared `RichChatViewModel` in ScarfCore); WS-9 is what surfaces it on iOS.
|
||||||
|
- No iOS dedicated "What's new in v0.13" feature surface. The "v0.13 features active" badge in mirror item 5 is the only entry point WS-9 adds.
|
||||||
|
|
||||||
|
**Capability injection (verified):**
|
||||||
|
- `ScarfGoTabRoot.swift:52` constructs a `HermesCapabilitiesStore` per server connection.
|
||||||
|
- `ScarfGoTabRoot.swift:153` calls `.hermesCapabilities(capabilities)` on the tab view.
|
||||||
|
- All iOS feature views read with `@Environment(\.hermesCapabilities) private var capabilitiesStore` (see `ChatView.swift:30`, `ProjectDetailView.swift:22`, `Components/HermesVersionBanner.swift:14`).
|
||||||
|
- WS-9 reuses the same env injection — no new plumbing required.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. iOS Goal pill (mirror WS-2)
|
||||||
|
|
||||||
|
**Source path read.** The goal text lives on `RichChatViewModel.activeGoal: HermesActiveGoal?` (added in WS-2 — see WS-2 plan §3 "Active goal state"). iOS reads the same VM through `ChatController.vm` (the shared ScarfCore VM). No new ScarfCore field is needed; the WS-2 plumbing flows automatically into iOS.
|
||||||
|
|
||||||
|
### File: `Scarf iOS/Chat/ChatView.swift`
|
||||||
|
|
||||||
|
#### 1a. Read the capability + goal state in `body`
|
||||||
|
|
||||||
|
iOS already injects `@Environment(\.hermesCapabilities) private var capabilitiesStore` at line 30. Add a derived flag near the existing `supportsImagePrompts` computed property (lines 44-46):
|
||||||
|
|
||||||
|
```swift
|
||||||
|
private var supportsActiveGoal: Bool {
|
||||||
|
capabilitiesStore?.capabilities.hasGoals ?? false
|
||||||
|
}
|
||||||
|
|
||||||
|
private var supportsACPQueue: Bool {
|
||||||
|
capabilitiesStore?.capabilities.hasACPQueue ?? false
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 1b. Mount the goal pill alongside the project chip
|
||||||
|
|
||||||
|
The `projectContextBar` (lines 832-892) currently renders only when there's an active project. Adding the goal pill INSIDE that bar would mean a pill-less goal can't render in non-project chats. Solution: split the conditional. Render `projectContextBar` when `projectName != nil OR supportsActiveGoal && controller.vm.activeGoal != nil OR supportsACPQueue && !controller.vm.queuedPrompts.isEmpty`. The bar's tinted-strip background works for any of these states.
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@ViewBuilder
|
||||||
|
private var projectContextBar: some View {
|
||||||
|
let hasProject = (controller.currentProjectName?.isEmpty == false)
|
||||||
|
let hasGoal = supportsActiveGoal && controller.vm.activeGoal != nil
|
||||||
|
let hasQueue = supportsACPQueue && !controller.vm.queuedPrompts.isEmpty
|
||||||
|
if hasProject || hasGoal || hasQueue {
|
||||||
|
HStack(spacing: 8) {
|
||||||
|
if hasProject { /* existing project chip */ }
|
||||||
|
if hasGoal { goalChip }
|
||||||
|
if hasQueue { queueChip }
|
||||||
|
Spacer()
|
||||||
|
if hasProject && !controller.vm.projectScopedCommands.isEmpty {
|
||||||
|
/* existing slash-commands chip */
|
||||||
|
}
|
||||||
|
}
|
||||||
|
.padding(.horizontal, 12)
|
||||||
|
.padding(.vertical, 6)
|
||||||
|
.frame(maxWidth: .infinity, alignment: .leading)
|
||||||
|
.background(.tint.opacity(0.1))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
@ViewBuilder
|
||||||
|
private var goalChip: some View {
|
||||||
|
if let goal = controller.vm.activeGoal {
|
||||||
|
Label(truncatedGoalText(goal.text), systemImage: "scope")
|
||||||
|
.labelStyle(.titleAndIcon)
|
||||||
|
.font(.subheadline) // semantic — Dynamic Type works
|
||||||
|
.foregroundStyle(ScarfColor.info)
|
||||||
|
.padding(.horizontal, 8)
|
||||||
|
.padding(.vertical, 3)
|
||||||
|
.background(ScarfColor.info.opacity(0.16), in: Capsule())
|
||||||
|
.lineLimit(1)
|
||||||
|
.accessibilityLabel("Goal locked: \(goal.text)")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
private func truncatedGoalText(_ text: String) -> String {
|
||||||
|
text.count <= 28 ? text : String(text.prefix(25)) + "…"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Font choice (per CLAUDE.md iOS rules).** Use semantic `.subheadline` because the goal text IS content (the user reads it to recall what they locked the agent on). Per CLAUDE.md "Decision tree per text element: 'is this read for content?' → semantic token. 'Is this chrome / a label / a badge?' → ScarfFont." If the design review pushes back and prefers a fixed-display chip look, switch the inner `Text` to `ScarfFont.captionStrong`; the surrounding pill chrome stays the same.
|
||||||
|
|
||||||
|
**Color choice.** `ScarfColor.info` matches Mac's WS-2 plan (informational state, not warning, not error). Keeps the pill visually distinct from the green "success" branch chip and the orange tinted-strip background of `projectContextBar`.
|
||||||
|
|
||||||
|
**Truncation.** 25-char prefix matches the iPhone 14 portrait width budget for a chip beside a project name. The full goal text is in the accessibility label (VoiceOver users get the full string).
|
||||||
|
|
||||||
|
#### 1c. NO clear affordance
|
||||||
|
|
||||||
|
iOS does not get a "Clear goal" gesture in v2.8.0. The pill is purely informational. Tapping is a no-op. Users running `/goal --clear` from the Mac will see the iOS pill drop on the next polled state refresh (or whenever `controller.vm.activeGoal` updates — most likely on the next ACP event).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. iOS Kanban v0.13 diagnostics (mirror WS-3)
|
||||||
|
|
||||||
|
**Source paths read.** All four new fields land on `HermesKanbanTask` (WS-3 plan §1):
|
||||||
|
- `task.maxRetries: Int?`
|
||||||
|
- `task.autoBlockedReason: String?`
|
||||||
|
- `task.hallucinationGateStatus: String?` → wrap in `KanbanHallucinationGate.from(_:)`
|
||||||
|
- `task.diagnostics: [HermesKanbanDiagnostic]`
|
||||||
|
|
||||||
|
The per-run shape adds `run.diagnostics: [HermesKanbanDiagnostic]` (WS-3 plan §3). The typed-mirror enums `KanbanHallucinationGate` and `KanbanDiagnosticKind` are added in ScarfCore and consumable from iOS by `import ScarfCore`.
|
||||||
|
|
||||||
|
### File: `Scarf iOS/Kanban/ScarfGoKanbanDetailSheet.swift`
|
||||||
|
|
||||||
|
#### 2a. Capability gate
|
||||||
|
|
||||||
|
Add `@Environment(\.hermesCapabilities) private var capabilitiesStore` at the top of the struct alongside the existing state (line ~17). Compute once in `body`:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
private var diagnosticsAvailable: Bool {
|
||||||
|
capabilitiesStore?.capabilities.hasKanbanDiagnostics ?? false
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Defensive default to `false` so a missing capability store (preview, smoke test) renders the v2.7.5 sheet unchanged.
|
||||||
|
|
||||||
|
#### 2b. Header chip row — add `max_retries` chip
|
||||||
|
|
||||||
|
Update `headerCard(_:)` (lines 91-111). Insert between the workspace-kind badge and the tenant badge, gated on `diagnosticsAvailable`:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
if diagnosticsAvailable, let maxRetries = task.maxRetries {
|
||||||
|
ScarfBadge("retries: \(maxRetries)", kind: .neutral)
|
||||||
|
.accessibilityLabel("Max retries \(maxRetries)")
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Tooltip on iOS is the accessibility label (no hover). No tap action; this is purely informational.
|
||||||
|
|
||||||
|
#### 2c. Header chip row — add hallucination-gate badge
|
||||||
|
|
||||||
|
Below the existing badge row, insert a NEW row when `KanbanHallucinationGate.from(task.hallucinationGateStatus) == .pending`:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
if diagnosticsAvailable,
|
||||||
|
KanbanHallucinationGate.from(task.hallucinationGateStatus) == .pending {
|
||||||
|
HStack(spacing: 6) {
|
||||||
|
Image(systemName: "questionmark.diamond.fill")
|
||||||
|
.foregroundStyle(ScarfColor.warning)
|
||||||
|
Text("Worker-created — verify on Mac")
|
||||||
|
.font(.subheadline) // semantic content text
|
||||||
|
.foregroundStyle(ScarfColor.warning)
|
||||||
|
}
|
||||||
|
.padding(.horizontal, 10)
|
||||||
|
.padding(.vertical, 6)
|
||||||
|
.background(ScarfColor.warning.opacity(0.10), in: RoundedRectangle(cornerRadius: ScarfRadius.md, style: .continuous))
|
||||||
|
.overlay(
|
||||||
|
RoundedRectangle(cornerRadius: ScarfRadius.md, style: .continuous)
|
||||||
|
.strokeBorder(ScarfColor.warning.opacity(0.4), lineWidth: 1)
|
||||||
|
)
|
||||||
|
.accessibilityHint("Open this task on the Mac app to verify or reject the worker's claim.")
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Copy choice.** "Worker-created — verify on Mac" is intentional: it surfaces the gate status AND tells the user where the action lives. This is the read-only iOS substitute for Mac's Verify / Reject buttons (which require write CLI verbs deferred to v2.8.x).
|
||||||
|
|
||||||
|
**Render order.** Hallucination badge sits BELOW the chip row but ABOVE the markdown body, so users see the worker-created flag before reading the (potentially hallucinated) body content.
|
||||||
|
|
||||||
|
#### 2d. Auto-blocked banner
|
||||||
|
|
||||||
|
In `headerCard` after the priority line, when status is `blocked` AND `task.autoBlockedReason` is non-empty:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
if diagnosticsAvailable,
|
||||||
|
KanbanStatus.from(task.status) == .blocked,
|
||||||
|
let reason = task.autoBlockedReason, !reason.isEmpty {
|
||||||
|
HStack(alignment: .top, spacing: 8) {
|
||||||
|
Image(systemName: "exclamationmark.octagon.fill")
|
||||||
|
.foregroundStyle(ScarfColor.danger)
|
||||||
|
VStack(alignment: .leading, spacing: 2) {
|
||||||
|
Text("Auto-blocked")
|
||||||
|
.font(.subheadline.weight(.semibold))
|
||||||
|
.foregroundStyle(ScarfColor.danger)
|
||||||
|
Text(reason)
|
||||||
|
.font(.subheadline) // semantic — server-supplied verbatim
|
||||||
|
.foregroundStyle(.secondary)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
.padding(10)
|
||||||
|
.background(ScarfColor.danger.opacity(0.08), in: RoundedRectangle(cornerRadius: ScarfRadius.md, style: .continuous))
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 2e. Task-level diagnostics block
|
||||||
|
|
||||||
|
After the markdown body block (before the Picker tab selector), render the task-level diagnostics list when non-empty:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
if diagnosticsAvailable, !detail.task.diagnostics.isEmpty {
|
||||||
|
diagnosticsBlock(detail.task.diagnostics, label: "Diagnostics")
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Helper:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@ViewBuilder
|
||||||
|
private func diagnosticsBlock(_ diags: [HermesKanbanDiagnostic], label: String) -> some View {
|
||||||
|
VStack(alignment: .leading, spacing: 6) {
|
||||||
|
Text(label)
|
||||||
|
.font(.caption.weight(.semibold))
|
||||||
|
.foregroundStyle(.secondary)
|
||||||
|
FlowLayout(spacing: 6) { // existing primitive at Scarf iOS/Components/FlowLayout.swift
|
||||||
|
ForEach(diags) { diag in
|
||||||
|
let kind = KanbanDiagnosticKind.from(diag.kind)
|
||||||
|
ScarfBadge(diag.kind, kind: kind.badgeKind)
|
||||||
|
.accessibilityLabel(diag.message ?? diag.kind)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
.frame(maxWidth: .infinity, alignment: .leading)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Tap-on-badge → an expandable detail sheet that shows kind + message + timestamp. iPhone-friendly substitute for the Mac `.help()` tooltip:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
ScarfBadge(diag.kind, kind: kind.badgeKind)
|
||||||
|
.onTapGesture { selectedDiagnostic = diag }
|
||||||
|
```
|
||||||
|
|
||||||
|
Sheet binding: `.sheet(item: $selectedDiagnostic) { DiagnosticDetailSheet(diagnostic: $0) }`. The detail sheet is a simple `NavigationStack` with name + message + ISO timestamp + a "Done" toolbar button. Lightweight (~30 lines).
|
||||||
|
|
||||||
|
`HermesKanbanDiagnostic` is `Identifiable` (per WS-3 plan §2 — synthetic UUID).
|
||||||
|
|
||||||
|
#### 2f. Per-run diagnostics in the Runs tab
|
||||||
|
|
||||||
|
Update `runsSection` (lines 167-204). Inside each run row, after the optional error text, append a diagnostics block when present:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
if diagnosticsAvailable, !run.diagnostics.isEmpty {
|
||||||
|
diagnosticsBlock(run.diagnostics, label: "Run diagnostics")
|
||||||
|
.padding(.top, 4)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Same `diagnosticsBlock` helper.
|
||||||
|
|
||||||
|
#### 2g. NO write actions
|
||||||
|
|
||||||
|
Per WS-9 contract, iOS does not expose Verify / Reject. The hallucination badge in §2c is informational. Mac's `KanbanInspectorPane.healthBanner.hallucinationBanner` (WS-3 plan §8b) wires Verify/Reject buttons; iOS does not.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. iOS Curator Archived list (mirror WS-4) — IF iOS Curator exists
|
||||||
|
|
||||||
|
**Confirmed:** iOS Curator surface exists at `Scarf iOS/Curator/CuratorView.swift` (read-mostly, with runNow / pause / resume / pin / unpin actions). **In scope.**
|
||||||
|
|
||||||
|
**Source paths read.** WS-4 introduces:
|
||||||
|
- `HermesCuratorArchivedSkill` model (WS-4 plan "New types / fields")
|
||||||
|
- `CuratorService.listArchived() async throws -> [HermesCuratorArchivedSkill]` (WS-4 plan §"New files")
|
||||||
|
- `CuratorViewModel.archivedSkills: [HermesCuratorArchivedSkill]` and `loadArchive() async` (WS-4 plan §"Edited files / CuratorViewModel")
|
||||||
|
|
||||||
|
The shared `CuratorViewModel` lives in ScarfCore — iOS reuses it directly. The iOS `CuratorView` already constructs it at line 18. No iOS-side ScarfCore changes required.
|
||||||
|
|
||||||
|
### File: `Scarf iOS/Curator/CuratorView.swift`
|
||||||
|
|
||||||
|
#### 3a. Capability gate
|
||||||
|
|
||||||
|
Add `@Environment(\.hermesCapabilities) private var capabilitiesStore` at the top of the struct. Compute once in `body`:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
private var archiveAvailable: Bool {
|
||||||
|
capabilitiesStore?.capabilities.hasCuratorArchive ?? false
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 3b. Wire `loadArchive()` into the existing `.task`
|
||||||
|
|
||||||
|
Update the existing `.task { await viewModel.load() }` (line 92) to also load the archive when capability allows:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
.task {
|
||||||
|
await viewModel.load()
|
||||||
|
if archiveAvailable {
|
||||||
|
await viewModel.loadArchive()
|
||||||
|
}
|
||||||
|
}
|
||||||
|
.refreshable {
|
||||||
|
await viewModel.load()
|
||||||
|
if archiveAvailable {
|
||||||
|
await viewModel.loadArchive()
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 3c. Add the Archived section
|
||||||
|
|
||||||
|
After the "Last report" section (lines 74-80) and before the trailing modifiers, render the new section gated on `archiveAvailable`:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
if archiveAvailable {
|
||||||
|
archivedSection
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Helper:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@ViewBuilder
|
||||||
|
private var archivedSection: some View {
|
||||||
|
Section {
|
||||||
|
if viewModel.archivedSkills.isEmpty {
|
||||||
|
Text("No archived skills — Curator will move stale skills here after the next review cycle.")
|
||||||
|
.font(.callout)
|
||||||
|
.foregroundStyle(.secondary)
|
||||||
|
} else {
|
||||||
|
ForEach(viewModel.archivedSkills) { skill in
|
||||||
|
VStack(alignment: .leading, spacing: 4) {
|
||||||
|
HStack {
|
||||||
|
Text(skill.name)
|
||||||
|
.font(.body) // semantic — content
|
||||||
|
.lineLimit(1)
|
||||||
|
Spacer()
|
||||||
|
if let category = skill.category, !category.isEmpty {
|
||||||
|
ScarfBadge(category, kind: .neutral)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
HStack(spacing: 6) {
|
||||||
|
if let reason = skill.reason, !reason.isEmpty {
|
||||||
|
Text(reason)
|
||||||
|
.font(.caption) // semantic — content
|
||||||
|
.foregroundStyle(.secondary)
|
||||||
|
.lineLimit(2)
|
||||||
|
}
|
||||||
|
Spacer()
|
||||||
|
Text(skill.archivedAtLabel)
|
||||||
|
.font(.caption2)
|
||||||
|
.foregroundStyle(.tertiary)
|
||||||
|
}
|
||||||
|
if let size = skill.sizeBytes, size > 0 {
|
||||||
|
Text(skill.sizeLabel)
|
||||||
|
.font(.caption2)
|
||||||
|
.foregroundStyle(.tertiary)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
} header: {
|
||||||
|
Text("Archived")
|
||||||
|
} footer: {
|
||||||
|
if !viewModel.archivedSkills.isEmpty {
|
||||||
|
Text("Restore or prune archived skills from the Mac app.")
|
||||||
|
.font(.caption)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Copy.** Empty-state mirrors Mac's empty-state copy so the wiki / docs only need one phrasing. The "Restore or prune from the Mac app" footer is the read-only signpost.
|
||||||
|
|
||||||
|
**Font choice.** Skill name + reason → semantic `.body` / `.caption` (read for content). Category badge stays `ScarfBadge` (chrome). Date and size → `.caption2` (chrome metadata).
|
||||||
|
|
||||||
|
#### 3d. NO write actions
|
||||||
|
|
||||||
|
No per-row Restore button (WS-4 Mac surface adds this — iOS does not). No Prune All. The `CuratorRestoreSheet` Mac fallback for v0.12 hosts does NOT have an iOS counterpart and WS-9 does not introduce one. iOS users wanting to restore an archived skill use the Mac app — that's documented in the section footer.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. iOS Gateway / Platforms read-only mirror (mirror WS-5) — extending existing iOS Settings → Platforms
|
||||||
|
|
||||||
|
**Investigation result:** iOS does NOT have a separate `Gateway/` or `Platforms/` directory. Gateway / platform configuration is surfaced through `SettingsView.platformsSection` (lines 280-288). WS-9 extends this section rather than spinning up a new feature module.
|
||||||
|
|
||||||
|
**Caveat.** WS-5's plan markdown does not yet exist at `scarf/docs/v2.8/WS-5-gateway-v0.13-plan.md` (verified — the dir contains WS-2/3/4/6/7/8 only). The Mac-side WS-5 plan is forthcoming. WS-9 is forced to make best-inference assumptions about the Mac-side model field names. The capability flags themselves DO exist (`hasGoogleChatPlatform`, `hasGatewayAllowlists`, `hasGatewayBusyAckToggle`, `hasGatewayRestartNotification`, `hasGatewayList`) and the surface contract per the user prompt is:
|
||||||
|
- Show Google Chat as a new platform entry (read-only)
|
||||||
|
- Show allowlists as read-only chip-rows ("3 allowed channels: ..., 4 allowed chats: ...")
|
||||||
|
- Show platform-specific toggles as read-only state badges ("Restart notifications: ON", "Busy ack: OFF")
|
||||||
|
|
||||||
|
WS-9 mirrors that contract. Concrete model fields are flagged in Open Questions §3 below — the implementer should sync with the WS-5 author before merging.
|
||||||
|
|
||||||
|
### File: `Scarf iOS/Settings/SettingsView.swift`
|
||||||
|
|
||||||
|
#### 4a. Capability gate
|
||||||
|
|
||||||
|
Add the env-injected capability store (it's not currently read in `SettingsView`):
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@Environment(\.hermesCapabilities) private var capabilitiesStore
|
||||||
|
|
||||||
|
private var caps: HermesCapabilities {
|
||||||
|
capabilitiesStore?.capabilities ?? .empty
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 4b. Extend `platformsSection`
|
||||||
|
|
||||||
|
The current section (lines 280-288) renders five rows: Discord require-mention, Discord auto-thread, Telegram require-mention, Slack reply-to-mode, Matrix require-mention. WS-9 appends:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@ViewBuilder
|
||||||
|
private var platformsSection: some View {
|
||||||
|
Section("Platforms") {
|
||||||
|
// Existing rows (lines 282-286) — UNCHANGED.
|
||||||
|
yesNoRow("Discord: require mention", vm.config.discord.requireMention)
|
||||||
|
yesNoRow("Discord: auto-thread", vm.config.discord.autoThread)
|
||||||
|
yesNoRow("Telegram: require mention", vm.config.telegram.requireMention)
|
||||||
|
LabeledContent("Slack: reply mode", value: vm.config.slack.replyToMode)
|
||||||
|
yesNoRow("Matrix: require mention", vm.config.matrix.requireMention)
|
||||||
|
|
||||||
|
// v0.13 additions (gated).
|
||||||
|
if caps.hasGoogleChatPlatform {
|
||||||
|
googleChatSubsection
|
||||||
|
}
|
||||||
|
if caps.hasGatewayBusyAckToggle {
|
||||||
|
yesNoRow("Gateway: busy ack", vm.config.gateway.busyAckEnabled)
|
||||||
|
}
|
||||||
|
if caps.hasGatewayRestartNotification {
|
||||||
|
yesNoRow("Gateway: restart notification", vm.config.gateway.restartNotificationEnabled)
|
||||||
|
}
|
||||||
|
if caps.hasGatewayAllowlists {
|
||||||
|
allowlistsSubsection
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Field-name caveat.** The exact field names on `HermesConfig.gateway.*` and `HermesConfig.googleChat.*` are TBD by WS-5. Provisional field names used above (`busyAckEnabled`, `restartNotificationEnabled`, `googleChat.requireMention`, etc.) MUST be aligned with the WS-5 model definitions before this code lands. See Open Questions §3.
|
||||||
|
|
||||||
|
#### 4c. Google Chat subsection
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@ViewBuilder
|
||||||
|
private var googleChatSubsection: some View {
|
||||||
|
yesNoRow("Google Chat: require mention", vm.config.googleChat.requireMention)
|
||||||
|
if let space = vm.config.googleChat.defaultSpace, !space.isEmpty {
|
||||||
|
LabeledContent("Google Chat: default space", value: space)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 4d. Allowlists subsection — chip-row summaries
|
||||||
|
|
||||||
|
Read-only, summarized counts. Per the user prompt: "3 allowed channels: …, 4 allowed chats: …". On iOS the summary is collapsed (full lists are wide and a SwiftUI `List` row is narrow). Shape:
|
||||||
|
|
||||||
|
```swift
|
||||||
|
@ViewBuilder
|
||||||
|
private var allowlistsSubsection: some View {
|
||||||
|
if let channels = vm.config.gateway.allowedChannels, !channels.isEmpty {
|
||||||
|
DisclosureGroup {
|
||||||
|
ForEach(channels, id: \.self) { ch in
|
||||||
|
Text(ch)
|
||||||
|
.font(.callout.monospaced())
|
||||||
|
.foregroundStyle(.secondary)
|
||||||
|
.lineLimit(1)
|
||||||
|
}
|
||||||
|
} label: {
|
||||||
|
LabeledContent("Allowed channels") {
|
||||||
|
Text("\(channels.count)")
|
||||||
|
.font(.callout)
|
||||||
|
.foregroundStyle(.secondary)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if let chats = vm.config.gateway.allowedChats, !chats.isEmpty {
|
||||||
|
DisclosureGroup {
|
||||||
|
ForEach(chats, id: \.self) { chat in
|
||||||
|
Text(chat)
|
||||||
|
.font(.callout.monospaced())
|
||||||
|
.foregroundStyle(.secondary)
|
||||||
|
.lineLimit(1)
|
||||||
|
}
|
||||||
|
} label: {
|
||||||
|
LabeledContent("Allowed chats") {
|
||||||
|
Text("\(chats.count)")
|
||||||
|
.font(.callout)
|
||||||
|
.foregroundStyle(.secondary)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**UI choice.** `DisclosureGroup` with the count in the label collapses well on iPhone (default-collapsed; the user can tap to expand). Avoids a wall-of-text in a small-screen list. No tap-to-edit (read-only).
|
||||||
|
|
||||||
|
#### 4e. NO write actions on iOS Platforms
|
||||||
|
|
||||||
|
No editor sheet for Google Chat. No allowlist editor. No toggle switches that send `hermes config set`. The existing `quickEditsSection` (lines 84-117) does drive `setSetting(key, value)` for "v1Editable" specs — WS-9 does NOT add the v0.13 platform fields to `SettingSpec.v1Editable`. That's a Mac-only concern in v2.8.0.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. iOS v0.13 features-active badge (Settings)
|
||||||
|
|
||||||
|
### File: `Scarf iOS/Settings/SettingsView.swift`
|
||||||
|
|
||||||
|
#### 5a. Capability check — semver, not a single flag
|
||||||
|
|
||||||
|
Per the prompt: "Capability-gate on `caps.semver >= 0.13.0`." The `HermesCapabilities` struct (verified at `Packages/ScarfCore/Sources/ScarfCore/Services/HermesCapabilities.swift`) exposes `atLeastSemver(_:_:_:)` — a private helper. The simplest public hook is to use any one of the v0.13-gated flags as the proxy (e.g. `caps.hasGoals`) since they all resolve to the same `>= 0.13.0` threshold; or expose a new `public var isV013OrLater: Bool` on `HermesCapabilities`. Recommend the latter for clarity:
|
||||||
|
|
||||||
|
> **Coordination requirement.** WS-9 needs `HermesCapabilities.isV013OrLater: Bool { atLeastSemver(0, 13, 0) }`. If WS-1 didn't ship this, WS-9 adds it as a one-line addition to `HermesCapabilities.swift`. Cheap and keeps the badge gating honest. Alternative: piggy-back on `caps.hasGoals` and accept the semantic drift (the badge says "v0.13 features active" but is gated on the goals flag specifically). Recommend the new helper.
|
||||||
|
|
||||||
|
#### 5b. Mount the badge above `quickEditsSection`
|
||||||
|
|
||||||
|
```swift
|
||||||
|
var body: some View {
|
||||||
|
List {
|
||||||
|
if let err = vm.lastError { /* unchanged */ }
|
||||||
|
|
||||||
|
if caps.isV013OrLater {
|
||||||
|
v013ActiveBadgeSection
|
||||||
|
}
|
||||||
|
|
||||||
|
if !vm.isLoading || vm.config.model != "unknown" {
|
||||||
|
quickEditsSection
|
||||||
|
// ... rest unchanged
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// ... unchanged modifiers
|
||||||
|
}
|
||||||
|
|
||||||
|
@ViewBuilder
|
||||||
|
private var v013ActiveBadgeSection: some View {
|
||||||
|
Section {
|
||||||
|
Button {
|
||||||
|
showV013FeaturesSheet = true
|
||||||
|
} label: {
|
||||||
|
HStack(spacing: 8) {
|
||||||
|
ScarfBadge("v0.13 features active", kind: .success)
|
||||||
|
Spacer()
|
||||||
|
Text("Learn more")
|
||||||
|
.font(.caption)
|
||||||
|
.foregroundStyle(.tint)
|
||||||
|
Image(systemName: "chevron.right")
|
||||||
|
.font(.caption)
|
||||||
|
.foregroundStyle(.tertiary)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
.buttonStyle(.plain)
|
||||||
|
}
|
||||||
|
.listRowBackground(ScarfColor.success.opacity(0.06))
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**State.** Add `@State private var showV013FeaturesSheet = false` near the top.
|
||||||
|
|
||||||
|
**Color.** `.success` (green) — the host has new capabilities, framing as positive. Distinct from the warning-tinted error banner above it.
|
||||||
|
|
||||||
|
#### 5c. "Learn more" sheet
|
||||||
|
|
||||||
|
```swift
|
||||||
|
.sheet(isPresented: $showV013FeaturesSheet) {
|
||||||
|
V013FeaturesSheet()
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
New file `Scarf iOS/Settings/V013FeaturesSheet.swift` (~80 lines):
|
||||||
|
|
||||||
|
```swift
|
||||||
|
import SwiftUI
|
||||||
|
import ScarfDesign
|
||||||
|
|
||||||
|
struct V013FeaturesSheet: View {
|
||||||
|
@Environment(\.dismiss) private var dismiss
|
||||||
|
|
||||||
|
var body: some View {
|
||||||
|
NavigationStack {
|
||||||
|
List {
|
||||||
|
Section {
|
||||||
|
featureRow(
|
||||||
|
icon: "scope",
|
||||||
|
title: "Persistent goals",
|
||||||
|
description: "Type /goal <text> in chat to lock the agent on a target across turns. Mac only in v2.8."
|
||||||
|
)
|
||||||
|
featureRow(
|
||||||
|
icon: "tray.full",
|
||||||
|
title: "ACP /queue",
|
||||||
|
description: "Queue prompts to run after the current turn finishes. Mac only in v2.8."
|
||||||
|
)
|
||||||
|
featureRow(
|
||||||
|
icon: "stethoscope",
|
||||||
|
title: "Kanban diagnostics",
|
||||||
|
description: "Worker distress signals (heartbeat stalls, retry caps, zombies) surface on the task detail."
|
||||||
|
)
|
||||||
|
featureRow(
|
||||||
|
icon: "questionmark.diamond.fill",
|
||||||
|
title: "Hallucination gate",
|
||||||
|
description: "Worker-created cards are flagged for verify/reject. Verify on the Mac app."
|
||||||
|
)
|
||||||
|
featureRow(
|
||||||
|
icon: "archivebox",
|
||||||
|
title: "Curator archive",
|
||||||
|
description: "Stale skills move to an Archived list. Restore or prune from the Mac app."
|
||||||
|
)
|
||||||
|
featureRow(
|
||||||
|
icon: "bubble.left.and.bubble.right",
|
||||||
|
title: "Google Chat platform",
|
||||||
|
description: "New gateway target — configure on the Mac app."
|
||||||
|
)
|
||||||
|
} header: {
|
||||||
|
Text("What's new in v0.13")
|
||||||
|
} footer: {
|
||||||
|
Text("This iOS release surfaces v0.13 features read-only. Editing lives in the Mac app for v2.8.")
|
||||||
|
.font(.caption)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
.navigationTitle("v0.13 features")
|
||||||
|
.navigationBarTitleDisplayMode(.inline)
|
||||||
|
.toolbar {
|
||||||
|
ToolbarItem(placement: .topBarTrailing) {
|
||||||
|
Button("Done") { dismiss() }
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
private func featureRow(icon: String, title: String, description: String) -> some View {
|
||||||
|
HStack(alignment: .top, spacing: 12) {
|
||||||
|
Image(systemName: icon)
|
||||||
|
.foregroundStyle(.tint)
|
||||||
|
.font(.title3)
|
||||||
|
.frame(width: 28)
|
||||||
|
VStack(alignment: .leading, spacing: 4) {
|
||||||
|
Text(title).font(.body.weight(.semibold))
|
||||||
|
Text(description)
|
||||||
|
.font(.callout)
|
||||||
|
.foregroundStyle(.secondary)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
.padding(.vertical, 4)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Copy is the load-bearing piece.** Each row is one sentence; the read-only-on-iOS framing is in the section footer. No deep links to the relevant tab — that's a v2.8.x polish, not WS-9.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Coordination with WS-2 / WS-3 / WS-4 / WS-5
|
||||||
|
|
||||||
|
WS-9 consumes models / fields / capability flags from earlier work-streams. **WS-9 must land AFTER all of them merge to main.**
|
||||||
|
|
||||||
|
| Consumed surface | Source WS | Consumed at |
|
||||||
|
|---|---|---|
|
||||||
|
| `HermesActiveGoal` model | WS-2 | iOS goal pill (§1) |
|
||||||
|
| `HermesQueuedPrompt` model | WS-2 | iOS queue chip (§1, no popover) |
|
||||||
|
| `RichChatViewModel.activeGoal` observable | WS-2 | iOS goal pill (§1) |
|
||||||
|
| `RichChatViewModel.queuedPrompts` observable | WS-2 | iOS queue chip (§1) |
|
||||||
|
| `HermesCapabilities.hasGoals` | WS-1 | iOS chat (§1) |
|
||||||
|
| `HermesCapabilities.hasACPQueue` | WS-1 | iOS chat (§1) |
|
||||||
|
| `HermesKanbanTask.maxRetries` | WS-3 | iOS Kanban detail (§2b) |
|
||||||
|
| `HermesKanbanTask.autoBlockedReason` | WS-3 | iOS Kanban detail (§2d) |
|
||||||
|
| `HermesKanbanTask.hallucinationGateStatus` + `KanbanHallucinationGate` | WS-3 | iOS Kanban detail (§2c) |
|
||||||
|
| `HermesKanbanTask.diagnostics` + `HermesKanbanDiagnostic` + `KanbanDiagnosticKind` | WS-3 | iOS Kanban detail (§2e–§2f) |
|
||||||
|
| `HermesKanbanRun.diagnostics` | WS-3 | iOS Kanban detail (§2f) |
|
||||||
|
| `HermesCapabilities.hasKanbanDiagnostics` | WS-1 | iOS Kanban detail (§2a) |
|
||||||
|
| `HermesCuratorArchivedSkill` model | WS-4 | iOS Curator (§3) |
|
||||||
|
| `CuratorViewModel.archivedSkills` + `loadArchive()` | WS-4 | iOS Curator (§3) |
|
||||||
|
| `CuratorService.listArchived()` | WS-4 | (transitively via VM in §3) |
|
||||||
|
| `HermesCapabilities.hasCuratorArchive` | WS-1 | iOS Curator (§3) |
|
||||||
|
| `HermesConfig.gateway.allowedChannels` / `.allowedChats` (TBD field names) | WS-5 | iOS Settings (§4d) |
|
||||||
|
| `HermesConfig.gateway.busyAckEnabled` / `.restartNotificationEnabled` (TBD) | WS-5 | iOS Settings (§4b–§4c) |
|
||||||
|
| `HermesConfig.googleChat.*` (TBD shape) | WS-5 | iOS Settings (§4c) |
|
||||||
|
| `HermesCapabilities.hasGoogleChatPlatform` / `.hasGatewayAllowlists` / `.hasGatewayBusyAckToggle` / `.hasGatewayRestartNotification` | WS-1 | iOS Settings (§4) |
|
||||||
|
| `HermesCapabilities.isV013OrLater` (NEW — see §5a) | WS-1 (small follow-up) | iOS Settings badge (§5) |
|
||||||
|
|
||||||
|
### Sequencing (recommended)
|
||||||
|
|
||||||
|
1. WS-2 (Goals + queue VM scaffolding) merges → iOS chat goal pill becomes wireable.
|
||||||
|
2. WS-3 (Kanban diagnostics models) merges → iOS Kanban detail extension becomes wireable.
|
||||||
|
3. WS-4 (Curator archive service + VM state) merges → iOS Curator section becomes wireable.
|
||||||
|
4. WS-5 (Gateway / Platforms config models + capability flags consumed) merges → iOS Settings extension becomes wireable.
|
||||||
|
5. WS-9 PR opens, builds against the merged baseline, ships all five additions in one PR.
|
||||||
|
|
||||||
|
Splitting WS-9 into per-mirror PRs is overkill — each diff is small, all gated, all read-only.
|
||||||
|
|
||||||
|
### Acceptable to land WS-9 in stages
|
||||||
|
|
||||||
|
If WS-5 slips, WS-9 can ship items 1-3-4-5 first (the WS-2/3/4 mirrors plus the badge) and follow up with item 6 (Gateway/Platforms mirror) once WS-5 lands. The badge is independent of any mirror item — it can ship the moment WS-1 capability flags are in (already done).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Files to change / create
|
||||||
|
|
||||||
|
| File | Status | Purpose |
|
||||||
|
|---|---|---|
|
||||||
|
| `Scarf iOS/Chat/ChatView.swift` | EDIT | Goal pill + queue chip in `projectContextBar` (§1) |
|
||||||
|
| `Scarf iOS/Kanban/ScarfGoKanbanDetailSheet.swift` | EDIT | Diagnostics + max_retries + hallucination badge + auto-blocked banner (§2) |
|
||||||
|
| `Scarf iOS/Kanban/DiagnosticDetailSheet.swift` | NEW | Tap-target sheet showing one diagnostic's full message + timestamp (§2e) |
|
||||||
|
| `Scarf iOS/Curator/CuratorView.swift` | EDIT | Archived section + capability gate + extra `.task` load (§3) |
|
||||||
|
| `Scarf iOS/Settings/SettingsView.swift` | EDIT | v0.13 badge section + Platforms section extension (§4, §5) |
|
||||||
|
| `Scarf iOS/Settings/V013FeaturesSheet.swift` | NEW | "Learn more" sheet for the v0.13-features badge (§5c) |
|
||||||
|
| `Packages/ScarfCore/Sources/ScarfCore/Services/HermesCapabilities.swift` | EDIT (1 line) | `public var isV013OrLater: Bool` helper if not already present (§5a) |
|
||||||
|
|
||||||
|
**Total:** 7 files (2 new), ~350-450 lines. ~80% of the diff is the new sheets and the iOS Kanban detail extension.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Capability gating
|
||||||
|
|
||||||
|
Every WS-9 surface is hard-gated. Pre-v0.13 hosts see the v2.7.5 iOS surface unchanged.
|
||||||
|
|
||||||
|
| Surface | Gate | Pre-v0.13 behaviour |
|
||||||
|
|---|---|---|
|
||||||
|
| iOS goal pill | `caps.hasGoals && vm.activeGoal != nil` | hidden (transitive impossibility — pill goes nil because Mac doesn't write it) |
|
||||||
|
| iOS queue chip | `caps.hasACPQueue && !vm.queuedPrompts.isEmpty` | hidden |
|
||||||
|
| iOS Kanban max_retries chip | `caps.hasKanbanDiagnostics && task.maxRetries != nil` | hidden (`if let` belt-and-suspenders even if cap leaks) |
|
||||||
|
| iOS Kanban hallucination badge | `caps.hasKanbanDiagnostics && KanbanHallucinationGate.from(...) == .pending` | hidden |
|
||||||
|
| iOS Kanban auto-blocked banner | `caps.hasKanbanDiagnostics && status == .blocked && reason != nil` | hidden |
|
||||||
|
| iOS Kanban diagnostics blocks (task + run) | `caps.hasKanbanDiagnostics && !diagnostics.isEmpty` | hidden |
|
||||||
|
| iOS Curator Archived section | `caps.hasCuratorArchive` | section absent; `loadArchive()` not invoked |
|
||||||
|
| iOS Settings v0.13 badge | `caps.isV013OrLater` | section absent |
|
||||||
|
| iOS Settings Google Chat row | `caps.hasGoogleChatPlatform` | row absent |
|
||||||
|
| iOS Settings Busy ack row | `caps.hasGatewayBusyAckToggle` | row absent |
|
||||||
|
| iOS Settings Restart notification row | `caps.hasGatewayRestartNotification` | row absent |
|
||||||
|
| iOS Settings Allowlists rows | `caps.hasGatewayAllowlists` | rows absent |
|
||||||
|
|
||||||
|
**Defensive default.** Every `capabilitiesStore?.capabilities ?? .empty` resolves the absent-store case to `false` for every flag. WS-1's `.empty` static is the explicit pre-v0.13 sentinel (verified — used elsewhere in iOS already at `HermesVersionBanner.swift:14`).
|
||||||
|
|
||||||
|
**No new capability flags.** WS-9 adds at most one helper (`isV013OrLater`) to `HermesCapabilities`. All other flags are already shipped by WS-1.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## How to test
|
||||||
|
|
||||||
|
Per CLAUDE.md "remote-servers dogfooding" memory: dogfood against the Mardon Mac Mini at 192.168.0.82 (running the v0.13 binary on the `remote-servers` branch).
|
||||||
|
|
||||||
|
### iOS simulator scenarios — v0.13 host
|
||||||
|
|
||||||
|
1. **Goal pill**
|
||||||
|
- Open the iOS chat against a v0.13 host. Switch to the Mac, run `/goal finish v2.8 by Friday` in the same session. Switch back to iOS — within 2-3 polled state refreshes the pill should appear in `projectContextBar` with truncated text "finish v2.8 by Friday".
|
||||||
|
- VoiceOver: focus the pill, confirm full text reads as "Goal locked: finish v2.8 by Friday".
|
||||||
|
- Run `/goal --clear` from Mac. Confirm pill drops on iOS.
|
||||||
|
- Without an active project (chat without `projectContextBar` triggered today), confirm the bar STILL shows when the goal pill is the only chip — i.e. the bar is no longer project-only. Without a goal AND without a project, confirm the bar stays hidden.
|
||||||
|
|
||||||
|
2. **Queue chip**
|
||||||
|
- Trigger a long-running prompt on Mac, send `/queue summarize` while it's working. Confirm iOS shows "1 queued" chip in the bar.
|
||||||
|
- When the Mac turn finishes and the queued prompt fires, confirm the iOS chip count decrements.
|
||||||
|
|
||||||
|
3. **Kanban diagnostics**
|
||||||
|
- Open the iOS Kanban detail sheet for a task with `max_retries: 3`. Confirm the "retries: 3" chip shows in the header.
|
||||||
|
- Open a task in `pending` hallucination state. Confirm the yellow "Worker-created — verify on Mac" badge appears below the chip row.
|
||||||
|
- Open a blocked task with `auto_blocked_reason`. Confirm the red "Auto-blocked" banner shows the reason verbatim.
|
||||||
|
- Open a task with task-level diagnostics. Confirm the chip-list renders. Tap one — confirm the detail sheet opens with kind + message + timestamp.
|
||||||
|
- Open a task whose latest run has `darwin_zombie_detected`. Confirm the per-run diagnostics chip-list renders inside the Runs tab row.
|
||||||
|
|
||||||
|
4. **Curator Archived list**
|
||||||
|
- On v0.13 host with no archives: confirm Archived section renders with empty-state copy.
|
||||||
|
- On v0.13 host with 3 archives: confirm rows show name, category badge, reason, archived-at label, size. No Restore button. Footer hint visible.
|
||||||
|
- Pull-to-refresh: confirm `loadArchive()` re-fires.
|
||||||
|
|
||||||
|
5. **iOS Settings v0.13 badge**
|
||||||
|
- On v0.13 host: confirm the green "v0.13 features active" badge sits above the Quick edits section. Tap "Learn more" — confirm the sheet opens with 6 feature rows.
|
||||||
|
- Tap Done — confirm dismissal.
|
||||||
|
|
||||||
|
6. **iOS Settings Platforms additions**
|
||||||
|
- On v0.13 host with Google Chat configured: confirm the Google Chat rows show. Tap is read-only (no nav).
|
||||||
|
- With at least 3 allowed channels and 4 allowed chats configured: confirm both DisclosureGroup rows show with the correct counts. Expand each — confirm the entries render in monospaced font.
|
||||||
|
- With Busy ack OFF and Restart notifications ON: confirm both rows show the right yes/no labels.
|
||||||
|
|
||||||
|
### iOS simulator scenarios — pre-v0.13 host (regression smoke)
|
||||||
|
|
||||||
|
1. Connect to a Hermes v0.12 host (Mardon downgrade or local dev install).
|
||||||
|
2. Verify:
|
||||||
|
- `projectContextBar` looks unchanged from v2.7.5 (no goal pill, no queue chip).
|
||||||
|
- Kanban detail sheet: no max_retries chip, no hallucination badge, no auto-blocked banner, no diagnostics blocks. v2.7.5 layout intact.
|
||||||
|
- Curator: no Archived section. Existing `runNow` / `pause` / `resume` / `pin` actions work.
|
||||||
|
- Settings: no v0.13 badge. Platforms section shows the 5 v2.7.5 rows only.
|
||||||
|
3. Tap through every existing iOS surface to confirm no regressions.
|
||||||
|
|
||||||
|
### Dynamic Type accessibility smoke
|
||||||
|
|
||||||
|
Per CLAUDE.md: iOS clamps Dynamic Type at the scene root (`ScarfIOSApp.swift`: `.dynamicTypeSize(.xSmall ... .accessibility2)`). Verify at both extremes:
|
||||||
|
|
||||||
|
1. Settings → Accessibility → Display & Text Size → set to AX2.
|
||||||
|
2. Open chat: confirm goal pill text scales (semantic `.subheadline` should). Confirm pill chrome doesn't blow out — the truncation kicks in.
|
||||||
|
3. Open Kanban detail: confirm body text + diagnostics chip text scale. Badges (`ScarfBadge`) should NOT scale (they're chrome).
|
||||||
|
4. Open Curator Archived list: confirm skill name + reason scale. Archived-at label stays small.
|
||||||
|
5. Open Settings v0.13 sheet: confirm description text scales.
|
||||||
|
6. Switch to xSmall: confirm nothing collapses in a way that's unreadable.
|
||||||
|
|
||||||
|
### Build + test gates
|
||||||
|
|
||||||
|
- `xcodebuild -project scarf/scarf.xcodeproj -scheme "scarf mobile" -destination 'platform=iOS Simulator,name=iPhone 15' build` must succeed.
|
||||||
|
- All existing iOS UI smoke tests (if present in the target) stay green.
|
||||||
|
- New iOS-side snapshot or UI tests are NOT planned for WS-9 — the surfaces are read-only and visual; manual verification is the right pass for v2.8.0.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Open questions
|
||||||
|
|
||||||
|
1. **Does iOS Curator surface exist today?** ✅ Confirmed yes. `Scarf iOS/Curator/CuratorView.swift` exists and is read-mostly with runNow / pause / resume / pin / unpin actions. WS-9 mirror item 4 (Curator Archived list) is in scope. (The user prompt anticipated this might be unknown.)
|
||||||
|
|
||||||
|
2. **iOS goal/queue chip — is the queue chip tap a no-op or does it open a previews sheet?** Recommend tap = no-op for v2.8.0 (read-only badge, mirroring the goal pill's no-op tap). A previews sheet is nice-to-have but doesn't cross the bar for v2.8 — the user can see queued prompts from the Mac app. If review pushes back, a 30-line sheet listing previews + queued-at timestamps is cheap to add.
|
||||||
|
|
||||||
|
3. **WS-5 plan does not yet exist (`scarf/docs/v2.8/WS-5-gateway-v0.13-plan.md` is missing).** The exact `HermesConfig.gateway.*` and `HermesConfig.googleChat.*` field names are TBD. **Action:** before WS-9 implementation starts, sync with the WS-5 author to align on:
|
||||||
|
- Where do the allowlists live? `HermesConfig.gateway.allowedChannels: [String]?` or `HermesConfig.platforms.<each>.allowedChannels`?
|
||||||
|
- Are restart-notifications and busy-ack global (one toggle) or per-platform (one per Discord/Slack/Telegram/Matrix/Google-Chat)?
|
||||||
|
- Is "busy ack" the right wire name? Hermes might call it `busy_acknowledge` or `busy_indicator`.
|
||||||
|
- Does Google Chat use the same `requireMention` shape as Discord/Telegram/Matrix?
|
||||||
|
|
||||||
|
WS-9's Settings extensions (§4) are correct in shape but need the field-name patches once WS-5 confirms. The capability flags are stable.
|
||||||
|
|
||||||
|
4. **`HermesCapabilities.isV013OrLater` helper.** WS-1 may or may not have shipped this. If not, WS-9 ships a one-line addition. If `caps.hasGoals` is acceptable as a proxy (since all v0.13 flags resolve to the same threshold), the helper isn't strictly needed — but the badge copy says "v0.13 features active" so semantic alignment matters. Coordinator should pick one.
|
||||||
|
|
||||||
|
5. **`projectContextBar` re-render frequency.** Today it renders only when there's a project. After WS-9, it renders when there's a project OR a goal OR a queued prompt. The added re-render churn during streaming (every diff to `vm.activeGoal` / `vm.queuedPrompts`) may matter for ScarfMon's `chatRender` budget. **Action:** add a ScarfMon counter to the bar's body to measure during dogfooding. If churn becomes a hot-path issue, extract `goalChip` and `queueChip` into separately-scoped subviews so they re-render in isolation.
|
||||||
|
|
||||||
|
6. **Animation on pill / chip appearance.** Should the goal pill fade in when `vm.activeGoal` becomes non-nil? Recommend yes — `.transition(.opacity.combined(with: .scale(scale: 0.9)))` with a `.spring(response: 0.3, dampingFraction: 0.7)` parent animation. Keeps the bar from feeling like it pops. Apply same to the queue chip and the Kanban hallucination badge.
|
||||||
|
|
||||||
|
7. **Tap target for the Kanban hallucination badge.** Currently planned as informational-only. Should tapping it open an alert with explanation copy + a "Open in Mac app" placeholder action? Recommend NO for v2.8.0 — the on-screen "verify on Mac" copy is enough; an alert is unnecessary friction for a read-only surface.
|
||||||
|
|
||||||
|
8. **iOS deep links from the v0.13 features sheet.** Tapping a feature row could deep-link to the relevant tab (e.g. tap "Hallucination gate" → switch to Kanban tab). Recommend defer — the v2.8.0 sheet is text-only. v2.8.x can add the routing.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Out of scope (deferred to v2.8.x or later)
|
||||||
|
|
||||||
|
- **iOS write surfaces** for everything WS-9 mirrors:
|
||||||
|
- `/goal` and `/queue` send from iOS chat composer.
|
||||||
|
- Verify / Reject buttons on the iOS Kanban detail sheet.
|
||||||
|
- Archive / Restore / Prune on the iOS Curator surface.
|
||||||
|
- Allowlist editor / platform toggle editor in iOS Settings.
|
||||||
|
- **Gateway/Platforms iOS feature module from scratch** (separate `Scarf iOS/Gateway/` or `Scarf iOS/Platforms/` dir). v2.8.0 keeps gateway/platform config as an extension to `SettingsView.platformsSection`.
|
||||||
|
- **iOS Curator Archive `live` updates** beyond pull-to-refresh + the existing `.task` invocation. Hermes hasn't shipped a curator-watch surface; iOS won't either.
|
||||||
|
- **iOS Kanban hallucination badge tap-to-explain alert** — recommend not adding (see Open Question #7).
|
||||||
|
- **iOS Kanban diagnostics history graph** — Mac WS-3 also defers this. iOS follows.
|
||||||
|
- **iOS deep links from v0.13 features sheet** — see Open Question #8.
|
||||||
|
- **Snapshot tests for the new iOS sheets** — manual verification is the v2.8.0 pass.
|
||||||
|
- **Localization** — every new copy string is English-only. Existing iOS surfaces aren't localized either; WS-9 stays consistent.
|
||||||
|
- **iOS Goal pill custom font / pill chrome migration to a `ScarfDesign` component** — keep inline. If Mac WS-2 lands a reusable `ScarfGoalPill` component in the design package, swap iOS to use it as a follow-up.
|
||||||
|
- **iOS goal-state persistence across app suspends** — relies on the Mac VM state being authoritative. iOS just renders what it polls. If this matters in dogfooding (user perceives a stale pill after a long suspend), revisit.
|
||||||
|
- **Telemetry counters** for new iOS surfaces (e.g. ScarfMon counter on goal-pill appearance). Add if dogfooding surfaces a perf signal; otherwise ship without.
|
||||||
|
- **Per-platform notification re-routing toggles on iOS** (e.g. "send Google Chat alerts to APNS"). Out of scope — APNS routing already lives in `Notifications/NotificationRouter.swift` and is platform-agnostic.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Estimate
|
||||||
|
|
||||||
|
**Engineering hours (one engineer, focused), assuming WS-2 / WS-3 / WS-4 / WS-5 are merged to main:**
|
||||||
|
|
||||||
|
| Block | Hours |
|
||||||
|
|---|---|
|
||||||
|
| iOS chat goal pill + queue chip in `projectContextBar` (§1) | 2 |
|
||||||
|
| iOS Kanban detail sheet — chips + banners + diagnostics blocks + tap sheet (§2) | 5 |
|
||||||
|
| iOS Kanban `DiagnosticDetailSheet.swift` (NEW, ~30 LOC) | 1 |
|
||||||
|
| iOS Curator Archived section (§3) | 2 |
|
||||||
|
| iOS Settings Platforms extension + capability env injection (§4) | 3 |
|
||||||
|
| iOS Settings v0.13 badge + sheet (§5, including new sheet file) | 2 |
|
||||||
|
| `HermesCapabilities.isV013OrLater` helper (if not present) | 0.5 |
|
||||||
|
| Manual smoke on iPhone simulator (v0.13 + v0.12 hosts) + Dynamic Type pass | 3 |
|
||||||
|
| Code review + revisions | 2 |
|
||||||
|
| Buffer for WS-5 field-name alignment (Open Q #3) | 1.5 |
|
||||||
|
| **Total** | **~22 hours (≈3 working days)** |
|
||||||
|
|
||||||
|
**Confidence: medium-high.** All five items are mechanical given the existing iOS surface scaffolding (`projectContextBar`, `ScarfGoKanbanDetailSheet`, `CuratorView`, `SettingsView.platformsSection`). The only real risk is WS-5 field-name drift — captured in Open Question #3 — and it's contained to mirror item 4 (Settings → Platforms extensions). If WS-5 slips, mirror items 1-3-5 ship first; item 6 (Platforms) follows once WS-5 lands.
|
||||||
|
|
||||||
|
**Critical-path dependency:** WS-2, WS-3, WS-4, WS-5 must all be on `main` before WS-9 PR opens. WS-9 is the final "iOS catch-up" PR of the v2.8.0 release cycle.
|
||||||
|
|
||||||
|
**Risk register:**
|
||||||
|
|
||||||
|
- **WS-5 field-name drift.** Mitigated by Open Question #3 sync with the WS-5 author before implementation; Settings extensions stub clearly-named provisional field names that fail-fast at compile if WS-5 ships different names.
|
||||||
|
- **Dynamic Type churn.** Goal pill and Kanban diagnostics blocks are content-text — they scale. Verify nothing collapses at AX2; truncation strategies in §1b and the FlowLayout primitive in §2e are the v2.7.5 patterns and known-good.
|
||||||
|
- **`projectContextBar` re-render churn.** Open Question #5 captures this. Add a ScarfMon counter; revisit if dogfooding shows a hot-path issue.
|
||||||
|
- **iOS Kanban polling cadence** — the existing 5s poll picks up the new fields automatically. No new polling logic required.
|
||||||
|
- **No iOS test coverage regression.** WS-9 doesn't add tests but doesn't remove any either. The shared `RichChatViewModel` / `CuratorViewModel` / `KanbanService` tests in ScarfCore (extended by WS-2/3/4) cover the model + state-machine layer; iOS-specific UI is verified manually in v2.8.0.
|
||||||
Reference in New Issue
Block a user