mirror of
https://github.com/awizemann/scarf.git
synced 2026-05-10 10:36:35 +00:00
f5f8dc30b6
* feat(templates): hackernews-digest template + dogfooding test harness First pass of the dogfooding-templates initiative. Each pre-release cycle ships one new official `.scarftemplate` and uses installing/exercising that template as the regression test. v1 lands the harness scaffolding plus the first template under it. - HackerNews Daily Digest template (`templates/awizemann/hackernews-digest/`): config-driven (min_score / max_items / topics) cron-only template. No secrets — keeps the harness minimal until the fake-Keychain shim lands. Bundle validates against `tools/build-catalog.py`; entry added to `templates/catalog.json`. - `SCARF_HERMES_HOME` env-var override at `HermesProfileResolver` — the seam every Layer-B test relies on to drive Scarf against an isolated Hermes home. Bypasses cache + active_profile lookup; rejects relative paths. 5 unit tests + 3 ServerContext integration tests. - `TestModeFlags.shared.isTestMode` — reads `--scarf-test-mode` once from `CommandLine.arguments`. Wiring only; gating sites (Sparkle, capability probe, first-run walkthrough) land as Layer-B exercises them. - Layer A (`scarf/scarfTests/TemplateE2ETests.swift`): parses + plans the shipped HN bundle the way the app does at install time; asserts manifest, config schema, dashboard widgets, and cron prompt contract. Mirrors the existing site-status-checker coverage. - Layer B scaffold (`scarf/scarfUITests/TemplateInstallUITests.swift`): proves the launch-arg + env-var plumbing reaches Scarf. Full install click-through deferred until fixture-Hermes-home and accessibility IDs land. Wiki pages added separately on the `.wiki-worktree` branch: - `Template-Ideas.md` — backlog of 9 v1-feasible templates + full-spec v3 epic for Project-Site-as-Living-Surface (eBay listings use case). - `Test-Harness.md` — contributor guide for extending the harness. Verification: scarfTests 124/124, ScarfCore 220/220, new Layer A 3/3, Layer B scaffold 1/1, build-catalog.py + its 28 unit tests all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(test-harness): Layer B pivot to real ~/.hermes + a11y IDs + Sparkle gating Discovered during Layer B work that XCUITest runners are sandboxed: they can read ~/.hermes/ but writes throw NSFileWriteNoPermissionError. That kills the SCARF_HERMES_HOME-based isolation pattern for UI tests — snapshot/restore from inside the runner can't work. Pivot: - Layer B drives the real ~/.hermes the dev Mac is already running against. The harness assumes a working Hermes install (XCTSkip if the binary isn't there). Cleanup is via the app's own UI flows (which have full disk access), not direct file I/O. Layer A keeps its env-var seam — those tests run inside the host app's address space and write freely. - SwiftUI's WindowGroup(for: ServerID.self) doesn't auto-surface a window on a fresh XCUIApplication.launch(). The harness sends ⌘1 (the "Open Server → Local" menu shortcut wired in scarfApp.swift's OpenServerCommands) to take the same code path real users hit via Dock click. - Real user home resolved via getpwuid(getuid()) rather than NSHomeDirectory(), which inside the sandboxed runner returns ~/Library/Containers/com.scarfUITests.xctrunner/Data. - 8 accessibility IDs added on the install path so the next iteration can drive the full Templates → Install from URL → Parent dir → Confirm Install flow without depending on view-tree label scraping: templates.toolbar.menu, templates.installFromFile, templates.installFromURL, templates.installURL.field, templates.installURL.confirm, templateInstall.parentDir.field, templateInstall.parentDir.continue, templateInstall.confirmInstall. - TestModeFlags.shared.isTestMode now gates UpdaterService — --scarf-test-mode launches Sparkle inert so update prompts don't pop on top of an XCUITest-driven window. Production launches unchanged. FixtureHermesHome.swift removed — the fixture-tmpdir approach is abandoned in favour of using the real installation. Layer A's SCARF_HERMES_HOME tests still pass; they just don't need a populated home to exercise path derivation. Verification: scarfTests 124/124, ScarfCore 220/220, Layer B smoke 1/1 (after fresh build — XCUITest is sensitive to stale binaries). catalog.py --check still green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(chat): clip placeholder to TextEditor bounds and clear it on focus Two related bugs in the Mac chat composer's placeholder overlay: * The "Message Hermes… / for commands · drag images to attach" hint had no width constraint, so on narrower window geometries it visibly overflowed past the rounded TextEditor boundary. Add `lineLimit(1)`, `truncationMode(.tail)`, and `frame(maxWidth: .infinity, alignment: .leading)` so it ellipsizes inside the field instead. * The opacity formula `text.isEmpty ? 1 : 0` only hid the placeholder once content was typed, not when the field gained focus. Standard NSTextField / UITextField semantics clear the placeholder on focus. Switch to `(text.isEmpty && !isFocused) ? 1 : 0` so the hint disappears the moment the user clicks into the field. The opaque-background ghosting mitigation from #65 is preserved unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(chat): surface OAuth refresh-revoked errors with in-app re-auth When an OAuth provider's refresh token was revoked, Hermes printed "Refresh session has been revoked. Run `hermes model` to re-authenticate." to stderr but Scarf swallowed it — the user saw a typing indicator that silently disappeared with no banner, no system message, no actionable hint. The error classifier had no pattern for OAuth revocation. - `ACPErrorHint.classify` now returns a `Classification` struct carrying the hint plus an optional `oauthProvider` name. New patterns match "Refresh session has been revoked", "re-authenticate", and 401-with-OAuth-provider-name (whole-word so `anthropicapi` doesn't false-match `anthropic`). Provider extraction lets the UI dispatch the right re-auth flow. - Chat error banner ([ChatView.swift]) gains a "Re-authenticate" button when an OAuth provider was identified — sets `AppCoordinator.pendingOAuthReauth` and routes to Credential Pools. - Credential Pools view consumes the hand-off slot to auto-present AddCredentialSheet seeded with the affected provider, AND adds a per-row "Re-authenticate" button on every OAuth provider so users who go straight there don't have to retype the provider name. - `AddCredentialSheet` accepts an optional `initialProvider` that pre-fills providerID + authType=.oauth; the existing Nous-vs-PKCE- vs-CLI gate dispatches re-auth identically to first-time setup — reuses the same `OAuthFlowController` / `NousSignInSheet` plumbing, no new flow code. Verification: ScarfCore 221/221 (incl. new errorHintsClassifyOAuthRefreshRevoked covering the four patterns + word-boundary guard); Mac app builds clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(catalog): in-app template catalog browser + sentinel-marker test isolation The v2.8 catalog browser surfaces every shipped .scarftemplate from awizemann.github.io/scarf/templates/catalog.json directly in Scarf. Users now discover and install templates without leaving the app. Closes the gap that publishing the catalog updated the website but nothing inside Scarf. Architecture mirrors NousModelCatalogService 1:1: cache-first fetch, 24h TTL at ~/.hermes/scarf/catalog_cache.json, result enum (fresh / cache / fallback) with bundled fallback so a fresh-install / offline user still sees something. Search + category filter + sort (awizemann official first). Detail page renders entry.config schema preview without separate README fetch — what's in catalog.json is what we render. Install hands the HTTPS URL to the existing TemplateInstallerViewModel.openRemoteURL flow; nothing about the installer itself changes. Files: - Core/Models/CatalogEntry.swift — Decodable mirror of catalog.json per-template shape. Identity-based Equatable/Hashable on `id`. - Core/Services/CatalogService.swift — fetch + cache + fallback - Core/Services/InstalledTemplatesIndex.swift — walks projects.json + template.lock.json to build [templateId: version] map; classify() helper for Installed / Update available / Not installed badges - Features/Templates/ViewModels/CatalogViewModel.swift — @Observable - Features/Templates/Views/{CatalogView,CatalogRowView,CatalogDetailView,CatalogCategoryFilter}.swift - Packages/ScarfCore/.../HermesPathSet.swift — adds catalogCache path - Features/Projects/Views/ProjectsView.swift — Templates toolbar menu now opens with "Browse Catalog…"; sheet binding. Tests (20 new, all passing in isolation): - CatalogServiceTests (6) — live catalog.json snapshot, cache lifecycle, staleness boundary, schema-version mismatch rejection, bundled fallback - InstalledTemplatesIndexTests (5) — empty registry, templated project, ad-hoc project skip, corrupt lock skip, classify() branches - CatalogViewModelTests (6) — search filter, category filter, official-first sort, deduped categories, install state, install URL pass-through Accessibility IDs (6, on the catalog path): templates.browseCatalog, catalog.searchField, catalog.refreshButton, catalog.row.<detailSlug>, catalog.categoryFilter, catalogDetail.installButton. ## Sentinel-marker hardening on SCARF_HERMES_HOME (incident response) While iterating on v2.8 tests, the env-var override pattern racing under Swift Testing's parallel-suite scheduler caused ~/.hermes/scarf/projects.json to be overwritten with fixture data from ProjectsViewModelTests. Recovered the user's projects from the on-disk dirs they referenced + cron-job prompt paths (6 projects restored). To make this class of incident impossible going forward: HermesProfileResolver.scarfHermesHomeOverride() now requires the override path to contain a sentinel marker file (`.scarf-test-home-marker`). Without the marker, the override is ignored and Scarf falls through to the real ~/.hermes/. Even if a test crashes mid-teardown leaving the env var set, even if the var leaks to a non-test process, even if a misconfigured launchctl plist exports it — the override only activates against directories that explicitly opt in by carrying the marker. Tests drop the marker in their tmpdir setUp; production never carries it. HermesProfileResolverTests gains overrideIsIgnoredWhenMarkerMissing which verifies the guard is load-bearing. All test files using SCARF_HERMES_HOME (CatalogServiceTests, CatalogViewModelTests, InstalledTemplatesIndexTests, TemplateE2ETests) now drop the marker before setenv. Verification: 20/20 v2.8 + v2.7 hardened tests pass; 45/45 adjacent existing tests pass; ScarfCore package tests pass (221/221); catalog validator clean (3 templates); wiki secret-scan clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(swift6): retroactive conformance + verbatim help text + xcstrings refresh Three small Swift 6 compile-cleanups that landed during the dogfooding-templates iteration: - MessageSpeechService — drop `@preconcurrency` on the AVSpeechSynthesizerDelegate conformance now that the protocol's Sendable annotations are upstreamed. - ChatView — mark `RichChatViewModel.PendingPermission: Identifiable` as `@retroactive`. We don't own either the type or the protocol; the Swift 6 compiler flags this so downstream breakage is loud if ScarfCore ever adds the conformance upstream. - CredentialPoolsView — wrap the `.help(...)` string in `Text(verbatim:)` so the backticks render literally instead of being interpreted as markdown inline-code by the LocalizedStringKey overload (which `.help(_:)` rejects styled). Localizable.xcstrings: auto-generated catalog refresh picking up the new active-profile + chat error-hint strings landed in earlier commits on this branch (acd3692,301806d). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(catalog): error logging + MainActor I/O + semver pre-release + decoder fault tolerance - InstalledTemplatesIndex: replace bare `try?` reads/decodes with logged do/catch so corrupt registry/lock files leave a breadcrumb instead of a silent nil. - InstalledTemplatesIndex.isVersionNewer: handle pre-release suffixes per semver §11 — `1.0.0-beta` no longer reports as newer than `1.0.0`, preventing a ghost "Update available" that would downgrade users. - CatalogViewModel.refresh: dispatch the synchronous index walk through `Task.detached` so registry + N lock-file reads don't run on @MainActor. - Catalog decoder: per-element fault tolerance via custom `init(from:)` — one malformed catalog entry is dropped with a logged warning instead of failing the whole catalog decode (honors the per-entry doc-comment contract). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
216 lines
11 KiB
Swift
216 lines
11 KiB
Swift
import Testing
|
|
import Foundation
|
|
import ScarfCore
|
|
@testable import scarf
|
|
|
|
/// End-to-end coverage for the dogfooding-templates harness.
|
|
///
|
|
/// Two suites live here:
|
|
///
|
|
/// 1. `HackerNewsDigestTemplateE2ETests` — exercises the shipped
|
|
/// `awizemann/hackernews-digest` bundle the way Scarf will at install
|
|
/// time: unpack, parse, validate the manifest + dashboard + cron
|
|
/// against the same `ProjectTemplateService` the app uses, then build
|
|
/// a `TemplateInstallPlan` and assert the resulting plan would write
|
|
/// the right files in the right places. Mirrors
|
|
/// `ProjectTemplateExampleTemplateTests.siteStatusCheckerParsesAndPlans`
|
|
/// so each shipped template gets the same regression net.
|
|
///
|
|
/// 2. `ScarfHermesHomeOverrideE2ETests` — proves the `SCARF_HERMES_HOME`
|
|
/// env-var override (added in `HermesProfileResolver`) actually steers
|
|
/// `ServerContext.local.paths`. This is the seam the Layer-B XCUITest
|
|
/// relies on to drive Scarf against an isolated Hermes home; if it
|
|
/// silently regresses, UI tests would suddenly start writing into the
|
|
/// user's real `~/.hermes`. Running it here keeps that invariant
|
|
/// visible from the unit-test target.
|
|
@Suite struct HackerNewsDigestTemplateE2ETests {
|
|
|
|
/// Parse + plan the shipped HN Digest bundle, assert its shape, and
|
|
/// confirm the cron prompt + dashboard contract are intact.
|
|
@Test func hackernewsDigestParsesAndPlans() throws {
|
|
let bundle = try Self.locateExample(author: "awizemann", name: "hackernews-digest")
|
|
|
|
let service = ProjectTemplateService(context: .local)
|
|
let inspection = try service.inspect(zipPath: bundle)
|
|
defer { service.cleanupTempDir(inspection.unpackedDir) }
|
|
|
|
// Manifest shape — mirror the install-time invariants the catalog
|
|
// validator enforces, so this test fails locally before a bad
|
|
// bundle escapes to PR.
|
|
#expect(inspection.manifest.id == "awizemann/hackernews-digest")
|
|
#expect(inspection.manifest.name == "HackerNews Daily Digest")
|
|
#expect(inspection.manifest.schemaVersion == 2)
|
|
#expect(inspection.manifest.contents.dashboard)
|
|
#expect(inspection.manifest.contents.agentsMd)
|
|
#expect(inspection.manifest.contents.cron == 1)
|
|
#expect(inspection.manifest.contents.config == 3)
|
|
#expect(inspection.manifest.contents.skills == nil)
|
|
#expect(inspection.manifest.contents.memory == nil)
|
|
#expect(inspection.cronJobs.count == 1)
|
|
#expect(inspection.cronJobs.first?.name == "Daily HN digest")
|
|
#expect(inspection.cronJobs.first?.schedule == "0 8 * * *")
|
|
|
|
// Config schema — three fields with the constraints the README
|
|
// promises. The validator catches missing fields; this catches
|
|
// wrong constraints (e.g. a default that drifts away from the
|
|
// text in README.md, or a maxItems someone bumped without
|
|
// updating the surrounding docs).
|
|
let schema = try #require(inspection.manifest.config)
|
|
#expect(schema.fields.count == 3)
|
|
let topicsField = try #require(schema.field(for: "topics"))
|
|
#expect(topicsField.type == .list)
|
|
#expect(topicsField.itemType == "string")
|
|
#expect(topicsField.required == false)
|
|
#expect(topicsField.maxItems == 20)
|
|
let minScoreField = try #require(schema.field(for: "min_score"))
|
|
#expect(minScoreField.type == .number)
|
|
#expect(minScoreField.minNumber == 1)
|
|
#expect(minScoreField.maxNumber == 1000)
|
|
let maxItemsField = try #require(schema.field(for: "max_items"))
|
|
#expect(maxItemsField.type == .number)
|
|
#expect(maxItemsField.minNumber == 5)
|
|
#expect(maxItemsField.maxNumber == 50)
|
|
#expect(schema.modelRecommendation?.preferred == "claude-haiku-4")
|
|
|
|
let scratch = try ProjectTemplateServiceTests.makeTempDir()
|
|
defer { try? FileManager.default.removeItem(atPath: scratch) }
|
|
let plan = try service.buildPlan(inspection: inspection, parentDir: scratch)
|
|
|
|
#expect(plan.projectDir.hasSuffix("awizemann-hackernews-digest"))
|
|
#expect(plan.skillsFiles.isEmpty)
|
|
#expect(plan.memoryAppendix == nil)
|
|
#expect(plan.cronJobs.count == 1)
|
|
#expect(plan.configSchema?.fields.count == 3)
|
|
#expect(plan.manifestCachePath?.hasSuffix("/.scarf/manifest.json") == true)
|
|
|
|
let destinations = plan.projectFiles.map(\.destinationPath)
|
|
#expect(destinations.contains { $0.hasSuffix("/.scarf/config.json") })
|
|
#expect(destinations.contains { $0.hasSuffix("/.scarf/manifest.json") })
|
|
#expect(destinations.contains { $0.hasSuffix("/.scarf/dashboard.json") })
|
|
|
|
// Cron-job name gets the template tag prefix so users can
|
|
// identify + remove it from the Cron sidebar later.
|
|
#expect(plan.cronJobs.first?.name == "[tmpl:awizemann/hackernews-digest] Daily HN digest")
|
|
|
|
// The bundled dashboard.json must decode cleanly against the
|
|
// same struct the app renders with — catches drift between
|
|
// template-author conventions and the runtime renderer.
|
|
let dashboardPath = inspection.unpackedDir + "/dashboard.json"
|
|
let dashboardData = try Data(contentsOf: URL(fileURLWithPath: dashboardPath))
|
|
let dashboard = try JSONDecoder().decode(ProjectDashboard.self, from: dashboardData)
|
|
#expect(dashboard.title == "HackerNews Digest")
|
|
#expect(dashboard.theme?.accent == "orange")
|
|
// Three sections: Today's Digest (3 stat widgets), Top Stories
|
|
// (1 list widget), How to Use (1 text widget). No webview —
|
|
// this template intentionally doesn't expose a Site tab.
|
|
#expect(dashboard.sections.count == 3)
|
|
|
|
let statsSection = dashboard.sections[0]
|
|
#expect(statsSection.title == "Today's Digest")
|
|
let statTitles = statsSection.widgets.filter { $0.type == "stat" }.map(\.title)
|
|
#expect(statTitles.contains("Top Story Score"))
|
|
#expect(statTitles.contains("Items Tracked"))
|
|
#expect(statTitles.contains("Last Run"))
|
|
|
|
// The agent's contract: cron prompt references the four nouns
|
|
// the dashboard + log files depend on. If any reference goes
|
|
// missing, AGENTS.md and the prompt have desynced and the
|
|
// agent will run against stale assumptions.
|
|
let cronPrompt = inspection.cronJobs.first?.prompt ?? ""
|
|
#expect(cronPrompt.contains("config.json"))
|
|
#expect(cronPrompt.contains("min_score"))
|
|
#expect(cronPrompt.contains("max_items"))
|
|
#expect(cronPrompt.contains("topics"))
|
|
#expect(cronPrompt.contains("dashboard.json"))
|
|
#expect(cronPrompt.contains("digest.md"))
|
|
#expect(cronPrompt.contains("hacker-news.firebaseio.com"))
|
|
// {{PROJECT_DIR}} stays unresolved in the bundle — the installer
|
|
// substitutes it at install time. A baked absolute path here
|
|
// would follow every install to every user's machine.
|
|
#expect(cronPrompt.contains("{{PROJECT_DIR}}"))
|
|
}
|
|
|
|
nonisolated private static func locateExample(author: String, name: String) throws -> String {
|
|
var dir = URL(fileURLWithPath: #filePath).deletingLastPathComponent()
|
|
for _ in 0..<6 {
|
|
let candidate = dir.appendingPathComponent("templates/\(author)/\(name)/\(name).scarftemplate")
|
|
if FileManager.default.fileExists(atPath: candidate.path) {
|
|
return candidate.path
|
|
}
|
|
dir = dir.deletingLastPathComponent()
|
|
}
|
|
throw ProjectTemplateError.requiredFileMissing("templates/\(author)/\(name)/\(name).scarftemplate")
|
|
}
|
|
}
|
|
|
|
/// Smoke-tests the SCARF_HERMES_HOME override at the `ServerContext.local`
|
|
/// integration point. The unit-level resolver tests live in
|
|
/// `HermesProfileResolverOverrideTests`; this exercises the same seam from
|
|
/// the surface every Scarf service actually reads — `ServerContext.paths`.
|
|
@Suite(.serialized)
|
|
struct ScarfHermesHomeOverrideE2ETests {
|
|
|
|
private static let envKey = "SCARF_HERMES_HOME"
|
|
|
|
@Test func overrideSteersServerContextPaths() throws {
|
|
let snapshot = TestRegistryLock.acquireAndSnapshot()
|
|
let saved = ProcessInfo.processInfo.environment[Self.envKey]
|
|
defer {
|
|
restore(saved)
|
|
TestRegistryLock.restore(snapshot)
|
|
}
|
|
|
|
let tmp = NSTemporaryDirectory().appending("scarf-e2e-home-\(UUID().uuidString)")
|
|
try FileManager.default.createDirectory(atPath: tmp, withIntermediateDirectories: true)
|
|
// Sentinel marker so the override is honored. Without this,
|
|
// `HermesProfileResolver.scarfHermesHomeOverride()` ignores the
|
|
// env var to protect the user's real `~/.hermes`.
|
|
try Data().write(to: URL(fileURLWithPath: tmp + "/" + HermesProfileResolver.testHomeMarkerFilename))
|
|
defer { try? FileManager.default.removeItem(atPath: tmp) }
|
|
setenv(Self.envKey, tmp, 1)
|
|
|
|
// Every derived path in HermesPathSet is computed off `home`, so
|
|
// proving `home` flips is enough to guarantee state.db, config.yaml,
|
|
// sessions/, cron/, scarf/projects.json, et al. all redirect.
|
|
// We assert the registry path explicitly because that's the one
|
|
// most likely to clobber the user's real ~/.hermes if the
|
|
// override regresses.
|
|
let paths = ServerContext.local.paths
|
|
#expect(paths.home == tmp)
|
|
#expect(paths.projectsRegistry == tmp + "/scarf/projects.json")
|
|
#expect(paths.cronJobsJSON == tmp + "/cron/jobs.json")
|
|
#expect(paths.configYAML == tmp + "/config.yaml")
|
|
}
|
|
|
|
@Test func overrideUnsetReturnsToProductionHome() {
|
|
let snapshot = TestRegistryLock.acquireAndSnapshot()
|
|
let saved = ProcessInfo.processInfo.environment[Self.envKey]
|
|
defer {
|
|
restore(saved)
|
|
TestRegistryLock.restore(snapshot)
|
|
}
|
|
|
|
unsetenv(Self.envKey)
|
|
HermesProfileResolver.invalidateCache()
|
|
|
|
// Without the override, `paths.home` resolves to the user's real
|
|
// Hermes home (or the active profile under it). We don't assert
|
|
// an exact path — we'd be encoding the test machine's username —
|
|
// but we do assert the shape: an absolute path ending in
|
|
// `/.hermes` (default profile) or containing `/profiles/`
|
|
// (named profile).
|
|
let paths = ServerContext.local.paths
|
|
#expect(paths.home.hasPrefix("/"))
|
|
#expect(paths.home.hasSuffix("/.hermes") || paths.home.contains("/.hermes/profiles/"))
|
|
}
|
|
|
|
private func restore(_ saved: String?) {
|
|
if let saved {
|
|
setenv(Self.envKey, saved, 1)
|
|
} else {
|
|
unsetenv(Self.envKey)
|
|
}
|
|
HermesProfileResolver.invalidateCache()
|
|
}
|
|
}
|