diff --git a/Core-Services.md b/Core-Services.md index 625e62a..19ed416 100644 --- a/Core-Services.md +++ b/Core-Services.md @@ -76,6 +76,10 @@ In v2.5 most service code moved out of the Mac target into the shared **ScarfCor See [ScarfCore Package](ScarfCore-Package) for the package architecture and how to add a new shared service. +## Performance instrumentation (ScarfMon, v2.7+) + +A separate harness lives at [`ScarfCore/Diagnostics/`](https://github.com/awizemann/scarf/tree/main/scarf/Packages/ScarfCore/Sources/ScarfCore/Diagnostics) — `ScarfMon.measure` / `measureAsync` / `event` wrap hot call sites in the chat path, transport, SQLite backend, and disk I/O. Three modes (`off`, `signpostOnly` (default), `full`) controlled from the in-app Diagnostics → Performance panel; the default is effectively free outside an Instruments session. See [Performance Monitoring](Performance-Monitoring) for the full reference, including the user capture recipe and the developer guide for adding new measure points. + ## Patterns shared across the layer - **`ServerContext` parameterizes all I/O.** Services receive the context at init; routing local vs. SSH happens through `context.transport`. See [Transport Layer](Transport-Layer). diff --git a/Home.md b/Home.md index 2d32d31..375b2b1 100644 --- a/Home.md +++ b/Home.md @@ -18,6 +18,7 @@ A native macOS companion app for the [Hermes AI agent](https://github.com/hermes - **[Slash Commands](Slash-Commands)** — author project-scoped slash commands (v2.5+) - **[Design System](Design-System)** — ScarfColor / ScarfFont / components reference - [Architecture Overview](Architecture-Overview) — MVVM-F, services, transport, ScarfCore +- [Performance Monitoring](Performance-Monitoring) — ScarfMon: opt-in perf instrumentation, how to capture a baseline - [Servers & Remote](Servers-and-Remote) — adding remote Hermes hosts over SSH - [Localization](Localization) — supported languages + how to contribute a new one - [Release Notes Index](Release-Notes-Index) — every version's notes diff --git a/Performance-Monitoring.md b/Performance-Monitoring.md new file mode 100644 index 0000000..8d22e48 --- /dev/null +++ b/Performance-Monitoring.md @@ -0,0 +1,221 @@ +# Performance Monitoring (ScarfMon) + +Scarf ships an always-on, opt-in performance instrumentation harness called **ScarfMon**. It records timing samples and event counts at known hot spots so users hitting "feels slow" can capture a baseline, share it with maintainers, and have a concrete signal to act on instead of a vague report. + +This page is for both groups: + +- **Users** who want to help diagnose perceived slowness. +- **Developers** who want to add measure points to their own code or extend the harness. + +## TL;DR + +- **It's free when off.** Default mode (`signpostOnly`) emits Apple `os_signpost` events, which the runtime elides outside an Instruments session. +- **It's privacy-respecting.** Sample names are `StaticString` (compile-time literals), so user content can't leak through metric tags. Nothing leaves the device unless the user explicitly hits **Copy as JSON**. +- **It's open source.** All the plumbing lives in [`ScarfCore/Diagnostics/`](https://github.com/awizemann/scarf/tree/main/scarf/Packages/ScarfCore/Sources/ScarfCore/Diagnostics). + +## How to turn it on + +### Mac + +`Settings → Advanced → Performance Diagnostics`. The picker has three modes: + +| Mode | What it records | Cost | +|---|---|---| +| **Off** | Nothing | One branch + return per call | +| **Signpost only** (default) | `os_signpost` events to the `com.scarf.mon` subsystem | Zero outside Instruments | +| **Full** | Signposts + a 4096-entry in-memory ring + `os.Logger` debug stream | One ring write per call | + +Switching to **Full** unlocks the in-app summary table (top 20 buckets by p95) and the **Copy as JSON** button. + +### iOS + +`Settings → Diagnostics → Performance`. Same three-mode picker, same panel layout, same `Copy as JSON` action. The mode is persisted across launches in `UserDefaults` under key `ScarfMonMode`. + +### From a Terminal (any mode) + +```bash +log stream --predicate 'subsystem == "com.scarf.mon"' --info --debug +``` + +Streams every signpost / log line live. Useful for catching events that happen before you can flip the panel toggle. + +## How to read the data + +The in-app panel groups samples by `(category, name)` and shows for each: + +- **count** — total samples of that name in the buffer +- **p50 / p95 / max** — for `interval` samples, percentile durations +- **bytes** — running total when the call site reported a payload size + +For a full export, hit **Copy as JSON**. Each line is one sample with `category`, `name`, `kind` (`event` or `interval`), `timestampMs`, `durationNanos`, `count`, and optional `bytes`. Compact JSON, valid JSON array — pipe through `jq` or paste into a feedback thread. + +## What's measured today (v2.7+) + +| Category | Name | Where | What it tells you | +|---|---|---|---| +| `chatRender` | `mac.ChatView.body` | [Mac ChatView](https://github.com/awizemann/scarf/tree/main/scarf/scarf/Features/Chat/Views/ChatView.swift) | Full chat tab body re-eval count | +| `chatRender` | `mac.RichChatMessageList.body` | [RichChatMessageList](https://github.com/awizemann/scarf/tree/main/scarf/scarf/Features/Chat/Views/RichChatMessageList.swift) | Whether the message-list `ForEach` is re-issuing | +| `chatRender` | `mac.RichMessageBubble.body` | [RichMessageBubble](https://github.com/awizemann/scarf/tree/main/scarf/scarf/Features/Chat/Views/RichMessageBubble.swift) | Per-bubble re-evals — divide by `acpEvent` count to spot wasted re-renders | +| `chatRender` | `ios.ChatView.body` / `ios.MessageBubble.body` | iOS `ChatView.swift` | Same signal on iOS | +| `chatStream` | `mac.sendViaACP` | [Mac ChatViewModel](https://github.com/awizemann/scarf/tree/main/scarf/scarf/Features/Chat/ViewModels/ChatViewModel.swift) | User tap → first prompt write (carries prompt byte count) | +| `chatStream` | `mac.sendPrompt` | Same | User tap → response complete (interval) | +| `chatStream` | `mac.acpEvent` / `mac.handleACPEvent` | Same | Per-event arrival + handle cost | +| `chatStream` | `firstByte` / `firstThoughtByte` | [RichChatViewModel](https://github.com/awizemann/scarf/tree/main/scarf/Packages/ScarfCore/Sources/ScarfCore/ViewModels/RichChatViewModel.swift) | Time-to-first-token. Splits Hermes "thinking" from streaming render | +| `chatStream` | `finalizeStreamingMessage` | Same | End-of-turn finalize cost (target: < 1 ms) | +| `chatStream` | `ios.send` / `ios.startResuming` / `ios.acpEvent` / `ios.handleACPEvent` | iOS `ChatView.swift` | Same shape on iOS | +| `sessionLoad` | `mac.startACPSession` / `ios.startResuming` | Both targets | Session boot cost | +| `sqlite` | `sqlite.query` / `sqlite.queryBatch` | [RemoteSQLiteBackend](https://github.com/awizemann/scarf/tree/main/scarf/Packages/ScarfCore/Sources/ScarfCore/Services/Backends/RemoteSQLiteBackend.swift) | Per-call latency over SSH (carries row count + stdout bytes) | +| `transport` | `ssh.streamScript` (iOS) / `ssh.run` (Mac) | [CitadelServerTransport](https://github.com/awizemann/scarf/tree/main/scarf/Packages/ScarfIOS/Sources/ScarfIOS/CitadelServerTransport.swift), [SSHScriptRunner](https://github.com/awizemann/scarf/tree/main/scarf/Packages/ScarfCore/Sources/ScarfCore/Transport/SSHScriptRunner.swift) | SSH round-trip time | +| `diskIO` | `loadConfig` / `loadCronJobs` | [HermesFileService](https://github.com/awizemann/scarf/tree/main/scarf/scarf/Core/Services/HermesFileService.swift) | Hot disk reads. `loadConfig` also logs caller stack frames in Full mode | + +Adding a new measure point is two lines (see Developer Guide below). + +## Capture recipe for a useful baseline + +1. Build + run the latest version. +2. Flip the panel to **Full**. +3. Optionally, in another terminal: `log stream --predicate 'subsystem == "com.scarf.mon"'`. +4. Hit **Reset** in the panel. +5. Run the specific scenario you want to measure (one chat turn, one session boot, one specific click). +6. Hit **Copy as JSON** *before* doing anything else — the ring is FIFO with a 4096-entry capacity, so a long idle session will eventually overwrite earlier samples. +7. Paste the JSON in a [GitHub issue](https://github.com/awizemann/scarf/issues) or feedback thread, with one sentence describing what you were doing. + +The maintainers will use the same JSON shape to grep for outliers. + +## Privacy + +ScarfMon is privacy-conscious by construction: + +- **No remote upload.** The ring buffer never leaves the device. The `Copy as JSON` button puts the dump on the system clipboard; the user decides whether to paste it anywhere. +- **No content.** Sample names are `StaticString` (compile-time literals), so prompt text, response text, file paths, etc. cannot accidentally end up in a metric. +- **No PII.** Optional `bytes` field tracks payload *size*, never payload *contents*. +- **Symbol-only stack traces.** When `Full` mode logs caller stack frames (e.g. for the `loadConfig` mystery-caller hint), they are mangled Swift symbols + offsets. No memory addresses, no file paths. +- **Subsystem isolation.** All output uses subsystem `com.scarf.mon`, so users can grep / filter / disable independently of Scarf's general logs. + +## Developer guide — adding a measure point + +The public API has three primitives: + +```swift +// Synchronous interval — duration is recorded +ScarfMon.measure(.diskIO, "loadX") { + // your work +} + +// Async interval — same shape +try await ScarfMon.measureAsync(.sqlite, "query") { + try await actualQuery() +} + +// One-shot event — count + optional payload size +ScarfMon.event(.chatStream, "firstByte", count: 1, bytes: chunk.utf8.count) +``` + +### Picking a category + +Categories are a fixed enum in [`ScarfMon.swift`](https://github.com/awizemann/scarf/tree/main/scarf/Packages/ScarfCore/Sources/ScarfCore/Diagnostics/ScarfMon.swift): + +| Category | When to use | +|---|---| +| `chatRender` | View-body re-evals, scroll, layout work | +| `chatStream` | ACP events, prompt sends, finalize | +| `sessionLoad` | Session boot / resume / load | +| `transport` | SSH round-trips, network | +| `sqlite` | DB queries, snapshot pipeline | +| `diskIO` | File reads / writes | +| `render` | Other rendering (dashboards, sidebars) | +| `other` | Catch-all — promote to a real category if it grows | + +Adding a new category is a one-line case in `ScarfMon.Category` plus a row in this page. + +### Picking a name + +- Names must be `StaticString` (compile-time literal) — the type system enforces this. +- Conventionally prefixed with the platform (`mac.`, `ios.`) when the same logical operation has both shapes; bare names are fine for cross-platform code. +- Use `dot.notation` to group related events (`sqlite.query`, `sqlite.queryBatch`, `sqlite.query.rows`). +- Stable: rename = breaking change for any saved JSON dumps users have shared. + +### Body counters in SwiftUI views + +Inside `var body: some View`, the idiom is: + +```swift +var body: some View { + let _: Void = ScarfMon.event(.chatRender, "mac.ChatView.body") + return VStack { … } +} +``` + +The `let _: Void = …` works inside `@ViewBuilder` (it's a local declaration, not a view-producing expression) and fires every time SwiftUI re-evaluates the body. Use sparingly — these are sentinel events for diagnosing render storms, not for routine operation. + +### Verifying the cost + +Default `signpostOnly` mode is effectively free. To prove it locally: + +```swift +let n = 1_000_000 +let t = ContinuousClock().measure { + for _ in 0..