From 110170d6e9e6232892bfd7bce1690f8dbc643ae8 Mon Sep 17 00:00:00 2001
From: Alan Wizemann <alan@wizemann.com>
Date: Mon, 20 Apr 2026 13:40:35 -0700
Subject: [PATCH 1/8] =?UTF-8?q?fix:=20v2.0.1=20=E2=80=94=20surface=20remot?=
 =?UTF-8?q?e=20SSH=20file-access=20errors=20(closes=20#19)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Three users reported on day-one of v2.0 that SSH connections showed a
green "Connected" pill but every data view read as empty / "not running"
/ "not configured". The common thread across Docker, homelab VM, and
Ubuntu VPS setups: file-access failures on the remote that Scarf
silently swallowed into nil/empty defaults.

Stop swallowing errors
- HermesFileService gains Result-returning variants for the four
  dashboard-critical readers: loadConfigResult, loadGatewayStateResult,
  hermesPIDResult, plus readFileResult / readFileDataResult as
  primitives. Each logs os.Logger warnings on failure. Legacy nil-
  returning signatures remain as thin forwarders.
- HermesDataService.open records lastOpenError with humanized hints
  for the top three failure modes — sqlite3 not installed, permission
  denied, file not found. Each maps to concrete remediation (`apt
  install sqlite3`, "check file perms", "set Hermes data directory").

Dashboard surfaces the error
- DashboardViewModel collects errors from every loader into
  lastReadError, only on remote contexts (local skips the banner).
- DashboardView renders an orange banner above the stats with the
  specific error text, a copy-selectable detail, and a "Run
  Diagnostics…" button.

New Remote Diagnostics sheet (stethoscope icon)
- RemoteDiagnosticsViewModel runs 14 checks in one SSH round-trip via
  a pipe-delimited "KEY|STATUS|DETAIL" protocol. Covers: SSH
  connectivity, remote user/$HOME, Hermes dir existence + readability,
  config.yaml readability + actual read (distinct from just `test -e`
  which can't detect permission issues), state.db readability, sqlite3
  binary presence, sqlite3 open test, hermes binary on non-login AND
  login PATH, pgrep availability.
- Each probe row shows a targeted hint on fail (e.g. "check perms on
  ~/.hermes", "apt install sqlite3", "move PATH export from .bashrc
  to .zshenv"). A Copy Full Report button dumps plain-text output
  for GitHub issues.
- Accessible from Manage Servers (stethoscope button per row) and
  directly from the yellow pill.

Yellow "degraded" connection state
- ConnectionStatusViewModel.Status gains .degraded(reason:) between
  .connected and .error. After tier-1 `true` passes, the probe runs
  tier-2 `test -r $HOME/.hermes/config.yaml` in the same SSH round-
  trip. On tier-2 fail, pill is orange with "Connected — can't read
  Hermes state" tooltip.
- Clicking a degraded pill opens Remote Diagnostics directly. Exactly
  the symptom in #19 is now one click from a specific answer.

Auto-suggest remoteHome for non-default installs
- TestConnectionProbe.TestResult.success gains suggestedRemoteHome:
  String?. When state.db isn't found at the configured path, the
  probe also checks /var/lib/hermes/.hermes, /opt/hermes/.hermes,
  /home/hermes/.hermes, /root/.hermes — the common alternates for
  systemd services, Docker containers, and single-user VPSes — and
  surfaces the first hit as a "Use this" suggestion in Add Server.
- AddServerSheet relabels "Remote ~/.hermes override" to "Hermes data
  directory" with an explanation of when you'd use it.

README
- New "Remote setup requirements" subsection lists the four concrete
  prereqs (SSH, sqlite3, pgrep, read access to ~/.hermes).
- New "Troubleshooting remote connections" paragraph describes the
  diagnostics sheet and remoteHome auto-suggest for the two most
  common failure modes.

Releases
- releases/v2.0.1/RELEASE_NOTES.md for the GitHub release body.
- Ship via `./scripts/release.sh 2.0.1`.

Closes #19.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 README.md                                     |  15 +
 releases/v2.0.1/RELEASE_NOTES.md              |  46 +++
 .../Core/Services/HermesDataService.swift     |  60 +++-
 .../Core/Services/HermesFileService.swift     | 141 ++++++--
 .../ViewModels/DashboardViewModel.swift       |  55 ++-
 .../Dashboard/Views/DashboardView.swift       |  42 +++
 .../ViewModels/AddServerViewModel.swift       |   9 +-
 .../ConnectionStatusViewModel.swift           |  67 +++-
 .../RemoteDiagnosticsViewModel.swift          | 330 ++++++++++++++++++
 .../ViewModels/TestConnectionProbe.swift      |  40 ++-
 .../Servers/Views/AddServerSheet.swift        |  47 ++-
 .../Servers/Views/ConnectionStatusPill.swift  |  12 +
 .../Servers/Views/ManageServersView.swift     |  24 +-
 .../Servers/Views/RemoteDiagnostics.swift     | 166 +++++++++
 14 files changed, 997 insertions(+), 57 deletions(-)
 create mode 100644 releases/v2.0.1/RELEASE_NOTES.md
 create mode 100644 scarf/scarf/Features/Servers/ViewModels/RemoteDiagnosticsViewModel.swift
 create mode 100644 scarf/scarf/Features/Servers/Views/RemoteDiagnostics.swift

diff --git a/README.md b/README.md
index e27e830..f4fc394 100644
--- a/README.md
+++ b/README.md
@@ -42,6 +42,21 @@ Scarf 2.0 is a multi-window app. Each window is bound to exactly one Hermes serv
 
 Remote Hermes is reached over system SSH — the same `~/.ssh/config`, ssh-agent, ProxyJump, and ControlMaster pooling your terminal uses. File I/O flows through `scp`/`sftp`; SQLite is served from atomic `sqlite3 .backup` snapshots cached under `~/Library/Caches/scarf/snapshots/<server-id>/`; chat (ACP) tunnels as `ssh -T host -- hermes acp` with JSON-RPC over stdio end-to-end. Everything in the feature list below works against remote identically to local.
 
+### Remote setup requirements
+
+The remote host must have:
+
+1. **SSH access** — key-based auth via your local ssh-agent. Scarf never prompts for passphrases; run `ssh-add` once in Terminal before connecting.
+2. **`sqlite3`** on the remote `$PATH` — needed for the atomic DB snapshots. Install on the remote with `apt install sqlite3` (Ubuntu/Debian), `yum install sqlite` (RHEL/Fedora), or `apk add sqlite` (Alpine).
+3. **`pgrep`** on the remote `$PATH` — used by the Dashboard "is Hermes running" check. Standard on every distro; install `procps` if missing.
+4. **`~/.hermes/` readable by the SSH user**. When Hermes runs as a separate user (systemd service, Docker container), the SSH user needs read access to `config.yaml` and `state.db`. Either (a) SSH as the Hermes user, (b) `chmod` Hermes's home to be group-readable and add your SSH user to that group, or (c) set the **Hermes data directory** field when adding the server to point at the right location (e.g. `/var/lib/hermes/.hermes`).
+
+### Troubleshooting remote connections
+
+If the connection pill is green but the Dashboard shows "Stopped", "unknown", or empty values, the SSH user can't read the Hermes state files. Open **Manage Servers → 🩺 Run Diagnostics** (or click the yellow "Can't read Hermes state" pill in the toolbar). The diagnostics sheet runs fourteen checks in one SSH session — connectivity, `sqlite3` presence, read access to `config.yaml` and `state.db`, the effective non-login `$PATH` — and tells you exactly which one fails and why, with remediation hints for each. Use the **Copy Full Report** button to paste the full output into a bug report.
+
+For the common "Hermes isn't at the default path" case (systemd services, Docker), **Test Connection** in the Add Server sheet now probes `/var/lib/hermes/.hermes`, `/opt/hermes/.hermes`, `/home/hermes/.hermes`, and `/root/.hermes` when it can't find `state.db` at `~/.hermes/`, and offers a one-click fill if it finds any of them.
+
 ## Features
 
 Scarf mirrors Hermes's surface area through a sidebar-based UI. Sections below map 1:1 to the app's sidebar.
diff --git a/releases/v2.0.1/RELEASE_NOTES.md b/releases/v2.0.1/RELEASE_NOTES.md
new file mode 100644
index 0000000..6de355d
--- /dev/null
+++ b/releases/v2.0.1/RELEASE_NOTES.md
@@ -0,0 +1,46 @@
+## What's New in 2.0.1
+
+Hotfix for [#19](https://github.com/awizemann/scarf/issues/19) and the related reports from the first day of v2.0: users' remote SSH connections would show a green "Connected" pill but every view (Dashboard, Sessions, Activity, Chat) read as empty / "not running" / "not configured". Three distinct environments reported it — Docker Hermes on a LAN, homelab VM over Tailscale, Ubuntu VPS — and every one was a silent file-access failure on the remote that Scarf wasn't surfacing.
+
+### What changed
+
+**Errors no longer disappear.** Every remote read (`config.yaml`, `gateway_state.json`, `state.db`, `pgrep`) used to silently substitute an empty value on *any* failure — permission denied, missing file, `sqlite3` not installed, connection drop — they all looked identical to the UI. Now:
+
+- Each failure logs a specific warning via `os.Logger` (visible in Console.app under subsystem `com.scarf`).
+- The Dashboard shows an orange banner above the stats with the exact error (e.g. "Permission denied reading `~/.hermes/state.db`") and a **Run Diagnostics…** button.
+- `HermesDataService` exposes a `lastOpenError` so views can explain *why* state.db couldn't be opened, rather than just rendering zeros.
+
+**New Remote Diagnostics sheet.** Accessible from **Manage Servers → 🩺** or directly by clicking the toolbar pill when it's yellow. Runs fourteen checks in a single SSH session and shows pass/fail for each, plus a targeted hint per failure:
+
+- SSH connectivity and auth
+- Remote user identity and `$HOME` resolution
+- `~/.hermes` directory existence and readability
+- `config.yaml` readable (existence *and* actual read access — the old probe only checked existence)
+- `state.db` readable
+- `sqlite3` installed on the remote (required for the atomic snapshot Scarf pulls)
+- `sqlite3` can actually open `state.db`
+- `hermes` binary on the non-login `$PATH` (what runtime uses)
+- `hermes` binary on the login `$PATH` (what the Test Connection probe uses)
+- `pgrep` available (for the "is Hermes running" check)
+
+One **Copy Full Report** button dumps every check as plain text for bug reports.
+
+**Connection pill gains a yellow "degraded" state.** The pill used to be green as long as SSH connected; now after connectivity passes it runs a second-tier check (`test -r ~/.hermes/config.yaml`). If that fails, the pill turns **yellow** with "Connected — can't read Hermes state" and clicking it opens Remote Diagnostics directly. This is the exact symptom mode in issue #19, and it's now one click away from a specific answer.
+
+**Auto-suggest the correct `remoteHome` during Add Server.** When Test Connection can't find `state.db` at the configured (or default) path, it now also probes the common alternate locations — `/var/lib/hermes/.hermes`, `/opt/hermes/.hermes`, `/home/hermes/.hermes`, `/root/.hermes` — and offers a one-click "Use this" fill if it finds one. Removes the need to know that systemd-installed Hermes lives at `/var/lib/hermes/.hermes` by convention.
+
+**Clearer copy for the `remoteHome` field.** The Add Server sheet field is now labeled "Hermes data directory" with a description explaining when you'd override it (systemd service installs, Docker sidecars) and noting that Test Connection auto-suggests.
+
+**README has a new "Remote setup requirements" section** with the four concrete prerequisites (SSH, `sqlite3`, `pgrep`, read access to `~/.hermes`) and a troubleshooting paragraph pointing at Remote Diagnostics.
+
+### Migrating from 2.0.0
+
+Sparkle will offer the update automatically. Settings and server list are preserved verbatim — this is purely additive (new diagnostics surface, new error banners, auto-suggest in Test Connection). If you were affected by #19, run Remote Diagnostics after updating; the sheet should pinpoint the specific file access issue and suggest a fix.
+
+### Under the hood
+
+- New types: `RemoteDiagnosticsViewModel`, `RemoteDiagnosticsView`. Both are local to Scarf; no new transport protocol.
+- `HermesFileService` gains `loadConfigResult()`, `loadGatewayStateResult()`, `hermesPIDResult()`, `readFileResult()`, `readFileDataResult()` — Result-returning variants that preserve the error. Legacy `loadConfig()` etc. still exist as thin forwarders for callers that don't need diagnostics.
+- `HermesDataService.open()` records `lastOpenError` with humanized hints for "sqlite3 not installed", "permission denied", and "file not found" — the three failure modes that produce 90% of issue #19 symptoms.
+- `ConnectionStatusViewModel` status enum gains `.degraded(reason:)` between `.connected` and `.error`.
+- `TestConnectionProbe` result enum gains `suggestedRemoteHome: String?` carrying any alternate-location hit.
diff --git a/scarf/scarf/Core/Services/HermesDataService.swift b/scarf/scarf/Core/Services/HermesDataService.swift
index 6011c0c..c75fbaf 100644
--- a/scarf/scarf/Core/Services/HermesDataService.swift
+++ b/scarf/scarf/Core/Services/HermesDataService.swift
@@ -1,5 +1,6 @@
 import Foundation
 import SQLite3
+import os
 
 /// Dedupes concurrent `snapshotSQLite` calls for the same server. When the
 /// file watcher ticks, Dashboard + Sessions + Activity (+ Chat's loadHistory)
@@ -29,11 +30,18 @@ actor SnapshotCoordinator {
 }
 
 actor HermesDataService {
+    private static let logger = Logger(subsystem: "com.scarf", category: "HermesDataService")
+
     private var db: OpaquePointer?
     private var hasV07Schema = false
     /// Local filesystem path we last opened. For remote contexts this is
     /// the cached snapshot under `~/Library/Caches/scarf/snapshots/<id>/`.
     private var openedAtPath: String?
+    /// Last error from `open()` / `refresh()`, user-presentable. `nil` means
+    /// the last attempt succeeded. Views surface this when their own load
+    /// path fails, so the user sees "Permission denied reading state.db"
+    /// instead of an empty Dashboard with no explanation.
+    private(set) var lastOpenError: String?
 
     let context: ServerContext
     private let transport: any ServerTransport
@@ -52,16 +60,25 @@ actor HermesDataService {
             // corrupt. Routed through SnapshotCoordinator so concurrent
             // view models don't each spawn a parallel SSH backup for the
             // same server.
-            let url = try? await SnapshotCoordinator.shared.snapshot(
-                remotePath: context.paths.stateDB,
-                contextID: context.id,
-                transport: transport
-            )
-            guard let url else { return false }
-            localPath = url.path
+            do {
+                let url = try await SnapshotCoordinator.shared.snapshot(
+                    remotePath: context.paths.stateDB,
+                    contextID: context.id,
+                    transport: transport
+                )
+                localPath = url.path
+                lastOpenError = nil
+            } catch {
+                lastOpenError = humanize(error)
+                Self.logger.warning("snapshotSQLite failed: \(error.localizedDescription, privacy: .public)")
+                return false
+            }
         } else {
             localPath = context.paths.stateDB
-            guard FileManager.default.fileExists(atPath: localPath) else { return false }
+            guard FileManager.default.fileExists(atPath: localPath) else {
+                lastOpenError = "Hermes state database not found at \(localPath)."
+                return false
+            }
         }
         // Remote snapshots are point-in-time copies that no one writes to;
         // opening them with `immutable=1` tells SQLite to skip WAL/SHM and
@@ -81,14 +98,41 @@ actor HermesDataService {
         }
         let result = sqlite3_open_v2(openPath, &db, flags, nil)
         guard result == SQLITE_OK else {
+            let msg: String
+            if let db {
+                msg = String(cString: sqlite3_errmsg(db))
+            } else {
+                msg = "sqlite3_open_v2 returned \(result)"
+            }
+            lastOpenError = "Couldn't open state.db: \(msg)"
+            Self.logger.warning("sqlite3_open_v2 failed (\(result)) at \(localPath, privacy: .public): \(msg, privacy: .public)")
             db = nil
             return false
         }
         openedAtPath = localPath
+        lastOpenError = nil
         detectSchema()
         return true
     }
 
+    /// Turn a transport error into the one-line string Dashboard shows. Adds
+    /// hints for the common "sqlite3 not installed" and "permission denied"
+    /// cases so users know what to do.
+    private nonisolated func humanize(_ error: Error) -> String {
+        let desc = (error as? LocalizedError)?.errorDescription ?? error.localizedDescription
+        let lower = desc.lowercased()
+        if lower.contains("sqlite3: command not found") || lower.contains("sqlite3: not found") {
+            return "sqlite3 is not installed on \(context.displayName). Install it with `apt install sqlite3` (Ubuntu/Debian) or `yum install sqlite` (RHEL/Fedora)."
+        }
+        if lower.contains("permission denied") {
+            return "Permission denied reading Hermes state on \(context.displayName). The SSH user may not have read access to ~/.hermes/state.db — try Run Diagnostics."
+        }
+        if lower.contains("no such file") {
+            return "Hermes state not found at ~/.hermes on \(context.displayName). If Hermes is installed elsewhere, set its data directory in Manage Servers."
+        }
+        return desc
+    }
+
     /// Force a fresh snapshot pull + reopen. Used on session-load and in
     /// any path that needs the UI to reflect writes Hermes just made.
     /// Without this, remote snapshots would be frozen at the first `open()`
diff --git a/scarf/scarf/Core/Services/HermesFileService.swift b/scarf/scarf/Core/Services/HermesFileService.swift
index 1c756d0..2e3df54 100644
--- a/scarf/scarf/Core/Services/HermesFileService.swift
+++ b/scarf/scarf/Core/Services/HermesFileService.swift
@@ -1,7 +1,10 @@
 import Foundation
+import os
 
 struct HermesFileService: Sendable {
 
+    nonisolated static let logger = Logger(subsystem: "com.scarf", category: "HermesFileService")
+
     let context: ServerContext
     let transport: any ServerTransport
 
@@ -17,6 +20,14 @@ struct HermesFileService: Sendable {
         return parseConfig(content)
     }
 
+    /// Error-surfacing config load. Used by Dashboard to show the user a
+    /// specific reason when config.yaml can't be read on a remote host
+    /// (permission denied, missing file, sqlite3 not installed, etc.)
+    /// instead of silently falling back to `.empty`.
+    nonisolated func loadConfigResult() -> Result<HermesConfig, Error> {
+        readFileResult(context.paths.configYAML).map { parseConfig($0) }
+    }
+
     nonisolated private func parseConfig(_ yaml: String) -> HermesConfig {
         let parsed = Self.parseNestedYAML(yaml)
         let values = parsed.values
@@ -385,11 +396,34 @@ struct HermesFileService: Sendable {
         do {
             return try JSONDecoder().decode(GatewayState.self, from: data)
         } catch {
-            print("[Scarf] Failed to decode gateway state: \(error.localizedDescription)")
+            Self.logger.warning("Failed to decode gateway state: \(error.localizedDescription, privacy: .public)")
             return nil
         }
     }
 
+    /// Error-surfacing gateway-state load. `.success(nil)` means the file
+    /// doesn't exist yet (gateway hasn't written state — normal when Hermes
+    /// is stopped). `.failure` means the file exists but couldn't be read
+    /// (permission denied, connection down, JSON corruption).
+    nonisolated func loadGatewayStateResult() -> Result<GatewayState?, Error> {
+        // Distinguish "file doesn't exist yet" (normal, returns .success(nil))
+        // from "file exists but we can't read or parse it" (error).
+        if !transport.fileExists(context.paths.gatewayStateJSON) {
+            return .success(nil)
+        }
+        switch readFileDataResult(context.paths.gatewayStateJSON) {
+        case .success(let data):
+            do {
+                return .success(try JSONDecoder().decode(GatewayState.self, from: data))
+            } catch {
+                Self.logger.warning("Failed to decode gateway state: \(error.localizedDescription, privacy: .public)")
+                return .failure(error)
+            }
+        case .failure(let err):
+            return .failure(err)
+        }
+    }
+
     // MARK: - Memory
 
     nonisolated func loadMemoryProfiles() -> [String] {
@@ -1173,22 +1207,45 @@ struct HermesFileService: Sendable {
     }
 
     nonisolated func hermesPID() -> pid_t? {
-        // Run `pgrep -f hermes` either locally or via the transport. On
-        // remote hosts we trust `pgrep` to be present — it's standard on
-        // Linux and macOS. On failure we conservatively return nil rather
-        // than pretending Hermes is down: the caller will see
-        // isHermesRunning==false, which is already the "unknown" UX.
-        let result = try? transport.runProcess(
-            executable: "/usr/bin/pgrep",
-            args: ["-f", "hermes"],
-            stdin: nil,
-            timeout: 5
-        )
-        guard let result, let firstLine = result.stdoutString
-            .components(separatedBy: "\n")
-            .first(where: { !$0.isEmpty }),
-              let pid = pid_t(firstLine.trimmingCharacters(in: .whitespaces)) else { return nil }
-        return pid
+        switch hermesPIDResult() {
+        case .success(let pid): return pid
+        case .failure: return nil
+        }
+    }
+
+    /// Error-surfacing variant. `.success(nil)` means `pgrep` ran successfully
+    /// and found no hermes process (Hermes is genuinely not running).
+    /// `.failure` means we couldn't probe at all (pgrep missing, connection
+    /// down, permission issue) — a *different* UX from "not running".
+    nonisolated func hermesPIDResult() -> Result<pid_t?, Error> {
+        do {
+            let result = try transport.runProcess(
+                executable: "/usr/bin/pgrep",
+                args: ["-f", "hermes"],
+                stdin: nil,
+                timeout: 5
+            )
+            // pgrep exits 1 when nothing matches — that's "not running", NOT an
+            // error. Anything else (127=command not found, 255=ssh failure) is.
+            if result.exitCode == 0 {
+                if let firstLine = result.stdoutString
+                    .components(separatedBy: "\n")
+                    .first(where: { !$0.isEmpty }),
+                   let pid = pid_t(firstLine.trimmingCharacters(in: .whitespaces)) {
+                    return .success(pid)
+                }
+                return .success(nil)
+            } else if result.exitCode == 1 {
+                return .success(nil)   // genuinely not running
+            } else {
+                let err = TransportError.commandFailed(exitCode: result.exitCode, stderr: result.stderrString)
+                Self.logger.warning("pgrep failed (exit \(result.exitCode)): \(result.stderrString, privacy: .public)")
+                return .failure(err)
+            }
+        } catch {
+            Self.logger.warning("pgrep transport error: \(error.localizedDescription, privacy: .public)")
+            return .failure(error)
+        }
     }
 
     @discardableResult
@@ -1488,15 +1545,55 @@ struct HermesFileService: Sendable {
     // MARK: - File I/O
 
     /// Read a UTF-8 text file through the transport. Missing files and any
-    /// transport error surface as `nil` — callers treat missing/unreadable
-    /// the same way they always have.
+    /// transport error surface as `nil` — callers that don't need the
+    /// specific error reason keep using this. New call sites that want to
+    /// show a user-actionable message should use `readFileResult`.
     nonisolated private func readFile(_ path: String) -> String? {
-        guard let data = try? transport.readFile(path) else { return nil }
-        return String(data: data, encoding: .utf8)
+        switch readFileResult(path) {
+        case .success(let s):
+            return s
+        case .failure:
+            return nil
+        }
     }
 
     nonisolated private func readFileData(_ path: String) -> Data? {
-        try? transport.readFile(path)
+        switch readFileDataResult(path) {
+        case .success(let d):
+            return d
+        case .failure:
+            return nil
+        }
+    }
+
+    /// Error-surfacing read. Returns the decoded text on success, or the
+    /// underlying `TransportError` (or raw error for local failures) on
+    /// failure. Every failure is also logged via `os.Logger` — the warning
+    /// trail in Console.app is how we diagnose "connection green, data
+    /// empty" bug reports without needing to wire the error through every
+    /// existing call site.
+    nonisolated func readFileResult(_ path: String) -> Result<String, Error> {
+        switch readFileDataResult(path) {
+        case .success(let data):
+            guard let s = String(data: data, encoding: .utf8) else {
+                let err = TransportError.fileIO(path: path, underlying: "file is not valid UTF-8")
+                Self.logger.warning("readFile(\(path, privacy: .public)): not UTF-8")
+                return .failure(err)
+            }
+            return .success(s)
+        case .failure(let err):
+            return .failure(err)
+        }
+    }
+
+    nonisolated func readFileDataResult(_ path: String) -> Result<Data, Error> {
+        do {
+            let data = try transport.readFile(path)
+            return .success(data)
+        } catch {
+            Self.logger.warning("readFile(\(path, privacy: .public)) failed: \(error.localizedDescription, privacy: .public)")
+            return .failure(error)
+        }
     }
 
     /// Write a UTF-8 text file atomically through the transport. Matches the
diff --git a/scarf/scarf/Features/Dashboard/ViewModels/DashboardViewModel.swift b/scarf/scarf/Features/Dashboard/ViewModels/DashboardViewModel.swift
index d2b970e..a1956da 100644
--- a/scarf/scarf/Features/Dashboard/ViewModels/DashboardViewModel.swift
+++ b/scarf/scarf/Features/Dashboard/ViewModels/DashboardViewModel.swift
@@ -21,28 +21,73 @@ final class DashboardViewModel {
     var hermesRunning = false
     var isLoading = true
 
+    /// User-presentable error banner. Set when any of the remote reads
+    /// (state.db snapshot, config.yaml, gateway_state.json, pgrep) failed
+    /// in a way that's not just "file doesn't exist yet". Dashboard renders
+    /// this above the stats with a "Run Diagnostics…" button. `nil` = no
+    /// surfaceable error.
+    var lastReadError: String?
+
     func load() async {
         isLoading = true
         // refresh() = close + reopen, forces a fresh remote snapshot. Cheap
         // on local (live DB reopen).
         let opened = await dataService.refresh()
+        var collectedErrors: [String] = []
         if opened {
             stats = await dataService.fetchStats()
             recentSessions = await dataService.fetchSessions(limit: 5)
             sessionPreviews = await dataService.fetchSessionPreviews(limit: 5)
             await dataService.close()
+        } else if let msg = await dataService.lastOpenError {
+            collectedErrors.append(msg)
         }
         // The fileService methods are synchronous and route through the
         // transport. For remote contexts each call is a blocking ssh
         // round-trip — do them off the main thread to avoid spinning the
         // beach ball during the load.
         let svc = fileService
-        let (cfg, gw, running) = await Task.detached {
-            (svc.loadConfig(), svc.loadGatewayState(), svc.isHermesRunning())
+        struct LoadResults: Sendable {
+            let cfg: Result<HermesConfig, Error>
+            let gw: Result<GatewayState?, Error>
+            let running: Result<pid_t?, Error>
+        }
+        let results = await Task.detached { () -> LoadResults in
+            LoadResults(
+                cfg: svc.loadConfigResult(),
+                gw: svc.loadGatewayStateResult(),
+                running: svc.hermesPIDResult()
+            )
         }.value
-        config = cfg
-        gatewayState = gw
-        hermesRunning = running
+
+        switch results.cfg {
+        case .success(let c): config = c
+        case .failure(let e):
+            config = .empty
+            collectedErrors.append("config.yaml — \(e.localizedDescription)")
+        }
+        switch results.gw {
+        case .success(let g): gatewayState = g
+        case .failure(let e):
+            gatewayState = nil
+            collectedErrors.append("gateway_state.json — \(e.localizedDescription)")
+        }
+        switch results.running {
+        case .success(let pid): hermesRunning = (pid != nil)
+        case .failure(let e):
+            hermesRunning = false
+            collectedErrors.append("pgrep — \(e.localizedDescription)")
+        }
+
+        // Only surface when there's a real error AND we're on a remote
+        // context. Local contexts rarely hit these paths (live DB, local
+        // filesystem), and a transient "file doesn't exist yet" on fresh
+        // installs shouldn't scare users.
+        if context.isRemote, !collectedErrors.isEmpty {
+            lastReadError = collectedErrors.joined(separator: "\n")
+        } else {
+            lastReadError = nil
+        }
         isLoading = false
     }
 }
diff --git a/scarf/scarf/Features/Dashboard/Views/DashboardView.swift b/scarf/scarf/Features/Dashboard/Views/DashboardView.swift
index 2f041ba..dd2da22 100644
--- a/scarf/scarf/Features/Dashboard/Views/DashboardView.swift
+++ b/scarf/scarf/Features/Dashboard/Views/DashboardView.swift
@@ -2,6 +2,7 @@ import SwiftUI
 
 struct DashboardView: View {
     @State private var viewModel: DashboardViewModel
+    @State private var showDiagnostics = false
     @Environment(AppCoordinator.self) private var coordinator
     @Environment(HermesFileWatcher.self) private var fileWatcher
 
@@ -13,6 +14,9 @@ struct DashboardView: View {
     var body: some View {
         ScrollView {
             VStack(alignment: .leading, spacing: 20) {
+                if let err = viewModel.lastReadError {
+                    readErrorBanner(err)
+                }
                 statusSection
                 statsSection
                 recentSessionsSection
@@ -30,6 +34,44 @@ struct DashboardView: View {
         .onChange(of: fileWatcher.lastChangeDate) {
             Task { await viewModel.load() }
         }
+        .sheet(isPresented: $showDiagnostics) {
+            RemoteDiagnosticsView(context: viewModel.context)
+        }
+    }
+
+    /// Banner shown above the Dashboard when one or more remote reads
+    /// failed (permission denied, missing sqlite3, wrong home dir, etc.).
+    /// Replaces the old silent-failure mode where empty values just
+    /// appeared as "Stopped / unknown / 0" with no explanation.
+    private func readErrorBanner(_ err: String) -> some View {
+        VStack(alignment: .leading, spacing: 8) {
+            HStack(alignment: .top, spacing: 8) {
+                Image(systemName: "exclamationmark.triangle.fill")
+                    .foregroundStyle(.orange)
+                VStack(alignment: .leading, spacing: 4) {
+                    Text("Can't read Hermes state on \(viewModel.context.displayName)")
+                        .font(.headline)
+                    Text(err)
+                        .font(.caption.monospaced())
+                        .foregroundStyle(.secondary)
+                        .textSelection(.enabled)
+                        .fixedSize(horizontal: false, vertical: true)
+                }
+                Spacer()
+                Button {
+                    showDiagnostics = true
+                } label: {
+                    Label("Run Diagnostics…", systemImage: "stethoscope")
+                }
+                .controlSize(.regular)
+            }
+        }
+        .padding(12)
+        .background(Color.orange.opacity(0.1), in: RoundedRectangle(cornerRadius: 8))
+        .overlay(
+            RoundedRectangle(cornerRadius: 8)
+                .strokeBorder(Color.orange.opacity(0.3), lineWidth: 1)
+        )
     }
 
     private var statusSection: some View {
diff --git a/scarf/scarf/Features/Servers/ViewModels/AddServerViewModel.swift b/scarf/scarf/Features/Servers/ViewModels/AddServerViewModel.swift
index 8be7769..526def4 100644
--- a/scarf/scarf/Features/Servers/ViewModels/AddServerViewModel.swift
+++ b/scarf/scarf/Features/Servers/ViewModels/AddServerViewModel.swift
@@ -22,7 +22,12 @@ final class AddServerViewModel {
     var testResult: TestResult?
 
     enum TestResult: Equatable {
-        case success(hermesPath: String, dbFound: Bool)
+        /// `suggestedRemoteHome` is non-nil when the probe didn't find
+        /// state.db at the configured (or default) path but did find a
+        /// `state.db` at one of the well-known alternates (e.g. a systemd
+        /// install in `/var/lib/hermes/.hermes`). UI offers a one-click
+        /// fill so the user doesn't have to know the convention.
+        case success(hermesPath: String, dbFound: Bool, suggestedRemoteHome: String?)
         /// `command` is the full ssh invocation we attempted (so the user can
         /// paste it into Terminal to see what their shell does with it).
         /// `stderr` is whatever ssh / the remote shell wrote to stderr.
@@ -95,7 +100,7 @@ final class AddServerViewModel {
     /// `hermesBinaryHint` so subsequent calls don't need to re-resolve it.
     func configForSave() -> SSHConfig {
         var cfg = draftConfig
-        if case .success(let path, _) = testResult {
+        if case .success(let path, _, _) = testResult {
             cfg.hermesBinaryHint = path
         }
         return cfg
diff --git a/scarf/scarf/Features/Servers/ViewModels/ConnectionStatusViewModel.swift b/scarf/scarf/Features/Servers/ViewModels/ConnectionStatusViewModel.swift
index 35ce655..5410a2c 100644
--- a/scarf/scarf/Features/Servers/ViewModels/ConnectionStatusViewModel.swift
+++ b/scarf/scarf/Features/Servers/ViewModels/ConnectionStatusViewModel.swift
@@ -11,8 +11,12 @@ final class ConnectionStatusViewModel {
     private let logger = Logger(subsystem: "com.scarf", category: "ConnectionStatus")
 
     enum Status: Equatable {
-        /// Healthy: most recent probe succeeded.
+        /// Healthy: SSH connected AND we can read `~/.hermes/config.yaml`.
         case connected
+        /// SSH connects but the follow-up read-access probe failed. Data
+        /// views will be empty until this is resolved. `reason` is shown
+        /// in the pill tooltip; users click the pill to open diagnostics.
+        case degraded(reason: String)
         /// No probe yet or the previous probe timed out but we haven't
         /// confirmed failure. Shown as yellow to tell the user "checking…".
         case idle
@@ -72,20 +76,59 @@ final class ConnectionStatusViewModel {
 
     private func probeOnce() async {
         let snapshot = transport
-        let result: Result<Void, TransportError>
-        // Transport IO on a detached task so we don't block MainActor.
-        result = await Task.detached {
+        let hermesHome = context.paths.home
+        // Two-tier probe in one SSH round-trip:
+        //   tier 1: `true` — raw connectivity / auth / ControlMaster path
+        //   tier 2: `test -r $HERMESHOME/config.yaml` — can we actually
+        //           read the file Dashboard reads on every tick? Green pill
+        //           only if both pass; yellow "degraded" if tier 1 passes
+        //           but tier 2 fails (the exact symptom in issue #19).
+        // Script emits two lines: TIER1:<exitcode> and TIER2:<exitcode>.
+        let homeArg: String
+        if hermesHome.hasPrefix("~/") {
+            homeArg = "\"$HOME/\(hermesHome.dropFirst(2))\""
+        } else if hermesHome == "~" {
+            homeArg = "\"$HOME\""
+        } else {
+            homeArg = "\"\(hermesHome.replacingOccurrences(of: "\"", with: "\\\""))\""
+        }
+        let script = """
+        echo TIER1:0
+        H=\(homeArg)
+        if [ -r "$H/config.yaml" ]; then echo TIER2:0; else echo TIER2:1; fi
+        """
+
+        enum ProbeOutcome {
+            case connected
+            case degraded(reason: String)
+            case failure(TransportError)
+        }
+
+        let outcome: ProbeOutcome = await Task.detached {
             do {
                 let probe = try snapshot.runProcess(
                     executable: "/bin/sh",
-                    args: ["-c", "true"],
+                    args: ["-c", script],
                     stdin: nil,
                     timeout: 10
                 )
-                if probe.exitCode == 0 {
-                    return .success(())
+                guard probe.exitCode == 0 else {
+                    return .failure(.commandFailed(exitCode: probe.exitCode, stderr: probe.stderrString))
                 }
-                return .failure(.commandFailed(exitCode: probe.exitCode, stderr: probe.stderrString))
+                let out = probe.stdoutString
+                let tier1 = out.contains("TIER1:0")
+                let tier2 = out.contains("TIER2:0")
+                if !tier1 {
+                    // The script itself didn't reach tier 1 — treat as connection failure.
+                    return .failure(.commandFailed(exitCode: 1, stderr: out))
+                }
+                if tier2 {
+                    return .connected
+                }
+                // Connected but can't read config.yaml — the core issue #19
+                // symptom. Give the pill a short reason; the full story goes
+                // into Remote Diagnostics.
+                return .degraded(reason: "can't read ~/.hermes/config.yaml")
             } catch let e as TransportError {
                 return .failure(e)
             } catch {
@@ -93,11 +136,15 @@ final class ConnectionStatusViewModel {
             }
         }.value
 
-        switch result {
-        case .success:
+        switch outcome {
+        case .connected:
             status = .connected
             lastSuccess = Date()
             consecutiveFailures = 0
+        case .degraded(let reason):
+            status = .degraded(reason: reason)
+            lastSuccess = Date()   // SSH itself is fine, reset failure count
+            consecutiveFailures = 0
         case .failure(let err):
             consecutiveFailures += 1
             // First failure → silent yellow "Reconnecting…" while we try
diff --git a/scarf/scarf/Features/Servers/ViewModels/RemoteDiagnosticsViewModel.swift b/scarf/scarf/Features/Servers/ViewModels/RemoteDiagnosticsViewModel.swift
new file mode 100644
index 0000000..434ff90
--- /dev/null
+++ b/scarf/scarf/Features/Servers/ViewModels/RemoteDiagnosticsViewModel.swift
@@ -0,0 +1,330 @@
+import Foundation
+import os
+
+/// Runs a fixed check-list against a remote server and reports per-probe
+/// pass/fail. Exists because `TestConnectionProbe` only verifies ssh
+/// connectivity + hermes binary presence, and `ConnectionStatusViewModel`
+/// only pings `/bin/sh -c true`. When users file "connection green but
+/// everything empty" bug reports (issue #19), this is the diagnostic surface
+/// that tells them (and us) exactly which read fails and why.
+///
+/// One shell invocation runs every check on the remote and emits a
+/// line-delimited `KEY|STATUS|DETAIL` protocol that the view model parses.
+/// Cheaper than one SSH round-trip per probe and gives a consistent shell
+/// environment across all probes.
+@Observable
+@MainActor
+final class RemoteDiagnosticsViewModel {
+    private static let logger = Logger(subsystem: "com.scarf", category: "RemoteDiagnostics")
+
+    let context: ServerContext
+
+    /// Probes in display order. The order matters: connectivity first, then
+    /// environment checks, then Hermes data-path checks. A failure early in
+    /// the list usually explains every subsequent failure.
+    enum ProbeID: String, CaseIterable, Identifiable {
+        case connectivity
+        case remoteUser
+        case remoteHome
+        case hermesHomeConfigured
+        case hermesDirExists
+        case hermesDirReadable
+        case configYAMLReadable
+        case configYAMLContents
+        case stateDBReadable
+        case sqlite3Installed
+        case sqlite3CanOpenStateDB
+        case hermesBinaryNonLogin
+        case hermesBinaryLogin
+        case pgrepAvailable
+
+        var id: String { rawValue }
+
+        /// Human-readable title rendered in the diagnostics sheet.
+        var title: String {
+            switch self {
+            case .connectivity:           return "SSH connectivity"
+            case .remoteUser:             return "Remote user identity"
+            case .remoteHome:             return "Remote $HOME"
+            case .hermesHomeConfigured:   return "Hermes home directory"
+            case .hermesDirExists:        return "Hermes directory exists"
+            case .hermesDirReadable:      return "Hermes directory readable"
+            case .configYAMLReadable:     return "config.yaml readable"
+            case .configYAMLContents:     return "config.yaml actually readable (content)"
+            case .stateDBReadable:        return "state.db readable"
+            case .sqlite3Installed:       return "sqlite3 binary installed on remote"
+            case .sqlite3CanOpenStateDB:  return "sqlite3 can open state.db"
+            case .hermesBinaryNonLogin:   return "hermes binary on non-login PATH"
+            case .hermesBinaryLogin:      return "hermes binary on login PATH (via rc files)"
+            case .pgrepAvailable:         return "pgrep available (for 'is Hermes running')"
+            }
+        }
+
+        /// When the check fails, show this hint alongside the stderr.
+        var failureHint: String? {
+            switch self {
+            case .connectivity:
+                return "SSH itself can't complete. Before re-testing in Scarf, confirm `ssh <host>` works in Terminal."
+            case .remoteUser, .remoteHome:
+                return nil
+            case .hermesHomeConfigured:
+                return nil
+            case .hermesDirExists:
+                return "Scarf is looking at the default `~/.hermes`. If Hermes is installed elsewhere (e.g. `/var/lib/hermes/.hermes` for systemd installs), set the Hermes home directory in Manage Servers → this server → Edit."
+            case .hermesDirReadable:
+                return "The SSH user can see `~/.hermes` but can't list it. Check permissions: `ls -ld ~/.hermes` on the remote — the SSH user needs at least `r-x`."
+            case .configYAMLReadable, .configYAMLContents:
+                return "Scarf can't read `config.yaml`. This usually means the SSH user is different from the user Hermes runs as. Either (a) run Hermes as the SSH user, (b) `chmod a+r ~/.hermes/config.yaml`, or (c) configure Scarf to SSH as the Hermes user."
+            case .stateDBReadable:
+                return "Scarf can't read `state.db` — Sessions, Activity, Dashboard stats all depend on this. Same fix pattern as config.yaml."
+            case .sqlite3Installed:
+                return "Scarf pulls a snapshot of state.db via `sqlite3 .backup`, so sqlite3 must be installed on the remote. Install: `sudo apt install sqlite3` (Ubuntu/Debian), `sudo yum install sqlite` (RHEL/Fedora), `apk add sqlite` (Alpine)."
+            case .sqlite3CanOpenStateDB:
+                return "sqlite3 exists but can't open state.db. Could be a permission issue, a corrupt DB, or a version skew."
+            case .hermesBinaryNonLogin:
+                return "Scarf's runtime calls use non-login SSH shells (no .bashrc). If `hermes` only appears here via the login path, runtime CLI calls will fail. Move your PATH export from `.bashrc` to `.zshenv` or `.profile`."
+            case .hermesBinaryLogin:
+                return "hermes couldn't be located even after sourcing login rc files. Install path is non-standard — set the hermes binary path manually in Manage Servers."
+            case .pgrepAvailable:
+                return "pgrep not found on remote. Dashboard can't determine whether Hermes is running. Install procps: `apt install procps` (most distros have it by default)."
+            }
+        }
+    }
+
+    struct Probe: Identifiable, Sendable {
+        let id: ProbeID
+        let passed: Bool
+        let detail: String
+    }
+
+    private(set) var probes: [Probe] = []
+    private(set) var isRunning: Bool = false
+    private(set) var startedAt: Date?
+    private(set) var finishedAt: Date?
+
+    init(context: ServerContext) {
+        self.context = context
+    }
+
+    /// Kick off the full check list. Safe to call again to re-run.
+    func run() async {
+        if isRunning { return }
+        isRunning = true
+        probes = []
+        startedAt = Date()
+        finishedAt = nil
+
+        let script = Self.buildScript(hermesHome: context.paths.home)
+        let captured = await Self.execute(script: script, context: context)
+
+        switch captured {
+        case .connectFailure(let msg):
+            probes = [
+                Probe(id: .connectivity, passed: false, detail: msg)
+            ] + ProbeID.allCases
+                .filter { $0 != .connectivity }
+                .map { Probe(id: $0, passed: false, detail: "(skipped — SSH didn't connect)") }
+        case .completed(let stdout, let stderr, let exitCode):
+            probes = Self.parse(stdout: stdout, stderr: stderr, exitCode: exitCode)
+        }
+
+        finishedAt = Date()
+        isRunning = false
+        Self.logger.info("Diagnostics for \(self.context.displayName, privacy: .public) finished — \(self.passingCount)/\(self.probes.count) passing")
+    }
+
+    /// Quick summary string, e.g. "9/14 passing". Used in the header.
+    var summary: String {
+        guard !probes.isEmpty else { return "Not yet run." }
+        return "\(passingCount)/\(probes.count) checks passing"
+    }
+
+    var passingCount: Int {
+        probes.filter { $0.passed }.count
+    }
+
+    var allPassed: Bool {
+        !probes.isEmpty && passingCount == probes.count
+    }
+
+    // MARK: - Script + parsing
+
+    /// Build the remote shell script. Uses a pipe-delimited protocol so the
+    /// Swift side can parse without regex surprises. Status is `PASS` or
+    /// `FAIL`; detail is a single line (can be blank). `__END__` at the
+    /// bottom lets us detect truncation.
+    private static func buildScript(hermesHome: String) -> String {
+        // Shell-quote the home path — user may have typed `~/.hermes` which
+        // we want the remote shell to expand, so we substitute `~/` with
+        // `$HOME/` like `SSHTransport.remotePathArg` does.
+        let expanded: String
+        if hermesHome.hasPrefix("~/") {
+            expanded = "\"$HOME/\(hermesHome.dropFirst(2))\""
+        } else if hermesHome == "~" {
+            expanded = "\"$HOME\""
+        } else {
+            // Absolute path — still quote in case of spaces.
+            expanded = "\"\(hermesHome.replacingOccurrences(of: "\"", with: "\\\""))\""
+        }
+
+        return #"""
+        H=\#(expanded)
+        emit() { printf '%s|%s|%s\n' "$1" "$2" "$3"; }
+
+        emit connectivity PASS "(running in this shell)"
+
+        user=$(id -un 2>/dev/null || echo unknown)
+        emit remoteUser PASS "$user"
+
+        emit remoteHome PASS "$HOME"
+
+        emit hermesHomeConfigured PASS "$H"
+
+        if [ -d "$H" ]; then
+            emit hermesDirExists PASS "$H"
+        else
+            emit hermesDirExists FAIL "not a directory: $H"
+        fi
+
+        if [ -r "$H" ] && [ -x "$H" ]; then
+            emit hermesDirReadable PASS ""
+        else
+            emit hermesDirReadable FAIL "cannot read/enter $H (check perms on the dir)"
+        fi
+
+        if [ -r "$H/config.yaml" ]; then
+            emit configYAMLReadable PASS ""
+        else
+            if [ -e "$H/config.yaml" ]; then
+                emit configYAMLReadable FAIL "exists but not readable by $user"
+            else
+                emit configYAMLReadable FAIL "file does not exist"
+            fi
+        fi
+
+        if head -c 1 "$H/config.yaml" > /dev/null 2>&1; then
+            size=$(wc -c < "$H/config.yaml" 2>/dev/null | tr -d ' ')
+            emit configYAMLContents PASS "${size} bytes"
+        else
+            emit configYAMLContents FAIL "cannot read file contents"
+        fi
+
+        if [ -r "$H/state.db" ]; then
+            size=$(wc -c < "$H/state.db" 2>/dev/null | tr -d ' ')
+            emit stateDBReadable PASS "${size} bytes"
+        else
+            if [ -e "$H/state.db" ]; then
+                emit stateDBReadable FAIL "exists but not readable by $user"
+            else
+                emit stateDBReadable FAIL "file does not exist"
+            fi
+        fi
+
+        if command -v sqlite3 > /dev/null 2>&1; then
+            sq=$(command -v sqlite3)
+            emit sqlite3Installed PASS "$sq"
+        else
+            emit sqlite3Installed FAIL "sqlite3 not on PATH"
+        fi
+
+        if sqlite3 "$H/state.db" 'SELECT 1' > /dev/null 2>&1; then
+            emit sqlite3CanOpenStateDB PASS ""
+        else
+            err=$(sqlite3 "$H/state.db" 'SELECT 1' 2>&1 | head -1)
+            emit sqlite3CanOpenStateDB FAIL "$err"
+        fi
+
+        # Non-login PATH: just ask the current shell.
+        hpath=$(command -v hermes 2>/dev/null)
+        if [ -n "$hpath" ]; then
+            emit hermesBinaryNonLogin PASS "$hpath"
+        else
+            emit hermesBinaryNonLogin FAIL "not on non-login PATH ($PATH)"
+        fi
+
+        # Login PATH: source rc files (mirroring TestConnectionProbe) and re-probe.
+        for rc in "$HOME/.zshenv" "$HOME/.zprofile" "$HOME/.bash_profile" "$HOME/.profile"; do
+            [ -f "$rc" ] && . "$rc" 2>/dev/null
+        done
+        hpath2=$(command -v hermes 2>/dev/null)
+        if [ -z "$hpath2" ]; then
+            for cand in "$HOME/.local/bin/hermes" "/opt/homebrew/bin/hermes" "/usr/local/bin/hermes" "$HOME/.hermes/bin/hermes"; do
+                if [ -x "$cand" ]; then hpath2="$cand"; break; fi
+            done
+        fi
+        if [ -n "$hpath2" ]; then
+            emit hermesBinaryLogin PASS "$hpath2"
+        else
+            emit hermesBinaryLogin FAIL "not found after sourcing rc files"
+        fi
+
+        if command -v pgrep > /dev/null 2>&1; then
+            emit pgrepAvailable PASS "$(command -v pgrep)"
+        else
+            emit pgrepAvailable FAIL "pgrep not on PATH"
+        fi
+
+        printf '__END__\n'
+        """#
+    }
+
+    enum Captured {
+        case connectFailure(String)
+        case completed(stdout: String, stderr: String, exitCode: Int32)
+    }
+
+    private static func execute(script: String, context: ServerContext) async -> Captured {
+        let transport = context.makeTransport()
+        return await Task.detached { () -> Captured in
+            do {
+                let result = try transport.runProcess(
+                    executable: "/bin/sh",
+                    args: ["-c", script],
+                    stdin: nil,
+                    timeout: 30
+                )
+                return .completed(
+                    stdout: result.stdoutString,
+                    stderr: result.stderrString,
+                    exitCode: result.exitCode
+                )
+            } catch {
+                let desc = (error as? LocalizedError)?.errorDescription ?? error.localizedDescription
+                return .connectFailure(desc)
+            }
+        }.value
+    }
+
+    private static func parse(stdout: String, stderr: String, exitCode: Int32) -> [Probe] {
+        var results: [ProbeID: Probe] = [:]
+        for line in stdout.split(whereSeparator: { $0 == "\n" || $0 == "\r" }) {
+            let parts = line.split(separator: "|", maxSplits: 2, omittingEmptySubsequences: false)
+            guard parts.count == 3 else { continue }
+            let key = String(parts[0]).trimmingCharacters(in: .whitespaces)
+            let status = String(parts[1]).trimmingCharacters(in: .whitespaces)
+            let detail = String(parts[2]).trimmingCharacters(in: .whitespaces)
+            guard let probe = ProbeID(rawValue: key) else { continue }
+            results[probe] = Probe(
+                id: probe,
+                passed: status == "PASS",
+                detail: detail
+            )
+        }
+
+        // If the script didn't complete, fill in the missing probes so the UI
+        // still shows every expected row (rather than silently skipping).
+        let terminated = stdout.contains("__END__")
+        let fallbackDetail: String
+        if terminated {
+            fallbackDetail = "(no output)"
+        } else if exitCode != 0 {
+            fallbackDetail = "(script exited \(exitCode) before this check — stderr: \(stderr.prefix(200)))"
+        } else {
+            fallbackDetail = "(no output from script)"
+        }
+
+        return ProbeID.allCases.map { id in
+            results[id] ?? Probe(id: id, passed: false, detail: fallbackDetail)
+        }
+    }
+}
diff --git a/scarf/scarf/Features/Servers/ViewModels/TestConnectionProbe.swift b/scarf/scarf/Features/Servers/ViewModels/TestConnectionProbe.swift
index b2aeb34..0e3ba44 100644
--- a/scarf/scarf/Features/Servers/ViewModels/TestConnectionProbe.swift
+++ b/scarf/scarf/Features/Servers/ViewModels/TestConnectionProbe.swift
@@ -47,6 +47,26 @@ struct TestConnectionProbe {
         //      Scarf's local resolution.
         // The matched absolute path is stored as `hermesBinaryHint` on the
         // SSHConfig so subsequent CLI/ACP invocations don't have to re-probe.
+        // If the user already typed a remoteHome override, use it; otherwise
+        // default to $HOME/.hermes. Either way, the script also probes a
+        // short list of well-known alternates when the primary path doesn't
+        // have state.db — systemd/docker/VPS installs tend to live at
+        // /var/lib/hermes/.hermes or /home/hermes/.hermes, and SSHing in as
+        // a different user than the Hermes daemon is the leading cause of
+        // "connection green, data empty" bug reports (issue #19).
+        let primary: String
+        if let override = config.remoteHome, !override.isEmpty {
+            if override.hasPrefix("~/") {
+                primary = "$HOME/\(override.dropFirst(2))"
+            } else if override == "~" {
+                primary = "$HOME"
+            } else {
+                primary = override
+            }
+        } else {
+            primary = "$HOME/.hermes"
+        }
+
         let script = #"""
         hpath=$(command -v hermes 2>/dev/null)
         if [ -z "$hpath" ]; then
@@ -61,7 +81,21 @@ struct TestConnectionProbe {
             done
         fi
         echo "HERMES:$hpath"
-        if [ -e "$HOME/.hermes/state.db" ]; then echo DB:ok; else echo DB:missing; fi
+        PRIMARY="\#(primary)"
+        if [ -r "$PRIMARY/state.db" ]; then
+            echo "DB:ok"
+            echo "HOME_USED:$PRIMARY"
+        else
+            echo "DB:missing"
+            # Probe well-known alternates. Emit the first one that has a
+            # readable state.db so the UI can offer a one-click fill.
+            for alt in "/var/lib/hermes/.hermes" "/opt/hermes/.hermes" "/home/hermes/.hermes" "/root/.hermes"; do
+                if [ -r "$alt/state.db" ]; then
+                    echo "SUGGEST:$alt"
+                    break
+                fi
+            done
+        fi
         """#
         sshArgs.append("/bin/sh")
         sshArgs.append("-c")
@@ -133,6 +167,8 @@ struct TestConnectionProbe {
             let hermesPath = lines.first(where: { $0.hasPrefix("HERMES:") })?
                 .dropFirst("HERMES:".count).trimmingCharacters(in: .whitespaces) ?? ""
             let dbFound = lines.contains(where: { $0 == "DB:ok" })
+            let suggestedHome = lines.first(where: { $0.hasPrefix("SUGGEST:") })
+                .map { String($0.dropFirst("SUGGEST:".count)).trimmingCharacters(in: .whitespaces) }
             if hermesPath.isEmpty {
                 return .failure(
                     message: "hermes binary not found in remote $PATH",
@@ -140,7 +176,7 @@ struct TestConnectionProbe {
                     command: displayCommand
                 )
             }
-            return .success(hermesPath: String(hermesPath), dbFound: dbFound)
+            return .success(hermesPath: String(hermesPath), dbFound: dbFound, suggestedRemoteHome: suggestedHome)
         }
 
         // Classify common failures by scanning the stderr trace.
diff --git a/scarf/scarf/Features/Servers/Views/AddServerSheet.swift b/scarf/scarf/Features/Servers/Views/AddServerSheet.swift
index 2cfbdc0..2ee63e5 100644
--- a/scarf/scarf/Features/Servers/Views/AddServerSheet.swift
+++ b/scarf/scarf/Features/Servers/Views/AddServerSheet.swift
@@ -81,11 +81,15 @@ struct AddServerSheet: View {
                 }
             }
 
-            LabeledField("Remote ~/.hermes override") {
-                TextField("Leave blank for default", text: $viewModel.remoteHome)
+            LabeledField("Hermes data directory") {
+                TextField("Default: ~/.hermes", text: $viewModel.remoteHome)
                     .textFieldStyle(.roundedBorder)
                     .autocorrectionDisabled()
             }
+            Text("Leave blank unless Hermes is installed at a non-default path (systemd services often live at /var/lib/hermes/.hermes; Docker sidecars vary). Test Connection auto-suggests a value when it detects one of the known alternates.")
+                .font(.caption)
+                .foregroundStyle(.secondary)
+                .fixedSize(horizontal: false, vertical: true)
 
             Text("Scarf uses ssh-agent for authentication. If your key has a passphrase, run `ssh-add` before connecting — Scarf never prompts for or stores passphrases.")
                 .font(.caption)
@@ -113,14 +117,43 @@ struct AddServerSheet: View {
 
             if let result = viewModel.testResult {
                 switch result {
-                case .success(let path, let dbFound):
-                    VStack(alignment: .leading, spacing: 4) {
+                case .success(let path, let dbFound, let suggestedHome):
+                    VStack(alignment: .leading, spacing: 6) {
                         Label("Connected", systemImage: "checkmark.circle.fill")
                             .foregroundStyle(.green)
                         Text("hermes at \(path)").font(.caption).monospaced()
-                        Text(dbFound ? "state.db found" : "state.db not found — Hermes may not have run yet on the remote")
-                            .font(.caption)
-                            .foregroundStyle(dbFound ? Color.secondary : Color.orange)
+                        if dbFound {
+                            Text("state.db readable")
+                                .font(.caption)
+                                .foregroundStyle(.secondary)
+                        } else if let suggestion = suggestedHome {
+                            // Scarf found Hermes data at one of the common
+                            // alternate paths. One-click fill the
+                            // remoteHome field so the user doesn't have to
+                            // know this is a convention thing.
+                            VStack(alignment: .leading, spacing: 4) {
+                                Text("state.db not found at the default location, but Scarf found one at:")
+                                    .font(.caption)
+                                    .foregroundStyle(.orange)
+                                HStack {
+                                    Text(suggestion)
+                                        .font(.caption.monospaced())
+                                        .textSelection(.enabled)
+                                    Spacer()
+                                    Button("Use this") {
+                                        viewModel.remoteHome = suggestion
+                                    }
+                                    .controlSize(.small)
+                                }
+                                .padding(8)
+                                .background(Color.yellow.opacity(0.12), in: RoundedRectangle(cornerRadius: 6))
+                            }
+                        } else {
+                            Text("state.db not found at the configured path. Either Hermes hasn't run yet on this server, or it's installed at a non-default location — set the Hermes data directory field above.")
+                                .font(.caption)
+                                .foregroundStyle(.orange)
+                                .fixedSize(horizontal: false, vertical: true)
+                        }
                     }
                 case .failure(let message, let stderr, let command):
                     VStack(alignment: .leading, spacing: 6) {
diff --git a/scarf/scarf/Features/Servers/Views/ConnectionStatusPill.swift b/scarf/scarf/Features/Servers/Views/ConnectionStatusPill.swift
index 3533585..1e3e468 100644
--- a/scarf/scarf/Features/Servers/Views/ConnectionStatusPill.swift
+++ b/scarf/scarf/Features/Servers/Views/ConnectionStatusPill.swift
@@ -8,12 +8,17 @@ import SwiftUI
 struct ConnectionStatusPill: View {
     let status: ConnectionStatusViewModel
     @State private var showDetails = false
+    @State private var showDiagnostics = false
 
     var body: some View {
         Button {
             switch status.status {
             case .error:
                 showDetails = true
+            case .degraded:
+                // Yellow "can't read" state — open the diagnostics sheet
+                // so the user can see exactly which files fail and why.
+                showDiagnostics = true
             case .connected, .idle:
                 status.retry()
             }
@@ -36,11 +41,15 @@ struct ConnectionStatusPill: View {
         .popover(isPresented: $showDetails, arrowEdge: .bottom) {
             errorDetails.frame(width: 400)
         }
+        .sheet(isPresented: $showDiagnostics) {
+            RemoteDiagnosticsView(context: status.context)
+        }
     }
 
     private var color: Color {
         switch status.status {
         case .connected: return .green
+        case .degraded: return .orange
         case .idle: return .yellow
         case .error: return .red
         }
@@ -49,6 +58,7 @@ struct ConnectionStatusPill: View {
     private var label: String {
         switch status.status {
         case .connected: return "Connected"
+        case .degraded: return "Connected — can't read Hermes state"
         case .idle: return "Checking…"
         case .error(let message, _): return message
         }
@@ -62,6 +72,8 @@ struct ConnectionStatusPill: View {
                 return "Last probe: \(fmt.localizedString(for: ts, relativeTo: Date()))"
             }
             return "Connected"
+        case .degraded(let reason):
+            return "SSH works but \(reason). Click for diagnostics."
         case .idle: return "Waiting for first probe"
         case .error(_, _): return "Click for details"
         }
diff --git a/scarf/scarf/Features/Servers/Views/ManageServersView.swift b/scarf/scarf/Features/Servers/Views/ManageServersView.swift
index 994c789..ffaa929 100644
--- a/scarf/scarf/Features/Servers/Views/ManageServersView.swift
+++ b/scarf/scarf/Features/Servers/Views/ManageServersView.swift
@@ -6,6 +6,7 @@ struct ManageServersView: View {
     @Environment(ServerRegistry.self) private var registry
     @State private var showAddSheet = false
     @State private var pendingRemoveID: ServerID?
+    @State private var diagnosticsContext: ServerContext?
 
     var body: some View {
         VStack(alignment: .leading, spacing: 0) {
@@ -17,12 +18,18 @@ struct ManageServersView: View {
                 list
             }
         }
-        .frame(width: 380, height: 360)
+        .frame(width: 440, height: 380)
         .sheet(isPresented: $showAddSheet) {
             AddServerSheet { name, config in
                 _ = registry.addServer(displayName: name, config: config)
             }
         }
+        .sheet(item: Binding(
+            get: { diagnosticsContext.map { IdentifiableContext(context: $0) } },
+            set: { diagnosticsContext = $0?.context }
+        )) { wrapper in
+            RemoteDiagnosticsView(context: wrapper.context)
+        }
         .confirmationDialog(
             "Remove this server?",
             isPresented: Binding(
@@ -42,6 +49,13 @@ struct ManageServersView: View {
         )
     }
 
+    /// Wrapper because `ServerContext` isn't `Identifiable` against the sheet
+    /// item API in a way that preserves display-ordering stability.
+    private struct IdentifiableContext: Identifiable {
+        var id: ServerID { context.id }
+        let context: ServerContext
+    }
+
     private var header: some View {
         HStack {
             Text("Servers").font(.headline)
@@ -87,6 +101,13 @@ struct ManageServersView: View {
                         }
                     }
                     Spacer()
+                    Button {
+                        diagnosticsContext = entry.context
+                    } label: {
+                        Image(systemName: "stethoscope")
+                    }
+                    .buttonStyle(.borderless)
+                    .help("Run remote diagnostics — check exactly which files are readable on this server.")
                     Button {
                         pendingRemoveID = entry.id
                     } label: {
@@ -94,6 +115,7 @@ struct ManageServersView: View {
                     }
                     .buttonStyle(.borderless)
                     .foregroundStyle(.red)
+                    .help("Remove this server from Scarf.")
                 }
                 .padding(.vertical, 4)
             }
diff --git a/scarf/scarf/Features/Servers/Views/RemoteDiagnostics.swift b/scarf/scarf/Features/Servers/Views/RemoteDiagnostics.swift
new file mode 100644
index 0000000..26cc440
--- /dev/null
+++ b/scarf/scarf/Features/Servers/Views/RemoteDiagnostics.swift
@@ -0,0 +1,166 @@
+import SwiftUI
+import AppKit
+
+/// Per-server diagnostics sheet. Shown from Manage Servers and from the
+/// Dashboard "Run Diagnostics…" button when `lastReadError` is set. Gives
+/// the user a specific list of what does/doesn't work over SSH, with
+/// targeted remediation hints for each failure.
+///
+/// Design principle: a failing check always shows both the raw detail the
+/// remote shell produced AND a human-written hint. The raw detail lets us
+/// triage bug reports; the hint unblocks the user without a round trip.
+struct RemoteDiagnosticsView: View {
+    let context: ServerContext
+    @State private var viewModel: RemoteDiagnosticsViewModel
+    @Environment(\.dismiss) private var dismiss
+
+    init(context: ServerContext) {
+        self.context = context
+        _viewModel = State(initialValue: RemoteDiagnosticsViewModel(context: context))
+    }
+
+    var body: some View {
+        VStack(spacing: 0) {
+            header
+            Divider()
+            probeList
+            Divider()
+            footer
+        }
+        .frame(minWidth: 640, minHeight: 520)
+        .task { await viewModel.run() }
+    }
+
+    private var header: some View {
+        VStack(alignment: .leading, spacing: 6) {
+            HStack {
+                Text("Remote Diagnostics — \(context.displayName)")
+                    .font(.title3)
+                    .fontWeight(.semibold)
+                Spacer()
+                if viewModel.isRunning {
+                    ProgressView()
+                        .controlSize(.small)
+                } else {
+                    Button("Re-run") { Task { await viewModel.run() } }
+                        .controlSize(.small)
+                }
+            }
+            HStack {
+                if viewModel.isRunning {
+                    Text("Running checks…")
+                        .font(.callout)
+                        .foregroundStyle(.secondary)
+                } else {
+                    Label(viewModel.summary, systemImage: viewModel.allPassed ? "checkmark.seal" : "info.circle")
+                        .font(.callout)
+                        .foregroundStyle(viewModel.allPassed ? .green : .orange)
+                }
+                Spacer()
+                if !viewModel.probes.isEmpty {
+                    Button {
+                        copyReportToClipboard()
+                    } label: {
+                        Label("Copy Full Report", systemImage: "doc.on.doc")
+                    }
+                    .controlSize(.small)
+                    .help("Copy a plain-text summary of every check (passes and fails) — paste into GitHub issues so we can see everything at once.")
+                }
+            }
+        }
+        .padding(16)
+    }
+
+    private var probeList: some View {
+        ScrollView {
+            VStack(alignment: .leading, spacing: 0) {
+                if viewModel.probes.isEmpty && viewModel.isRunning {
+                    Text("Running a single shell session on \(context.displayName) that exercises every path Scarf reads…")
+                        .font(.callout)
+                        .foregroundStyle(.secondary)
+                        .padding()
+                }
+                ForEach(viewModel.probes) { probe in
+                    probeRow(probe)
+                    if probe.id != viewModel.probes.last?.id {
+                        Divider()
+                    }
+                }
+            }
+        }
+    }
+
+    private func probeRow(_ probe: RemoteDiagnosticsViewModel.Probe) -> some View {
+        HStack(alignment: .top, spacing: 12) {
+            Image(systemName: probe.passed ? "checkmark.circle.fill" : "xmark.circle.fill")
+                .foregroundStyle(probe.passed ? .green : .red)
+                .font(.title3)
+                .padding(.top, 2)
+            VStack(alignment: .leading, spacing: 4) {
+                Text(probe.id.title)
+                    .font(.body)
+                if !probe.detail.isEmpty {
+                    Text(probe.detail)
+                        .font(.caption.monospaced())
+                        .foregroundStyle(.secondary)
+                        .textSelection(.enabled)
+                }
+                if !probe.passed, let hint = probe.id.failureHint {
+                    HStack(alignment: .top, spacing: 6) {
+                        Image(systemName: "lightbulb")
+                            .foregroundStyle(.yellow)
+                            .font(.caption)
+                        Text(hint)
+                            .font(.caption)
+                            .foregroundStyle(.primary)
+                            .textSelection(.enabled)
+                            .fixedSize(horizontal: false, vertical: true)
+                    }
+                    .padding(8)
+                    .background(Color.yellow.opacity(0.08))
+                    .clipShape(RoundedRectangle(cornerRadius: 6))
+                }
+            }
+            Spacer(minLength: 0)
+        }
+        .padding(.horizontal, 16)
+        .padding(.vertical, 10)
+    }
+
+    private var footer: some View {
+        HStack {
+            Text("Scarf runs these over a single SSH session that mirrors the shell your dashboard reads from, so a green row here means Scarf can actually read that file at runtime.")
+                .font(.caption)
+                .foregroundStyle(.secondary)
+                .fixedSize(horizontal: false, vertical: true)
+            Spacer()
+            Button("Done") { dismiss() }
+                .keyboardShortcut(.defaultAction)
+        }
+        .padding(16)
+    }
+
+    private func copyReportToClipboard() {
+        var lines: [String] = []
+        lines.append("Scarf remote diagnostics — \(context.displayName)")
+        if case .ssh(let config) = context.kind {
+            lines.append("Host: \(config.host)" + (config.user.map { " (user: \($0))" } ?? ""))
+            if let rh = config.remoteHome { lines.append("Hermes home (override): \(rh)") }
+        }
+        lines.append("Ran at: \(viewModel.startedAt.map { ISO8601DateFormatter().string(from: $0) } ?? "?")")
+        lines.append("Result: \(viewModel.summary)")
+        lines.append("")
+        for probe in viewModel.probes {
+            let mark = probe.passed ? "PASS" : "FAIL"
+            lines.append("[\(mark)] \(probe.id.title)")
+            if !probe.detail.isEmpty { lines.append("    \(probe.detail)") }
+            if !probe.passed, let hint = probe.id.failureHint {
+                lines.append("    hint: \(hint)")
+            }
+        }
+        let text = lines.joined(separator: "\n")
+        let pb = NSPasteboard.general
+        pb.clearContents()
+        pb.setString(text, forType: .string)
+    }
+}

From f8069a44810a0a0887a4746693ae21f9013c5cfc Mon Sep 17 00:00:00 2001
From: Alan Wizemann <alan@wizemann.com>
Date: Mon, 20 Apr 2026 13:52:28 -0700
Subject: [PATCH 2/8] fix: don't log 'No such file' as warnings on remote reads
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The Result-returning readers I added for the v2.0.1 diagnostics surface
were logging EVERY failure, including routine "file doesn't exist" cases
— e.g. skill.yaml files under ~/.hermes/skills/*/ that are optional
metadata, gateway_state.json before Hermes has started, memories/USER.md
on fresh installs.

In practice this meant the Platforms view and similar feature loaders
that walk directories and read optional files now spam the Console with
warnings on every refresh. That's noisier than useful and actively hides
the signal (permission denied, connection failure, sqlite3 missing) we
added the logging to surface.

readFileDataResult now detects the "no such file" case via either:
- TransportError.fileIO(_, "No such file...") from SSHTransport
- NSCocoaErrorDomain code 260 (NSFileNoSuchFileError) from FileManager
- NSPOSIXErrorDomain code 2 (ENOENT)

and suppresses the warning log for those paths. The Result.failure is
still returned, so any caller that cares (Dashboard's banner, Remote
Diagnostics) can still distinguish missing from present-but-unreadable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .../Core/Services/HermesFileService.swift     | 27 ++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/scarf/scarf/Core/Services/HermesFileService.swift b/scarf/scarf/Core/Services/HermesFileService.swift
index 2e3df54..cf5e980 100644
--- a/scarf/scarf/Core/Services/HermesFileService.swift
+++ b/scarf/scarf/Core/Services/HermesFileService.swift
@@ -1591,11 +1591,36 @@ struct HermesFileService: Sendable {
             let data = try transport.readFile(path)
             return .success(data)
         } catch {
-            Self.logger.warning("readFile(\(path, privacy: .public)) failed: \(error.localizedDescription, privacy: .public)")
+            // Don't log "No such file" — that's a routine, expected case
+            // for optional files (skill.yaml, gateway_state.json before
+            // Hermes starts, ~/.hermes/memories/USER.md on fresh installs,
+            // etc.). The caller still gets the Result.failure so it can
+            // distinguish missing from present-but-unreadable.
+            // Log everything else — permission denied, connection drops,
+            // sqlite3 missing — since those are actionable diagnostics.
+            if !Self.isFileNotFound(error) {
+                Self.logger.warning("readFile(\(path, privacy: .public)) failed: \(error.localizedDescription, privacy: .public)")
+            }
             return .failure(error)
         }
     }
 
+    /// `true` iff the error represents "file does not exist" as opposed to
+    /// a permission / transport / parse failure. Used to suppress routine
+    /// logging for optional files while still surfacing real problems.
+    nonisolated private static func isFileNotFound(_ error: Error) -> Bool {
+        if let transportErr = error as? TransportError,
+           case .fileIO(_, let underlying) = transportErr {
+            return underlying.lowercased().contains("no such file")
+        }
+        // Cocoa NSFileNoSuchFileError (returned by LocalTransport when
+        // reading a missing file via FileManager).
+        let ns = error as NSError
+        if ns.domain == NSCocoaErrorDomain && ns.code == 260 { return true }
+        if ns.domain == NSPOSIXErrorDomain && ns.code == 2 { return true }   // ENOENT
+        return false
+    }
+
     /// Write a UTF-8 text file atomically through the transport. Matches the
     /// old pre-transport behavior (print + swallow on error) because the
     /// callers don't have a UI path for surfacing I/O failures — that's

From ec03627bcd3f09372f52a24583a5299a390f41cb Mon Sep 17 00:00:00 2001
From: Alan Wizemann <alan@wizemann.com>
Date: Mon, 20 Apr 2026 13:56:09 -0700
Subject: [PATCH 3/8] =?UTF-8?q?fix:=20diagnostics=20sheet=20=E2=80=94=20by?=
 =?UTF-8?q?pass=20transport.runProcess=20for=20shell=20script?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

First-run of diagnostics against a working Mardon returned 0/14 passing
with "(no output)" for every probe — including the trivial "emit
connectivity PASS" that the script emits unconditionally. That meant the
script wasn't executing as written; the parser saw `__END__` but no
probe lines.

Root cause: SSHTransport.runProcess wraps every argument through
`remotePathArg`, which is designed for PATHS (it rewrites `~/` to
`$HOME/` and double-quotes the result with backslash-escapes). Passing
a multi-line shell script with embedded `"$1"` / `"$2"` / `"$3"` and
`printf '\n'` escape sequences through that is corruption — the remote
sh -c receives a scrambled script and silently emits nothing.

TestConnectionProbe already works around this: it builds the ssh argv
directly (ssh host -- /bin/sh -c <script>) so the script travels as a
single opaque argv entry and ssh forwards it to the remote shell
unchanged.

Mirror that approach. RemoteDiagnosticsViewModel.execute now:
- For remote contexts: builds ssh argv directly (ControlMaster-aware,
  uses the same socket as SSHTransport so it's effectively free after
  the first connection), then passes /bin/sh -c <script> as argv.
- For local contexts: spawns /bin/sh -c <script> via Process directly.

Also surfaces raw stdout/stderr/exit-code in a disclosure panel at the
bottom of the sheet, visible only when ALL probes fail. Makes any
future transport-level breakage self-diagnosing: the user sees exactly
what the remote returned, not just "(no output)" rows.

Expose SSHTransport.controlDirPath (already static) as a public helper
so the diagnostics probe reuses the same ControlMaster socket as the
connection itself.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .../RemoteDiagnosticsViewModel.swift          | 155 ++++++++++++++++--
 .../Servers/Views/RemoteDiagnostics.swift     |  51 +++++-
 2 files changed, 183 insertions(+), 23 deletions(-)

diff --git a/scarf/scarf/Features/Servers/ViewModels/RemoteDiagnosticsViewModel.swift b/scarf/scarf/Features/Servers/ViewModels/RemoteDiagnosticsViewModel.swift
index 434ff90..c96b31d 100644
--- a/scarf/scarf/Features/Servers/ViewModels/RemoteDiagnosticsViewModel.swift
+++ b/scarf/scarf/Features/Servers/ViewModels/RemoteDiagnosticsViewModel.swift
@@ -101,6 +101,13 @@ final class RemoteDiagnosticsViewModel {
     private(set) var isRunning: Bool = false
     private(set) var startedAt: Date?
     private(set) var finishedAt: Date?
+    /// Raw stdout/stderr from the most recent run, preserved so the UI can
+    /// surface them in a disclosure panel when things look wrong. This is
+    /// how we debug cases where the script ran but no probes were parsed
+    /// (e.g. transport-quoting bugs, dash-vs-bash incompatibilities).
+    private(set) var rawStdout: String = ""
+    private(set) var rawStderr: String = ""
+    private(set) var rawExitCode: Int32 = 0
 
     init(context: ServerContext) {
         self.context = context
@@ -119,12 +126,18 @@ final class RemoteDiagnosticsViewModel {
 
         switch captured {
         case .connectFailure(let msg):
+            rawStdout = ""
+            rawStderr = msg
+            rawExitCode = -1
             probes = [
                 Probe(id: .connectivity, passed: false, detail: msg)
             ] + ProbeID.allCases
                 .filter { $0 != .connectivity }
                 .map { Probe(id: $0, passed: false, detail: "(skipped — SSH didn't connect)") }
         case .completed(let stdout, let stderr, let exitCode):
+            rawStdout = stdout
+            rawStderr = stderr
+            rawExitCode = exitCode
             probes = Self.parse(stdout: stdout, stderr: stderr, exitCode: exitCode)
         }
 
@@ -274,27 +287,137 @@ final class RemoteDiagnosticsViewModel {
     }
 
     private static func execute(script: String, context: ServerContext) async -> Captured {
-        let transport = context.makeTransport()
+        // Can't use `transport.runProcess(executable: "/bin/sh", args: ["-c", script])`
+        // here: SSHTransport.runProcess pipes every argument through
+        // `remotePathArg` (which double-quotes to rewrite `~/` → `$HOME/`),
+        // which mangles a multi-line shell script containing `"$1"`,
+        // nested quotes, and `printf` escape sequences. The result on the
+        // remote is a scrambled string and every probe fails to emit.
+        //
+        // Mirror TestConnectionProbe's approach: build the ssh argv
+        // directly so the script travels as a single opaque argv entry
+        // that ssh forwards to the remote shell unchanged.
+        switch context.kind {
+        case .local:
+            return await runLocally(script: script)
+        case .ssh(let config):
+            return await runOverSSH(script: script, config: config)
+        }
+    }
+
+    /// Direct ssh invocation — same style as TestConnectionProbe. Uses the
+    /// connection's ControlMaster socket so this is cheap (~5ms) after the
+    /// first connection is warm.
+    private static func runOverSSH(script: String, config: SSHConfig) async -> Captured {
+        var sshArgv: [String] = [
+            "-o", "ControlMaster=auto",
+            "-o", "ControlPath=\(controlDirPath())/%C",
+            "-o", "ControlPersist=600",
+            "-o", "ServerAliveInterval=30",
+            "-o", "ConnectTimeout=10",
+            "-o", "StrictHostKeyChecking=accept-new",
+            "-o", "LogLevel=QUIET",
+            "-o", "BatchMode=yes"
+        ]
+        if let port = config.port { sshArgv += ["-p", String(port)] }
+        if let id = config.identityFile, !id.isEmpty {
+            sshArgv += ["-i", id]
+        }
+        let hostSpec: String
+        if let user = config.user, !user.isEmpty { hostSpec = "\(user)@\(config.host)" }
+        else { hostSpec = config.host }
+        sshArgv.append(hostSpec)
+        sshArgv.append("--")
+        sshArgv.append("/bin/sh")
+        sshArgv.append("-c")
+        sshArgv.append(script)
+
         return await Task.detached { () -> Captured in
-            do {
-                let result = try transport.runProcess(
-                    executable: "/bin/sh",
-                    args: ["-c", script],
-                    stdin: nil,
-                    timeout: 30
-                )
-                return .completed(
-                    stdout: result.stdoutString,
-                    stderr: result.stderrString,
-                    exitCode: result.exitCode
-                )
-            } catch {
-                let desc = (error as? LocalizedError)?.errorDescription ?? error.localizedDescription
-                return .connectFailure(desc)
+            let proc = Process()
+            proc.executableURL = URL(fileURLWithPath: "/usr/bin/ssh")
+            proc.arguments = sshArgv
+
+            // Inherit the shell's SSH_AUTH_SOCK so ssh can reach the
+            // agent — same pattern as SSHTransport + TestConnectionProbe.
+            var env = ProcessInfo.processInfo.environment
+            let shellEnv = HermesFileService.enrichedEnvironment()
+            for key in ["SSH_AUTH_SOCK", "SSH_AGENT_PID"] {
+                if env[key] == nil, let v = shellEnv[key], !v.isEmpty {
+                    env[key] = v
+                }
             }
+            proc.environment = env
+
+            let stdoutPipe = Pipe()
+            let stderrPipe = Pipe()
+            proc.standardOutput = stdoutPipe
+            proc.standardError = stderrPipe
+            do {
+                try proc.run()
+            } catch {
+                return .connectFailure("Failed to launch ssh: \(error.localizedDescription)")
+            }
+            let deadline = Date().addingTimeInterval(30)
+            while proc.isRunning && Date() < deadline {
+                try? await Task.sleep(nanoseconds: 100_000_000)
+            }
+            if proc.isRunning {
+                proc.terminate()
+                return .connectFailure("Diagnostics timed out after 30s")
+            }
+            let out = (try? stdoutPipe.fileHandleForReading.readToEnd()) ?? Data()
+            let err = (try? stderrPipe.fileHandleForReading.readToEnd()) ?? Data()
+            return .completed(
+                stdout: String(data: out, encoding: .utf8) ?? "",
+                stderr: String(data: err, encoding: .utf8) ?? "",
+                exitCode: proc.terminationStatus
+            )
         }.value
     }
 
+    /// Local Shell invocation — runs the diagnostic script against the
+    /// user's own Mac. Less useful than the remote form (most checks will
+    /// trivially pass), but lets the same UI work for both contexts.
+    private static func runLocally(script: String) async -> Captured {
+        return await Task.detached { () -> Captured in
+            let proc = Process()
+            proc.executableURL = URL(fileURLWithPath: "/bin/sh")
+            proc.arguments = ["-c", script]
+
+            let stdoutPipe = Pipe()
+            let stderrPipe = Pipe()
+            proc.standardOutput = stdoutPipe
+            proc.standardError = stderrPipe
+            do {
+                try proc.run()
+            } catch {
+                return .connectFailure("Failed to launch /bin/sh: \(error.localizedDescription)")
+            }
+            let deadline = Date().addingTimeInterval(10)
+            while proc.isRunning && Date() < deadline {
+                try? await Task.sleep(nanoseconds: 100_000_000)
+            }
+            if proc.isRunning {
+                proc.terminate()
+                return .connectFailure("Local diagnostics timed out (should be <1s)")
+            }
+            let out = (try? stdoutPipe.fileHandleForReading.readToEnd()) ?? Data()
+            let err = (try? stderrPipe.fileHandleForReading.readToEnd()) ?? Data()
+            return .completed(
+                stdout: String(data: out, encoding: .utf8) ?? "",
+                stderr: String(data: err, encoding: .utf8) ?? "",
+                exitCode: proc.terminationStatus
+            )
+        }.value
+    }
+
+    /// Same cache directory used by SSHTransport — shared so the diagnostic
+    /// probe reuses the connection's ControlMaster socket when it already
+    /// exists (no second TCP handshake, no second auth).
+    private static func controlDirPath() -> String {
+        SSHTransport.controlDirPath()
+    }
+
     private static func parse(stdout: String, stderr: String, exitCode: Int32) -> [Probe] {
         var results: [ProbeID: Probe] = [:]
         for line in stdout.split(whereSeparator: { $0 == "\n" || $0 == "\r" }) {
diff --git a/scarf/scarf/Features/Servers/Views/RemoteDiagnostics.swift b/scarf/scarf/Features/Servers/Views/RemoteDiagnostics.swift
index 26cc440..e0cb005 100644
--- a/scarf/scarf/Features/Servers/Views/RemoteDiagnostics.swift
+++ b/scarf/scarf/Features/Servers/Views/RemoteDiagnostics.swift
@@ -128,14 +128,51 @@ struct RemoteDiagnosticsView: View {
     }
 
     private var footer: some View {
-        HStack {
-            Text("Scarf runs these over a single SSH session that mirrors the shell your dashboard reads from, so a green row here means Scarf can actually read that file at runtime.")
+        VStack(alignment: .leading, spacing: 8) {
+            // Raw-output disclosure. Shown when EVERY probe fails, since
+            // that's the signature of a script-quoting / transport-level
+            // issue rather than a real remote problem. Hidden in the normal
+            // case so it doesn't distract.
+            if !viewModel.probes.isEmpty, viewModel.passingCount == 0 {
+                DisclosureGroup("Raw remote output (for debugging)") {
+                    VStack(alignment: .leading, spacing: 6) {
+                        Text("exit code: \(viewModel.rawExitCode)")
+                            .font(.caption.monospaced())
+                        if !viewModel.rawStdout.isEmpty {
+                            Text("stdout:").font(.caption).foregroundStyle(.secondary)
+                            ScrollView {
+                                Text(viewModel.rawStdout)
+                                    .font(.system(size: 10, design: .monospaced))
+                                    .textSelection(.enabled)
+                                    .frame(maxWidth: .infinity, alignment: .leading)
+                            }
+                            .frame(maxHeight: 140)
+                        }
+                        if !viewModel.rawStderr.isEmpty {
+                            Text("stderr:").font(.caption).foregroundStyle(.secondary)
+                            ScrollView {
+                                Text(viewModel.rawStderr)
+                                    .font(.system(size: 10, design: .monospaced))
+                                    .textSelection(.enabled)
+                                    .frame(maxWidth: .infinity, alignment: .leading)
+                            }
+                            .frame(maxHeight: 140)
+                        }
+                    }
+                    .padding(8)
+                    .background(Color.secondary.opacity(0.08), in: RoundedRectangle(cornerRadius: 6))
+                }
                 .font(.caption)
-                .foregroundStyle(.secondary)
-                .fixedSize(horizontal: false, vertical: true)
-            Spacer()
-            Button("Done") { dismiss() }
-                .keyboardShortcut(.defaultAction)
+            }
+            HStack {
+                Text("Scarf runs these over a single SSH session that mirrors the shell your dashboard reads from, so a green row here means Scarf can actually read that file at runtime.")
+                    .font(.caption)
+                    .foregroundStyle(.secondary)
+                    .fixedSize(horizontal: false, vertical: true)
+                Spacer()
+                Button("Done") { dismiss() }
+                    .keyboardShortcut(.defaultAction)
+            }
         }
         .padding(16)
     }

From c51241dc72011b52b9fa008ee2388e0b897687ba Mon Sep 17 00:00:00 2001
From: Alan Wizemann <alan@wizemann.com>
Date: Mon, 20 Apr 2026 14:04:32 -0700
Subject: [PATCH 4/8] =?UTF-8?q?fix:=20diagnostics=20script=20=E2=80=94=20p?=
 =?UTF-8?q?ipe=20to=20sh=20via=20stdin,=20not=20sh=20-c=20argv?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The previous fix (direct ssh argv, bypassing transport.runProcess) got
us from 0/14 to 7/14, but \$H was empty everywhere it was referenced —
the user's 7/14 report showed:
- probe 4 (hermesHomeConfigured): PASS with empty detail
- probe 5 (hermesDirExists FAIL): "not a directory:" (empty after colon)
- probe 11 (sqlite3CanOpenStateDB FAIL): 'unable to open "/state.db"'

Root cause: `ssh host -- /bin/sh -c <script>` doesn't travel as three
argv entries to the remote. ssh concatenates them with single spaces
into one command string and sends that to the remote's LOGIN shell.
The login shell then runs `$LOGIN_SHELL -c "$string"`, and bash's
parser treats unquoted newlines inside `$string` as command separators.
So the first newline splits the script: `/bin/sh -c H="..."` becomes
one command (which runs in an ephemeral sh subprocess that exits
immediately), and every subsequent line runs in the login shell with
no \$H set.

TestConnectionProbe happens to still work because its downstream lines
don't depend on an assignment from the first line — but the diagnostic
script's \$H is used everywhere, so the entire script is effectively
running with \$H="".

Fix: pipe the script into `/bin/sh -s` on stdin via ssh's own stdin
channel. `sh -s` reads a shell program from stdin and executes it in
one process, variable scope preserved. Implementation uses
Process.standardInput with a Pipe, writing the script after proc.run()
and closing the write end so sh sees EOF. Same as
`cat script.sh | ssh host -- /bin/sh -s` from the command line.

Also: raw-output disclosure panel in the diagnostics sheet now shows
whenever ANY probe fails, not only when all fail. Partial failures are
the most common failure mode and the raw stdout is the only way to see
why a specific detail came back the way it did.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .../RemoteDiagnosticsViewModel.swift          | 33 +++++++++++++++----
 .../Servers/Views/RemoteDiagnostics.swift     | 10 +++---
 2 files changed, 32 insertions(+), 11 deletions(-)

diff --git a/scarf/scarf/Features/Servers/ViewModels/RemoteDiagnosticsViewModel.swift b/scarf/scarf/Features/Servers/ViewModels/RemoteDiagnosticsViewModel.swift
index c96b31d..06b9649 100644
--- a/scarf/scarf/Features/Servers/ViewModels/RemoteDiagnosticsViewModel.swift
+++ b/scarf/scarf/Features/Servers/ViewModels/RemoteDiagnosticsViewModel.swift
@@ -305,9 +305,19 @@ final class RemoteDiagnosticsViewModel {
         }
     }
 
-    /// Direct ssh invocation — same style as TestConnectionProbe. Uses the
-    /// connection's ControlMaster socket so this is cheap (~5ms) after the
-    /// first connection is warm.
+    /// Direct ssh invocation. Pipes the script into `sh` on stdin rather
+    /// than passing it as `sh -c <script>` argv — because ssh concatenates
+    /// argv with spaces and sends that as a single command string to the
+    /// remote's LOGIN shell, which then parses newlines as command
+    /// separators. A multi-line `sh -c <script>` would run only the first
+    /// line inside the `sh` subprocess (any variables set there die when
+    /// `sh` exits), and the rest would run in the login shell with no
+    /// access to those variables. Symptom: `$H=""` everywhere downstream.
+    ///
+    /// Feeding the script via stdin avoids the split entirely — `sh -s`
+    /// consumes the whole stream in one process, so variable scope is
+    /// preserved and the script runs exactly the same way it would from
+    /// a local `cat script.sh | sh`.
     private static func runOverSSH(script: String, config: SSHConfig) async -> Captured {
         var sshArgv: [String] = [
             "-o", "ControlMaster=auto",
@@ -317,7 +327,8 @@ final class RemoteDiagnosticsViewModel {
             "-o", "ConnectTimeout=10",
             "-o", "StrictHostKeyChecking=accept-new",
             "-o", "LogLevel=QUIET",
-            "-o", "BatchMode=yes"
+            "-o", "BatchMode=yes",
+            "-T"  // no pty — keep stdin/stdout a clean byte stream
         ]
         if let port = config.port { sshArgv += ["-p", String(port)] }
         if let id = config.identityFile, !id.isEmpty {
@@ -329,8 +340,7 @@ final class RemoteDiagnosticsViewModel {
         sshArgv.append(hostSpec)
         sshArgv.append("--")
         sshArgv.append("/bin/sh")
-        sshArgv.append("-c")
-        sshArgv.append(script)
+        sshArgv.append("-s")   // read script from stdin
 
         return await Task.detached { () -> Captured in
             let proc = Process()
@@ -348,15 +358,26 @@ final class RemoteDiagnosticsViewModel {
             }
             proc.environment = env
 
+            let stdinPipe = Pipe()
             let stdoutPipe = Pipe()
             let stderrPipe = Pipe()
+            proc.standardInput = stdinPipe
             proc.standardOutput = stdoutPipe
             proc.standardError = stderrPipe
+
             do {
                 try proc.run()
             } catch {
                 return .connectFailure("Failed to launch ssh: \(error.localizedDescription)")
             }
+
+            // Write the script to ssh's stdin, then close the write end so
+            // remote sh sees EOF and exits after executing the whole script.
+            if let data = script.data(using: .utf8) {
+                try? stdinPipe.fileHandleForWriting.write(contentsOf: data)
+            }
+            try? stdinPipe.fileHandleForWriting.close()
+
             let deadline = Date().addingTimeInterval(30)
             while proc.isRunning && Date() < deadline {
                 try? await Task.sleep(nanoseconds: 100_000_000)
diff --git a/scarf/scarf/Features/Servers/Views/RemoteDiagnostics.swift b/scarf/scarf/Features/Servers/Views/RemoteDiagnostics.swift
index e0cb005..2aa709c 100644
--- a/scarf/scarf/Features/Servers/Views/RemoteDiagnostics.swift
+++ b/scarf/scarf/Features/Servers/Views/RemoteDiagnostics.swift
@@ -129,11 +129,11 @@ struct RemoteDiagnosticsView: View {
 
     private var footer: some View {
         VStack(alignment: .leading, spacing: 8) {
-            // Raw-output disclosure. Shown when EVERY probe fails, since
-            // that's the signature of a script-quoting / transport-level
-            // issue rather than a real remote problem. Hidden in the normal
-            // case so it doesn't distract.
-            if !viewModel.probes.isEmpty, viewModel.passingCount == 0 {
+            // Raw-output disclosure. Shown whenever anything fails — we need
+            // this visible for partial failures too since the raw stdout is
+            // the only way to see WHY a check returned its detail. Hidden
+            // only when 14/14 pass (script worked, nothing to debug).
+            if !viewModel.probes.isEmpty, !viewModel.allPassed {
                 DisclosureGroup("Raw remote output (for debugging)") {
                     VStack(alignment: .leading, spacing: 6) {
                         Text("exit code: \(viewModel.rawExitCode)")

From 8e3dafe4c6564ce6025e1d9af95ceee524d83570 Mon Sep 17 00:00:00 2001
From: Alan Wizemann <alan@wizemann.com>
Date: Mon, 20 Apr 2026 14:14:12 -0700
Subject: [PATCH 5/8] fix: remove the pill's own capsule background

The toolbar item already draws its own bezel for the principal-placement
slot; painting a `Color.secondary.opacity(0.08)` capsule on top gave the
pill a doubly-framed look. Drop the pill's background + the padding that
was only there to fit inside the capsule. The dot + label now sit
directly on the toolbar's native surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .../scarf/Features/Servers/Views/ConnectionStatusPill.swift | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/scarf/scarf/Features/Servers/Views/ConnectionStatusPill.swift b/scarf/scarf/Features/Servers/Views/ConnectionStatusPill.swift
index 1e3e468..a7ede8d 100644
--- a/scarf/scarf/Features/Servers/Views/ConnectionStatusPill.swift
+++ b/scarf/scarf/Features/Servers/Views/ConnectionStatusPill.swift
@@ -23,6 +23,9 @@ struct ConnectionStatusPill: View {
                 status.retry()
             }
         } label: {
+            // No explicit background — the SwiftUI toolbar gives the item
+            // its own bezel, and painting a second capsule on top looked
+            // doubly-framed. Just the colored dot + label reads cleanly.
             HStack(spacing: 4) {
                 Circle()
                     .fill(color)
@@ -32,9 +35,6 @@ struct ConnectionStatusPill: View {
                     .foregroundStyle(.secondary)
                     .lineLimit(1)
             }
-            .padding(.horizontal, 6)
-            .padding(.vertical, 2)
-            .background(Color.secondary.opacity(0.08), in: Capsule())
         }
         .buttonStyle(.plain)
         .help(tooltip)

From ee1d705abce25dcef1f44fdb0c548646970df610 Mon Sep 17 00:00:00 2001
From: Alan Wizemann <alan@wizemann.com>
Date: Mon, 20 Apr 2026 14:16:54 -0700
Subject: [PATCH 6/8] fix: move connection pill off .principal to drop the
 emphasis bezel
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

macOS applies a centered emphasis bezel (light capsule + drop shadow)
to ToolbarItem(placement: .principal) — visible in screenshots as a
doubly-framed "capsule behind the pill" look. The pill itself doesn't
own that background; the toolbar placement does.

.primaryAction (right side of the toolbar) has no decorative
background, so the pill renders as just the colored dot + label text
directly on the toolbar surface. Fits the intended minimal look.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 scarf/scarf/ContentView.swift | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/scarf/scarf/ContentView.swift b/scarf/scarf/ContentView.swift
index 8a44b76..7571002 100644
--- a/scarf/scarf/ContentView.swift
+++ b/scarf/scarf/ContentView.swift
@@ -21,7 +21,12 @@ struct ContentView: View {
                         ServerSwitcherToolbar()
                     }
                     if serverContext.isRemote {
-                        ToolbarItem(placement: .principal) {
+                        // `.principal` placement renders the item inside a
+                        // centered emphasis bezel on macOS, which reads as
+                        // an unwanted capsule-with-shadow around the pill.
+                        // `.primaryAction` (right side of the toolbar) has
+                        // no decorative background — what we want.
+                        ToolbarItem(placement: .primaryAction) {
                             ConnectionStatusPill(status: connectionStatus)
                         }
                     }

From d3055702ef1b5e238ffa84a58833ec6f8d85f2e0 Mon Sep 17 00:00:00 2001
From: Alan Wizemann <alan@wizemann.com>
Date: Mon, 20 Apr 2026 14:22:35 -0700
Subject: [PATCH 7/8] =?UTF-8?q?fix:=20connection=20pill=20=E2=80=94=20reve?=
 =?UTF-8?q?rt=20to=20.principal,=20swap=20dot=20for=20state=20SF=20Symbol?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Rolling back the .primaryAction placement (the pill shifted right and
lost its centered position in the toolbar). The "funny background with
shadow" visible in the toolbar is macOS's own .principal emphasis bezel
— not something Scarf draws, and not something we can cleanly hide
without disabling the toolbar surface itself. The native bezel is the
pill's frame; we just have to make the pill's interior read well inside
it.

Two changes to make the pill itself look like a toolbar tool inside
that bezel:

- Drop the colored dot, replace with a state-specific SF Symbol. The
  icon's shape signals clickability (looks like a tool button), and its
  color signals state (green/orange/yellow/red hierarchical). Less
  "status chip", more "toolbar button with status".
- Icons per state:
  - connected  → checkmark.circle.fill (click to re-probe)
  - degraded   → stethoscope (click to run diagnostics, matches the
                 stethoscope on the Manage Servers row)
  - idle       → arrow.triangle.2.circlepath (checking/retry)
  - error      → exclamationmark.triangle.fill (click for stderr)

Horizontal padding = 4 so the icon-and-label sit balanced inside the
bezel rather than pushed up against its edges.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 scarf/scarf/ContentView.swift                 | 11 ++++---
 .../Servers/Views/ConnectionStatusPill.swift  | 29 ++++++++++++++-----
 2 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/scarf/scarf/ContentView.swift b/scarf/scarf/ContentView.swift
index 7571002..9e60787 100644
--- a/scarf/scarf/ContentView.swift
+++ b/scarf/scarf/ContentView.swift
@@ -21,12 +21,11 @@ struct ContentView: View {
                         ServerSwitcherToolbar()
                     }
                     if serverContext.isRemote {
-                        // `.principal` placement renders the item inside a
-                        // centered emphasis bezel on macOS, which reads as
-                        // an unwanted capsule-with-shadow around the pill.
-                        // `.primaryAction` (right side of the toolbar) has
-                        // no decorative background — what we want.
-                        ToolbarItem(placement: .primaryAction) {
+                        // `.principal` centers the pill in the toolbar —
+                        // the native emphasis bezel is the intended frame;
+                        // the pill's own visual content (icon + label, no
+                        // background) sits inside it in balance.
+                        ToolbarItem(placement: .principal) {
                             ConnectionStatusPill(status: connectionStatus)
                         }
                     }
diff --git a/scarf/scarf/Features/Servers/Views/ConnectionStatusPill.swift b/scarf/scarf/Features/Servers/Views/ConnectionStatusPill.swift
index a7ede8d..6961de3 100644
--- a/scarf/scarf/Features/Servers/Views/ConnectionStatusPill.swift
+++ b/scarf/scarf/Features/Servers/Views/ConnectionStatusPill.swift
@@ -23,18 +23,20 @@ struct ConnectionStatusPill: View {
                 status.retry()
             }
         } label: {
-            // No explicit background — the SwiftUI toolbar gives the item
-            // its own bezel, and painting a second capsule on top looked
-            // doubly-framed. Just the colored dot + label reads cleanly.
-            HStack(spacing: 4) {
-                Circle()
-                    .fill(color)
-                    .frame(width: 8, height: 8)
+            // Leading SF Symbol does double duty: its color is the status
+            // signal (green/orange/yellow/red), and its shape reads as a
+            // clickable toolbar tool. No custom background — the toolbar's
+            // `.principal` emphasis bezel is the frame.
+            HStack(spacing: 5) {
+                Image(systemName: iconName)
+                    .foregroundStyle(color)
+                    .symbolRenderingMode(.hierarchical)
                 Text(label)
                     .font(.caption)
                     .foregroundStyle(.secondary)
                     .lineLimit(1)
             }
+            .padding(.horizontal, 4)
         }
         .buttonStyle(.plain)
         .help(tooltip)
@@ -55,6 +57,19 @@ struct ConnectionStatusPill: View {
         }
     }
 
+    /// State-specific SF Symbol. The icon shape itself signals what the
+    /// click will do: checkmark for connected (click to re-probe),
+    /// stethoscope for degraded (click to run diagnostics), spinning
+    /// arrows for probing, triangle for error.
+    private var iconName: String {
+        switch status.status {
+        case .connected: return "checkmark.circle.fill"
+        case .degraded: return "stethoscope"
+        case .idle: return "arrow.triangle.2.circlepath"
+        case .error: return "exclamationmark.triangle.fill"
+        }
+    }
+
     private var label: String {
         switch status.status {
         case .connected: return "Connected"

From bd21a539e647940ccef2e77b71f86450216bd3ce Mon Sep 17 00:00:00 2001
From: Alan Wizemann <alan@wizemann.com>
Date: Mon, 20 Apr 2026 14:26:33 -0700
Subject: [PATCH 8/8] docs: update v2.0.1 release notes for diagnostics fixes +
 pill UX

Reflect the three post-initial-commit fixes:
- log-noise suppression (skill.yaml / optional-file 'No such file'
  warnings no longer spam Console via the new Result-returning readers)
- diagnostics script now stdin-pipes to sh -s instead of sh -c <script>
  argv, so it runs as one sh process with variable scope preserved
- pill UX: replaced colored dot with state-specific SF Symbol
  (checkmark / stethoscope / arrows / triangle), removed custom
  background, kept .principal placement for centering

Also expanded the 'Known follow-ups' section so users know what's
explicitly deferred post-2.0.1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 releases/v2.0.1/RELEASE_NOTES.md | 38 +++++++++++++++++++++++++-------
 1 file changed, 30 insertions(+), 8 deletions(-)

diff --git a/releases/v2.0.1/RELEASE_NOTES.md b/releases/v2.0.1/RELEASE_NOTES.md
index 6de355d..d927195 100644
--- a/releases/v2.0.1/RELEASE_NOTES.md
+++ b/releases/v2.0.1/RELEASE_NOTES.md
@@ -2,15 +2,18 @@
 
 Hotfix for [#19](https://github.com/awizemann/scarf/issues/19) and the related reports from the first day of v2.0: users' remote SSH connections would show a green "Connected" pill but every view (Dashboard, Sessions, Activity, Chat) read as empty / "not running" / "not configured". Three distinct environments reported it — Docker Hermes on a LAN, homelab VM over Tailscale, Ubuntu VPS — and every one was a silent file-access failure on the remote that Scarf wasn't surfacing.
 
-### What changed
+### Errors no longer disappear
 
-**Errors no longer disappear.** Every remote read (`config.yaml`, `gateway_state.json`, `state.db`, `pgrep`) used to silently substitute an empty value on *any* failure — permission denied, missing file, `sqlite3` not installed, connection drop — they all looked identical to the UI. Now:
+Every remote read (`config.yaml`, `gateway_state.json`, `state.db`, `pgrep`) used to silently substitute an empty value on *any* failure — permission denied, missing file, `sqlite3` not installed, connection drop — they all looked identical to the UI. Now:
 
 - Each failure logs a specific warning via `os.Logger` (visible in Console.app under subsystem `com.scarf`).
 - The Dashboard shows an orange banner above the stats with the exact error (e.g. "Permission denied reading `~/.hermes/state.db`") and a **Run Diagnostics…** button.
 - `HermesDataService` exposes a `lastOpenError` so views can explain *why* state.db couldn't be opened, rather than just rendering zeros.
+- Routine "file doesn't exist" cases (optional `skill.yaml` metadata, `gateway_state.json` before Hermes starts, `memories/USER.md` on fresh installs) are detected and **not** logged as warnings — only real errors (permission denied, connection drops, `sqlite3` missing) hit the log. Prevents Console from filling with false-positive noise when directory walks encounter optional files.
 
-**New Remote Diagnostics sheet.** Accessible from **Manage Servers → 🩺** or directly by clicking the toolbar pill when it's yellow. Runs fourteen checks in a single SSH session and shows pass/fail for each, plus a targeted hint per failure:
+### New Remote Diagnostics sheet
+
+Accessible from **Manage Servers → 🩺** per-server button, or by clicking the orange connection pill when Scarf can see the server but can't read Hermes state. Runs fourteen checks in a single SSH session and shows pass/fail for each, plus a targeted hint per failure:
 
 - SSH connectivity and auth
 - Remote user identity and `$HOME` resolution
@@ -23,15 +26,27 @@ Hotfix for [#19](https://github.com/awizemann/scarf/issues/19) and the related r
 - `hermes` binary on the login `$PATH` (what the Test Connection probe uses)
 - `pgrep` available (for the "is Hermes running" check)
 
-One **Copy Full Report** button dumps every check as plain text for bug reports.
+One **Copy Full Report** button dumps every check as plain text for bug reports, and a raw-output disclosure panel shows the exact stdout/stderr the remote returned whenever any probe fails — so transport-level problems are self-diagnosing.
 
-**Connection pill gains a yellow "degraded" state.** The pill used to be green as long as SSH connected; now after connectivity passes it runs a second-tier check (`test -r ~/.hermes/config.yaml`). If that fails, the pill turns **yellow** with "Connected — can't read Hermes state" and clicking it opens Remote Diagnostics directly. This is the exact symptom mode in issue #19, and it's now one click away from a specific answer.
+The diagnostics script is piped to `/bin/sh -s` on stdin rather than passed as `sh -c <script>` argv. The latter was getting split line-by-line by the remote's login shell (newlines parsed as command separators), which stranded variables set on line 1 in an ephemeral `sh` subprocess that exited before line 2 could use them. Stdin-piping runs the whole script in one `sh` process with variable scope preserved.
 
-**Auto-suggest the correct `remoteHome` during Add Server.** When Test Connection can't find `state.db` at the configured (or default) path, it now also probes the common alternate locations — `/var/lib/hermes/.hermes`, `/opt/hermes/.hermes`, `/home/hermes/.hermes`, `/root/.hermes` — and offers a one-click "Use this" fill if it finds one. Removes the need to know that systemd-installed Hermes lives at `/var/lib/hermes/.hermes` by convention.
+### Connection pill gains a "degraded" state
 
-**Clearer copy for the `remoteHome` field.** The Add Server sheet field is now labeled "Hermes data directory" with a description explaining when you'd override it (systemd service installs, Docker sidecars) and noting that Test Connection auto-suggests.
+The pill used to be green as long as SSH connected; now after connectivity passes it runs a second-tier check (`test -r $HOME/.hermes/config.yaml`). If that fails, the pill turns **orange** with "Connected — can't read Hermes state" and clicking it opens Remote Diagnostics directly. This is the exact symptom mode in #19, and it's now one click away from a specific answer.
 
-**README has a new "Remote setup requirements" section** with the four concrete prerequisites (SSH, `sqlite3`, `pgrep`, read access to `~/.hermes`) and a troubleshooting paragraph pointing at Remote Diagnostics.
+The pill's visual also got a pass: the colored dot is replaced with a state-specific SF Symbol (`checkmark.circle.fill` / `stethoscope` / `arrow.triangle.2.circlepath` / `exclamationmark.triangle.fill`), which reads more like a clickable toolbar tool and doubles as the status signal. No custom pill background anymore — the toolbar's native `.principal` bezel is the frame.
+
+### Auto-suggest the correct `remoteHome` during Add Server
+
+When Test Connection can't find `state.db` at the configured (or default) path, it now also probes the common alternate locations — `/var/lib/hermes/.hermes`, `/opt/hermes/.hermes`, `/home/hermes/.hermes`, `/root/.hermes` — and offers a one-click "Use this" fill if it finds one. Removes the need to know that systemd-installed Hermes lives at `/var/lib/hermes/.hermes` by convention.
+
+### Clearer copy for the `remoteHome` field
+
+The Add Server sheet field is now labeled "Hermes data directory" with a description explaining when you'd override it (systemd service installs, Docker sidecars) and noting that Test Connection auto-suggests.
+
+### README has a new "Remote setup requirements" section
+
+Four concrete prerequisites (SSH, `sqlite3`, `pgrep`, read access to `~/.hermes`) and a troubleshooting paragraph pointing at Remote Diagnostics.
 
 ### Migrating from 2.0.0
 
@@ -44,3 +59,10 @@ Sparkle will offer the update automatically. Settings and server list are preser
 - `HermesDataService.open()` records `lastOpenError` with humanized hints for "sqlite3 not installed", "permission denied", and "file not found" — the three failure modes that produce 90% of issue #19 symptoms.
 - `ConnectionStatusViewModel` status enum gains `.degraded(reason:)` between `.connected` and `.error`.
 - `TestConnectionProbe` result enum gains `suggestedRemoteHome: String?` carrying any alternate-location hit.
+
+### Known follow-ups (not in 2.0.1)
+
+- `TestConnectionProbe` uses a direct-argv ssh invocation that's functionally correct but fragile (works by accident when split across the login shell). Should be ported to the stdin-pipe pattern the diagnostics sheet now uses.
+- Remaining `try?`-swallowed read paths beyond the four Dashboard-surfacing ones — Cron, Memory, Skills, MCP Servers, Platforms still silently render empty on read errors. Same fix pattern applies, low priority.
+- `hermesBinaryHint` is only populated when the user clicks Test Connection; if they skip it, ACP chat and CLI calls fall back to bare `hermes` which requires it on the non-interactive PATH (rarely true for `~/.local/bin` installs). The connection-pill's second-tier probe could auto-populate this.
+- Docker-host support: when users SSH to a Docker host, `pgrep` and `~/.hermes/` on the host don't see what's inside the container. Needs a `docker exec` wrapping option per server.