fix(ios-snapshot): harden Citadel state.db snapshot path (#56)

Reported on iOS: dashboard shows "Connection issue / Citadel.SSH
Client.CommandFailed error 1", memory files (USER.md, SOUL.md) load
fine but Sessions / Activity / Tool Calls all show 0. The snapshot
operation that pulls ~/.hermes/state.db over SFTP via `sqlite3
.backup` was failing on the remote, but the iOS user got zero
actionable context.

Two latent bugs in CitadelServerTransport.asyncSnapshotSQLite —
both fixed in v2.5.0 for asyncRunProcess but missed on this path:

1. `executeCommand` throws CommandFailed on non-zero exit AND
   discards the captured stderr buffer. So when sqlite3 is missing
   (slim Docker images, statically-linked installs) or state.db
   doesn't exist, the user only saw "error 1" and a generic
   connection-issue banner with no remediation.

2. No `PATH=...` prefix. asyncRunProcess inline-prepends
   `PATH="$HOME/.local/bin:/opt/homebrew/bin:/usr/local/bin:$PATH"`
   so bare command resolution works on Citadel's stripped-PATH
   exec channel; the snapshot path didn't, so any sqlite3 install
   outside /usr/bin failed at exit 127 ("command not found").

Mirror the asyncRunProcess hardening on the snapshot path:

- Prepend the same PATH prefix so sqlite3 resolves on hosts where
  it lives at /usr/local/bin or /opt/homebrew/bin.
- Drive `executeCommandStream` instead of `executeCommand`.
  Capture stdout + stderr regardless of exit code.
- On non-zero exit, throw an NSError carrying the real stderr (or
  stdout if stderr is empty — sqlite3 sometimes errors via stdout
  depending on the remote shell). HermesDataService.humanize
  already keys off "sqlite3: command not found" /
  "permission denied" / "no such file" substrings, so once the
  real message reaches it the dashboard banner becomes actionable
  ("sqlite3 is not installed on <host>. Install with apt install
  sqlite3..." instead of the generic CommandFailed error).
- When the stream itself fails to start (network/auth-level), throw
  with a "Failed to start snapshot stream" message so the connect-
  level error path is distinguishable from the remote-exec failure.

iOS-only — Mac path was already correct.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Alan Wizemann
2026-04-29 12:41:26 +02:00
parent 067aeda878
commit 424711c3d9
@@ -407,8 +407,76 @@ public final class CitadelServerTransport: ServerTransport, @unchecked Sendable
let remoteTmp = "/tmp/scarf-snapshot-\(UUID().uuidString).db" let remoteTmp = "/tmp/scarf-snapshot-\(UUID().uuidString).db"
// Double-quote paths; $HOME expansion happens inside double quotes. // Double-quote paths; $HOME expansion happens inside double quotes.
let rewritten = Self.rewriteHomeRelative(remotePath) let rewritten = Self.rewriteHomeRelative(remotePath)
let backupScript = #"sqlite3 "\#(rewritten)" ".backup '\#(remoteTmp)'" && sqlite3 '\#(remoteTmp)' "PRAGMA journal_mode=DELETE;" > /dev/null"#
_ = try await client.executeCommand(backupScript + " 2>&1") // Prepend the same PATH prefix `asyncRunProcess` uses so `sqlite3`
// resolves on hosts where it lives in /usr/local/bin or
// /opt/homebrew/bin (issue #56). Citadel's bare exec channel
// inherits a stripped PATH (typically `/usr/bin:/bin` on Linux);
// without this, statically-linked or custom-prefix sqlite3
// installs fail "command not found" at exit 127.
let backupScript =
#"PATH="$HOME/.local/bin:/opt/homebrew/bin:/usr/local/bin:$PATH" "#
+ #"sqlite3 "\#(rewritten)" ".backup '\#(remoteTmp)'" && sqlite3 '\#(remoteTmp)' "PRAGMA journal_mode=DELETE;" > /dev/null"#
// Drive `executeCommandStream` instead of `executeCommand` so we
// capture stderr regardless of exit code (issue #56). Pre-fix
// a non-zero exit threw `CommandFailed` and discarded the buffer
// surfaced as the unhelpful "Citadel.SSHClient.CommandFailed
// error 1" banner. Now we propagate the real stderr so
// `HermesDataService.humanize` can translate "sqlite3: command
// not found" / "no such file" / "permission denied" into the
// dashboard banner with actionable copy.
let stream: AsyncThrowingStream<ExecCommandOutput, Error>
do {
stream = try await client.executeCommandStream(backupScript)
} catch {
throw NSError(
domain: "CitadelServerTransport",
code: -1,
userInfo: [NSLocalizedDescriptionKey: "Failed to start snapshot stream: \(error.localizedDescription)"]
)
}
var stdout = Data()
var stderr = Data()
var exitCode: Int32 = 0
do {
for try await chunk in stream {
switch chunk {
case .stdout(var buf):
if let s = buf.readString(length: buf.readableBytes) {
stdout.append(Data(s.utf8))
}
case .stderr(var buf):
if let s = buf.readString(length: buf.readableBytes) {
stderr.append(Data(s.utf8))
}
}
}
} catch let failed as SSHClient.CommandFailed {
exitCode = Int32(failed.exitCode)
} catch {
stderr.append(Data(error.localizedDescription.utf8))
exitCode = -1
}
if exitCode != 0 {
// Combine stdout + stderr into the error message sqlite3
// sometimes prints "Error: ..." on stdout depending on the
// remote shell. HermesDataService.humanize keys off
// substrings like "sqlite3: command not found",
// "permission denied", "no such file", so as long as one of
// them ends up in the message we get a useful banner.
let messageBytes = stderr.isEmpty ? stdout : stderr
let message = String(data: messageBytes, encoding: .utf8)?.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
throw NSError(
domain: "CitadelServerTransport",
code: Int(exitCode),
userInfo: [
NSLocalizedDescriptionKey: message.isEmpty
? "Snapshot exited \(exitCode) with no output (likely sqlite3 missing on remote)"
: message
]
)
}
// SFTP-download the remote tmp into our local snapshot cache. // SFTP-download the remote tmp into our local snapshot cache.
let sftp = try await connectionHolder.sftp() let sftp = try await connectionHolder.sftp()