fix: diagnostics script — pipe to sh via stdin, not sh -c argv

The previous fix (direct ssh argv, bypassing transport.runProcess) got
us from 0/14 to 7/14, but \$H was empty everywhere it was referenced —
the user's 7/14 report showed:
- probe 4 (hermesHomeConfigured): PASS with empty detail
- probe 5 (hermesDirExists FAIL): "not a directory:" (empty after colon)
- probe 11 (sqlite3CanOpenStateDB FAIL): 'unable to open "/state.db"'

Root cause: `ssh host -- /bin/sh -c <script>` doesn't travel as three
argv entries to the remote. ssh concatenates them with single spaces
into one command string and sends that to the remote's LOGIN shell.
The login shell then runs `$LOGIN_SHELL -c "$string"`, and bash's
parser treats unquoted newlines inside `$string` as command separators.
So the first newline splits the script: `/bin/sh -c H="..."` becomes
one command (which runs in an ephemeral sh subprocess that exits
immediately), and every subsequent line runs in the login shell with
no \$H set.

TestConnectionProbe happens to still work because its downstream lines
don't depend on an assignment from the first line — but the diagnostic
script's \$H is used everywhere, so the entire script is effectively
running with \$H="".

Fix: pipe the script into `/bin/sh -s` on stdin via ssh's own stdin
channel. `sh -s` reads a shell program from stdin and executes it in
one process, variable scope preserved. Implementation uses
Process.standardInput with a Pipe, writing the script after proc.run()
and closing the write end so sh sees EOF. Same as
`cat script.sh | ssh host -- /bin/sh -s` from the command line.

Also: raw-output disclosure panel in the diagnostics sheet now shows
whenever ANY probe fails, not only when all fail. Partial failures are
the most common failure mode and the raw stdout is the only way to see
why a specific detail came back the way it did.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Alan Wizemann
2026-04-20 14:04:32 -07:00
parent ec03627bcd
commit c51241dc72
2 changed files with 32 additions and 11 deletions
@@ -305,9 +305,19 @@ final class RemoteDiagnosticsViewModel {
}
}
/// Direct ssh invocation same style as TestConnectionProbe. Uses the
/// connection's ControlMaster socket so this is cheap (~5ms) after the
/// first connection is warm.
/// Direct ssh invocation. Pipes the script into `sh` on stdin rather
/// than passing it as `sh -c <script>` argv because ssh concatenates
/// argv with spaces and sends that as a single command string to the
/// remote's LOGIN shell, which then parses newlines as command
/// separators. A multi-line `sh -c <script>` would run only the first
/// line inside the `sh` subprocess (any variables set there die when
/// `sh` exits), and the rest would run in the login shell with no
/// access to those variables. Symptom: `$H=""` everywhere downstream.
///
/// Feeding the script via stdin avoids the split entirely `sh -s`
/// consumes the whole stream in one process, so variable scope is
/// preserved and the script runs exactly the same way it would from
/// a local `cat script.sh | sh`.
private static func runOverSSH(script: String, config: SSHConfig) async -> Captured {
var sshArgv: [String] = [
"-o", "ControlMaster=auto",
@@ -317,7 +327,8 @@ final class RemoteDiagnosticsViewModel {
"-o", "ConnectTimeout=10",
"-o", "StrictHostKeyChecking=accept-new",
"-o", "LogLevel=QUIET",
"-o", "BatchMode=yes"
"-o", "BatchMode=yes",
"-T" // no pty keep stdin/stdout a clean byte stream
]
if let port = config.port { sshArgv += ["-p", String(port)] }
if let id = config.identityFile, !id.isEmpty {
@@ -329,8 +340,7 @@ final class RemoteDiagnosticsViewModel {
sshArgv.append(hostSpec)
sshArgv.append("--")
sshArgv.append("/bin/sh")
sshArgv.append("-c")
sshArgv.append(script)
sshArgv.append("-s") // read script from stdin
return await Task.detached { () -> Captured in
let proc = Process()
@@ -348,15 +358,26 @@ final class RemoteDiagnosticsViewModel {
}
proc.environment = env
let stdinPipe = Pipe()
let stdoutPipe = Pipe()
let stderrPipe = Pipe()
proc.standardInput = stdinPipe
proc.standardOutput = stdoutPipe
proc.standardError = stderrPipe
do {
try proc.run()
} catch {
return .connectFailure("Failed to launch ssh: \(error.localizedDescription)")
}
// Write the script to ssh's stdin, then close the write end so
// remote sh sees EOF and exits after executing the whole script.
if let data = script.data(using: .utf8) {
try? stdinPipe.fileHandleForWriting.write(contentsOf: data)
}
try? stdinPipe.fileHandleForWriting.close()
let deadline = Date().addingTimeInterval(30)
while proc.isRunning && Date() < deadline {
try? await Task.sleep(nanoseconds: 100_000_000)
@@ -129,11 +129,11 @@ struct RemoteDiagnosticsView: View {
private var footer: some View {
VStack(alignment: .leading, spacing: 8) {
// Raw-output disclosure. Shown when EVERY probe fails, since
// that's the signature of a script-quoting / transport-level
// issue rather than a real remote problem. Hidden in the normal
// case so it doesn't distract.
if !viewModel.probes.isEmpty, viewModel.passingCount == 0 {
// Raw-output disclosure. Shown whenever anything fails we need
// this visible for partial failures too since the raw stdout is
// the only way to see WHY a check returned its detail. Hidden
// only when 14/14 pass (script worked, nothing to debug).
if !viewModel.probes.isEmpty, !viewModel.allPassed {
DisclosureGroup("Raw remote output (for debugging)") {
VStack(alignment: .leading, spacing: 6) {
Text("exit code: \(viewModel.rawExitCode)")