fix(dashboard): max-wait safeguard for scheduleCoalescedTick + drop forward-looking version label

Two follow-ups from code review on this branch:

1. Add `maxWait` (1.5 s) safeguard to `HermesFileWatcher.scheduleCoalescedTick`
   so the trailing-debounce can't be starved indefinitely under sustained
   activity. Each scheduled fire now picks the earlier of (a) the
   `coalesceWindow` quiet floor and (b) `maxWait` since the FIRST fire of
   the current burst. A 10 Hz `state.db-wal` write storm coincident with
   a `gateway_state.json` Start/Stop touch now publishes within
   `maxWait` instead of waiting for the WAL activity to subside. The
   single-fire / quiet-burst case is unchanged because both deadlines
   reduce to the same value.

2. Drop the forward-looking "v2.8 dogfood bug report" reference from a
   comment in `DashboardViewModel.load()` per the
   `feedback_no_version_bumps.md` rule (release notes own version
   labels, not in-code comments).

Tests: full ScarfCore suite green (450/450), Mac scheme builds clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Alan Wizemann
2026-05-09 20:41:59 +02:00
parent e7096bb44c
commit ce028b065f
2 changed files with 52 additions and 16 deletions
@@ -23,12 +23,26 @@ final class HermesFileWatcher {
/// loads on v0.13 hosts and tripped sqlite contention on the read-only /// loads on v0.13 hosts and tripped sqlite contention on the read-only
/// state.db handle. We coalesce to at most one tick per /// state.db handle. We coalesce to at most one tick per
/// `coalesceWindow` so a burst of FSEvents collapses into one observable /// `coalesceWindow` so a burst of FSEvents collapses into one observable
/// state mutation. 500 ms picks the smallest window that still feels /// state mutation.
/// responsive on a single keystroke `touch dashboard.json` while ///
/// surviving v0.13's WAL-write storm. /// **Two limits, not one.** A pure trailing-debounce would starve under
/// sustained WAL writes the timer would keep getting cancelled and
/// rescheduled, and a coincident `gateway_state.json` Start/Stop touch
/// would never propagate until WAL activity quieted down. So we publish
/// when EITHER (a) `coalesceWindow` of quiet has elapsed since the last
/// fire, OR (b) `maxWait` has elapsed since the first fire of the
/// current burst whichever comes first. The max-wait guarantees a
/// floor of one observable mutation per `maxWait` even during sustained
/// activity. Numbers picked to keep the dashboard responsive on a
/// single `touch` while surviving v0.13's WAL-write storm.
private var pendingCoalesceTimer: DispatchWorkItem? private var pendingCoalesceTimer: DispatchWorkItem?
private var pendingTickDate: Date? private var pendingTickDate: Date?
/// Wall-clock when the current burst began. Set on the first
/// `scheduleCoalescedTick` fire after a quiet window; cleared whenever
/// the timer fires. Drives the `maxWait` floor below.
private var burstStartDate: Date?
private static let coalesceWindow: TimeInterval = 0.5 private static let coalesceWindow: TimeInterval = 0.5
private static let maxWait: TimeInterval = 1.5
let context: ServerContext let context: ServerContext
private let transport: any ServerTransport private let transport: any ServerTransport
@@ -114,23 +128,44 @@ final class HermesFileWatcher {
} }
/// Coalesce a burst of FSEvents (or remote-poll deltas) into a single /// Coalesce a burst of FSEvents (or remote-poll deltas) into a single
/// `lastChangeDate` mutation after `coalesceWindow` seconds of quiet. /// `lastChangeDate` mutation. Two limits decide when the publish fires,
/// Each new fire records the latest event date and pushes the timer /// whichever comes first:
/// out, so a 100-ms-spaced burst of 50 fires collapses to one observable ///
/// state mutation `coalesceWindow` ms after the LAST fire same shape /// 1. **Quiet window**: `coalesceWindow` seconds have elapsed since the
/// as a debounce. Runs on `.main` (the FSEvents queue) so observers /// last fire. Each new fire pushes this out pure debounce shape.
/// see the publish on MainActor without a hop. /// 2. **Max wait**: `maxWait` seconds have elapsed since the FIRST fire
/// of the current burst. This bounds the latency floor under
/// sustained activity (v0.13's ~10 Hz WAL-write storm) so a
/// coincident `gateway_state.json` Start/Stop touch can't be starved
/// indefinitely behind a continuously-rescheduling debounce timer.
///
/// Runs on `.main` (the FSEvents queue and the remote-poll
/// MainActor.run) so observers see the publish on MainActor without a
/// hop. The work item self-clears `burstStartDate` when it fires so the
/// next burst starts a fresh max-wait window.
private func scheduleCoalescedTick() { private func scheduleCoalescedTick() {
let now = Date() let now = Date()
pendingTickDate = now pendingTickDate = now
if burstStartDate == nil {
burstStartDate = now
}
pendingCoalesceTimer?.cancel() pendingCoalesceTimer?.cancel()
// Pick the deadline as the earlier of (a) `coalesceWindow` from now,
// and (b) `maxWait` from the burst start. The latter only matters
// when fires keep arriving faster than `coalesceWindow`; in the
// single-fire / quiet-burst case both reduce to the same value.
let quietDeadline = now.addingTimeInterval(Self.coalesceWindow)
let maxWaitDeadline = (burstStartDate ?? now).addingTimeInterval(Self.maxWait)
let firingDate = min(quietDeadline, maxWaitDeadline)
let delay = max(0, firingDate.timeIntervalSince(now))
let work = DispatchWorkItem { [weak self] in let work = DispatchWorkItem { [weak self] in
guard let self, let date = self.pendingTickDate else { return } guard let self, let date = self.pendingTickDate else { return }
self.pendingTickDate = nil self.pendingTickDate = nil
self.burstStartDate = nil
self.lastChangeDate = date self.lastChangeDate = date
} }
pendingCoalesceTimer = work pendingCoalesceTimer = work
DispatchQueue.main.asyncAfter(deadline: .now() + Self.coalesceWindow, execute: work) DispatchQueue.main.asyncAfter(deadline: .now() + delay, execute: work)
} }
func stopWatching() { func stopWatching() {
@@ -146,6 +181,7 @@ final class HermesFileWatcher {
pendingCoalesceTimer?.cancel() pendingCoalesceTimer?.cancel()
pendingCoalesceTimer = nil pendingCoalesceTimer = nil
pendingTickDate = nil pendingTickDate = nil
burstStartDate = nil
} }
/// Watch each project's `dashboard.json` AND its enclosing `.scarf/` /// Watch each project's `dashboard.json` AND its enclosing `.scarf/`
@@ -56,12 +56,12 @@ final class DashboardViewModel {
func load() async { func load() async {
// Coalesce overlapping triggers: the `.task` first-appear and the // Coalesce overlapping triggers: the `.task` first-appear and the
// `.onChange(fileWatcher.lastChangeDate)` observer can both fire // `.onChange(fileWatcher.lastChangeDate)` observer can both fire
// a load in the same tick. Without this guard a v0.13 host's // a load in the same tick. Without this guard a Hermes v0.13
// WAL-write storm walked over the previous load mid-snapshot // host's WAL-write storm walked over the previous load
// (see HermesFileWatcher.scheduleCoalescedTick + the v2.8 dogfood // mid-snapshot (see `HermesFileWatcher.scheduleCoalescedTick`).
// bug report). If a load is already running, await its // If a load is already running, await its completion and return
// completion and return the caller already has a fresh snapshot // the caller already has a fresh snapshot by the time `await`
// by the time `await` returns. // returns.
if let existing = inFlightLoad { if let existing = inFlightLoad {
await existing.value await existing.value
return return