ADR 018: Event-Driven Gardener Triggering via Remote-Trigger Runs
Author: Joe McGinley Status: Deprecated (gardener runs as claude.ai routines) Created: 2026-06-14
Problem
The knowledge gardener runs as a claude.ai scheduled routine: a cron-fired Claude Code session that, on each tick, polls the monolith over MCP for work and gardens whatever it finds. With the knowledge graph now authoritative in Postgres and edits indexed synchronously on write (ADR platform/006, Phases 1-3), this triggering model has become the wrong shape for two reasons.
First, it is double indirection. A fixed-interval cron decides when to look, and a passive claude_agent.routine_jobs Postgres queue (register to claim to complete, SELECT FOR UPDATE SKIP LOCKED with a TTL lock) decides what to do. The Claude session is a generic claimant stitching the two together. For the gardener, that is a lot of machinery to answer a question the monolith can already answer directly: "have any notes changed since the last gardening pass?"
Second, it is poll-timed, not event-timed. The cron wakes on a schedule whether or not work exists, so it pays for empty wake-ups, and gardening latency is bounded by the poll interval (hours) rather than by when an edit actually lands. Since note edits already index synchronously into knowledge.notes, the monolith holds a clean, precise event source that the poll-based design throws away.
The claude.ai remote-trigger API exposes POST /v1/code/triggers/{id}/run, an on-demand start that is not bound to the trigger's cron schedule. That primitive lets the monolith push a gardening session the moment there is real work, which is what this ADR adopts.
Decision
Garden on events, not on a timer. The monolith owns a single claude.ai trigger ("gardener") carrying the gardener prompt and the homelab MCP connector. When note edits land, the monolith coalesces them and calls the trigger's run endpoint to start a Claude Code session that does the same gardening work it does today. The routine_jobs queue is removed from the gardener's path.
Triggering stays in the private monolith tier: the same process that already holds cluster secrets, owns monolith-pg, and runs the synchronous note indexer is the one that detects dirtiness and calls run. The session reaches private knowledge data only back through the authenticated MCP gateway (mcp.jomcgi.dev, Cloudflare Access OIDC, ADR agents/006), so the data boundary is unchanged: only the clock moves from claude.ai's cron into the monolith's edit stream.
This decision is about triggering only. It does not change what the gardener does once running (the tool-based knowledge CLI over Postgres, ADR platform/006 Phase 4) or which model it uses. It evolves the trigger mechanism that ADR agents/004 and the routine-job pattern established.
| Aspect | Today (cron + queue) | Decided (event-driven run) |
|---|---|---|
| Clock | claude.ai cron, fixed interval | Monolith note-edit stream (debounced) |
| Work discovery | routine_jobs pull: list to claim to complete | None: the triggering edit is the work signal |
| Latency to garden | Up to one poll interval (hours) | Seconds after a burst settles |
| Empty wake-ups | Every tick with no work still starts a session | No session unless an edit happened |
| Session start | claude.ai cron starts it | Monolith POST /v1/code/triggers/{id}/run |
| Concurrency guard | Per-job TTL lock in routine_jobs | Single "gardener running" TTL lock (same primitive) |
Architecture
graph LR
E[Note edit] --> I[indexing.py writes knowledge.notes]
I --> D{garden dirty?\ndebounce window}
D -- burst settled --> L[acquire 'gardener running' lock]
L -- acquired --> R[POST /v1/code/triggers/gardener/run]
L -- already held --> X[skip: session in flight]
R --> S[Claude Code session: gardener prompt]
S --> M[MCP gateway mcp.jomcgi.dev]
M --> K[knowledge CLI over monolith-pg]
S --> C[release lock on completion]Two controls keep this from melting the inference budget, both reusing patterns the codebase already has:
- Coalescing. A
garden_dirty_sincemarker plus a debounce window (idle timeout or minimum interval) collapses a burst of edits into one session, rather than one session per edit. - Overlap guard. A single TTL lock ("gardener running"), the same primitive that
routine_jobsuses per row, prevents starting a second session while one is in flight. A crashed session's lock expires on its TTL and the next dirty edit can start a fresh pass.
A slow fallback cron on the trigger (for example daily) remains available as a backstop so a missed run call never strands pending gardening indefinitely.
Alternatives Considered
- Keep cron, drop the queue (Option A). Have the cron-fired gardener query
knowledge.notesdirectly for "changed since last run" and garden inline, deleting theroutine_jobsdance but keeping poll timing. Simpler (no new credential) but still pays empty wake-ups and hours of latency; kept as the fallback if the credential path below proves impractical. - Generalize to a queue-replacing dispatcher (Option C). Build a generic
dirty to webhook to sessiondispatcher for every routine kind and retireroutine_jobswholesale. Premature: the gardener is the only case with a clean event source today, so prove the pattern on it before generalizing. - NATS / event-stream fan-out (ADR agents/016, 017). Route note-change events through the canonical event bus to a consumer that triggers the session. The bus is broader infrastructure than one gardener needs, and the Temporal/lakehouse stack that motivated much of it was decommissioned 2026-06-14; a direct in-process call from the indexer is the simplest thing that works.
Security
Baseline per docs/security.md. The gardening session reaches private knowledge data only through the authenticated MCP gateway (Cloudflare Access OIDC, ADR agents/006); this ADR does not widen that boundary. The one new capability is outbound: the monolith gains the ability to start a claude.ai session on demand, which requires it to hold a credential for the remote-trigger API (see Open Questions). That credential is a write-scoped capability to start one named trigger, stored as a OnePasswordItem secret like every other cluster credential, never hardcoded.
Risks
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Edit storm fans out into many sessions | Medium | High | Debounce + single "gardener running" TTL lock; at most one session at a time |
run call fails silently, gardening stalls | Medium | Medium | Slow fallback cron on the trigger as a backstop |
| Monolith holds a long-lived claude.ai credential | Low | Medium | 1Password-managed, single-trigger scope, rotatable; revisit if API allows narrower scoping |
| Lock leaks on session crash, blocks future runs | Low | Medium | TTL expiry releases the lock; next dirty edit reclaims |
Open Questions
- Credential / auth path. The
RemoteTriggertool injects an OAuth token in-process for interactive sessions. Whether therunendpoint accepts a long-lived token the monolith can hold (and at what scope) is unverified. If it does not, fall back to Option A (cron + direct Postgres query, no credential). - Debounce tuning. The right idle window / minimum interval is empirical; start conservative (tens of minutes) and tighten once real edit cadence is observed.
- Whether
routine_jobssurvives elsewhere. This ADR removes the gardener from the queue's path; other routine kinds may still use it. Retiring the table entirely is out of scope here (that would be Option C).
References
| Resource | Relevance |
|---|---|
| ADR platform/006 | Postgres-authoritative notes + tool-based gardener this triggers |
| ADR agents/004 | Autonomous-agent triggering pattern this evolves |
| ADR agents/006 | Authenticated MCP gateway the session connects back through |
| ADR agents/012 | Prior gardener pipeline context |
claude.ai remote-trigger API POST /v1/code/triggers/{id}/run | The on-demand run primitive this decision depends on |