Project Evolution

The v1.2 Milestone

v1.0 was about narrowing the tool surface. v1.2 is about widening what each response means — every tool can now carry its own sense of time, engine confidence, and causal context, on demand and without breaking existing callers.

graph LR
    subgraph v100["v1.0: Surface narrowed"]
        A[28 unified tools]
        B[Raw response shapes]
        C[Agent infers state diffs]
    end

    subgraph v120["v1.2: Response deepened"]
        D[Same 28 tools]
        E[Opt-in envelope: as_of, confidence]
        F[Opt-in causal: caused_by, based_on]
    end

    v100 --> v120

    classDef old fill:#fde2e2,stroke:#c0392b,color:#5c1f1f;
    classDef new fill:#e8f1ff,stroke:#3b6db3,color:#183257;
    class A,B,C old;
    class D,E,F new;

The shift

From inferred state to observed state.

In v1.0 the agent could reach the right control by meaning instead of by pixels. But the response itself was still a raw struct: nothing said how fresh the data was, how confident the engine felt, or whether the change in the UI was actually caused by the agent's last call. v1.2 puts that information into the response itself — opt-in, on every tool, without disturbing existing callers.

Key Evolution: Envelope responses

Time and confidence on every call.

Pass include: ["envelope"] to any tool and the response comes back wrapped:

{
  "_version": 1,
  "data": { /* the original tool result, unchanged */ },
  "as_of": "2026-05-03T09:14:22.481Z",
  "confidence": "fresh"
}

`as_of`

The timestamp of the actual observation, not of serialization. The agent can decide for itself whether 800 ms-old data is fresh enough.

`confidence`

A small enum: "fresh", "degraded", or "stale". Partial fallbacks no longer hide behind the same shape as a clean read.

Key Evolution: Causal context

Observing your own effect.

Pass include: ["causal"] to desktop_state and the envelope grows two more siblings of data:

{
  "data": { /* normal desktop state */ },
  "caused_by": {
    "your_last_action": "mouse_click({\"x\":412,\"y\":287})",
    "tool_call_id": "session-abc:42",
    "elapsed_ms": 87,
    "produced_changes": ["focus:Notepad", "dirty_rect:monitor_0:1"]
  },
  "based_on": {
    "events": ["1849", "1850"],
    "sources": ["uia", "dxgi"]
  }
}

caused_by.your_last_action echoes the agent's own previous tool call. caused_by.produced_changes is what the engine observed actually changing as a direct consequence. based_on.events is the L1 event-id range covered by this observation, encoded as decimal strings so 64-bit IDs survive JSON serialization. The agent no longer has to infer whether its click landed — it can read the receipt.

run_macro participates in the same record as a single boundary, so a macro execution shows up as one observable unit rather than dissolving into its constituent steps.

Compatibility

Existing callers stay byte-for-byte the same.

If you don't pass include, nothing changes. Same response shape, same fields, same defaults as v1.0. Existing configs, scripts, and macros all keep working without modification. The new capability is paid for only by callers who ask for it.

View Full Changelog Read the v1.0 Milestone