Reactive Perception Graph
By the time an LLM acts, the world may already be different. Reactive Perception Graph is one way to deal with that.
flowchart LR
subgraph A["Snapshot-and-Act"]
A1["Observe screen"]
A2["Think"]
A3["Act on old assumption"]
A1 --> A2 --> A3
AX["World changed"] -.-> A3
end
subgraph B["Reactive Perception Graph"]
B1["Observe target"]
B2["Store provisional state"]
B3["Dirty / stale signals"]
B4["Validate lease"]
B5["Run guards"]
B6["Execute or block"]
B1 --> B2 --> B3 --> B4 --> B5 --> B6
end
A3 --> C["Unsafe action"]
B6 --> D["Safer action contract"]
classDef bad fill:#fde2e2,stroke:#c0392b,color:#5c1f1f;
classDef good fill:#e5f6ea,stroke:#2e8b57,color:#123d28;
class C bad;
class D good;
Seeing and touching are separated by time.
Many LLM agents implicitly follow a loop like this: observe the interface, think, then act. That sounds harmless. In a dynamic interface, it is fragile.
What can change?
- the user focuses another window
- a modal appears
- the UI re-renders
- the target moves or disappears
What breaks?
The model still acts on the assumptions formed at observation time, even though those assumptions are no longer valid at action time.
Correct intent, wrong target.
Suppose an LLM wants to type hello into Notepad. It observes Notepad, decides where to type, another window comes to the front, and it sends hello anyway.
The agent may execute the intended action correctly, but on the wrong target.
That is not mainly an intelligence failure. It is a stale-assumption failure.
External state should be treated as provisional.
Reactive Perception Graph is a layer that treats external state as provisional and re-checks the assumptions behind action before the action fires.
Important clarification
RPG is not a screenshot cache. It is a different contract between the agent and the world.
The parts that make the contract work.
flowchart TB
L["Lens\nWhat am I watching?"]
P["Provisional state\nWhat do I currently believe?"]
G["Guard\nIs this action still safe?"]
T["Lease\nCan I still trust this target?"]
X["Action"]
L --> P
P --> G
P --> T
T --> G
G --> X
W["World changes"] -. "marks dirty" .-> P
W -. "can revoke trust" .-> T
classDef core fill:#eef4ff,stroke:#3b6db3,color:#183257;
classDef edge fill:#fff6df,stroke:#b8860b,color:#5a4300;
class L,P,G,T core;
class X edge;
Provisional state
Keep not only what the agent believes, but how trustworthy that belief still is.
Lens
A watchpoint on something the agent currently cares about.
Guard
A safety check before action when the environment may have drifted.
Lease
A temporary trust contract for an external target.
Execute should be the final step, not the default step.
flowchart TD
S["See target"] --> P["Issue lease"]
P --> Q["Keep state as provisional"]
Q --> R{"World changed?"}
R -- "No" --> U["Action proposed"]
R -- "Yes" --> T["Mark dirty / stale"]
T --> U
U --> V{"Lease valid?"}
V -- "No" --> W["Refresh view"]
V -- "Yes" --> X{"Guards pass?"}
X -- "No" --> Y["Block or recover"]
X -- "Yes" --> Z["Execute action"]
classDef safe fill:#e7f8ec,stroke:#2e8b57,color:#173d29;
classDef risk fill:#fff4d6,stroke:#b8860b,color:#5a4300;
classDef stop fill:#fde8e8,stroke:#c0392b,color:#5c1f1f;
class Z safe;
class T,W risk;
class Y stop;
const lease = issueLease(target);
const state = rememberAsProvisional(target);
if (!validateLease(lease, state)) {
return refresh();
}
if (!guardsPass(state)) {
return block();
}
return execute();
This is a broader contract problem.
Browser agents
A DOM observed earlier may no longer match the live page.
Workflow or API agents
A previously fetched resource handle may no longer be valid.
Embodied agents
An object seen a moment ago may no longer be where the agent assumes it is.
The next thing that matters is evidence.
To validate this direction, the project still needs to measure:
- unsafe action rate
- re-observation count
- token-heavy observation count
- task success rate
- recovery steps