Anti-Flap Toolkit — Stable GOAP¶
What flapping is¶
Flapping is the pathology where an AI agent oscillates between two (or more)
goals or actions every replan tick, never committing long enough to make
visible progress on either. A patrolling guard who sees the player at the
exact edge of detection range and switches Patrol -> Chase -> Patrol -> Chase
once per planner tick is the canonical example: from a human's perspective the
agent is "stuttering" rather than acting.
The root cause is almost always that the planner's goal-selection score for two candidates is near-equal and a small amount of noise (sensor jitter, distance crossing a threshold, a tiny utility tweak) is enough to flip the winner every replan. The fix is not to make the planner smarter — A* with admissible heuristics is already optimal for each individual replan. The fix is to add stability on top of it, at the right layer.
Intent Forge's framing for the v0.30 release is Stable GOAP: classic GOAP optimality preserved per-replan, with a small set of orthogonal stability layers wrapped around it. Each layer lives in its natural API location rather than being collapsed into a single "pick a strategy" enum.
The pipeline¶
Raw sensor read <- Family 4 (EMA, One Euro) | sensor layer
|
v
Derived fact <- Family 2 (latched bool, hysteresis) | fact-derivation layer
|
v
Goal score <- Family 1 (momentum bonus) | selection layer
|
v
Selected action <- Family 3 (min-commit time) | execution layer
|
v
Running executor <- Family 5 (plan reuse) | plan-shape layer
A particular flap may be suppressible at any of these layers. You typically want to attack it at the earliest layer that makes the signal stable, so the downstream layers see a clean input.
What ships in v0.31¶
| Family | Layer | Status (v0.31) |
|---|---|---|
| 1 | Selection | Shipped (since v0.30) |
| 2 | Fact derivation | Shipped |
| 3 | Execution | Shipped (since v0.27) |
| 4 | Fact derivation | Shipped |
| 5 | Plan shape | Shipped (since v0.23) |
v0.31 ships the bundled fact-derivation layer: per-fact EMA smoothing
(Family 4) and Schmitt-style latched bool facts (Family 2), both inside
FWorldState::SetScalar. Configuration lives on the UFactSchemaAsset
so every producer (sensors, gameplay events, blackboard bridges) gets
the same treatment without each opting in by inheritance.
The full stack of five orthogonal stability layers is now usable from a single declarative schema; the Patrol/Chase worked example demonstrates all three of Families 1 + 2 + 4 in one actor with individual toggles.
The layered stack (v0.31)¶
Producer (sensor / gameplay / event)
|
v raw scalar
FWorldState::SetScalar
|
+--> [bSuppressDerivation?] -- yes (planner search) --> store raw, run latch, done
|
v no (live runtime)
|
+--> [Filter enabled on fact?]
| yes -> smoothed = Alpha*raw + (1-Alpha)*prior; store smoothed
| no -> store raw
|
v
+--> [Latch configured with this fact as source?]
yes -> evaluate Schmitt band, SetBool(OutputBoolFactId)
no -> done
Planner search nodes set bSuppressDerivation = true on every state copy
so EMA does not bleed into the deterministic action-effect simulation.
Latching is not suppressed in search — the derived bool is part of the
symbolic state the planner reasons about, so action effects that move the
source scalar must also move the derived bool inside the search.
The five families¶
Family 1 — Score-domain biasing (selection layer)¶
A multiplicative bonus on the score of the currently-active goal during
replan ranking: score *= 1 + MomentumBonus. Switching to a different
goal requires the new goal to clear the bonus margin.
Ships in v0.30 as FPlannerHints::MomentumBonus, threaded into the planner
by the component as UIntentForgeComponent::GoalMomentumBonus (default
0.15). The 0.15 default matches Apoch's Utility Intelligence published
guidance and the broader Curvature literature: large enough to suppress
flap on near-equal goals, small enough that meaningful score changes still
win.
Special case: when both candidates score exactly 0, the bonus is inert
(0 * 1.15 = 0). The planner's comparator falls back to an
active-goal-wins tiebreak before declaration order, so the previous goal
still keeps the slot.
Family 2 — Latched fact derivation (fact-derivation layer)¶
Schmitt-style hysteresis on the derivation of a bool fact from a scalar:
LowHealth = true once health drops below 0.25, but stays true until
health rises above 0.40. The dual-threshold band creates deliberate
hysteresis in the derived state.
Ships in v0.31 as FIntentLatchedScalarFactConfig, attached to the
schema as a sibling list (UFactSchemaAsset::LatchedBoolFacts). Each
row maps one source scalar fact id to one output bool fact id with an
EnterThreshold / ExitThreshold band. Latching runs inside
FWorldState::SetScalar so every write to the source scalar drives the
latch automatically — no extra ticking, no per-agent state on the
sensor, no special code in the planner.
Canonical form: Enter <= Exit. The output is true when the
source value is below Enter, and stays true until the source rises
above Exit. The example above (LowHealth) is canonical.
Inverse form: Enter > Exit. "True when high." The output is true
when the source rises above Enter, and stays true until the source
falls below Exit. The validator reports this as an info/warning so the
choice is explicit and not a typo.
Initial state. bInitialValue is what GetBool(OutputBoolFactId)
returns until the first source-scalar write arrives. Without this, a
just-spawned agent would read the bitset default (false) for the
output bool regardless of where the source value started.
Family 3 — Time-domain (execution layer)¶
A minimum commit time on every action: once dispatched, an action cannot
be preempted for at least MinActionCommitTime seconds (default 0.3s)
unless the candidate has a higher EIntentInterruptionLevel. The
interruption-level gate is the emergency bypass — a Critical-level
goal can preempt anything regardless of how recently the current action
started.
Already shipping (since v0.27). No changes in v0.30.
Family 4 — Signal smoothing (fact-derivation layer)¶
An exponential moving average on a noisy raw sample before it becomes a
fact: smoothed_n = Alpha * raw_n + (1 - Alpha) * smoothed_{n-1}. The
smoothed value is what the rest of the planner sees.
Ships in v0.31 as FIntentScalarFilterConfig, attached inline to a
FFactSchemaEntry of Type == Scalar. The filter applies inside
FWorldState::SetScalar so every producer benefits — sensors, gameplay
events, blackboard bridges, BatchMutate — without each opting in by
inheritance.
First-sample passthrough. The first SetScalar for a fact stores
the raw value (no smoothing). Subsequent writes apply EMA. This avoids
a long warm-up where the smoothed value crawls up from zero.
Alpha tuning. Alpha = 1.0 is identity (passthrough). Alpha =
0.05 is heavy lag (slow tracking). The default 0.25 is a reasonable
starting point for distance-style signals. The validator warns if
Alpha is outside (0, 1].
Suppression in search. EMA does not run inside the planner's A*
expansion. Each search node has bSuppressDerivation = true, so action
effects that move a scalar produce deterministic outputs the search can
hash and compare. The live runtime never sets that flag.
Family 5 — Structural commitment (plan-shape layer)¶
The shape of the plan itself encodes commitment:
- Short plans: the planner's
MaxPlanDepthcap keeps plans atomic enough that "the next replan will pick something different" can't span across weeks of in-fiction time. - Plan reuse: when the new plan's first action matches the in-flight
action, the executor keeps running; only the tail of the plan rotates.
See
bSameFirstStepinUIntentForgeComponent::AdoptNewPlan. - Executor lifecycle: an executor runs to terminal status before the component touches it, regardless of how many replans fire underneath.
Already shipping. No changes in v0.30.
Why Families 2 and 4 are bundled¶
Both are fact-derivation policies. Bundling them into a single v0.31 release rather than dribbling them out gave us:
- One schema-side API change. A scalar fact in v0.31 gains both an optional filter config and an optional latched-derivation config in a single schema-asset extension. Splitting the work over two releases would have meant churning the schema editor twice.
- The double-smoothing footgun caught by construction. A heavily
smoothed scalar feeding into a tight Schmitt band can oscillate
inside the band forever and never flip the derived bool. The v0.31
validator emits a warning (case 10) when the filter's
Alpha < 0.05AND the latch band|Enter - Exit| < 0.05— both halves live together in the same code path so the cross-check is cheap. - Preconditions stay pure. A
FPrecondition_ScalarSchmittwould need per-agent state and would break the planner's "preconditions are pure functions of world state" invariant. Pushing both EMA and hysteresis into fact derivation means everything downstream of the schema is unchanged.
Validator coverage (v0.31)¶
The archetype validator (UIntentForgeArchetypeValidator) catches the
ten most common authoring footguns. Numbered by severity-and-frequency:
| # | Condition | Level |
|---|---|---|
| 1 | Filter Alpha outside (0, 1] on an enabled filter |
Warning |
| 2 | Filter enabled on a non-Scalar fact | Error |
| 3 | Latch row references a SourceScalarFactId that isn't Scalar |
Error |
| 4 | Latch row references an OutputBoolFactId that isn't Bool |
Error |
| 5 | Two latch rows share the same OutputBoolFactId |
Error |
| 6 | EnterThreshold == ExitThreshold (no hysteresis) |
Warning |
| 7 | EnterThreshold > ExitThreshold (inverse "true when high") |
Warning¹ |
| 8 | Latch output also written directly by an action effect | Warning |
| 9 | Latch output's bTriggersReplan == false |
Warning¹ |
| 10 | Heavy filter (Alpha < 0.05) + tight band (|E-X| < 0.05) |
Warning |
¹ Cases 7 and 9 are conceptually "informational" (the configuration is
legal, just worth a sanity check). UEditorValidatorBase only exposes
AssetPasses / AssetWarning / AssetFails — there is no Info severity
in UE 5.7 — so these surface as Warnings.
Recommended starting composition (v0.31)¶
For a new project on v0.31, start with the defaults and add fact-layer policies only where the signal is actually noisy:
GoalMomentumBonus = 0.15(component default — Family 1)MinActionCommitTime = 0.3s(component default — Family 3)- Plan reuse (automatic — Family 5)
- Add
FIntentScalarFilterConfigwithAlpha = 0.25to scalar facts whose producers are noisy (raw distances, sensor readings, gameplay events that fire in bursts). — Family 4 - Add a
FIntentLatchedScalarFactConfigrow to the schema where a derived bool oscillates near a threshold (PlayerNear, LowHealth, CanSeeTarget, etc.). Pick a band wide enough that residual filter jitter cannot push the source value across both thresholds. — Family 2
Order of operations when tuning a flap problem: try Family 1 first (one component knob), then Family 4 on the source scalar (one schema knob), then Family 2 to give the derived bool its own committed band (one schema row). If you still see flap after all three, the issue is almost certainly in the goal-scoring functions themselves, not the toolkit.
Worked example: Patrol/Chase¶
AIntentForgePatrolChaseExample (in IntentForgeExamples) is the
single-actor demonstration of the full layered stack. Drop one in any
level, hit Play.
The agent has two equally-scored goals:
Patrol— desiredPlayerNear = falseChase— desiredPlayerNear = true
A UPatrolChaseDistanceSensor samples the distance to the player every
0.1s. Two actor toggles control how the signal flows from the sensor to
the planner:
| Toggle | Off (v0.30 baseline) | On (v0.31 demo) |
|---|---|---|
bEnableFactFiltering |
No EMA — raw distance is used. | Distance fact is smoothed with DistanceFilterAlpha. |
bEnableLatchedBool |
Sensor writes PlayerNear direct. |
Schmitt latch derives PlayerNear from distance. |
To observe pure flap (every safety layer off): set
GoalMomentumBonus = 0 on the AgentComponent, leave both toggles off.
Stand on the threshold and let small position jitter cross it. The
on-screen messages show Goal changed: Patrol -> Chase and back every
replan tick — that's signal-layer flap and selection-layer flap
stacking.
To observe selection-layer stability only (Family 1): set
GoalMomentumBonus = 0.15, leave both toggles off. The agent commits
to whichever goal it picked first across small jitter, but a single
clean crossing of the threshold still flips it.
To observe signal-layer smoothing (Family 4): enable
bEnableFactFiltering. The raw distance is EMA'd before any consumer
sees it; transient spikes get absorbed. The Live Inspector's "Derived
Facts" panel shows raw and smoothed side by side.
To observe symbolic-state hysteresis (Family 2): enable
bEnableLatchedBool. Even if the smoothed distance walks across the
midpoint of the band, the derived PlayerNear only flips when the
smoothed value clears the other side of the band. The agent stays
committed to one goal across the entire band, which is the point.
The Live Inspector's Replan History tab still shows the Margin column
(WinningScore - RunnerUpScore) with a * flag on rows where the
momentum bonus did the work. The Derived Facts panel beneath the World
State panel shows current raw / smoothed / latched values live, so you
can watch the signal walk across each layer.
What we deliberately do NOT ship (and why)¶
- Hysteresis decay over time. A bonus that scales down based on how long the active goal has been running. Adds tuning surface area for marginal benefit; the same effect is better expressed by sizing the per-action commit time. Rejected.
- Per-goal commit time. A separate
MinCommitTimeper goal. Same problem — one global knob is easier to reason about than N. If you have one outlier goal that must commit longer, use the goal'sInterruptionLevel+ a custom executor with a built-in minimum runtime. Rejected. - Soft-max / probabilistic goal selection. Replace argmax with a temperature-scaled sample. Solves flap by intentionally introducing exploration noise, which is the opposite of what gameplay wants. Rejected.
- Additive switching cost alongside multiplicative bonus. Score-domain biasing already covers the design space; adding a second knob in the same domain doubles the tuning load without adding new behavior. Rejected.
- EMA at a sensor base class. Smoothing is a fact-derivation policy, not a sensor-inheritance concern. v0.31 puts it on the fact. Rejected at this layer.
- Time-decaying EMA / second time constant. A separate "fast" and "slow" EMA that share state. Adds a second knob for marginal benefit; the One Euro filter is the well-known shape for that idea and we rejected One Euro itself (below). Rejected.
- Boxcar / median / One Euro filter alternatives. A flat suite of filter shapes parameterized by an enum. Adds taxonomy surface area without a concrete need. EMA covers the common case; we will add a one-pole filter alternative only if a user reports a signal that cannot be tuned with EMA. Rejected for v0.31.
- Multi-source latched facts (OR / AND of N sources). A single derived bool driven by combining multiple scalars. Solvable by adding one latch per source plus a precondition that OR's their outputs. Rejected as built-in.
- Per-frame ticking of latch state. The latch fires inside
SetScalaronly, not on a timer. If the source value doesn't change, the latch state doesn't change — by design. Rejected (would defeat the purpose of "stateful derivation lives next to the fact it derives from"). bResetOnReplanfor latch state. Resetting latched bools on every replan would invalidate the hysteresis band on the very edge where it matters. The latch's job is to survive replans. Rejected.- Custom alpha curves (non-EMA). Arbitrary smoothing functions authored by users. Saturates the configuration surface; if you need non-EMA smoothing, write a sensor that does the smoothing and writes the smoothed scalar. Rejected.