Anti-Flap Toolkit — Stable GOAP¶

What flapping is¶

Flapping is the pathology where an AI agent oscillates between two (or more) goals or actions every replan tick, never committing long enough to make visible progress on either. A patrolling guard who sees the player at the exact edge of detection range and switches Patrol -> Chase -> Patrol -> Chase once per planner tick is the canonical example: from a human's perspective the agent is "stuttering" rather than acting.

The root cause is almost always that the planner's goal-selection score for two candidates is near-equal and a small amount of noise (sensor jitter, distance crossing a threshold, a tiny utility tweak) is enough to flip the winner every replan. The fix is not to make the planner smarter — A* with admissible heuristics is already optimal for each individual replan. The fix is to add stability on top of it, at the right layer.

Intent Forge's framing for the v0.30 release is Stable GOAP: classic GOAP optimality preserved per-replan, with a small set of orthogonal stability layers wrapped around it. Each layer lives in its natural API location rather than being collapsed into a single "pick a strategy" enum.

The pipeline¶

  Raw sensor read       <- Family 4 (EMA, One Euro)         | sensor layer
       |
       v
  Derived fact          <- Family 2 (latched bool, hysteresis) | fact-derivation layer
       |
       v
  Goal score            <- Family 1 (momentum bonus)        | selection layer
       |
       v
  Selected action       <- Family 3 (min-commit time)       | execution layer
       |
       v
  Running executor      <- Family 5 (plan reuse)            | plan-shape layer

A particular flap may be suppressible at any of these layers. You typically want to attack it at the earliest layer that makes the signal stable, so the downstream layers see a clean input.

What ships in v0.31¶

Family	Layer	Status (v0.31)
1	Selection	Shipped (since v0.30)
2	Fact derivation	Shipped
3	Execution	Shipped (since v0.27)
4	Fact derivation	Shipped
5	Plan shape	Shipped (since v0.23)

v0.31 ships the bundled fact-derivation layer: per-fact EMA smoothing (Family 4) and Schmitt-style latched bool facts (Family 2), both inside FWorldState::SetScalar. Configuration lives on the UFactSchemaAsset so every producer (sensors, gameplay events, blackboard bridges) gets the same treatment without each opting in by inheritance.

The full stack of five orthogonal stability layers is now usable from a single declarative schema; the Patrol/Chase worked example demonstrates all three of Families 1 + 2 + 4 in one actor with individual toggles.

The layered stack (v0.31)¶

  Producer (sensor / gameplay / event)
       |
       v  raw scalar
  FWorldState::SetScalar
       |
       +--> [bSuppressDerivation?]  -- yes (planner search) --> store raw, run latch, done
       |
       v  no (live runtime)
       |
       +--> [Filter enabled on fact?]
       |        yes -> smoothed = Alpha*raw + (1-Alpha)*prior; store smoothed
       |        no  -> store raw
       |
       v
       +--> [Latch configured with this fact as source?]
                yes -> evaluate Schmitt band, SetBool(OutputBoolFactId)
                no  -> done

Planner search nodes set bSuppressDerivation = true on every state copy so EMA does not bleed into the deterministic action-effect simulation. Latching is not suppressed in search — the derived bool is part of the symbolic state the planner reasons about, so action effects that move the source scalar must also move the derived bool inside the search.

The five families¶

Family 1 — Score-domain biasing (selection layer)¶

A multiplicative bonus on the score of the currently-active goal during replan ranking: score *= 1 + MomentumBonus. Switching to a different goal requires the new goal to clear the bonus margin.

Ships in v0.30 as FPlannerHints::MomentumBonus, threaded into the planner by the component as UIntentForgeComponent::GoalMomentumBonus (default 0.15). The 0.15 default matches Apoch's Utility Intelligence published guidance and the broader Curvature literature: large enough to suppress flap on near-equal goals, small enough that meaningful score changes still win.

Special case: when both candidates score exactly 0, the bonus is inert (0 * 1.15 = 0). The planner's comparator falls back to an active-goal-wins tiebreak before declaration order, so the previous goal still keeps the slot.

Family 2 — Latched fact derivation (fact-derivation layer)¶

Schmitt-style hysteresis on the derivation of a bool fact from a scalar: LowHealth = true once health drops below 0.25, but stays true until health rises above 0.40. The dual-threshold band creates deliberate hysteresis in the derived state.

Ships in v0.31 as FIntentLatchedScalarFactConfig, attached to the schema as a sibling list (UFactSchemaAsset::LatchedBoolFacts). Each row maps one source scalar fact id to one output bool fact id with an EnterThreshold / ExitThreshold band. Latching runs inside FWorldState::SetScalar so every write to the source scalar drives the latch automatically — no extra ticking, no per-agent state on the sensor, no special code in the planner.

Canonical form: Enter <= Exit. The output is true when the source value is below Enter, and stays true until the source rises above Exit. The example above (LowHealth) is canonical.

Inverse form: Enter > Exit. "True when high." The output is true when the source rises above Enter, and stays true until the source falls below Exit. The validator reports this as an info/warning so the choice is explicit and not a typo.

Initial state. bInitialValue is what GetBool(OutputBoolFactId) returns until the first source-scalar write arrives. Without this, a just-spawned agent would read the bitset default (false) for the output bool regardless of where the source value started.

Family 3 — Time-domain (execution layer)¶

A minimum commit time on every action: once dispatched, an action cannot be preempted for at least MinActionCommitTime seconds (default 0.3s) unless the candidate has a higher EIntentInterruptionLevel. The interruption-level gate is the emergency bypass — a Critical-level goal can preempt anything regardless of how recently the current action started.

Already shipping (since v0.27). No changes in v0.30.

Family 4 — Signal smoothing (fact-derivation layer)¶

An exponential moving average on a noisy raw sample before it becomes a fact: smoothed_n = Alpha * raw_n + (1 - Alpha) * smoothed_{n-1}. The smoothed value is what the rest of the planner sees.

Ships in v0.31 as FIntentScalarFilterConfig, attached inline to a FFactSchemaEntry of Type == Scalar. The filter applies inside FWorldState::SetScalar so every producer benefits — sensors, gameplay events, blackboard bridges, BatchMutate — without each opting in by inheritance.

First-sample passthrough. The first SetScalar for a fact stores the raw value (no smoothing). Subsequent writes apply EMA. This avoids a long warm-up where the smoothed value crawls up from zero.

Alpha tuning. Alpha = 1.0 is identity (passthrough). Alpha = 0.05 is heavy lag (slow tracking). The default 0.25 is a reasonable starting point for distance-style signals. The validator warns if Alpha is outside (0, 1].

Suppression in search. EMA does not run inside the planner's A* expansion. Each search node has bSuppressDerivation = true, so action effects that move a scalar produce deterministic outputs the search can hash and compare. The live runtime never sets that flag.

Family 5 — Structural commitment (plan-shape layer)¶

The shape of the plan itself encodes commitment:

Short plans: the planner's MaxPlanDepth cap keeps plans atomic enough that "the next replan will pick something different" can't span across weeks of in-fiction time.
Plan reuse: when the new plan's first action matches the in-flight action, the executor keeps running; only the tail of the plan rotates. See bSameFirstStep in UIntentForgeComponent::AdoptNewPlan.
Executor lifecycle: an executor runs to terminal status before the component touches it, regardless of how many replans fire underneath.

Already shipping. No changes in v0.30.

Why Families 2 and 4 are bundled¶

Both are fact-derivation policies. Bundling them into a single v0.31 release rather than dribbling them out gave us:

One schema-side API change. A scalar fact in v0.31 gains both an optional filter config and an optional latched-derivation config in a single schema-asset extension. Splitting the work over two releases would have meant churning the schema editor twice.
The double-smoothing footgun caught by construction. A heavily smoothed scalar feeding into a tight Schmitt band can oscillate inside the band forever and never flip the derived bool. The v0.31 validator emits a warning (case 10) when the filter's Alpha < 0.05 AND the latch band |Enter - Exit| < 0.05 — both halves live together in the same code path so the cross-check is cheap.
Preconditions stay pure. A FPrecondition_ScalarSchmitt would need per-agent state and would break the planner's "preconditions are pure functions of world state" invariant. Pushing both EMA and hysteresis into fact derivation means everything downstream of the schema is unchanged.

Validator coverage (v0.31)¶

The archetype validator (UIntentForgeArchetypeValidator) catches the ten most common authoring footguns. Numbered by severity-and-frequency:

#	Condition	Level
1	Filter `Alpha` outside `(0, 1]` on an enabled filter	Warning
2	Filter enabled on a non-Scalar fact	Error
3	Latch row references a `SourceScalarFactId` that isn't Scalar	Error
4	Latch row references an `OutputBoolFactId` that isn't Bool	Error
5	Two latch rows share the same `OutputBoolFactId`	Error
6	`EnterThreshold == ExitThreshold` (no hysteresis)	Warning
7	`EnterThreshold > ExitThreshold` (inverse "true when high")	Warning¹
8	Latch output also written directly by an action effect	Warning
9	Latch output's `bTriggersReplan == false`	Warning¹
10	Heavy filter (`Alpha < 0.05`) + tight band (`\|E-X\| < 0.05`)	Warning

¹ Cases 7 and 9 are conceptually "informational" (the configuration is legal, just worth a sanity check). UEditorValidatorBase only exposes AssetPasses / AssetWarning / AssetFails — there is no Info severity in UE 5.7 — so these surface as Warnings.

Recommended starting composition (v0.31)¶

For a new project on v0.31, start with the defaults and add fact-layer policies only where the signal is actually noisy:

GoalMomentumBonus = 0.15 (component default — Family 1)
MinActionCommitTime = 0.3s (component default — Family 3)
Plan reuse (automatic — Family 5)
Add FIntentScalarFilterConfig with Alpha = 0.25 to scalar facts whose producers are noisy (raw distances, sensor readings, gameplay events that fire in bursts). — Family 4
Add a FIntentLatchedScalarFactConfig row to the schema where a derived bool oscillates near a threshold (PlayerNear, LowHealth, CanSeeTarget, etc.). Pick a band wide enough that residual filter jitter cannot push the source value across both thresholds. — Family 2

Order of operations when tuning a flap problem: try Family 1 first (one component knob), then Family 4 on the source scalar (one schema knob), then Family 2 to give the derived bool its own committed band (one schema row). If you still see flap after all three, the issue is almost certainly in the goal-scoring functions themselves, not the toolkit.

Worked example: Patrol/Chase¶

AIntentForgePatrolChaseExample (in IntentForgeExamples) is the single-actor demonstration of the full layered stack. Drop one in any level, hit Play.

The agent has two equally-scored goals:

Patrol — desired PlayerNear = false
Chase — desired PlayerNear = true

A UPatrolChaseDistanceSensor samples the distance to the player every 0.1s. Two actor toggles control how the signal flows from the sensor to the planner:

Toggle	Off (v0.30 baseline)	On (v0.31 demo)
`bEnableFactFiltering`	No EMA — raw distance is used.	Distance fact is smoothed with `DistanceFilterAlpha`.
`bEnableLatchedBool`	Sensor writes `PlayerNear` direct.	Schmitt latch derives `PlayerNear` from distance.

To observe pure flap (every safety layer off): set GoalMomentumBonus = 0 on the AgentComponent, leave both toggles off. Stand on the threshold and let small position jitter cross it. The on-screen messages show Goal changed: Patrol -> Chase and back every replan tick — that's signal-layer flap and selection-layer flap stacking.

To observe selection-layer stability only (Family 1): set GoalMomentumBonus = 0.15, leave both toggles off. The agent commits to whichever goal it picked first across small jitter, but a single clean crossing of the threshold still flips it.

To observe signal-layer smoothing (Family 4): enable bEnableFactFiltering. The raw distance is EMA'd before any consumer sees it; transient spikes get absorbed. The Live Inspector's "Derived Facts" panel shows raw and smoothed side by side.

To observe symbolic-state hysteresis (Family 2): enable bEnableLatchedBool. Even if the smoothed distance walks across the midpoint of the band, the derived PlayerNear only flips when the smoothed value clears the other side of the band. The agent stays committed to one goal across the entire band, which is the point.

The Live Inspector's Replan History tab still shows the Margin column (WinningScore - RunnerUpScore) with a * flag on rows where the momentum bonus did the work. The Derived Facts panel beneath the World State panel shows current raw / smoothed / latched values live, so you can watch the signal walk across each layer.

What we deliberately do NOT ship (and why)¶

Hysteresis decay over time. A bonus that scales down based on how long the active goal has been running. Adds tuning surface area for marginal benefit; the same effect is better expressed by sizing the per-action commit time. Rejected.
Per-goal commit time. A separate MinCommitTime per goal. Same problem — one global knob is easier to reason about than N. If you have one outlier goal that must commit longer, use the goal's InterruptionLevel + a custom executor with a built-in minimum runtime. Rejected.
Soft-max / probabilistic goal selection. Replace argmax with a temperature-scaled sample. Solves flap by intentionally introducing exploration noise, which is the opposite of what gameplay wants. Rejected.
Additive switching cost alongside multiplicative bonus. Score-domain biasing already covers the design space; adding a second knob in the same domain doubles the tuning load without adding new behavior. Rejected.
EMA at a sensor base class. Smoothing is a fact-derivation policy, not a sensor-inheritance concern. v0.31 puts it on the fact. Rejected at this layer.
Time-decaying EMA / second time constant. A separate "fast" and "slow" EMA that share state. Adds a second knob for marginal benefit; the One Euro filter is the well-known shape for that idea and we rejected One Euro itself (below). Rejected.
Boxcar / median / One Euro filter alternatives. A flat suite of filter shapes parameterized by an enum. Adds taxonomy surface area without a concrete need. EMA covers the common case; we will add a one-pole filter alternative only if a user reports a signal that cannot be tuned with EMA. Rejected for v0.31.
Multi-source latched facts (OR / AND of N sources). A single derived bool driven by combining multiple scalars. Solvable by adding one latch per source plus a precondition that OR's their outputs. Rejected as built-in.
Per-frame ticking of latch state. The latch fires inside SetScalar only, not on a timer. If the source value doesn't change, the latch state doesn't change — by design. Rejected (would defeat the purpose of "stateful derivation lives next to the fact it derives from").
bResetOnReplan for latch state. Resetting latched bools on every replan would invalidate the hysteresis band on the very edge where it matters. The latch's job is to survive replans. Rejected.
Custom alpha curves (non-EMA). Arbitrary smoothing functions authored by users. Saturates the configuration surface; if you need non-EMA smoothing, write a sensor that does the smoothing and writes the smoothed scalar. Rejected.