Fires the outgoing flow on the FIRST completing branch, then blocks until ALL branches from that cycle complete before it can fire again. Prevents duplicate triggering within a cycle while ensuring full resource cleanup before the next request is accepted.
A redundant inference system dispatches each request to three model endpoints simultaneously — a primary cluster, a warm standby, and a cold fallback. The first response activates the output path and is delivered to the caller. The discriminator fires once and blocks re-firing until all three endpoint calls have returned (either with a result, a timeout, or an error), ensuring that in-flight connections are tracked and released before the next request enters the cycle.
The key insight: delivering the first response is not enough. If the discriminator re-fires immediately on the next incoming request, branches from the previous cycle may still hold open TCP connections, reserved GPU memory, or rate-limit budget. The blocking discriminator treats cycle isolation as a hard constraint — accepting a small latency penalty (waiting for stragglers) in exchange for deterministic resource accounting. Systems where GPU memory or API quotas are finite cannot afford cross-cycle branch leakage.
| Metric | Signal |
|---|---|
| First-response latency | User-facing quality signal — the OR-join fires at the fastest endpoint's completion time |
| Cycle block duration | Time from first-response fire to AND-join completion — quantifies the blocking overhead per cycle |
| Straggler timeout rate | Fraction of cycles where at least one branch hits the timeout threshold — indicates endpoint reliability |
| Effective throughput | Requests processed per second — bounded by the slowest branch in each cycle, not the fastest |
| Node | What it does | What it receives | What it produces |
|---|---|---|---|
| Dispatch Request | Fans the inference request out to all three model endpoints simultaneously via AND-split | Single inference request | Three concurrent endpoint activations |
| Model Endpoint A/B/C | Each endpoint executes the inference request independently and returns a response token | Inference prompt + parameters | Model response + completion signal |
| First Wins (OR-join) | Fires on the first arriving response, ignores subsequent arrivals for this cycle. Passes the winning response forward. | First arriving endpoint response | Selected response for delivery |
| Deliver Response | Sends the winning response to the caller. Executes independently of the cleanup path. | First response from OR-join | Response to caller |
| Wait All Done (AND-join) | Waits for all three endpoints to complete (success or error). Only when this gate clears is the discriminator unblocked for the next cycle. | Completion signals from all three endpoints | Cycle-complete signal |
| Origin of Value | Where it appears | How it is captured |
|---|---|---|
| Future Cashflow | Delivery latency (first-response path) | The winner's latency is the user-facing metric. The blocking overhead is invisible to the caller — it occurs after delivery. |
| Governance | Wait All Done gate | The AND-join enforces cycle isolation as a hard governance rule. No request enters the next cycle until the current one is fully accounted for. |
| Conditional Action | Each endpoint branch | All three branches consume compute. The two losing branches produce no user-facing value — their cost is pure redundancy insurance. |
| Risk Exposure | AND-join (Wait All Done) | A straggler branch that never completes blocks the entire discriminator indefinitely. Timeouts on each endpoint are mandatory, not optional. |
Contrast with 20.25. The blocking discriminator pays a latency tax (waiting for stragglers) to guarantee resource cleanup. 20.25 avoids this tax by cancelling remaining branches — but cancellation is not always possible (e.g., committed database writes, charged API calls). Choose 20.24 when "cancel" is not a safe operation on the branch workload.
One endpoint is slow or unresponsive. The AND-join never fires because the third completion signal never arrives. The discriminator is blocked and no further requests are processed. Fix: enforce a maximum branch timeout at the endpoint level. After the timeout elapses, the endpoint emits a failure token to the cleanup join, unblocking the cycle regardless of model state.
The Deliver Response step retries due to a transient caller error. The discriminator has already fired and blocked — a second delivery attempt from the same cycle sends the result twice. Fix: make delivery idempotent keyed on a cycle ID. The OR-join assigns a cycle ID at fire time; downstream delivery checks this key before writing.
The Deliver Response and Wait All Done paths both terminate at the end event. If the process engine treats the first end-event arrival as process completion, the cleanup path may be orphaned and resource release is skipped. Fix: route both paths to an explicit terminal merge node that requires both signals before marking the process instance closed.
| Variant | Modification | When to use |
|---|---|---|
| Timeout-Enforced Blocking | Each branch has a hard deadline; timeout emits a synthetic completion token to the AND-join | Endpoints have variable latency; unbounded blocking is not acceptable but cancellation is not safe |
| Blocking Discriminator with Audit Log | All branch responses (winners and losers) are written to an audit log before AND-join fires | Compliance or model comparison requires recording all endpoint outputs, not just the winner |
| Weighted Winner Selection | OR-join scores responses by quality signal; highest-scoring early arrival wins, not purely first-arrival | Response quality varies by endpoint; raw latency is not a sufficient selection criterion |
| Pattern | Relationship |
|---|---|
| 40.45 Cancelling Discriminator | Drops the cleanup wait by cancelling remaining branches — lower latency overhead, but requires cancellable branches |
| 40.41 Multi-Choice | Selectively activates branches rather than all-or-nothing; pair with a discriminator when using multi-choice fan-out |
| 40.48 Generalised AND-Join | The Wait All Done gate is a fixed-N AND-join; use 20.28 when branch count is determined at runtime |
| 20.24 Competitive Evaluation | A higher-level pattern that uses discriminator logic to select the best output from competing agents |