30.38 Static Partial Join for Multiple Instances

Multiple concurrent instances of the same task run in parallel. Once a fixed N-of-M instances complete (both N and M set at design time), the next task is triggered. Remaining instances continue to completion but do not block the downstream step.


Motivating Scenario

A document translation system processes a batch of 10 documents by running 10 parallel translator agents — one per document. The review phase begins as soon as 7 of the 10 translations are complete. The remaining 3 translators continue running and will append their results to the batch when they finish, but they do not delay the review process. Both M=10 (total instances) and N=7 (threshold to proceed) are specified at design time in the workflow definition.

The key insight: this pattern applies to multiple instances of the same task, not to multiple distinct branches. The distinction matters for implementation: the join must track completions from homogeneous instances of a single task node, not from heterogeneous branches. When M and N are known before the workflow is even compiled, the static variant is simpler and more auditable than dynamic alternatives. For batch translation, the 30% tail tolerance is an explicit engineering decision — the expected distribution of translator runtimes makes waiting for all 10 statistically unjustifiable when 7 provide sufficient coverage for review.

Structure

Zoom and pan enabled · Concrete example: 7-of-10 translation batch review trigger

Key Metrics

MetricSignal
Threshold fire latency Time from batch dispatch to N-th completion — the primary latency signal for this pattern
Finalize completion rate Fraction of batches where all M instances complete — tracks archival record integrity
Quality delta (N vs M) Difference in output quality between using N completions vs all M — validates whether the remaining instances would have changed the review outcome
Instance completion time distribution Variance across the M instances — high variance justifies partial join; low variance reduces its benefit
NodeWhat it doesWhat it receivesWhat it produces
Dispatch Translators Spawns 10 translator instances simultaneously. Passes each instance a single document from the batch. Document batch (10 documents) 10 parallel translator activations
Translator Agent (×10) Translates the assigned document and produces a structured translation artifact. Multiple instances of the same task definition. Single document + target language config Translated document artifact
Start Review (partial join, 7-of-10) Fires when 7 translation completions arrive. Passes the 7 completed artifacts to review. Remaining 3 instances continue running independently. 7 translation artifacts (first arriving) 7-artifact set for review
Finalize All (AND-join) Waits for the remaining 3 instances to complete. Merges all 10 translations into the final batch record. Completion signals from remaining 3 instances Complete 10-document batch record

When to Use

Use when
Avoid when

Value Profile

Origin of ValueWhere it appearsHow it is captured
Future Cashflow Review phase start latency Review begins at the 7th completion, not the 10th. If the 3 slowest translators account for 40% of total batch runtime, review starts 40% earlier than a full AND-join would allow.
Governance Threshold N (design-time constant) N is a design-time policy decision: what fraction of instances are sufficient for downstream quality? Encoding N in the workflow definition makes the policy visible and auditable.
Conditional Action All 10 translator instances All 10 instances run to completion and consume compute. The 3 post-threshold completions do not improve review quality but produce records. If records are not needed, 30.39 (cancelling) is more efficient.
Risk Exposure Review quality at threshold Review operates on N=7 results. If the 3 remaining translations contain critical errors that the reviewed 7 did not, the review process missed them. Threshold N should be validated against the distribution of translation quality across instances.
Design-time vs. runtime specification. The "static" qualifier means both M and N are baked into the workflow definition — not computed at execution time. This is the simplest form of partial join for MI tasks and should be the default choice when the batch size is known. Dynamic variants (30.39, 30.40) add complexity that is only justified when parameters genuinely vary per execution.

Dynamics and Failure Modes

Threshold N fires on low-quality early completers

The 7 fastest translator instances are those processing short or simple documents. The 3 slower instances handle complex documents where translation quality is most critical. The review phase fires before the most important translations are available. Fix: weight instances by document complexity when tracking completions, or require that the threshold set includes at least K high-complexity documents before firing.

Post-threshold instances never complete

After the review phase starts, 2 of the remaining 3 translators hang indefinitely. The Finalize All AND-join never receives their completion tokens. The batch record is never closed. Fix: enforce a timeout on every translator instance. Timed-out instances emit a failure artifact to the finalize join. The batch record is closed with a "partial — 2 instances timed out" status rather than blocking indefinitely.

Review modifies assumptions about remaining instances

The review phase examines the 7 translations and identifies a systematic translation error (wrong glossary applied). The remaining 3 instances are still running with the same wrong glossary. Fix: add a branch from review back to a "cancel and rerun remaining" handler that can interrupt the in-flight instances when a systematic error is detected early.

Variants

VariantModificationWhen to use
Quality-Gated Partial Join Threshold fires when N instances complete AND their aggregate quality score exceeds a minimum Early completers may be low quality; a raw count threshold is insufficient
Staggered Threshold Two thresholds: N1 triggers a preliminary action; N2 triggers the main action; M unblocks the cycle Downstream pipeline has multiple stages that can start progressively as more instances complete
Adaptive N N is computed at runtime based on current instance quality distribution (e.g., stop when variance drops below threshold) Quality-based stopping criteria are more meaningful than a fixed count — note this transitions toward 30.40

Related Patterns

PatternRelationship
60.66 Cancelling Partial Join MISame static M and N, but remaining instances are cancelled after N complete — use when archival value is not needed
60.67 Dynamic Partial Join MIWhen M or N cannot be fixed at design time — runtime-determined thresholds
40.46 Structured Partial JoinThe heterogeneous branch variant — 20.26 operates on distinct branches; 30.38 operates on homogeneous instances of one task
60.61 MI Without SynchronizationNo join at all — instances write to a shared accumulator independently; use when even N-of-M synchronization is unnecessary