30.38 Static Partial Join for Multiple Instances

Multiple concurrent instances of the same task run in parallel. Once a fixed N-of-M instances complete (both N and M set at design time), the next task is triggered. Remaining instances continue to completion but do not block the downstream step.

Motivating Scenario

A document translation system processes a batch of 10 documents by running 10 parallel translator agents — one per document. The review phase begins as soon as 7 of the 10 translations are complete. The remaining 3 translators continue running and will append their results to the batch when they finish, but they do not delay the review process. Both M=10 (total instances) and N=7 (threshold to proceed) are specified at design time in the workflow definition.

The key insight: this pattern applies to multiple instances of the same task, not to multiple distinct branches. The distinction matters for implementation: the join must track completions from homogeneous instances of a single task node, not from heterogeneous branches. When M and N are known before the workflow is even compiled, the static variant is simpler and more auditable than dynamic alternatives. For batch translation, the 30% tail tolerance is an explicit engineering decision — the expected distribution of translator runtimes makes waiting for all 10 statistically unjustifiable when 7 provide sufficient coverage for review.

Structure

Zoom and pan enabled · Concrete example: 7-of-10 translation batch review trigger

Key Metrics

Metric	Signal
Threshold fire latency	Time from batch dispatch to N-th completion — the primary latency signal for this pattern
Finalize completion rate	Fraction of batches where all M instances complete — tracks archival record integrity
Quality delta (N vs M)	Difference in output quality between using N completions vs all M — validates whether the remaining instances would have changed the review outcome
Instance completion time distribution	Variance across the M instances — high variance justifies partial join; low variance reduces its benefit

Node	What it does	What it receives	What it produces
Dispatch Translators	Spawns 10 translator instances simultaneously. Passes each instance a single document from the batch.	Document batch (10 documents)	10 parallel translator activations
Translator Agent (×10)	Translates the assigned document and produces a structured translation artifact. Multiple instances of the same task definition.	Single document + target language config	Translated document artifact
Start Review (partial join, 7-of-10)	Fires when 7 translation completions arrive. Passes the 7 completed artifacts to review. Remaining 3 instances continue running independently.	7 translation artifacts (first arriving)	7-artifact set for review
Finalize All (AND-join)	Waits for the remaining 3 instances to complete. Merges all 10 translations into the final batch record.	Completion signals from remaining 3 instances	Complete 10-document batch record

When to Use

Use when

Total instance count M and threshold N are both fixed at design time
All instances execute the same task (homogeneous multiple instances)
Downstream step can operate on N results without the remaining M-N
Remaining instances have archival or secondary value and should not be cancelled
Task runtime variance is high — waiting for all M would be dominated by outliers

Avoid when

M or N are determined at runtime — use 30.40
Remaining instances should be cancelled after N complete — use 30.39
All M instances must complete — use standard MI with AND-join (30.35)
Instances are heterogeneous (different tasks) — use 20.26

Value Profile

Origin of Value	Where it appears	How it is captured
Future Cashflow	Review phase start latency	Review begins at the 7th completion, not the 10th. If the 3 slowest translators account for 40% of total batch runtime, review starts 40% earlier than a full AND-join would allow.
Governance	Threshold N (design-time constant)	N is a design-time policy decision: what fraction of instances are sufficient for downstream quality? Encoding N in the workflow definition makes the policy visible and auditable.
Conditional Action	All 10 translator instances	All 10 instances run to completion and consume compute. The 3 post-threshold completions do not improve review quality but produce records. If records are not needed, 30.39 (cancelling) is more efficient.
Risk Exposure	Review quality at threshold	Review operates on N=7 results. If the 3 remaining translations contain critical errors that the reviewed 7 did not, the review process missed them. Threshold N should be validated against the distribution of translation quality across instances.

Design-time vs. runtime specification. The "static" qualifier means both M and N are baked into the workflow definition — not computed at execution time. This is the simplest form of partial join for MI tasks and should be the default choice when the batch size is known. Dynamic variants (30.39, 30.40) add complexity that is only justified when parameters genuinely vary per execution.

Dynamics and Failure Modes

Threshold N fires on low-quality early completers

The 7 fastest translator instances are those processing short or simple documents. The 3 slower instances handle complex documents where translation quality is most critical. The review phase fires before the most important translations are available. Fix: weight instances by document complexity when tracking completions, or require that the threshold set includes at least K high-complexity documents before firing.

Post-threshold instances never complete

After the review phase starts, 2 of the remaining 3 translators hang indefinitely. The Finalize All AND-join never receives their completion tokens. The batch record is never closed. Fix: enforce a timeout on every translator instance. Timed-out instances emit a failure artifact to the finalize join. The batch record is closed with a "partial — 2 instances timed out" status rather than blocking indefinitely.

Review modifies assumptions about remaining instances

The review phase examines the 7 translations and identifies a systematic translation error (wrong glossary applied). The remaining 3 instances are still running with the same wrong glossary. Fix: add a branch from review back to a "cancel and rerun remaining" handler that can interrupt the in-flight instances when a systematic error is detected early.

Variants

Variant	Modification	When to use
Quality-Gated Partial Join	Threshold fires when N instances complete AND their aggregate quality score exceeds a minimum	Early completers may be low quality; a raw count threshold is insufficient
Staggered Threshold	Two thresholds: N1 triggers a preliminary action; N2 triggers the main action; M unblocks the cycle	Downstream pipeline has multiple stages that can start progressively as more instances complete
Adaptive N	N is computed at runtime based on current instance quality distribution (e.g., stop when variance drops below threshold)	Quality-based stopping criteria are more meaningful than a fixed count — note this transitions toward 30.40

Related Patterns

Pattern	Relationship
60.66 Cancelling Partial Join MI	Same static M and N, but remaining instances are cancelled after N complete — use when archival value is not needed
60.67 Dynamic Partial Join MI	When M or N cannot be fixed at design time — runtime-determined thresholds
40.46 Structured Partial Join	The heterogeneous branch variant — 20.26 operates on distinct branches; 30.38 operates on homogeneous instances of one task
60.61 MI Without Synchronization	No join at all — instances write to a shared accumulator independently; use when even N-of-M synchronization is unnecessary