Multiple concurrent instances of the same task run in parallel. Once a fixed N-of-M instances complete (both N and M set at design time), the next task is triggered. Remaining instances continue to completion but do not block the downstream step.
A document translation system processes a batch of 10 documents by running 10 parallel translator agents — one per document. The review phase begins as soon as 7 of the 10 translations are complete. The remaining 3 translators continue running and will append their results to the batch when they finish, but they do not delay the review process. Both M=10 (total instances) and N=7 (threshold to proceed) are specified at design time in the workflow definition.
The key insight: this pattern applies to multiple instances of the same task, not to multiple distinct branches. The distinction matters for implementation: the join must track completions from homogeneous instances of a single task node, not from heterogeneous branches. When M and N are known before the workflow is even compiled, the static variant is simpler and more auditable than dynamic alternatives. For batch translation, the 30% tail tolerance is an explicit engineering decision — the expected distribution of translator runtimes makes waiting for all 10 statistically unjustifiable when 7 provide sufficient coverage for review.
| Metric | Signal |
|---|---|
| Threshold fire latency | Time from batch dispatch to N-th completion — the primary latency signal for this pattern |
| Finalize completion rate | Fraction of batches where all M instances complete — tracks archival record integrity |
| Quality delta (N vs M) | Difference in output quality between using N completions vs all M — validates whether the remaining instances would have changed the review outcome |
| Instance completion time distribution | Variance across the M instances — high variance justifies partial join; low variance reduces its benefit |
| Node | What it does | What it receives | What it produces |
|---|---|---|---|
| Dispatch Translators | Spawns 10 translator instances simultaneously. Passes each instance a single document from the batch. | Document batch (10 documents) | 10 parallel translator activations |
| Translator Agent (×10) | Translates the assigned document and produces a structured translation artifact. Multiple instances of the same task definition. | Single document + target language config | Translated document artifact |
| Start Review (partial join, 7-of-10) | Fires when 7 translation completions arrive. Passes the 7 completed artifacts to review. Remaining 3 instances continue running independently. | 7 translation artifacts (first arriving) | 7-artifact set for review |
| Finalize All (AND-join) | Waits for the remaining 3 instances to complete. Merges all 10 translations into the final batch record. | Completion signals from remaining 3 instances | Complete 10-document batch record |
| Origin of Value | Where it appears | How it is captured |
|---|---|---|
| Future Cashflow | Review phase start latency | Review begins at the 7th completion, not the 10th. If the 3 slowest translators account for 40% of total batch runtime, review starts 40% earlier than a full AND-join would allow. |
| Governance | Threshold N (design-time constant) | N is a design-time policy decision: what fraction of instances are sufficient for downstream quality? Encoding N in the workflow definition makes the policy visible and auditable. |
| Conditional Action | All 10 translator instances | All 10 instances run to completion and consume compute. The 3 post-threshold completions do not improve review quality but produce records. If records are not needed, 30.39 (cancelling) is more efficient. |
| Risk Exposure | Review quality at threshold | Review operates on N=7 results. If the 3 remaining translations contain critical errors that the reviewed 7 did not, the review process missed them. Threshold N should be validated against the distribution of translation quality across instances. |
Design-time vs. runtime specification. The "static" qualifier means both M and N are baked into the workflow definition — not computed at execution time. This is the simplest form of partial join for MI tasks and should be the default choice when the batch size is known. Dynamic variants (30.39, 30.40) add complexity that is only justified when parameters genuinely vary per execution.
The 7 fastest translator instances are those processing short or simple documents. The 3 slower instances handle complex documents where translation quality is most critical. The review phase fires before the most important translations are available. Fix: weight instances by document complexity when tracking completions, or require that the threshold set includes at least K high-complexity documents before firing.
After the review phase starts, 2 of the remaining 3 translators hang indefinitely. The Finalize All AND-join never receives their completion tokens. The batch record is never closed. Fix: enforce a timeout on every translator instance. Timed-out instances emit a failure artifact to the finalize join. The batch record is closed with a "partial — 2 instances timed out" status rather than blocking indefinitely.
The review phase examines the 7 translations and identifies a systematic translation error (wrong glossary applied). The remaining 3 instances are still running with the same wrong glossary. Fix: add a branch from review back to a "cancel and rerun remaining" handler that can interrupt the in-flight instances when a systematic error is detected early.
| Variant | Modification | When to use |
|---|---|---|
| Quality-Gated Partial Join | Threshold fires when N instances complete AND their aggregate quality score exceeds a minimum | Early completers may be low quality; a raw count threshold is insufficient |
| Staggered Threshold | Two thresholds: N1 triggers a preliminary action; N2 triggers the main action; M unblocks the cycle | Downstream pipeline has multiple stages that can start progressively as more instances complete |
| Adaptive N | N is computed at runtime based on current instance quality distribution (e.g., stop when variance drops below threshold) | Quality-based stopping criteria are more meaningful than a fixed count — note this transitions toward 30.40 |
| Pattern | Relationship |
|---|---|
| 60.66 Cancelling Partial Join MI | Same static M and N, but remaining instances are cancelled after N complete — use when archival value is not needed |
| 60.67 Dynamic Partial Join MI | When M or N cannot be fixed at design time — runtime-determined thresholds |
| 40.46 Structured Partial Join | The heterogeneous branch variant — 20.26 operates on distinct branches; 30.38 operates on homogeneous instances of one task |
| 60.61 MI Without Synchronization | No join at all — instances write to a shared accumulator independently; use when even N-of-M synchronization is unnecessary |