A convergence point where each completing incoming branch independently triggers the outgoing flow — no synchronization. If N branches are active, the downstream task fires N times. Each branch result is processed as it arrives, not after all branches complete.
A legal document processing system fans out to three OCR engines simultaneously: a fast low-cost engine, a deep high-accuracy engine, and a handwriting specialist. Each engine has a different latency and cost profile. Rather than waiting for all three before indexing, each result triggers an independent indexing job the moment it arrives.
The index accumulates results incrementally. When the fast engine returns in 2 seconds, the index is updated immediately. When the deep engine returns in 8 seconds, the index is enriched further. When the handwriting engine returns in 12 seconds, the final enrichment is applied. Each completion is valuable on its own — the index is useful before all three are done. This is Multi-Merge: the downstream task is idempotent and accumulative, designed to fire multiple times.
| Metric | Signal |
|---|---|
| Time to first result | Latency of the fastest branch — determines how quickly the index becomes useful |
| Time to full enrichment | Latency of the slowest branch — when the index reaches maximum fidelity |
| Per-engine contribution delta | How much each engine adds over the previous — quantifies marginal value of each branch |
| Downstream invocation count per document | Should equal active branch count. Count != N indicates lost or duplicate firings. |
| Node | What it does | What it receives | What it produces |
|---|---|---|---|
| Dispatch OCR | Sends the document to all three OCR engines simultaneously via AND-split | Raw document (PDF or image) | Three simultaneous OCR jobs dispatched |
| Fast OCR | Low-latency OCR pass — high throughput, moderate accuracy, returns in ~2 seconds | Document | Structured text extraction (confidence: ~85%) |
| Deep OCR | High-accuracy OCR pass — runs full layout analysis and semantic correction | Document | Structured text extraction (confidence: ~97%) |
| Handwriting OCR | Specialist model for handwritten annotations and marginalia | Document | Handwritten section extraction + annotation metadata |
| Index Result | Incrementally enriches the document index with each arriving OCR result. Fires independently for each completing branch — designed to handle multiple invocations per document. | Single OCR result (whichever arrived) | Updated document index entry (idempotent upsert) |
| Origin of Value | Where it appears | How it is captured |
|---|---|---|
| Future Cashflow | Incremental index quality | Each branch completion improves index fidelity. The fast engine provides immediate utility (2s latency). The deep engine corrects errors later. The handwriting engine adds coverage unavailable from the others. Total value is the sum of independent contributions. |
| Governance | Index Result node | Idempotency is a correctness constraint. The indexing logic must handle out-of-order arrivals gracefully — deep engine may occasionally beat fast engine under load. Without idempotent upsert logic, the governance of "index reflects latest best result" breaks. |
| Conditional Action | Each branch independently | N branches means N index invocations. Cost is proportional to N, not to a single synchronized join. The compute model is additive, not multiplicative — each engine runs independently and charges for its own execution. |
| Risk Exposure | Index consistency window | Between the first and last branch completion, the index is in a partially-enriched state. Queries during this window may return incomplete results. The risk is latency-bounded — the window closes when the slowest branch completes. |
Contrast with AND-join and 20.23. Synchronization (AND-join / AND-join) waits for all N branches before triggering downstream once. Structured Discriminator (20.23) triggers downstream once on the first completion and ignores the rest. Multi-Merge triggers downstream N times, once per branch completion. Use Multi-Merge when each completion independently adds value to an accumulative target.
The indexing function appends results rather than performing an upsert — each OCR result adds a new record instead of enriching the existing one. After three completions, the document has three separate index entries with conflicting extracted text. Downstream search queries return triplicated results. Fix: the Index Result node must implement idempotent upsert semantics keyed on (document_id, engine_id). An append-only design is incompatible with Multi-Merge.
The deep engine (97% confidence) returns first due to document simplicity. The fast engine (85% confidence) returns 500ms later and overwrites the higher-quality result because the merge logic uses last-write-wins. Fix: the merge logic must compare confidence scores and retain the highest-confidence value per field, not apply the most recent write unconditionally.
A downstream consumer reads the index 1 second after the fast engine completes — before the deep and handwriting engines return. The consumer sees a 85%-confidence partial extraction and makes a downstream decision based on incomplete data. The multi-merge has not failed, but its incremental enrichment model is invisible to the consumer. Fix: expose index completeness metadata alongside results — include which engines have completed and when the final enrichment is expected.
| Variant | Modification | When to use |
|---|---|---|
| Threshold Multi-Merge | Downstream fires only after at least K of N branches complete, then fires once for each subsequent completion | A minimum evidence set is required before any processing is useful — partial results below K are noise, not signal |
| Weighted Multi-Merge | Each branch result carries a confidence weight; downstream applies weighted merge rather than independent upsert | Branch outputs are estimates of the same underlying value — a weighted average is more accurate than last-write-wins |
| Bounded Multi-Merge | Multi-Merge within a structured block — all branches are guaranteed to eventually complete, enabling clean close-out | Process lifecycle management requires knowing when enrichment is definitively complete — unbounded merge makes closure ambiguous |
| Pattern | Relationship |
|---|---|
| 40.41 Multi-Choice (OR-Split) | Common upstream pairing — OR-split activates a variable subset of branches; Multi-Merge collects each completion independently. |
| 40.43 Structured Discriminator | Contrast: fires downstream once on first completion. Use when only the fastest result matters and subsequent arrivals are discarded. |
| 10.11 Pipeline (AND-join) | Contrast: fires downstream once after all branches complete. Use when all results must be present before any processing. |
Multi-Merge is the architecture of incremental enrichment pipelines. The pattern is commercially significant in any domain where multiple data sources improve the same artifact over time: document intelligence, multi-model ensembles, multi-source data fusion.
The idempotency requirement is a hidden engineering cost. Teams underestimate how difficult it is to build downstream tasks that correctly handle multiple, out-of-order, partial invocations. Systems that appear to use Multi-Merge but have non-idempotent downstream code accumulate silent data corruption at scale.
Due diligence question: does the downstream indexing/processing logic have explicit tests for out-of-order multi-invocation? If not, the system works in demos (branches arrive in expected order, small scale) and fails in production (high load, variable latency, concurrent documents).