30.40 Dynamic Partial Join for Multiple Instances

Multiple instances run concurrently and new instances can be added dynamically during execution. A runtime completion condition — evaluated continuously — determines when to proceed. Neither M nor N are fixed at design time; both emerge from the execution itself.

Motivating Scenario

A web crawl AI starts with a seed set of 10 URLs and discovers new pages to scrape as it goes. Each scraper agent processes one URL and may surface additional URLs that are added to the work queue. New scraper instances are spawned dynamically as new URLs are discovered. The aggregation phase begins when two conditions are simultaneously satisfied: the content quality score across all completed scrapers exceeds 0.85, AND the URL discovery queue is empty (no new work is being generated). Neither of these conditions can be evaluated before execution starts — both depend on what the crawl actually finds.

The key insight: this pattern is fundamentally different from static and static-partial MI variants. The instance set is open: new members can join during execution. The completion criterion is not "N of M instances done" — it is a predicate over execution state. This makes 30.40 the most expressive MI join pattern and also the hardest to implement correctly. It maps naturally to any AI workflow with recursive or exploratory structure: the process generates its own workload as it runs, and "done" is a semantic condition, not a count.

Structure

Zoom and pan enabled · Concrete example: dynamic web crawl with quality-gated aggregation (top-bottom layout)

Key Metrics

Metric	Signal
Total instance count per run	How many scraper instances were spawned — primary cost driver; track distribution across runs to detect pathological cases
Queue depth over time	Tracks whether the crawl is converging (queue shrinking) or diverging (queue growing) — early warning for non-termination
Aggregate quality score trajectory	How quality evolves as more scrapers complete — validates whether more instances produce meaningful quality gains
Time to completion condition	Wall-clock time from seed to aggregation trigger — the primary latency signal for the end-to-end workflow

Node	What it does	What it receives	What it produces
URL Discoverer	Reads from the URL queue. If a URL is available, routes it to a Scraper Agent instance. If the queue is empty and the completion condition is met, routes to aggregation. Loops back to itself to re-check the queue.	URL queue state + quality gate signal	URL dispatched to new scraper instance, OR aggregation trigger
Scraper Agent	Fetches and parses the assigned URL. Extracts structured content and discovers outbound links. Adds new URLs to the shared queue. Emits a content quality score on completion.	Single URL + scraping config	Structured page content + new URL additions to queue + quality score
Quality Gate	After each scraper completion, evaluates: (1) is aggregate quality score > 0.85? (2) is the URL queue empty? Routes to Discoverer for more work if either condition fails; routes to aggregation if both pass.	Scraper output + aggregate quality state + queue depth	Continue signal (to Discoverer) OR completion signal (to Aggregator)
Aggregate Results	Collects all completed scraper outputs and produces a unified knowledge graph from the crawl	All completed scraper content artifacts	Crawl knowledge graph

When to Use

Use when

New instances can be created during execution (the instance set is open)
Completion condition is a runtime predicate, not a fixed count
The process is exploratory or recursive — execution generates its own workload
Neither M (total instances) nor N (threshold) can be known before the process starts
A semantic "done" condition is more meaningful than a numeric one

Avoid when

M and N are both known at design time — use 30.38 or 30.39 (simpler, more auditable)
The completion condition is difficult to evaluate reliably — ambiguous predicates cause non-termination
Hard upper bounds on execution time are required — dynamic join can run indefinitely
Process instance state must be simple and inspectable — dynamic MI is the most complex MI variant

Value Profile

Origin of Value	Where it appears	How it is captured
Future Cashflow	Crawl coverage quality	The dynamic instance set means coverage adapts to what the crawl finds. A topic-rich seed produces more instances and higher coverage; a sparse seed terminates quickly. Quality is outcome-adaptive rather than input-sized.
Governance	Quality Gate completion condition	The two-part predicate (quality score AND queue empty) is the governance mechanism. Each clause is an independently tunable policy parameter. Weakening either clause terminates the crawl earlier; strengthening either extends it.
Conditional Action	Each scraper instance	Compute cost is entirely determined by the runtime execution path — unknown at design time. Budget caps (max instances, max runtime) are essential guardrails for cost control.
Risk Exposure	Non-termination	If new URLs are discovered faster than scrapers complete, the queue never empties and the completion condition never fires. The crawl runs indefinitely. Mandatory circuit breakers (max total instances, max wall-clock time) are non-negotiable.

Semantic completion vs. count-based completion. 30.38 and 30.39 complete when a number is reached. 30.40 completes when a condition is true. This is a fundamentally different termination model. The quality of a 30.40 implementation depends entirely on the precision of the completion predicate. Vague predicates ("sufficient coverage") produce non-deterministic termination. Precise predicates ("quality score > 0.85 AND queue depth = 0 for 30 consecutive seconds") produce reproducible behavior.

Dynamics and Failure Modes

Non-termination (crawl depth explosion)

The crawl discovers high-density link graphs. Each scraper finds 20 new URLs. The queue grows faster than it drains. The "queue empty" condition never fires. The crawl runs indefinitely, consuming unbounded compute. Fix: implement a hard cap on total URL additions (e.g., max 500 URLs queued regardless of discovery). When the cap is hit, the queue is sealed — no new URLs are accepted — and the completion condition is re-evaluated with a relaxed criterion (quality score only, queue cap is not "empty" but "sealed").

Race condition on completion evaluation

The Quality Gate evaluates the completion condition: queue is empty AND quality > 0.85. Both conditions are true. The gate signals aggregation. Simultaneously, a scraper adds 3 new URLs to the queue (race condition — the URL addition message was in flight when the evaluation ran). Aggregation starts on incomplete data. Fix: the "queue empty" condition must be evaluated with a distributed lock that also blocks new URL additions. "Queue empty" means "queue is empty and locked against further additions."

Quality score manipulation by outlier pages

One scraped page is exceptionally high-quality (score = 0.99). This single page pulls the aggregate score above 0.85 threshold even though most scraped content is mediocre. The completion condition fires prematurely. Fix: use a robust aggregate (median or trimmed mean) rather than arithmetic mean. Alternatively, require that both quality threshold AND minimum instance count are met.

Instance state loss on partial failure

The Quality Gate process crashes mid-execution. On recovery, it does not know which scraper instances are active, what the current aggregate quality is, or how many URLs are queued. The completion condition cannot be evaluated. Fix: gate state (active instance registry, aggregate quality accumulator, queue depth) must be durably persisted after every scraper completion. Recovery replays from the last checkpoint, not from the beginning.

Variants

Variant	Modification	When to use
Budget-Capped Dynamic Join	Hard cap on total instances (max M); when cap is reached, no new instances spawn and completion condition collapses to quality-only	Compute budget is fixed; the dynamic crawl must terminate within resource limits regardless of discovery rate
Time-Bounded Dynamic Join	A timeout triggers a forced completion after a maximum wall-clock duration; whatever is complete at that point is passed to aggregation	Hard latency SLA exists; graceful degradation on partial results is preferable to SLA violation
Incremental Aggregation	Aggregation runs continuously as scrapers complete; the quality gate monitors aggregate output quality rather than individual scraper scores	Aggregation is cheap and incremental; running it continuously surfaces the quality signal needed for the completion condition more accurately than per-instance scores
Convergence-Detecting Completion	Completion fires when the marginal quality gain from the last K instances falls below a threshold (diminishing returns detection)	Quality grows sublinearly with instance count; the optimal stopping point is the knee of the quality-cost curve

Related Patterns

Pattern	Relationship
60.65 Static Partial Join MI	Fixed M and N variant — use when both are known at design time; far simpler to implement and audit
60.66 Cancelling Partial Join MI	Fixed M, fixed N, with cancellation — use when cost recovery after N completions is more important than dynamic expansion
40.48 Generalised AND-Join	Waits for all activated branches — the full-N complement; 30.40 adds dynamic instance creation and semantic completion
10.15 Evaluator-Optimizer	The Quality Gate's role in 30.40 mirrors the Evaluator role — both assess output quality and decide whether to continue or stop
30.31 Feedback Loop	The Discoverer-Scraper-QualityGate cycle is a feedback loop where scrapers feed new work back into the system; 30.40 adds the termination condition