60.64 Cancel Multiple Instance Activity

For a task with multiple concurrent instances, all incomplete instances of that task can be cancelled at any point without affecting already-completed instances. Completed instances and their outputs are preserved. Only the running and pending instances of the specific MI task are terminated.


Motivating Scenario

A batch AI translation system receives a document set of 20 files and spawns one Translation Agent per document — 20 parallel instances of the same task. After 15 instances complete successfully, the user cancels the remaining 5 in-progress translations. The 15 completed translations are kept, their output files are valid, and they will be delivered. The 5 running instances are terminated; their partial outputs are discarded.

The key insight: this is not an all-or-nothing cancel (which would discard all 20). The completed instances have produced real value — destroying them would be wasteful. The user's intent is to stop further spending on the remaining work, not to invalidate work already done. The MI activity cancel mechanism preserves the completed-instance invariant: once an instance completes and its output is committed, that output is immutable. Only running and pending instances are vulnerable to cancellation.

Structure

Zoom and pan enabled · Concrete example: batch translation with partial user cancellation

Key Metrics

MetricSignal
Completion rate at cancel time Fraction of instances completed when cancel fires. Tracks how much value is preserved vs. discarded per cancel event.
Cancel response latency per instance Time from cancel signal to confirmed instance termination. Slow responses indicate cancel-unresponsive agents or external API dependency.
Orphaned instance count Instances still running after collector closes. Non-zero count indicates runtime cancel delivery failures.
Partial delivery rate Fraction of batch jobs that end with a partial result set due to user cancel. High rates may indicate batch sizing or timeout configuration issues upstream.
NodeWhat it doesWhat it receivesWhat it produces
Dispatch Batch AND-split: spawns one Translation Agent instance per document in the batch. All 20 instances are active simultaneously within the MI task boundary. Document batch: 20 files with target language 20 parallel instance tokens, one per document
Translation Agent (×20) MI task: each instance translates one document independently. Instances complete asynchronously. Completed instances commit their output immediately. Running instances can be cancelled at any time by the Cancel MI signal. Single document + target language configuration Translated document (committed on instance completion)
Cancel Remaining Receives the user cancel signal. Fires the MI cancel trigger targeting the Translation Agent task specifically. Already-completed instances are not affected. Routes partial results to Collect Completed. External user cancel signal MI cancel trigger for Translation Agent; partial completion token
Collect Completed OR-join: collects results as they arrive, whether from natural completion of all 20 instances or from a partial set after a cancel event. Fires downstream when all active instances have either completed or been cancelled. Translation results from completed instances (any count from 1..20) Collected translation set with completion manifest: which documents succeeded, which were cancelled

When to Use

Use when
Avoid when

Value Profile

Origin of ValueWhere it appearsHow it is captured
Future Cashflow Translation Agent (per instance) Each completed instance produces an independently valuable output. Value accumulates monotonically as instances complete — unlike 60.63 where value is all-or-nothing at the AND-join. A cancel event captures the value already produced without forfeiting it.
Governance Cancel Remaining + Collect Completed The completion manifest produced by Collect Completed is the governance record: which documents were translated, which were cancelled, and why. This is the audit trail for partial delivery — essential for billing (charge for 15 of 20), SLA management, and reprocessing the cancelled 5 later.
Conditional Action Running Translation Agent instances Each running instance is an active compute spend. Cancellation stops the spend immediately for in-progress instances. The cancel mechanism is a cost control lever — the user can cap total batch spend by cancelling when a sufficient fraction has completed.
Risk Exposure Collect Completed (OR-join) OR-join with mixed completion states (some completed, some cancelled) must correctly account for all instance states. If the join fires prematurely — before all running instances have either completed or received the cancel signal — some instances will continue running as orphans with no collector downstream.
VCM analog: Partial vesting on MI tokens. Each Translation Agent instance holds a work token. Completed instances vest their tokens — the output is committed and the value is realized. Running instances hold unvested tokens that are cancelled (forfeited) when the MI cancel signal fires. Completed tokens are untouched by the cancel event.

Dynamics and Failure Modes

Race between instance completion and cancel signal

Instance 16 is 98% complete when the cancel signal fires. The runtime attempts to cancel it, but the instance completes its final write operation before the cancel is processed. Is instance 16 "completed" or "cancelled"? The answer depends on whether the output commit preceded the cancel signal in the runtime's event ordering. Fix: define a clear commitment boundary for each instance — the instance is completed when and only when its output is written to the durable store. Any instance that has not reached this boundary when the cancel fires is cancelled. Any instance that has already crossed the boundary is completed, regardless of the cancel signal timing.

Orphaned instances after partial cancel

The cancel signal cancels instances 16..20 but instance 17 is making an external API call and does not respond to the cancel. It continues running and completes 30 seconds later, producing a translation output. The Collect Completed node has already fired with 15 results — instance 17's output is produced after the collector has closed. Fix: the collector must remain open until all instances — including cancelled ones — have confirmed termination. Confirmed termination for cancelled instances means the cancel signal was acknowledged, not just sent.

Cancel signal targeting the wrong MI task

In a nested process with two MI tasks (e.g., Translation Agent + Quality Review Agent running concurrently), the cancel signal is directed at the wrong task ID. The Quality Review agents are cancelled but the Translation Agents continue. The collector receives translated documents but no quality reviews — the downstream system receives a partial output in an unexpected schema. Fix: cancel signals must be addressed to a specific named task instance. Validate cancel target before firing. Log the cancel event with the task name, instance count affected, and timestamp.

Variants

VariantModificationWhen to use
Cancel with Replacement Cancelled instances are queued for retry in a new MI batch rather than discarded. The cancel event produces a "retry list" in addition to cancelling running instances. The user cancelled due to cost control, not disinterest. The remaining 5 documents should be translated later — the cancel event triggers deferred processing rather than permanent abandonment.
Priority-Based Cancel The cancel signal targets only low-priority instances. High-priority instances continue to completion even if the cancel signal fires. Heterogeneous batch where some documents are marked critical — they must complete regardless of cost. Cancel applies only to best-effort documents.
Threshold-Triggered Cancel The cancel fires automatically when K instances have completed — no external user signal required. Remaining instances are cancelled as soon as the threshold is met. The system only needs K results — completing the full batch is wasteful. Threshold-triggered cancel is equivalent to 60.65 Complete MI Activity with different semantics: 60.64 preserves K completed results; 60.65 forces completion and accepts partial data.

Related Patterns

PatternRelationship
80.85 Cancel RegionAll-or-nothing cancel across a named region of multiple task types. Use when partial results have no value and the entire region must be terminated together.
80.87 Complete MI ActivityForce-completes the MI task by withdrawing remaining instances and accepting partial data as sufficient. The difference: 60.64 preserves completed outputs; 60.65 forces the task to a "done" state regardless.
20.24 Competitive EvaluationMultiple instances competing — the first to satisfy a quality threshold wins and the rest are cancelled. Combine with 60.64 semantics to preserve the winning instance's output.

Investment Signal

Cancel MI Activity is the mechanism that makes batch AI workloads cost-controllable without sacrificing completed work. Organizations processing large document batches need this pattern to implement usage-based billing, cost caps, and dynamic re-prioritization without requiring that every batch run to full completion. The ability to cancel 5 of 20 running agents while preserving 15 completed results is a product capability, not just a technical detail.

The moat is in the collector implementation. A Collect Completed node that correctly handles mixed completion states — some natural, some cancelled, some timed out — is significantly harder to build correctly than a simple AND-join that waits for all instances. Organizations that have solved the partial-collection problem can offer more flexible batch semantics, which is a competitive advantage in markets where customers have variable and unpredictable batch sizes.

Red flag: a batch system that does not distinguish between "cancelled" and "failed" in its completion manifest is losing information that is essential for billing and reprocessing decisions. Cancelled instances should be retried at the customer's discretion. Failed instances should trigger automatic retry. Conflating the two states produces incorrect billing and incorrect SLA calculations.