# Execution Lifecycle

# Unit Execution Lifecycle

Obtain the next provider client via the Model Selection Policy.
Cast the input to the optional InputSchema
Populate the Prompt template using the current execution context
Perform inference and extract the response into the unit's ResponseSchema
Validate the response using validate
Post-process the response into the unit's OutputSchema using process
Run any custom logic to adapt a Unit's OutputSchema to the next Unit's InputSchema using .propagate
If at any point an error occurs, refer to the pipeline failure/termination section.

# Pipeline Execution Lifecycle

The underlying concurrency model is a ThreadPoolExecutor with a configurable max_workers parameter. Since a bulk of the execution time is spent waiting for network I/O, we can set max_workers to a high value. CPU-bound tasks (e.g., MapUnit) are assumed to be lightweight and are handled by a separate executor with a configurable max_workers parameter.

For a single sample, the execution lifecycle is as follows:

Materialize the graph of primitives into a graph of Units
Queue all the root nodes for execution
Once a Unit's execution lifecycle is complete, store the output
Queue the ready dependent Units for execution
Repeat until all Units have been executed
Gather the outputs of all leaf nodes

For an entire dataset, we perform the same steps but build a copy of the graph for each sample and queue up all samples for execution using the same ThreadPoolExecutor. Hence, execution occurs across all samples in parallel.