#
Execution Lifecycle
#
Unit Execution Lifecycle
- Obtain the next provider client via the Model Selection Policy.
- Cast the input to the optional
InputSchema
- Populate the Prompt template using the current execution context
- Perform inference and extract the response into the unit's
ResponseSchema
- Validate the response using
validate
- Post-process the response into the unit's
OutputSchema
usingprocess
- Run any custom logic to adapt a Unit's
OutputSchema
to the next Unit'sInputSchema
using.propagate
- If at any point an error occurs, refer to the pipeline failure/termination section.
#
Pipeline Execution Lifecycle
The underlying concurrency model is a ThreadPoolExecutor
with a configurable max_workers
parameter. Since a bulk of the execution time is spent waiting for network I/O, we can set max_workers
to a high value. CPU-bound tasks (e.g., MapUnit
) are assumed to be lightweight and are handled by a separate executor with a configurable max_workers
parameter.
For a single sample, the execution lifecycle is as follows:
- Materialize the graph of primitives into a graph of
Unit
s - Queue all the root nodes for execution
- Once a
Unit
's execution lifecycle is complete, store the output - Queue the ready dependent
Unit
s for execution - Repeat until all
Unit
s have been executed - Gather the outputs of all leaf nodes
For an entire dataset, we perform the same steps but build a copy of the graph for each sample and queue up all samples for execution using the same ThreadPoolExecutor
. Hence, execution occurs across all samples in parallel.