#
Execution Lifecycle
#
Unit Execution Lifecycle
- Obtain the next provider client via the Model Selection Policy.
- Cast the input to the optional
InputSchema - Populate the Prompt template using the current execution context
- Perform inference and extract the response into the unit's
ResponseSchema - Validate the response using
validate - Post-process the response into the unit's
OutputSchemausingprocess - Run any custom logic to adapt a Unit's
OutputSchemato the next Unit'sInputSchemausing.propagate - If at any point an error occurs, refer to the pipeline failure/termination section.
#
Pipeline Execution Lifecycle
The underlying concurrency model is a ThreadPoolExecutor with a configurable max_workers parameter. Since a bulk of the execution time is spent waiting for network I/O, we can set max_workers to a high value. CPU-bound tasks (e.g., MapUnit) are assumed to be lightweight and are handled by a separate executor with a configurable max_workers parameter.
For a single sample, the execution lifecycle is as follows:
- Materialize the graph of primitives into a graph of
Units - Queue all the root nodes for execution
- Once a
Unit's execution lifecycle is complete, store the output - Queue the ready dependent
Units for execution - Repeat until all
Units have been executed - Gather the outputs of all leaf nodes
For an entire dataset, we perform the same steps but build a copy of the graph for each sample and queue up all samples for execution using the same ThreadPoolExecutor. Hence, execution occurs across all samples in parallel.