ExecutionConfiguration class
Configuration management for DerivaML executions.
This module provides functionality for configuring and managing execution parameters in DerivaML. It includes:
- ExecutionConfiguration class: Core class for execution settings
- Parameter validation: Handles JSON and file-based parameters
- Dataset specifications: Manages dataset versions and materialization
- Asset management: Tracks required input files with optional caching
The module supports both direct parameter specification and JSON-based configuration files.
Typical usage example
workflow = ml.lookup_workflow_by_url("https://github.com/my-org/my-repo") config = ExecutionConfiguration( ... workflow=workflow, ... datasets=[DatasetSpec(rid="1-abc123", version="1.0.0")], ... description="Process sample data" ... ) execution = ml.create_execution(config)
AssetRID
dataclass
Bases: str
A string subclass representing an asset Resource ID with optional description.
.. deprecated::
Use :class:AssetSpec instead for new code. AssetRID is retained
for backward compatibility.
Attributes:
| Name | Type | Description |
|---|---|---|
rid |
str
|
The Resource ID string identifying the asset in Deriva. |
description |
str
|
Optional human-readable description of the asset. |
Example
asset = AssetRID("3RA", "Pretrained model weights") print(asset) # "3RA" print(asset.description) # "Pretrained model weights"
Source code in src/deriva_ml/execution/execution_configuration.py
147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 | |
ExecutionConfiguration
Bases: BaseModel
Configuration for a DerivaML execution.
Defines the complete configuration for a computational or manual process in DerivaML, including required datasets, input assets, workflow definition, and parameters.
Attributes:
| Name | Type | Description |
|---|---|---|
datasets |
list[DatasetSpec]
|
Dataset specifications, each containing: - rid: Dataset Resource Identifier - version: Version to use - materialize: Whether to extract dataset contents |
assets |
list[AssetSpec]
|
Asset specifications. Each element can be:
- A plain RID string (no caching)
- An |
workflow |
Workflow | None
|
Workflow object defining the computational process.
Use |
description |
str
|
Description of execution purpose (supports Markdown). |
argv |
list[str]
|
Command line arguments used to start execution. |
config_choices |
dict[str, str]
|
Hydra config group choices that were selected. Maps group names to selected config names (e.g., {"model_config": "cifar10_quick"}). Automatically populated by run_model() and get_notebook_configuration(). |
Example
Plain RIDs (backward compatible)
config = ExecutionConfiguration(assets=["6-EPNR", "6-EP56"])
Mixed: cached model weights + uncached embeddings
config = ExecutionConfiguration( ... assets=[ ... AssetSpec(rid="6-EPNR", cache=True), ... "6-EP56", ... ] ... )
Source code in src/deriva_ml/execution/execution_configuration.py
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 | |
load_configuration
staticmethod
load_configuration(
path: Path,
) -> ExecutionConfiguration
Creates an ExecutionConfiguration from a JSON file.
Loads and parses a JSON configuration file into an ExecutionConfiguration instance. The file should contain a valid configuration specification.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Path
|
Path to JSON configuration file. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
ExecutionConfiguration |
ExecutionConfiguration
|
Loaded configuration instance. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If JSON file is invalid or missing required fields. |
FileNotFoundError
|
If configuration file doesn't exist. |
Example
config = ExecutionConfiguration.load_configuration(Path("config.json")) print(f"Workflow: {config.workflow}") print(f"Datasets: {len(config.datasets)}")
Source code in src/deriva_ml/execution/execution_configuration.py
120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 | |
validate_assets
classmethod
validate_assets(value: Any) -> Any
Normalize asset entries to AssetSpec objects.
Accepts plain RID strings, AssetRID objects, DictConfig from Hydra, AssetSpec objects, or dicts with 'rid' and optional 'cache' keys.
Source code in src/deriva_ml/execution/execution_configuration.py
88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 | |