Helper Classes
Shared definitions that are used in different DerivaML modules. This module re-exports all symbols from the core submodules for backwards compatibility.
BaseStrEnum
Bases: str
, Enum
Base class for string-based enumerations.
Extends both str and Enum to create string enums that are both string-like and enumerated. This provides type safety while maintaining string compatibility.
Example
class MyEnum(BaseStrEnum): ... VALUE = "value" isinstance(MyEnum.VALUE, str) # True isinstance(MyEnum.VALUE, Enum) # True
Source code in src/deriva_ml/core/enums.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
|
BuiltinTypes
Bases: Enum
ERMrest built-in data types.
Maps ERMrest's built-in data types to their type names. These types are used for defining column types in tables and for type validation.
Attributes:
Name | Type | Description |
---|---|---|
text |
str
|
Text/string type. |
int2 |
str
|
16-bit integer. |
jsonb |
str
|
Binary JSON. |
float8 |
str
|
64-bit float. |
timestamp |
str
|
Timestamp without timezone. |
int8 |
str
|
64-bit integer. |
boolean |
str
|
Boolean type. |
json |
str
|
JSON type. |
float4 |
str
|
32-bit float. |
int4 |
str
|
32-bit integer. |
timestamptz |
str
|
Timestamp with timezone. |
date |
str
|
Date type. |
ermrest_rid |
str
|
Resource identifier. |
ermrest_rcb |
str
|
Record created by. |
ermrest_rmb |
str
|
Record modified by. |
ermrest_rct |
str
|
Record creation time. |
ermrest_rmt |
str
|
Record modification time. |
markdown |
str
|
Markdown text. |
longtext |
str
|
Long text. |
ermrest_curie |
str
|
Compact URI. |
ermrest_uri |
str
|
URI type. |
color_rgb_hex |
str
|
RGB color in hex. |
serial2 |
str
|
16-bit auto-incrementing. |
serial4 |
str
|
32-bit auto-incrementing. |
serial8 |
str
|
64-bit auto-incrementing. |
Source code in src/deriva_ml/core/enums.py
88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 |
|
ColumnDefinition
Bases: BaseModel
Defines a column in an ERMrest table.
Provides a Pydantic model for defining columns with their types, constraints, and metadata. Maps to deriva_py's Column.define functionality.
Attributes:
Name | Type | Description |
---|---|---|
name |
str
|
Name of the column. |
type |
BuiltinTypes
|
ERMrest data type for the column. |
nullok |
bool
|
Whether NULL values are allowed. Defaults to True. |
default |
Any
|
Default value for the column. |
comment |
str | None
|
Description of the column's purpose. |
acls |
dict
|
Access control lists. |
acl_bindings |
dict
|
Dynamic access control bindings. |
annotations |
dict
|
Additional metadata annotations. |
Example
col = ColumnDefinition( ... name="score", ... type=BuiltinTypes.float4, ... nullok=False, ... comment="Confidence score between 0 and 1" ... )
Source code in src/deriva_ml/core/ermrest.py
97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 |
|
ExecAssetType
Bases: BaseStrEnum
Execution asset type identifiers.
Defines the types of assets that can be produced during an execution.
Attributes:
Name | Type | Description |
---|---|---|
input_file |
str
|
Input file used by the execution. |
output_file |
str
|
Output file produced by the execution. |
notebook_output |
str
|
Jupyter notebook output from the execution. |
Source code in src/deriva_ml/core/enums.py
208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 |
|
ExecMetadataType
Bases: BaseStrEnum
Execution metadata type identifiers.
Defines the types of metadata that can be associated with an execution.
Attributes:
Name | Type | Description |
---|---|---|
execution_config |
str
|
Execution configuration data. |
runtime_env |
str
|
Runtime environment information. |
Source code in src/deriva_ml/core/enums.py
194 195 196 197 198 199 200 201 202 203 204 205 |
|
FileSpec
Bases: BaseModel
An entry into the File table
Attributes:
Name | Type | Description |
---|---|---|
url |
str
|
The File url to the url. |
description |
str | None
|
The description of the file. |
md5 |
str
|
The MD5 hash of the file. |
length |
int
|
The length of the file in bytes. |
file_types |
conlist(str) | None
|
A list of file types. Each files_type should be a defined term in MLVocab.file_type vocabulary. |
Source code in src/deriva_ml/core/filespec.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 |
|
create_filespecs
classmethod
create_filespecs(
path: Path | str,
description: str,
file_types: list[str]
| Callable[[Path], list[str]]
| None = None,
) -> Generator[FileSpec, None, None]
Given a file or directory, generate the sequence of corresponding FileSpecs suitable to create a File table.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
Path | str
|
Path to the file or directory. |
required |
description
|
str
|
The description of the file(s) |
required |
file_types
|
list[str] | Callable[[Path], list[str]] | None
|
A list of file types or a function that takes a file path and returns a list of file types. |
None
|
Returns:
Type | Description |
---|---|
None
|
An iterable of FileSpecs for each file in the directory. |
Source code in src/deriva_ml/core/filespec.py
60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 |
|
read_filespec
staticmethod
read_filespec(
path: Path | str,
) -> Generator[FileSpec, None, None]
Get FileSpecs from a JSON lines file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
Path | str
|
Path to the .jsonl file (string or Path). |
required |
Yields:
Type | Description |
---|---|
FileSpec
|
A FileSpec object. |
Source code in src/deriva_ml/core/filespec.py
94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 |
|
validate_file_url
classmethod
validate_file_url(url: str) -> str
Examine the provided URL. If it's a local path, convert it into a tag URL.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
url
|
str
|
The URL to validate and potentially convert |
required |
Returns:
Type | Description |
---|---|
str
|
The validated/converted URL |
Raises:
Type | Description |
---|---|
ValidationError
|
If the URL is not a file URL |
Source code in src/deriva_ml/core/filespec.py
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
|
FileUploadState
Bases: BaseModel
Tracks the state and result of a file upload operation.
Attributes:
Name | Type | Description |
---|---|---|
state |
UploadState
|
Current state of the upload (success, failed, etc.). |
status |
str
|
Detailed status message. |
result |
Any
|
Upload result data, if any. |
rid |
RID | None
|
Resource identifier of the uploaded file, if successful. |
Source code in src/deriva_ml/core/ermrest.py
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
|
ForeignKeyDefinition
Bases: BaseModel
Defines a foreign key relationship between tables.
Provides a Pydantic model for defining foreign key constraints with referential actions and metadata. Maps to deriva_py's ForeignKey.define functionality.
Attributes:
Name | Type | Description |
---|---|---|
colnames |
Iterable[str]
|
Names of columns in the referencing table. |
pk_sname |
str
|
Schema name of the referenced table. |
pk_tname |
str
|
Name of the referenced table. |
pk_colnames |
Iterable[str]
|
Names of columns in the referenced table. |
constraint_names |
Iterable[str]
|
Names for the foreign key constraints. |
on_update |
str
|
Action on update of referenced row. Defaults to "NO ACTION". |
on_delete |
str
|
Action on delete of referenced row. Defaults to "NO ACTION". |
comment |
str | None
|
Description of the relationship. |
acls |
dict
|
Access control lists. |
acl_bindings |
dict
|
Dynamic access control bindings. |
annotations |
dict
|
Additional metadata annotations. |
Example
fk = ForeignKeyDefinition( ... colnames=["dataset_id"], ... pk_sname="core", ... pk_tname="dataset", ... pk_colnames=["id"], ... on_delete="CASCADE" ... )
Source code in src/deriva_ml/core/ermrest.py
186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 |
|
KeyDefinition
Bases: BaseModel
Defines a key constraint in an ERMrest table.
Provides a Pydantic model for defining primary keys and unique constraints. Maps to deriva_py's Key.define functionality.
Attributes:
Name | Type | Description |
---|---|---|
colnames |
Iterable[str]
|
Names of columns that form the key. |
constraint_names |
Iterable[str]
|
Names for the key constraints. |
comment |
str | None
|
Description of the key's purpose. |
annotations |
dict
|
Additional metadata annotations. |
Example
key = KeyDefinition( ... colnames=["id", "version"], ... constraint_names=["unique_id_version"], ... comment="Unique identifier with version" ... )
Source code in src/deriva_ml/core/ermrest.py
152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 |
|
MLAsset
Bases: BaseStrEnum
Asset type identifiers.
Defines the types of assets that can be associated with executions.
Attributes:
Name | Type | Description |
---|---|---|
execution_metadata |
str
|
Metadata about an execution. |
execution_asset |
str
|
Asset produced by an execution. |
Source code in src/deriva_ml/core/enums.py
169 170 171 172 173 174 175 176 177 178 179 180 |
|
MLVocab
Bases: BaseStrEnum
Controlled vocabulary type identifiers.
Defines the names of controlled vocabulary tables used in DerivaML for various types of entities and attributes.
Attributes:
Name | Type | Description |
---|---|---|
dataset_type |
str
|
Dataset classification vocabulary. |
workflow_type |
str
|
Workflow classification vocabulary. |
asset_type |
str
|
Asset classification vocabulary. |
asset_role |
str
|
Asset role classification vocabulary. |
Source code in src/deriva_ml/core/enums.py
149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 |
|
Status
Bases: BaseStrEnum
Execution status values.
Represents the various states an execution can be in throughout its lifecycle.
Attributes:
Name | Type | Description |
---|---|---|
initializing |
str
|
Initial setup is in progress. |
created |
str
|
Execution record has been created. |
pending |
str
|
Execution is queued. |
running |
str
|
Execution is in progress. |
aborted |
str
|
Execution was manually stopped. |
completed |
str
|
Execution finished successfully. |
failed |
str
|
Execution encountered an error. |
Source code in src/deriva_ml/core/enums.py
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 |
|
TableDefinition
Bases: BaseModel
Defines a complete table structure in ERMrest.
Provides a Pydantic model for defining tables with their columns, keys, and relationships. Maps to deriva_py's Table.define functionality.
Attributes:
Name | Type | Description |
---|---|---|
name |
str
|
Name of the table. |
column_defs |
Iterable[ColumnDefinition]
|
Column definitions. |
key_defs |
Iterable[KeyDefinition]
|
Key constraint definitions. |
fkey_defs |
Iterable[ForeignKeyDefinition]
|
Foreign key relationship definitions. |
comment |
str | None
|
Description of the table's purpose. |
acls |
dict
|
Access control lists. |
acl_bindings |
dict
|
Dynamic access control bindings. |
annotations |
dict
|
Additional metadata annotations. |
Example
table = TableDefinition( ... name="experiment", ... column_defs=[ ... ColumnDefinition(name="id", type=BuiltinTypes.text), ... ColumnDefinition(name="date", type=BuiltinTypes.date) ... ], ... comment="Experimental data records" ... )
Source code in src/deriva_ml/core/ermrest.py
242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 |
|
UploadState
Bases: Enum
File upload operation states.
Represents the various states a file upload operation can be in, from initiation to completion.
Attributes:
Name | Type | Description |
---|---|---|
success |
int
|
Upload completed successfully. |
failed |
int
|
Upload failed. |
pending |
int
|
Upload is queued. |
running |
int
|
Upload is in progress. |
paused |
int
|
Upload is temporarily paused. |
aborted |
int
|
Upload was aborted. |
cancelled |
int
|
Upload was cancelled. |
timeout |
int
|
Upload timed out. |
Source code in src/deriva_ml/core/enums.py
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
|
VocabularyTerm
Bases: BaseModel
Represents a term in a controlled vocabulary.
A vocabulary term is a standardized entry in a controlled vocabulary table. Each term has a primary name, optional synonyms, and identifiers for cross-referencing.
Attributes:
Name | Type | Description |
---|---|---|
name |
str
|
Primary name of the term. |
synonyms |
list[str] | None
|
Alternative names for the term. |
id |
str
|
CURIE (Compact URI) identifier. |
uri |
str
|
Full URI for the term. |
description |
str
|
Explanation of the term's meaning. |
rid |
str
|
Resource identifier in the catalog. |
Example
term = VocabularyTerm( ... Name="epithelial", ... Synonyms=["epithelium"], ... ID="tissue:0001", ... URI="http://example.org/tissue/0001", ... Description="Epithelial tissue type", ... RID="1-abc123" ... )
Source code in src/deriva_ml/core/ermrest.py
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 |
|