Skip to content

Helper Classes

Shared definitions that are used in different DerivaML modules. This module re-exports all symbols from the core submodules for backwards compatibility.

BaseStrEnum

Bases: str, Enum

Base class for string-based enumerations.

Extends both str and Enum to create string enums that are both string-like and enumerated. This provides type safety while maintaining string compatibility.

Example

class MyEnum(BaseStrEnum): ... VALUE = "value" isinstance(MyEnum.VALUE, str) # True isinstance(MyEnum.VALUE, Enum) # True

Source code in src/deriva_ml/core/enums.py
22
23
24
25
26
27
28
29
30
31
32
33
34
35
class BaseStrEnum(str, Enum):
    """Base class for string-based enumerations.

    Extends both str and Enum to create string enums that are both string-like and enumerated.
    This provides type safety while maintaining string compatibility.

    Example:
        >>> class MyEnum(BaseStrEnum):
        ...     VALUE = "value"
        >>> isinstance(MyEnum.VALUE, str)  # True
        >>> isinstance(MyEnum.VALUE, Enum)  # True
    """

    pass

BuiltinTypes

Bases: Enum

ERMrest built-in data types.

Maps ERMrest's built-in data types to their type names. These types are used for defining column types in tables and for type validation.

Attributes:

Name Type Description
text str

Text/string type.

int2 str

16-bit integer.

jsonb str

Binary JSON.

float8 str

64-bit float.

timestamp str

Timestamp without timezone.

int8 str

64-bit integer.

boolean str

Boolean type.

json str

JSON type.

float4 str

32-bit float.

int4 str

32-bit integer.

timestamptz str

Timestamp with timezone.

date str

Date type.

ermrest_rid str

Resource identifier.

ermrest_rcb str

Record created by.

ermrest_rmb str

Record modified by.

ermrest_rct str

Record creation time.

ermrest_rmt str

Record modification time.

markdown str

Markdown text.

longtext str

Long text.

ermrest_curie str

Compact URI.

ermrest_uri str

URI type.

color_rgb_hex str

RGB color in hex.

serial2 str

16-bit auto-incrementing.

serial4 str

32-bit auto-incrementing.

serial8 str

64-bit auto-incrementing.

Source code in src/deriva_ml/core/enums.py
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
class BuiltinTypes(Enum):
    """ERMrest built-in data types.

    Maps ERMrest's built-in data types to their type names. These types are used for defining
    column types in tables and for type validation.

    Attributes:
        text (str): Text/string type.
        int2 (str): 16-bit integer.
        jsonb (str): Binary JSON.
        float8 (str): 64-bit float.
        timestamp (str): Timestamp without timezone.
        int8 (str): 64-bit integer.
        boolean (str): Boolean type.
        json (str): JSON type.
        float4 (str): 32-bit float.
        int4 (str): 32-bit integer.
        timestamptz (str): Timestamp with timezone.
        date (str): Date type.
        ermrest_rid (str): Resource identifier.
        ermrest_rcb (str): Record created by.
        ermrest_rmb (str): Record modified by.
        ermrest_rct (str): Record creation time.
        ermrest_rmt (str): Record modification time.
        markdown (str): Markdown text.
        longtext (str): Long text.
        ermrest_curie (str): Compact URI.
        ermrest_uri (str): URI type.
        color_rgb_hex (str): RGB color in hex.
        serial2 (str): 16-bit auto-incrementing.
        serial4 (str): 32-bit auto-incrementing.
        serial8 (str): 64-bit auto-incrementing.
    """

    text = builtin_types.text.typename
    int2 = builtin_types.int2.typename
    jsonb = builtin_types.json.typename
    float8 = builtin_types.float8.typename
    timestamp = builtin_types.timestamp.typename
    int8 = builtin_types.int8.typename
    boolean = builtin_types.boolean.typename
    json = builtin_types.json.typename
    float4 = builtin_types.float4.typename
    int4 = builtin_types.int4.typename
    timestamptz = builtin_types.timestamptz.typename
    date = builtin_types.date.typename
    ermrest_rid = builtin_types.ermrest_rid.typename
    ermrest_rcb = builtin_types.ermrest_rcb.typename
    ermrest_rmb = builtin_types.ermrest_rmb.typename
    ermrest_rct = builtin_types.ermrest_rct.typename
    ermrest_rmt = builtin_types.ermrest_rmt.typename
    markdown = builtin_types.markdown.typename
    longtext = builtin_types.longtext.typename
    ermrest_curie = builtin_types.ermrest_curie.typename
    ermrest_uri = builtin_types.ermrest_uri.typename
    color_rgb_hex = builtin_types.color_rgb_hex.typename
    serial2 = builtin_types.serial2.typename
    serial4 = builtin_types.serial4.typename
    serial8 = builtin_types.serial8.typename

ColumnDefinition

Bases: BaseModel

Defines a column in an ERMrest table.

Provides a Pydantic model for defining columns with their types, constraints, and metadata. Maps to deriva_py's Column.define functionality.

Attributes:

Name Type Description
name str

Name of the column.

type BuiltinTypes

ERMrest data type for the column.

nullok bool

Whether NULL values are allowed. Defaults to True.

default Any

Default value for the column.

comment str | None

Description of the column's purpose.

acls dict

Access control lists.

acl_bindings dict

Dynamic access control bindings.

annotations dict

Additional metadata annotations.

Example

col = ColumnDefinition( ... name="score", ... type=BuiltinTypes.float4, ... nullok=False, ... comment="Confidence score between 0 and 1" ... )

Source code in src/deriva_ml/core/ermrest.py
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
class ColumnDefinition(BaseModel):
    """Defines a column in an ERMrest table.

    Provides a Pydantic model for defining columns with their types, constraints, and metadata.
    Maps to deriva_py's Column.define functionality.

    Attributes:
        name (str): Name of the column.
        type (BuiltinTypes): ERMrest data type for the column.
        nullok (bool): Whether NULL values are allowed. Defaults to True.
        default (Any): Default value for the column.
        comment (str | None): Description of the column's purpose.
        acls (dict): Access control lists.
        acl_bindings (dict): Dynamic access control bindings.
        annotations (dict): Additional metadata annotations.

    Example:
        >>> col = ColumnDefinition(
        ...     name="score",
        ...     type=BuiltinTypes.float4,
        ...     nullok=False,
        ...     comment="Confidence score between 0 and 1"
        ... )
    """
    name: str
    type: BuiltinTypes
    nullok: bool = True
    default: Any = None
    comment: str | None = None
    acls: dict = Field(default_factory=dict)
    acl_bindings: dict = Field(default_factory=dict)
    annotations: dict = Field(default_factory=dict)

    @field_validator("type", mode="before")
    @classmethod
    def extract_type_name(cls, value: Any) -> Any:
        if isinstance(value, dict):
            return BuiltinTypes(value["typename"])
        else:
            return value

    @model_serializer()
    def serialize_column_definition(self):
        return em.Column.define(
            self.name,
            builtin_types[self.type.value],
            nullok=self.nullok,
            default=self.default,
            comment=self.comment,
            acls=self.acls,
            acl_bindings=self.acl_bindings,
            annotations=self.annotations,
        )

ExecAssetType

Bases: BaseStrEnum

Execution asset type identifiers.

Defines the types of assets that can be produced during an execution.

Attributes:

Name Type Description
input_file str

Input file used by the execution.

output_file str

Output file produced by the execution.

notebook_output str

Jupyter notebook output from the execution.

Source code in src/deriva_ml/core/enums.py
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
class ExecAssetType(BaseStrEnum):
    """Execution asset type identifiers.

    Defines the types of assets that can be produced during an execution.

    Attributes:
        input_file (str): Input file used by the execution.
        output_file (str): Output file produced by the execution.
        notebook_output (str): Jupyter notebook output from the execution.
    """

    input_file = "Input_File"
    output_file = "Output_File"
    notebook_output = "Notebook_Output"
    model_file = "Model_File"

ExecMetadataType

Bases: BaseStrEnum

Execution metadata type identifiers.

Defines the types of metadata that can be associated with an execution.

Attributes:

Name Type Description
execution_config str

Execution configuration data.

runtime_env str

Runtime environment information.

Source code in src/deriva_ml/core/enums.py
194
195
196
197
198
199
200
201
202
203
204
205
class ExecMetadataType(BaseStrEnum):
    """Execution metadata type identifiers.

    Defines the types of metadata that can be associated with an execution.

    Attributes:
        execution_config (str): Execution configuration data.
        runtime_env (str): Runtime environment information.
    """

    execution_config = "Execution_Config"
    runtime_env = "Runtime_Env"

FileSpec

Bases: BaseModel

An entry into the File table

Attributes:

Name Type Description
url str

The File url to the url.

description str | None

The description of the file.

md5 str

The MD5 hash of the file.

length int

The length of the file in bytes.

file_types conlist(str) | None

A list of file types. Each files_type should be a defined term in MLVocab.file_type vocabulary.

Source code in src/deriva_ml/core/filespec.py
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
class FileSpec(BaseModel):
    """An entry into the File table

    Attributes:
        url: The File url to the url.
        description: The description of the file.
        md5: The MD5 hash of the file.
        length: The length of the file in bytes.
        file_types: A list of file types.  Each files_type should be a defined term in MLVocab.file_type vocabulary.
    """

    url: str = Field(alias="URL", validation_alias="url")
    md5: str = Field(alias="MD5", validation_alias="md5")
    length: int = Field(alias="Length", validation_alias="length")
    description: str | None = Field(default="", alias="Description", validation_alias="description")
    file_types: conlist(str) | None = []

    @field_validator("url")
    @classmethod
    def validate_file_url(cls, url: str) -> str:
        """Examine the provided URL. If it's a local path, convert it into a tag URL.

        Args:
            url: The URL to validate and potentially convert

        Returns:
            The validated/converted URL

        Raises:
            ValidationError: If the URL is not a file URL
        """
        url_parts = urlparse(url)
        if url_parts.scheme == "tag":
            # Already a tag URL, so just return it.
            return url
        elif (not url_parts.scheme) or url_parts.scheme == "file":
            # There is no scheme part of the URL, or it is a file URL, so it is a local file path.
            # Convert to a tag URL.
            return f"tag://{gethostname()},{date.today()}:file://{url_parts.path}"
        else:
            raise ValueError("url is not a file URL")

    @classmethod
    def create_filespecs(
        cls, path: Path | str, description: str, file_types: list[str] | Callable[[Path], list[str]] | None = None
    ) -> Generator[FileSpec, None, None]:
        """Given a file or directory, generate the sequence of corresponding FileSpecs suitable to create a File table.

        Args:
            path: Path to the file or directory.
            description: The description of the file(s)
            file_types: A list of file types or a function that takes a file path and returns a list of file types.

        Returns:
            An iterable of FileSpecs for each file in the directory.
        """

        path = Path(path)
        file_types = file_types or []
        file_types_fn = file_types if callable(file_types) else lambda _x: file_types

        def create_spec(file_path: Path) -> FileSpec:
            hashes = hash_utils.compute_file_hashes(file_path, hashes=frozenset(["md5", "sha256"]))
            md5 = hashes["md5"][0]
            type_list = file_types_fn(file_path)
            return FileSpec(
                length=path.stat().st_size,
                md5=md5,
                description=description,
                url=file_path.as_posix(),
                file_types=type_list if "File" in type_list else ["File"] + type_list,
            )

        files = [path] if path.is_file() else [f for f in Path(path).rglob("*") if f.is_file()]
        return (create_spec(file) for file in files)

    @staticmethod
    def read_filespec(path: Path | str) -> Generator[FileSpec, None, None]:
        """Get FileSpecs from a JSON lines file.

        Args:
         path: Path to the .jsonl file (string or Path).

        Yields:
             A FileSpec object.
        """
        path = Path(path)
        with path.open("r", encoding="utf-8") as f:
            for line in f:
                line = line.strip()
                if not line:
                    continue
                yield FileSpec(**json.loads(line))

create_filespecs classmethod

create_filespecs(
    path: Path | str,
    description: str,
    file_types: list[str]
    | Callable[[Path], list[str]]
    | None = None,
) -> Generator[FileSpec, None, None]

Given a file or directory, generate the sequence of corresponding FileSpecs suitable to create a File table.

Parameters:

Name Type Description Default
path Path | str

Path to the file or directory.

required
description str

The description of the file(s)

required
file_types list[str] | Callable[[Path], list[str]] | None

A list of file types or a function that takes a file path and returns a list of file types.

None

Returns:

Type Description
None

An iterable of FileSpecs for each file in the directory.

Source code in src/deriva_ml/core/filespec.py
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
@classmethod
def create_filespecs(
    cls, path: Path | str, description: str, file_types: list[str] | Callable[[Path], list[str]] | None = None
) -> Generator[FileSpec, None, None]:
    """Given a file or directory, generate the sequence of corresponding FileSpecs suitable to create a File table.

    Args:
        path: Path to the file or directory.
        description: The description of the file(s)
        file_types: A list of file types or a function that takes a file path and returns a list of file types.

    Returns:
        An iterable of FileSpecs for each file in the directory.
    """

    path = Path(path)
    file_types = file_types or []
    file_types_fn = file_types if callable(file_types) else lambda _x: file_types

    def create_spec(file_path: Path) -> FileSpec:
        hashes = hash_utils.compute_file_hashes(file_path, hashes=frozenset(["md5", "sha256"]))
        md5 = hashes["md5"][0]
        type_list = file_types_fn(file_path)
        return FileSpec(
            length=path.stat().st_size,
            md5=md5,
            description=description,
            url=file_path.as_posix(),
            file_types=type_list if "File" in type_list else ["File"] + type_list,
        )

    files = [path] if path.is_file() else [f for f in Path(path).rglob("*") if f.is_file()]
    return (create_spec(file) for file in files)

read_filespec staticmethod

read_filespec(
    path: Path | str,
) -> Generator[FileSpec, None, None]

Get FileSpecs from a JSON lines file.

Parameters:

Name Type Description Default
path Path | str

Path to the .jsonl file (string or Path).

required

Yields:

Type Description
FileSpec

A FileSpec object.

Source code in src/deriva_ml/core/filespec.py
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
@staticmethod
def read_filespec(path: Path | str) -> Generator[FileSpec, None, None]:
    """Get FileSpecs from a JSON lines file.

    Args:
     path: Path to the .jsonl file (string or Path).

    Yields:
         A FileSpec object.
    """
    path = Path(path)
    with path.open("r", encoding="utf-8") as f:
        for line in f:
            line = line.strip()
            if not line:
                continue
            yield FileSpec(**json.loads(line))

validate_file_url classmethod

validate_file_url(url: str) -> str

Examine the provided URL. If it's a local path, convert it into a tag URL.

Parameters:

Name Type Description Default
url str

The URL to validate and potentially convert

required

Returns:

Type Description
str

The validated/converted URL

Raises:

Type Description
ValidationError

If the URL is not a file URL

Source code in src/deriva_ml/core/filespec.py
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
@field_validator("url")
@classmethod
def validate_file_url(cls, url: str) -> str:
    """Examine the provided URL. If it's a local path, convert it into a tag URL.

    Args:
        url: The URL to validate and potentially convert

    Returns:
        The validated/converted URL

    Raises:
        ValidationError: If the URL is not a file URL
    """
    url_parts = urlparse(url)
    if url_parts.scheme == "tag":
        # Already a tag URL, so just return it.
        return url
    elif (not url_parts.scheme) or url_parts.scheme == "file":
        # There is no scheme part of the URL, or it is a file URL, so it is a local file path.
        # Convert to a tag URL.
        return f"tag://{gethostname()},{date.today()}:file://{url_parts.path}"
    else:
        raise ValueError("url is not a file URL")

FileUploadState

Bases: BaseModel

Tracks the state and result of a file upload operation.

Attributes:

Name Type Description
state UploadState

Current state of the upload (success, failed, etc.).

status str

Detailed status message.

result Any

Upload result data, if any.

rid RID | None

Resource identifier of the uploaded file, if successful.

Source code in src/deriva_ml/core/ermrest.py
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
class FileUploadState(BaseModel):
    """Tracks the state and result of a file upload operation.

    Attributes:
        state (UploadState): Current state of the upload (success, failed, etc.).
        status (str): Detailed status message.
        result (Any): Upload result data, if any.
        rid (RID | None): Resource identifier of the uploaded file, if successful.
    """
    state: UploadState
    status: str
    result: Any

    @computed_field
    @property
    def rid(self) -> RID | None:
        return self.result and self.result["RID"]

ForeignKeyDefinition

Bases: BaseModel

Defines a foreign key relationship between tables.

Provides a Pydantic model for defining foreign key constraints with referential actions and metadata. Maps to deriva_py's ForeignKey.define functionality.

Attributes:

Name Type Description
colnames Iterable[str]

Names of columns in the referencing table.

pk_sname str

Schema name of the referenced table.

pk_tname str

Name of the referenced table.

pk_colnames Iterable[str]

Names of columns in the referenced table.

constraint_names Iterable[str]

Names for the foreign key constraints.

on_update str

Action on update of referenced row. Defaults to "NO ACTION".

on_delete str

Action on delete of referenced row. Defaults to "NO ACTION".

comment str | None

Description of the relationship.

acls dict

Access control lists.

acl_bindings dict

Dynamic access control bindings.

annotations dict

Additional metadata annotations.

Example

fk = ForeignKeyDefinition( ... colnames=["dataset_id"], ... pk_sname="core", ... pk_tname="dataset", ... pk_colnames=["id"], ... on_delete="CASCADE" ... )

Source code in src/deriva_ml/core/ermrest.py
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
class ForeignKeyDefinition(BaseModel):
    """Defines a foreign key relationship between tables.

    Provides a Pydantic model for defining foreign key constraints with referential actions
    and metadata. Maps to deriva_py's ForeignKey.define functionality.

    Attributes:
        colnames (Iterable[str]): Names of columns in the referencing table.
        pk_sname (str): Schema name of the referenced table.
        pk_tname (str): Name of the referenced table.
        pk_colnames (Iterable[str]): Names of columns in the referenced table.
        constraint_names (Iterable[str]): Names for the foreign key constraints.
        on_update (str): Action on update of referenced row. Defaults to "NO ACTION".
        on_delete (str): Action on delete of referenced row. Defaults to "NO ACTION".
        comment (str | None): Description of the relationship.
        acls (dict): Access control lists.
        acl_bindings (dict): Dynamic access control bindings.
        annotations (dict): Additional metadata annotations.

    Example:
        >>> fk = ForeignKeyDefinition(
        ...     colnames=["dataset_id"],
        ...     pk_sname="core",
        ...     pk_tname="dataset",
        ...     pk_colnames=["id"],
        ...     on_delete="CASCADE"
        ... )
    """
    colnames: Iterable[str]
    pk_sname: str
    pk_tname: str
    pk_colnames: Iterable[str]
    constraint_names: Iterable[str] = Field(default_factory=list)
    on_update: str = "NO ACTION"
    on_delete: str = "NO ACTION"
    comment: str | None = None
    acls: dict[str, Any] = Field(default_factory=dict)
    acl_bindings: dict[str, Any] = Field(default_factory=dict)
    annotations: dict[str, Any] = Field(default_factory=dict)

    @model_serializer()
    def serialize_fk_definition(self):
        return em.ForeignKey.define(
            fk_colnames=self.colnames,
            pk_sname=self.pk_sname,
            pk_tname=self.pk_tname,
            pk_colnames=self.pk_colnames,
            on_update=self.on_update,
            on_delete=self.on_delete,
            comment=self.comment,
            acls=self.acls,
            acl_bindings=self.acl_bindings,
            annotations=self.annotations,
        )

KeyDefinition

Bases: BaseModel

Defines a key constraint in an ERMrest table.

Provides a Pydantic model for defining primary keys and unique constraints. Maps to deriva_py's Key.define functionality.

Attributes:

Name Type Description
colnames Iterable[str]

Names of columns that form the key.

constraint_names Iterable[str]

Names for the key constraints.

comment str | None

Description of the key's purpose.

annotations dict

Additional metadata annotations.

Example

key = KeyDefinition( ... colnames=["id", "version"], ... constraint_names=["unique_id_version"], ... comment="Unique identifier with version" ... )

Source code in src/deriva_ml/core/ermrest.py
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
class KeyDefinition(BaseModel):
    """Defines a key constraint in an ERMrest table.

    Provides a Pydantic model for defining primary keys and unique constraints.
    Maps to deriva_py's Key.define functionality.

    Attributes:
        colnames (Iterable[str]): Names of columns that form the key.
        constraint_names (Iterable[str]): Names for the key constraints.
        comment (str | None): Description of the key's purpose.
        annotations (dict): Additional metadata annotations.

    Example:
        >>> key = KeyDefinition(
        ...     colnames=["id", "version"],
        ...     constraint_names=["unique_id_version"],
        ...     comment="Unique identifier with version"
        ... )
    """
    colnames: Iterable[str]
    constraint_names: Iterable[str]
    comment: str | None = None
    annotations: dict = Field(default_factory=dict)

    @model_serializer()
    def serialize_key_definition(self):
        return em.Key.define(
            colnames=self.colnames,
            constraint_names=self.constraint_names,
            comment=self.comment,
            annotations=self.annotations,
        )

MLAsset

Bases: BaseStrEnum

Asset type identifiers.

Defines the types of assets that can be associated with executions.

Attributes:

Name Type Description
execution_metadata str

Metadata about an execution.

execution_asset str

Asset produced by an execution.

Source code in src/deriva_ml/core/enums.py
169
170
171
172
173
174
175
176
177
178
179
180
class MLAsset(BaseStrEnum):
    """Asset type identifiers.

    Defines the types of assets that can be associated with executions.

    Attributes:
        execution_metadata (str): Metadata about an execution.
        execution_asset (str): Asset produced by an execution.
    """

    execution_metadata = "Execution_Metadata"
    execution_asset = "Execution_Asset"

MLVocab

Bases: BaseStrEnum

Controlled vocabulary type identifiers.

Defines the names of controlled vocabulary tables used in DerivaML for various types of entities and attributes.

Attributes:

Name Type Description
dataset_type str

Dataset classification vocabulary.

workflow_type str

Workflow classification vocabulary.

asset_type str

Asset classification vocabulary.

asset_role str

Asset role classification vocabulary.

Source code in src/deriva_ml/core/enums.py
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
class MLVocab(BaseStrEnum):
    """Controlled vocabulary type identifiers.

    Defines the names of controlled vocabulary tables used in DerivaML for various types
    of entities and attributes.

    Attributes:
        dataset_type (str): Dataset classification vocabulary.
        workflow_type (str): Workflow classification vocabulary.
        asset_type (str): Asset classification vocabulary.
        asset_role (str): Asset role classification vocabulary.
    """

    dataset_type = "Dataset_Type"
    workflow_type = "Workflow_Type"
    asset_type = "Asset_Type"
    asset_role = "Asset_Role"
    feature_name = "Feature_Name"

Status

Bases: BaseStrEnum

Execution status values.

Represents the various states an execution can be in throughout its lifecycle.

Attributes:

Name Type Description
initializing str

Initial setup is in progress.

created str

Execution record has been created.

pending str

Execution is queued.

running str

Execution is in progress.

aborted str

Execution was manually stopped.

completed str

Execution finished successfully.

failed str

Execution encountered an error.

Source code in src/deriva_ml/core/enums.py
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
class Status(BaseStrEnum):
    """Execution status values.

    Represents the various states an execution can be in throughout its lifecycle.

    Attributes:
        initializing (str): Initial setup is in progress.
        created (str): Execution record has been created.
        pending (str): Execution is queued.
        running (str): Execution is in progress.
        aborted (str): Execution was manually stopped.
        completed (str): Execution finished successfully.
        failed (str): Execution encountered an error.
    """

    initializing = "Initializing"
    created = "Created"
    pending = "Pending"
    running = "Running"
    aborted = "Aborted"
    completed = "Completed"
    failed = "Failed"

TableDefinition

Bases: BaseModel

Defines a complete table structure in ERMrest.

Provides a Pydantic model for defining tables with their columns, keys, and relationships. Maps to deriva_py's Table.define functionality.

Attributes:

Name Type Description
name str

Name of the table.

column_defs Iterable[ColumnDefinition]

Column definitions.

key_defs Iterable[KeyDefinition]

Key constraint definitions.

fkey_defs Iterable[ForeignKeyDefinition]

Foreign key relationship definitions.

comment str | None

Description of the table's purpose.

acls dict

Access control lists.

acl_bindings dict

Dynamic access control bindings.

annotations dict

Additional metadata annotations.

Example

table = TableDefinition( ... name="experiment", ... column_defs=[ ... ColumnDefinition(name="id", type=BuiltinTypes.text), ... ColumnDefinition(name="date", type=BuiltinTypes.date) ... ], ... comment="Experimental data records" ... )

Source code in src/deriva_ml/core/ermrest.py
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
class TableDefinition(BaseModel):
    """Defines a complete table structure in ERMrest.

    Provides a Pydantic model for defining tables with their columns, keys, and relationships.
    Maps to deriva_py's Table.define functionality.

    Attributes:
        name (str): Name of the table.
        column_defs (Iterable[ColumnDefinition]): Column definitions.
        key_defs (Iterable[KeyDefinition]): Key constraint definitions.
        fkey_defs (Iterable[ForeignKeyDefinition]): Foreign key relationship definitions.
        comment (str | None): Description of the table's purpose.
        acls (dict): Access control lists.
        acl_bindings (dict): Dynamic access control bindings.
        annotations (dict): Additional metadata annotations.

    Example:
        >>> table = TableDefinition(
        ...     name="experiment",
        ...     column_defs=[
        ...         ColumnDefinition(name="id", type=BuiltinTypes.text),
        ...         ColumnDefinition(name="date", type=BuiltinTypes.date)
        ...     ],
        ...     comment="Experimental data records"
        ... )
    """
    name: str
    column_defs: Iterable[ColumnDefinition]
    key_defs: Iterable[KeyDefinition] = Field(default_factory=list)
    fkey_defs: Iterable[ForeignKeyDefinition] = Field(default_factory=list)
    comment: str | None = None
    acls: dict = Field(default_factory=dict)
    acl_bindings: dict = Field(default_factory=dict)
    annotations: dict = Field(default_factory=dict)

    @model_serializer()
    def serialize_table_definition(self):
        return em.Table.define(
            tname=self.name,
            column_defs=[c.model_dump() for c in self.column_defs],
            key_defs=[k.model_dump() for k in self.key_defs],
            fkey_defs=[fk.model_dump() for fk in self.fkey_defs],
            comment=self.comment,
            acls=self.acls,
            acl_bindings=self.acl_bindings,
            annotations=self.annotations,
        )

UploadState

Bases: Enum

File upload operation states.

Represents the various states a file upload operation can be in, from initiation to completion.

Attributes:

Name Type Description
success int

Upload completed successfully.

failed int

Upload failed.

pending int

Upload is queued.

running int

Upload is in progress.

paused int

Upload is temporarily paused.

aborted int

Upload was aborted.

cancelled int

Upload was cancelled.

timeout int

Upload timed out.

Source code in src/deriva_ml/core/enums.py
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
class UploadState(Enum):
    """File upload operation states.

    Represents the various states a file upload operation can be in, from initiation to completion.

    Attributes:
        success (int): Upload completed successfully.
        failed (int): Upload failed.
        pending (int): Upload is queued.
        running (int): Upload is in progress.
        paused (int): Upload is temporarily paused.
        aborted (int): Upload was aborted.
        cancelled (int): Upload was cancelled.
        timeout (int): Upload timed out.
    """

    success = 0
    failed = 1
    pending = 2
    running = 3
    paused = 4
    aborted = 5
    cancelled = 6
    timeout = 7

VocabularyTerm

Bases: BaseModel

Represents a term in a controlled vocabulary.

A vocabulary term is a standardized entry in a controlled vocabulary table. Each term has a primary name, optional synonyms, and identifiers for cross-referencing.

Attributes:

Name Type Description
name str

Primary name of the term.

synonyms list[str] | None

Alternative names for the term.

id str

CURIE (Compact URI) identifier.

uri str

Full URI for the term.

description str

Explanation of the term's meaning.

rid str

Resource identifier in the catalog.

Example

term = VocabularyTerm( ... Name="epithelial", ... Synonyms=["epithelium"], ... ID="tissue:0001", ... URI="http://example.org/tissue/0001", ... Description="Epithelial tissue type", ... RID="1-abc123" ... )

Source code in src/deriva_ml/core/ermrest.py
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
class VocabularyTerm(BaseModel):
    """Represents a term in a controlled vocabulary.

    A vocabulary term is a standardized entry in a controlled vocabulary table. Each term has
    a primary name, optional synonyms, and identifiers for cross-referencing.

    Attributes:
        name (str): Primary name of the term.
        synonyms (list[str] | None): Alternative names for the term.
        id (str): CURIE (Compact URI) identifier.
        uri (str): Full URI for the term.
        description (str): Explanation of the term's meaning.
        rid (str): Resource identifier in the catalog.

    Example:
        >>> term = VocabularyTerm(
        ...     Name="epithelial",
        ...     Synonyms=["epithelium"],
        ...     ID="tissue:0001",
        ...     URI="http://example.org/tissue/0001",
        ...     Description="Epithelial tissue type",
        ...     RID="1-abc123"
        ... )
    """
    name: str = Field(alias="Name")
    synonyms: list[str] | None = Field(alias="Synonyms")
    id: str = Field(alias="ID")
    uri: str = Field(alias="URI")
    description: str = Field(alias="Description")
    rid: str = Field(alias="RID")

    class Config:
        extra = "ignore"