Helper Classes

Shared definitions that are used in different DerivaML modules. This module re-exports all symbols from the core submodules for backwards compatibility.

BaseStrEnum

Bases: str, Enum

Base class for string-based enumerations.

Extends both str and Enum to create string enums that are both string-like and enumerated. This provides type safety while maintaining string compatibility.

Example

class MyEnum(BaseStrEnum): ... VALUE = "value" isinstance(MyEnum.VALUE, str) # True isinstance(MyEnum.VALUE, Enum) # True

Source code in src/deriva_ml/core/enums.py

class BaseStrEnum(str, Enum):
    """Base class for string-based enumerations.

    Extends both str and Enum to create string enums that are both string-like and enumerated.
    This provides type safety while maintaining string compatibility.

    Example:
        >>> class MyEnum(BaseStrEnum):
        ...     VALUE = "value"
        >>> isinstance(MyEnum.VALUE, str)  # True
        >>> isinstance(MyEnum.VALUE, Enum)  # True
    """

    pass

BuiltinTypes

Bases: Enum

ERMrest built-in data types.

Maps ERMrest's built-in data types to their type names. These types are used for defining column types in tables and for type validation.

Attributes:

Name	Type	Description
`text`	`str`	Text/string type.
`int2`	`str`	16-bit integer.
`jsonb`	`str`	Binary JSON.
`float8`	`str`	64-bit float.
`timestamp`	`str`	Timestamp without timezone.
`int8`	`str`	64-bit integer.
`boolean`	`str`	Boolean type.
`json`	`str`	JSON type.
`float4`	`str`	32-bit float.
`int4`	`str`	32-bit integer.
`timestamptz`	`str`	Timestamp with timezone.
`date`	`str`	Date type.
`ermrest_rid`	`str`	Resource identifier.
`ermrest_rcb`	`str`	Record created by.
`ermrest_rmb`	`str`	Record modified by.
`ermrest_rct`	`str`	Record creation time.
`ermrest_rmt`	`str`	Record modification time.
`markdown`	`str`	Markdown text.
`longtext`	`str`	Long text.
`ermrest_curie`	`str`	Compact URI.
`ermrest_uri`	`str`	URI type.
`color_rgb_hex`	`str`	RGB color in hex.
`serial2`	`str`	16-bit auto-incrementing.
`serial4`	`str`	32-bit auto-incrementing.
`serial8`	`str`	64-bit auto-incrementing.

Source code in src/deriva_ml/core/enums.py

class BuiltinTypes(Enum):
    """ERMrest built-in data types.

    Maps ERMrest's built-in data types to their type names. These types are used for defining
    column types in tables and for type validation.

    Attributes:
        text (str): Text/string type.
        int2 (str): 16-bit integer.
        jsonb (str): Binary JSON.
        float8 (str): 64-bit float.
        timestamp (str): Timestamp without timezone.
        int8 (str): 64-bit integer.
        boolean (str): Boolean type.
        json (str): JSON type.
        float4 (str): 32-bit float.
        int4 (str): 32-bit integer.
        timestamptz (str): Timestamp with timezone.
        date (str): Date type.
        ermrest_rid (str): Resource identifier.
        ermrest_rcb (str): Record created by.
        ermrest_rmb (str): Record modified by.
        ermrest_rct (str): Record creation time.
        ermrest_rmt (str): Record modification time.
        markdown (str): Markdown text.
        longtext (str): Long text.
        ermrest_curie (str): Compact URI.
        ermrest_uri (str): URI type.
        color_rgb_hex (str): RGB color in hex.
        serial2 (str): 16-bit auto-incrementing.
        serial4 (str): 32-bit auto-incrementing.
        serial8 (str): 64-bit auto-incrementing.
    """

    text = builtin_types.text.typename
    int2 = builtin_types.int2.typename
    jsonb = builtin_types.json.typename
    float8 = builtin_types.float8.typename
    timestamp = builtin_types.timestamp.typename
    int8 = builtin_types.int8.typename
    boolean = builtin_types.boolean.typename
    json = builtin_types.json.typename
    float4 = builtin_types.float4.typename
    int4 = builtin_types.int4.typename
    timestamptz = builtin_types.timestamptz.typename
    date = builtin_types.date.typename
    ermrest_rid = builtin_types.ermrest_rid.typename
    ermrest_rcb = builtin_types.ermrest_rcb.typename
    ermrest_rmb = builtin_types.ermrest_rmb.typename
    ermrest_rct = builtin_types.ermrest_rct.typename
    ermrest_rmt = builtin_types.ermrest_rmt.typename
    markdown = builtin_types.markdown.typename
    longtext = builtin_types.longtext.typename
    ermrest_curie = builtin_types.ermrest_curie.typename
    ermrest_uri = builtin_types.ermrest_uri.typename
    color_rgb_hex = builtin_types.color_rgb_hex.typename
    serial2 = builtin_types.serial2.typename
    serial4 = builtin_types.serial4.typename
    serial8 = builtin_types.serial8.typename

ColumnDefinition

Bases: BaseModel

Defines a column in an ERMrest table.

Provides a Pydantic model for defining columns with their types, constraints, and metadata. Maps to deriva_py's Column.define functionality.

Attributes:

Name	Type	Description
`name`	`str`	Name of the column.
`type`	`BuiltinTypes`	ERMrest data type for the column.
`nullok`	`bool`	Whether NULL values are allowed. Defaults to True.
`default`	`Any`	Default value for the column.
`comment`	`str \| None`	Description of the column's purpose.
`acls`	`dict`	Access control lists.
`acl_bindings`	`dict`	Dynamic access control bindings.
`annotations`	`dict`	Additional metadata annotations.

Example

col = ColumnDefinition( ... name="score", ... type=BuiltinTypes.float4, ... nullok=False, ... comment="Confidence score between 0 and 1" ... )

Source code in src/deriva_ml/core/ermrest.py

class ColumnDefinition(BaseModel):
    """Defines a column in an ERMrest table.

    Provides a Pydantic model for defining columns with their types, constraints, and metadata.
    Maps to deriva_py's Column.define functionality.

    Attributes:
        name (str): Name of the column.
        type (BuiltinTypes): ERMrest data type for the column.
        nullok (bool): Whether NULL values are allowed. Defaults to True.
        default (Any): Default value for the column.
        comment (str | None): Description of the column's purpose.
        acls (dict): Access control lists.
        acl_bindings (dict): Dynamic access control bindings.
        annotations (dict): Additional metadata annotations.

    Example:
        >>> col = ColumnDefinition(
        ...     name="score",
        ...     type=BuiltinTypes.float4,
        ...     nullok=False,
        ...     comment="Confidence score between 0 and 1"
        ... )
    """
    name: str
    type: BuiltinTypes
    nullok: bool = True
    default: Any = None
    comment: str | None = None
    acls: dict = Field(default_factory=dict)
    acl_bindings: dict = Field(default_factory=dict)
    annotations: dict = Field(default_factory=dict)

    @field_validator("type", mode="before")
    @classmethod
    def extract_type_name(cls, value: Any) -> Any:
        if isinstance(value, dict):
            return BuiltinTypes(value["typename"])
        else:
            return value

    @model_serializer()
    def serialize_column_definition(self):
        return em.Column.define(
            self.name,
            builtin_types[self.type.value],
            nullok=self.nullok,
            default=self.default,
            comment=self.comment,
            acls=self.acls,
            acl_bindings=self.acl_bindings,
            annotations=self.annotations,
        )

ExecAssetType

Bases: BaseStrEnum

Execution asset type identifiers.

Defines the types of assets that can be produced during an execution.

Attributes:

Name	Type	Description
`input_file`	`str`	Input file used by the execution.
`output_file`	`str`	Output file produced by the execution.
`notebook_output`	`str`	Jupyter notebook output from the execution.

Source code in src/deriva_ml/core/enums.py

class ExecAssetType(BaseStrEnum):
    """Execution asset type identifiers.

    Defines the types of assets that can be produced during an execution.

    Attributes:
        input_file (str): Input file used by the execution.
        output_file (str): Output file produced by the execution.
        notebook_output (str): Jupyter notebook output from the execution.
    """

    input_file = "Input_File"
    output_file = "Output_File"
    notebook_output = "Notebook_Output"
    model_file = "Model_File"

ExecMetadataType

Bases: BaseStrEnum

Execution metadata type identifiers.

Defines the types of metadata that can be associated with an execution.

Attributes:

Name	Type	Description
`execution_config`	`str`	Execution configuration data.
`runtime_env`	`str`	Runtime environment information.

Source code in src/deriva_ml/core/enums.py

class ExecMetadataType(BaseStrEnum):
    """Execution metadata type identifiers.

    Defines the types of metadata that can be associated with an execution.

    Attributes:
        execution_config (str): Execution configuration data.
        runtime_env (str): Runtime environment information.
    """

    execution_config = "Execution_Config"
    runtime_env = "Runtime_Env"

FileSpec

Bases: BaseModel

An entry into the File table

Attributes:

Name	Type	Description
`url`	`str`	The File url to the url.
`description`	`str \| None`	The description of the file.
`md5`	`str`	The MD5 hash of the file.
`length`	`int`	The length of the file in bytes.
`file_types`	`conlist(str) \| None`	A list of file types. Each files_type should be a defined term in MLVocab.file_type vocabulary.

Source code in src/deriva_ml/core/filespec.py

class FileSpec(BaseModel):
    """An entry into the File table

    Attributes:
        url: The File url to the url.
        description: The description of the file.
        md5: The MD5 hash of the file.
        length: The length of the file in bytes.
        file_types: A list of file types.  Each files_type should be a defined term in MLVocab.file_type vocabulary.
    """

    url: str = Field(alias="URL", validation_alias="url")
    md5: str = Field(alias="MD5", validation_alias="md5")
    length: int = Field(alias="Length", validation_alias="length")
    description: str | None = Field(default="", alias="Description", validation_alias="description")
    file_types: conlist(str) | None = []

    @field_validator("url")
    @classmethod
    def validate_file_url(cls, url: str) -> str:
        """Examine the provided URL. If it's a local path, convert it into a tag URL.

        Args:
            url: The URL to validate and potentially convert

        Returns:
            The validated/converted URL

        Raises:
            ValidationError: If the URL is not a file URL
        """
        url_parts = urlparse(url)
        if url_parts.scheme == "tag":
            # Already a tag URL, so just return it.
            return url
        elif (not url_parts.scheme) or url_parts.scheme == "file":
            # There is no scheme part of the URL, or it is a file URL, so it is a local file path.
            # Convert to a tag URL.
            return f"tag://{gethostname()},{date.today()}:file://{url_parts.path}"
        else:
            raise ValueError("url is not a file URL")

    @classmethod
    def create_filespecs(
        cls, path: Path | str, description: str, file_types: list[str] | Callable[[Path], list[str]] | None = None
    ) -> Generator[FileSpec, None, None]:
        """Given a file or directory, generate the sequence of corresponding FileSpecs suitable to create a File table.

        Args:
            path: Path to the file or directory.
            description: The description of the file(s)
            file_types: A list of file types or a function that takes a file path and returns a list of file types.

        Returns:
            An iterable of FileSpecs for each file in the directory.
        """

        path = Path(path)
        file_types = file_types or []
        file_types_fn = file_types if callable(file_types) else lambda _x: file_types

        def create_spec(file_path: Path) -> FileSpec:
            hashes = hash_utils.compute_file_hashes(file_path, hashes=frozenset(["md5", "sha256"]))
            md5 = hashes["md5"][0]
            type_list = file_types_fn(file_path)
            return FileSpec(
                length=path.stat().st_size,
                md5=md5,
                description=description,
                url=file_path.as_posix(),
                file_types=type_list if "File" in type_list else ["File"] + type_list,
            )

        files = [path] if path.is_file() else [f for f in Path(path).rglob("*") if f.is_file()]
        return (create_spec(file) for file in files)

    @staticmethod
    def read_filespec(path: Path | str) -> Generator[FileSpec, None, None]:
        """Get FileSpecs from a JSON lines file.

        Args:
         path: Path to the .jsonl file (string or Path).

        Yields:
             A FileSpec object.
        """
        path = Path(path)
        with path.open("r", encoding="utf-8") as f:
            for line in f:
                line = line.strip()
                if not line:
                    continue
                yield FileSpec(**json.loads(line))

create_filespecs `classmethod`

create_filespecs(
    path: Path | str,
    description: str,
    file_types: list[str]
    | Callable[[Path], list[str]]
    | None = None,
) -> Generator[FileSpec, None, None]

Given a file or directory, generate the sequence of corresponding FileSpecs suitable to create a File table.

Parameters:

Name	Type	Description	Default
`path`	`Path \| str`	Path to the file or directory.	required
`description`	`str`	The description of the file(s)	required
`file_types`	`list[str] \| Callable[[Path], list[str]] \| None`	A list of file types or a function that takes a file path and returns a list of file types.	`None`

Returns:

Type	Description
`None`	An iterable of FileSpecs for each file in the directory.

Source code in src/deriva_ml/core/filespec.py

@classmethod
def create_filespecs(
    cls, path: Path | str, description: str, file_types: list[str] | Callable[[Path], list[str]] | None = None
) -> Generator[FileSpec, None, None]:
    """Given a file or directory, generate the sequence of corresponding FileSpecs suitable to create a File table.

    Args:
        path: Path to the file or directory.
        description: The description of the file(s)
        file_types: A list of file types or a function that takes a file path and returns a list of file types.

    Returns:
        An iterable of FileSpecs for each file in the directory.
    """

    path = Path(path)
    file_types = file_types or []
    file_types_fn = file_types if callable(file_types) else lambda _x: file_types

    def create_spec(file_path: Path) -> FileSpec:
        hashes = hash_utils.compute_file_hashes(file_path, hashes=frozenset(["md5", "sha256"]))
        md5 = hashes["md5"][0]
        type_list = file_types_fn(file_path)
        return FileSpec(
            length=path.stat().st_size,
            md5=md5,
            description=description,
            url=file_path.as_posix(),
            file_types=type_list if "File" in type_list else ["File"] + type_list,
        )

    files = [path] if path.is_file() else [f for f in Path(path).rglob("*") if f.is_file()]
    return (create_spec(file) for file in files)

read_filespec `staticmethod`

read_filespec(
    path: Path | str,
) -> Generator[FileSpec, None, None]

Get FileSpecs from a JSON lines file.

Parameters:

Name	Type	Description	Default
`path`	`Path \| str`	Path to the .jsonl file (string or Path).	required

Yields:

Type	Description
`FileSpec`	A FileSpec object.

Source code in src/deriva_ml/core/filespec.py

@staticmethod
def read_filespec(path: Path | str) -> Generator[FileSpec, None, None]:
    """Get FileSpecs from a JSON lines file.

    Args:
     path: Path to the .jsonl file (string or Path).

    Yields:
         A FileSpec object.
    """
    path = Path(path)
    with path.open("r", encoding="utf-8") as f:
        for line in f:
            line = line.strip()
            if not line:
                continue
            yield FileSpec(**json.loads(line))

validate_file_url `classmethod`

validate_file_url(url: str) -> str

Examine the provided URL. If it's a local path, convert it into a tag URL.

Parameters:

Name	Type	Description	Default
`url`	`str`	The URL to validate and potentially convert	required

Returns:

Type	Description
`str`	The validated/converted URL

Raises:

Type	Description
`ValidationError`	If the URL is not a file URL

Source code in src/deriva_ml/core/filespec.py

@field_validator("url")
@classmethod
def validate_file_url(cls, url: str) -> str:
    """Examine the provided URL. If it's a local path, convert it into a tag URL.

    Args:
        url: The URL to validate and potentially convert

    Returns:
        The validated/converted URL

    Raises:
        ValidationError: If the URL is not a file URL
    """
    url_parts = urlparse(url)
    if url_parts.scheme == "tag":
        # Already a tag URL, so just return it.
        return url
    elif (not url_parts.scheme) or url_parts.scheme == "file":
        # There is no scheme part of the URL, or it is a file URL, so it is a local file path.
        # Convert to a tag URL.
        return f"tag://{gethostname()},{date.today()}:file://{url_parts.path}"
    else:
        raise ValueError("url is not a file URL")

FileUploadState

Bases: BaseModel

Tracks the state and result of a file upload operation.

Attributes:

Name	Type	Description
`state`	`UploadState`	Current state of the upload (success, failed, etc.).
`status`	`str`	Detailed status message.
`result`	`Any`	Upload result data, if any.
`rid`	`RID \| None`	Resource identifier of the uploaded file, if successful.

Source code in src/deriva_ml/core/ermrest.py

class FileUploadState(BaseModel):
    """Tracks the state and result of a file upload operation.

    Attributes:
        state (UploadState): Current state of the upload (success, failed, etc.).
        status (str): Detailed status message.
        result (Any): Upload result data, if any.
        rid (RID | None): Resource identifier of the uploaded file, if successful.
    """
    state: UploadState
    status: str
    result: Any

    @computed_field
    @property
    def rid(self) -> RID | None:
        return self.result and self.result["RID"]

ForeignKeyDefinition

Bases: BaseModel

Defines a foreign key relationship between tables.

Provides a Pydantic model for defining foreign key constraints with referential actions and metadata. Maps to deriva_py's ForeignKey.define functionality.

Attributes:

Name	Type	Description
`colnames`	`Iterable[str]`	Names of columns in the referencing table.
`pk_sname`	`str`	Schema name of the referenced table.
`pk_tname`	`str`	Name of the referenced table.
`pk_colnames`	`Iterable[str]`	Names of columns in the referenced table.
`constraint_names`	`Iterable[str]`	Names for the foreign key constraints.
`on_update`	`str`	Action on update of referenced row. Defaults to "NO ACTION".
`on_delete`	`str`	Action on delete of referenced row. Defaults to "NO ACTION".
`comment`	`str \| None`	Description of the relationship.
`acls`	`dict`	Access control lists.
`acl_bindings`	`dict`	Dynamic access control bindings.
`annotations`	`dict`	Additional metadata annotations.

Example

fk = ForeignKeyDefinition( ... colnames=["dataset_id"], ... pk_sname="core", ... pk_tname="dataset", ... pk_colnames=["id"], ... on_delete="CASCADE" ... )

Source code in src/deriva_ml/core/ermrest.py

class ForeignKeyDefinition(BaseModel):
    """Defines a foreign key relationship between tables.

    Provides a Pydantic model for defining foreign key constraints with referential actions
    and metadata. Maps to deriva_py's ForeignKey.define functionality.

    Attributes:
        colnames (Iterable[str]): Names of columns in the referencing table.
        pk_sname (str): Schema name of the referenced table.
        pk_tname (str): Name of the referenced table.
        pk_colnames (Iterable[str]): Names of columns in the referenced table.
        constraint_names (Iterable[str]): Names for the foreign key constraints.
        on_update (str): Action on update of referenced row. Defaults to "NO ACTION".
        on_delete (str): Action on delete of referenced row. Defaults to "NO ACTION".
        comment (str | None): Description of the relationship.
        acls (dict): Access control lists.
        acl_bindings (dict): Dynamic access control bindings.
        annotations (dict): Additional metadata annotations.

    Example:
        >>> fk = ForeignKeyDefinition(
        ...     colnames=["dataset_id"],
        ...     pk_sname="core",
        ...     pk_tname="dataset",
        ...     pk_colnames=["id"],
        ...     on_delete="CASCADE"
        ... )
    """
    colnames: Iterable[str]
    pk_sname: str
    pk_tname: str
    pk_colnames: Iterable[str]
    constraint_names: Iterable[str] = Field(default_factory=list)
    on_update: str = "NO ACTION"
    on_delete: str = "NO ACTION"
    comment: str | None = None
    acls: dict[str, Any] = Field(default_factory=dict)
    acl_bindings: dict[str, Any] = Field(default_factory=dict)
    annotations: dict[str, Any] = Field(default_factory=dict)

    @model_serializer()
    def serialize_fk_definition(self):
        return em.ForeignKey.define(
            fk_colnames=self.colnames,
            pk_sname=self.pk_sname,
            pk_tname=self.pk_tname,
            pk_colnames=self.pk_colnames,
            on_update=self.on_update,
            on_delete=self.on_delete,
            comment=self.comment,
            acls=self.acls,
            acl_bindings=self.acl_bindings,
            annotations=self.annotations,
        )

KeyDefinition

Bases: BaseModel

Defines a key constraint in an ERMrest table.

Provides a Pydantic model for defining primary keys and unique constraints. Maps to deriva_py's Key.define functionality.

Attributes:

Name	Type	Description
`colnames`	`Iterable[str]`	Names of columns that form the key.
`constraint_names`	`Iterable[str]`	Names for the key constraints.
`comment`	`str \| None`	Description of the key's purpose.
`annotations`	`dict`	Additional metadata annotations.

Example

key = KeyDefinition( ... colnames=["id", "version"], ... constraint_names=["unique_id_version"], ... comment="Unique identifier with version" ... )

Source code in src/deriva_ml/core/ermrest.py

class KeyDefinition(BaseModel):
    """Defines a key constraint in an ERMrest table.

    Provides a Pydantic model for defining primary keys and unique constraints.
    Maps to deriva_py's Key.define functionality.

    Attributes:
        colnames (Iterable[str]): Names of columns that form the key.
        constraint_names (Iterable[str]): Names for the key constraints.
        comment (str | None): Description of the key's purpose.
        annotations (dict): Additional metadata annotations.

    Example:
        >>> key = KeyDefinition(
        ...     colnames=["id", "version"],
        ...     constraint_names=["unique_id_version"],
        ...     comment="Unique identifier with version"
        ... )
    """
    colnames: Iterable[str]
    constraint_names: Iterable[str]
    comment: str | None = None
    annotations: dict = Field(default_factory=dict)

    @model_serializer()
    def serialize_key_definition(self):
        return em.Key.define(
            colnames=self.colnames,
            constraint_names=self.constraint_names,
            comment=self.comment,
            annotations=self.annotations,
        )

MLAsset

Bases: BaseStrEnum

Asset type identifiers.

Defines the types of assets that can be associated with executions.

Attributes:

Name	Type	Description
`execution_metadata`	`str`	Metadata about an execution.
`execution_asset`	`str`	Asset produced by an execution.

Source code in src/deriva_ml/core/enums.py

class MLAsset(BaseStrEnum):
    """Asset type identifiers.

    Defines the types of assets that can be associated with executions.

    Attributes:
        execution_metadata (str): Metadata about an execution.
        execution_asset (str): Asset produced by an execution.
    """

    execution_metadata = "Execution_Metadata"
    execution_asset = "Execution_Asset"

MLVocab

Bases: BaseStrEnum

Controlled vocabulary type identifiers.

Defines the names of controlled vocabulary tables used in DerivaML for various types of entities and attributes.

Attributes:

Name	Type	Description
`dataset_type`	`str`	Dataset classification vocabulary.
`workflow_type`	`str`	Workflow classification vocabulary.
`asset_type`	`str`	Asset classification vocabulary.
`asset_role`	`str`	Asset role classification vocabulary.

Source code in src/deriva_ml/core/enums.py

class MLVocab(BaseStrEnum):
    """Controlled vocabulary type identifiers.

    Defines the names of controlled vocabulary tables used in DerivaML for various types
    of entities and attributes.

    Attributes:
        dataset_type (str): Dataset classification vocabulary.
        workflow_type (str): Workflow classification vocabulary.
        asset_type (str): Asset classification vocabulary.
        asset_role (str): Asset role classification vocabulary.
    """

    dataset_type = "Dataset_Type"
    workflow_type = "Workflow_Type"
    asset_type = "Asset_Type"
    asset_role = "Asset_Role"
    feature_name = "Feature_Name"

Status

Bases: BaseStrEnum

Execution status values.

Represents the various states an execution can be in throughout its lifecycle.

Attributes:

Name	Type	Description
`initializing`	`str`	Initial setup is in progress.
`created`	`str`	Execution record has been created.
`pending`	`str`	Execution is queued.
`running`	`str`	Execution is in progress.
`aborted`	`str`	Execution was manually stopped.
`completed`	`str`	Execution finished successfully.
`failed`	`str`	Execution encountered an error.

Source code in src/deriva_ml/core/enums.py

class Status(BaseStrEnum):
    """Execution status values.

    Represents the various states an execution can be in throughout its lifecycle.

    Attributes:
        initializing (str): Initial setup is in progress.
        created (str): Execution record has been created.
        pending (str): Execution is queued.
        running (str): Execution is in progress.
        aborted (str): Execution was manually stopped.
        completed (str): Execution finished successfully.
        failed (str): Execution encountered an error.
    """

    initializing = "Initializing"
    created = "Created"
    pending = "Pending"
    running = "Running"
    aborted = "Aborted"
    completed = "Completed"
    failed = "Failed"

TableDefinition

Bases: BaseModel

Defines a complete table structure in ERMrest.

Provides a Pydantic model for defining tables with their columns, keys, and relationships. Maps to deriva_py's Table.define functionality.

Attributes:

Name	Type	Description
`name`	`str`	Name of the table.
`column_defs`	`Iterable[ColumnDefinition]`	Column definitions.
`key_defs`	`Iterable[KeyDefinition]`	Key constraint definitions.
`fkey_defs`	`Iterable[ForeignKeyDefinition]`	Foreign key relationship definitions.
`comment`	`str \| None`	Description of the table's purpose.
`acls`	`dict`	Access control lists.
`acl_bindings`	`dict`	Dynamic access control bindings.
`annotations`	`dict`	Additional metadata annotations.

Example

table = TableDefinition( ... name="experiment", ... column_defs=[ ... ColumnDefinition(name="id", type=BuiltinTypes.text), ... ColumnDefinition(name="date", type=BuiltinTypes.date) ... ], ... comment="Experimental data records" ... )

Source code in src/deriva_ml/core/ermrest.py

class TableDefinition(BaseModel):
    """Defines a complete table structure in ERMrest.

    Provides a Pydantic model for defining tables with their columns, keys, and relationships.
    Maps to deriva_py's Table.define functionality.

    Attributes:
        name (str): Name of the table.
        column_defs (Iterable[ColumnDefinition]): Column definitions.
        key_defs (Iterable[KeyDefinition]): Key constraint definitions.
        fkey_defs (Iterable[ForeignKeyDefinition]): Foreign key relationship definitions.
        comment (str | None): Description of the table's purpose.
        acls (dict): Access control lists.
        acl_bindings (dict): Dynamic access control bindings.
        annotations (dict): Additional metadata annotations.

    Example:
        >>> table = TableDefinition(
        ...     name="experiment",
        ...     column_defs=[
        ...         ColumnDefinition(name="id", type=BuiltinTypes.text),
        ...         ColumnDefinition(name="date", type=BuiltinTypes.date)
        ...     ],
        ...     comment="Experimental data records"
        ... )
    """
    name: str
    column_defs: Iterable[ColumnDefinition]
    key_defs: Iterable[KeyDefinition] = Field(default_factory=list)
    fkey_defs: Iterable[ForeignKeyDefinition] = Field(default_factory=list)
    comment: str | None = None
    acls: dict = Field(default_factory=dict)
    acl_bindings: dict = Field(default_factory=dict)
    annotations: dict = Field(default_factory=dict)

    @model_serializer()
    def serialize_table_definition(self):
        return em.Table.define(
            tname=self.name,
            column_defs=[c.model_dump() for c in self.column_defs],
            key_defs=[k.model_dump() for k in self.key_defs],
            fkey_defs=[fk.model_dump() for fk in self.fkey_defs],
            comment=self.comment,
            acls=self.acls,
            acl_bindings=self.acl_bindings,
            annotations=self.annotations,
        )

UploadState

Bases: Enum

File upload operation states.

Represents the various states a file upload operation can be in, from initiation to completion.

Attributes:

Name	Type	Description
`success`	`int`	Upload completed successfully.
`failed`	`int`	Upload failed.
`pending`	`int`	Upload is queued.
`running`	`int`	Upload is in progress.
`paused`	`int`	Upload is temporarily paused.
`aborted`	`int`	Upload was aborted.
`cancelled`	`int`	Upload was cancelled.
`timeout`	`int`	Upload timed out.

Source code in src/deriva_ml/core/enums.py

class UploadState(Enum):
    """File upload operation states.

    Represents the various states a file upload operation can be in, from initiation to completion.

    Attributes:
        success (int): Upload completed successfully.
        failed (int): Upload failed.
        pending (int): Upload is queued.
        running (int): Upload is in progress.
        paused (int): Upload is temporarily paused.
        aborted (int): Upload was aborted.
        cancelled (int): Upload was cancelled.
        timeout (int): Upload timed out.
    """

    success = 0
    failed = 1
    pending = 2
    running = 3
    paused = 4
    aborted = 5
    cancelled = 6
    timeout = 7

VocabularyTerm

Bases: BaseModel

Represents a term in a controlled vocabulary.

A vocabulary term is a standardized entry in a controlled vocabulary table. Each term has a primary name, optional synonyms, and identifiers for cross-referencing.

Attributes:

Name	Type	Description
`name`	`str`	Primary name of the term.
`synonyms`	`list[str] \| None`	Alternative names for the term.
`id`	`str`	CURIE (Compact URI) identifier.
`uri`	`str`	Full URI for the term.
`description`	`str`	Explanation of the term's meaning.
`rid`	`str`	Resource identifier in the catalog.

Example

term = VocabularyTerm( ... Name="epithelial", ... Synonyms=["epithelium"], ... ID="tissue:0001", ... URI="http://example.org/tissue/0001", ... Description="Epithelial tissue type", ... RID="1-abc123" ... )

Source code in src/deriva_ml/core/ermrest.py

class VocabularyTerm(BaseModel):
    """Represents a term in a controlled vocabulary.

    A vocabulary term is a standardized entry in a controlled vocabulary table. Each term has
    a primary name, optional synonyms, and identifiers for cross-referencing.

    Attributes:
        name (str): Primary name of the term.
        synonyms (list[str] | None): Alternative names for the term.
        id (str): CURIE (Compact URI) identifier.
        uri (str): Full URI for the term.
        description (str): Explanation of the term's meaning.
        rid (str): Resource identifier in the catalog.

    Example:
        >>> term = VocabularyTerm(
        ...     Name="epithelial",
        ...     Synonyms=["epithelium"],
        ...     ID="tissue:0001",
        ...     URI="http://example.org/tissue/0001",
        ...     Description="Epithelial tissue type",
        ...     RID="1-abc123"
        ... )
    """
    name: str = Field(alias="Name")
    synonyms: list[str] | None = Field(alias="Synonyms")
    id: str = Field(alias="ID")
    uri: str = Field(alias="URI")
    description: str = Field(alias="Description")
    rid: str = Field(alias="RID")

    class Config:
        extra = "ignore"

Helper Classes

BaseStrEnum

BuiltinTypes

ColumnDefinition

ExecAssetType

ExecMetadataType

FileSpec

create_filespecs classmethod

read_filespec staticmethod

validate_file_url classmethod

FileUploadState

ForeignKeyDefinition

KeyDefinition

MLAsset

MLVocab

Status

TableDefinition

UploadState

VocabularyTerm

create_filespecs `classmethod`

read_filespec `staticmethod`

validate_file_url `classmethod`