Exceptions
DerivaML defines custom exceptions to provide clear error messages for common error conditions when working with catalogs, datasets, and executions.
Custom exceptions for the DerivaML package.
This module defines the exception hierarchy for DerivaML. All DerivaML-specific exceptions inherit from DerivaMLException, making it easy to catch all library errors with a single except clause.
Exception Hierarchy
DerivaMLException (base class for all DerivaML errors) │ ├── DerivaMLConfigurationError (configuration and initialization) │ ├── DerivaMLSchemaError (schema/catalog structure issues) │ ├── DerivaMLAuthenticationError (authentication failures) │ ├── DerivaMLOfflineError (online-only operation in offline mode) │ └── DerivaMLNoExecutionContext (write attempted on read-only handle) │ ├── DerivaMLDataError (data access and validation) │ ├── DerivaMLNotFoundError (entity not found) │ │ ├── DerivaMLDatasetNotFound (dataset lookup failures) │ │ ├── DerivaMLTableNotFound (table lookup failures) │ │ └── DerivaMLInvalidTerm (vocabulary term not found) │ ├── DerivaMLTableTypeError (wrong table type) │ ├── DerivaMLValidationError (data validation failures) │ ├── DerivaMLCycleError (cycle detected in relationships) │ └── DerivaMLStateInconsistency (SQLite/catalog state disagreement) │ ├── DerivaMLExecutionError (execution lifecycle) │ ├── DerivaMLWorkflowError (workflow issues) │ │ └── DerivaMLDirtyWorkflowError (uncommitted changes) │ └── DerivaMLUploadError (asset upload failures) │ ├── DerivaMLReadOnlyError (write operation on read-only resource) │ └── DerivaMLDenormalizeError (denormalization planning errors) ├── DerivaMLDenormalizeMultiLeaf ├── DerivaMLDenormalizeNoSink ├── DerivaMLDenormalizeDownstreamLeaf ├── DerivaMLDenormalizeAmbiguousPath └── DerivaMLDenormalizeUnrelatedAnchor
Example
from deriva_ml.core.exceptions import DerivaMLException, DerivaMLNotFoundError try: # doctest: +SKIP ... dataset = ml.lookup_dataset("invalid_rid") ... except DerivaMLDatasetNotFound as e: ... print(f"Dataset not found: {e}") ... except DerivaMLNotFoundError as e: ... print(f"Entity not found: {e}") ... except DerivaMLException as e: ... print(f"DerivaML error: {e}")
DerivaMLAuthenticationError
Bases: DerivaMLConfigurationError
Exception raised for authentication failures.
Raised when authentication with the catalog fails or credentials are invalid.
Example
raise DerivaMLAuthenticationError("Failed to authenticate with catalog") # doctest: +SKIP
Source code in src/deriva_ml/core/exceptions.py
141 142 143 144 145 146 147 148 149 150 | |
DerivaMLConfigurationError
Bases: DerivaMLException
Exception raised for configuration and initialization errors.
Raised when there are issues with DerivaML configuration, catalog initialization, or schema setup.
Example
raise DerivaMLConfigurationError("Invalid catalog configuration") # doctest: +SKIP
Source code in src/deriva_ml/core/exceptions.py
80 81 82 83 84 85 86 87 88 89 90 | |
DerivaMLCycleError
Bases: DerivaMLDataError
Exception raised when a cycle is detected in relationships.
Raised when creating dataset hierarchies or other relationships that would result in a circular dependency.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cycle_nodes
|
list[str]
|
List of nodes involved in the cycle. |
required |
msg
|
str
|
Additional context. Defaults to "Cycle detected". |
'Cycle detected'
|
Example
raise DerivaMLCycleError(["Dataset1", "Dataset2", "Dataset1"]) # doctest: +SKIP
Source code in src/deriva_ml/core/exceptions.py
353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 | |
DerivaMLDataError
Bases: DerivaMLException
Exception raised for data access and validation issues.
Base class for errors related to data lookup, validation, and integrity.
Example
raise DerivaMLDataError("Invalid data format") # doctest: +SKIP
Source code in src/deriva_ml/core/exceptions.py
201 202 203 204 205 206 207 208 209 210 | |
DerivaMLDatasetNotFound
Bases: DerivaMLNotFoundError
Exception raised when a dataset cannot be found.
Raised when attempting to look up a dataset that doesn't exist in the catalog or downloaded bag.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_rid
|
str
|
The RID of the dataset that was not found. |
required |
msg
|
str
|
Additional context. Defaults to "Dataset not found". |
'Dataset not found'
|
Example
raise DerivaMLDatasetNotFound("1-ABC") # doctest: +SKIP DerivaMLDatasetNotFound: Dataset 1-ABC not found
Source code in src/deriva_ml/core/exceptions.py
226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 | |
DerivaMLDenormalizeAmbiguousPath
Bases: DerivaMLDenormalizeError
Multiple FK paths between two requested tables — can't silently choose.
Raised when Rule 6 detects two or more distinct FK paths between
row_per and another requested / via table. Silent path selection
is rejected by design — the result shape would be materially
different depending on which path is chosen, and callers should be
explicit. Disambiguate by adding intermediates to include_tables
(their columns are included) or to via= (path-only, columns
excluded).
Attributes:
| Name | Type | Description |
|---|---|---|
from_table |
the |
|
to_table |
the requested table with multiple paths. |
|
paths |
list of path descriptions — each is a list of table
names from |
|
suggested_intermediates |
tables that appear in at least one
path but not in |
Example
try: # doctest: +SKIP ... d.as_dataframe(["Image", "Subject"]) # diamond schema ... except DerivaMLDenormalizeAmbiguousPath as e: ... for p in e.paths: ... print(" → ".join(p)) ... # Retry routing explicitly through Observation: ... df = d.as_dataframe( ... ["Image", "Subject"], via=e.suggested_intermediates[:1] ... )
Source code in src/deriva_ml/core/exceptions.py
566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 | |
DerivaMLDenormalizeDownstreamLeaf
Bases: DerivaMLDenormalizeError
Explicit row_per conflicts with a downstream table in include_tables.
Raised when the user specifies row_per=X but another table in
include_tables is downstream of X via FK (would require aggregation).
Attributes:
| Name | Type | Description |
|---|---|---|
row_per |
the explicit row_per value. |
|
downstream_tables |
tables downstream of row_per that can't be hoisted. |
Source code in src/deriva_ml/core/exceptions.py
543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 | |
DerivaMLDenormalizeError
Bases: DerivaMLException
Base class for denormalization errors.
All errors raised by :class:~deriva_ml.local_db.denormalizer.Denormalizer
and related planning functions are instances of this class.
Example
raise DerivaMLDenormalizeError("Planner failed") # doctest: +SKIP
Source code in src/deriva_ml/core/exceptions.py
484 485 486 487 488 489 490 491 492 | |
DerivaMLDenormalizeMultiLeaf
Bases: DerivaMLDenormalizeError
Multiple candidate tables for row_per — ambiguous leaf.
Raised when Rule 2 auto-inference finds more than one sink in
include_tables — i.e., multiple tables tie for "deepest in the
FK graph." The user must specify row_per explicitly to resolve.
Attributes:
| Name | Type | Description |
|---|---|---|
candidates |
list of table names that all qualify as sinks. |
|
include_tables |
the |
Example
try: # doctest: +SKIP ... d.as_dataframe(["Dataset", "Subject"]) ... except DerivaMLDenormalizeMultiLeaf as e: ... print(f"Pick one of {e.candidates} as row_per") ... # Then retry: d.as_dataframe(..., row_per="Subject")
Source code in src/deriva_ml/core/exceptions.py
495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 | |
DerivaMLDenormalizeNoSink
Bases: DerivaMLDenormalizeError
No sink found in the FK subgraph — cycle detected.
Raised when every table in include_tables has an outbound FK to
another table in the set, forming a cycle. Pathological — rare in
real schemas.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
msg
|
str
|
Descriptive error message. Should identify the tables forming the cycle. |
''
|
Example
raise DerivaMLDenormalizeNoSink( # doctest: +SKIP ... "Cycle in FK graph between tables A, B, C" ... )
Source code in src/deriva_ml/core/exceptions.py
525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 | |
DerivaMLDenormalizeUnrelatedAnchor
Bases: DerivaMLDenormalizeError
Anchor has no FK path to any table in include_tables ∪ via.
Raised when Rule 8 detects anchors whose table has no FK
relationship to any requested table — those anchors would
contribute nothing to the output, which is almost always a mistake
(wrong dataset passed, stale table name, etc.). Pass
ignore_unrelated_anchors=True to silently drop them if the
heterogeneity is intentional.
Note: this is distinct from Rule 7 case 5 (table has an FK path
into include_tables ∪ via but the specific anchor RIDs don't
reach row_per). Case 5 anchors are silently dropped regardless
of the flag — only case 6 (no path at all) raises this error.
Attributes:
| Name | Type | Description |
|---|---|---|
unrelated_tables |
tables of the unrelated anchors. |
|
include_tables |
the |
Example
try: # doctest: +SKIP ... d.as_dataframe(["Image", "Subject"]) # dataset has stray types ... except DerivaMLDenormalizeUnrelatedAnchor as e: ... print(f"Dataset has unrelated members: {e.unrelated_tables}") ... # Retry, dropping them: ... df = d.as_dataframe( ... ["Image", "Subject"], ignore_unrelated_anchors=True ... )
Source code in src/deriva_ml/core/exceptions.py
623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 | |
DerivaMLDirtyWorkflowError
Bases: DerivaMLWorkflowError
Exception raised when workflow code has uncommitted changes.
DerivaML requires code to be committed before execution for provenance tracking. Running with uncommitted changes means the execution record cannot reliably link back to the source code.
Use allow_dirty=True in the API or --allow-dirty on the CLI
to override this check when debugging or iterating.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
Path to the file with uncommitted changes. |
required |
Example
raise DerivaMLDirtyWorkflowError("src/models/train.py") # doctest: +SKIP DerivaMLDirtyWorkflowError: File src/models/train.py has uncommitted changes. ...
Source code in src/deriva_ml/core/exceptions.py
423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 | |
DerivaMLException
Bases: Exception
Base exception class for all DerivaML errors.
This is the root exception for all DerivaML-specific errors. Catching this exception will catch any error raised by the DerivaML library.
Attributes:
| Name | Type | Description |
|---|---|---|
_msg |
The error message stored for later access. |
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
msg
|
str
|
Descriptive error message. Defaults to empty string. |
''
|
Example
raise DerivaMLException("Failed to connect to catalog") # doctest: +SKIP DerivaMLException: Failed to connect to catalog
Source code in src/deriva_ml/core/exceptions.py
53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | |
DerivaMLExecutionError
Bases: DerivaMLException
Exception raised for execution lifecycle issues.
Base class for errors related to workflow execution, asset management, and provenance tracking.
Example
raise DerivaMLExecutionError("Execution failed to initialize") # doctest: +SKIP
Source code in src/deriva_ml/core/exceptions.py
397 398 399 400 401 402 403 404 405 406 407 | |
DerivaMLInvalidTerm
Bases: DerivaMLNotFoundError
Exception raised when a vocabulary term is not found or invalid.
Raised when attempting to look up or use a term that doesn't exist in a controlled vocabulary table, or when a term name/synonym cannot be resolved.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
vocabulary
|
str
|
Name of the vocabulary table being searched. |
required |
term
|
str
|
The term name that was not found. |
required |
msg
|
str
|
Additional context about the error. Defaults to "Term doesn't exist". |
"Term doesn't exist"
|
Example
raise DerivaMLInvalidTerm("Diagnosis", "unknown_condition") # doctest: +SKIP DerivaMLInvalidTerm: Invalid term unknown_condition in vocabulary Diagnosis: Term doesn't exist.
Source code in src/deriva_ml/core/exceptions.py
266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 | |
DerivaMLMaterializeLimitExceeded
Bases: DerivaMLValidationError
Raised when a result set exceeds the caller-supplied materialize_limit.
Surfaced by helpers (e.g. feature_values) that materialize the
full result set into memory before reduction. Callers can either
raise the limit, narrow their query (e.g. add an execution_rids
filter), or switch to a streaming consumer.
Attributes:
| Name | Type | Description |
|---|---|---|
actual_count |
The actual size of the result set that triggered the limit. |
|
limit |
The |
Example
from deriva_ml.core.exceptions import DerivaMLMaterializeLimitExceeded exc = DerivaMLMaterializeLimitExceeded(actual_count=1500, limit=1000) exc.actual_count 1500 "exceeds materialize_limit" in str(exc) True
Source code in src/deriva_ml/core/exceptions.py
322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 | |
DerivaMLNoExecutionContext
Bases: DerivaMLConfigurationError
Exception raised when an execution-scoped operation is attempted without an execution context.
Handles returned by ml.table(name) are read-only — useful for schema
introspection — but their .insert(...) and asset-file methods raise
this exception. Use exe.table(name) to get a handle bound to an
execution that permits writes.
Example
Calling a write method on a read-only handle raises this error::
>>> handle = ml.table("Subject") # doctest: +SKIP
>>> handle.record_class() # OK # doctest: +SKIP
>>> handle.insert({"Name": "x"}) # raises # doctest: +SKIP
Traceback (most recent call last):
...
DerivaMLNoExecutionContext: ml.table() handles are read-only; use exe.table() for writes
Source code in src/deriva_ml/core/exceptions.py
174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 | |
DerivaMLNotFoundError
Bases: DerivaMLDataError
Exception raised when an entity cannot be found.
Raised when a lookup operation fails to find the requested entity (dataset, table, term, etc.) in the catalog or bag.
Example
raise DerivaMLNotFoundError("Entity '1-ABC' not found in catalog") # doctest: +SKIP
Source code in src/deriva_ml/core/exceptions.py
213 214 215 216 217 218 219 220 221 222 223 | |
DerivaMLOfflineError
Bases: DerivaMLConfigurationError
Exception raised when an online-only operation is attempted in offline mode.
The DerivaML instance was constructed with mode=ConnectionMode.offline
but the caller invoked an operation that requires server contact — most
commonly create_execution, which needs a server-assigned Execution RID.
Example
Creating an execution requires an online mode because the Execution RID must be server-assigned::
>>> ml = DerivaML(..., mode=ConnectionMode.offline) # doctest: +SKIP
>>> ml.create_execution(config) # doctest: +SKIP
Traceback (most recent call last):
...
DerivaMLOfflineError: create_execution requires online mode
Source code in src/deriva_ml/core/exceptions.py
153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 | |
DerivaMLReadOnlyError
Bases: DerivaMLException
Exception raised when attempting write operations on read-only resources.
Raised when attempting to modify data in a downloaded bag or other read-only context where write operations are not supported.
Example
raise DerivaMLReadOnlyError("Cannot create datasets in a downloaded bag") # doctest: +SKIP
Source code in src/deriva_ml/core/exceptions.py
466 467 468 469 470 471 472 473 474 475 476 | |
DerivaMLSchemaError
Bases: DerivaMLConfigurationError
Exception raised for schema or catalog structure issues.
Raised when the catalog schema is invalid, missing required tables, or has structural problems that prevent normal operation.
Example
raise DerivaMLSchemaError("Ambiguous domain schema: ['Schema1', 'Schema2']") # doctest: +SKIP
Source code in src/deriva_ml/core/exceptions.py
93 94 95 96 97 98 99 100 101 102 103 | |
DerivaMLSchemaPinned
Bases: DerivaMLConfigurationError
Raised when refresh_schema() is called on a pinned cache.
The cache has been explicitly pinned via pin_schema(). Call
unpin_schema() first if you really want to refresh. Note:
force=True does NOT bypass a pin — it only bypasses the
pending-rows guard.
Example
raise DerivaMLSchemaPinned( # doctest: +SKIP ... "refresh_schema refused: cache is pinned at snapshot s0" ... )
Source code in src/deriva_ml/core/exceptions.py
124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 | |
DerivaMLSchemaRefreshBlocked
Bases: DerivaMLConfigurationError
Raised when refresh_schema() is called with staged work in the workspace.
The caller should drain the workspace first (ml.upload_pending())
or call refresh_schema(force=True) to discard local state.
Draining is the safer choice — a forced refresh may leave rows
whose metadata references columns or types no longer in the new
schema, causing catalog-insert failures on the next upload.
Example
raise DerivaMLSchemaRefreshBlocked( # doctest: +SKIP ... "refresh_schema requires a drained workspace; 3 pending rows" ... )
Source code in src/deriva_ml/core/exceptions.py
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 | |
DerivaMLStateInconsistency
Bases: DerivaMLDataError
Exception raised when workspace SQLite state and catalog state disagree in an unresolvable way.
The six disagreement cases enumerated in spec §2.2 are handled automatically
by the reconciliation logic (see state_machine.reconcile_with_catalog);
anything outside those rules surfaces as this exception with enough
information for a human to intervene.
Example
A catalog-side delete of an in-flight execution produces this error::
>>> exe = ml.resume_execution("EXE-A") # doctest: +SKIP
Traceback (most recent call last):
...
DerivaMLStateInconsistency: Execution EXE-A: SQLite status 'running' but catalog returned no Execution row
Source code in src/deriva_ml/core/exceptions.py
372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 | |
DerivaMLTableNotFound
Bases: DerivaMLNotFoundError
Exception raised when a table cannot be found.
Raised when attempting to access a table that doesn't exist in the catalog schema or downloaded bag.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
table_name
|
str
|
The name of the table that was not found. |
required |
msg
|
str
|
Additional context. Defaults to "Table not found". |
'Table not found'
|
Example
raise DerivaMLTableNotFound("MyTable") # doctest: +SKIP DerivaMLTableNotFound: Table not found: MyTable
Source code in src/deriva_ml/core/exceptions.py
246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 | |
DerivaMLTableTypeError
Bases: DerivaMLDataError
Exception raised when a RID or table is not of the expected type.
Raised when an operation requires a specific table type (e.g., Dataset, Execution) but receives a RID or table reference of a different type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
table_type
|
str
|
The expected table type (e.g., "Dataset", "Execution"). |
required |
table
|
str
|
The actual table name or RID that was provided. |
required |
Example
raise DerivaMLTableTypeError("Dataset", "1-ABC123") # doctest: +SKIP DerivaMLTableTypeError: Table 1-ABC123 is not of type Dataset.
Source code in src/deriva_ml/core/exceptions.py
288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 | |
DerivaMLUploadError
Bases: DerivaMLExecutionError
Exception raised for asset upload failures.
Raised when uploading assets to the catalog fails, including file uploads, metadata insertion, and provenance recording.
Example
raise DerivaMLUploadError("Failed to upload execution assets") # doctest: +SKIP
Source code in src/deriva_ml/core/exceptions.py
448 449 450 451 452 453 454 455 456 457 458 | |
DerivaMLValidationError
Bases: DerivaMLDataError
Exception raised when data validation fails.
Raised when input data fails validation, such as invalid RID format, mismatched metadata, or constraint violations.
Example
raise DerivaMLValidationError("Invalid RID format: ABC") # doctest: +SKIP
Source code in src/deriva_ml/core/exceptions.py
309 310 311 312 313 314 315 316 317 318 319 | |
DerivaMLWorkflowError
Bases: DerivaMLExecutionError
Exception raised for workflow-related issues.
Raised when there are problems with workflow lookup, creation, or Git integration for workflow tracking.
Example
raise DerivaMLWorkflowError("Not executing in a Git repository") # doctest: +SKIP
Source code in src/deriva_ml/core/exceptions.py
410 411 412 413 414 415 416 417 418 419 420 | |