ADR-0003: Dataset dev versioning — every mutation lands on dev, release is the only path to a released version
Date: 2026-05-09 Status: Accepted
Context
DerivaML datasets carry a semantic version (current_version) that
points at a row in Dataset_Version. Today, every member-mutation
operation (add_dataset_members, delete_dataset_members) auto-bumps
to a new released Dataset_Version row with a stamped catalog
Snapshot. Three problems with that model:
- No representation of "modified since last release but not yet re-released." Every mutation is its own release, so the dataset is always pointing at a frozen snapshot — there's no notational space for "the dataset has drifted, but I haven't decided what kind of release that warrants."
- No way to record indirect drift. When a feature value is added
to a member of a dataset, the dataset's
Datasetrow and member list are unchanged, but the bag's content has changed. Today's model has nowhere to put that fact. Dataset.cite()cannot distinguish frozen from live state. A citation URL for "the current version" can only ever be a snapshot-pinned URL, even when the catalog has drifted underneath.
The shape of the fix: introduce a dev version state, separate from released versions. The hard design questions are which mutations land on dev versus release, how dev rows are stored, and what happens at release time.
Decision
Every mutation lands on the dev version. Release is the only operation that produces a released version.
Concretely:
- A dataset's
current_versionis either a released PEP 440 version (0.4.0) pinning a catalogSnapshot, or a dev PEP 440 version (0.4.0.post1.devN) withSnapshot=NULL. add_dataset_members,delete_dataset_members,mark_dev, and any future drift-recording operation flip the dataset to a dev version. They never produce a released version.release(bump, description, execution=None)is the only operation that produces a released version. It promotes the dev row in place by settingVersionto the released label, stampingSnapshot, replacingDescriptionwith release notes, and overwriting the dev row'sExecutionlink with the suppliedexecution(orNULLif none). The released row'sExecutionlink records "the execution that calledrelease()", not "the most recent mutator" — mutator authorship during the dev period is recoverable from the audit trail (RMT, per-feature-value provenance) and doesn't need the version row to carry it.release()raisesDerivaMLValidationErrorwhen called on a dataset with no dev row. Users wanting a no-op release callmark_devfirst to declare a dev period, thenreleaseto promote it. The error message points at this resolution path.- Dev rows are lazy: a dev row exists only while the dataset is in dev state. Releases do not preemptively create the next dev row.
- Dev rows are mutable: one row per (dataset, dev period). The
.devNcounter advances byUPDATE, not byINSERT. TheDescriptioncolumn is replaced on each mutation with the most recently supplied description (not appended). Prior values are recoverable from the catalog's audit log if needed. - A dev label resolves if and only if it matches the dataset's
current dev row's
Version. The dev row is mutable, soVersion="0.4.0.post1.dev2"is observable only at the moment that's its current value; afterward, the same row'sVersionsays.dev3,.dev4, etc. The rule is "dev labels resolve to the live present, and only when they describe it accurately" — not "dev labels never resolve". As a corollary: methods that produce a dev label (e.g.,mark_dev) returnNone, notDatasetVersion— a returned dev label can't be passed to any version-accepting API later (the next mutation makes it unaddressable), so returning it would invite caller mistakes. Callers who want to display the new label readcurrent_versionafter the call.release(), by contrast, returnsDatasetVersionbecause a released label is addressable across time. APIs that accept aversion=argument: - Treat
version=None(or omitted) as "current version" — whichever the dataset has at request time, dev or released. - Accept any released label as today (snapshot-pinned).
- Accept the current dev label (matches the dev row's current
Version) and resolve it to the live catalog (no@snaptime). - Reject a dev label that does not match the current dev
row's
Version— raise a clear error: dev versions are mutable and historical or post-release.devNvalues are not addressable. - The
.devNcounter is a generation number, not a handle to historical state. Its purpose is notational change-detection — two reads ofcurrent_versionat different times can be told apart by their.devN. It is not a stable identifier across time. - Bag downloads of the current dev version use live catalog
state with no
@snaptimepin. Two downloads of the same dev label may differ if the catalog drifted between them. The cite-URL form follows the same rule (no@snaptimefor dev versions). - The
.devNcounter advances per call that actually changes at least one row —add_dataset_members,delete_dataset_members,mark_dev, or any future drift-recording operation. A call that no-ops (e.g.,add_dataset_members([])) does not advance the counter and does not create a dev row. The first effective mutation after a release creates the dev row at.dev1; there is no.dev0. Per-call granularity matches setuptools-scm's.devNsemantics — one commit equals one increment regardless of how many files it touched. create_datasetinitializes a new dataset at0.1.0released (no dev row). The "every mutation lands on dev" rule applies after creation, where it's load-bearing — at creation time there's no drift to record.- A dev version must never appear as the recorded version of an
Execution's consumed dataset. Dev versions are notational, not citational. Executions consume released versions; live-state consumption is recorded as live, not as a moving dev label.
There is no schema migration.
Dataset_Version.Snapshot is already nullable in the existing
schema (verified in create_schema.py and validation.py); the
dev-versioning work just makes that nullability load-bearing
instead of incidental. No DDL change. No data migration (no dev
rows exist yet, so nothing to backfill). The relevant code change
in create_schema.py is to update the Snapshot column's comment
to reflect its new contract — NULL means "live state, dev row" —
and to update the schema-validator's expected-columns map
(validation.py) where any expectations need sharpening.
Considered Options
Option B (rejected): Computed dev versions, no persistence
Dataset_Version would be unchanged. current_version would return
a synthetic dev label after every released bump. Rejected
because there'd be nothing to attach a description, an execution
link, or accumulated drift information to. The "notational clarity"
that motivated the work would be lost — the version label would
exist in name only.
Option C (rejected): Explicit dev-mode entry
Member-mutations would keep auto-bumping to a real release as today;
dev versions would only be reached via an explicit start_dev()
call. Rejected because mixed semantics (mutations bump to
release, but feature drift bumps to dev) are impossible to remember,
and the start_dev/end_dev API surface is the same shape as
mark_dev once we already need that — the explicit mode entry
collapses into something we already have.
Option A.b/A.c (rejected): Insert new released row at release;
delete or archive the dev row
release() would INSERT a new released Dataset_Version row and
either delete the dev row or keep it as a "superseded" record.
Rejected because dev rows are already established as mutable
(advancing .devN is a row update). One more update at release —
to set Version to released and stamp Snapshot — is consistent
with that pattern. Insert-and-delete tells the same story with
extra steps; archive-superseded is the over-engineering the
notational-clarity goal was supposed to avoid.
Consequences
increment_dataset_versionis renamed torelease(bump, description, execution=None)and moves toDatasetas an instance method. The argument formerly known ascomponentis renamed tobumpto match the workspace'sbump-versionCLI vocabulary. Theexecutionargument changes type fromRID | NonetoExecution | Noneto match the rest of the new API surface (typed objects, not bare RIDs). This is a breaking change for callers of the previous public API and ships in 1.34 with the migration guide. No deprecated alias is provided — CLAUDE.md's "no backwards-compat shims" rule applies.- A new
Dataset.is_dirty()/Dataset.release_diff()/Dataset.compare_versions(v_a, v_b)trio detects catalog drift by walking the same FK paths used to generate the dataset bag (viaCatalogGraph), filtering by anRMTtime predicate. The drift walk is the bag walk plus anRMTfilter. The three methods all flow through one internal_diff_between(t_lower, t_upper)helper; they differ only in what they pass to it: is_dirty()— fastboolpredicate, short-circuits on first non-zero count. Calls_diff_between(t_last_release, None)whereNonemeans "live state, no upper bound."release_diff()— per-table change counts. Implemented as a thin wrapper aroundcompare_versions(self.last_released_version, self.current_version). When the dataset is in dev,current_versionis the dev label, which resolves to live state. When the dataset is at its last release with no drift, both endpoints coincide and the result is{}. This is the right answer in both cases.compare_versions(v_a, v_b)— per-table change counts between any two endpoints. Each argument may independently be a released label (resolves to that snapshot's timestamp) or the current dev label (resolves to live state). Stale or post-release dev labels error per the addressability rule. The predicate ismin(t_a, t_b) < RMT <= max(t_a, t_b); order is symmetric for the result set.
All three live on Dataset only, not on DatasetLike — bags
can never be dirty.
- Deletions of catalog rows referenced by a dataset are not
detected by any of these methods (a deleted row is invisible to
the bag walk too). Users who delete content rows must call
mark_dev manually. Closing this gap would require querying
the snapshots' RID sets directly and computing set differences
— a separate, deferred operation with a different cost profile.
- Cite-URL routing falls out: released versions get
snapshot-pinned URLs; dev versions get no-snapshot URLs. The
check is the PEP 440 is_devrelease property — see ADR-0004 for
why we use PEP 440.
Read-side surface
dataset_history() and current_version are unchanged in shape but
gain new behaviors implied by dev rows being first-class
Dataset_Version entries:
dataset_history()returns allDataset_Versionrows for the dataset, dev or released, with no filtering. Callers who want released-only filter by the PEP 440 typed property:[h for h in ds.dataset_history() if not h.dataset_version.is_devrelease]. Hiding the filter inside the method would make the API disagree with the catalog; we don't.dataset_history()results are sorted bydataset_versionascending. Reads forward in time;[0.1.0, 0.2.0, ..., 0.4.0.post1.dev3]. Today's order is whatever the catalog returns — that's a fragile contract and the change is net positive.current_versionkeeps usingmax(history). Under PEP 440 ordering (ADR-0004), the dev version sorts after the last released version, somaxcorrectly returns the dev version when there is one.current_version's defensive fallback toDatasetVersion(0, 1, 0)for empty history is removed.create_datasetalways inserts a version row, so empty history is a catalog inconsistency — raise rather than silently invent a version. This matches CLAUDE.md's "no defensive code for impossible cases" rule.- No new read methods. No
current_dev_version(), noreleased_history()filter helper. Dev rows are not second-class; they're a different state of the same row type. - Display methods (
to_markdown, etc.) render PEP 440 version strings literally."0.4.0.post1.dev3"is what the version is; introducing a separate display form would require keeping it in sync with the canonical form.
Concurrency model
release() and dev-row mutations both modify the same row (the
dev row). Concurrent writers are reconciled at the database level
via ERMrest's row-level conditional updates, not by application
locking:
- Each writer reads the dev row's
RMT(row-modified time) before acting. - Each writer's
UPDATEcarriesWHERE RID=<dev_row_rid> AND RMT=<observed>. If a competing writer landed first, the predicate matches zero rows; the update is a no-op and the caller raises a clear "concurrent modification" error pointing at re-reading the dataset. - The framework does not auto-retry. A concurrent release
between read and write may change the meaning of the caller's
operation (they thought they were mutating the dev period after
0.4.0; the dataset has since been released to0.5.0and any new mutation would land on a fresh dev period after that). The caller decides whether the new state is still what they want. - A mutation that arrives on a just-released dataset is not a
conflict — it's the normal lazy-dev-row case. The mutation
observes a released row, creates a new dev row at
<just_released>.post1.dev1, and proceeds.
The key reason for using ERMrest's row-level concurrency rather
than an application-level Status column or a Dataset row
generation counter: the database already solves this; introducing
a parallel locking scheme adds schema and bug surface for a corner
that the conditional-update primitive handles cleanly.
Migration impact
This work is a breaking change shipped in 1.34 (the renamed
release() method, the dev-flip behavior change for
add_dataset_members / delete_dataset_members, and the changed
Dataset.current_version / dataset_history() semantics). There
is no DDL change (the
schema's Snapshot column is already nullable), but the
semantics of several existing public methods change. The
behavior-change inventory:
| Existing call | Today | New |
|---|---|---|
add_dataset_members / delete_dataset_members |
Bumps to a new released version | Lands on a dev version (creates .dev1 if needed, advances .devN if existing) |
increment_dataset_version(component, description, execution_rid=None) |
Public mixin method on DerivaML, creates a released row |
Renamed to Dataset.release(bump, description, execution=None). Instance method on Dataset. Errors if no dev row. execution_rid: RID → execution: Execution. |
Dataset.current_version |
Returns latest released | Returns dev when present, else latest released |
Dataset.dataset_history() |
Released rows in unspecified order | All rows (dev + released) sorted ascending |
Downstream callers must:
- Find every call to
increment_dataset_versionand rewrite toDataset.release(). Mechanical rename plus signature update. - Audit any logic that assumes "the version after a mutation
is the released version" — that's no longer true. To produce
a released version, follow the mutation with an explicit
release()call. - Audit any
find_datasets/dataset_historyconsumers that filter on "released versions" — they'll need to add an explicitis_devreleasefilter rather than relying on every version being released.
Out-of-repo blast radius (handled in dependent PRs, not this ADR):
deriva-ml-mcp's tool that exposes increment_dataset_version,
the model-template workflows, and any user notebooks. The 1.34
changelog and migration guide are written at PR-ready time and
point users at this ADR.
Schema-creation and validation touch points
Implementation must update these places to reflect the new
contract on Dataset_Version, even though the column types do not
change:
src/deriva_ml/schema/create_schema.py— theSnapshotcolumn'scommentshould make the new contract explicit: populated for released rows,NULLfor dev rows. The comment today is "Catalog Snapshot ID for dataset", which is silent on the nullable case.src/deriva_ml/schema/create_schema.py— theVersioncolumn'scomment("Semantic version of dataset") anddefault("0.1.0") should be updated. The label is no longer "semantic version" (that's semver-specific); it's a PEP 440 version string. Released rows carryMAJOR.MINOR.PATCH; dev rows carry<last_release>.post1.devN.src/deriva_ml/schema/validation.py—EXPECTED_TABLE_COLUMNS[MLTable.dataset_version]["Snapshot"]is already("text", True)(nullable). Confirm this is the expected contract; no change needed unless a stricter expectation is desired for released rows (which would require splitting the validator's view into "released-row expectation" vs "dev-row expectation" — out of scope for this ADR).