Welcome to DerivaML

DerivaML is a library designed to improve the reproducibility of machine learning experiments. Using DerivaML facilitates the ability to create data that is continously Findable, Accessable, Interoperable, and Reusable (FAIR).

More details about the principles of continous FAIRness can be found here.

Dempsey, William, Ian Foster, Scott Fraser, and Carl Kesselman.
Sharing begins at home: how continuous and ubiquitous FAIRness can enhance research productivity and data reuse.
Harvard data science review 4, no. 3 (2022). PDF

An overview of the design and operational aspects of DerivaML can be found in this paper.

[Li, Zhiwei, Carl Kesselman, Mike D’Arcy, Michael Pazzani, and Benjamin Yizing Xu. Deriva-ML: A Continuous FAIRness Approach to Reproducible Machine Learning Models. In 2024 IEEE 20th International Conference on e-Science (e-Science), pp. 1-10. IEEE, 2024. PDF

The data model design in Deriva Catalog:

minid