AI Ready Datasets
AI ready datasets give physical sciences researchers data they can trust and reuse. These AI-Ready datasets are hosted as a PSDI Community Data Collection and are designed to support AI and machine learning workflows, ranging from general purpose collections that researchers can shape for their own models, to task-specific datasets that already include annotations and predefined training, validation, and test splits. Every record includes one or more datafiles accompanied by a Croissant format metadata description (https://mlcommons.org/working-groups/data/croissant/), ensuring that structure, provenance, and context are captured in a machine readable way.
To use this resource go to the resource landing page.
This resource is part of the AI Ready Datasets resource theme.
Qualified Attribution
- Contributor: Claire Murray
- Contributor: Julia Parker
- Contributor: Tobias Bird
- Contributor: Andrew Stewart
- Contributor: Natalia Da Silva De Sa
- Contributor: Jaehoon Cha
- Contributor: Daniel N. Rainer
- Contributor: Simon J. Coles
- Contributor: Tahereh Nematiaram
- Contributor: Malin Zollner
- Contributor: Matthew Partridge
Further Information
Publisher
Access
Open Access
License
Contact
Citation
Please cite: Aileen E. Day. AI Ready Datasets. Online. Version 1.0.0. 19 March 2026. Available from: https://resources.psdi.ac.uk/data/1efa3139-865d-4a81-9426-3ca372217a7e. [accessed YYYY-MM-DD].

