Scientific purpose and scope

BenchmarkSet1500 is an open-access multireference excited-state database established to provide the first dedicated high-accuracy benchmark set for organic semiconductor research. The repository comprises 1,500 small organic molecules with consistently computed vertical excited-state properties obtained using state-averaged complete active space self-consistent field (SA-CASSCF) and strongly contracted N-electron valence state second-order perturbation theory (SC-NEVPT2). The dataset focuses on systems where single-reference approaches (e.g. TD-DFT) are known to fail, including molecules exhibiting strong static correlation and inverted singlet–triplet gaps.

Each molecular entry contains curated metadata, optimised geometries (at B3LYP/6-31g* level of theory), complete electronic-structure output files, and computed excited-state energies and oscillator strengths for low-lying singlet and triplet states. In addition, a consolidated machine-learning-ready CSV file aggregates all molecules with their structural descriptors and excited-state properties to enable immediate integration into data-driven workflows.

BenchmarkSet1500 is designed to support rigorous method benchmarking, systematic assessment of theory-level performance, development of predictive models, and screening for technologically relevant organic semiconductors.

Any feedback, suggestions or contributions can be provided to support@psdi.ac.uk.