SLURM Preemption

Revision as of 15:01, 24 August 2021 by Botto (talk | contribs) (→‎Preemption)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Preemption

Some hardware on CIRCE is provided by research contributors. This hardware is available for use by all CIRCE users by specifying the partition and the "preempt" QOS (example: sbatch --partition=margres_2020 --qos=preempt ./submit-script.sh). The caveat however is that as this is contributor hardware, non-contributor jobs running on this partition are subject to preemption.

There is a partition-specified grace period (listed above, typically 2 hours) before a contributor’s job(s) will cancel the non-contributors job(s). This means that any user taking advantage of the hardware should have some kind of check-pointing enabled, so that interrupted jobs can be re-submitted without needing to start over.