SLURM Preemption

Preemption

Some hardware on CIRCE is provided by research contributors. This hardware is available for use by all CIRCE users by specifying the partition and the "preempt" QOS (example: sbatch --partition=margres_2020 --qos=preempt ./submit-script.sh). The caveat however is that as this is contributor hardware, non-contributor jobs running on this partition are subject to preemption.

There is a partition-specified grace period (typically 2 hours) before a contributor’s job(s) will cancel the non-contributors job(s). Users of the prempt QOS should be aware that preemptable jobs on contributor hardware can be stopped at any time without warning. This means that any user taking advantage of the hardware should have some kind of check-pointing enabled, so that interrupted jobs can be re-submitted (potentially to another partition) without needing to start over.