SLURM Partitions

Dispatching

It is really no longer necessary to discuss queues in the traditional sense. In the past, we would create queues based on pools of hardware resources. If a user wanted to utilize a particular hardware resource, he or she would request the appropriate queue. Most times, however, what the user wants and what is best for the user or what is best for all users are not necessarily the same. Allowing individuals to dictate where their jobs will run will inevitably lead to throughput problems since it would be unreasonable to expect the users to understand the complete state and behavior of the scheduler.

Below is a general description of how jobs make their way through the queue. Please see Scheduling and Dispatch Policy for more information.

When a user submits a job to a specific partiton, the scheduler determines if the requested hardware/time requirements of the job (see Using Features) match up with the resources the partition provides. If it does, the job is executed if there are available resources. If there are no available resources, the job will be held until the next scheduler iteration, to see if resources have become available.

“Available resources” include processors and memory. Processors generally match up to the number of slots in a given queue while memory is defined as a complex value which may not be so obvious to query. If your job is waiting in the qw state, it is likely that either the slots requested or the memory requested are beyond what the system can provide at that particular point in time.

CIRCE Partition Layout

The following partitions (aka queues) are available on CIRCE:

Current QOS' configuration and limits

SLURM_Active_QOS'

Per Partition Hardware

CIRCE_Hardware

Queue Name	Max Runtime	QOS' Required	Description (Preempt Grace Period)	Operating System	$WORK file system path	Notes
bgfsqdr	1 week	none	EL7 general-purpose queue	RHEL 7.4	/work_bgfs	General use partition
circe	1 week	none	default general-purpose queue	RHEL 6.8	/work	The default partition if no partition is specified
cms2016	1 week	cms16	CMS nodes	RHEL 6.8	/work	Limited to College of Marine Science only
cuda	1 week	none	Pseudo partition for access to Kepler K20's	RHEL 6.8	/work	CUDA GPU nodes
hchg	1 week	hchg, interactive	EL7 interactive/serial partition	RHEL 7.4	/work_bgfs	General use partition for interactive and serial jobs
mri2016	2 days	mri16, mri16_npi, preempt	MRI nodes (2 hour grace period)	RHEL 6.8	/work	*See Preemption Guidelines for more info
henderson_itn18	Per QOS	hen18	Chemical Engineering GPU nodes (2 hour grace period)	RHEL 7.4	/work_bgfs	3 GPU's (GTX 1080 Ti) per node. Uses Omni-Path.
himem	1 week	memaccess	large memory job queue (>= 64 GB)	RHEL 6.8	/work	To request access, email rc-help@usf.edu
rra	1 week	rra	Genomics Center/Restricted Research	RHEL 7.4	/work	HIPAA certification required & audited BeeGFS file system. Uses Omni-Path.
simmons_itn18	Per QOS	sim18	Chemical Engineering GPU nodes (2 hour grace period)	RHEL 7.4	/work_bgfs	24 GPU's (GTX 1080 Ti) per node (oversubscribed). Uses Omni-Path.
snsm_itn19	Per QOS	snsm19, snsm19_long	SNSM grant nodes	RHEL 7.4	/work_bgfs	1 GPU (GTX 1070 Ti) per node. Uses Omni-Path.

Note: For jobs requiring longer than 1 week to run, please email rc-help@usf.edu with your project details (hardware/runtime requested, duration of project, etc).

SC Partitions

SC Partition Layout

The following partitions (aka queues) are available on SC:

Queue Name	Max Runtime	QOS' Required	Description	Notes
sc	2 days	none	default general-purpose queue	The default partition if no partition is specified

SLURM Partitions

Contents

SLURM Partitions

Dispatching

CIRCE Partition Layout

Current QOS' configuration and limits

Per Partition Hardware

SC Partitions

SC Partition Layout

Search

Personal tools

Tools

USF LINKS