SLURM Partitions

Dispatching

It is really no longer necessary to discuss queues in the traditional sense. In the past, we would create queues based on pools of hardware resources. If a user wanted to utilize a particular hardware resource, he or she would request the appropriate queue. Most times, however, what the user wants and what is best for the user or what is best for all users are not necessarily the same. Allowing individuals to dictate where their jobs will run will inevitably lead to throughput problems since it would be unreasonable to expect the users to understand the complete state and behavior of the scheduler.

Below is a general description of how jobs make their way through the queue. Please see Scheduling and Dispatch Policy for more information.

When a user submits a job to a specific partiton, the scheduler determines if the requested hardware/time requirements of the job (see Using Features) match up with the resources the partition provides. If it does, the job is executed if there are available resources. If there are no available resources, the job will be held until the next scheduler iteration, to see if resources have become available.

“Available resources” include processors and memory. Processors generally match up to the number of slots in a given queue while memory is defined as a complex value which may not be so obvious to query. If your job is waiting in the qw state, it is likely that either the slots requested or the memory requested are beyond what the system can provide at that particular point in time.

CIRCE Partitions

CIRCE Node Sets

The following node sets are available:

Memory	CPU	Cores	Interconnect	Nodes	Slots	GPUs	Constraint Flags	Location
24GB	Xeon E5-2630	12	4x QDR IB	28	336	n/a	ib_qdr, ib_psm, sse41, sse42, avx, cpu_xeon, xeon_E52630	Tampa
32GB	Xeon E5-2670	16	4x QDR IB	19	304	37	ib_qdr, ib_psm, sse4, sse41, sse42, avx, cpu_xeon, xeon_E52670, gpu_K20	Tampa
32GB	Xeon E5-2670	16	4x QDR IB	84	1344	n/a	ib_qdr, ib_psm, sse4, sse41, sse42, avx, cpu_xeon, xeon_E52670	Tampa
512GB	Xeon E5-2650 V3	20	4x QDR IB	3	60	n/a	ib_qdr, ib_psm, tpa, sse4_1, sse4_2, avx, avx2, gpfs, cpu_xeon, xeon_E52650, mem_512G	Tampa
64GB	Xeon E5-2650 V4	24	4x QDR IB	108	2592	n/a	ib_qdr, ib_psm, tpa, sse4_1, sse4_2, avx, avx2, gpfs, cpu_xeon, xeon_E52650	Tampa
192GB	Xeon Silver 4114	20	100G Omni-Path	57	1140	57	ib_opa, ib_psm, tpa, sse4_1, sse4_2, avx, avx2, gpfs, cpu_xeon, silver_4114, gpu_gtx1080ti	Tampa
192GB	Xeon Silver 4114	20	100G Omni-Path	39	780	49	ib_opa, ib_psm, tpa, sse4_1, sse4_2, avx, avx2, gpfs, cpu_xeon, silver_4114, gpu_gtx1070ti	Tampa
96GB	Xeon Silver 4114	20	100G Omni-Path	10	200	6	ib_opa, ib_psm, tpa, sse4_1, sse4_2, avx, avx2, gpfs, cpu_xeon, silver_4114, gpu_gtx1080ti	Tampa
96GB	Xeon Silver 4114	20	100G Omni-Path	2	40	2	ib_opa, ib_psm, tpa, sse4_1, sse4_2, avx, avx2, gpfs, cpu_xeon, silver_4114, gpu_titanv100	Tampa
96GB	Xeon Gold 6136	20	100G Omni-Path	8	160	24	ib_opa, ib_psm, tpa, sse4_1, sse4_2, avx, avx2, gpfs, cpu_xeon, gold_6136, gpu_gtx1080ti	Tampa
96GB	Xeon Gold 6136	20	100G Omni-Path	20	400	60 (480)	ib_opa, ib_psm, tpa, sse4_1, sse4_2, avx, avx2, gpfs, cpu_xeon, gold_6136, gpu_gtx1080ti	Tampa
Total				378	7324	235 (655)

CIRCE Partition Layout

The node sets are associated with the following queues:

Queue Name	Max Runtime	QOS' Required	Description (Preempt Grace Period)	Operating System	$WORK file system path	Notes
bgfsqdr	Per QOS	el7, el7_cms	EL7 application testing	RHEL 7.4	/work_bgfs	To request access, email rc-help@usf.edu
circe	1 week	none	default general-purpose queue	RHEL 6.8	/work	The default partition if no partition is specified
cms2016	1 week	cms16	CMS nodes	RHEL 6.8	/work
cuda	1 week	none		RHEL 6.8	/work	CUDA GPU nodes
devel	Per QOS	devel, trial	development partition	RHEL 6.8	/work	max: 24 cores and 2 nodes per user
mri2016	2 days	mri16, mri16_npi, preempt	MRI nodes (2 hour grace period)	RHEL 6.8	/work	*See Preemption Guidelines for more info
henderson_itn18	Per QOS	hen18	Chemical Engineering GPU nodes (2 hour grace period)	RHEL 7.4	/work_bgfs	3 GPU's (GTX 1080 Ti) per node
himem	1 week	memaccess	large memory job queue (>= 64 GB)	RHEL 6.8	/work	To request access, email rc-help@usf.edu
rra	1 week	rra	Genomics Center/Restricted Research	RHEL 7.4	/work	HIPAA certification required & audited BeeGFS file system
simmons_itn18	Per QOS	sim18	Chemical Engineering GPU nodes (2 hour grace period)	RHEL 7.4	/work_bgfs	24 GPU's (GTX 1080 Ti) per node (oversubscribed)
snsm_itn19	Per QOS	snsm19, snsm19_long	SNSM grant nodes	RHEL 7.4	/work_bgfs	1 GPU (GTX 1070 Ti) per node

Note: For jobs requiring longer than 1 week to run, please email rc-help@usf.edu with your project details (hardware/runtime requested, duration of project, etc).

SC Partitions

SC Node Sets

The following node sets are available:

Memory	CPU	Cores	Interconnect	Nodes	Slots	GPUs	Constraint Flags	Location
64GB	Xeon E5-2650 v4	24	4x QDR IB	10	240	n/a	ib_qdr, ib_psm, avx, avx2, sse4_1, sse4_2, gpfs, cpu_xeon, xeon_E52650	Tampa
48GB	Xeon E5649	12	4x QDR IB	7	84	7	ib_qdr, ib_psm, avx, avx2, sse4_1, sse4_2, gpfs, cpu_xeon, xeon_E5649	Tampa
Total					444	7

SC Partition Layout

The node sets are associated with the following queues:

Queue Name	Max Runtime	QOS' Required	Description	Notes
sc	2 days	none	default general-purpose queue	The default partition if no partition is specified
cuda	2 days	none	CUDA GPU nodes

SLURM Partitions

Contents

SLURM Partitions

Dispatching

CIRCE Partitions

CIRCE Node Sets

CIRCE Partition Layout

SC Partitions

SC Node Sets

SC Partition Layout

Search

Personal tools

Tools

USF LINKS