Revision as of 20:21, 9 January 2017

SLURM Partitions

Dispatching

It is really no longer necessary to discuss queues in the traditional sense. In the past, we would create queues based on pools of hardware resources. If a user wanted to utilize a particular hardware resource, he or she would request the appropriate queue. Most times, however, what the user wants and what is best for the user or what is best for all users are not necessarily the same. Allowing individuals to dictate where their jobs will run will inevitably lead to throughput problems since it would be unreasonable to expect the users to understand the complete state and behavior of the scheduler.

Below is a general description of how jobs make their way through the queue. Please see Scheduling and Dispatch Policy for more information.

When a user submits a job to a specific partiton, the scheduler determines if the requested hardware/time requirements of the job (see Using Features) match up with the resources the partition provides. If it does, the job is executed if there are available resources. If there are no available resources, the job will be held until the next scheduler iteration, to see if resources have become available.

“Available resources” include processors and memory. Processors generally match up to the number of slots in a given queue while memory is defined as a complex value which may not be so obvious to query. If your job is waiting in the qw state, it is likely that either the slots requested or the memory requested are beyond what the system can provide at that particular point in time.

Partition Node Sets

The following node sets are available:

Memory	CPU	Cores	Interconnect	Nodes	Slots	GPUs	Complex Flags	Location
24GB	Opteron 2384	12	4x DDR IB	32	384	n/a	ib_ddr, ib_psm, tpa, sse4, sse4a, cpu_amd, opteron_2384	Tampa
24GB	Xeon E5649	12	4x QDR IB	107	1284	8	ib_qdr, ib_psm, sse4, sse41, sse42, cpu_xeon, xeon_E5649	Tampa
24GB	Xeon E5-2630	12	4x QDR IB	67	804	n/a	ib_qdr, ib_psm, sse41, sse42, avx, cpu_xeon, xeon_E52630	Tampa
24GB	Xeon E5649	12	4x QDR IB	14	168	n/a	ib_qdr, ib_ofa, sse41, sse42, avx, cpu_xeon, xeon_E5649	Tampa
32GB	Xeon E5-2670	16	4x QDR IB	129	2064	40	ib_qdr, ib_psm, sse4, sse41, sse42, avx, cpu_xeon, xeon_E52670, gpu_K20	Tampa
512GB	Xeon E5-2650	20	4x QDR IB	3	60	n/a	ib_qdr, ib_psm, tpa, sse4_1, sse4_2, avx, avx2, gpfs, cpu_xeon, xeon_E52650, mem_512G	Tampa
Total					4764	48

Partition Layout

The node sets are associated with the following queues:

Queue Name	Max Runtime	QOS' Required	Description/Preempt Grace Period	Notes
circe	infinite	none	default general-purpose queue
rc2016	infinite	none	general-purpose queue with latest hardware
gpfsgpu	infinite	none	CUDA GPU nodes
cuda	infinite	none	CUDA GPU nodes
cms2016	infinite	cms16, preempt	CMS nodes (2 hour grace period)	*See Preemption description below
mri2016	infinite	mri16, preempt	MRI nodes (2 hour grace period)	*See Preemption description below
himem	1 week	memaccess	large memory job queue (>= 24 GB)	To request access, email rc-help@usf.edu

Preemption

Some hardware on CIRCE is provided by research contributors. This hardware is available for use by all CIRCE users by specifying the partition and the "preempt" QOS (example: sbatch --partition=mri2016 --qos=preempt ./submit-script.sh). The caveat however is that as this is contributor hardware, non-contributor jobs running on this partition are subject to preemption.

There is a partition-specified grace period (listed above, typically 2 hours) before a contributor’s job(s) will cancel the non-contributors job(s). This means that any user taking advantage of the hardware should have some kind of check-pointing enabled, so that interrupted jobs can be re-submitted without needing to start over.

@@ Line 135: / Line 135: @@
 |cms16, preempt
 |CMS nodes (2 hour grace period)
-|*See [https://wiki.rc.usf.edu/index.php/SLURM_Partitions#Preemption_Description Preemption Description] below
+|*See [https://wiki.rc.usf.edu/index.php/SLURM_Partitions#Preemption Preemption] description below
 |-
 |mri2016
@@ Line 141: / Line 141: @@
 |mri16, preempt
 |MRI nodes (2 hour grace period)
-|*See [https://wiki.rc.usf.edu/index.php/SLURM_Partitions#Preemption_Description Preemption Description] below
+|*See [https://wiki.rc.usf.edu/index.php/SLURM_Partitions#Preemption Preemption] description below
 |-
 |himem

Difference between revisions of "SLURM Partitions"

Revision as of 20:21, 9 January 2017

Contents

SLURM Partitions

Dispatching

Partition Node Sets

Partition Layout

Preemption

Search

Personal tools

Tools

USF LINKS