Latest revision as of 21:25, 13 February 2024

SLURM Partitions

Dispatching

It is really no longer necessary to discuss queues in the traditional sense. In the past, we would create queues based on pools of hardware resources. If a user wanted to utilize a particular hardware resource, he or she would request the appropriate queue. Most times, however, what the user wants and what is best for the user or what is best for all users are not necessarily the same. Allowing individuals to dictate where their jobs will run will inevitably lead to throughput problems since it would be unreasonable to expect the users to understand the complete state and behavior of the scheduler.

Below is a general description of how jobs make their way through the queue. Please see Scheduling and Dispatch Policy for more information.

When a user submits a job to a specific partiton, the scheduler determines if the requested hardware/time requirements of the job (see Using Features) match up with the resources the partition provides. If it does, the job is executed if there are available resources. If there are no available resources, the job will be held until the next scheduler iteration, to see if resources have become available.

“Available resources” include processors and memory. Processors generally match up to the number of slots in a given queue while memory is defined as a complex value which may not be so obvious to query. If your job is waiting in the qw state, it is likely that either the slots requested or the memory requested are beyond what the system can provide at that particular point in time.

CIRCE Partition Layout

The following partitions (aka queues) are available on CIRCE:

Current QOS' configuration and limits

SLURM_Active_QOS'

Per Partition Hardware

CIRCE_Hardware

Queue Name	Max Runtime	QOS' Required	Description (Preempt Grace Period)	Operating System	$WORK file system path	Notes
amd_2021	1 week	none	AMD 2021 hardware purchase	RHEL 7.4	/work_bgfs
amdwoods_2022	Per QOS	amdwoods22, physics22, preempt	Physics 2022 hardware purchase	RHEL 7.4	/work_bgfs	Joint hardware purchase within Physics
bfbsm_2019	Per QOS	bfbsm19, preempt	BFBSM_2019 hardware purchase	RHEL 7.4	/work_bgfs	*See Preemption Guidelines for more info
cbcs	Per QOS	fawcett_access, preempt	CBCS/Engineering queue	RHEL 7.4	/work_bgfs
charbonnier_2022	Per QOS	charbonnier22, preempt	Physics 2022 hardware purchase	RHEL 7.4	/work_bgfs	Joint hardware purchase within Physics
chbme_2018	Per QOS	chbme18, sim18, preempt	Chemical Engineering GPU nodes	RHEL 7.4	/work_bgfs	3 GPU's (GTX 1080 Ti) per node. Uses Omni-Path.
circe	1 week	none	default general-purpose queue	RHEL 7.4	/work	The default general-use partition if no partition is specified
cms_ocg	Per QOS	cms_ocg	CMS OCG nodes	RHEL 7.4	/work_bgfs	Limited to College of Marine Science OCG only
cool2022	Per QOS	cool22,preempt	CMS OCG nodes	RHEL 7.4	/work	College of Marine Science OOL AMD hardware purchase
hchg	1 week	hchg, interactive	general-purpose interactive/serial partition	RHEL 7.4	/work_bgfs	*See Preemption Guidelines for more info
himem	1 week	memaccess	large memory job queue (>= 64 GB)	RHEL 7.4	/work	To request access, email rc-help@usf.edu
margres_2020	Per QOS	margres20,preempt	Margres lab nodes, Integrative Biology	RHEL 7.4	/work_bgfs
muma_2021	Per QOS	muma21, preempt_short	MUMA	RHEL 7.4	/work_bgfs
qcg_gayles_2022	Per QOS	qcg_gayles22, physics22, preempt	Physics 2022 hardware purchase	RHEL 7.4	/work_bgfs	Joint hardware purchase within Physics
rra	1 week	rra, rra_guest	Genomics Center/Restricted Research	RHEL 7.4	/work	HIPAA certification required & audited BeeGFS file system. Uses Omni-Path.
rra_con2020	35 days	rradl	Genomics Center/College of Nursing deep learning partition	RHEL 7.4	/work	HIPAA certification required & audited BeeGFS file system. Uses HDR Infiniband. Access limited to approved personnel and workflows only.
simmons_itn18	Per QOS	sim18,chbme18, preempt, preempt_short	Chemical Engineering GPU nodes. (30 minute grace period)	RHEL 7.4	/work_bgfs	*See Preemption Guidelines for more info. 3 GPU's (GTX 1080 Ti) per node. Uses Omni-Path.
snsm_itn19	Per QOS	openaccess, snsm19, snsm19_long, snsm19_special	SNSM grant nodes	RHEL 7.4	/work_bgfs	1 GPU (GTX 1070 Ti) per node. Uses Omni-Path.

Note: For jobs requiring longer than 1 week to run, please email rc-help@usf.edu with your project details (hardware/runtime requested, duration of project, etc).

SC Partitions

SC Partition Layout

The following partitions (aka queues) are available on SC:

Queue Name	Max Runtime	QOS' Required	Description	Notes
sc	2 days	none	default general-purpose queue	The default partition if no partition is specified

Difference between revisions of "SLURM Partitions"

Latest revision as of 21:25, 13 February 2024

Contents

SLURM Partitions

Dispatching

CIRCE Partition Layout

Current QOS' configuration and limits

Per Partition Hardware

SC Partitions

SC Partition Layout

Search

Personal tools

Tools

USF LINKS

@@ Line 10: / Line 10: @@
 “Available resources” include processors and memory. Processors generally match up to the number of slots in a given queue while memory is defined as a complex value which may not be so obvious to query. If your job is waiting in the <code>qw</code> state, it is likely that either the slots requested or the memory requested are beyond what the system can provide at that particular point in time.
-== CIRCE Partitions ==
-=== CIRCE Node Sets ===
-The following node sets are available:
-{| class=wikitable
-|- style="background-color:#f1edbe;"
-|'''Memory'''
-|'''CPU'''
-|'''Cores'''
-|'''Interconnect'''
-|'''Nodes'''
-|'''Slots'''
-|'''GPUs'''
-|'''Constraint Flags'''
-|'''Location'''
-|-
-|24GB
-|Opteron 2384
-|12
-|4x DDR IB
-|32
-|384
-|n/a
-|ib_ddr, ib_psm, tpa, sse4, sse4a, cpu_amd, opteron_2384
-|Tampa
-|-
-|24GB
-|Xeon E5649
-|12
-|4x QDR IB
-|107
-|1284
-|8
-|ib_qdr, ib_psm, sse4, sse41, sse42, cpu_xeon, xeon_E5649
-|Tampa
-|-
-|24GB
-|Xeon E5-2630
-|12
-|4x QDR IB
-|67
-|804
-|n/a
-|ib_qdr, ib_psm, sse41, sse42, avx, cpu_xeon, xeon_E52630
-|Tampa
-|-
-|24GB
-|Xeon E5649
-|12
-|4x QDR IB
-|14
-|168
-|n/a
-|ib_qdr, ib_ofa, sse41, sse42, avx, cpu_xeon, xeon_E5649
-|Tampa
-|-
-|32GB
-|Xeon E5-2670
-|16
-|4x QDR IB
-|129
-|2064
-|40
-|ib_qdr, ib_psm, sse4, sse41, sse42, avx, cpu_xeon, xeon_E52670, gpu_K20
-|Tampa
-|-
-|512GB
-|Xeon E5-2650
-|20
-|4x QDR IB
-|3
-|60
-|n/a
-|ib_qdr, ib_psm, tpa, sse4_1, sse4_2, avx, avx2, gpfs, cpu_xeon, xeon_E52650, mem_512G
-|Tampa
-|-
-|'''Total'''
-|
-|
-|
-|
-|4764
-|48
-|
-|}
 === CIRCE Partition Layout ===
-The node sets are associated with the following queues:
+The following partitions (aka queues) are available on CIRCE:
+=== Current QOS' configuration and limits ===
+[[SLURM_Active_QOS']]
+=== Per Partition Hardware ===
+[[CIRCE_Hardware]]
 {|class=wikitable
-|- style="background-color:#f1edbe;"
+|- style="background-color:#f1edbe;text-align:center"
 |'''Queue Name'''
 |'''Max Runtime'''
 |'''QOS' Required'''
 |'''Description (Preempt Grace Period)'''
+|'''Operating System'''
+|'''$WORK file system path'''
 |'''Notes'''
+|-
+|amd_2021
+|1 week
+|none
+|AMD 2021 hardware purchase
+|RHEL 7.4
+|'''/work_bgfs'''
+|
+|-
+|amdwoods_2022
+|Per QOS
+|amdwoods22, physics22, preempt
+|Physics 2022 hardware purchase
+|RHEL 7.4
+|'''/work_bgfs'''
+|Joint hardware purchase within Physics
+|-
+|bfbsm_2019
+|Per QOS
+|bfbsm19, preempt
+|BFBSM_2019 hardware purchase
+|RHEL 7.4
+|'''/work_bgfs'''
+|*See [[SLURM_Preemption|Preemption Guidelines]] for more info
+|-
+|cbcs
+|Per QOS
+|fawcett_access, preempt
+|CBCS/Engineering queue
+|RHEL 7.4
+|'''/work_bgfs'''
+|-
+|charbonnier_2022
+|Per QOS
+|charbonnier22, preempt
+|Physics 2022 hardware purchase
+|RHEL 7.4
+|'''/work_bgfs'''
+|Joint hardware purchase within Physics
+|-
+|chbme_2018
+|Per QOS
+|chbme18, sim18, preempt
+|Chemical Engineering GPU nodes
+|RHEL 7.4
+|'''/work_bgfs'''
+|3 GPU's (GTX 1080 Ti) per node. Uses Omni-Path.
 |-
 |circe
@@ Line 112: / Line 78: @@
 |none
 |default general-purpose queue
-|The default partition if no partition is specified
+|RHEL 7.4
+|'''/work'''
+|The default general-use partition if no partition is specified
 |-
-|cuda
+|cms_ocg
-|1 week
+|Per QOS
-|none
+|cms_ocg
-|CUDA GPU nodes
+|CMS OCG nodes
-|
+|RHEL 7.4
+|'''/work_bgfs'''
+|Limited to College of Marine Science OCG only
 |-
-|devel
+|cool2022
-|30 minutes
+|Per QOS
-|devel
+|cool22,preempt
-|development partition
+|CMS OCG nodes
-|max: 24 cores and 2 nodes per user
+|RHEL 7.4
+|'''/work'''
+|College of Marine Science OOL AMD hardware purchase
 |-
-|mri2016
+|hchg
-|2 days
+|1 week
-|mri16, preempt
+|hchg, interactive
-|MRI nodes (2 hour grace period)
+|general-purpose interactive/serial partition
+|RHEL 7.4
+|'''/work_bgfs'''
 |*See [[SLURM_Preemption|Preemption Guidelines]] for more info
 |-
@@ Line 136: / Line 110: @@
 |memaccess
 |large memory job queue (&gt;= 64 GB)
+|RHEL 7.4
+|'''/work'''
 |To request access, email {{rchelp}}
 |-
-|cms2016
+|margres_2020
+|Per QOS
+|margres20,preempt
+|Margres lab nodes, Integrative Biology
+|RHEL 7.4
+|'''/work_bgfs'''
+|
+|-
+|muma_2021
+|Per QOS
+|muma21, preempt_short
+|MUMA
+|RHEL 7.4
+|'''/work_bgfs'''
+|
+|-
+|qcg_gayles_2022
+|Per QOS
+|qcg_gayles22, physics22, preempt
+|Physics 2022 hardware purchase
+|RHEL 7.4
+|'''/work_bgfs'''
+|Joint hardware purchase within Physics
+|-
+|rra
 |1 week
-|cms16
+|rra, rra_guest
-|CMS nodes
+|Genomics Center/Restricted Research
-|
+|RHEL 7.4
+|'''/work'''
+|HIPAA certification required & audited BeeGFS file system. Uses Omni-Path.
+|-
+|rra_con2020
+|35 days
+|rradl
+|Genomics Center/College of Nursing deep learning partition
+|RHEL 7.4
+|'''/work'''
+|HIPAA certification required & audited BeeGFS file system. Uses HDR Infiniband.  Access limited to approved personnel and workflows only.
+|-
+|simmons_itn18
+|Per QOS
+|sim18,chbme18, preempt, preempt_short
+|Chemical Engineering GPU nodes.
+(30 minute grace period)
+|RHEL 7.4
+|'''/work_bgfs'''
+|*See [[SLURM_Preemption|Preemption Guidelines]] for more info. 3 GPU's (GTX 1080 Ti) per node. Uses Omni-Path.
+|-
+|snsm_itn19
+|Per QOS
+|openaccess, snsm19, snsm19_long, snsm19_special
+|SNSM grant nodes
+|RHEL 7.4
+|'''/work_bgfs'''
+|1 GPU (GTX 1070 Ti) per node. Uses Omni-Path.
 |}
@@ Line 148: / Line 175: @@
 == SC Partitions ==
-=== SC Node Sets ===
-The following node sets are available:
-{| class=wikitable
-|- style="background-color:#f1edbe;"
-|'''Memory'''
-|'''CPU'''
-|'''Cores'''
-|'''Interconnect'''
-|'''Nodes'''
-|'''Slots'''
-|'''GPUs'''
-|'''Constraint Flags'''
-|'''Location'''
-|-
-|64GB
-|Xeon E5-2650 v4
-|24
-|4x QDR IB
-|10
-|240
-|n/a
-|ib_qdr, ib_psm, avx, avx2, sse4_1, sse4_2, gpfs, cpu_xeon, xeon_E52650
-|Tampa
-|-
-|48GB
-|Xeon E5649
-|12
-|4x QDR IB
-|7
-|84
-|7
-|ib_qdr, ib_psm, avx, avx2, sse4_1, sse4_2, gpfs, cpu_xeon, xeon_E5649
-|Tampa
-|-
-|'''Total'''
-|
-|
-|
-|
-|444
-|7
-|
-|}
 === SC Partition Layout ===
-The node sets are associated with the following queues:
+The following partitions (aka queues) are available on SC:
 {|class=wikitable
@@ Line 209: / Line 191: @@
 |default general-purpose queue
 |The default partition if no partition is specified
-|-
-|cuda
-|2 days
-|none
-|CUDA GPU nodes
-|
 |}