Overview

Batch System - Overview

Computer Center

Batch System - Overview

1. Available farm nodes
2. Submission hosts
3. Job runtime

1. Available farm nodes

node name	number of systems	CPU	clock frequency	cores	memory	scratch space in $TMPDIR	comment
kepler{00..15}	16	Intel Xeon E5-2660	2.2GHz	16	64GB	480GB	2x nVidia Kepler K20 GPGPU per node
kepler{16..26}	11	Intel Xeon E5-2640 v3	2.6GHz	16	64GB	480GB	2x nVidia Kepler K80 GPGPU per node
blade{g,h}*	32	Intel Xeon E5-2640 v3	2.6GHz	16	64GB	1.2TB
bladei*	16	Intel Xeon E5-2640 v4	2.4GHz	20	64GB	1.2TB
blade{j,k}*	32	Intel Xeon Gold 6130	2.1GHz	32	192GB	600GB
pascal*	6	Intel Xeon Gold 6130	2.1GHz	32	384GB	1.2TB	6x nVidia Tesla P4 GPGPU per node
pizza*	25	AMD EPYC 7702P	2.0GHz	64	512GB	1.7TB
turing*	5	Intel(R) Xeon(R) Gold 6130	2.1GHz	32	384GB	1.4TB	8x nVidia RTX 2080ti GPGPU per node
ampere*	3	AMD EPYC 7502	2.5GHz	64	512GB	1.7TB	8x nVidia RTX 3090 per node

For an up-to-date overview, see also output of qhost command - details can be found under job monitoring.

2. Submission hosts

Public login machines (pub[1..6])
Workgroup server (Linux)
Linux desktops

3. Job runtime

The batch farm is configured to optimize job throughput while providing some kind of interactive availability. Therefor we prefer job runtimes of 1-12 hours. Jobs running for more than 12 hours can only fill up the compute farm up to a certain percentage.

General Notice

The maximum job runtime is currently limited to 48 hours:

Jobs requesting a longer runtime won't ever start!

job runtime	description
0-30 minutes	allows a slight cpu oversubscription - this means, a job can start although all available slots are currently filled. Expect a slightly worse cpu performance! It should be used for test purposes only therefor.
30 minutes - 12 hours	The preferred job runtime. Allows the maximum farm usage while keeping an overall good "interactivity" i.e. fast job turnaround
12-24 hours	number of simultaneously running jobs is limited to 75% of the available slots
24-48 hours	number of simultaneously running jobs is limited to 66% of the available slots

A job runtime of less than 10 minutes should be avoided to keep a good ratio between overhead at job start/end and the actual job payload.