DV-Zeuthen

| Computer Center

Overview

Batch System - Overview

Computer Center

Batch System - Overview

1. Available farm nodes
2. Submission hosts
3. Job runtime

1. Available farm nodes

node name
number of systems
CPU
clock frequency
cores
memory
scratch space in $TMPDIR
comment
kepler{00..15}
16
Intel Xeon E5-2660
2.2GHz
16
64GB
480GB
2x nVidia Kepler K20 GPGPU per node
kepler{16..26}
11
Intel Xeon E5-2640 v3
2.6GHz
16
64GB
480GB
2x nVidia Kepler K80 GPGPU per node
blade{g,h}*
32
Intel Xeon E5-2640 v3
2.6GHz
16
64GB
1.2TB
 
bladei*
16
Intel Xeon E5-2640 v4
2.4GHz
20
64GB
1.2TB
 
blade{j,k}*
32
Intel Xeon Gold 6130
2.1GHz
32
192GB
600GB
 
pascal*
6
Intel Xeon Gold 6130
2.1GHz
32
384GB
1.2TB
6x nVidia Tesla P4 GPGPU per node
pizza*
25
AMD EPYC 7702P
2.0GHz
64
512GB
1.7TB
 
turing*
5
Intel(R) Xeon(R) Gold 6130
2.1GHz
32
384GB
1.4TB
8x nVidia RTX 2080ti GPGPU per node
ampere*
3
AMD EPYC 7502
2.5GHz
64
512GB
1.7TB
8x nVidia RTX 3090 per node

For an up-to-date overview, see also output of qhost command - details can be found under job monitoring.

2. Submission hosts 

3. Job runtime

The batch farm is configured to optimize job throughput while providing some kind of interactive availability. Therefor we prefer job runtimes of 1-12 hours. Jobs running for more than 12 hours can only fill up the compute farm up to a certain percentage.

General Notice

The maximum job runtime is currently limited to 48 hours:

Jobs requesting a longer runtime won't ever start!

job runtime
description
0-30 minutes
allows a slight cpu oversubscription - this means, a job can start although all available slots are currently filled. Expect a slightly worse cpu performance! It should be used for test purposes only therefor.
30 minutes - 12 hours
The preferred job runtime. Allows the maximum farm usage while keeping an overall good "interactivity" i.e. fast job turnaround
12-24 hours
number of simultaneously running jobs is limited to 75% of the available slots
24-48 hours
number of simultaneously running jobs is limited to 66% of the available slots

A job runtime of less than 10 minutes should be avoided to keep a good ratio between overhead at job start/end and the actual job payload.