nf-core/configs: dkfz
Edit

Deutsches Krebsforschungszentrum (DKFZ) ODCF HPC cluster profile

nf-core/configs: DKFZ configuration

To use, run the pipeline with -profile dkfz. This will download and launch the dkfz.config, pre-configured for the Deutsches Krebsforschungszentrum (DKFZ) / ODCF LSF cluster in Heidelberg, Germany.

This configuration is tested with Nextflow 25.10.0 (available on the cluster as a module).

The profile only configures the cluster itself (LSF executor, dynamic queue selection, scratch, resource limits and the /omics bind-mount). Pick a container engine on the command line, e.g. -profile dkfz,apptainer or -profile dkfz,conda.

:warning: Use Apptainer/Singularity (or Conda), not Docker. On the ODCF cluster Docker is only available through LSF’s docker-generic application profile. Nextflow’s docker executor runs docker run directly on the node, which this setup does not allow, so -profile dkfz,docker will not work. Use -profile dkfz,apptainer instead.

Before you use this profile

Load Nextflow via the environment module system on a submission host. Check the pipeline’s README for the required Nextflow version:
```
module load Nextflow/25.10.0
```
Submit from a submission host (bsub01.lsf.dkfz.de / bsub02.lsf.dkfz.de). Do not run heavy work on the login/worker nodes. Wrap the Nextflow driver itself in a bsub job (see below).
The shared /omics filesystem is bind-mounted into every container automatically. If your inputs or references live elsewhere, point NXF_APPTAINER_CACHEDIR / NXF_SINGULARITY_CACHEDIR at a path under /omics so images are cached on shared storage:
```
export NXF_APPTAINER_CACHEDIR=/omics/groups/<your-group>/.../apptainer_cache
```

Queues

Queue selection is automatic, based on each task’s requested time and memory:

Queue	Selected when	Limit
`short`	no time given, or `time <= 10.min`	10 min
`medium`	`time <= 1.h`	1 hour
`long`	`time <= 10.h`	10 hours
`verylong`	`time > 10.h`	no hard limit
`highmem`	`memory > 200.GB`	up to ~4 TB

Note: highmem is the only queue that accepts requests above 200 GB (and it rejects requests below 200 GB).

Resource limits, retries and containers

Every task is capped to what the cluster can provide via process.resourceLimits (64 CPUs, 1000 GB memory, 720 h). Requests above these are capped automatically.
Unlabelled processes default to a safe 1 CPU / 6 GB / 10 min.
The shared /omics filesystem is bound into every container via containerOptions, with --nv added for accelerator tasks. If one of your modules sets its own containerOptions, re-add --bind /omics there.

Enable GPU support

This profile turns any task that requests a GPU through Nextflow’s standard accelerator directive into a correct DKFZ GPU submission. It selects the GPU queue, builds the LSF -gpu num=<n>:j_exclusive=yes[:gmem=<n>G] request, and adds --nv so the GPU is visible inside the container.

How a task acquires an accelerator request depends on the pipeline:

nf-core pipelines mark GPU-capable processes with the process_gpu label and only switch the accelerator on when the run includes the gpu profile. So add gpu to your profile list:
```
nextflow run <pipeline> -profile dkfz,gpu,apptainer --input ... --outdir ...
```

Custom / non-nf-core pipelines just declare accelerator on the GPU process:

process MY_GPU_TASK {
    accelerator 1
    container 'docker://nvcr.io/...'

    script:
    "my_gpu_tool ..."
}

nextflow run main.nf -profile dkfz,apptainer --outdir ...

Tasks without an accelerator request are unaffected and run on the normal CPU queues.

Choosing the GPU queue

The --dkfz_gpu_queue parameter selects which GPU queue all GPU jobs are submitted to (default gpu):

gpu — default (RTX 2080 Ti … V100/A100-DGX), 72 h wall time
gpu-lowprio — same nodes as gpu but low priority; use for large job batches
gpu-pro — high-end A100/H200/L40S/GH200, 142 h wall time — requires a separate access application to the DKFZ Data Science Board

Number of GPUs and GPU memory per process

The profile builds the LSF request as -gpu num=<n>:j_exclusive=yes[:gmem=<n>G] (DKFZ requires j_exclusive=yes and rejects mode=exclusive_process). Two things are tunable per process:

Number of GPUs — the accelerator directive (default 1).
GPU memory (optional) — set ext.gpu_memory to a Nextflow memory value to pin the job to GPUs with at least that much VRAM. When ext.gpu_memory is unset, gmem is omitted and LSF assigns any free GPU.

Approximate values to target each GPU tier (request at or just below the card’s usable VRAM):

`ext.gpu_memory`	Targets	Queue
`10.GB`	RTX 2080 Ti (11 GB)	`gpu`
`15.GB`	V100 16 GB	`gpu`
`23.GB`	TITAN RTX / Quadro RTX (24 GB)	`gpu`
`31.GB`	V100 32 GB	`gpu`
`40.GB`	A100 40 GB	`gpu-pro` only
`46.GB`	L40S	`gpu-pro` only
`98.GB`	GH200	`gpu-pro` only
`141.GB`	H200	`gpu-pro` only

Set these directly on the process, or per process name from config (e.g. nf-core’s conf/modules.config):

process {
    // 2 GPUs, any free GPU (no gmem constraint)
    withName: 'FOO:BAR:ALIGN_GPU' {
        accelerator = 2
    }
    // 1 big-memory GPU
    withName: 'FOO:BAR:FOLD' {
        accelerator    = 1
        ext.gpu_memory = 40.GB   // -> A100/L40S/H200; also set --dkfz_gpu_queue gpu-pro
    }
}

:warning: Requesting 40.GB or more only works on gpu-pro. On the plain gpu queue such a request hangs in PEND forever. Use at most 12 CPUs and ~45 GB host RAM per GPU (DKFZ GPU usage policy).

Running Nextflow on the cluster

Run the Nextflow driver inside an LSF job rather than on a submission host directly. Make a script and submit it with bsub < my_script.sh:

#!/bin/bash
#BSUB -J nf_pipeline
#BSUB -o nf_pipeline.%J.log
#BSUB -q long
#BSUB -n 2
#BSUB -R "rusage[mem=8G]"
#BSUB -W 10:00

module load Nextflow/25.10.0

# Cache images on shared storage so worker nodes can reach them:
export NXF_APPTAINER_CACHEDIR=/omics/groups/<your-group>/.../apptainer_cache

nextflow run <pipeline> \
    -profile dkfz,apptainer \
    --input samplesheet.csv \
    --outdir results

Add gpu to -profile (e.g. -profile dkfz,gpu,apptainer) to send process_gpu tasks to a GPU queue.

Config file

See config file on GitHub

// Institutional profile for the DKFZ / ODCF LSF cluster.

params {
    config_profile_description = 'Deutsches Krebsforschungszentrum (DKFZ) ODCF HPC cluster profile'
    config_profile_contact     = 'Abid Abrar (abid.abrar@dkfz-heidelberg.de), Kübra Narcı (kuebra.narci@dkfz-heidelberg.de)'
    config_profile_name        = 'DKFZ Cluster'
    config_profile_url         = 'https://www.dkfz.de'

    max_cpus   = 64
    max_memory = '1000.GB'
    max_time   = '720.h'

    // GPU queue for GPU jobs (options: gpu (default), gpu-lowprio, gpu-pro)
    dkfz_gpu_queue = 'gpu'
}

apptainer {
    enabled    = true
    autoMounts = true
}

// Ignore the custom dkfz_gpu_queue param in nf-schema validation
validation.ignoreParams = ['dkfz_gpu_queue']

process {
    executor = 'lsf'
    scratch  = '$CLUSTER_SCRATCHDIR'

    // Retry transient failures: no exit status, signals 130–145 (137 = OOM/preempt), 104/255 (I/O drops)
    errorStrategy = { (task.exitStatus == null || task.exitStatus == Integer.MAX_VALUE || task.exitStatus in ((130..145) + [104, 255])) ? 'retry' : 'finish' }
    maxRetries    = 3
    cache         = 'lenient'

    // Cap every task to the cluster ceiling: 64 cores, 1000 GB RAM, 720 h (30 day) wall time
    resourceLimits = [
        cpus  : 64,
        memory: 1000.GB,
        time  : 720.h,
    ]

    // Low defaults for unlabelled processes
    cpus   = 1
    memory = 6.GB
    time   = 10.min

    // GPU tasks go to a GPU queue; everything else to a CPU queue by time/memory.
    queue = {
        if (task.accelerator) {
            return params.dkfz_gpu_queue
        } else if (task.memory && task.memory > 200.GB) {
            return 'highmem'
        } else if (!task.time || task.time <= 10.min) {
            return 'short'
        } else if (task.time <= 1.h) {
            return 'medium'
        } else if (task.time <= 10.h) {
            return 'long'
        } else {
            return 'verylong'
        }
    }

    // GPU request, depends on `accelerator`: a nf-core `process_gpu` task without
    // `-profile gpu` has no accelerator, so it stays on CPU.
    // j_exclusive=yes is mandatory
    // optional `ext.gpu_memory` pins to GPUs with at least that much VRAM.
    clusterOptions = {
        if (!task.accelerator) {
            return null
        }
        def gpu = "-gpu num=${task.accelerator.request}:j_exclusive=yes"
        if (task.ext.gpu_memory) {
            gpu += ":gmem=${task.ext.gpu_memory.toGiga()}G"
        }
        return gpu
    }

    // Bind /omics into every container; add --nv for GPU tasks.
    containerOptions = { task.accelerator ? '--bind /omics --nv' : '--bind /omics' }
}

executor {
    name            = 'lsf'
    perJobMemLimit  = true
    perTaskReserve  = false
    queueSize       = 10
    submitRateLimit = '1 sec'
    exitReadTimeout = '30 min'
}

nf-core/configs: dkfz
Edit

nf-core/configs: DKFZ configuration

Before you use this profile

Queues

Resource limits, retries and containers

Enable GPU support

Choosing the GPU queue

Number of GPUs and GPU memory per process

Running Nextflow on the cluster

Config file

executor

Last modified

homepage

get in touch

nf-core/configs: dkfzEdit

nf-core/configs: DKFZ configuration

Before you use this profile

Queues

Resource limits, retries and containers

Enable GPU support

Choosing the GPU queue

Number of GPUs and GPU memory per process

Running Nextflow on the cluster

Config file

executor

Last modified

homepage

get in touch

nf-core/configs: dkfz
Edit