nf-core/configs: sage
The Sage Bionetworks Nextflow Config Profile
nf-core/configs: Sage Bionetworks Global Configuration
To use this custom configuration, run the pipeline with -profile sage. This will download and load the sage.config, which contains a number of optimizations relevant to Sage employees running workflows on AWS (e.g. using Nextflow Tower). This profile will also load any applicable pipeline-specific configuration.
This global configuration includes the following tweaks:
- Update the default value for
igenomes_basetos3://sage-igenomes - Enable retries for all failures
- Allow pending jobs to finish if the number of retries are exhausted
- Increase resource allocations for specific resource-related exit codes
- Optimize resource allocations to better “fit” EC2 instance types
- Slow the increase in the number of allocated CPU cores on retries
- Increase the default time limits because we run pipelines on AWS
- Increase the amount of time allowed for file transfers
- Improve reliability of file transfers with retries and reduced concurrency
Additional information about iGenomes
The following iGenomes prefixes have been copied from s3://ngi-igenomes/ (eu-west-1) to s3://sage-igenomes (us-east-1). See this script for more information. The sage-igenomes S3 bucket has been configured to openly available, but files cannot be downloaded out of us-east-1 to avoid egress charges. You can check the conf/igenomes.config file in each nf-core pipeline to figure out the mapping between genome IDs (i.e. for --genome) and iGenomes prefixes (example).
- Human Genome Builds
Homo_sapiens/Ensembl/GRCh37Homo_sapiens/GATK/GRCh37Homo_sapiens/UCSC/hg19Homo_sapiens/GATK/GRCh38Homo_sapiens/NCBI/GRCh38Homo_sapiens/UCSC/hg38
- Mouse Genome Builds
Mus_musculus/Ensembl/GRCm38Mus_musculus/UCSC/mm10
Config file
// Config profile metadataparams { config_profile_description = 'The Sage Bionetworks Nextflow Config Profile' config_profile_contact = 'Rixing Xu (@rxu17)' config_profile_url = 'https://github.com/Sage-Bionetworks-Workflows'
// Leverage us-east-1 mirror of select human and mouse genomes igenomes_base = 's3://sage-igenomes/igenomes' cpus = 4 max_cpus = 32 max_memory = 128.GB max_time = 240.h single_cpu_mem = 6.GB
// Define task exit errors exit_status_scaling = [143,137,104,134,139,247]
warning_message = { System.out.println("WARNING: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!") System.out.println("WARNING:") System.out.println("WARNING: THIS CONFIG IS NO LONGER MAINTAINED.") System.out.println("WARNING:") System.out.println("WARNING: THIS CONFIG WILL BE DEPRECATED BY THE END OF MAY 2026 DUE TO AN UPCOMING NEXTFLOW VERSION THAT WILL NOT BE BACKWARDS COMPATIBLE.") System.out.println("WARNING: MODIFICATIONS TO THIS CONFIG AND TESTING BY USERS OF THE INFRASTRUCTURE ARE REQUIRED TO ENSURE THE CONFIG REMAINS FUNCTIONAL.") System.out.println("WARNING:") System.out.println("WARNING: PLEASE GET IN CONTACT WITH THE NF-CORE COMMUNITY VIA SLACK (#configs CHANNEL) OR EMAIL (https://nf-co.re/join) ASAP") System.out.println("WARNING: TO ALLOW CONTINUED USE OF YOUR CONFIG WITH NF-CORE PIPELINES") System.out.println("WARNING:") System.out.println("WARNING: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!") }.call()}
// Increase time limit to allow file transfers to finish// The default is 12 hours, which results in timeoutsthreadPool.FileTransfer.maxAwait = '24 hour'
// Configure Nextflow to be more reliable on AWSaws { region = "us-east-1" client { uploadMaxThreads = 4 } batch { retryMode = 'built-in' maxParallelTransfers = 1 maxTransferAttempts = 10 delayBetweenAttempts = '60 sec' }}
// Adjust default resource allocations (see `../docs/sage.md`)
process {
resourceLimits = [ memory: 128.GB, cpus: 32, time: 240.h ]
maxErrors = '-1' maxRetries = 5 // Enable retries globally for certain exit codes errorStrategy = { task.attempt <= 5 ? 'retry' : 'finish' }
cpus = { task.exitStatus in params.exit_status_scaling ? Math.ceil(task.attempt/2) : 1 } memory = { 6.GB * (task.exitStatus in params.exit_status_scaling ? task.attempt : 1) } time = { 24.h * (task.exitStatus in params.exit_status_scaling ? task.attempt : 1) }
// Process-specific resource requirements withLabel: process_single { cpus = { task.exitStatus in params.exit_status_scaling ? Math.ceil(task.attempt/2) : 1 } memory = { 6.GB * (task.exitStatus in params.exit_status_scaling ? task.attempt : 1) } time = { 24.h * (task.exitStatus in params.exit_status_scaling ? task.attempt : 1) } } withLabel: process_low { cpus = { 2 * (task.exitStatus in params.exit_status_scaling ? Math.ceil(task.attempt/2) : 1) } memory = { 12.GB * (task.exitStatus in params.exit_status_scaling ? task.attempt : 1) } time = { 24.h * (task.exitStatus in params.exit_status_scaling ? task.attempt : 1) } } withLabel: process_medium { cpus = { 8 * (task.exitStatus in params.exit_status_scaling ? Math.ceil(task.attempt/2) : 1) } memory = { 32.GB * (task.exitStatus in params.exit_status_scaling ? task.attempt : 1) } time = { 48.h * (task.exitStatus in params.exit_status_scaling ? task.attempt : 1) } } withLabel: process_high { cpus = { 16 * (task.exitStatus in params.exit_status_scaling ? Math.ceil(task.attempt/2) : 1) } memory = { 64.GB * (task.exitStatus in params.exit_status_scaling ? task.attempt : 1) } time = { 96.h * (task.exitStatus in params.exit_status_scaling ? task.attempt : 1) } } withLabel: process_long { time = { 96.h * (task.exitStatus in params.exit_status_scaling ? task.attempt : 1) } } withLabel: 'process_high_memory|memory_max' { memory = { 128.GB * (task.exitStatus in params.exit_status_scaling ? task.attempt : 1) } } withLabel: cpus_max { cpus = { 32 * (task.exitStatus in params.exit_status_scaling ? Math.ceil(task.attempt/2) : 1) } }}