[BI] Temporary files generated by the nf-core pipeline in root folder
Problem Description
When I ran the nf-core RNA-seq pipeline, big size temporary files named “rootf-*” (> 10Gb) were generated in root folder (/tmp). Because there isn’t enough space in root partition, the location for temporary files have to be chnaged.
Time waste
I spent too many time applying environment variables of nf-core, Nextflow & Singularity
- NXF_TEMP, SINGULARITY_TMPDIR & TMPDIR
Solution
I used Singularity v3.8 before.
When I reinstalled SingularityCE v4.1, the issue resolved itself naturally.
Troubleshooting Journey
I initially concentrated on identifying factors that contribute to the creation of temp files depending on the pipeline framwork or softwares.
-
nf-core, Nextflow
-
Singularity
1. nf-core pipeline & Nextflow configuration:
nf-core pipeline configuration
Priority:
-
Parameters specified on the command line (
--something value
) -
-params-file
option (.json|.yaml format) If you want to know what parameters are there? → nf-core launch -
-c my_config
optionWarning: For nf-core pipelines - parameters defined in custom.config files will not override defaults in
nextflow.config
! Please use-params-file
-
nextflow.config
in the current directory -
nextflow.config
in the workflow project directory -
$HOME/.nextflow/config
-
Hardcoded pipeline defaults - Values defined within the pipeline script itself (e.g.
main.nf
)
There are three main types of pipeline configuration that you can use:
Basic pipeline configuration profiles
-
base.config (in ~/.nextflow/assets/nf-core/{pipeline}/conf/)
-
profile configs (test.config is located with base.config)
-
predefined sets of configurations stored in the pipeline’s nextflow.config file or in separate config files within the pipeline’s conf/ directory.
-
ex. -profile test,singularity (The order of arguments is important! They are loaded in sequence, so later profiles can overwrite earlier profiles.)
-
Shared nf-core/configs configuration profiles
- If you use a shared system with other people
Custom configuration files
-
If you are the only person to be running this pipeline,
-
User’s home directory:
~/.nextflow/config
-
Analysis working directory:
nextflow.config
-
Custom path specified on the command line:
-c path/to/config
(multiple can be given)
-
2. Singularity configuration:
Singularity CE (v4.1) configuration
Comprehensively, there are three environmental variables.
-
NXF_TEMP: This environment variable is used by Nextflow.
NXF_TEMP
specifies the directory where Nextflow stores temporary files generated during its execution. These temp files can include intermediate data, log files, or other temp files created during the workflow’s execution. -
SINGULARITY_TMPDIR: This environment variable is used by Singularity.
SINGULARITY_TMPDIR
specifies the directory for storing temp files needed during the building or execution of Singularity container images. The temp files here might be related to the image build process or needed for container execution. If no changes were observed, it might indicate that Singularity did not use this directory under certain conditions or no relevant actions were performed. -
TMPDIR: This environment variable is widely used across various operating systems like Linux, Unix, and macOS, to specify the default directory for storing temp files created by the system or applications.
TMPDIR
serves as a general environment variable for applications and the operating system to store temporary data. By setting this variable, users can customize the location for storing temporary files.
Despite numerous attempts, it was unsuccessful. So, I changed the direction to reinstallation of Singularity because it was confirmed that rootfs-* folders are generated by this software.
3. Installation of SincularityCE
SingularityCE v4.1 installation
Old version installed in conda environment was removed.
conda remove singularity
Dependencies were installed.
# Ensure repositories are up-to-date
sudo apt-get update
# Install debian packages for dependencies
sudo apt-get install -y \
autoconf \
automake \
cryptsetup \
fuse \
fuse2fs \
git \
libfuse-dev \
libglib2.0-dev \
libseccomp-dev \
libtool \
pkg-config \
runc \
squashfs-tools \
squashfs-tools-ng \
uidmap \
wget \
zlib1g-dev
Because SingularityCE is written in Go, Go was installed and set up my environment.
# Download files and locate
export VERSION=1.21.6 OS=linux ARCH=amd64 && \
wget https://dl.google.com/go/go$VERSION.$OS-$ARCH.tar.gz && \
sudo tar -C /usr/local -xzvf go$VERSION.$OS-$ARCH.tar.gz && \
rm go$VERSION.$OS-$ARCH.tar.gz
# Set up
echo 'export GOPATH=${HOME}/go' >> ~/.bashrc && \
echo 'export PATH=/usr/local/go/bin:${PATH}:${GOPATH}/bin' >> ~/.bashrc && \
source ~/.bashrc
SingularityCE was installed.
# Download files
export VERSION=4.1.0 && # adjust this as necessary \
wget https://github.com/sylabs/singularity/releases/download/v${VERSION}/singularity-ce-${VERSION}.tar.gz && \
tar -xzf singularity-ce-${VERSION}.tar.gz && \
cd singularity-ce-${VERSION}
# Compile
./mconfig && \
make -C ./builddir && \
sudo make -C ./builddir install
Terms
-
Cloud compute infrastructure (AWS Batch & Google Cloud)
-
Container engine (Docker, Singularity, Podman, Charliecloud & Shifter)
-
Package management system (Conda)
Leave a comment