!!! CAUTION !!! Please do NOT run python command directly in the SSH terminal!
By doing so, you are running a potentionally intensive computing task on OSCER login machines. Intensive computing tasks had a tendency to crash the login machines, thus preventing all other 1000+ OSCER users from logging in! Please make sure to read the Python Basic Setup instruction to understand how to properly set up a Python environment and submit a Python batch job on the supercomputer.
If you have trouble following the instruction below, feel free to join OSCER weekly zoom help sessions,
Mamba (Conda) Setup
Many python github projects, including (but not limited to) machine learning, AI deep learning, and data analytics, relies on conda/miniconda/Anaconda to set up a python environment with required package installed automatically. However, as you may already figured out, conda environment setup takes considerable amount of time and disk space. A simple conda setup with python 3.10 takes hours to resolve required packages and set you back about ~3GB of your precious home storage space. Because of this, we highly recommend setting up your python environment from raw "vanilla" python whenever possible, following the Python Basic Setupinstruction.
However, in case it gets too complicated to set up a python environment from scratch for your project that you feel you must use conda, we highly recommend Mamba. This is a relatively new alternative to conda, reimplementing conda package manager in C++ (read: it's FAST). It will still set you back the same amount of storage as if you are using conda, but at least you do not have to wait hours, if not DAYS, for your conda environment to be set up.
To use mamba as a replacement for conda, please do the following steps:
1. Loading Mamba Module and Initializing Mamba
To load a Mamba module, type:
module load Mamba
(note: it's Case SenSiTive!)
Currently, OSCER support one Mamba version: Mamba/23.1.0-4
After loading a Mamba module, type:
mamba init
Then log out, and log back into OSCER. This sign-out is required since Mamba modified your .bashrc file to make sure Mamba is set up properly right after the next time you log in.
Once you logged back into OSCER, you should see your terminal beginning with (base)
. This means conda base
environment has been initialized successfully.
2. Creating a New Conda Environment and Installing Packages
The next step is to create a conda environment corresponding to the need of your python code/github project. In this instruction, we take picrust2 github project as an example. According to the official installation guide of picrust2, the command to set up a conda environment for picrust2 is:
mamba create -n picrust2 -c bioconda -c conda-forge picrust2
This command uses mamba
to create a conda environment named picrust2
, getting picrust2
package and all of its dependencies from conda's bioconda
and conda-forge
channels (i.e. conda package repositories). In our test, mamba
only took about 5 minutes to resolve all dependencies and install all required packages for picrust2. If you run this command with conda
instead of mamba
, you may have to wait hours, if not DAYS.
To activate picrust2 conda environment, type:
mamba activate picrust2
You will notice your terminal now begins with (picrust2)
, suggesting that you have successfully activated picrust2
conda environment. You will soon find out that you can replace virtually almost all conda
commands with mamba
.
To install a python package in a conda environment, if you want to install such a package from a conda channel, you can type:
mamba install -c your_favorite_conda_channel your_favorite_python_package
For example, to install numpy
from conda-forge
channel, type:
mamba install -c conda-forge numpy
You can also install a package through pip
:
pip install your_favorite_package
Such as:
pip install numpy
IMPORTANT NOTE: in the future, anytime you wish to install additional package to a mamba environment, please make sure to ACTIVATE your mamba environment BEFORE doing any mamba/pip package installation. For example, if I want to install extra packages in picrust2
mamba environment, I need to type:
mamba activate picrust2
BEFORE any mamba install
or pip install
command.
3. Submitting a Batch Job in a Conda Environment
To run your python batch job in a conda environment, the safest way that guarantees your job would run without any error related to the conda environment setup is to activate your conda environment AT THE END of your terminal configuration file (~/.bashrc
for bash shell, or ~/.zshrc
for zsh shell, or ~/.cshrc
for csh shell, or ~/.tcshrc
for tcsh shell - you get the idea).
That is, use a text editing program like vi or nano to edit your terminal configuration file:
nano ~/.bashrc
Then add this line to the END of it:
mamba activate your_conda_environment
For example, to load picrust2
conda environment, type this line at the end of your terminal configuration file:
mamba activate picrust2
Your terminal should now started with (picrust2)
whenever you log in. Please log out and log back in for it to take effect.
Below is the content of my test_conda.sbatch
script running a simple test_picrust2.py
script, to be submitted to partition debug
, requesting 1 CPU, 1GB of memory, and 10 minutes run time: #!/bin/bash
#
#SBATCH --partition=debug
#SBATCH --output=conda_%J_stdout.txt
#SBATCH --error=conda_%J_stderr.txt
#SBATCH --ntasks=1
#SBATCH --mem=1G
#SBATCH --time=00:10:00
python test_picrust2.py
After that, submit your batch script by typing:
sbatch your_conda_batchscript
Such as:
sbatch test_conda.sbatch
Extra: Completelly Removing Conda/Mamba from Your Account
In case you do not want to use conda environment anymore, following these steps to completely remove conda-related files and scripts.
- Edit
~/.bashrc
(for bash shell, or~/.zshrc
for zsh shell, or~/.cshrc
for csh shell, or~/.tcshrc
for tcsh shell - you get the idea), delete part of the script beginning with:
# >>> conda initialize >>>
And ending with:
# <<< conda initialize <<<
This will make sure the(base)
conda environment will no longer be activated right after you log in to OSCER anymore. - Remove
~/.conda
directory recursively:
rm -rf ~/.conda
This will remove all conda environments and downloaded packages.