Skip Navigation

Python Basic Setup

Skip Side Navigation

!!! CAUTION !!! Please do NOT run python command directly in the SSH terminal!

By doing so, you are running a potentionally intensive computing task on OSCER login machines. Intensive computing tasks had a tendency to crash the login machines, thus preventing all other 1000+ OSCER users from logging in! Please make sure to scroll down and read the entire Python Basic Setup instruction to understand how to properly set up a Python environment and submit a Python batch job on the supercomputer.

If you have trouble following the instruction below, feel free to join OSCER weekly zoom help sessions.

Python Basic Setup on OSCER

1. Loading a python module

Assuming you have already logged in to OSCER, to load a Python module, type:
module load Python

(note: it's Case-SenSiTive).
As of this writing, this command will load Python 3.10.8 by default. To see different versions of python that can be loaded, type:
module avail python

(case-insensitive)
To load a specific version of python, such as Python/3.10.8-GCCcore-12.2.0, type:
module load Python/3.10.8-GCCcore-12.2.0

(this is again Case-SenSiTive)
You can then check to make sure the module is loaded correctly by typing:
python --version

and:
which python

(this should point to a directory on /opt/oscer/software with the correct version of python in its path).
 

2. Creating a Python Virtual Environment

Most likely you want to install your favorite packages that are not provided in raw "vanilla" python. However, if you attempt to install an extra python package with pip command, chances are either you're greeted with a message saying you cannot do that because you don't have root privilege, or (in more recent version of pip) it will automatically install your package locally in your user account under ~/.local/. If you're confident that all of your python projects can use exactly the same set of python packages, you can skip this step and run pip with --user flag.

We would rather recommend setting up a python virtual environment for each of your python project. This way, each project uses an independent set of python packages, and you would not have to worry about package version collision - where one of your python code requires a version of a package, while the other one requires a different version of the same package.

To do so, the first task is to create a directory for one of your working project and enter it. For the purpose of this guide, I'll name it test, and put it right under home directory:
mkdir ~/test
cd ~/test

To create a python virtual environment named test_env for test project, type:
python -m venv test_env

This command will generate a directory called test_env within ~/test/ directory, containing some scripts to activate/deactivate a python virtual environment. This will also be the place where your python packages installed via pip command would reside.
You can use a different project directory name and virtual environment name if you want to. We suggest setting a different virtual environment name for each of your python project. This way, later on, you can easily find out which virtual environment belongs to which project you want to work on.

3. Activating a Python Virtual Environment and Installing Packages

To activate the python virtual environment for test project, type:
source ~/test/test_env/bin/activate

This assumes you're using bash shell. For csh (C shell), type:
source ~/test/test_env/bin/activate.csh

Note: to find out which type of shell you're using, type:
ps -p $$

If you see bash or zsh, use the regular activate script.
If you see csh or tcsh, use the activate.csh script.

You should now see (test_env) at the beginning of your terminal prompt. This means you are in test_env python virtual environment. You can now install python packages with pip command, WITHOUT --user flag, AS IF you have root privilege.
It's always a good idea to upgrade pip before installing python packages (similar to updating application repos in Linux), so that the list of available python packages are up-to-date. To do so, type:
pip install --upgrade pip

And then, install your favorite packages by typing:
pip install your_favorite_package

Such as:
pip install numpy

for numpy (numerical python) package.

IMPORTANT NOTE: in the future, anytime you wish to install additional package to a python virtual environment, please make sure to load the corresponding python module and ACTIVATE your virtual environment BEFORE doing any pip package installation, in that EXACT order (i.e. do NOT activate your python environment before loading the right python module). For example, if I want to install extra packages in test_env python virtual environment, I need to type in the exact following order:
module load Python/3.10.8-GCCcore-12.2.0
source ~/test/test_env/bin/activate

BEFORE any pip install command.

4. Creating and Submitting a Python Batch Job

We understand that it is very tempting to run your python code directly in the terminal prompt after you have successfully set up your python virtual environment and installed your favorite python packages.

However, since we cannot ALWAYS check whether your python code is computationally intensive, we have to ALWAYS ASSUME THE WORST case scenario.

That is, if you run your python code directly on the terminal, it WILL BOG DOWN the login machines of OSCER, thereby preventing all other users from logging in.

Therefore, the ONLY proper way to run python code is through a batch job submission.

To set up a batch script for python, you simply need to load the Python module used for setting up your python virtual environment and activate your python virtual environment before calling the main python command. Below is the content of my batch script test_python.sbatch, designed to submit a simple python command from my test project on partition debug, requesting 1 CPU, 1GB of memory, for 10 minutes:
#!/bin/bash
#
#SBATCH --partition=debug
#SBATCH --output=python_%J_stdout.txt
#SBATCH --error=python_%J_stderr.txt
#SBATCH --ntasks=1
#SBATCH --mem=1G
#SBATCH --time=00:10:00

module load Python/3.10.8-GCCcore-12.2.0
source $HOME/test/test_env/bin/activate

python ~/test/test.py


And my test.py file has the following content to print out $USER environment variable:
import os
print(os.getenv("USER"))


To submit the test_python.sbatch, type:
sbatch test_python.sbatch

You should get something like:
Submitted batch job 16041355

That 16041355 number is the job_ID ob your submitted batch job.
You can type:
watch squeue -j job_ID

such as:
watch squeue -j 16041355

to watch the status of your job. Once its status ST turns to R, it means its's running.

If you use the example batch script in this guide, the job's screen output would be stored in python_job_ID_stdout.txt (such as python_16041355_stdout.txt), while the job's error output would be stored in python_job_ID_stderr.txt (such as python_16041355_stderr.txt). You can follow the output of your job while it's running by typing:
tail -f python_job_ID_stdout.txt

such as:
tail -f python_16041355_stdout.txt


5. Removing PIP Cache

If you installed a large amount of packages through pip and almost filled up your home storage, you can remove pip's cache files to free up some space. Simply type:
rm -rf ~/.cache/pip

To remove all pip's cache files.