!!! CAUTION !!! Please do NOT run python command directly in the SSH terminal!
By doing so, you are running a potentionally intensive computing task on OSCER login machines. Intensive computing tasks had a tendency to crash the login machines, thus preventing all other 1000+ OSCER users from logging in! Please make sure to read the Python Basic Setup instruction to understand how to properly set up a Python environment and submit a Python batch job on the supercomputer.
If you have trouble following the instruction below, feel free to join OSCER weekly zoom help sessions.
Python PyTorch Setup WITHOUT Mamba/Conda
If you're doing deep learning neural network research, pytorch is now a highly recommended, widely-supported machine learning framework that can be found in many modern AI products, including Stable Diffusion, OpenAI's ChatGPT, Tesla's Autopilot, to name a few. There are many different ways to install pytorch. In this guide, we show you how to install pytorch in a python virtual environment, WITHOUT using conda/miniconda/mamba. More on this in our Mamba (conda) instruction.
If you haven't read the Python Basic Setup instruction, please do so. The steps are the same with the Python Basic Setup instruction, except:
- For the latest pytorch built with CUDA 11.8, first, you need to install
wheel
package:
pip install wheel
Then, according to the official pytorch installation instruction, type:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
Pytorch is already come bundled with the required CUDA library, so typically you do not need to load OSCER's CUDA or cuDNN module, as opposed to tensorflow.
- Example of torch batch script and python code:
Below is the content of my batch scripttest_torch.sbatch
, designed to submit a simple torch python command from my test project on partitiondebug
, requesting 1 CPU, 1GB of memory, for 10 minutes:#!/bin/bash
#
#SBATCH --partition=debug
#SBATCH --output=python_%J_stdout.txt
#SBATCH --error=python_%J_stderr.txt
#SBATCH --ntasks=1
#SBATCH --mem=1G
#SBATCH --time=00:10:00
module load Python/3.10.8-GCCcore-12.2.0
source $HOME/test/test_env/bin/activate
python ~/test/test_torch.py
And mytest_torch.py
file has the following content to print out the torch version:
import torch
print(torch.__version__)
Mamba/Conda PyTorch setup
If you insist on setting up your machine-learning/AI environment using mamba/conda, please first read our OSCER Mamba instruction. At the end of step 2, once you created and activated your mamba environment, you can install additional packages, such as pytorch.
In step 2, after you activated your mamba environment, type:
mamba install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
This command is based on the official getting-started instruction of PyTorch website. Again, there is no need to load OSCER's CUDA module to run pytorch in your batch script since CUDA library is already bundled with mamba/conda's pytorch.