The GPU hosts (or nodes) each contain :
| Partition name (Taught/Research) | GPU | CPU |
|---|---|---|
| a40 | Tesla A40 48GB | AMD Epyc |
| a30 | Tesla A30 24GB GPU | AMD Epyc |
| t4 | Tesla T4 16GB GPUs | Intel |
| a100 | Tesla A100 80GB | AMD Epyc |
| a16 | Tesla A16 16GB GPUs | AMD Epyc |
| training | Experimental, 4 addtional GPUs | AMD Epyc |
For example, to target a T4 GPU (taught students):
sbatch --partition t4 /path/to/script.sh
If you choose specific, eg., 48GB GPUs, then expect to wait for a while, use squeue --me --start for an estimated start time. Decide whether your script really needs 48GB or if 24GB or 16GB will suffice
Experimental: GPU 'shards' - run jobs on a GPU in the training partition, even while another shard job is running on a GPU
sbatch --partition training --gres=shard:1 /path/to/script.sh