1c. Quick Start (interactive shell using 'salloc')

This 'interactive' method allows you to work as if you were using a terminal prompt on a Lab PC with GPU (for a maximum of three days)

Connect to gpucluster2 or gpucluster3 from a Lab PC or externally from your own device, use 'salloc' to queue your CPU, RAM and GPU resources.

ssh gpucluster2.doc.ic.ac.uk

salloc --gres=gpu:1

Typical output:

# salloc: Granted job allocation 18816
# salloc: Nodes parrot are ready for job

The command will drop you straight into your allocated node, as indicated by the shell prompt eg. myaccount@parrot. Run 'nvidia-smi' to show your allocated GPU. You can now commence debugging and running code with an Nvidia GPU

If you prefer to connect to your node later, for example in VSCode, use '--no-shell'

salloc --gres=gpu:1 --no-shell

Make a note of the allocated node, or run 'squeue --me' to see your existing jobs. Read the Shell server guide on how to use shell1-5 or gpucluster2/3 as jumphosts to connect directly to your node externally

Reminder: exit or cancel interactive jobs to allow other users to use GPUs, too many idle interactive jobs reduce the efficiency and effectiveness of the service

Connect to your running job via ssh

You can ssh directly to the node hosting your GPU, as long as your job is running in the queue. Type the following command either on the headnode (gpucluster2 or gpucluster3) or in your running session, eg parrot, etc:

squeue --me

Example output:

username@cloud-vm-40-244:~$ squeue  --me

JOBID    PARTITION    NAME    USER    ST    TIME    NODES    NODELIST(REASON)
109287    a16         sys/dash    username    R    0:31    1    parrot

Make a note of your node from the Nodelist column (please note cloud-vm-40-244 is an alias of gpucluster2, one of the head nodes), the user in this case would type :

ssh username@parrot.doc.ic.ac.uk

You can also connect directly using IDEs such as VSCode - remember to run salloc first and find your node name. If reconnecting, make sure you ssh to gpucluster2/3 first and then ssh to your allocated node (or directly from a lab PC or even VPN)

Open OnDemand

Open OnDemand is a HPC web frontend for Slurm (and other queuing systems). Access is through a college PC or Zscaler externally

https://gpucluster1.doc.ic.ac.uk

https://gpucluster4.doc.ic.ac.uk

https://cpucluster1.doc.ic.ac.uk (a CPU cluster for non-GPU jobs)

Please explore the interface, discuss on Edstem, search online for help from many institutions using this tool - NO email support

results matching ""

    No results matching ""