2. Data Storage
Network accessible storage such as /vol/bitbucket or your home directory /homes/username are vital to get your scripts running on remote GPU cluster nodes. On your own laptop or computer, you would store files on locally attached storage (internal SSD, external USB, etc) but for the GPU cluster, you must copy all necessary files to shared network storage, allowing your scripts to access files regardless of which GPU node they execute on.
Experimental CephFS storage (self-deleting) optional
There is a new way to create folders on shared network storage, HPC workspaces. These folders are deleted automatically after 365 days (one year), but data can be retrieved for a short period after folder deletion. There is a 50GB quota in place
The tools are only available on the GPU cluster head nodes, eg gpucluster2, et al
Create a workspace:
ws_allocate projectname 365
This will create /vol/gpudata/username-projectname and you may create more than one, however excessive folder counts will be noticeable to all!
List your spaces:
ws_list
Delete a workspace (please tidy up)
ws_release name-of-space
list all workspaces available for restoration
ws_restore -l
restore
ws_restore -n name -t new-folder-name
IMPORTANT: one cannot create folders using 'mkdir /vol/gpudata/folder' - ws_allocate MUST be used on gpucluster2/3 in order to allocate a folder with quota.
/vol/bitbucket (legacy)
There is a department-wide network share /vol/bitbucket for data and virtual environment storage. Create your personal folder as follows:
mkdir -p /vol/bitbucket/${USER}
Read the detailed Python Virtual Environments guide for best practice in using /vol/bitbucket and creating virtual environments.
Please note: /vol/bitbucket is especially susceptible to slowdown when many users connect over the NFS protocol, keep this in mind when approaching deadlines