User Tools

Site Tools


slurm

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
slurm [2021/01/19 09:37] kauffmanslurm [2025/01/13 12:45] (current) – [Where to begin] amcguire
Line 16: Line 16:
  
 ==== Discord ==== ==== Discord ====
-There is a dedicated text channel ''%%#slurm%%'' on [[http://discord.gg/ZVjX8Gv|the CS Discord server]]. Stop by and get help or help fellow Slurm users.+There is a dedicated text channel ''%%#slurm%%'' on the UChicago CS Discord server. Please note that this Discord server is //only// for UChicago-affiliated users. You can find a link to our Discord server on the [[start|front page]] of this wiki.
  
 ===== Clusters ===== ===== Clusters =====
 +
 +We have a couple different clusters. If you don't know where to start please use the ''%%Peanut%%'' cluster. The ''%%AI Cluster%%'' is for GPU jobs and more advanced users.
  
   * [[slurm:peanut|Peanut Cluster]]   * [[slurm:peanut|Peanut Cluster]]
Line 25: Line 27:
  
 ==== Peanut Cluster ==== ==== Peanut Cluster ====
-Think of these machines as a dumping ground for discrete computing tasks that might be rude or disruptive to execute on the main (shared) shell servers (i.e., linux1linux2linux3).+Think of these machines as a dumping ground for discrete computing tasks that might be rude or disruptive to execute on the main (shared) shell servers (i.e., ''focal0''''focal1''..., ''focal7'').
  
 Additionally, this cluster is used for courses that require it. Additionally, this cluster is used for courses that require it.
Line 37: Line 39:
 ===== Where to begin ===== ===== Where to begin =====
  
-Slurm is a set of command line utilities that can be accessed via the command line from **most** any computer science system you can login to. Using our main shell servers (linux.cs.uchicago.edu) is expected to be our most common use case, so you should start there.+Slurm is a set of command line utilities that can be accessed via the command line from **most** any computer science system you can login to. Using our main shell servers (''linux.cs.uchicago.edu'') is expected to be our most common use case, so you should start there.
  
   ssh user@linux.cs.uchicago.edu   ssh user@linux.cs.uchicago.edu
  
-If you want to use the AI Cluster you will need to login into:+If you want to use the AI Cluster you will need to have previously requested access by sending in a ticket. Afterwards, you may login into:
  
   ssh user@fe.ai.cs.uchicago.edu   ssh user@fe.ai.cs.uchicago.edu
Line 345: Line 347:
 Please make sure you specify $CUDA_HOME and if you want to take advantage of CUDNN libraries you will need to append /usr/local/cuda-x.x/lib64 to the $LD_LIBRARY_PATH environment variable. Please make sure you specify $CUDA_HOME and if you want to take advantage of CUDNN libraries you will need to append /usr/local/cuda-x.x/lib64 to the $LD_LIBRARY_PATH environment variable.
  
-  cuda_version=9.2+  cuda_version=11.1
   export CUDA_HOME=/usr/local/cuda-${cuda_version}   export CUDA_HOME=/usr/local/cuda-${cuda_version}
   export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CUDA_HOME/lib64   export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CUDA_HOME/lib64
Line 361: Line 363:
 The variable name is actually misleading; since it does NOT mean the amount of devices, but rather the physical device number assigned by the kernel (e.g. /dev/nvidia2). The variable name is actually misleading; since it does NOT mean the amount of devices, but rather the physical device number assigned by the kernel (e.g. /dev/nvidia2).
  
-For example: If you requested multiple gpu's from Slurm (--gres=gpu:2), the CUDA_VISIBLE_DEVICES variable should contain two numbers(0-3 in this case) separated by a comma (e.g. 1,3).+For example: If you requested multiple gpu's from Slurm (--gres=gpu:2), the CUDA_VISIBLE_DEVICES variable should contain two numbers(0-3 in this case) separated by a comma (e.g. 0,1). 
 + 
 +The numbering is relative and specific to you. For example: two users with one job which require two gpus each could be assigned non-sequential gpu numbers. However CUDA_VISIBLE_DEVICES will look like this for both users: 0,1 
  
  
Line 410: Line 415:
 STDOUT will look something like this: STDOUT will look something like this:
 <code> <code>
-cnetid@linux1:~$ cat $HOME/slurm/slurm_out/12567.gpu1.stdout +cnetid@focal0:~$ cat $HOME/slurm/slurm_out/12567.gpu1.stdout 
 Device Number: 0 Device Number: 0
   Device name: Tesla M2090   Device name: Tesla M2090
/var/lib/dokuwiki/data/attic/slurm.1611070630.txt.gz · Last modified: 2021/01/19 09:37 by kauffman

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki