Slurm is very easy to install. The installation took only a few minutes even without reading a tutorial.

  1. install all the needed packages
    • aptitude install slurm-llnl
  2. run the provided configurator and copy slurm.conf in /etc/slurm-llnl
  3. check the config file
    • scontrol show daemons
  4. copy the slurm.conf on each node
  5. create a munge keys
    • /usr/sbin/create-munge-key
  6. copy /etc/munge/munge.key on each node
  7. start the slurm services
    • /etc/init.d/slurm-llnl start
  8. start munge
    • /etc/init.d/munge start
  9. test it
    • srun –ntasks=12 –partition=dungeon –label /bin/hostname

when a node is in down state try: scontrol update NodeName=$node State=RESUME