Skip to content
  • Addition to Troubleshooting

    Problem: A worker has status "draining" and jobs won't start anymore

    1. Restart the worker (either reboot or physical restart)
    2. Execute: sudo scontrol update nodename=<worker_id> state=idle
    Edited by leonsick
  • Problem: A worker has status "draining" and Reason SlurmSpoolDir is full.

    1. ssh onto the worker
    2. Execute docker system prune -af
    3. Check what is in /var/spool/slurmd and clean up what can be removed
    4. Set worker to status idle
    Edited by leonsick
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment