Taki Users:
We are seeing many jobs stuck in a pending or "PD" state in slurm. The reason being given by slurm is "launch failed requeued held". This is slurm's way of telling us that something is wrong with the node configuration.
While we are investigating the potential issues, we also have high-priority tickets in with slurm-support. We aim to resolve this issue ASAP.
Thank you for your patience.