Date: Tuesday, January 24, 2023 (06:00-10:00)
Who is affected:
All users of the CS Department Beowulf high performance computing cluster,
known as ionic.
All users of the CS Staff-managed public login systems, including the
cycles, courselab, and armlab systems.
What is happening:
Ionic nodes will have Nvidia, Cuda, and kernel drivers updated to fix
GPU-related failures. After the upgrade, machines will be rebooted.
Cycles, courselab, and armlab machines will be rebooted during this window
to clear some defunct user processes interfering with some research work.
Why is it happening:
Ionic nodes are experiencing various GPU-related failures. In an attempt
to fix them, we will be updating Nvidia, Cuda, and kernel drivers.
As some user processes have entered a defunct state, and those processes
prevent research work, machines require a system reboot to clear.
We will post updates to the status page:
as necessary.
If this downtime will cause you undue hardship, please contact immediately, so we can discuss options to reduce
any negative impact. Your patience is appreciated.
CS Staff
downtime mailing list