File System Issues

We seem to be having issues with the server that handles project file systems. We are investigating the problem.

Update 12:37pm: We are heading into the department to perform a reboot of the file server. This involves a shutdown and restart of most of the systems.

Update 5:40pm: Systems are now back up. Due to some tempermental hardware the process did not go particularly smoothly.

File System Issues Read More »

Downtime: Monday, February 18, 2008

On Monday, February 18, 2008, we will have a scheduled downtime of the Linux public cycle servers during normal business hours.

Who is affected:

  • This downtime will impact users of the 32-bit Linux cycle servers (tux, opus, and willy) and the 64-bit Linux cycle servers (soak, wash, rinse, and spin).

What is happening:

  • The operating systems on these machines will be upgraded. Note that this upgrade will take place during normal hours. We will upgrade the machines in stages (one 32-bit and one 64-bit machine at a time) to minimize disruption. Users logged in will be given a 5-minute warning before each machine is brought down. We expect that each machine will be down for less than an hour.
  • Note that we will begin the upgrades at approximately 10:30am.

Special Note: because these upgrades involve a re-install of the operating system, crontab entries will be lost. If you have cron jobs that run on these machines, be sure to re-establish them after the upgrade.

Why it is happening:

  • These upgrades address several critical security issues.

As of 2:09pm all updates have finished.

Downtime: Monday, February 18, 2008 Read More »

Downtime: Thu, February 7, 2008

On Thursday, February 7, 2008, we will have a scheduled downtime from 4:00am to 8:00am EST.

Scheduled work includes:

  • Firmware updates to the disk array on our home directory file server
  • Upgrade virtual hosting web server to Apache 2
  • Operating system patches

Why is it happening:

  • Ordinarily, we would avoid having a downtime during the first week of classes; however, our file server vendor has classified a recent firmware update as \”critical\” meaning there is the potential for data loss.
  • While our systems are down, we plan to upgrade the virtual host web server (virtweb) that hosts URLs of the form http://[projectname].cs.princeton.edu to Apache 2. This will allow web pages to host files larger than 2G.

Downtime: Thu, February 7, 2008 Read More »

Downtime: Wednesday, January 30, 2008

On Thursday, January 30, 2008, we will have a scheduled downtime of the Linux public cycle servers during normal business hours.

Who is affected:

  • This downtime will impact users of the 32-bit Linux cycle servers (tux, opus, and willy) and the 64-bit Linux cycle servers (soak, wash, rinse, and spin).

What is happening:

  • The operating systems on these machines will be upgraded to CentOS 5.1. Note that this upgrade will take place during normal hours. We will upgrade the machines in stages (one 32-bit and one 64-bit machine at a time) to minimize disruption. Users logged in will be given a 5-minute warning before each machine is brought down. We expect that each machine will be down for less than an hour.
  • Note that we will begin the upgrades at approximately 9:00am with tux and soak.

Special Note: because these upgrades involve a re-install of the operating system, crontab entries will be lost. If you have cron jobs that run on these machines, be sure to re-establish them after the upgrade.

Why it is happening:

  • These upgrades address several security issues. We are upgrading them now to avoid disruption once the Spring term begins.

Downtime: Wednesday, January 30, 2008 Read More »

Downtime: Thursday, January 17, 2008

On Thursday, January 17, 2008, we will have a scheduled downtime from 4:00am to 8:00am EST.

Scheduled work includes:

  • Upgrading virtual host web server to PHP5 (specifically, version 5.2.5)
  • Install new LDAP server hardware and software
  • Update the SSH server software

Why is it happening:

  • The virtual host web server (virtweb) hosts websites that have URLs of the form http://[projectname].cs.princeton.edu. This server is currently running PHP4 (specifically, version 4.4.7). Because PHP4 reached its end-of-life on 12/31/07 and because there are users in the department who need PHP5, we are upgrading to version 5.2.5. For the most part, PHP code that runs on version 4 will run without change on version 5. We are contacting the owners of those project web pages that use PHP with additional information to ensure a smooth transition.
  • In the not-to-distant future, we will also transition the web server that handles user home pages to PHP5. As part of this downtime, we will create a virtual host (running PHP5) that will serve the content of user public_html directories. This host will only be reachable from within the department\’s local network to let users test their home pages under PHP5 via http://web1.cs.princeton.edu/~username
  • Note that user home pages will continue to be available (under PHP4) both inside and outside the department as http://www.cs.princeton.edu/~username. We have no plans to change this convention.

Downtime: Thursday, January 17, 2008 Read More »

Further Emergency Downtime & Update

In response to the file server problem we have been working on since Tuesday, a vendor field engineer will be coming out tonight to help troubleshoot and diagnose the problem. He is expected to arrive around 20:00 (8:00 PM) tonight, Thursday night.

It is very likely that downtime of the server will be required while the engineer is here. This will mean that all services provided by CS Staff, including public cycle servers, clusters, email service, web service, DNS, etc. will be shutdown, as they all depend on the file server. Please make sure to regularly save any data you are working on to protect against losing data when services are shutdown.

We appreciate your patience throughout these last few days, and apologize for any inconvenience.

Update 2007/07/13 @ 03:20 After over 7 grueling hours of emergency downtime and troubleshooting, the network and systems are again up and running, and initial signs look good. While we hesitate to declare victory, we would ask that you please report any instability you notice with as much detail as possible about what you were doing and what failed.

We thank you again for your patience, especially those of you working toward deadlines. If we have indeed licked this issue, look forward to some exciting announcements in the coming weeks.

Further Emergency Downtime & Update Read More »

Emergency Maintenance

Our systems are still not playing nice with each other after the installation of a new file server. To get everything in a known state, we must initiate a full shutdown/startup of the equipment in 218 at 8:45am this morning. We don\’t necessarily expect this to fix everything, but it will eliminate many variables. Thank you for your patience.

Update @ 13:45: We are continuing to experience unstable NFS performance, especially on the public linux servers. The public solaris machines (shades), while also affected, appear to be more usable under these conditions. We are working with vendor support to isolate the issue or issues. Further updates will appear on this site as they become available.

Update 7/12 @ 12:15: Problems continue. We are working with our vendor to determine if this is a software or hardware issue.

Emergency Maintenance Read More »

Scroll to Top