News

Recent News

Posted on: Details
01/17/2025 14:46 EST SVC datacenter maintenance beginning 01/27/2025


Research Computing has been notified of planned facilities maintenance for the electrical systems within the SVC data center.

The planned work is requried maintenance for the continued operation of the power backup systems in the SVC data center. During this scheduled work, available power will be reduced while circuits are bypassed during the maintenance on each system. This will affect only compute nodes in the following CIRCE partitions that are housed in SVC.

circe, amd_2021, cool2022, himem, mri2016, and hii02

Beginning at 5pm on Jan. 27, new jobs submissions to the above queues will be suspended. Any running jobs that remain will be stopped by 8am on Jan. 28. Users should still be able to log in to CIRCE to monitor and submit new jobs to partitions that remain operational. Any new jobs submitted to the above partitions will not start running until the maintenance has been completed. Currently, no down time is planned for RRA, the Student Cluster, license servers, or filesystems.

The contractor conducting the work plans to complete the project as quickly as possible. The electical maintenance is scheduled to be completed by Friday morning, Jan. 31 at 9 am. There is a possibility that it may be finished sooner, however please plan for the full down time of the partions until the CoB on Friday. Once Research Computing has received the all-clear that work has been completed, the affected partitions will be powered-on and resumed. While we do not anticipate a full loss of power to the data center during this time, there is always a risk, and extended downtime may be necessary if additional facilities maintenance is required. Please be aware that any unscheduled maintenance may result in additional CIRCE and other RC resources becoming unavailable.

In summary: The above listed CIRCE partitions will be unavailable starting at 5pm (EST) on Monday, Jan. 27 for facilities maintenance related to building electrical infrastructure in SVC. Service is anticipated to be restored by 9am (EST) on Jan. 31.

If you have any questions about this maintenance window, please contact Research Computing at rc-help@usf.edu.

10/11/2024 10:25 EDT Update 2: Hurricane Milton


Quick update. Hope that you all are doing well and made it through the storm without too many problems. While crews assess and repair the damage on campus, Research Computing remains in emergency operations mode. Generators and backup chillers did kick in during the storm, but the filesystems and essential system servers were able to stay on.

For the time being, critical infrastructure including the login nodes, storage systems, web servers, and CIFS servers will continue to operate and will be accessible.

Currently we are unable to provide an exact estimate for when the compute nodes will be able to be powered back on. We will begin the process as soon as we are given the all clear from Emergency Management that it is safe to return to campus and resume normal operations. It may be another day or 2 before this happens however.

Should any change in this decision be made, we will make our best efforts to keep this Research Computing mailing list and our News page (https://wiki.rc.usf.edu/index.php/News) as up-to-date as possible.

Please contact rc-help@usf.edu with any questions or concerns.

10/08/2024 11:35 EDT Update 1: Hurricane Milton


In response to the current University guidance and based upon current tracking data with potential effects of the storm on the Tampa Bay area, Research Computing has entered emergency operations mode. RC staff will begin suspending queues on the CIRCE, RRA, and Student clusters this afternoon. This action will reduce heat within the data centers, in addition to conserving power draw in the event that emergency generators are needed. Currently running jobs may be able to finish, but all jobs will be stopped so that compute nodes can be powered down by 12 noon on Wednesday.

For the time being, critical infrastructure including the CIRCE and RRA login nodes, storage systems, and CIFS servers will continue to operate and will be accessible. Depending on the effects of the storm, these assets may need to be powered down as well.

Should any change in this decision be made, we will make our best efforts to keep this Research Computing mailing list and our News page (https://wiki.rc.usf.edu/index.php/News) as up-to-date as possible.

Please contact rc-help@usf.edu with any questions or concerns.

10/07/2024 09:48 EDT Hurricane Milton


Research Computing staff are closely monitoring the progress of Hurricane Milton. At this time, all systems are functional and remain under normal operations.

While no outages are scheduled at this time, please plan for the likely event that we may see some impact to operations of the Research Computing clusters depending on the path of the storm. The National Hurricane Center has issued a Hurricane watch for the Tampa Bay area, and it could bring heavy rains and wind by mid week. In the event of disruption to power or cooling in the data center, we may need to execute a shutdown of systems in the following order:

1.) A shutdown of all computational systems.
2.) A shutdown of all login nodes.
3.) A shutdown of all storage systems.

We will post updates as needed and when possible. Please contact rc-help@usf.edu with any questions regarding Research Computing services.

09/27/2024 12:18 EDT Final Update: Hurricane Helene


Research Computing has been given the all-clear notice from USF facilities that the campus infrastructure is ready for the compute nodes to be returned to service.

As of 11:50 am today, all Research Computing computational resources are now back online. Users may log in and submit jobs to the CIRCE, RRA, or SC clusters as normal.

If you experience any issues, please contact the Research Computing Help Desk at rc-help@usf.edu

09/25/2024 12:16 EDT Update 1: Hurricane Helene


In response to the current University guidance and based upon current tracking data with potential effects of the storm on the Tampa Bay area, Research Computing has entered emergency operations mode. RC staff will be partially suspending queues on the CIRCE and RRA clusters later this afternoon. This action will reduce heat within the data centers, in addition to conserving power draw in the event that emergency generators are needed. Job submissions will still be accepted and queued, but wait times may increase until full operations resume. A full shutdown is NOT expected at this time.

Currently, critical infrastructure including the CIRCE and RRA login nodes, storage systems, and CIFS servers will continue to operate and will be accessible. Depending on the severity of the storm effects and advisement from campus facilities operations, cluster assets may need to be powered down during the day on Thursday.

Should any change in this decision be made, we will make our best efforts to keep this Research Computing mailing list and our News page (https://wiki.rc.usf.edu/index.php/News) as up-to-date as possible.

Please contact rc-help@usf.edu with any questions or concerns.

09/24/2024 11:41 EDT Potential Tropical Storm Helene


Research Computing staff are closely monitoring the development of Potential Tropical Cyclone Nine. At this time, all systems are functional and remain under normal operations.

While no outages are planned at this time, we may see some impact to operations of the Research Computing clusters depending on the path of the storm. The National Hurricane Center has issued a Hurricane watch for the Tampa Bay area, and it could bring heavy rains and wind over towards the end of this week. In the event of disruption to power or cooling in the data center, we may need to execute a shutdown of systems in the following order:

1.) A shutdown of all computational systems.
2.) A shutdown of all login nodes.
3.) A shutdown of all storage systems.

We will post updates as needed and when possible. Please contact rc-help@usf.edu with any questions regarding Research Computing services.

08/05/2024 15:07 EDT Reminder 2: Network maintenance beginning August 6th, 2024


This is a reminder for the service outage related to the network maintenance beginning tomorrow (Aug 6) at 8am (EDT), and extending until Wednesday morning (Aug 7). Please plan for the possibility that services may not be fully restored until later in the week, in the event that unexpected complications arise.

Additional details can be found in the previous email below, or in the "News" section of the RC wiki site https://wiki.rc.usf.edu/index.php/News

If you have any questions about this maintenance window, please contact Research Computing at rc-help@usf.edu.

07/31/2024 12:15 EDT Reminder 1: Network maintenance beginning August 6th, 2024


This is a reminder for the service outage related to the network maintenance beginning next Tuesday (Aug 6) at 8am (EDT), and extending until Wednesday morning (Aug 7). Please plan for the possibility that services may not be fully restored until later in the week, in the event that unexpected complications arise.

During the scheduled work, the network switches, login nodes, and several attached systems will be offline for the duration. This will affect access to ALL resources attached to the CIRCE, RRA, and student clusters, as well as remote file systems and RC hosted web sites. Any jobs running after the maintenance starts will be stopped. Jobs will need to be resubmitted after the resources are back online.

Additional details can be found in the previous email below, or in the "News" section of the RC wiki site https://wiki.rc.usf.edu/index.php/News

If you have any questions about this maintenance window, please contact Research Computing at rc-help@usf.edu.

07/17/2024 12:49 EDT Network maintenance beginning August 6th, 2024


The Ethernet networking hardware used by several Research Computing systems will be replaced and/or upgraded on Tuesday, Aug 6th.

During the scheduled work, the network switches, login nodes, and several attached systems will be offline for the duration. This will affect access to ALL resources attached to the CIRCE, RRA, and student clusters, as well as remote file systems and RC hosted web sites. During the network outage, all CIRCE and RRA partitions will be unavailable. Beginning at 8am (EDT) on Tuesday, Aug 6th, all jobs will be stopped and no further access to any partition, login node, or filesystem will be available until after the new network gear has been installed and put into service.

This outage should NOT affect the license servers.

We are currently projecting a return to services by 9am (EDT) on Wednesday, Aug. 7th. However, please plan for the possibility that services may not be fully restored until later in the week.

In summary: All CIRCE, RRA, and student cluster partitions will be offline starting at 8am (EDT) on Tuesday Aug. 6 for network upgrades and maintenance. All RC hosted file systems and web sites will also be affected. Service is anticipated to be restored by 9am (EDT) on Aug. 7, but may extend until later in that week.

If you have any questions about this maintenance window, please contact Research Computing at rc-help@usf.edu.

06/03/2024 15:35 EDT PGS storage returned to service


The PGS storage system has been returned to service. Network mounts via cifs-pgs can be resumed.

The PGS filesystem had been powered down during the planned electrical maintenance that started on May 31. When the system was powered back up, the array exhibited cascading drive failures. RC staff have been able to replace the faulty systems and rebuild the filesystem structure. Any directories that were hosted on PGS that had backups have been restored from the most recent snapshot taken before the May 31st electrical maintenance. For most directories, the backup would have been from approximately 6am EDT on Friday May 31. Files placed in PGS mounted directories after the 6am May 31st backup will need to be re-transferred or recreated. If any older files are missing, they may still exist in an older daily incremental. Please contact rc-help@usf.edu if you have any questions.

06/05/2024 16:08 EDT PGS storage currently down


The PGS filesystem is currently unavailable due to issues with the filesystem related to last weekend's power maintenance in MDC. RC is working to fix the issue. Additional updates will follow when more information is available.

If you have any questions about this maintenance window, please contact Research Computing at rc-help@usf.edu.

06/04/2024 15:53 EDT Final Update: MDC datacenter maintenance beginning 05/31/2024


This is the final update regarding the May 31 data center maintenance. The electrical work in the MDC building has been completed. Login access for the RRA and SC clusters is now available. All partitions and file systems are available on the CIRCE cluster.

All queues across the 3 clusters are operational. A few nodes remain offline due to various hardware issues, but RC staff will continue working with these nodes to repair or replace them. Job submissions can resume as normal at this time.

No further updates will be mailed regarding this maintenance window. Additional details can be found in the previous email below, or in the "News" section of the RC wiki site https://wiki.rc.usf.edu/index.php/News

If you have any questions about this maintenance window, please contact Research Computing at rc-help@usf.edu.

05/23/2024 15:01 EDT Reminder 1: MDC datacenter maintenance beginning 05/31/2024


This is a reminder for service outage related to the MDC Data Center facilities maintenance beginning next Friday (May 31), and extending until at least Wednesday evening (June 5). The CIRCE partitions listed below, as well as the RRA cluster, Student Cluster, BGFS filesystem and PGS filesystem will unavailable.

  • amd_2021
  • amdwoods_2022
  • bfbsm_2019
  • cbcs
  • charbonnier_2022
  • chbme_2018
  • cms_ocg
  • margres_2020
  • muma_2021
  • qcg_gayles_2022
  • simmons_itn18
  • snsm_itn19


This outage will NOT affect the main circe and himem partitions, or the services hosting websites and license servers. Users will still be able to login to the CIRCE cluster to access files and submit jobs. Jobs submitted to any of the offline partitions will remain in a pending state until the resources are back online.

Additional details can be found in the previous email below, or in the "News" section of the RC wiki site https://wiki.rc.usf.edu/index.php/News

If you have any questions about this maintenance window, please contact Research Computing at rc-help@usf.edu.

05/06/2024 13:43 EDT MDC datacenter maintenance beginning 05/31/2024


Research Computing has been notified of planned building facilities maintenance within the MDC data center. During the maintenance period, the CIRCE partitions in SVC will remain online. Please continue to read the full message below for details about which services will be offine, and which will remain available.

The planned work consists of several upgrades to the electrical infrastructure in the MDC building. Similar work was previously done in the SVC datacenter. During this scheduled work, all power to servers and filesystems within the MDC data center will be offline for the duration. This will affect several Research Computing services. The partitions listed below will be unavailable for the duration. In addition, all RRA and Student Cluster resources will be offline, as well as the BGFS and PGS filesystems.

  • amd_2021
  • amdwoods_2022
  • bfbsm_2019
  • cbcs
  • charbonnier_2022
  • chbme_2018
  • cms_ocg
  • margres_2020
  • muma_2021
  • qcg_gayles_2022
  • simmons_itn18
  • snsm_itn19


Beginning at 5pm on May 31, jobs on the above listed partitions will be stopped, and not available for further jobs until after power to the building has been restored.

This outage will NOT affect the main circe and himem partitions, or the services hosting websites and license servers. Users will still be able to login to the CIRCE cluster to access file and submit jobs. Jobs submitted to any of the offline partitions will remain in a pending state until the resources are back online, however.

The building maintenance is scheduled to be complete by Tuesday morning, June. 4th. Once Research Computing has received the all-clear that work has been completed, the process of rebooting all systems will begin. This process normally takes several hours, and while we do not anticipate any additional power disruption, it may take several days to fully restore services should any unexpected issues arise. We are currently projecting a return to services for 6pm (EDT) on June. 5th. However, please plan for the possibility that services may not be fully restored until later in the week.

In summary: The CIRCE partitions listed above, as well as the RRA cluster, Student Cluster, BGFS filesystem and PGS filesystem will be powered down starting at 5pm (EDT) on May 31 for facilities maintenance related to building electrical infrastructure in MDC. The main "circe" and himem partitions, login nodes, web sites and license servers will remain online. Service is anticipated to be restored by 6pm (EDT) on May 5th, but may extend until later in that week.

If you have any questions about this maintenance window, please contact Research Computing at rc-help@usf.edu.

03/08/2024 10:36 EST CIRCE resource to require USF VPN beginning March 14


On March 14, access to all Research Computing resources, including the CIRCE login nodes and mountable filesystems, will start to be moved behind the USF VPN. Access from machines on campus using either Wi-Fi or a wired Ethernet connection should remain unaffected. All off-campus users will want to be connected to the USF VPN prior to accessing CIRCE using X2Go, ssh, Filezilla, WinSCP, or other remote connection tools starting on that date. At some point after 12:00 AM EST on March 14, connections to CIRCE and other Research computing resources will no longer be accessible without first connecting to the USF VPN. We recommend that all users start the habit of connecting to the VPN from off campus as soon as possible. Even if some resources are available after that date, they would only remain so for a short time until all networking changes have been committed.

Please test any automated scripts and workflows that utilize CIRCE connections with the VPN prior to March 14, in order for RC staff to provide any necessary assistance in a timely manner.

Hosted web sites will NOT need to use the VPN, and will still be accessible from a web browser as normal.

The PDFs in the links below from the Office of Research & Innovation provide details about connecting to the VPN.

https://www.usf.edu/research-innovation/documents/globalprotect-windows.pdf

https://www.usf.edu/research-innovation/documents/globalprotect-macos.pdf

If you have not used the USF VPN before, please use the following link to set up the VPN on your device.

https://vpn.usf.edu/

Requests for assistance with connecting to the USF VPN should be directed to the main USF IT Help Desk: help@usf.edu

If you have any questions about CIRCE access, please contact Research Computing at rc-help@usf.edu.

02/29/2024 16:25 EST SVC datacenter maintenance beginning March 25, 2024


Research Computing has been notified of scheduled maintenance to upgrade an electrical breaker panel within the SVC data center.

During the scheduled work, power to the login nodes and core network switches within the SVC data center will be offline for the duration. This will affect access to all resources attached to the CIRCE cluster. During the power outage, all CIRCE partitions will be unavailable. Beginning at 6am on Monday, Mar 25, all jobs will be stopped and no further access to any CIRCE partition, login node, or filesystem will be available until after power to the breaker panel has been restored.

The RRA and SC clusters will remain online in the MDC data center, but there is a possibility that the loss of communication with resources housed in SVC may cause instability on those systems. Therefore we recommend not relying on the availability of the RRA or SC clusters during the planned SVC outage.

This outage should NOT affect the license servers or RC websites.

The electrical maintenance is scheduled to be complete by Tuesday afternoon, Mar. 26th. Once Research Computing has received the all-clear that work has been completed, the process of rebooting all systems will begin. This process normally takes several hours. We are currently projecting a return to services for 6pm (EDT) on Tuesday, Mar. 26th. However, please plan for the possibility that services may not be fully restored until later in the week.

In summary: All CIRCE partitions will be powered down starting at 6am (EDT) on Mar. 25 for facilities maintenance related to electrical upgrades in the SVC datacenter. Other RRA and SC services may be affected. Service is anticipated to be restored by 6pm (EDT) on Mar. 26, but may extend until later in that week.

If you have any questions about this maintenance window, please contact Research Computing at rc-help@usf.edu.