Shutdown

From UF HPC Wiki

Jump to: navigation, search

Contents

Items performed during the shutdown, 7/30/07 - 8/9/07

Hardware changes

  • Added memory to imgsrv
  • Installed IB card in imgsrv
  • Changed imgsrv CPUs from 2x248s yo 2x265s
  • Replaced RAID card in racksrv
  • Replaced power supply in racksrv
  • Replaced motherboard in racksrv
  • Replaced memory on hpcio5
  • Upgraded submit from 4-way ASUS K8N-DRE to 8-way Tyan m4881
  • Replaced memory on submit
  • Replaced power supply on hpcio2
  • Install 10-gigabit ethernet card in submit
  • Rerouted I/O Node IB Cables
  • Recabled IPoIB bridge-groups to corresponding 6506 port-channels
  • Installed 6704 blade into the 6506
  • Wired the 4948 to the 6506 via two 10-gbit links

Network changes

  • Attached Cisco 4948 to 6506 via ISL
  • Configured NAT on 6506
  • Configured ACLs on 6506
  • Configured port-channels for bond0 interfaces on hpcio1 - hpcio8
  • Created port-channel for bond0 interface on altix to ethsw4948
  • Placed ib1 interface of each I/O node on separate subnets for IPoIB
  • Configured bridge-groups 41-48 on IPoIB gateway
  • Configured port-channels 41-48 for corresponding IPoIB gateway bridge groups
  • Moved ISL to Netgear switch from 6506 to ethsw4948
  • Moved management ethernet to Altix from ethsw02b to ethsw05a
  • Convert all ISL's from leaf switches to 2-cable port channels
  • Create link for submit via 10-gigabit

Node changes

  • Reimaged nodes and support machines with CentOS 4.5
    • Change smartd configuration so that it emails hpc-logs with drive SMART information. This will enable us to monitor drives that may be going bad and offline them prior to them going down unexpectedly.
  • Upgraded RapidScale target and initiator software
  • Upgrade Torque
  • Upgrade Maui
  • Drop the Topspin IB stack in favor of OFED-1.2
  • Use OFED-provided MPI implementations (with tm interface for OpenMPI and other OFED annoyances fixed)
    • Upgrades for OpenMPI, MVAPICH, MVAPICH2
  • Install OFED on the Altix
  • Set system-wide default MPI implementations (OpenMPI/Intel)
  • Configured ethernet only nodes to use ethernet binding to the RapidScale targets
  • Retire iogw2, hpc, tp9400, osg
  • Convert iogw4 to torque
  • Convert hpcio9 to submit
  • Install the following on submit:
    • NTP
    • DNS
    • LDAP
    • NFS
  • Install the following on torque:
    • Torque
    • Maui
    • PBS log sweeper for website
      • perl-DBD-MySQL-2.9004-3.1.x86_64.rpm
      • mysqlclient10-3.23.58-4.RHEL4.1.x86_64.rpm
  • Setup submit to be the submission node, which will then push all jobs over to Torque for the actual handling of these jobs.
  • Rename all nodes to fit into new naming scheme of rack-side and slot number
  • Upgrade to version 2.3.37 of LDAP
  • Remove Pathscale compiler and software versions
  • Remove older OpenMPI packages
Personal tools