Shutdown
From UF HPC Wiki
Contents |
Items performed during the shutdown, 7/30/07 - 8/9/07
Hardware changes
- Added memory to imgsrv
- Installed IB card in imgsrv
- Changed imgsrv CPUs from 2x248s yo 2x265s
- Replaced RAID card in racksrv
- Replaced power supply in racksrv
- Replaced motherboard in racksrv
- Replaced memory on hpcio5
- Upgraded submit from 4-way ASUS K8N-DRE to 8-way Tyan m4881
- Replaced memory on submit
- Replaced power supply on hpcio2
- Install 10-gigabit ethernet card in submit
- Rerouted I/O Node IB Cables
- Recabled IPoIB bridge-groups to corresponding 6506 port-channels
- Installed 6704 blade into the 6506
- Wired the 4948 to the 6506 via two 10-gbit links
Network changes
- Attached Cisco 4948 to 6506 via ISL
- Configured NAT on 6506
- Configured ACLs on 6506
- Configured port-channels for bond0 interfaces on hpcio1 - hpcio8
- Created port-channel for bond0 interface on altix to ethsw4948
- Placed ib1 interface of each I/O node on separate subnets for IPoIB
- Configured bridge-groups 41-48 on IPoIB gateway
- Configured port-channels 41-48 for corresponding IPoIB gateway bridge groups
- Moved ISL to Netgear switch from 6506 to ethsw4948
- Moved management ethernet to Altix from ethsw02b to ethsw05a
- Convert all ISL's from leaf switches to 2-cable port channels
- Create link for submit via 10-gigabit
Node changes
- Reimaged nodes and support machines with CentOS 4.5
- Change smartd configuration so that it emails hpc-logs with drive SMART information. This will enable us to monitor drives that may be going bad and offline them prior to them going down unexpectedly.
- Upgraded RapidScale target and initiator software
- Upgrade Torque
- Upgrade Maui
- Drop the Topspin IB stack in favor of OFED-1.2
- Use OFED-provided MPI implementations (with tm interface for OpenMPI and other OFED annoyances fixed)
- Upgrades for OpenMPI, MVAPICH, MVAPICH2
- Install OFED on the Altix
- Set system-wide default MPI implementations (OpenMPI/Intel)
- Configured ethernet only nodes to use ethernet binding to the RapidScale targets
- Retire iogw2, hpc, tp9400, osg
- Convert iogw4 to torque
- Convert hpcio9 to submit
- Install the following on submit:
- NTP
- DNS
- LDAP
- NFS
- Install the following on torque:
- Torque
- Maui
- PBS log sweeper for website
- perl-DBD-MySQL-2.9004-3.1.x86_64.rpm
- mysqlclient10-3.23.58-4.RHEL4.1.x86_64.rpm
- Setup submit to be the submission node, which will then push all jobs over to Torque for the actual handling of these jobs.
- Rename all nodes to fit into new naming scheme of rack-side and slot number
- Upgrade to version 2.3.37 of LDAP
- Remove Pathscale compiler and software versions
- Remove older OpenMPI packages
