Adding New Nodes

From UF HPC Wiki

Jump to: navigation, search

This document describes what needs to be done when we add new nodes to the cluster.

  • Add node to dhcpd.conf on imgsrv, if needed.
  • Add node to /opt/cluster/config/hosts on imgsrv.
    • Run mkdns and mkrevdns
  • Add symlink for node to /var/lib/systemimager/scripts pointing to the appropriate imaging script
  • Create a new ssh key for the system:
    • Run the mkSshKey script
  • Add node to /opt/cluster/Distfile
  • Add node to torque:
[root@torx ~]# qmgr
Max open servers: 4
Qmgr: create node r5a-s7.ufhpc
Qmgr: set node r5a-s7.ufhpc np = 4
Qmgr: set node r5a-s7.ufhpc properties = "all,infiniband,r5a,phase2"
Qmgr: list node r5a-s7.ufhpc
Node r5a-s7.ufhpc
	state = free
	np = 4
	properties = all,infiniband,r5a,phase2
	ntype = cluster
	status = arch=x86_64,opsys=CentOS53,
                 uname=Linux r5a-s7.ufhpc 2.6.18-128.1.6.el5 #1 SMP Wed Apr 1 09:10:25 EDT 2009 x86_64,
                 sessions=? 0,nsessions=? 0,nusers=0,idletime=1359,
                 totmem=12367056kb,availmem=12219208kb,physmem=3981136kb,
                 ncpus=4,loadave=0.00,netload=5917213,state=free,jobs=,varattr=,
                 rectime=1250545922
  • Image the node
  • Distribute ssh keys to cluster:
    • Run /opt/cluster/config/mkSshKeyArchive <nodename>
    • Run make
  • Update hosts.equiv:
    • Edit the file in /opt/cluster/config/etc/hosts.equiv
    • Run rdist:
rdist -P /usr/bin/ssh -f /opt/cluster/Distfile -M 16 hosts-equiv
  • Final note: Ensure that PBS is running on the node at the end, and that it survives a reboot. Cases have been seen where there may have been a trust issue during this process.