Terapix Star Formation Region IC 1396, © 2001 CFHT
How to insert an opteron in the cluster
Article
by FMA - Updated November 3rd, 2006

On mix# machine:

-  services on mix#

  • autofs (--ghost (autofs V4) shows mount point, so you can remove symlinks: bad idea, removed!)
    rc-update add autofs default
    vi /etc/conf.d/autofs
     daemonoptions='--timeout 60'
  • nfs
    emerge -va nfs-utils
    rc-update add nfs default
  • ypbind
    emerge -v ypbind yp-tools
    rc-update add ypbind default
  • portmap

-  config ypbind

# vi /etc/yp.conf
domain clic.iap.fr server clix.clic.iap.fr
# vi /etc/conf.d/domainname
OVERRIDE=0
DNSDOMAIN="iap.fr"
NISDOMAIN="clic.iap.fr"

Very old fashioned:

# echo   clic.iap.fr > /etc/nisdomainname
# /etc/init.d/domainname restart

Very old fashioned:

# vi /etc/conf.d/ypbind
YP_DOMAIN=clic.iap.fr
# vi /etc/init.d/ypbind      (patche newer versions)
       ypdomainname "$YP_DOMAIN"
       start-stop-daemon --start --quiet --exec /usr/sbin/ypbind ${YOPTS}

-  nsswitch configuration

A machine which is part of the cluster gets its identifications (passwd, shadow, group) from the nis, as the devices to automount. Switched back to "files nis" order as it triggerd problems in init scripts (Cf. gentoo forum).

# vi /etc/nsswitch.conf
passwd:      files nis
shadow:      files nis
group:       files nis

automount:   files nis
or:
# vi /etc/nsswitch.conf
passwd:      compat
shadow:      compat
group:       compat

automount:   nis files
# echo "+" >> /etc/passwd

-  config DNS

clic.iap.fr is the cluster domain name, which is searched first. The clix DNS server is used to resolve names first, so that cluster's nodes can be found. It's better to connect to clix's DNS server via its public IP rather than via its cluster's network IP, to avoid a long delays if eth1 (interface connected to the cluster's switch) is down.

Normal DNS list follows.

Only for machines with direct internet connection in addition to the connection in the local cluster's network

# vi /etc/resolv.conf

search clic.iap.fr iap.fr
# clix.clic.iap.fr
#nameserver 10.0.1.253
# clix.iap.fr
nameserver 194.57.221.40
# carignan.iap.fr
nameserver 194.167.0.147
# cairanne.iap.fr
nameserver 194.167.0.198
# rasteau.iap.fr
nameserver 194.167.0.195
# gentiane.imcce.fr
nameserver 193.48.190.1

-  cluster's interface configuration

eth1 is the interface connected to the cluster's switch. It gets its IP, and only its IP, from clix's dhcp server (thus the dhcpcd_eth1 options to avoid overwriting other config files). The 10.0.1.0 network route is done by dhcpcd by default ! So every connection to a machine in the 10.0.1.0/24 network (cluster private network) is done via eth1. So there can be other class A private networks in IAP without interferences (cluster's network is actually in class C).

# vi /etc/conf.d/net
# the -N option to prevent dhcpcd from overwriting your /etc/ntp.conf file
# the -Y option Prevents dhcpcd from replacing existing /etc/yp.conf file
# -R     Prevents dhcpcd from replacing existing /etc/resolv.conf file.
# -G Prevents dhcpcd from installing default routes provided by DHCP server.
config_eth1=( "dhcp" )
dhcpcd_eth1="-N -Y -R -G"
# for ganglia packets: do command `route add -host 239.2.11.71 dev eth1`
routes_eth1=( "-host 239.2.11.71 dev eth1" )
### OLD ### iface_eth1="dhcp"
### OLD ### dhcpcd_eth1="-N -Y -R -G"
# x86:
#config_eth0=( "dhcp" )
#dhcpcd_eth0="-N -Y"
# already donne by dhcpcd !
#routes_eth1=" -net 10.0.1.0 netmask 255.255.255.0 dev eth1"

Activate eth1, and make ypbind depend on it

# cd /etc/init.d
# ln -s net.eth0 net.eth1
# rc-update add net.eth1 default
# vi /etc/init.d/ypbind
       need net portmap net.eth1

-  simple node:

# cat /etc/conf.d/net
config_eth0=( "dhcp" )
dhcpcd_eth0="-N -Y"

-  build automounted directory tree

NB since the use of --ghost in /etc/init.d/autofs, the symlinks are not useful anymore... (give it a try !): does not work well: pb with locally mounted dir !!

# mkdir /home/nis
# for mac in clix pix{1,2,3,4,5,6,7,8,9,10} mix{1,2,3,4,5,6,7,8,9} fcix fcix2 ftpix efigix; do mkdir -p /mnt/data/$mac; mkdir -p /data/$mac; done
# for a in pix{1,2,3,4,5,6,7,8,9,10} ftpix mix{8,9} efigix; do cd /data/$a; ln -s /mnt/data/$a/raid .; done
# for a in mix{1,2,3,4,5,6,7} fcix fcix2; do cd /data/$a; ln -s /mnt/data/$a/raid1 .; done
# for a in mix{1,2,3,4,5,6,7} fcix fcix2; do cd /data/$a; ln -s /mnt/data/$a/raid2 .; done
# for a in pix{5,6,7,8,9}; do cd /data/$a; ln -s /mnt/data/$a/raid2 .; done
# for i in `seq 8`; do cd /data/clix; ln -s /mnt/data/clix/fc$i .; done
# cd /data/clix; ln -s /mnt/data/fcix/raid1 fc9; ln -s /mnt/data/fcix/raid2 fc10

or:

/home/nis/root.nis/fred/build_data.bash

cosmix:

# mkdir /mnt/data
# ln -s /mnt/data/cos{1,2,3,4} .

-  modify raid mounting to conform to cluster architecture

# grep raid /etc/fstab
/dev/sdb                /data/mix2/raid1          xfs             noatime                 0 0
/dev/sdc                /data/mix2/raid2          xfs             noatime                 0 0
# ln -s /data/mix2/raid1 /raid1
# ln -s /data/mix2/raid2 /raid2
# touch /data/mix2/raid1/this_is_mix2_raid1
# touch /data/mix2/raid2/this_is_mix2_raid2

-  NFS configuration

Add insecure to allow ports above 1024 (Cf. man exports).

mix2 / # vi /etc/exports
# /etc/exports: NFS file systems being exported.  See exports(5).
/data/mix2/raid1 *.clic.iap.fr(rw,sync,insecure)
/data/mix2/raid2 *.clic.iap.fr(rw,sync,insecure)

-  change uid and gid of existing accounts to mirror clix configuration and add people to wheel group

mix2 # vi /etc/group
pipeline:x:12387:pipeline
wheel:x:10:root,ab,hjmcc,marmo,pipeline,bertin,dantel,gimi,magnard,mellier
mix2 # vi /etc/passwd
mix2 # find / -uid 1006 | xargs  chown 12392:100

-  install and enable ganglia

# emerge -v /usr/portage/sys-cluster/ganglia-monitor-core/ganglia-monitor-core-2.5.5.ebuild
# gmond --convert my_old_gmond.conf > my_new_gmond.conf
# vi /etc/init.d/gmond
       need net net.eth1
# vi /etc/gmond.conf
mcast_if  eth1
# rc-update add gmond default
# /etc/init.d/gmond start  (to be done *after* config on clix !!)

simple node:

# vi /etc/init.d/gmond
#!/sbin/runscript

depend() {
       need net net.eth0
}

start() {
       ebegin "Starting GANGLIA gmond: "
       start-stop-daemon --start --quiet --exec /usr/sbin/gmond
       eend $? "Failed to start gmond"
}

stop() {
       ebegin "Shutting down GANGLIA gmnod: "
       start-stop-daemon --stop --quiet --exec /usr/sbin/gmond
       eend $? "Failed to stop gmond"
}

# vi /etc/gmond.conf
mcast_if  eth0
# rc-update add gmond default
# /etc/init.d/gmond start  (to be done *after* config on clix !!)


-  add new machines (relation eth1's MAC address - name) in clix dhcpd config file

[root@clix etc]# vi /chroot/dhcp/etc/dhcp/dhcpd.conf
# TAG: NODE_LIST_ADMIN_END

host mix1{
   hardware ethernet 00:E0:81:51:DA:5E;
   fixed-address mix1;
}

host mix2{
   hardware ethernet 00:e0:81:51:da:8e;
   fixed-address mix2;
}
host mix3{
   hardware ethernet 00:E0:81:60:E6:E6;
   fixed-address mix3;
}
host efigix{
   hardware ethernet 00:E0:81:43:82:67;
   fixed-address 10.0.1.218;
}
[root@clix etc]# /etc/init.d/dhcp restart

-  create automount files for each new machine

root@clix # cat /etc/autofs/auto.mix2
raid1  -rw,nfs,soft,timeo=2,intr,nosuid,rsize=8192,wsize=8192 mix2:/data/mix2/raid1
raid2  -rw,nfs,soft,timeo=2,intr,nosuid,rsize=8192,wsize=8192 mix2:/data/mix2/raid2
root@clix # cat /etc/autofs/auto.mix1
...
root@clix # cat /etc/autofs/auto.mix3
...
root@clix # vi /etc/autofs/auto.master
/mnt/data/mix1 /etc/autofs/auto.mix1 --timeout=600
/mnt/data/mix2 /etc/autofs/auto.mix2 --timeout=600
/mnt/data/mix3 /etc/autofs/auto.mix3 --timeout=600
root@clix # vi /etc/autofs/auto.master.nis
/mnt/data/mix1 /etc/autofs/auto.mix1 --timeout=600
/mnt/data/mix2 /etc/autofs/auto.mix2 --timeout=600
/mnt/data/mix3 /etc/autofs/auto.mix3 --timeout=600

-  update ypserv configuration

# cd /var/yp/
[root@clix yp]# vi Makefile
AUTO_MIX1  = $(YPSRCDIR)/auto.mix1
AUTO_MIX2  = $(YPSRCDIR)/auto.mix2
...
all: .... auto.mix1 auto.mix2
...
auto.mix1: $(AUTO_MIX1) $(YPDIR)/Makefile
        @echo "Updating $@..."
        -@sed -e "/^#/d" -e s/#.*$$// $(AUTO_MIX1) | $(DBLOAD) \
                -i $(AUTO_MIX1) -o $(YPMAPDIR)/$@ - $@
        -@$(NOPUSH) || $(YPPUSH) -d $(DOMAIN) $@

auto.mix2: $(AUTO_MIX2) $(YPDIR)/Makefile
        @echo "Updating $@..."
        -@sed -e "/^#/d" -e s/#.*$$// $(AUTO_MIX2) | $(DBLOAD) \
                -i $(AUTO_MIX2) -o $(YPMAPDIR)/$@ - $@
        -@$(NOPUSH) || $(YPPUSH) -d $(DOMAIN) $@
[root@clix yp]# make
[root@clix etc]# /etc/init.d/ypserv restart

-  add new machines in clix DNS server

root@clix # vi /chroot/dns/var/bind/pri/clic.iap.fr.zone
mix1   IN      A       10.0.1.200
n200     IN      CNAME   mix1.clic.iap.fr.

mix2   IN      A       10.0.1.201
n201     IN      CNAME   mix2.clic.iap.fr.

mix3   IN      A       10.0.1.202
n202     IN      CNAME   mix3.clic.iap.fr.

root@clix # vi /chroot/dns/var/bind/pri/1.0.10.zone
200     IN      PTR   mix1.clic.iap.fr.
201     IN      PTR   mix2.clic.iap.fr.
202     IN      PTR   mix3.clic.iap.fr.
root@clix # /etc/init.d/named restart

-  update automounted directory tree

[root@clix yp]# NEWMACS="ftpix"
[root@clix yp]# mkdir -p /mnt/data/$NEWMACS /data/$NEWMACS; cd /data/$NEWMACS; ln -s /mnt/data/$NEWMACS/raid1 .; ln -s /mnt/data/$NEWMACS/raid2 .
[root@clix yp]# rshp $NKA -- "mkdir -p /mnt/data/$NEWMACS /data/$NEWMACS; cd /data/$NEWMACS; ln -s /mnt/data/$NEWMACS/raid1 .; ln -s /mnt/data/$NEWMACS/raid2 .;"
[root@clix yp]# for node in mix{1,2,3,4,5}; do ssh $node "mkdir -p /data/$NEWMACS /mnt/data/$NEWMACS; cd /data/$NEWMACS; ln -s /mnt/data/$NEWMACS/raid1 .; ln -s /mnt/data/$NEWMACS/raid2 ."; done


-  start all configured services on the new node:

# emerge -va net-misc/dhcpcd
# /etc/init.d/net.eth1 start
# /etc/init.d/autofs start
# /etc/init.d/gmond start


Site Map  -   -  Contact
© Terapix 2003-2011