2017年7月15日 星期六

在 CentOS / RHEL7 上架設 OpenHPC 系統

學習目標:
  • 利用 OpenHPC 套件,架設高效能電腦(HPC)!
  • OpenHPC 官方示範架構圖:

  • 共要準備六部主機:SMS 一部,Compute Node 四部,以及一部 Storage Server 提供 iSCSI 或是 NFS 儲存空間!各主機準備兩張網卡!
  • 練習用網路設定參數:compute network 172.16.1.0/24,storage network 10.1.1.0/24

操作流程:
  1. 將各主機安裝 CentOS 7.3 版本的 Linux 作業系統!
    • 需要安裝 EPEL 軟體庫套件!

  2. 預先設定 SMS 主機上相關系統參數:
    # vim /etc/hosts
    172.16.1.1 sms sms.example.com
    172.16.1.11 node1 node1.example.com
    172.16.1.12 node2 node2.example.com
    172.16.1.13 node3 node3.example.com
    172.16.1.14 node4 node4.example.com
    10.1.1.100 storage storage.example.com
    
    # hostnamectl set-hostname sms.example.com
    # setenforce 0
    # systemctl disable firewalld
    # systemctl stop firewalld
    
  3. 在 SMS 主機上,安裝 OpenHPC 上的軟體庫套件:
    # yum install http://build.openhpc.community/OpenHPC:/1.3/CentOS_7/x86_64/ohpc-release-1.3-1.el7.x86_64.rpm
    
  4. 可在 SMS 主機上,安裝 OpenHPC 套件的模版 Scripts,加快其他節點安裝:
    # yum -y install docs-ohpc
    (稍後再談...)
    
  5. 在 SMS 主機上,安裝 OpenHPC 準備需要使用的套件:
    # yum -y install ohpc-base ohpc-warewulf
    
  6. 為了時間同步,在 SMS 主機上須要啟動 ntpd 服務:
    # systemctl enable ntpd.service
    # echo "server time.google.com" >> /etc/ntp.conf
    # systemctl restart ntpd
    
  7. 在 SMS 主機上,安裝資源管理套件:
    # yum -y install pbspro-server-ohpc
    
  8. 在 SMS 主機上,安裝 InfiniBand 支援套件:
    # yum -y groupinstall "InfiniBand Support"
    # yum -y install infinipath-psm
    # systemctl start rdma
    
  9. 在 SMS 主機上,設定其他相關設定檔:
    #vim /etc/warewulf/provision.conf
    network device = enp0s8
    
    # vim /etc/xinetd.d/tftp
    disable = no
    
    # systemctl restart xinetd
    # systemctl enable mariadb
    # systemctl restart mariadb
    # systemctl enable httpd
    # systemctl restart httpd
    
  10. 在 SMS 主機上,建立初始基本系統影像檔(for compute node):
    # export CHROOT=/opt/ohpc/admin/images/centos7.3
    # wwmkchroot centos-7 $CHROOT
    
  11. 在 SMS 主機上,增加 OpenHPC 元件給 client 端使用:
    # yum -y --installroot=$CHROOT install ohpc-base-compute
    # cp -p /etc/resolv.conf $CHROOT/etc/resolv.conf
    # yum -y --installroot=$CHROOT install pbspro-execution-ohpc
    
    # vim $CHROOT/etc/pbs.conf
    PBS_SERVER=sms.example.com
    
    # chroot $CHROOT opt/pbs/libexec/pbs_habitat
    
    # vim $CHROOT/var/spool/pbs/mom_priv/config
    $clienthost sms
    
    # echo "\$usecp *:/home /home" >> $CHROOT/var/spool/pbs/mom_priv/config
    # chroot $CHROOT systemctl enable pbs
    
    # yum -y --installroot=$CHROOT groupinstall "InfiniBand Support"
    # yum -y --installroot=$CHROOT install infinipath-psm
    # chroot $CHROOT systemctl enable rdma
    
    # yum -y --installroot=$CHROOT install ntp
    
    # yum -y --installroot=$CHROOT install kernel
    
    # yum -y --installroot=$CHROOT install lmod-ohpc
    
  12. 在 SMS 主機上,客製化系統設定:
    # wwinit database
    # wwinit ssh_keys
    # cat ~/.ssh/cluster.pub >> $CHROOT/root/.ssh/authorized_keys
    
    # echo "172.16.1.1:/home /home nfs nfsvers=3,rsize=1024,wsize=1024,cto 0 0" >> $CHROOT/etc/fstab
    # echo "172.16.1.1:/opt/ohpc/pub /opt/ohpc/pub nfs nfsvers=3 0 0" >> $CHROOT/etc/fstab
    # echo "/home *(rw,no_subtree_check,fsid=10,no_root_squash)" >> /etc/exports
    # echo "/opt/ohpc/pub *(ro,no_subtree_check,fsid=11)" >> /etc/exports
    # exportfs -a
    # systemctl restart nfs-server
    # systemctl enable nfs-server
    
    # chroot $CHROOT systemctl enable ntpd
    # echo "server 172.16.1.1" >> $CHROOT/etc/ntp.conf
    
  13. 在 SMS 主機上,增加鎖定記憶體限制設定:
    # perl -pi -e 's/# End of file/\* soft memlock unlimited\n$&/s' /etc/security/limits.conf
    # perl -pi -e 's/# End of file/\* hard memlock unlimited\n$&/s' /etc/security/limits.conf
    
    # perl -pi -e 's/# End of file/\* soft memlock unlimited\n$&/s' $CHROOT/etc/security/limits.conf
    # perl -pi -e 's/# End of file/\* hard memlock unlimited\n$&/s' $CHROOT/etc/security/limits.conf
    
  14. 在 SMS 主機上,增加 BeeGFS 檔案系統:
    # wget -P /etc/yum.repos.d https://www.beegfs.io/release/beegfs_6/dists/beegfs-rhel7.repo
    # yum -y install kernel-devel gcc
    # yum -y install beegfs-client beegfs-helperd beegfs-utils
    
    # perl -pi -e "s/^buildArgs=-j8/buildArgs=-j8 BEEGFS_OPENTK_IBVERBS=1/" \
    /etc/beegfs/beegfs-client-autobuild.conf
    
    # /opt/beegfs/sbin/beegfs-setup-client -m sms.example.com
    # systemctl start beegfs-helperd
    # systemctl start beegfs-client (要安裝 kernel-devel 才會正常啟動!另外,也可能需要自行架設 BeeGFS 儲存節點)
    
    # wget -P $CHROOT/etc/yum.repos.d https://www.beegfs.io/release/beegfs_6/dists/beegfs-rhel7.repo
    # perl -pi -e "s/^buildEnabled=true/buildEnabled=false/" $CHROOT/etc/beegfs/beegfs-client-autobuild.conf
    # rm -f $CHROOT/var/lib/beegfs/client/force-auto-build
    # chroot $CHROOT systemctl enable beegfs-helperd beegfs-client
    # cp /etc/beegfs/beegfs-client.conf $CHROOT/etc/beegfs/beegfs-client.conf
    # echo "drivers += beegfs" >> /etc/warewulf/bootstrap.conf
    
參考文獻
  • https://github.com/openhpc/ohpc/releases/download/v1.3.1.GA/Install_guide-CentOS7-Warewulf-PBSPro-1.3.1-x86_64.pdf
  • https://www.beegfs.io/wiki/BasicConfigurationFirstStartup