728x90
반응형
https://estar987.tistory.com/164
이전에 포스팅한 Munge를 설치 완료했다면 이제 Slurm을 설치할 차례이다. 이번 포스팅은 마스터 노드에서만 돌아가는 Slurm을 설정할 것이고 차후 업로드하는 게시물에 마스터 노드에 계산 노드를 연동하여 HPC를 구성한 후 Slurm을 사용하는 방법을 포스팅할 예정이다.
아래 모든 설정은 마스터 노드에서 작업한다.
Slurm download
cd /engrid/slurm/src/
wget https://download.schedmd.com/slurm/slurm-23.11.6.tar.bz2
Slurm install
tar xvfj slurm-23.11.6.tar.bz2
cd slurm-23.11.6/
./configure --prefix=/engrid/slurm
make && make install
Slurm Setting
# mkdir /engrid/slurm/etc
# vim /engrid/slurm/etc/slurm.conf
---------------------------------------------
SlurmctldAddr=192.168.207.191
MpiDefault=none
ProctrackType=proctrack/pgid
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmdPidFile=/var/run/slurmd.pid
SlurmdSpoolDir=/var/spool/slurmd
StateSaveLocation=/var/spool/slurmctld
SwitchType=switch/none
TaskPlugin=task/affinity
SchedulerType=sched/backfill
SelectType=select/cons_tres
SelectTypeParameters=CR_CPU
SlurmctldLogFile=/var/log/slurmctld.log
SlurmdLogFile=/var/log/slurmd.log
---------------------------------------------
# /engrid/slurm/sbin/slurmd -C | head -n 1 >> /engrid/slurm/etc/slurm.conf
# echo "PartitionName=all.q Nodes=slurm01 Default=Yes OverSubscribe=FORCE:1 MaxTime=UNLIMITED" >> /engrid/slurm/etc/slurm.conf
cat << EOL >> /etc/profile.d/slurm.sh
> SLURM_HOME=/engrid/slurm
> PATH=\{SLURM_HOME}/bin:
> {PATH}
> LD_LIBRARY_PATH=\{SLURM_HOME}/lib:
> {LD_LIBRARY_PATH}
> export SLURM_HOME PATH LD_LIBRARY_PATH
> EOL
# cat /engrid/slurm/etc/slurm.conf
---------------------------------------------
ClusterName=Your Cluster Name Setting
SlurmctldHost=slurm01(192.168.207.191)
SlurmctldAddr=192.168.207.191
MpiDefault=none
ProctrackType=proctrack/pgid
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmdPidFile=/var/run/slurmd.pid
SlurmdSpoolDir=/var/spool/slurmd
StateSaveLocation=/var/spool/slurmctld
SwitchType=switch/none
TaskPlugin=task/affinity
SchedulerType=sched/backfill
SelectType=select/cons_tres
SelectTypeParameters=CR_CPU
SlurmctldLogFile=/var/log/slurmctld.log
SlurmdLogFile=/var/log/slurmd.log
NodeName=slurm01 CPUs=12 Boards=1 SocketsPerBoard=6 CoresPerSocket=2 ThreadsPerCore=1 RealMemory=7949
PartitionName=all.q Nodes=slurm01 Default=Yes OverSubscribe=FORCE:1 MaxTime=UNLIMITED
Slurm Run and Check
# /engrid/slurm/sbin/slurmctld
# /engrid/slurm/sbin/slurmd
# cat /etc/rc.local
---------------------------------------------
#!/bin/bash
/engrid/slurm/sbin/slurmctld
/engrid/slurm/sbin/slurmd
exit 0
---------------------------------------------
chmod 755 /etc/rc.local
cat <<EOL>> /etc/systemd/system/rc-local.service
> [Unit]
> Description=/etc/rc.local Compatibility
> ConditionPathExists=/etc/rc.local
> [Service]
> Type=forking
> ExecStart=/etc/rc.local start
> TimeoutSec=0
> StandardOutput=tty
> RemainAfterExit=yes
> SysVStartPriority=99
> [Install]
> WantedBy=multi-user.target
> EOL
cat /etc/systemd/system/rc-local.service
---------------------------------------------
[Unit]
Description=/etc/rc.local Compatibility
ConditionPathExists=/etc/rc.local
[Service]
Type=forking
ExecStart=/etc/rc.local start
TimeoutSec=0
StandardOutput=tty
RemainAfterExit=yes
SysVStartPriority=99
[Install]
WantedBy=multi-user.target
---------------------------------------------
systemctl daemon-reload
systemctl enable rc-local
systemctl start rc-local
/engrid/slurm/bin/sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
all.q* up infinite 1 idle slurm01
반응형
'HPC' 카테고리의 다른 글
[SLURM] SRUN 명령어 (1) | 2024.09.05 |
---|---|
[Ubuntu] SGE 설치 (0) | 2024.09.03 |
[Ubuntu]Munge Install (0) | 2024.06.16 |
HPE (Hewlett Packard Enterprise): 고성능 컴퓨팅의 선두 주자 (0) | 2024.06.11 |
PBS 클러스터: 고성능 컴퓨팅 환경의 중심 (0) | 2024.06.09 |