728x90
반응형
MIG 설정 순서
- MIG 활성화
- GPU Instance(GI) 생성
- Compute Instance(CI) 생성
MIG 활성화 전 확인
# nvidia-smi
MIG 활성화 /비활성화
nvidia-smi -i [활성화할 GPU 번호] -mig [0/1 비활성화 / 활성화]
- 5번 GPU 활성화
# nvidia-smi -i 5 -mig 1
- 0번 GPU 비활성화
# nvidia-smi -i 0 -mig 0
- 활성화 / 비활성화 후 GPU 리셋
# nvidia-smi --gpu-reset
# nvidia-smi
MIG 프로필 확인
- GPU : 각 GPU 당 7개씩 MIG 나누어진 것 확인
- Instance Free / Total : GI 생성 가능 개수 확인
- Memory GIB 유의해서 원하는 만큼 활성화 시키기
# nvidia-smi mig -lgip
# nvidia-smi mig -lgipp
# nvidia-smi mig -lgipp
GPU 0 Profile ID 19 Placements: {0,1,2,3,4,5,6}:1
GPU 0 Profile ID 20 Placements: {0,1,2,3,4,5,6}:1
GPU 0 Profile ID 15 Placements: {0,2,4,6}:2
GPU 0 Profile ID 14 Placements: {0,2,4}:2
GPU 0 Profile ID 9 Placements: {0,4}:4
GPU 0 Profile ID 5 Placement : {0}:4
GPU 0 Profile ID 0 Placement : {0}:8
GPU 1 Profile ID 19 Placements: {0,1,2,3,4,5,6}:1
GPU 1 Profile ID 20 Placements: {0,1,2,3,4,5,6}:1
GPU 1 Profile ID 15 Placements: {0,2,4,6}:2
GPU 1 Profile ID 14 Placements: {0,2,4}:2
GPU 1 Profile ID 9 Placements: {0,4}:4
GPU 1 Profile ID 5 Placement : {0}:4
GPU 1 Profile ID 0 Placement : {0}:8
MIG GI(GPU Instance) 생성
MIG 프로필 ID로 생성
- GPU를 선택하지 않으면 0번 GPU로 자동 할당된다.
# nvidia-smi mig -cgi 15
- GPU를 선택하면 선택한 GPU로 할당된다.
# nvidia-smi mig -i 6 -cgi 19
Successfully created GPU instance ID 13 on GPU 6 using profile MIG 1g.10gb (ID 19)
- 조회 (GI생성 확인)
# nvidia-smi mig -lgi
+-------------------------------------------------------+
| GPU instances: |
| GPU Name Profile Instance Placement |
| ID ID Start:Size |
|=======================================================|
| 6 MIG 1g.10gb 19 13 6:1 |
+-------------------------------------------------------+
| 7 MIG 4g.40gb 5 1 0:4 |
+-------------------------------------------------------+
| 7 MIG 1g.20gb 15 6 6:2 |
+-------------------------------------------------------+
MIG 이름로 생성
- # nvidia-smi mig -lgip 에서 나온 이름으로 생성한다.
EX
# nvidia-smi mig -lgip
+-----------------------------------------------------------------------------+
| GPU instance profiles: |
| GPU Name ID Instances Memory P2P SM DEC ENC |
| Free/Total GiB CE JPEG OFA |
|=============================================================================|
+-----------------------------------------------------------------------------+
| 6 MIG 1g.10gb 19 6/7 9.75 No 16 1 0 |
| 1 1 0 |
+-----------------------------------------------------------------------------+
| 6 MIG 1g.10gb+me 20 1/1 9.75 No 16 1 0 |
| 1 1 1 |
+-----------------------------------------------------------------------------+
| 6 MIG 1g.20gb 15 3/4 19.62 No 26 1 0 |
| 1 1 0 |
+-----------------------------------------------------------------------------+
| 6 MIG 2g.20gb 14 3/3 19.62 No 32 2 0 |
| 2 2 0 |
+-----------------------------------------------------------------------------+
| 6 MIG 3g.40gb 9 1/2 39.38 No 60 3 0 |
| 3 3 0 |
+-----------------------------------------------------------------------------+
| 6 MIG 4g.40gb 5 1/1 39.38 No 64 4 0 |
| 4 4 0 |
+-----------------------------------------------------------------------------+
| 6 MIG 7g.80gb 0 0/1 79.12 No 132 7 0 |
| 8 7 1 |
+-----------------------------------------------------------------------------+
- nvidia-smi mig -i [GPU 번호] -cgi [2. instance Name]
[실습 1] 단일 생성
# nvidia-smi mig -i 6 -cgi 1g.10gb
Successfully created GPU instance ID 11 on GPU 6 using profile MIG 1g.10gb (ID 19)
[실습 2] 다중 생성
# nvidia-smi mig -cgi 1g.10gb,”MIG 1g.10gb”
Successfully created GPU instance ID 9 on GPU 5 using profile MIG 1g.10gb (ID 19)
Successfully created GPU instance ID 7 on GPU 5 using profile MIG 1g.10gb (ID 19)
- MIG GI 추가 확인
- 6번 GPU에 Instance ID 11번 추가됨
# nvidia-smi mig -lgi
+-------------------------------------------------------+
| GPU instances: |
| GPU Name Profile Instance Placement |
| ID ID Start:Size |
|=======================================================|
| 6 MIG 1g.10gb 19 11 4:1 |
+-------------------------------------------------------+
| 7 MIG 4g.40gb 5 1 0:4 |
+-------------------------------------------------------+
| 7 MIG 1g.20gb 15 6 6:2 |
+-------------------------------------------------------+
CI(Compute Instance) 생성
- CI를 생성할 GI 조회
- 1. GPU 번호(6,7) Instance ID(11, 13, 1, 6)
# nvidia-smi mig -lgi
+-------------------------------------------------------+
| GPU instances: |
| GPU Name Profile Instance Placement |
| ID ID Start:Size |
|=======================================================|
| 6 MIG 1g.10gb 19 11 4:1 |
+-------------------------------------------------------+
| 7 MIG 4g.40gb 5 1 0:4 |
+-------------------------------------------------------+
| 7 MIG 1g.20gb 15 6 6:2 |
+-------------------------------------------------------+
- nvidia-smi mig -i [1. GPU 번호] -cci -gi [2. Instance ID]
[실습 1] 단일 생성
# nvidia-smi mig -i 6 -cci -gi 11
Successfully created compute instance ID 0 on GPU 6 GPU instance ID 11 using profile MIG 1g.
10gb (ID 0)
[실습 2] 다중 생성
# nvidia-smi mig -i 7 -cci -gi 1,6
Successfully created compute instance ID 0 on GPU 7 GPU instance ID 1 using profile MIG 1g.10gb (ID 0)
Successfully created compute instance ID 0 on GPU 7 GPU instance ID 6 using profile MIG 1g.10gb (ID 0)
- 확인(전체)
# nvidia-smi mig -lci
+--------------------------------------------------------------------+
| Compute instances: |
| GPU GPU Name Profile Instance Placement |
| Instance ID ID Start:Size |
| ID |
|====================================================================|
| 6 11 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 7 1 MIG 4g.40gb 3 0 0:4 |
+--------------------------------------------------------------------+
| 7 6 MIG 1g.20gb 7 0 0:2 |
+--------------------------------------------------------------------+
- 확인(원하는 GPU만)
# nvidia-smi mig -i 7 -lci
+--------------------------------------------------------------------+
| Compute instances: |
| GPU GPU Name Profile Instance Placement |
| Instance ID ID Start:Size |
| ID |
|====================================================================|
| 7 1 MIG 4g.40gb 3 0 0:4 |
+--------------------------------------------------------------------+
| 7 6 MIG 1g.20gb 7 0 0:2 |
+--------------------------------------------------------------------+
MIG GI & MIG CI 동시 생성
- 사전 작업으로 2번 GPU 활성화 시켜준다.
# nvidia-smi -i 2 -mig 1
Enabled MIG Mode for GPU 00000000:3F:00.0
Warning: persistence mode is disabled on device 00000000:3F:00.0. See the Known Issues section
of the nvidia-smi(1) man page for more information. Run with [--help | -h] switch to get more
information on how to enable persistence mode.
All done.
- nvidia-smi mig -i [1. GPU 번호] -cci -gi [2. Profile ID] -C
# nvidia-smi mig -i 2 -cgi 19 -C
Successfully created GPU instance ID 13 on GPU 2 using profile MIG 1g.10gb (ID 19)
Successfully created compute instance ID 0 on GPU 2 GPU instance ID 13 using profile MIG 1g.
10gb (ID 0)
- 확인(2번 GPU 생성 확인) - GI
# nvidia-smi mig -lgi
+-------------------------------------------------------+
| GPU instances: |
| GPU Name Profile Instance Placement |
| ID ID Start:Size |
|=======================================================|
| 2 MIG 1g.10gb 19 13 6:1 |
+-------------------------------------------------------+
| 6 MIG 1g.10gb 19 11 4:1 |
+-------------------------------------------------------+
| 7 MIG 4g.40gb 5 1 0:4 |
+-------------------------------------------------------+
| 7 MIG 1g.20gb 15 6 6:2 |
+-------------------------------------------------------+
- 확인(2번 GPU 생성 확인) - CI
# nvidia-smi mig -lci
+--------------------------------------------------------------------+
| Compute instances: |
| GPU GPU Name Profile Instance Placement |
| Instance ID ID Start:Size |
| ID |
|====================================================================|
| 2 13 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 6 11 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 7 1 MIG 4g.40gb 3 0 0:4 |
+--------------------------------------------------------------------+
| 7 6 MIG 1g.20gb 7 0 0:2 |
+--------------------------------------------------------------------+
MIG 활성화, MIG GI & CI 순서대로 생성 후 MIG 생성됐는지 확인
# nvidia-smi
....
+-----------------------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG |
| | | ECC| |
|==================+==================================+===========+=======================|
| 2 1 0 0 | 51MiB / 40320MiB | 32 0 | 4 0 4 0 4 |
| | 0MiB / 65535MiB | | |
+------------------+----------------------------------+-----------+-----------------------+
....
MIG CI 삭제
- nvidia-smi mig -dci -i [1.Compute GPU] -gi [2. GPU Instance ID]
[실습 1] 단일 삭제
# nvidia-smi mig -dci -i 7 -gi 6
Successfully destroyed compute instance ID 0 from GPU 7 GPU instance ID 6
- 확인
# nvidia-smi mig -lci
+--------------------------------------------------------------------+
| Compute instances: |
| GPU GPU Name Profile Instance Placement |
| Instance ID ID Start:Size |
| ID |
|====================================================================|
| 2 13 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 6 11 MIG 1g.10gb 0 0 0:1 |
+--------------------------------------------------------------------+
| 7 1 MIG 4g.40gb 3 0 0:4 |
+--------------------------------------------------------------------+
[실습 2] 다중 삭제
# nvidia-smi mig -dci -i 5 -gi 7,9
Successfully destroyed compute instance ID 0 from GPU 5 GPU instance ID 7
Successfully destroyed compute instance ID 0 from GPU 5 GPU instance ID 9
MIG GI 삭제
- nvidia-smi mig -dgi -i [1.Compute GPU] -gi [2.Instance ID]
- 조회
# nvidia-smi mig -lgi
+-------------------------------------------------------+
| GPU instances: |
| GPU Name Profile Instance Placement |
| ID ID Start:Size |
|=======================================================|
| 2 MIG 1g.10gb 19 13 6:1 |
+-------------------------------------------------------+
| 6 MIG 1g.10gb 19 11 4:1 |
+-------------------------------------------------------+
| 7 MIG 4g.40gb 5 1 0:4 |
+-------------------------------------------------------+
| 7 MIG 1g.20gb 15 6 6:2 |
+-------------------------------------------------------+
[실습 1] 단일 삭제
# nvidia-smi mig 2 -dgi -gi 13
[실습 2] 다중 삭제
# nvidia-smi mig -dci -i 5 -gi 7,9
Successfully destroyed compute instance ID 0 from GPU 5 GPU instance ID 7
Successfully destroyed compute instance ID 0 from GPU 5 GPU instance ID 9
[실습 3] 전체 삭제
# nvidia-smi mig -dci
Successfully destroyed compute instance ID 0 from GPU 2 GPU instance ID 13
Successfully destroyed compute instance ID 0 from GPU 6 GPU instance ID 11
Successfully destroyed compute instance ID 0 from GPU 7 GPU instance ID 1
# nvidia-smi mig -dgi
Successfully destroyed GPU instance ID 13 from GPU 2
Successfully destroyed GPU instance ID 11 from GPU 6
Successfully destroyed GPU instance ID 1 from GPU 7
Successfully destroyed GPU instance ID 6 from GPU 7
- 조회
# nvidia-smi mig -lci
No GPU instances found: Not Found
# nvidia-smi mig -lgi
No GPU instances found: Not Found
Trouble shooting
- GI 먼저 삭제하면 삭제가 안된다.
- 지우는 순서는 생성 역순으로 CI 지우고 GI 지우고 비활성화
반응형
'NVIDIA' 카테고리의 다른 글
[NVIDIA] In use by another client(프로세스 충돌) (0) | 2024.10.18 |
---|---|
[NVIDIA] MIG(Multi-Instance-GPU) Docker 컨테이너에 할당 (0) | 2024.08.03 |
[NVIDIA] MIG 활용시 배포 및 시스템 고려 사항 (0) | 2024.08.01 |
[NVIDIA] MIG를 활용한 고성능 컴퓨팅 환경 구축 (0) | 2024.07.31 |
[NVIDIA] NVIDIA Multi-Instance GPU (MIG) 개요 및 가이드 (0) | 2024.07.30 |