๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
[DevOps]/Kubernetes

๐Ÿ˜ก 1 node(s) had untolerated taint(node.kubernetes.io/disk-pressure:)์˜ ์›์ธ ํŒŒ์•…๊ณผ ํ•ด๊ฒฐ

by ํŒกํŽ‘ํ 2025. 1. 15.
728x90

๐Ÿšจ Error :

 

๐Ÿšจ
Pod๋ฅผ ์—ฌ๋Ÿฌ ๊ฐœ ๋„์šฐ๊ณ  ํ…Œ์ŠคํŠธํ•˜๋Š” ๋„์ค‘์— ์ƒˆ Pod๋ฅผ ์ƒ์„ฑํ•˜์ž Pending ์ƒํƒœ๊ฐ€ ์ง€์†๋˜๋‹ค๊ฐ€ ํ•œ์ฐธ ์ง€๋‚˜์„œ Running์œผ๋กœ ๋ณ€๊ฒฝ๋˜๋Š” ์ƒํ™ฉ์ด ๋ฐœ์ƒํ–ˆ๋‹ค.
Pod์— ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•˜์—ฌ Running ์ƒํƒœ๊ฐ€ ๋˜์ง€ ๋ชปํ•˜๋ฉด ์•„๋ž˜ ๋ช…๋ น์–ด๋กœ Pod์˜ Event๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

 

$ kubectl describe po <pod๋ช…>

# ์ถœ๋ ฅ
...
Warning  FailedScheduling  6m21s  default-scheduler  0/2 nodes are available: 
1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 
1 node(s) had untolerated taint {node.kubernetes.io/disk-pressure: }. 
preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling.
  • Pod๋ฅผ ๋…ธ๋“œ์— ๋„์šธ ๋•Œ ํ•ด๋‹น Pod๊ฐ€ ์–ด๋Š ๋…ธ๋“œ์— ๋„์›Œ์งˆ์ง€๋ฅผ ๊ฒฐ์ •ํ•˜๋Š” ์—ฌ๋Ÿฌ ์š”์ธ ์ค‘ ํ•˜๋‚˜๊ฐ€ taint์™€ toleration์ด๋‹ค.
  • ์•„์ฃผ ๊ฐ„๋‹จํ•˜๊ฒŒ ์„ค๋ช…ํ•˜๋ฉด taint๋Š” ์–ผ๋ฃฉ์ด๋ผ๋Š” ์˜๋ฏธ๋กœ node์— ์–ผ๋ฃฉ์„ ๋ฌปํžˆ๋Š” ๊ฒƒ์ด๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ๋‹ค.
    • ์ด ์–ผ๋ฃฉ์ด๋ž€ ์˜ˆ๋ฅผ ๋“ค์–ด ๋ชจ๊ธฐ ๊ธฐํ”ผ์ œ์ด๋‹ค.
  • ๋ฐ˜๋Œ€๋กœ toleraiton์€ taint์— ์ €ํ•ญํ•  ์ˆ˜ ์žˆ์Œ์„ ์˜๋ฏธํ•œ๋‹ค.
    • ์˜ˆ๋ฅผ ๋“ค์–ด ๋ชจ๊ธฐ ๊ธฐํ”ผ์ œ์— ๋‚ด์„ฑ์ด ์žˆ๋Š” ๋ชจ๊ธฐ์ด๋‹ค.
  • ๋ชจ๊ธฐ ๊ธฐํ”ผ์ œ๊ฐ€ ๋ฐœ๋ฆฐ ๋…ธ๋“œ์—๋Š” ์ผ๋ฐ˜ ๋ชจ๊ธฐ๊ฐ€ ์ ‘๊ทผํ•˜์ง€ ๋ชปํ•œ๋‹ค.
  • ์ด๋ฅผ Pod๋กœ ๋ฐ”๊ฟ” ์–˜๊ธฐํ•˜๋ฉด, ๋ชจ๊ธฐ ๊ธฐํ”ผ์ œ์— ๋‚ด์„ฑ์„ ๊ฐ€์ง€๊ณ  ์žˆ์ง€ ์•Š์€ Pod(๋ชจ๊ธฐ)๋Š” ํ•ด๋‹น ๋…ธ๋“œ์— ์Šค์ผ€์ค„๋ง(์‹คํ–‰)๋  ์ˆ˜ ์—†๋‹ค.



๐Ÿค“ ์›์ธ :

 

  •  ๋”ฐ๋ผ์„œ ์œ„ ์—๋Ÿฌ๋Š” node-role.kubernetes.io/control-plane, node.kubernetes.io/disk-pressure๋ผ๋Š” ๋‘ taint์— ๋‚ด์„ฑ์„ ๊ฐ€์ง€์ง€ ๋ชปํ•œ Pod๊ฐ€ ํ•ด๋‹น ๋…ธ๋“œ์— ์Šค์ผ€์ค„๋ง๋˜์ง€ ๋ชปํ•œ ๊ฒƒ์ด๋‹ค.
  • ๊ธฐ๋ณธ์ ์œผ๋กœ control-plane(๋งˆ์Šคํ„ฐ ๋…ธ๋“œ)์—๋Š” ํŒŒ๋“œ๊ฐ€ ์Šค์ผ€์ค„๋ง๋˜์ง€ ์•Š๋Š”๋‹ค.
    • ๊ทธ ์ด์œ  ์—ญ์‹œ node-role.kubernetes.io/control-plane์ด taint ๋•Œ๋ฌธ์ด๋‹ค.
    • ์ด๋Š” ๋งˆ์Šคํ„ฐ ๋…ธ๋“œ์— ์„ค์ •๋˜์–ด ์žˆ๋Š” ๊ธฐ๋ณธ taint์ด๋‹ค.
      • describe ๋ช…๋ น์–ด๋กœ ๋งˆ์Šคํ„ฐ๋…ธ๋“œ์˜ taint ๋ถ€๋ถ„์„ ์‚ดํŽด๋ณด๋ฉด
      • Taints:             node-role.kubernetes.io/control-plane:NoSchedule๋ผ๊ณ  ๋˜์–ด์žˆ๋‹ค.
  • ์ด์ œ ์—๋Ÿฌ๊ฐ€ ๋ฐœ์ƒํ•œ ์ด์œ ๋ฅผ ์ถ”์ธกํ•ด๋ณด์ž.
    • ์•ž์„œ node.kubernetes.io/disk-pressure์ด๋ผ๋Š” taint๋กœ ์ธํ•ด Pod๊ฐ€ ์›Œ์ปค ๋…ธ๋“œ๋กœ์˜ ์Šค์ผ€์ค„๋ง์— ์‹คํŒจํ–ˆ์„ ๊ฒƒ์ด๋‹ค.
    • ์ดํ›„ control-plane(๋งˆ์Šคํ„ฐ๋…ธ๋“œ)์— ์Šค์ผ€์ค„๋ง์„ ํ•˜๋ ค๋‹ค๊ฐ€ node-role.kubernetes.io/control-plane๋ผ๋Š” taint๋กœ ์ธํ•ด ๋˜ ์‹คํŒจํ•˜์—ฌ ์œ„์™€ ๊ฐ™์€ ์—๋Ÿฌ๊ฐ€ ๋ฐœ์ƒํ•œ ๊ฒƒ์œผ๋กœ ๋ณด์ธ๋‹ค.


๐Ÿš’ ํ•ด๊ฒฐ :

 

node.kubernetes.io/disk-pressure

  • node.kubernetes.io/disk-pressure๋Š” ๋…ธ๋“œ์˜ ๋””์Šคํฌ ์‚ฌ์šฉ๋Ÿ‰์ด ๋†’์•„์„œ ์ถ”๊ฐ€์ ์ธ ๋ฆฌ์†Œ์Šค๋ฅผ ํ• ๋‹นํ•  ์ˆ˜ ์—†๋‹ค๋Š” ๊ฒƒ์„ ๋‚˜ํƒ€๋‚ธ๋‹ค.
  • ๋””์Šคํฌ ์••๋ฐ• ์ƒํƒœ์—์„œ๋Š” ํ•ด๋‹น ๋…ธ๋“œ์— Pod๋ฅผ ์Šค์ผ€์ค„๋งํ•  ์ˆ˜ ์—†๋‹ค.

 

๐Ÿ’ก
์›Œ์ปค ๋…ธ๋“œ์˜ ๋””์Šคํฌ ์šฉ๋Ÿ‰์— ๋น„ํ•ด Pod๊ฐ€ ๋„ˆ๋ฌด ๋งŽ์ด ๋„์›Œ์ ธ ์žˆ์–ด ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•œ ๊ฒƒ์ด๊ธฐ ๋•Œ๋ฌธ์— ์‚ฌ์šฉํ•˜์ง€ ์•Š๋Š” Pod๋Š” ๋ชจ๋‘ ์ œ๊ฑฐํ•ด ์ฃผ์—ˆ๋‹ค.

์ด ๋ฌธ์ œ๋Š” ์›Œ์ปค๋…ธ๋“œ์— ์–ผ๋งˆ๋งŒํผ์˜ ๊ฐ€์šฉ๋Ÿ‰์ด ์žˆ๋Š”์ง€ ์•Œ์ง€ ๋ชปํ•ด ๋ฐœ์ƒํ•œ ๋ฌธ์ œ์ด๋ฏ€๋กœ, ์ด๋ฒˆ ๊ธฐํšŒ์— ์›Œ์ปค๋…ธ๋“œ ๋””์Šคํฌ์˜ ๊ฐ€์šฉ๋Ÿ‰์„ ํ™•์ธํ•ด ๋ณด์ž.

 

$ k describe nodes <์›Œ์ปค ๋…ธ๋“œ๋ช…>

# ์ถœ๋ ฅ

Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  NetworkUnavailable   False   Mon, 06 Jan 2025 11:16:31 +0000   Mon, 06 Jan 2025 11:16:31 +0000   FlannelIsUp                  Flannel is running on this node
  MemoryPressure       False   Wed, 15 Jan 2025 03:58:07 +0000   Mon, 06 Jan 2025 11:15:42 +0000   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Wed, 15 Jan 2025 03:58:07 +0000   Wed, 15 Jan 2025 03:49:07 +0000   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Wed, 15 Jan 2025 03:58:07 +0000   Mon, 06 Jan 2025 11:15:42 +0000   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                True    Wed, 15 Jan 2025 03:58:07 +0000   Mon, 06 Jan 2025 11:16:29 +0000   KubeletReady                 kubelet is posting ready status
  
  ...
  
  Capacity:
  cpu:                8
  ephemeral-storage:  37206272Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             24306844Ki
  pods:               110
Allocatable:
  cpu:                8
  ephemeral-storage:  34289300219
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             24204444Ki
  pods:               110
  
  ...
  Events:
  Type     Reason                 Age                 From     Message
  ----     ------                 ----                ----     -------
  Warning  EvictionThresholdMet   16m (x2 over 27m)   kubelet  Attempting to reclaim ephemeral-storage
  Normal   NodeHasDiskPressure    16m (x2 over 27m)   kubelet  Node k8s-worker1 status is now: NodeHasDiskPressure
  Normal   NodeHasNoDiskPressure  10m (x3 over 7d1h)  kubelet  Node k8s-worker1 status is now: NodeHasNoDiskPressure

 

Conditions

  • Conditions๋ฅผ ํ™•์ธํ•ด ๋ณด๋ฉด ํ•ด๋‹น ๋…ธ๋“œ์˜ ๋„คํŠธ์›Œํฌ. ๋ฉ”๋ชจ๋ฆฌ, ๋””์Šคํฌ ๋“ฑ์˜ ํ•œ๊ณ„๊ฐ€ ํƒ€์ž… ๋ณ„๋กœ ๋‚˜๋‰˜์–ด ์žˆ๊ณ , ํ˜„์žฌ ์ƒํƒœ๊ฐ€ ํ‘œ์‹œ๋˜์–ด ์žˆ๋‹ค.
  • Ready์˜ ๊ฒฝ์šฐ๋Š” ๋…ธ๋“œ๊ฐ€ ์ •์ƒ์ ์œผ๋กœ ํ™œ์„ฑํ™”๋˜์–ด์žˆ๋‹ค๋ฉด True๋กœ ๋˜์–ด์žˆ๋Š” ๊ฒƒ์ด ๋งž๋‹ค.
  • ๋‹ค๋ฅธ ํƒ€์ž…์€ ๋ช…์นญ์„ ๋ณด๋ฉด False๋กœ ๋˜์–ด์žˆ์–ด์•ผ ์ •์ƒ์ž„์„ ์ง์ž‘ํ•  ์ˆ˜ ์žˆ๊ณ  ํ˜„์žฌ ์ •์ƒ์ž„์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

 

Capacity & Allocatable

  • ๊ฐ ๋ฆฌ์†Œ์Šค ๋ณ„ ์ „์ฒด ์šฉ๋Ÿ‰๊ณผ ํ• ๋‹น ๊ฐ€๋Šฅ ์šฉ๋Ÿ‰์„ ํ‘œ์‹œํ•œ๋‹ค.
  • ์—ฌ๊ธฐ์„œ ephemeral-storage์— ๋Œ€ํ•ด ๊ฐ„๋‹จํžˆ ์•Œ์•„๋ณด์ž.
    • ephemeral-storage๋Š” Kubernetes์—์„œ ์ปจํ…Œ์ด๋„ˆ๊ฐ€ ์‚ฌ์šฉํ•˜๋Š” ์ž„์‹œ ์ €์žฅ์†Œ๋ฅผ ์˜๋ฏธํ•œ๋‹ค.
    • ์ด ์ €์žฅ์†Œ๋Š” ์ปจํ…Œ์ด๋„ˆ๊ฐ€ ์‹คํ–‰๋˜๋Š” ๋™์•ˆ๋งŒ ๋ฐ์ดํ„ฐ๋ฅผ ์œ ์ง€ํ•˜๋ฉฐ ์ปจํ…Œ์ด๋„ˆ๊ฐ€ ์‚ญ์ œ๋˜๊ฑฐ๋‚˜ ์žฌ์‹œ์ž‘๋˜๋ฉด ๋ฐ์ดํ„ฐ๊ฐ€ ์‚ฌ๋ผ์ง„๋‹ค. 
    • ์ปจํ…Œ์ด๋„ˆ์™€ ์ƒ๋ช…์ฃผ๊ธฐ๋ฅผ ํ•จ๊ป˜ํ•˜๋Š” ๊ฒƒ์ด๋‹ค.
  • ์ฆ‰, ํ•œ ๋…ธ๋“œ์— ์—ฌ๋Ÿฌ Pod๋ฅผ ๋„์šธ์ˆ˜๋ก ์‹คํ–‰ ์ค‘์ธ ์ปจํ…Œ์ด๋„ˆ์˜ ์–‘์ด ๋งŽ์•„์ ธ ํ• ๋‹น ๊ฐ€๋Šฅํ•œ ์ €์žฅ ์šฉ๋Ÿ‰์ด ์ ์  ์ค„์–ด๋“ค ๊ฒƒ์ด๋‹ค.
  • ์ด ์šฉ๋Ÿ‰์€ ์ธ์Šคํ„ด์Šค์˜ ๋””์Šคํฌ ์šฉ๋Ÿ‰๊ณผ ๊ด€๋ จ์ด ์žˆ์œผ๋ฏ€๋กœ ์ด ๋ถ€๋ถ„์—์„œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•œ ๊ฒƒ์ด๋‹ค.

 

Events

  • ๊ธฐ๋ก ์ˆœ์„œ๋ฅผ ๋ณด๋ฉด ephemeral-storage์— ์šฉ๋Ÿ‰ ํ•œ๊ณ„๋กœ ์ธํ•œ ๊ฒฝ๊ณ ๊ฐ€ ๋ฐœ์ƒํ–ˆ๊ณ , ์ดํ›„ DiskPressure๊ฐ€ True๋กœ ๋˜์–ด ๋””์Šคํฌ ์‚ฌ์šฉ๋Ÿ‰์ด ์ž„๊ณ„์น˜์— ๋„๋‹ฌํ•˜์—ฌ Pod๊ฐ€ ์ƒ์„ฑ์ด ๋ถˆ๊ฐ€๋Šฅํ•ด์กŒ๋‹ค๊ฐ€ ํ’€๋ ธ์Œ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.
    • ์•„๋งˆ ํ’€๋ฆฐ ์‹œ์ ์ด ๋‚ด๊ฐ€ ์‚ฌ์šฉํ•˜์ง€ ์•Š์€ Pod๋ฅผ ์ง€์šด ์‹œ์ ์ผ ๊ฒƒ์ด๋‹ค.

 

๊ฒฐ๋ก 

  • ํ˜„์žฌ๋Š” ํ…Œ์ŠคํŠธ ์ค‘์ธ ์„œ๋น„์Šค์ด๊ธฐ ๋•Œ๋ฌธ์— ์ฃผ๊ธฐ์ ์œผ๋กœ Pod๋ฅผ ์‚ญ์ œํ•˜๋ฉด ์–ด๋Š ์ •๋„ ํ•ด๊ฒฐ๋  ๋ฌธ์ œ์ด์ง€๋งŒ ์ถ”ํ›„์—๋Š” ๋””์Šคํฌ ์šฉ๋Ÿ‰์„ ๋Š˜๋ฆด ํ•„์š”๊ฐ€ ์žˆ์–ด ๋ณด์ธ๋‹ค.

 

https://suzuworld.tistory.com/453

 

kubernetes node : kubelet has disk pressure ํ•ด๊ฒฐ ๋ฐฉ๋ฒ• ์ด์ •๋ฆฌ(pod status evicted, Attempting to reclaim ephemeral-storage)

์ง€๋‚œ ๊ธ€์—์„œ ์›Œ์ปค ๋…ธ๋“œ์˜ ๋””์Šคํฌ ๊ฐ€์šฉ๋Ÿ‰์ด ๋ถ€์กฑํ•˜์—ฌ ํ•ด๋‹น ๋…ธ๋“œ์— Pod๋ฅผ ์Šค์ผ€์ค„๋งํ•  ์ˆ˜ ์—†๋Š” ๋ฌธ์ œ์— ๋Œ€ํ•œ ์ด์Šˆ๋ฅผ ์ •๋ฆฌํ–ˆ๋‹ค. ํ˜„์žฌ ์‚ฌ๋‚ด์—์„œ ์‚ฌ์šฉ ์ค‘์ธ ์ฟ ๋ฒ„๋„คํ‹ฐ์Šค๋Š” ํด๋ผ์šฐ๋“œ ๋‚ด์—์„œ ์ž‘๋™ ์ค‘์ด๋ฉฐ

suzuworld.tistory.com

  • ์ข€ ๋” ๋‹ค์–‘ํ•œ ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•์„ ์ฐพ๊ณ  ์žˆ๋‹ค๋ฉด ๋‹ค์Œ ๊ธ€์„ ์ฐธ๊ณ ํ•˜์ž.

 

 

๐Ÿค” ์˜๋ฌธ์  :

์—†์Œ!

 

 

 

์ฐธ๊ณ 

chatGPT

728x90