NCP Kubernetes + Prometheus + Alertmanager 환경에
Thanos + NCP Object Storage(S3 API) 까지 붙이면 “GPU 장기 이력 + 감사 + 비용 분석” 이 완성됩니다.

아래는 운영 환경에서 바로 쓰는 표준 설치 가이드 (명령어 중심) 입니다.

🧩 전체 아키텍처

Prometheus
   │ (sidecar)
   ▼
Thanos Sidecar
   │
   ▼
NCP Object Storage (S3 API)
   │
   ▼
Thanos Store
   │
   ▼
Thanos Query → Grafana

🧩 1️⃣ NCP Object Storage S3 정보 준비

NCP 콘솔 → Object Storage → 버킷 생성

예:

Bucket: ncp-prometheus-metrics
Region: kr-standard

S3 API 정보:

항목값

Endpoint	https://kr.object.ncloudstorage.com
AccessKey	NCP 콘솔
SecretKey	NCP 콘솔

🧩 2️⃣ Thanos S3 설정 Secret 생성

cat <<EOF > thanos-objstore.yaml
type: S3
config:
  bucket: "ncp-prometheus-metrics"
  endpoint: "kr.object.ncloudstorage.com"
  access_key: "NCP_ACCESS_KEY"
  secret_key: "NCP_SECRET_KEY"
  insecure: false
  signature_version2: false
EOF

kubectl create secret generic thanos-objstore \
--from-file=thanos.yaml=thanos-objstore.yaml \
-n monitoring

🧩 3️⃣ Thanos Helm Repo 추가

helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update

🧩 4️⃣ Prometheus에 Thanos Sidecar 연결

Prometheus에 sidecar 붙여야 함.

values 파일 생성:

cat <<EOF > prometheus-thanos.yaml
prometheus:
  prometheusSpec:
    thanos:
      objectStorageConfig:
        name: thanos-objstore
        key: thanos.yaml
EOF

업그레이드

helm upgrade prometheus prometheus-community/kube-prometheus-stack \
-n monitoring \
-f prometheus-thanos.yaml

확인

kubectl get pods -n monitoring | grep thanos

→ prometheus pod 안에 thanos-sidecar 컨테이너 생김

🧩 5️⃣ Thanos Query + Store + Compactor 설치

helm install thanos bitnami/thanos \
  -n monitoring \
  --set objstoreConfig.secretName=thanos-objstore \
  --set query.enabled=true \
  --set storegateway.enabled=true \
  --set compactor.enabled=true \
  --set ruler.enabled=false

확인

kubectl get pods -n monitoring | grep thanos

🧩 6️⃣ Thanos Query 연결 확인

kubectl port-forward -n monitoring svc/thanos-query 9091:9090

→ http://localhost:9091

Store 연결 확인

Status → Stores → Prometheus sidecar 보여야 함

🧩 7️⃣ Grafana → Thanos Query 연결

Grafana → Data Source → Prometheus 추가

URL:

http://thanos-query.monitoring.svc.cluster.local:9090

기존 Prometheus 대신 Thanos로 변경 → 과거 6개월 GPU 사용량 조회 가능

🧩 8️⃣ 정상 업로드 확인

kubectl logs -n monitoring -l app.kubernetes.io/name=thanos-sidecar

uploading new block to S3

보이면 성공 🎯

🎯 지금 이 구성이 의미하는 것

이제 NCP GPU 클러스터는:

기능	구현
실시간 GPU 모니터링	Prometheus
알람	Alertmanager
1년치 GPU 사용 이력	Thanos + Object Storage
GPU 비용 감사	Thanos
장애 원인 추적	Thanos

→ GPUaaS급 플랫폼 완성

🚀 다음에 하면 되는 것

이제 가능:

GPU 사용자별 월간 사용량
팀별 GPU 비용
장애 발생 시 과거 시점 분석
GPU Idle 비용 낭비 추적

저작자표시 비영리 변경금지 (새창열림)

'[GPUaaS] > Prometheus' 카테고리의 다른 글

[NCP 실전] kubectl rollout restart statefulset prometheus-k8s -n monitoring 의미 (0)	2026.01.14
[NCP 실전] Prometheus Alertmanager - AlertRule 적용 (0)	2026.01.13
Thanos for Kubernetes in S3 with Grafana and Prometheus (1)	2026.01.13
[NCP 실전] Kubernetes 내부 DNS 주소 규칙 (0)	2026.01.13
[NCP 실전] Kubernetes에 Prometheus + Grafana 모니터링 구성 (0)	2026.01.12
Helm을 사용하여 Kubernetes에 Prometheus 설정 \| Prometheus를 사용한 Kubernetes 모니터링 (0)	2026.01.12
[중요] 우분투 - Grafana Prometheus 를 사용한 서버 시각화!! (2)	2026.01.12
[Prometheus] Node Exporter의 역할!! (@2025년 최신) (1)	2025.10.03

[NCP 실전] NCP Kubernetes + Prometheus + Alertmanager 환경에Thanos + NCP Object Storage 연동

🧩 전체 아키텍처

🧩 1️⃣ NCP Object Storage S3 정보 준비

🧩 2️⃣ Thanos S3 설정 Secret 생성

🧩 3️⃣ Thanos Helm Repo 추가

🧩 4️⃣ Prometheus에 Thanos Sidecar 연결

🧩 5️⃣ Thanos Query + Store + Compactor 설치

🧩 6️⃣ Thanos Query 연결 확인

🧩 7️⃣ Grafana → Thanos Query 연결

🧩 8️⃣ 정상 업로드 확인

🎯 지금 이 구성이 의미하는 것

🚀 다음에 하면 되는 것

'[GPUaaS] > Prometheus' 카테고리의 다른 글

댓글

티스토리툴바

[NCP 실전] NCP Kubernetes + Prometheus + Alertmanager 환경에Thanos + NCP Object Storage 연동

🧩 전체 아키텍처

🧩 1️⃣ NCP Object Storage S3 정보 준비

🧩 2️⃣ Thanos S3 설정 Secret 생성

🧩 3️⃣ Thanos Helm Repo 추가

🧩 4️⃣ Prometheus에 Thanos Sidecar 연결

🧩 5️⃣ Thanos Query + Store + Compactor 설치

🧩 6️⃣ Thanos Query 연결 확인

🧩 7️⃣ Grafana → Thanos Query 연결

🧩 8️⃣ 정상 업로드 확인

🎯 지금 이 구성이 의미하는 것

🚀 다음에 하면 되는 것

'[GPUaaS] > Prometheus' 카테고리의 다른 글

관련글

댓글

티스토리툴바