Appearance
MariaDB 遷移(Bitnami Galera → 官方 StatefulSet)
在 AKS 環境將 Bitnami MariaDB Galera Helm Chart 遷移至官方
mariadb:12.2映像的六個 Phase SOP:mydumper/myloader 平行匯出入、匯入加速設定與正式環境切換、Rollback 策略。
概述
Bitnami 的 MariaDB Galera Helm Chart 映像從 bitnami/mariadb-galera 更名為 bitnamilegacy/mariadb-galera 後,長期維護性存疑。切換至官方 mariadb 映像(Docker Hub 官方維護)可以獲得更長的版本支援週期,並將部署架構從 Galera 多主同步降為更易維護的 Standalone StatefulSet。
遷移的核心挑戰是資料目錄路徑不同(Bitnami: /bitnami/mariadb/data;官方: /var/lib/mysql),因此無法直接複製 PVC,必須透過邏輯匯出匯入完成遷移。本文使用 mydumper/myloader 工具進行平行匯出入,相較於 mysqldump/mysql 有顯著的速度優勢。
遷移設計為「不中斷正式服務」模式:先建立新環境空實例、匯入資料、驗證完整性,最後在停機窗口(通常僅需數分鐘)切換流量,舊環境保留 7 天做 Rollback 用。
核心內容
環境對照
| 項目 | 舊環境(Bitnami Galera) | 新環境(官方) |
|---|---|---|
| Image | bitnamilegacy/mariadb-galera:12.0.2 | mariadb:12.2 |
| Pod | aks-mariadb-galera-0 | mariadb-0 |
| Service | aks-mariadb-galera | mariadb |
| 資料目錄 | /bitnami/mariadb/data | /var/lib/mysql |
| 資源 | CPU 3 / Mem 12Gi | CPU 3 / Mem 12Gi |
| PVC | data-aks-mariadb-galera-0 | data-mariadb-0(256Gi) |
mydumper/myloader 的優勢
相較於傳統 mysqldump/mysql 的優點:
- 平行處理:
--threads=8同時匯出/匯入多張表,速度遠快於單線程 - ZSTD 壓縮:
--compress=zstd壓縮率約 50-60%,大幅降低備份儲存需求 - Chunk 分割:
--rows=100000 --chunk-filesize=512將大表切成小檔案 - 一致性保證:
--trx-consistency-only確保匯出時的資料一致性 - 完整性:
--triggers --routines --events保留所有資料庫物件
匯入加速設定策略
匯入期間暫時關閉部分安全機制,匯入完成後必須立即切回 production.cnf:
| 設定 | 匯入期 | 正式環境 | 說明 |
|---|---|---|---|
innodb_doublewrite | OFF | ON | 斷電保護 |
log_bin | OFF | ON | 複製與時間點恢復 |
innodb_flush_log_at_trx_commit | 0 | 1 | 崩潰後恢復能力 |
foreign_key_checks | 0 | 1 | 匯入時跳過外鍵驗證 |
ConfigMap 同時包含 custom.cnf(匯入加速)與 production.cnf(正式環境),匯入完成後將 custom.cnf 替換為 production.cnf 內容並重啟 Pod。
流量切換策略
方式 A(推薦):修改應用程式的 DB_HOST 環境變數指向新 Service。操作最簡單,Rollback 只需改回環境變數。
方式 B:讓新 Service 接管舊 Service 名稱(aks-mariadb-galera),應用程式不需修改。步驟較多,且刪除舊 Service 後若需 Rollback 需重建。
關鍵要點
- mydumper 備份 PVC 建議 80Gi(100GB 資料 ZSTD 壓縮後約 40-60GB)
- 新 StatefulSet PVC 建議 256Gi(原始資料加索引重建空間需 1.5-2 倍)
- 匯入後必須立即切換 production.cnf 並重啟 Pod,否則資料持久性無法保證
- Phase 4 行數比對可接受 1% 以內差異(InnoDB TABLE_ROWS 為估算值),關鍵表用
COUNT(*)精確比對 - 舊環境 scale 0(保留 PVC)後觀察 7 天再清理,確保 Rollback 能力
實際應用
本遷移指南針對 AKS 叢集 mariadb namespace,從 aks-mariadb-galera Helm release 遷移至官方 StatefulSet。執行前需評估:應用程式是否仍在寫入(影響是否需要增量同步)、可接受的停機窗口長度、以及 Rollback 策略。
遷移完成後,可使用本文的 CronJob 設定建立每日定期備份(UTC 18:00 = 台灣凌晨 2:00),保留 14 天。
部署設定參考
六個 Phase 完整 Checklist
□ Phase 0 — 記錄舊環境(版本、DB 大小、行數、PVC)
□ Phase 1.1 — 建立 migration-backup PVC(80Gi)
□ Phase 1.2 — 執行 mydumper Job,確認 STATUS=Complete
□ Phase 1.3 — 備份使用者與權限至 users_backup.sql
□ Phase 2 — 部署 ConfigMap + Secret + StatefulSet
□ Phase 2.4 — 驗證空實例 mariadb-0 可連線
□ Phase 3.1 — 執行 myloader Job,確認 STATUS=Complete
□ Phase 3.2 — 匯入使用者與權限
□ Phase 3.3 — 🔴 切換為 production.cnf 並重啟(必做)
□ Phase 4.1 — 比對資料庫大小
□ Phase 4.2 — 比對表格行數(diff old_rows.txt new_rows.txt)
□ Phase 4.3 — mariadb-check 完整性檢查
□ Phase 4.4 — 應用程式帳號連線測試
□ Phase 5 — 停機窗口:(增量同步) → 切換 Service → 啟動應用
□ Phase 6.1 — 舊環境 scale 0(保留 PVC 7 天)
□ Phase 6.2 — 7 天後清理遷移資源
□ Phase 6.3 — 部署 CronJob 定期備份Phase 0:記錄現有環境
bash
kubectl exec aks-mariadb-galera-0 -n mariadb -- \
mysql -u root -p'<密碼>' -e "
SELECT VERSION();
SHOW VARIABLES LIKE 'character_set_server';
SELECT TABLE_SCHEMA,
COUNT(*) AS table_count,
ROUND(SUM(DATA_LENGTH + INDEX_LENGTH)/1024/1024/1024, 2) AS size_gb
FROM information_schema.TABLES
GROUP BY TABLE_SCHEMA ORDER BY size_gb DESC;
"
kubectl get pvc -n mariadbPhase 1:mydumper 平行匯出
yaml
# 01-pvc-migration-backup.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mariadb-migration-backup
namespace: mariadb
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: managed-premium
resources:
requests:
storage: 80Gi # ZSTD 壓縮率 50-60%,100GB 資料需約 60GB,預留空間yaml
# 02-job-mydumper.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: mydumper-export
namespace: mariadb
spec:
backoffLimit: 0
template:
spec:
restartPolicy: Never
containers:
- name: mydumper
image: mydumper/mydumper:latest
command:
- /bin/sh
- -c
- |
mydumper \
--host=aks-mariadb-galera --port=3306 \
--user=root --password='<密碼>' \
--outputdir=/backup/dump \
--threads=8 --rows=100000 --chunk-filesize=512 \
--compress=zstd --build-empty-files \
--trx-consistency-only --complete-insert \
--triggers --routines --events \
--long-query-guard=3600 --verbose=3 \
--logfile=/backup/mydumper.log
volumeMounts:
- name: backup
mountPath: /backup
resources:
requests: { cpu: "2", memory: "4Gi" }
limits: { cpu: "4", memory: "8Gi" }
volumes:
- name: backup
persistentVolumeClaim:
claimName: mariadb-migration-backupbash
kubectl apply -f 01-pvc-migration-backup.yaml
kubectl apply -f 02-job-mydumper.yaml
kubectl logs -f job/mydumper-export -n mariadb
kubectl get job mydumper-export -n mariadb # 確認 STATUS=Complete備份使用者與權限
bash
kubectl exec aks-mariadb-galera-0 -n mariadb -- \
mysql -u root -p'<密碼>' --skip-column-names -e "
SELECT CONCAT(
'CREATE USER IF NOT EXISTS ''', user, '''@''', host,
''' IDENTIFIED BY PASSWORD ''', authentication_string, ''';'
) FROM mysql.global_priv
WHERE user NOT IN ('root','mysql','mariadb.sys','')
AND user NOT LIKE 'mariabackup%';
" > users_backup.sqlPhase 2:ConfigMap(含匯入加速與正式設定)
yaml
# 03-configmap-mariadb.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: mariadb-config
namespace: mariadb
data:
# 匯入階段設定:Phase 3 完成後必須切換為 production.cnf
custom.cnf: |
[mysqld]
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
innodb_buffer_pool_size = 8G
innodb_log_file_size = 512M
innodb_io_capacity = 2000
innodb_io_capacity_max = 4000
innodb_read_io_threads = 8
innodb_write_io_threads = 8
# --- 匯入加速(正式環境必須關閉)---
innodb_doublewrite = OFF
innodb_flush_log_at_trx_commit = 0
log_bin = OFF
sync_binlog = 0
foreign_key_checks = 0
unique_checks = 0
max_connections = 500
max_allowed_packet = 64M
slow_query_log = OFF
performance_schema = OFF
[client]
default-character-set = utf8mb4
# 正式環境設定:匯入完成後切換至此
production.cnf: |
[mysqld]
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
innodb_buffer_pool_size = 8G
innodb_log_file_size = 512M
innodb_io_capacity = 2000
innodb_io_capacity_max = 4000
innodb_read_io_threads = 8
innodb_write_io_threads = 8
innodb_flush_method = O_DIRECT
# --- InnoDB 安全 ---
innodb_doublewrite = ON
innodb_flush_log_at_trx_commit = 1
# --- Binary Log ---
log_bin = mysql-bin
binlog_format = ROW
sync_binlog = 1
expire_logs_days = 14
max_connections = 500
max_allowed_packet = 64M
# --- 監控 ---
slow_query_log = ON
long_query_time = 2
performance_schema = ON
[client]
default-character-set = utf8mb4Phase 2:StatefulSet
yaml
# 05-statefulset-mariadb.yaml
apiVersion: v1
kind: Service
metadata:
name: mariadb
namespace: mariadb
spec:
type: ClusterIP
ports: [{port: 3306, targetPort: 3306, name: mysql}]
selector: {app: mariadb}
---
apiVersion: v1
kind: Service
metadata:
name: mariadb-headless
namespace: mariadb
spec:
type: ClusterIP
clusterIP: None
ports: [{port: 3306, targetPort: 3306, name: mysql}]
selector: {app: mariadb}
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mariadb
namespace: mariadb
spec:
serviceName: mariadb-headless
replicas: 1
selector:
matchLabels: {app: mariadb}
template:
metadata:
labels: {app: mariadb}
spec:
securityContext:
fsGroup: 999
containers:
- name: mariadb
image: mariadb:12.2
ports: [{containerPort: 3306, name: mysql}]
env:
- name: MARIADB_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mariadb-secret
key: MARIADB_ROOT_PASSWORD
- name: MARIADB_AUTO_UPGRADE
value: "1"
- name: TZ
value: "Asia/Taipei"
volumeMounts:
- name: data
mountPath: /var/lib/mysql
- name: config
mountPath: /etc/mysql/conf.d/custom.cnf
subPath: custom.cnf
resources:
requests: { cpu: "3", memory: "12Gi" }
limits: { cpu: "4", memory: "14Gi" }
livenessProbe:
exec:
command: ["healthcheck.sh", "--connect", "--innodb_initialized"]
initialDelaySeconds: 120
periodSeconds: 30
failureThreshold: 5
readinessProbe:
exec:
command: ["healthcheck.sh", "--connect", "--innodb_initialized"]
initialDelaySeconds: 30
periodSeconds: 10
failureThreshold: 3
volumes:
- name: config
configMap: {name: mariadb-config}
volumeClaimTemplates:
- metadata: {name: data}
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: managed-premium
resources:
requests:
storage: 256Gi # 原始 100GB+,含索引重建需 1.5-2 倍空間Phase 3:myloader 匯入
yaml
# 06-job-myloader.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: myloader-import
namespace: mariadb
spec:
backoffLimit: 0
template:
spec:
restartPolicy: Never
containers:
- name: myloader
image: mydumper/mydumper:latest
command:
- /bin/sh
- -c
- |
myloader \
--host=mariadb-headless --port=3306 \
--user=root --password='<密碼>' \
--directory=/backup/dump \
--threads=8 --queries-per-transaction=10000 \
--overwrite-tables --compress-protocol \
--verbose=3 --logfile=/backup/myloader.log
volumeMounts:
- name: backup
mountPath: /backup
resources:
requests: { cpu: "2", memory: "4Gi" }
limits: { cpu: "4", memory: "8Gi" }
volumes:
- name: backup
persistentVolumeClaim:
claimName: mariadb-migration-backup🔴 匯入完成後切換 production.cnf(必做)
bash
# 將 production.cnf 內容替換為 custom.cnf
kubectl create configmap mariadb-config \
--from-file=custom.cnf=<(kubectl get configmap mariadb-config -n mariadb \
-o jsonpath='{.data.production\.cnf}') \
--from-file=production.cnf=<(kubectl get configmap mariadb-config -n mariadb \
-o jsonpath='{.data.production\.cnf}') \
-n mariadb --dry-run=client -o yaml | kubectl apply -f -
kubectl rollout restart statefulset/mariadb -n mariadb
kubectl wait --for=condition=Ready pod/mariadb-0 -n mariadb --timeout=300s
# 驗證安全設定已生效
kubectl exec mariadb-0 -n mariadb -- \
mariadb -u root -p'<密碼>' -e "
SHOW VARIABLES LIKE 'innodb_doublewrite';
SHOW VARIABLES LIKE 'innodb_flush_log_at_trx_commit';
SHOW VARIABLES LIKE 'log_bin';
SHOW VARIABLES LIKE 'slow_query_log';
"
# 預期:innodb_doublewrite=ON, innodb_flush_log_at_trx_commit=1, log_bin=ONPhase 4:驗證資料完整性
bash
# 比對行數(新舊環境)
kubectl exec aks-mariadb-galera-0 -n mariadb -- \
mysql -u root -p'<密碼>' --skip-column-names -e "
SELECT CONCAT(TABLE_SCHEMA, '.', TABLE_NAME, '=', TABLE_ROWS)
FROM information_schema.TABLES
WHERE TABLE_SCHEMA NOT IN ('information_schema','performance_schema','mysql','sys')
ORDER BY TABLE_SCHEMA, TABLE_NAME;
" > old_rows.txt
kubectl exec mariadb-0 -n mariadb -- \
mariadb -u root -p'<密碼>' --skip-column-names -e "
SELECT CONCAT(TABLE_SCHEMA, '.', TABLE_NAME, '=', TABLE_ROWS)
FROM information_schema.TABLES
WHERE TABLE_SCHEMA NOT IN ('information_schema','performance_schema','mysql','sys')
ORDER BY TABLE_SCHEMA, TABLE_NAME;
" > new_rows.txt
diff old_rows.txt new_rows.txt # InnoDB TABLE_ROWS 為估算值,1% 差異屬正常
# 完整性檢查
kubectl exec mariadb-0 -n mariadb -- \
mariadb-check -u root -p'<密碼>' --all-databases --checkPhase 5:切換流量(方式 A)
bash
kubectl scale deployment/<YOUR_APP> --replicas=0 -n <APP_NAMESPACE>
sleep 15 # 等待應用程式停止寫入
kubectl set env deployment/<YOUR_APP> \
DB_HOST=mariadb.mariadb.svc.cluster.local \
-n <APP_NAMESPACE>
kubectl scale deployment/<YOUR_APP> --replicas=<N> -n <APP_NAMESPACE>Phase 6:收尾
bash
# 停止舊 Pod(保留 PVC 7 天做 Rollback)
kubectl scale statefulset/aks-mariadb-galera --replicas=0 -n mariadb
# 7 天後確認穩定,清理遷移資源
kubectl delete job mydumper-export myloader-import -n mariadb
kubectl delete pvc mariadb-migration-backup -n mariadb
helm uninstall aks-mariadb-galera -n mariadb
kubectl delete pvc data-aks-mariadb-galera-0 -n mariadb定期備份 CronJob
yaml
# 07-cronjob-backup.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
name: mariadb-backup
namespace: mariadb
spec:
schedule: "0 18 * * *" # UTC 18:00 = 台灣凌晨 2:00
concurrencyPolicy: Forbid
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 3
jobTemplate:
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- name: mydumper
image: mydumper/mydumper:latest
command:
- /bin/sh
- -c
- |
DIR="/backup/$(date +%Y%m%d_%H%M%S)"
mkdir -p "$DIR"
mydumper --host=mariadb --user=root --password='<密碼>' \
--outputdir="$DIR" --threads=4 --compress=zstd \
--trx-consistency-only --triggers --routines --events --verbose=2
# 保留 14 天
find /backup -maxdepth 1 -type d -mtime +14 ! -name backup -exec rm -rf {} +
echo "Backup done: $(du -sh $DIR)"
volumeMounts:
- name: backup
mountPath: /backup
resources:
requests: { cpu: "1", memory: "2Gi" }
limits: { cpu: "2", memory: "4Gi" }
volumes:
- name: backup
persistentVolumeClaim:
claimName: mariadb-backup-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mariadb-backup-pvc
namespace: mariadb
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: managed-premium
resources:
requests:
storage: 100GiRollback SOP
bash
kubectl scale deployment/<YOUR_APP> --replicas=0 -n <APP_NAMESPACE>
# 恢復舊 Bitnami(尚未 helm uninstall 時)
kubectl scale statefulset/aks-mariadb-galera --replicas=1 -n mariadb
kubectl wait --for=condition=Ready pod/aks-mariadb-galera-0 -n mariadb --timeout=300s
# 修改 DB_HOST 指回舊 Service 或重建舊 Service(若用了方式 B)
kubectl scale deployment/<YOUR_APP> --replicas=<N> -n <APP_NAMESPACE>相關概念
- MariaDB Helm 部署(Kubernetes) — Bitnami Helm Chart 部署 MariaDB 的完整說明,含 Galera 架構與 InnoDB 調優
- PgBouncer 連線池(Kubernetes 部署) — 同樣在 Kubernetes 部署資料庫相關服務的參考
- Prometheus Exporter 部署模式 — 遷移完成後可為新 MariaDB 部署 MySQL Exporter 實現監控
- DRBD Split-Brain 修復(Pacemaker HA Cluster) — 另一種叢集 split-brain 情境的診斷與修復,DRBD 雙節點手動選邊策略可與 Galera 的自動仲裁機制對照