强曰为道
与天地相似,故不违。知周乎万物,而道济天下,故不过。旁行而不流,乐天知命,故不忧.
文档目录

dqlite 分布式 SQLite 教程 / 第 9 章:Docker 与 Kubernetes 部署

第 9 章:Docker 与 Kubernetes 部署

本章介绍如何使用 Docker 和 Kubernetes 部署 dqlite,包括容器化构建、Docker Compose 集群编排、Kubernetes StatefulSet 部署和数据持久化方案。


9.1 容器化概述

dqlite 在容器中运行的注意事项:

项目说明
基础镜像Ubuntu 22.04/24.04 或 Alpine
存储需要持久化卷(PV)保存数据
网络节点间需要稳定的网络地址
时钟容器时钟必须与宿主机同步
文件系统推荐使用 ext4/xfs,避免 overlay

9.2 Docker 镜像构建

9.2.1 多阶段构建 Dockerfile

# Dockerfile.dqlite
# 阶段 1:编译
FROM ubuntu:24.04 AS builder

ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update && apt-get install -y \
    build-essential \
    autoconf automake libtool \
    pkg-config \
    libuv1-dev \
    zlib1g-dev \
    liblz4-dev \
    libsqlite3-dev \
    wget git \
    ca-certificates \
    && rm -rf /var/lib/apt/lists/*

# 编译 dqlite
WORKDIR /build
RUN git clone --depth 1 --branch v1.16.6 https://github.com/canonical/dqlite.git \
    && cd dqlite \
    && autoreconf -i \
    && ./configure --prefix=/usr \
    && make -j$(nproc) \
    && make install DESTDIR=/build/root

# 阶段 2:运行时
FROM ubuntu:24.04

RUN apt-get update && apt-get install -y \
    libuv1 \
    zlib1g \
    liblz4-1 \
    libsqlite3-0 \
    tini \
    && rm -rf /var/lib/apt/lists/*

# 从编译阶段复制库文件
COPY --from=builder /build/root/usr/lib/ /usr/lib/
COPY --from=builder /build/root/usr/include/ /usr/include/

RUN ldconfig

# 复制应用
COPY dqlite-node /usr/local/bin/

# 创建运行用户
RUN useradd -r -s /bin/false -d /var/lib/dqlite dqlite \
    && mkdir -p /var/lib/dqlite \
    && chown dqlite:dqlite /var/lib/dqlite

# 数据目录
VOLUME /var/lib/dqlite
EXPOSE 9001

USER dqlite

# 使用 tini 作为 init 进程
ENTRYPOINT ["tini", "--"]
CMD ["dqlite-node", "--data-dir", "/var/lib/dqlite", "--address", "0.0.0.0:9001"]

9.2.2 Go 应用 Dockerfile

# Dockerfile.go-app
# 阶段 1:编译 Go 应用
FROM golang:1.21-bookworm AS builder

WORKDIR /build

# 复制依赖文件
COPY go.mod go.sum ./
RUN go mod download

# 编译
COPY . .
RUN CGO_ENABLED=1 go build -ldflags="-s -w" -o /build/app ./cmd/server

# 阶段 2:运行时
FROM ubuntu:24.04

RUN apt-get update && apt-get install -y \
    libuv1 \
    zlib1g \
    liblz4-1 \
    libsqlite3-0 \
    ca-certificates \
    tini \
    && rm -rf /var/lib/apt/lists/*

# 安装 dqlite 库
COPY --from=builder /usr/lib/x86_64-linux-gnu/libdqlite* /usr/lib/x86_64-linux-gnu/
COPY --from=builder /usr/lib/x86_64-linux-gnu/libraft* /usr/lib/x86_64-linux-gnu/
RUN ldconfig

# 复制应用
COPY --from=builder /build/app /usr/local/bin/

RUN useradd -r -s /bin/false dqlite \
    && mkdir -p /var/lib/dqlite && chown dqlite:dqlite /var/lib/dqlite

VOLUME /var/lib/dqlite
EXPOSE 9001

USER dqlite
ENTRYPOINT ["tini", "--"]
CMD ["app"]

9.2.3 构建和测试

# 构建镜像
docker build -t my-dqlite:latest -f Dockerfile.dqlite .

# 测试运行
docker run --rm -it \
    -p 9001:9001 \
    -v dqlite-test-data:/var/lib/dqlite \
    my-dqlite:latest

# 查看日志
docker logs -f <container-id>

9.3 Docker Compose 部署

9.3.1 三节点集群

# docker-compose.yml
version: '3.8'

services:
  dqlite-1:
    image: my-dqlite:latest
    container_name: dqlite-node-1
    command: >
      dqlite-node
        --id 1
        --data-dir /var/lib/dqlite
        --address 0.0.0.0:9001
        --join dqlite-1:9001,dqlite-2:9001,dqlite-3:9001
    ports:
      - "9001:9001"
    volumes:
      - dqlite-data-1:/var/lib/dqlite
    networks:
      dqlite-net:
        ipv4_address: 172.20.0.11
    deploy:
      resources:
        limits:
          memory: 256M
          cpus: '0.5'
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "dqlite-status", "127.0.0.1:9001"]
      interval: 10s
      timeout: 5s
      retries: 3

  dqlite-2:
    image: my-dqlite:latest
    container_name: dqlite-node-2
    command: >
      dqlite-node
        --id 2
        --data-dir /var/lib/dqlite
        --address 0.0.0.0:9001
        --join dqlite-1:9001,dqlite-2:9001,dqlite-3:9001
    ports:
      - "9002:9001"
    volumes:
      - dqlite-data-2:/var/lib/dqlite
    networks:
      dqlite-net:
        ipv4_address: 172.20.0.12
    deploy:
      resources:
        limits:
          memory: 256M
          cpus: '0.5'
    restart: unless-stopped
    depends_on:
      - dqlite-1

  dqlite-3:
    image: my-dqlite:latest
    container_name: dqlite-node-3
    command: >
      dqlite-node
        --id 3
        --data-dir /var/lib/dqlite
        --address 0.0.0.0:9001
        --join dqlite-1:9001,dqlite-2:9001,dqlite-3:9001
    ports:
      - "9003:9001"
    volumes:
      - dqlite-data-3:/var/lib/dqlite
    networks:
      dqlite-net:
        ipv4_address: 172.20.0.13
    deploy:
      resources:
        limits:
          memory: 256M
          cpus: '0.5'
    restart: unless-stopped
    depends_on:
      - dqlite-1

volumes:
  dqlite-data-1:
    driver: local
  dqlite-data-2:
    driver: local
  dqlite-data-3:
    driver: local

networks:
  dqlite-net:
    driver: bridge
    ipam:
      config:
        - subnet: 172.20.0.0/24

9.3.2 带 TLS 的 Docker Compose

# docker-compose.tls.yml
version: '3.8'

services:
  dqlite-1:
    image: my-dqlite:latest
    container_name: dqlite-node-1-tls
    command: >
      dqlite-node
        --id 1
        --data-dir /var/lib/dqlite
        --address 0.0.0.0:9001
        --tls-cert /etc/dqlite/certs/cert.pem
        --tls-key /etc/dqlite/certs/key.pem
        --tls-ca /etc/dqlite/certs/ca.pem
    volumes:
      - dqlite-data-1:/var/lib/dqlite
      - ./certs/node1:/etc/dqlite/certs:ro
    networks:
      - dqlite-net
    environment:
      - DQLITE_TLS_VERIFY=true
    restart: unless-stopped

  # ... 其他节点类似

9.3.3 常用命令

# 启动集群
docker compose up -d

# 查看状态
docker compose ps

# 查看日志
docker compose logs -f dqlite-1

# 停止集群
docker compose down

# 停止并删除数据
docker compose down -v

# 扩容(添加节点)
docker compose up -d --scale dqlite-node=5

9.4 Kubernetes 部署

9.4.1 StatefulSet 设计

dqlite 在 Kubernetes 中应该使用 StatefulSet 部署,因为:

需求StatefulSet 特性
稳定的网络标识每个 Pod 有固定的 DNS 名称
持久化存储每个 Pod 有独立的 PVC
有序部署/扩展Pod 按顺序创建和删除
有序更新滚动更新时保持顺序

9.4.2 完整 Kubernetes 配置

# k8s/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: dqlite-system
# k8s/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: dqlite-config
  namespace: dqlite-system
data:
  DQLITE_NODE_ID: "auto"
  DQLITE_BIND_ADDRESS: "0.0.0.0:9001"
  DQLITE_DATA_DIR: "/var/lib/dqlite"
# k8s/statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: dqlite
  namespace: dqlite-system
spec:
  serviceName: dqlite-headless
  replicas: 3
  selector:
    matchLabels:
      app: dqlite
  template:
    metadata:
      labels:
        app: dqlite
    spec:
      terminationGracePeriodSeconds: 30
      initContainers:
        # 自动检测节点 ID(基于 Pod 序号)
        - name: init-node-id
          image: busybox:1.36
          command:
            - sh
            - -c
            - |
              # 从 Pod 名称提取序号(dqlite-0 → 1, dqlite-1 → 2, ...)
              ORDINAL=$(echo $HOSTNAME | rev | cut -d'-' -f1 | rev)
              NODE_ID=$((ORDINAL + 1))
              echo "$NODE_ID" > /etc/dqlite/node-id
              echo "Node ID: $NODE_ID"
          volumeMounts:
            - name: config
              mountPath: /etc/dqlite

      containers:
        - name: dqlite
          image: my-dqlite:latest
          imagePullPolicy: IfNotPresent
          command:
            - sh
            - -c
            - |
              NODE_ID=$(cat /etc/dqlite/node-id)
              NODES=""
              for i in 0 1 2; do
                NODES="${NODES:+$NODES,}dqlite-${i}.dqlite-headless.dqlite-system.svc.cluster.local:9001"
              done
              exec dqlite-node \
                --id "$NODE_ID" \
                --data-dir /var/lib/dqlite \
                --address "0.0.0.0:9001" \
                --join "$NODES"
          ports:
            - containerPort: 9001
              name: dqlite
          envFrom:
            - configMapRef:
                name: dqlite-config
          resources:
            requests:
              cpu: 100m
              memory: 128Mi
            limits:
              cpu: 500m
              memory: 512Mi
          readinessProbe:
            tcpSocket:
              port: 9001
            initialDelaySeconds: 10
            periodSeconds: 5
          livenessProbe:
            tcpSocket:
              port: 9001
            initialDelaySeconds: 30
            periodSeconds: 10
          volumeMounts:
            - name: dqlite-data
              mountPath: /var/lib/dqlite
            - name: config
              mountPath: /etc/dqlite

      volumes:
        - name: config
          emptyDir: {}

  volumeClaimTemplates:
    - metadata:
        name: dqlite-data
      spec:
        accessModes: ["ReadWriteOnce"]
        storageClassName: standard  # 根据实际集群调整
        resources:
          requests:
            storage: 10Gi
# k8s/service.yaml
# 无头服务(用于 StatefulSet 的 DNS 解析)
apiVersion: v1
kind: Service
metadata:
  name: dqlite-headless
  namespace: dqlite-system
spec:
  clusterIP: None
  selector:
    app: dqlite
  ports:
    - port: 9001
      targetPort: 9001
      name: dqlite

---
# 客户端访问服务(可选)
apiVersion: v1
kind: Service
metadata:
  name: dqlite-service
  namespace: dqlite-system
spec:
  selector:
    app: dqlite
  ports:
    - port: 9001
      targetPort: 9001
      name: dqlite
  type: ClusterIP
# k8s/poddisruptionbudget.yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: dqlite-pdb
  namespace: dqlite-system
spec:
  minAvailable: 2  # 至少 2 个节点可用(3 节点集群)
  selector:
    matchLabels:
      app: dqlite

9.4.3 部署命令

# 创建所有资源
kubectl apply -f k8s/namespace.yaml
kubectl apply -f k8s/configmap.yaml
kubectl apply -f k8s/statefulset.yaml
kubectl apply -f k8s/service.yaml
kubectl apply -f k8s/poddisruptionbudget.yaml

# 查看状态
kubectl -n dqlite-system get pods
kubectl -n dqlite-system get pvc

# 查看日志
kubectl -n dqlite-system logs -f dqlite-0

# 进入容器
kubectl -n dqlite-system exec -it dqlite-0 -- /bin/bash

# 检查集群状态
kubectl -n dqlite-system exec -it dqlite-0 -- dqlite-status

9.4.4 DNS 解析规则

StatefulSet 中每个 Pod 的 DNS 名称格式:

<pod-name>.<service-name>.<namespace>.svc.cluster.local

dqlite-0.dqlite-headless.dqlite-system.svc.cluster.local
dqlite-1.dqlite-headless.dqlite-system.svc.cluster.local
dqlite-2.dqlite-headless.dqlite-system.svc.cluster.local

9.5 数据持久化

9.5.1 存储类选择

存储类型IOPS延迟适用场景
local SSD最高最低单节点、性能优先
云 SSD (gp3)推荐生产使用
云 HDD不推荐(dqlite 对 I/O 敏感)
NFS❌ 不推荐(并发问题)

9.5.2 本地存储方案

# k8s/storage-local.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: dqlite-local
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: dqlite-pv-node1
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: dqlite-local
  local:
    path: /mnt/dqlite/node1
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/hostname
              operator: In
              values:
                - worker-node-1

9.5.3 备份策略

# k8s/cronjob-backup.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: dqlite-backup
  namespace: dqlite-system
spec:
  schedule: "0 2 * * *"  # 每天凌晨 2 点
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: backup
              image: my-dqlite:latest
              command:
                - sh
                - -c
                - |
                  BACKUP_DIR="/backup/$(date +%Y%m%d)"
                  mkdir -p "$BACKUP_DIR"
                  
                  # 导出数据库
                  dqlite-dump \
                    --address dqlite-0.dqlite-headless:9001 \
                    --output "$BACKUP_DIR/dump.sql"
                  
                  # 压缩
                  gzip "$BACKUP_DIR/dump.sql"
                  
                  # 上传到对象存储(可选)
                  # aws s3 sync "$BACKUP_DIR" s3://my-bucket/dqlite-backup/
                  
                  echo "Backup completed: $BACKUP_DIR"
              volumeMounts:
                - name: backup-storage
                  mountPath: /backup
          volumes:
            - name: backup-storage
              persistentVolumeClaim:
                claimName: dqlite-backup-pvc
          restartPolicy: OnFailure

9.6 监控与可观测性

9.6.1 Prometheus 指标

# k8s/servicemonitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: dqlite-monitor
  namespace: dqlite-system
spec:
  selector:
    matchLabels:
      app: dqlite
  endpoints:
    - port: metrics
      interval: 15s

dqlite 可以暴露以下关键指标:

指标说明
dqlite_node_role节点角色(0=Follower, 1=Leader)
dqlite_raft_term当前 Raft 任期
dqlite_raft_log_entries日志条目数量
dqlite_raft_commit_index已提交的日志索引
dqlite_db_size_bytes数据库大小
dqlite_connections_active活跃连接数

9.6.2 日志收集

# k8s/fluentbit-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentbit-config
data:
  fluent-bit.conf: |
    [INPUT]
        Name              tail
        Tag               dqlite.*
        Path              /var/log/containers/dqlite-*.log
        Parser            docker

    [FILTER]
        Name              kubernetes
        Match             dqlite.*
        Kube_URL          https://kubernetes.default.svc:443

    [OUTPUT]
        Name              es
        Match             dqlite.*
        Host              elasticsearch.logging
        Port              9200
        Index             dqlite-logs

9.7 运维操作

9.7.1 滚动更新

# 更新策略
spec:
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1  # 同时最多 1 个 Pod 不可用
# 手动触发滚动更新
kubectl -n dqlite-system rollout restart statefulset/dqlite

# 查看更新状态
kubectl -n dqlite-system rollout status statefulset/dqlite

# 回滚
kubectl -n dqlite-system rollout undo statefulset/dqlite

9.7.2 扩缩容

# 扩容到 5 节点
kubectl -n dqlite-system scale statefulset dqlite --replicas=5

# 缩容到 3 节点(注意:需要先从集群中移除节点)
# 1. 先从 dqlite 集群中移除节点 4 和 5
# 2. 然后缩减 Pod 数量
kubectl -n dqlite-system scale statefulset dqlite --replicas=3

本章小结

部署方式适用场景复杂度
单容器 Docker开发测试最低
Docker Compose本地多节点、小型部署
Kubernetes StatefulSet生产部署、云环境
本地存储 + SSD高性能需求
关键点说明
StatefulSet必须使用,保证稳定标识和持久化
无头服务用于 Pod 间 DNS 发现
PodDisruptionBudget保证维护时集群可用
本地 SSD推荐存储类型
备份 CronJob定期备份数据库

下一章

第 10 章:生产最佳实践 — 学习何时选择 dqlite、容量规划、监控告警和运维 SOP。