强曰为道
与天地相似,故不违。知周乎万物,而道济天下,故不过。旁行而不流,乐天知命,故不忧.
文档目录

HTTP/2 与 RPC 精讲教程 / 14 - 容器化部署

第 14 章:容器化部署

从开发到生产——Docker、Kubernetes 与 Service Mesh 中的 RPC 部署


14.1 Docker 化 gRPC 服务

14.1.1 多阶段构建

# Dockerfile(Go gRPC 服务)
# 阶段 1:编译
FROM golang:1.21-alpine AS builder

RUN apk add --no-cache git ca-certificates

WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download

COPY . .
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 \
    go build -ldflags="-s -w" -o /app/server ./cmd/server

# 阶段 2:运行镜像
FROM alpine:3.18

RUN apk add --no-cache ca-certificates tzdata
COPY --from=builder /app/server /usr/local/bin/server

# gRPC 默认端口
EXPOSE 50051
# 健康检查端口
EXPOSE 50052

ENTRYPOINT ["server"]
# 构建镜像
docker build -t my-grpc-service:v1 .

# 运行容器
docker run -d \
  --name grpc-service \
  -p 50051:50051 \
  -p 50052:50052 \
  my-grpc-service:v1

14.1.2 gRPC 健康检查

package main

import (
	"log"
	"net"
	"net/http"

	"google.golang.org/grpc"
	"google.golang.org/grpc/health"
	"google.golang.org/grpc/health/grpc_health_v1"
)

func main() {
	// gRPC 服务
	grpcServer := grpc.NewServer()
	healthServer := health.NewServer()
	grpc_health_v1.RegisterHealthServer(grpcServer, healthServer)

	// 设置服务状态
	healthServer.SetServingStatus("user.UserService",
		grpc_health_v1.HealthCheckResponse_SERVING)

	// 启动 gRPC 服务
	lis, _ := net.Listen("tcp", ":50051")
	go grpcServer.Serve(lis)

	// HTTP 健康端点(供 Docker/K8s 使用)
	http.HandleFunc("/healthz", func(w http.ResponseWriter, r *http.Request) {
		w.WriteHeader(http.StatusOK)
		w.Write([]byte("ok"))
	})
	http.HandleFunc("/readyz", func(w http.ResponseWriter, r *http.Request) {
		w.WriteHeader(http.StatusOK)
		w.Write([]byte("ready"))
	})

	log.Println("健康检查服务启动于 :50052")
	http.ListenAndServe(":50052", nil)
}
# Docker 健康检查
HEALTHCHECK --interval=10s --timeout=3s --start-period=5s --retries=3 \
  CMD grpc_health_probe -addr=:50051 || exit 1
# docker-compose.yml
version: "3.8"
services:
  user-service:
    build: .
    ports:
      - "50051:50051"
    healthcheck:
      test: ["CMD", "grpc_health_probe", "-addr=:50051"]
      interval: 10s
      timeout: 3s
      retries: 3
    deploy:
      resources:
        limits:
          cpus: "1.0"
          memory: 512M
        reservations:
          cpus: "0.5"
          memory: 256M

14.2 Kubernetes 部署

14.2.1 基础部署配置

# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: user-service
  labels:
    app: user-service
spec:
  replicas: 3
  selector:
    matchLabels:
      app: user-service
  template:
    metadata:
      labels:
        app: user-service
    spec:
      containers:
        - name: user-service
          image: my-grpc-service:v1
          ports:
            - containerPort: 50051
              name: grpc
            - containerPort: 50052
              name: http
          # gRPC 健康检查(K8s 1.24+ 支持 gRPC 探针)
          livenessProbe:
            grpc:
              port: 50051
            initialDelaySeconds: 10
            periodSeconds: 10
          readinessProbe:
            grpc:
              port: 50051
            initialDelaySeconds: 5
            periodSeconds: 5
          resources:
            requests:
              cpu: 100m
              memory: 128Mi
            limits:
              cpu: 500m
              memory: 512Mi
---
apiVersion: v1
kind: Service
metadata:
  name: user-service
spec:
  selector:
    app: user-service
  ports:
    - name: grpc
      port: 50051
      targetPort: 50051
      protocol: TCP
  type: ClusterIP

14.2.2 gRPC 负载均衡问题

Kubernetes 默认的 Service 负载均衡基于 L4(TCP 连接级别):

问题:
客户端 ──── TCP 连接 ────→ Service ────→ Pod A
                                         (所有请求走同一条连接)

由于 HTTP/2 使用单一长连接,K8s Service 会将所有请求路由到同一个 Pod!

解决方案:
1. 客户端负载均衡(推荐)
2. Envoy/Istio sidecar(L7 负载均衡)
3. gRPC-LB 协议
4. Headless Service + 客户端解析

14.2.3 Headless Service + 客户端负载均衡

# Headless Service(不分配 ClusterIP)
apiVersion: v1
kind: Service
metadata:
  name: user-service
spec:
  clusterIP: None  # 关键:Headless
  selector:
    app: user-service
  ports:
    - name: grpc
      port: 50051
// 客户端:使用 DNS 解析实现负载均衡
package main

import (
	"log"

	"google.golang.org/grpc"
	"google.golang.org/grpc/credentials/insecure"
	"google.golang.org/grpc/resolver"
	_ "google.golang.org/grpc/health" // 注册健康检查
)

func main() {
	// 使用 DNS 解析 + round-robin 负载均衡
	conn, err := grpc.Dial(
		"dns:///user-service.default.svc.cluster.local:50051",
		grpc.WithTransportCredentials(insecure.NewCredentials()),
		grpc.WithDefaultServiceConfig(`{
			"loadBalancingConfig": [{"round_robin":{}}],
			"healthCheckConfig": {
				"serviceName": "user.UserService"
			}
		}`),
	)
	if err != nil {
		log.Fatalf("连接失败: %v", err)
	}
	defer conn.Close()
}

14.3 Service Mesh 集成

14.3.1 Istio gRPC 配置

# Istio DestinationRule - gRPC 负载均衡
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: user-service
spec:
  host: user-service
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 1000
      http:
        h2UpgradePolicy: DEFAULT
        maxRequestsPerConnection: 1000
    loadBalancer:
      simple: ROUND_ROBIN
    outlierDetection:
      consecutive5xxErrors: 3
      interval: 10s
      baseEjectionTime: 30s
      maxEjectionPercent: 50
---
# Istio VirtualService - gRPC 路由
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: user-service
spec:
  hosts:
    - user-service
  http:
    - match:
        - method:
            exact: POST
          uri:
            prefix: /user.v1.UserService
      route:
        - destination:
            host: user-service
            subset: v1
          weight: 90
        - destination:
            host: user-service
            subset: v2
          weight: 10  # 金丝雀发布
      retries:
        attempts: 3
        retryOn: unavailable,deadline-exceeded
      timeout: 10s

14.3.2 Istio gRPC 可观测性

# gRPC 访问日志配置
apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
  name: grpc-logging
spec:
  accessLogging:
    - providers:
        - name: envoy
  metrics:
    - providers:
        - name: prometheus
  tracing:
    - providers:
        - name: jaeger
      randomSamplingPercentage: 10
# Grafana gRPC 监控 Dashboard 查询
# 请求速率
sum(rate(istio_requests_total{
  connection_security_policy!="mutual_tls",
  destination_service="user-service"
}[5m])) by (grpc_service, grpc_method)

# P99 延迟
histogram_quantile(0.99,
  sum(rate(istio_request_duration_milliseconds_bucket{
    destination_service="user-service"
  }[5m])) by (le)
)

# 错误率
sum(rate(istio_requests_total{
  destination_service="user-service",
  response_code!="200",
  grpc_response_status!="0"
}[5m])) / sum(rate(istio_requests_total{
  destination_service="user-service"
}[5m]))

14.4 gRPC-Web 代理

14.4.1 Envoy 代理配置

# envoy-grpc-web.yaml
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          address: 0.0.0.0
          port_value: 8080
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                codec_type: AUTO
                stat_prefix: ingress_http
                route_config:
                  virtual_hosts:
                    - name: local_service
                      domains: ["*"]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: grpc_backend
                            timeout: 60s
                      cors:
                        allow_origin_string_match:
                          - prefix: "*"
                        allow_methods: GET, PUT, DELETE, POST, OPTIONS
                        allow_headers: keep-alive,user-agent,cache-control,content-type,content-transfer-encoding,x-custom-header,x-accept-content-transfer-encoding,x-accept-response-streaming,x-user-agent,x-grpc-web,grpc-timeout
                        max_age: "1728000"
                        expose_headers: grpc-status,grpc-message
                http_filters:
                  - name: envoy.filters.http.grpc_web
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.grpc_web.v3.GrpcWeb
                  - name: envoy.filters.http.cors
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.cors.v3.Cors
                  - name: envoy.filters.http.router
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router

  clusters:
    - name: grpc_backend
      connect_timeout: 1s
      type: STRICT_DNS
      lb_policy: ROUND_ROBIN
      typed_extension_protocol_options:
        envoy.extensions.upstreams.http.v3.HttpProtocolOptions:
          "@type": type.googleapis.com/envoy.extensions.upstreams.http.v3.HttpProtocolOptions
          explicit_http_version_config:
            http2_protocol_options: {}
      load_assignment:
        cluster_name: grpc_backend
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: user-service
                      port_value: 50051

14.5 CI/CD 流水线

# .github/workflows/grpc-service.yml
name: Build and Deploy gRPC Service

on:
  push:
    branches: [main]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup Go
        uses: actions/setup-go@v5
        with:
          go-version: "1.21"

      - name: Install Protobuf
        run: |
          sudo apt-get install -y protobuf-compiler
          go install google.golang.org/protobuf/cmd/protoc-gen-go@latest
          go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@latest

      - name: Generate proto
        run: buf generate

      - name: Run tests
        run: go test -v -race ./...

      - name: gRPC 集成测试
        run: |
          go run cmd/server/main.go &
          sleep 2
          grpcurl -plaintext localhost:50051 grpc.health.v1.Health/Check

      - name: Build Docker image
        run: docker build -t $REGISTRY/user-service:$GITHUB_SHA .

      - name: Push to Registry
        run: docker push $REGISTRY/user-service:$GITHUB_SHA

      - name: Deploy to K8s
        run: |
          kubectl set image deployment/user-service \
            user-service=$REGISTRY/user-service:$GITHUB_SHA

14.6 注意事项

⚠️ HTTP/2 连接复用问题

  • K8s 默认 L4 负载均衡对 HTTP/2 无效
  • 必须使用 L7 负载均衡(Istio/Envoy)或客户端负载均衡

⚠️ 容器网络

  • gRPC 长连接可能穿越 NAT/防火墙,注意超时设置
  • 启用 keepalive 探测连接活性
  • grpc.keepalive_time_msgrpc.keepalive_timeout_ms

⚠️ 滚动更新

  • gRPC 长连接在 Pod 终止时不会自动断开
  • 需要实现优雅关闭(接收 SIGTERM 后发送 GOAWAY)
  • 设置 terminationGracePeriodSeconds 足够长

💡 最佳实践

  • 始终实现 gRPC 健康检查
  • 使用 Headless Service 或 Service Mesh 实现 L7 负载均衡
  • 配置合理的资源限制和 HPA
  • 监控 gRPC 特定指标(状态码分布、延迟分布)

14.7 扩展阅读


第 13 章 - Connect 协议 | 第 15 章 - 最佳实践