强曰为道
与天地相似,故不违。知周乎万物,而道济天下,故不过。旁行而不流,乐天知命,故不忧.
文档目录

MySQL 完全指南 / 第 15 章:监控体系

第 15 章:监控体系

没有监控就没有运维。本章覆盖 Performance Schema、慢查询日志和 Prometheus 监控方案。


15.1 Performance Schema

15.1.1 概述

Performance Schema 是 MySQL 内置的性能监控框架,收集服务器运行时的各种事件数据。

-- 查看是否开启
SHOW VARIABLES LIKE 'performance_schema';

-- 查看所有消费者(Consumer)
SELECT * FROM performance_schema.setup_consumers;

-- 查看所有检测点(Instrument)
SELECT * FROM performance_schema.setup_instruments
WHERE ENABLED = 'YES' LIMIT 20;

15.1.2 常用监控查询

-- ===================== 连接与线程 =====================
-- 当前连接数
SHOW STATUS LIKE 'Threads_connected';

-- 活跃线程数
SHOW STATUS LIKE 'Threads_running';

-- 连接历史
SHOW STATUS LIKE 'Connections';

-- 连接错误
SHOW STATUS LIKE 'Connection_errors%';

-- ===================== 查询统计 =====================
-- 全局查询统计
SHOW GLOBAL STATUS LIKE 'Com_select';
SHOW GLOBAL STATUS LIKE 'Com_insert';
SHOW GLOBAL STATUS LIKE 'Com_update';
SHOW GLOBAL STATUS LIKE 'Com_delete';

-- 每秒查询数(QPS)
-- 需要两次采样计算差值

-- ===================== InnoDB 监控 =====================
-- Buffer Pool 命中率
SHOW STATUS LIKE 'Innodb_buffer_pool_read%';
-- Innodb_buffer_pool_reads:从磁盘读取次数
-- Innodb_buffer_pool_read_requests:总请求次数
-- 命中率 = 1 - reads/read_requests

-- 行操作
SHOW STATUS LIKE 'Innodb_rows%';
-- Innodb_rows_read
-- Innodb_rows_inserted
-- Innodb_rows_updated
-- Innodb_rows_deleted

-- ===================== 表锁 =====================
SHOW STATUS LIKE 'Table_locks%';
-- Table_locks_immediate:立即获得锁
-- Table_locks_waited:需要等待锁

-- ===================== 临时表 =====================
SHOW STATUS LIKE 'Created_tmp%';
-- Created_tmp_disk_tables:磁盘临时表(应尽量少)
-- Created_tmp_tables:内存临时表

15.1.3 表 I/O 监控

-- 哪些表的 I/O 最多
SELECT 
    OBJECT_SCHEMA,
    OBJECT_NAME,
    COUNT_READ,
    COUNT_WRITE,
    COUNT_FETCH,
    SUM_TIMER_READ / 1e12 AS read_time_s,
    SUM_TIMER_WRITE / 1e12 AS write_time_s
FROM performance_schema.table_io_waits_summary_by_table
WHERE OBJECT_SCHEMA NOT IN ('mysql', 'performance_schema', 'sys')
ORDER BY (SUM_TIMER_READ + SUM_TIMER_WRITE) DESC
LIMIT 20;

15.1.4 语句统计

-- 最耗时的 SQL(按总时间排序)
SELECT 
    DIGEST_TEXT,
    COUNT_STAR AS exec_count,
    ROUND(SUM_TIMER_WAIT / 1e12, 2) AS total_time_s,
    ROUND(AVG_TIMER_WAIT / 1e12, 4) AS avg_time_s,
    SUM_ROWS_EXAMINED,
    SUM_ROWS_SENT,
    FIRST_SEEN,
    LAST_SEEN
FROM performance_schema.events_statements_summary_by_digest
WHERE SCHEMA_NAME = 'myapp'
ORDER BY SUM_TIMER_WAIT DESC
LIMIT 20;

-- 扫描行数远大于返回行数的查询(可能缺索引)
SELECT 
    DIGEST_TEXT,
    COUNT_STAR,
    SUM_ROWS_EXAMINED,
    SUM_ROWS_SENT,
    ROUND(SUM_ROWS_EXAMINED / NULLIF(SUM_ROWS_SENT, 0), 2) AS examine_ratio
FROM performance_schema.events_statements_summary_by_digest
WHERE SUM_ROWS_EXAMINED > 0
  AND SUM_ROWS_EXAMINED > SUM_ROWS_SENT * 100
ORDER BY SUM_ROWS_EXAMINED DESC
LIMIT 10;

15.2 sys Schema

sys Schema 是 Performance Schema 的友好视图层。

-- 查看最耗时的语句
SELECT * FROM sys.statements_with_runtimes_in_95th_percentile LIMIT 10;

-- 查看全表扫描的语句
SELECT * FROM sys.statements_with_full_table_scans LIMIT 10;

-- 查看冗余索引
SELECT * FROM sys.schema_redundant_indexes;

-- 查看未使用的索引
SELECT * FROM sys.schema_unused_indexes WHERE object_schema = 'myapp';

-- 查看表大小
SELECT * FROM sys.schema_table_statistics WHERE table_schema = 'myapp';

-- 查看等待事件
SELECT * FROM sys.waits_global_by_latency LIMIT 10;

-- 查看 I/O 最多的文件
SELECT * FROM sys.io_global_by_file_by_bytes LIMIT 10;

-- 查看内存使用
SELECT * FROM sys.memory_global_total;

-- 用户连接统计
SELECT * FROM sys.user_summary;

-- 查看冗余索引和重复索引
SELECT * FROM sys.schema_redundant_indexes\G

15.3 SHOW STATUS 速查

-- 连接相关
SHOW GLOBAL STATUS LIKE 'Threads_connected';    -- 当前连接数
SHOW GLOBAL STATUS LIKE 'Threads_running';       -- 活跃线程数
SHOW GLOBAL STATUS LIKE 'Max_used_connections';  -- 历史最大连接数
SHOW GLOBAL STATUS LIKE 'Aborted_connects';      -- 连接失败次数
SHOW GLOBAL STATUS LIKE 'Aborted_clients';       -- 异常断开的客户端

-- 查询相关
SHOW GLOBAL STATUS LIKE 'Questions';             -- 总查询数
SHOW GLOBAL STATUS LIKE 'Slow_queries';          -- 慢查询数

-- InnoDB 相关
SHOW GLOBAL STATUS LIKE 'Innodb_row_lock%';      -- 行锁统计
SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool%';   -- Buffer Pool 统计
SHOW GLOBAL STATUS LIKE 'Innodb_deadlocks';      -- 死锁次数

-- 临时表
SHOW GLOBAL STATUS LIKE 'Created_tmp_disk_tables';  -- 磁盘临时表
SHOW GLOBAL STATUS LIKE 'Created_tmp_tables';       -- 总临时表

15.4 Prometheus + Grafana 监控

15.4.1 mysqld_exporter 安装

# 下载
wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.16.0/mysqld_exporter-0.16.0.linux-amd64.tar.gz
tar xzf mysqld_exporter-0.16.0.linux-amd64.tar.gz
mv mysqld_exporter-0.16.0.linux-amd64/mysqld_exporter /usr/local/bin/

# 创建监控用户
mysql -uroot -p -e "
CREATE USER 'exporter'@'localhost' IDENTIFIED BY 'Export!Pass123';
GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'localhost';
FLUSH PRIVILEGES;"

# 配置
cat > /etc/.mysqld_exporter.cnf << EOF
[client]
user=exporter
password=Export!Pass123
EOF

# 启动
mysqld_exporter --config.my-cnf=/etc/.mysqld_exporter.cnf --web.listen-address=:9104

15.4.2 Prometheus 配置

# prometheus.yml
scrape_configs:
  - job_name: 'mysql'
    static_configs:
      - targets: ['192.168.1.100:9104']
        labels:
          instance: 'mysql-master'
      - targets: ['192.168.1.101:9104']
        labels:
          instance: 'mysql-slave-1'

15.4.3 关键监控指标

指标PromQL说明
连接数mysql_global_status_threads_connected当前连接数
QPSrate(mysql_global_status_queries[5m])每秒查询数
慢查询rate(mysql_global_status_slow_queries[5m])慢查询速率
Buffer Pool 命中率1 - rate(mysql_global_status_innodb_buffer_pool_reads[5m]) / rate(mysql_global_status_innodb_buffer_pool_read_requests[5m])命中率
复制延迟mysql_slave_status_seconds_behind_master从库延迟秒数
死锁次数mysql_global_status_innodb_deadlocks死锁累计次数
磁盘临时表比rate(mysql_global_status_created_tmp_disk_tables[5m]) / rate(mysql_global_status_created_tmp_tables[5m])磁盘临时表比例

15.5 告警规则建议

# Prometheus 告警规则 (mysql_alerts.yml)
groups:
  - name: mysql_alerts
    rules:
      # 连接数过高
      - alert: MySQLTooManyConnections
        expr: mysql_global_status_threads_connected > 200
        for: 5m
        labels:
          severity: warning

      # 慢查询过多
      - alert: MySQLSlowQueriesHigh
        expr: rate(mysql_global_status_slow_queries[5m]) > 10
        for: 5m
        labels:
          severity: warning

      # Buffer Pool 命中率低
      - alert: MySQLBufferPoolHitRateLow
        expr: (1 - rate(mysql_global_status_innodb_buffer_pool_reads[5m]) / rate(mysql_global_status_innodb_buffer_pool_read_requests[5m])) < 0.99
        for: 10m
        labels:
          severity: critical

      # 复制延迟
      - alert: MySQLReplicationLag
        expr: mysql_slave_status_seconds_behind_master > 30
        for: 5m
        labels:
          severity: critical

      # 复制中断
      - alert: MySQLReplicationStopped
        expr: mysql_slave_status_slave_io_running == 0 or mysql_slave_status_slave_sql_running == 0
        for: 1m
        labels:
          severity: critical

      # 死锁
      - alert: MySQLDeadlocks
        expr: increase(mysql_global_status_innodb_deadlocks[5m]) > 5
        for: 5m
        labels:
          severity: warning

15.6 监控 Dashboard 推荐

Grafana 常用 MySQL Dashboard:

Dashboard ID名称说明
7362MySQL Overview综合监控
14057MySQL Dashboard简洁版
11323MySQL InnoDB MetricsInnoDB 深入监控
7991MySQL Replication复制监控
# 在 Grafana 中导入 Dashboard
# 1. 打开 Grafana → + → Import
# 2. 输入 Dashboard ID(如 7362)
# 3. 选择 Prometheus 数据源

业务场景

场景 1:日常巡检脚本

-- 日常巡检 SQL 脚本
-- 1. 连接状态
SELECT 'Connections' AS metric, 
       VARIABLE_VALUE AS value
FROM performance_schema.global_status 
WHERE VARIABLE_NAME = 'Threads_connected';

-- 2. Buffer Pool 命中率
SELECT 'Buffer Pool Hit Rate' AS metric,
       ROUND((1 - (
           SELECT VARIABLE_VALUE FROM performance_schema.global_status WHERE VARIABLE_NAME = 'Innodb_buffer_pool_reads'
       ) / NULLIF((
           SELECT VARIABLE_VALUE FROM performance_schema.global_status WHERE VARIABLE_NAME = 'Innodb_buffer_pool_read_requests'
       ), 0)) * 100, 2) AS value;

-- 3. 慢查询数
SELECT 'Slow Queries' AS metric, VARIABLE_VALUE AS value
FROM performance_schema.global_status WHERE VARIABLE_NAME = 'Slow_queries';

-- 4. 复制状态
SHOW REPLICA STATUS\G

-- 5. 磁盘空间
SELECT TABLE_SCHEMA,
       ROUND(SUM(DATA_LENGTH + INDEX_LENGTH) / 1024 / 1024, 0) AS total_mb
FROM information_schema.TABLES
GROUP BY TABLE_SCHEMA ORDER BY total_mb DESC;

-- 6. 长事务
SELECT * FROM information_schema.INNODB_TRX
WHERE TIMESTAMPDIFF(SECOND, trx_started, NOW()) > 60;

-- 7. 锁等待
SELECT * FROM performance_schema.data_lock_waits;

扩展阅读