如何分析 Nginx 错误日志？

lin 发表于 2025-8-26 17:21:24

在运维工作中，Nginx 错误日志是排查网站故障和性能问题的重要依据。无论是网站出现 502、504、403、404 等错误，还是访问速度慢、连接异常，错误日志都能提供关键线索。本文将介绍 Nginx 错误日志的内容、常见错误及分析思路。
一、Nginx 错误日志位置
Nginx 错误日志默认配置在 nginx.conf 文件中：
error_log /var/log/nginx/error.log warn;路径：/var/log/nginx/error.log（不同发行版可能位置不同）
日志级别：

[*]debug：调试信息，最详细
[*]info：普通信息
[*]notice：提示信息
[*]warn：警告信息
[*]error：错误信息（常用）
[*]crit：严重错误
[*]alert：需要立刻处理
[*]emerg：紧急情况，系统无法工作
在排障时，通常设置为 error 或 warn，如果需要更详细的调试，可以临时调整为 debug。
二、Nginx 错误日志格式
Nginx 错误日志的基本格式如下：
YYYY/MM/DD HH:MM:SS PID#TID: *CID message, client: IP, server: domain, request: "METHOD URL HTTP/version", upstream: "backend", host: "domain"示例：
2025/08/26 16:22:10 1234#5678: *90 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.1.100, server: www.test.com, request: "GET /index.php HTTP/1.1", upstream: "http://127.0.0.1:9000/index.php", host: "www.test.com"关键字段解析：

[*]时间：错误发生的时间
[*]级别：error/warn/crit 等
[*]PID#TID：进程号/线程号
[*]CID：连接 ID（用于追踪同一请求）
[*]message：错误描述
[*]client：客户端 IP
[*]server：虚拟主机名
[*]request：请求的 URL
[*]upstream：上游服务器地址（反向代理时出现）
[*]host：请求头中的 Host
三、常见错误日志案例分析
1. 502 Bad Gateway
1234#5678: *100 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.1.2, server: www.test.com, request: "GET /login HTTP/1.1", upstream: "http://127.0.0.1:9000/login"原因：

[*]PHP-FPM/Tomcat 后端服务未启动
[*]端口配置错误
[*]防火墙阻止连接
排查思路：

[*]检查 PHP-FPM 是否运行：systemctl status php-fpm
[*]检查 upstream 配置是否正确
[*]查看防火墙设置
2. 504 Gateway Timeout
1234#5678: *200 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.1.2, server: www.test.com, request: "POST /order HTTP/1.1", upstream: "http://127.0.0.1:9000/order"原因：

[*]后端服务处理时间过长
[*]Nginx 或 upstream 超时配置过短
解决办法：
优化后端代码，减少耗时
调整超时配置：

[*]proxy_connect_timeout 60s;
[*]proxy_send_timeout 60s;
[*]proxy_read_timeout 60s;
3. 403 Forbidden
1234#5678: *300 access forbidden by rule, client: 192.168.1.3, server: www.test.com, request: "GET /admin HTTP/1.1"原因：

[*]文件权限不足（nginx 用户无法访问文件）
[*]配置中限制了访问目录
排查：

[*]检查目录权限：ls -ld /var/www/html/admin
[*]检查 Nginx 配置的 deny 或 allow 规则
4. 404 Not Found
1234#5678: *400 open() "/var/www/html/notfound.html" failed (2: No such file or directory), client: 192.168.1.4, server: www.test.com, request: "GET /notfound.html HTTP/1.1"原因：

[*]文件不存在
[*]root 路径配置错误
排查：

[*]确认文件是否存在
[*]检查 location 和 root 配置是否正确
5. Too many open files
1234#5678: *500 open() "/var/log/nginx/access.log" failed (24: Too many open files)原因：

[*]系统文件句柄数不足
解决：

[*]调整 ulimit -n 值
[*]修改 /etc/security/limits.conf
四、日志分析技巧
实时查看日志
tail -f /var/log/nginx/error.log筛选关键词
grep "upstream timed out" /var/log/nginx/error.loggrep "connect() failed" /var/log/nginx/error.log统计错误次数
grep "502" /var/log/nginx/error.log | wc -l定位单个 IP 的错误
grep "192.168.1.100" /var/log/nginx/error.log结合 access.log 分析

[*]error.log 提示错误位置
[*]access.log 查看请求流量和用户行为
五、总结

[*]error.log 是排查网站问题的关键入口，能快速定位故障。
[*]常见问题类型：502、504、403、404、文件句柄不足。
[*]分析时要结合：错误描述 + 客户端信息 + upstream 状态。
[*]实际运维中，建议配合 access.log + 监控工具综合分析，才能高效解决问题。

页: [1]

随客社区's Archiver

如何分析 Nginx 错误日志？