Python中如何排查和解决Nginx报500错误

服务端采用 Django + Nginx + Gunicorn 部署，每次访问 /articles/{id} 这个接口时就报 500/502/504 错误。nginx 错误文件：

2018/01/15 16:49:44 [error] 1006#1006: *1274 upstream prematurely closed connection while reading response header from upstream, client: 27.154.27.176, server: waterlaw.cn, request: "GET /articles/6 HTTP/1.1", upstream: "http://unix:/tmp/waterlaw.cn.socket:/articles/6", host: "waterlaw.cn", referrer: "https://waterlaw.cn/" 2018/01/15 17:07:17 [error] 1006#1006: *1338 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 27.154.27.176, server: waterlaw.cn, request: "GET / HTTP/1.1", upstream: "http://unix:/tmp/waterlaw.cn.socket:/", host: "waterlaw.cn" 2018/01/15 17:10:47 [error] 22126#22126: *7 upstream prematurely closed connection while reading response header from upstream, client: 27.154.225.154, server: waterlaw.cn, request: "GET / HTTP/1.1", upstream: "http://unix:/tmp/waterlaw.cn.socket:/", host: "waterlaw.cn" 2018/01/15 17:14:26 [error] 22236#22236: *8 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 27.154.225.154, server: waterlaw.cn, request: "GET / HTTP/1.1", upstream: "http://unix:/tmp/waterlaw.cn.socket/", host: "waterlaw.cn" 2018/01/15 17:16:03 [error] 22236#22236: *8 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 27.154.225.154, server: waterlaw.cn, request: "GET / HTTP/1.1", upstream: "http://unix:/tmp/waterlaw.cn.socket/", host: "waterlaw.cn" 2018/01/15 17:43:44 [error] 22615#22615: *1 upstream prematurely closed connection while reading response header from upstream, client: 27.154.27.176, server: waterlaw.cn, request: "GET /articles/6 HTTP/1.1", upstream: "http://unix:/tmp/waterlaw.cn.socket:/articles/6", host: "waterlaw.cn", referrer: "https://waterlaw.cn/"

Gunicorn 控制台的错误日志：

[2018-01-15 17:51:19 +0800] [22183] [CRITICAL] WORKER TIMEOUT (pid:22730)

yibo5220 1楼

upstream 设置的不对吧

sinazl 2楼

当Nginx报500错误时，通常意味着后端服务（比如你的Python应用）出了问题。要排查这个问题，得从几个关键地方入手。

首先，你得看Nginx的错误日志，一般在/var/log/nginx/error.log。这里会告诉你Nginx为什么处理不了请求，比如是连接超时还是后端服务挂了。

然后，重点检查你的Python应用。500错误最常见的原因是应用代码运行时抛出了未捕获的异常。你需要确保你的WSGI应用（比如用Flask或Django写的）有完善的错误处理。一个快速验证的方法是直接运行你的WSGI应用，看看控制台有没有报错。

这里是一个简单的Flask应用示例，以及如何配置Nginx和Gunicorn来更好地暴露错误：

1. Python应用示例 (app.py):

from flask import Flask, jsonify
import traceback

app = Flask(__name__)

@app.route('/')
def index():
    # 你的主要逻辑在这里
    # 为了演示，这里模拟一个潜在的错误
    # result = 1 / 0 # 取消注释这行来模拟一个内部错误
    return jsonify({"status": "ok", "message": "Hello from Python!"})

@app.errorhandler(Exception)
def handle_all_exceptions(error):
    # 全局异常处理器，将错误详情和堆栈跟踪返回给客户端（仅限调试！）
    response = {
        "status": "error",
        "message": str(error),
        "traceback": traceback.format_exc()
    }
    return jsonify(response), 500

if __name__ == '__main__':
    # 仅在本地调试时使用
    app.run(debug=True, host='0.0.0.0', port=5000)

2. 使用Gunicorn运行应用: 在生产环境中，不要直接用app.run()。用Gunicorn这样的WSGI服务器。

gunicorn --bind 0.0.0.0:8000 --workers 2 --access-logfile - --error-logfile - --log-level debug app:app

--error-logfile - 和 --log-level debug 会让Gunicorn把详细的错误日志打到控制台。

3. 关键的Nginx配置片段 (在 server 块中):

location / {
    # 将请求代理到Gunicorn
    proxy_pass http://127.0.0.1:8000;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;

    # **重要：调整超时时间，避免因Python应用处理慢而导致Nginx返回502**
    proxy_connect_timeout 75s;
    proxy_send_timeout 300s;
    proxy_read_timeout 300s;
}

排查步骤总结：

查Nginx日志 (error.log)：看是不是连接、权限或配置问题。
查应用日志：直接运行Gunicorn命令，看控制台输出的错误堆栈。这是找到Python代码问题的关键。
检查应用状态：用 curl http://127.0.0.1:8000 或 systemctl status your-gunicorn-service 确保你的Python进程在运行且能响应。
简化测试：暂时修改Nginx配置，将某个测试路由直接返回一个简单字符串，绕过Python应用，以确认是Nginx本身还是后端的问题。

一句话建议： 核心是结合Nginx错误日志和后端Python应用的运行日志（特别是Gunicorn的详细错误输出）来定位问题根源。

sinazl 3楼

想花式 500 ？
你得跟着 V2EX 多学学😂

phonegap100 4楼

排查两方面，一个关闭防火墙试试，同时注意关闭 selinux 试试。

bupafengyu 5楼

http://unix:/… 肯定连不上啊。。

ionicwang 6楼作者

proxy_pass 配错了吧，upstream 那块的问题

yuanlaile 7楼

不知道是不是文件句柄太多，每次修改完都再次开启 gunicorn, 然后开了多个进程。重启一下突然好了，想到博客用了结巴分词，然后缓存的时候占用了 1.5 秒。