Python中tornado接口里的操作时间差远远大于脚本的时间差，如何排查和优化？

接口里有个逻辑如下

st = time.time()
most_common_id = Counter(id_list).most_common(5000)
ed = time.time()
print("most common cost time %s", et - st)

其中，id_list是个 15w 长度的 list，Counter 之后是 10w 长度。

但在接口里要花接近 400ms，脚本直接执行的话，只要 200ms，都是平均时长。

eggper 1楼

这个问题我遇到过，核心在于Tornado的异步特性和你的测试方法。

首先，直接比较脚本执行时间和接口响应时间是不公平的。脚本时间只计算了你的核心逻辑CPU耗时，而接口响应时间（比如用time.time()在handler里包裹）包含了网络延迟、框架开销、可能存在的I/O等待（即使是异步的，某些阻塞操作也会拖累整个事件循环）、以及其他并发请求的干扰。

最可能的原因是你的接口里混用了阻塞代码。在异步函数（加了@gen.coroutine或async def）里，如果你直接调用了同步的、耗时的操作（比如requests.get()、time.sleep()、某些同步数据库查询、或者一个计算密集的循环），它会阻塞整个事件循环，导致所有并发请求都被卡住。这时单个请求的响应时间就会暴涨。

排查和优化的关键代码思路如下：

首先，定位耗时点。在Handler里用高精度计时器分段打点，不要只记录总时间。

import time
import asyncio
from tornado.web import RequestHandler, Application
from tornado.ioloop import IOLoop

class TestHandler(RequestHandler):
    async def get(self):
        start_total = time.perf_counter()
        
        # 阶段1: 网络I/O (必须异步化)
        io_start = time.perf_counter()
        # 错误示例: 同步请求，绝对禁止！
        # import requests
        # resp = requests.get('http://httpbin.org/delay/1') # 阻塞！
        
        # 正确示例: 使用异步HTTP客户端，如aiohttp或AsyncHTTPClient
        from tornado.httpclient import AsyncHTTPClient
        http_client = AsyncHTTPClient()
        try:
            resp = await http_client.fetch("http://httpbin.org/delay/1")
        except Exception as e:
            self.write(f"HTTP Error: {e}")
            return
        io_time = time.perf_counter() - io_start
        
        # 阶段2: CPU密集型计算 (考虑丢到线程池)
        cpu_start = time.perf_counter()
        # 如果是重计算，会阻塞事件循环
        result = some_intensive_calculation() # 这可能是个问题
        cpu_time = time.perf_counter() - cpu_start
        
        # 阶段3: 数据库查询 (必须用异步驱动)
        db_start = time.perf_counter()
        # 错误示例: 使用同步MySQL驱动如pymysql
        # db.cursor().execute("SELECT SLEEP(1)")
        
        # 正确示例: 使用aiomysql, asyncpg等
        # async with pool.acquire() as conn:
        #     await conn.execute("SELECT 1")
        db_time = time.perf_counter() - db_start
        
        total_time = time.perf_counter() - start_total
        
        self.write(f"""
        IO耗时: {io_time:.3f}s<br>
        CPU计算耗时: {cpu_time:.3f}s<br>
        数据库耗时: {db_time:.3f}s<br>
        总耗时: {total_time:.3f}s
        """)

针对性优化：
- I/O操作：确保所有网络请求、数据库操作都使用异步库（aiohttp, aiomysql, motor for MongoDB等）。Tornado自带的AsyncHTTPClient用于HTTP请求。
- CPU密集型任务：如果some_intensive_calculation()很耗时，用IOLoop.run_in_executor把它丢到线程池去跑，避免阻塞事件循环。
```
import concurrent.futures
cpu_executor = concurrent.futures.ThreadPoolExecutor(max_workers=4)

class CPUHandler(RequestHandler):
    async def get(self):
        loop = asyncio.get_event_loop()
        result = await loop.run_in_executor(cpu_executor, some_intensive_calculation)
        self.write(f"Result: {result}")
```
- 检查全局锁或共享资源竞争：比如有没有用全局变量还加了锁？这会在并发时造成串行等待。
验证环境：用ab或wrk做并发测试，模拟多个同时请求。单独测一个请求很快，但并发下变慢，基本就是上面说的阻塞事件循环的问题。

总结：用异步库替换所有阻塞调用，CPU密集型任务丢线程池。