Python多线程死锁问题如何解决?
场景是这样的:开启多个线程访问外部 api ,过一段时间,发现所有线程死锁。 环境: python2.6.7 centos7.1 urllib2 , suse 下无此问题。 dump 如下:
Thread 161 (Thread 0x7f80de4e9700 (LWP 12459)):
#0 0x00007f80e4cacb6c in __lll_lock_wait_private () from /lib64/libc.so.6
#1 0x00007f80e4cc2efd in _L_lock_746 () from /lib64/libc.so.6
#2 0x00007f80e4cc2cb5 in __check_pf () from /lib64/libc.so.6
#3 0x00007f80e4c88f69 in getaddrinfo () from /lib64/libc.so.6
#4 0x00007f80e12faa3c in socket_getaddrinfo (self=<optimized out>, args=<optimized out>) at /home/basic/Python-2.7.6/Modules/socketmodule.c:4198
#5 0x00000000004b5726 in call_function (oparg=<optimized out>, pp_stack=0x7f80de4e6b30) at Python/ceval.c:4021
#6 PyEval_EvalFrameEx (f=f[@entry](/user/entry)=0x7f7fa403c980, throwflag=throwflag[@entry](/user/entry)=0) at Python/ceval.c:2666
看上去是 getaddrinfo 引发的死锁,不知道大家没有遇到这个坑,请大家帮忙给些建议,谢谢!
Python多线程死锁问题如何解决?
3 回复
死锁在多线程里挺常见的,主要就是几个线程互相等对方手里的锁,结果全卡住了。解决思路就一个核心:让所有线程都按同一个固定的顺序去申请锁。
比如,你有锁A和锁B,线程1先拿A再拿B,线程2也必须是先拿A再拿B。这样就不会出现一个拿着A等B,另一个拿着B等A的死循环了。
看个具体例子。下面这段代码模拟了一个经典死锁场景,然后我们用固定顺序加锁的方式修复它:
import threading
import time
# 这是会死锁的错误写法
def bad_example():
lock_a = threading.Lock()
lock_b = threading.Lock()
def thread_1():
with lock_a:
print("Thread 1 acquired lock A")
time.sleep(0.1) # 稍微等一下,让另一个线程有机会拿到B
with lock_b:
print("Thread 1 acquired lock B")
def thread_2():
with lock_b:
print("Thread 2 acquired lock B")
time.sleep(0.1)
with lock_a:
print("Thread 2 acquired lock A")
t1 = threading.Thread(target=thread_1)
t2 = threading.Thread(target=thread_2)
t1.start()
t2.start()
t1.join()
t2.join()
# 修复后的正确写法
def fixed_example():
lock_a = threading.Lock()
lock_b = threading.Lock()
# 关键:两个线程都按照先A后B的顺序加锁
def thread_1():
with lock_a:
print("Thread 1 acquired lock A")
time.sleep(0.1)
with lock_b:
print("Thread 1 acquired lock B")
def thread_2():
with lock_a: # 注意这里也先申请A
print("Thread 2 acquired lock A")
time.sleep(0.1)
with lock_b:
print("Thread 2 acquired lock B")
t1 = threading.Thread(target=thread_1)
t2 = threading.Thread(target=thread_2)
t1.start()
t2.start()
t1.join()
t2.join()
if __name__ == "__main__":
print("=== 错误示例(可能会卡住)===")
# bad_example() # 运行这个可能会死锁
print("\n=== 修复后的示例 ===")
fixed_example()
另外,Python的threading模块里有个RLock(可重入锁),同一个线程可以多次acquire而不会阻塞自己,这在某些嵌套调用场景下也能避免死锁。但最根本的还是设计好锁的获取顺序。
简单说就是:定个规矩,大家都按顺序拿锁。
看不懂- -,不过 OS 课本上说的是按照统一的顺序调用临界资源即可

