关于Python CPython的线程安全问题

之前对 Python 理解不深，最近准备深入学习一下，在多线程和线程安全的时候碰到了一个问题。

https://docs.python.org/3/faq/library.html#what-kinds-of-global-value-mutation-are-thread-safe 里写到 Python 自带的数据结构的某些操作是不安全的。比如 D[x] = D[x] + 1。

于是写了个很简单的测试：

import threading
import sys
sys.setswitchinterval(1)
d = {1:0}
def func():
d[1] += 1
threads = []
for _ in range(1000000):
t = threading.Thread(target=func)
threads.append(t)
t.start()
for thread in threads:
thread.join()
print(d)

但是跑了很多遍结果都没有问题（打印{1: 100000}）。用 dis 看了一下，确实 func()也是用了多行 bytecode，按理说应该有 race condition 才对。

>>> dis.dis(func)
 11           0 LOAD_GLOBAL              0 (d)
              2 LOAD_CONST               1 (1)
              4 DUP_TOP_TWO
              6 BINARY_SUBSCR
              8 LOAD_CONST               1 (1)
             10 INPLACE_ADD
             12 ROT_THREE
             14 STORE_SUBSCR
             16 LOAD_CONST               0 (None)
             18 RETURN_VALUE

不太明白问题出在哪，是 100 万不够大吗？

关于Python CPython的线程安全问题

sinazl 1楼

GIL 了解下

zlyuanteng 2楼

CPython的GIL（全局解释器锁）是线程安全的核心机制。它确保任何时候只有一个线程在执行Python字节码，这避免了多线程操作Python对象时的数据竞争。但GIL只保护解释器内部状态，不保护你自己的数据。

如果你的代码涉及I/O操作（如文件读写、网络请求），GIL会在I/O等待时释放，这时其他线程可以执行，所以多线程对I/O密集型任务仍有加速效果。但对于CPU密集型任务，多线程由于GIL的存在无法真正并行。

需要特别注意：GIL不保证你的代码是线程安全的！如果你在多线程中共享可变数据（如列表、字典），仍然需要手动加锁。比如多个线程同时修改同一个列表时，即使有GIL，也可能因为操作不是原子的而出问题。

简单说，用threading.Lock来保护共享数据的修改。对于CPU密集型任务，考虑用multiprocessing绕过GIL。

总结：理解GIL的边界，该加锁时就加锁。

ionicwang 3楼

解释器在运行 bytecode 的时候 GIL 是有可能会丢的，我也加了sys.setswitchinterval(1)让这个频率高一些。按理说这种情况下往共享资源里写东西，又不是原子操作，是会有 race condition 的。
）

phonegap100 4楼

或许 func 执行比线程创建快多了

yuanlaile 5楼

我也猜过这个可能，在 func 里面加过 sleep，但是还是返回了正确的值。不过只测试了一遍（ 100 万个 sleep 实在是太慢了）。看看还有没有其他可能。

bupafengyu 6楼

sys.setswitchinterval(interval)
Set the interpreter ’ s thread switch interval (in seconds). This floating-point value determines the ideal duration of the “ timeslices ” allocated to concurrently running Python threads.

一秒太慢了

ionicwang 7楼

在 func 里面循环一百万遍啊

eggper 8楼

在函数里面做这个循环，不用这么多线程

htzhanglong 9楼

 # encoding: utf-8 import threading import time d = {1:0} def func(): for _ in range(10): d[1] += 1 time.sleep(0.01) d[1] += 1 threads = [] for _ in range(1000): t = threading.Thread(target=func) threads.append(t) t.start() for thread in threads: thread.join() print(d) 

python2 下有问题

python3 上没问题

https://www.reddit.com/r/Python/comments/4vyg2m/threadinglocking_is_4x_as_fast_on_python_3_vs/d62giao/

phonegap100 10楼

我又试了下，确实在 python2 下是有问题的，比较了下 func 的 bytecode 也不一样。反正 bytecode 这种东西也没有标准，说变就变，那就不钻牛角尖了。感谢。