Python线程安全与GIL的实战思考-CFANZ编程社区

初识GIL：一个让人又爱又恨的设计

第一次遇到GIL（Global Interpreter Lock）是在写多线程爬虫的时候。当时我发现用10个线程爬取数据，CPU使用率居然上不去，性能还不如单线程。通过threading.current_thread()打印线程ID确认线程确实启动了，但就是无法并行执行。这才意识到Python有个叫GIL的东西。

import threading

def worker():
    print(f"Thread {threading.current_thread().name} started")
    # 模拟CPU密集型操作
    sum = 0
    for i in range(10000000):
        sum += i

threads = []
for i in range(4):
    t = threading.Thread(target=worker)
    threads.append(t)
    t.start()

for t in threads:
    t.join()

运行这段代码你会发现，虽然启动了多个线程，但它们是交替执行的，而不是真正的并行。这就是GIL在起作用——它确保同一时刻只有一个线程在执行Python字节码。

GIL存在的意义

经过查阅源码和资料，我理解GIL的设计初衷：

简化CPython实现：内存管理不用考虑多线程竞争
保护基础数据结构：比如引用计数不用加锁
兼容C扩展：许多C扩展假设单线程环境

但这也带来了明显的缺点——多线程CPU密集型程序性能下降。我在处理图像时深有体会：

from PIL import Image
import threading

def process_image(img_path):
    img = Image.open(img_path)
    # 一些CPU密集型操作...
    
# 多线程处理图片反而更慢！

线程安全的实战经验

场景1：计数器陷阱

早期我写过这样的代码，结果总是出错：

count = 0

def increment():
    global count
    for _ in range(100000):
        count += 1

threads = [threading.Thread(target=increment) for _ in range(10)]
for t in threads:
    t.start()
for t in threads:
    t.join()

print(count)  # 结果经常小于1000000

解决方法：

使用threading.Lock()
使用queue.Queue
改用原子操作（如queue.Queue的内部实现）

from threading import Lock

counter = 0
lock = Lock()

def safe_increment():
    global counter
    for _ in range(100000):
        with lock:
            counter += 1

场景2：列表的线程安全问题

我发现即使有GIL，列表操作也不是绝对安全的：

lst = []

def append_numbers():
    for i in range(100000):
        lst.append(i)

threads = [threading.Thread(target=append_numbers) for _ in range(2)]
for t in threads:
    t.start()
for t in threads:
    t.join()

print(len(lst))  # 有时不是200000

这是因为append()操作不是原子性的，虽然每个字节码执行时有GIL保护，但整个操作可能被打断。

突破GIL限制的方案

经过多次实践，我总结了这些解决方案：

多进程代替多线程：

from multiprocessing import Pool

def cpu_bound_task(x):
    return x*x

if __name__ == '__main__':
    with Pool(4) as p:
        print(p.map(cpu_bound_task, range(10)))

使用C扩展：将性能关键代码用C编写
异步IO：对于I/O密集型任务

import asyncio

async def fetch_data():
    # 模拟网络请求
    await asyncio.sleep(1)
    return "data"

async def main():
    tasks = [fetch_data() for _ in range(10)]
    results = await asyncio.gather(*tasks)
    print(results)

asyncio.run(main())

Jython/IronPython：这些实现没有GIL

最佳实践建议

根据我的项目经验：

I/O密集型用多线程+异步
CPU密集型用多进程
共享资源一定要加锁
考虑使用concurrent.futures线程池

from concurrent.futures import ThreadPoolExecutor

def task(n):
    return n*n

with ThreadPoolExecutor(max_workers=4) as executor:
    futures = [executor.submit(task, i) for i in range(10)]
    results = [f.result() for f in futures]