本文共 2778 字,大约阅读时间需要 9 分钟。
本文翻译自:
Is there a Pool class for worker threads , similar to the multiprocessing module's ? 工作线程是否有Pool类,类似于多处理模块的 ?
I like for example the easy way to parallelize a map function 我喜欢例如并行化地图功能的简单方法
def long_running_func(p): c_func_no_gil(p)p = multiprocessing.Pool(4)xs = p.map(long_running_func, range(100))
however I would like to do it without the overhead of creating new processes. 但是我想在没有创建新流程的开销的情况下这样做。
I know about the GIL. 我知道GIL。 However, in my usecase, the function will be an IO-bound C function for which the python wrapper will release the GIL before the actual function call. 但是,在我的用例中,该函数将是一个IO绑定的C函数,python包装器将在实际函数调用之前释放GIL。
Do I have to write my own threading pool? 我是否必须编写自己的线程池?
参考:
In Python 3 you can use , ie: 在Python 3中,您可以使用 ,即:
executor = ThreadPoolExecutor(max_workers=10)a = executor.submit(my_function)
See the for more info and examples. 有关更多信息和示例,请参阅 。
Yes, and it seems to have (more or less) the same API. 是的,它似乎(或多或少)具有相同的API。
import multiprocessingdef worker(lnk): .... def start_process(): .........if(PROCESS): pool = multiprocessing.Pool(processes=POOL_SIZE, initializer=start_process)else: pool = multiprocessing.pool.ThreadPool(processes=POOL_SIZE, initializer=start_process)pool.map(worker, inputs)....
There is no built in thread based pool. 没有内置的基于线程的池。 However, it can be very quick to implement a producer/consumer queue with the Queue
class. 但是,使用Queue
类实现生产者/消费者队列可以非常快速。
From: 来自: :
from threading import Threadfrom Queue import Queuedef worker(): while True: item = q.get() do_work(item) q.task_done()q = Queue()for i in range(num_worker_threads): t = Thread(target=worker) t.daemon = True t.start()for item in source(): q.put(item)q.join() # block until all tasks are done
The overhead of creating the new processes is minimal, especially when it's just 4 of them. 创建新流程的开销很小,特别是当它只有4个时。 I doubt this is a performance hot spot of your application. 我怀疑这是你的应用程序的性能热点。 Keep it simple, optimize where you have to and where profiling results point to. 保持简单,优化您所需的位置以及分析结果指向的位置。
I just found out that there actually is a thread-based Pool interface in the multiprocessing
module, however it is hidden somewhat and not properly documented. 我刚刚发现multiprocessing
模块中确实存在基于线程的Pool接口,但是它有些隐藏并且没有正确记录。
It can be imported via 它可以通过导入
from multiprocessing.pool import ThreadPool
It is implemented using a dummy Process class wrapping a python thread. 它是使用包装python线程的虚拟Process类实现的。 This thread-based Process class can be found in which is mentioned briefly in the . 这个基于线程的Process类可以在中找到,这在中有简要提及。 This dummy module supposedly provides the whole multiprocessing interface based on threads. 据推测,这个虚拟模块基于线程提供整个多处理接口。
转载地址:http://smlgj.baihongyu.com/