Why is my Python multiprocessing code slower than single-thread execution?

2 weeks ago 18

ARTICLE AD BOX

Forking off four cPython interpreters, and doing import multiprocessing in each, will take non-zero time. This hardly seems like a fair comparison with the single-threaded case, which the OP does not present.

the multiprocessing version runs Slower than the single-threaded version.

That hardly seems surprising, given that you asked it to do more than the single-threaded version. Plus, you did not describe the benchmark environment, nor the elapsed times.

Starting up a new cPython interpreter does not happen for free. There's a lot of initialization work to attend to. You have posed an XY question. Perhaps in future you will choose to tell us what your true goal might be.

J_H

21.5k5 gold badges29 silver badges52 bronze badges

This can happen when the time needed to construct/execute/destroy the subprocess(es) is longer than the time needed to execute the compute() function.

However, if we extend your code to include a single-threaded approach, the multiprocessing variant is, in fact, faster.

from multiprocessing import Pool import time N = 10_000_000 R = 4 def compute(n): total = 0 for i in range(n): total += i*i return total def mproc(): start = time.perf_counter() with Pool(R) as p: p.map(compute, [N]*R) return time.perf_counter() - start def tproc(): start = time.perf_counter() for _ in range(R): compute(N) return time.perf_counter() - start if __name__ == "__main__": for func in (mproc, tproc): d = func() print(func.__name__, f"{d:.4f}s")

Output:

mproc 0.4869s

tproc 1.3456s

Python 3.14.2 on Apple Silicon (M2) MacOS 26.2