Who controls parallelism? A disagreement that leads to slower code
If you’re using NumPy, Polars, Zarr, or many other libraries, setting a single environment variable or calling a single API function might make your code run 20%-80% faster.
Or, more accurately, it may be that your code is running that much more slowly than it ought to.
The problem?
A conflict over who controls parallelism: your application, or the libraries it uses.
Let’s see an example, and how you can solve it.
The mystery of the speedy single-thread implementation
We’re going to be using the following example to measure time spent in code using a single Python thread, a Python thread pool, and a Python process pool.
All three variations will calculate the dot product of two arrays.