Who controls parallelism? A disagreement that leads to slower code

If you’re using NumPy, Polars, Zarr, or many other libraries, setting a single environment variable or calling a single API function might make your code run 20%-80% faster.
Or, more accurately, it may be that your code is running that much more slowly than it ought to.

The problem?
A conflict over who controls parallelism: your application, or the libraries it uses.

Let’s see an example, and how you can solve it.

The mystery of the speedy single-thread implementation

We’re going to be using the following example to measure time spent in code using a single Python thread, a Python thread pool, and a Python process pool.
All three variations will calculate the dot product of two arrays.