why is nonzero(arr != 0) faster than nonzero(arr) in python

2 weeks ago 20

ARTICLE AD BOX

It doesn’t make sense to me. It looks like nonzero(arr != 0) just creates an intermediate array, allocating more memory. No way it is faster otherwise why don’t numpy optimize it? But here is my benchmark:

import numpy as np from timeit import timeit arr = np.random.randint(0, 2, 10_000_000) a = timeit(lambda: np.nonzero(arr != 0), number=10) b = timeit(lambda: np.nonzero(arr), number=10) print(f"nonzero(arr != 0): {a}") print(f"nonzero(arr): {b}")

Results:

nonzero(arr != 0): 0.20066774962469935 nonzero(arr): 0.5988789172843099

It seems that nonzero is just much better if you convert the input into boolean array first. I tested it using smaller integers, and the results are consistent:

arr = np.random.randint(0, 2, 10_000_000, dtype=np.uint8) a = timeit(lambda: np.nonzero(arr != 0), number=10) b = timeit(lambda: np.nonzero(arr), number=10) c = timeit(lambda: np.nonzero(arr.view(bool)), number=10) print(f"nonzero(arr != 0): {a}") print(f"nonzero(arr): {b}") print(f"nonzero(view): {c}")

Results:

nonzero(arr != 0): 0.15293325018137693 nonzero(arr): 0.5660374169237912 nonzero(view): 0.13332204101607203

So an intermediate array does add some overhead, but the overhead is much smaller than the nonzero() overhead on non-boolean arrays.

Are there some reasons for this? My guess is that nonzero() is optimized (maybe using SIMD?) for boolean array, but somehow the optimizations doesn’t happen for other types.

Read Entire Article

LEFT SIDEBAR AD

Hidden in mobile, Best for skyscrapers.

why is nonzero(arr != 0) faster than nonzero(arr) in python

ARTICLE AD BOX

Related

Applying Kolmogorov-Smirnov (KS) test to evaluate multivariate synthetic tabular data (TVAE/TabDDPM vs. Empirical baseline)

Browser not Responsive when using Profile

How can I reliably recover and preserve page numbers from legal-document HTML/PDF text in Python at scale?

LEFT SIDEBAR AD