Why does local variable caching in a loop behave differently when using exec() vs local scope assignment in Python 3.12?

12 hours ago 1
ARTICLE AD BOX

I’m investigating some bytecode optimizations in Python 3.12 related to loop performance and local variable lookups (LOAD_FAST vs LOAD_DEREF).

I noticed a bizarre performance discrepancy when running dynamically evaluated code inside a local function scope versus a standard loop.

Consider these two setups. Both are trying to execute a tight loop where a local variable is updated, but Setup B uses exec() to dynamically run the inner logic within the same local context.

import timeit # Setup A: Standard local loop def test_standard(): x = 0 for _ in range(10_000_000): x += 1 return x # Setup B: Executing loop logic dynamically def test_dynamic(): x = 0 local_vars = {'x': x} # Running the exact same loop structure inside exec exec(""" for _ in range(10_000_000): x += 1 """, globals(), local_vars) return local_vars['x'] print("Standard:", timeit.timeit(test_standard, number=1)) print("Dynamic (exec):", timeit.timeit(test_dynamic, number=1))

The Results:

On Python 3.12.2, test_dynamic() runs roughly 3x to 4x slower than test_standard().

My Analysis so far:

I ran dis.dis on both to see what the compiler is doing under the hood.

For test_standard, Python completely optimizes the loop using LOAD_FAST and STORE_FAST opcodes because it maps x strictly to the local namespace array:

6 22 LOAD_FAST 0 (x) 24 LOAD_CONST 2 (1) 26 BINARY_OP 0 (+) 30 STORE_FAST 0 (x)

However, when inspecting the code object generated inside exec(), even though it is passed a dedicated local_vars dictionary, it defaults to using LOAD_NAME and STORE_NAME.

My Questions:

Since Python 3.11+ introduced the Specializing Adaptive Interpreter, why doesn't the adaptive interpreter optimize LOAD_NAME to a localized fast-path inside an exec() code block once it detects the type/dictionary structure isn't changing?

Is there a strict architectural reason why exec() code objects are fundamentally barred from utilizing LOAD_FAST optimization mechanics, even when provided an explicit, isolated locals dictionary?

Read Entire Article