How can one initialise a std::unordered_map with a given starting size (less than the default)?

8 hours ago 2
ARTICLE AD BOX

The question boils down to the following controversy: what do you mean by the size? If you mean the initial number of key-value pairs in the collection, it makes no sense due to the storage mechanism based on buckets.

Please see the constructor documentation.

As you can see, four top constructor profiles on this page gives you a way to specify the size: unordered_map(size_type bucket_count, ...).

However, this is the initial bucket size, that is, the initial count of buckets.
This is what you actually need as this is the way to adjust the performance taking into account your expectations on the input data. It can help to reduce the number of relatively expensive hash operations while you add the element with new keys.

Basically, this is how it works:
There is some default bucket_count, and this is a pretty low value, just a few buckets. When a new element is added to a map, re-hashing is triggered when the average number of elements per bucket (called load factor) exceeds its maximum value. The documented default for the maximum default factor max_load_factor is 1.0f. When the value of the load factor excees max_load_factor, the re-hashing is triggered; and this is a relatively slow operation. When the bucket_count is low and the number of actually added key-value pairs is considerable, the re-hashing can be triggered many times.

Is optimization possible?
To avoid multiple re-hashing, the user can set greater bucket_count, for example, to the value equal to the maximum number of expected element. In this case, the probability of re-hashing will be low. There will be some trade-off between the time of finding the element by its key's hash value and the re-hashing time. The overall balance can be adjusted based on the knowledge of the expectancy of the data placed in the map.

Also, max_load_factor can be adjusted. This value is returned by the function max_load_factor() and a non-default value can be set as a constructor parameter. It also can be modified dynamically using the function void max_load_factor(float ml).

See also this useful StackOverflow question and the answers: fixed size unordered_map, how to define?. (No, I don't think your question is a repost. This is just useful related matter.)

Read Entire Article