Skip to content Skip to sidebar Skip to footer

Get Dictionary Keys Hashes Without Recalculation

Is there a way to extract existing key hashes from a dictionary, without recalculating them again? What would be the risks for exposing them and consequntly, accessing the dict by

Solution 1:

I don't think Python's dictionary objects have any public API that allows you to see the hashes their objects are stored with. You can't store an object directly by hash in Python code (it may be possible by calling internal C functions in CPython). There are a few good reasons that you can't add values to a dictionary by hash value, rather than by key.

The most obvious is that multiple key objects might have the same hash. If such a hash collision happens, the second value will be inserted somewhere else in the hash table. The important thing is that it won't overwrite the previous value stored under a different key that hashes the same. If you could just pass the hash and not the key too, Python wouldn't be able to tell if you were using the same key or if you were providing a new key that happened to have a colliding hash.

A secondary reason that you can't insert by hash is that it would be a security vulnerability. The performance of a hash table like Python's dictionaries is very good when there are few hash collisions. It is very bad however if every hash is the same. If you could submit data to a Python program that all hashes to the same value, you can perform a very efficient denial of service attack (the new hash randomization for strings was added in recent versions of Python to make this kind of attack harder).


Solution 2:

A Python dict's keys must be hashable, i.e., implement the __hash__ special method (as well as some other methods irrelevant to your question), or be one of some predetermined built in types. So you actually can access the hash value of a key without the table, e.g., through

>>> '123'.__hash__()
163512108404620371

or more uniformly

>>> hash('123')
163512108404620371
>>> hash(2)
2

That being said, as noted by the comments, a hash value and a position in a table are not the same thing. In fact, as a table resizes, the hash value of a key will remain the same, but the position might change. Consequently, as:

  • the hash value is easily available to you via hash()

  • the position would expose the dictionary's internal state

  • you can "cache" hash values in your objects easily enough in the __hash__ method

there is probably no point in exposing the keys' positions.


Post a Comment for "Get Dictionary Keys Hashes Without Recalculation"