diff options
author | John Naylor <john.naylor@postgresql.org> | 2023-11-27 17:03:38 +0700 |
---|---|---|
committer | John Naylor <john.naylor@postgresql.org> | 2024-01-19 12:44:09 +0700 |
commit | e97b672c88f6e5938a2b81021bd4b590b013976f (patch) | |
tree | ac9fa324eeaa544014a6c726d02c357edfee21d4 /src/backend/utils/adt/json.c | |
parent | 04c0897d3bcafe4ca61967d5ab1b5669f3cbe80b (diff) |
Add inline incremental hash functions for in-memory use
It can be useful for a hash function to expose separate initialization,
accumulation, and finalization steps. In particular, this is useful
for building inline hash functions for simplehash. Instead of trying
to whack around hash_bytes while maintaining its current behavior on
all platforms, we base this work on fasthash (MIT licensed) which
is simple, faster than hash_bytes for inputs over 12 bytes long,
and also passes the hash function testing suite SMHasher.
The fasthash functions have been reimplemented using our added-on
incremental interface to validate that this method will still give
the same answer, provided we have the input length ahead of time.
This functionality lives in a new header hashfn_unstable.h. The name
implies we have the freedom to change things across versions that
would be unacceptable for our other hash functions that are used for
e.g. hash indexes and hash partitioning. As such, these should only
be used for in-memory data structures like hash tables. There is also
no guarantee of being independent of endianness or pointer size.
As demonstration, use fasthash for pgstat_hash_hash_key. Previously
this called the 32-bit murmur finalizer on the three elements,
then joined them with hash_combine(). The new function is simpler,
faster and takes up less binary space. While the collision and bias
behavior were almost certainly fine with the previous coding, now we
have objective confidence of that.
There are other places that could benefit from this, but that is left
for future work.
Reviewed by Jeff Davis, Heikki Linnakangas, Jian He, Junwang Zhao
Credit to Andres Freund for the idea
Discussion: https://postgr.es/m/20231122223432.lywt4yz2bn7tlp27%40awork3.anarazel.de
Diffstat (limited to 'src/backend/utils/adt/json.c')
0 files changed, 0 insertions, 0 deletions