diff options
| author | Tomas Vondra <tomas.vondra@postgresql.org> | 2025-10-17 21:44:42 +0200 | 
|---|---|---|
| committer | Tomas Vondra <tomas.vondra@postgresql.org> | 2025-10-17 22:21:50 +0200 | 
| commit | b85c4700fc5124999e22ea7300ecc0a290c81cbc (patch) | |
| tree | de6ff52e0d766856fbb49759d6c12351bd7e3018 /src/backend/optimizer/geqo/geqo_erx.c | |
| parent | fd530650137e41bd2c3ed2b62724d5b47721a922 (diff) | |
Fix hashjoin memory balancing logic
Commit a1b4f289beec improved the hashjoin sizing to also consider the
memory used by BufFiles for batches. The code however had multiple
issues, making it ineffective or not working as expected in some cases.
* The amount of memory needed by buffers was calculated using uint32,
  so it would overflow for nbatch >= 262144. If this happened the loop
  would exit prematurely and the memory usage would not be reduced.
  The nbatch overflow is fixed by reworking the condition to not use a
  multiplication at all, so there's no risk of overflow. An explicit
  cast was added to a similar calculation in ExecHashIncreaseBatchSize.
* The loop adjusting the nbatch value used hash_table_bytes to calculate
  the old/new size, but then updated only space_allowed. The consequence
  is the total memory usage was not reduced, but all the memory saved by
  reducing the number of batches was used for the internal hash table.
  This was fixed by using only space_allowed. This is also more correct,
  because hash_table_bytes does not account for skew buckets.
* The code was also doubling multiple parameters (e.g. the number of
  buckets for hash table), but was missing overflow protections.
  The loop now checks for overflow, and terminates if needed. It'd be
  possible to cap the value and continue the loop, but it's not worth
  the complexity. And the overflow implies the in-memory hash table is
  already very large anyway.
While at it, rework the comment explaining how the memory balancing
works, to make it more concise and easier to understand.
The initial nbatch overflow issue was reported by Vaibhav Jain. The
other issues were noticed by me and Melanie Plageman. Fix by me, with a
lot of review and feedback by Melanie.
Backpatch to 18, where the hashjoin memory balancing was introduced.
Reported-by: Vaibhav Jain <jainva@google.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Backpatch-through: 18
Discussion: https://postgr.es/m/CABa-Az174YvfFq7rLS+VNKaQyg7inA2exvPWmPWqnEn6Ditr_Q@mail.gmail.com
Diffstat (limited to 'src/backend/optimizer/geqo/geqo_erx.c')
0 files changed, 0 insertions, 0 deletions
