diff options
| author | Tomas Vondra <tomas.vondra@postgresql.org> | 2025-11-04 19:30:17 +0100 |
|---|---|---|
| committer | Tomas Vondra <tomas.vondra@postgresql.org> | 2025-11-04 20:06:01 +0100 |
| commit | 1213cb475391640508d2495b2b560329897d152c (patch) | |
| tree | 701ac891cd93a8db7eb937f8de4d1ac52f2ddba3 | |
| parent | 6d2ff1de4d0b66eb0288e21021c3741b9df1df0d (diff) | |
Trim TIDs during parallel GIN builds more eagerly
The parallel GIN builds perform "freezing" of TID lists when merging
chunks built earlier. This means determining what part of the list can
no longer change, depending on the last received chunk. The frozen part
can be evicted from memory and written out.
The code attempted to freeze items right before merging the old and new
TID list, after already attempting to trim the current buffer. That
means part of the data may get frozen based on the new TID list, but
will be trimmed later (on next loop). This increases memory usage.
This inverts the order, so that we freeze data first (before trimming).
The benefits are likely relatively small, but it's also virtually free
with no other downsides.
Discussion: https://postgr.es/m/CAHLJuCWDwn-PE2BMZE4Kux7x5wWt_6RoWtA0mUQffEDLeZ6sfA@mail.gmail.com
| -rw-r--r-- | src/backend/access/gin/gininsert.c | 78 |
1 files changed, 36 insertions, 42 deletions
diff --git a/src/backend/access/gin/gininsert.c b/src/backend/access/gin/gininsert.c index 61869d9d931..c2b879b2bf6 100644 --- a/src/backend/access/gin/gininsert.c +++ b/src/backend/access/gin/gininsert.c @@ -1401,10 +1401,46 @@ GinBufferKeyEquals(GinBuffer *buffer, GinTuple *tup) * enough TIDs to trim (with values less than "first" TID from the new tuple), * we do the trim. By enough we mean at least 128 TIDs (mostly an arbitrary * number). + * + * We try freezing TIDs at the beginning of the list first, before attempting + * to trim the buffer. This may allow trimming the data earlier, reducing the + * memory usage and excluding it from the mergesort. */ static bool GinBufferShouldTrim(GinBuffer *buffer, GinTuple *tup) { + /* + * Check if the last TID in the current list is frozen. This is the case + * when merging non-overlapping lists, e.g. in each parallel worker. + */ + if ((buffer->nitems > 0) && + (ItemPointerCompare(&buffer->items[buffer->nitems - 1], + GinTupleGetFirst(tup)) == 0)) + buffer->nfrozen = buffer->nitems; + + /* + * Now find the last TID we know to be frozen, i.e. the last TID right + * before the new GIN tuple. + * + * Start with the first not-yet-frozen tuple, and walk until we find the + * first TID that's higher. If we already know the whole list is frozen + * (i.e. nfrozen == nitems), this does nothing. + * + * XXX This might do a binary search for sufficiently long lists, but it + * does not seem worth the complexity. Overlapping lists should be rare + * common, TID comparisons are cheap, and we should quickly freeze most of + * the list. + */ + for (int i = buffer->nfrozen; i < buffer->nitems; i++) + { + /* Is the TID after the first TID of the new tuple? Can't freeze. */ + if (ItemPointerCompare(&buffer->items[i], + GinTupleGetFirst(tup)) > 0) + break; + + buffer->nfrozen++; + } + /* not enough TIDs to trim (1024 is somewhat arbitrary number) */ if (buffer->nfrozen < 1024) return false; @@ -1470,48 +1506,6 @@ GinBufferStoreTuple(GinBuffer *buffer, GinTuple *tup) buffer->key = (Datum) 0; } - /* - * Try freeze TIDs at the beginning of the list, i.e. exclude them from - * the mergesort. We can do that with TIDs before the first TID in the new - * tuple we're about to add into the buffer. - * - * We do this incrementally when adding data into the in-memory buffer, - * and not later (e.g. when hitting a memory limit), because it allows us - * to skip the frozen data during the mergesort, making it cheaper. - */ - - /* - * Check if the last TID in the current list is frozen. This is the case - * when merging non-overlapping lists, e.g. in each parallel worker. - */ - if ((buffer->nitems > 0) && - (ItemPointerCompare(&buffer->items[buffer->nitems - 1], - GinTupleGetFirst(tup)) == 0)) - buffer->nfrozen = buffer->nitems; - - /* - * Now find the last TID we know to be frozen, i.e. the last TID right - * before the new GIN tuple. - * - * Start with the first not-yet-frozen tuple, and walk until we find the - * first TID that's higher. If we already know the whole list is frozen - * (i.e. nfrozen == nitems), this does nothing. - * - * XXX This might do a binary search for sufficiently long lists, but it - * does not seem worth the complexity. Overlapping lists should be rare - * common, TID comparisons are cheap, and we should quickly freeze most of - * the list. - */ - for (int i = buffer->nfrozen; i < buffer->nitems; i++) - { - /* Is the TID after the first TID of the new tuple? Can't freeze. */ - if (ItemPointerCompare(&buffer->items[i], - GinTupleGetFirst(tup)) > 0) - break; - - buffer->nfrozen++; - } - /* add the new TIDs into the buffer, combine using merge-sort */ { int nnew; |
