diff options
| author | Melanie Plageman <melanieplageman@gmail.com> | 2025-02-24 16:07:55 -0500 |
|---|---|---|
| committer | Melanie Plageman <melanieplageman@gmail.com> | 2025-02-24 16:10:19 -0500 |
| commit | bfe56cdf9a4e07edca46254a88efd9ef17421cd7 (patch) | |
| tree | 8873b768c437042a67730df2da4cd3138e0ece00 /src/include/nodes | |
| parent | b8778c4cd8bc924ce5347cb1ab10dfbf34130559 (diff) | |
Delay extraction of TIDBitmap per page offsets
Pages from the bitmap created by the TIDBitmap API can be exact or
lossy. The TIDBitmap API extracts the tuple offsets from exact pages
into an array for the convenience of the caller.
This was done in tbm_private|shared_iterate() right after advancing the
iterator. However, as long as tbm_private|shared_iterate() set a
reference to the PagetableEntry in the TBMIterateResult, the offset
extraction can be done later.
Waiting to extract the tuple offsets has a few benefits. For the shared
iterator case, it allows us to extract the offsets after dropping the
shared iterator state lock, reducing time spent holding a contended
lock.
Separating the iteration step and extracting the offsets later also
allows us to avoid extracting the offsets for prefetched blocks. Those
offsets were never used, so the overhead of extracting and storing them
was wasted.
The real motivation for this change, however, is that future commits
will make bitmap heap scan use the read stream API. This requires a
TBMIterateResult per issued block. By removing the array of tuple
offsets from the TBMIterateResult and only extracting the offsets when
they are used, we reduce the memory required for per buffer data
substantially.
Suggested-by: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGLHbKP3jwJ6_%2BhnGi37Pw3BD5j2amjV3oSk7j-KyCnY7Q%40mail.gmail.com
Diffstat (limited to 'src/include/nodes')
| -rw-r--r-- | src/include/nodes/tidbitmap.h | 32 |
1 files changed, 27 insertions, 5 deletions
diff --git a/src/include/nodes/tidbitmap.h b/src/include/nodes/tidbitmap.h index 8cd93d90a86..e185635c10b 100644 --- a/src/include/nodes/tidbitmap.h +++ b/src/include/nodes/tidbitmap.h @@ -22,9 +22,17 @@ #ifndef TIDBITMAP_H #define TIDBITMAP_H +#include "access/htup_details.h" #include "storage/itemptr.h" #include "utils/dsa.h" +/* + * The maximum number of tuples per page is not large (typically 256 with + * 8K pages, or 1024 with 32K pages). So there's not much point in making + * the per-page bitmaps variable size. We just legislate that the size + * is this: + */ +#define TBM_MAX_TUPLES_PER_PAGE MaxHeapTuplesPerPage /* * Actual bitmap representation is private to tidbitmap.c. Callers can @@ -53,12 +61,22 @@ typedef struct TBMIterator /* Result structure for tbm_iterate */ typedef struct TBMIterateResult { - BlockNumber blockno; /* page number containing tuples */ - int ntuples; /* -1 when lossy */ + BlockNumber blockno; /* block number containing tuples */ + bool lossy; - bool recheck; /* should the tuples be rechecked? */ - /* Note: recheck is always true if lossy */ - OffsetNumber offsets[FLEXIBLE_ARRAY_MEMBER]; + + /* + * Whether or not the tuples should be rechecked. This is always true if + * the page is lossy but may also be true if the query requires recheck. + */ + bool recheck; + + /* + * Pointer to the page containing the bitmap for this block. It is a void * + * to avoid exposing the details of the tidbitmap PagetableEntry to API + * users. + */ + void *internal_page; } TBMIterateResult; /* function prototypes in nodes/tidbitmap.c */ @@ -75,6 +93,10 @@ extern void tbm_add_page(TIDBitmap *tbm, BlockNumber pageno); extern void tbm_union(TIDBitmap *a, const TIDBitmap *b); extern void tbm_intersect(TIDBitmap *a, const TIDBitmap *b); +extern int tbm_extract_page_tuple(TBMIterateResult *iteritem, + OffsetNumber *offsets, + uint32 max_offsets); + extern bool tbm_is_empty(const TIDBitmap *tbm); extern TBMPrivateIterator *tbm_begin_private_iterate(TIDBitmap *tbm); |
