summaryrefslogtreecommitdiff
path: root/src/backend/storage/buffer
AgeCommit message (Collapse)Author
2023-05-17Update parameter name context to wb_contextAndres Freund
For clarity of review, renaming the function parameter "context" in ScheduleBufferTagForWriteback() and IssuePendingWritebacks() to "wb_context" is a separate commit. The next commit adds an "io_context" parameter and "wb_context" makes it more clear which is which. Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAAKRu_acc6iL4M3hvOTeztf_ZPpsB3Pqio5aVHgZ5q=Pi3BZKg@mail.gmail.com
2023-04-18Fix various typosDavid Rowley
This fixes many spelling mistakes in comments, but a few references to invalid parameter names, function names and option names too in comments and also some in string constants Also, fix an #undef that was undefining the incorrect definition Author: Alexander Lakhin Reviewed-by: Justin Pryzby Discussion: https://postgr.es/m/d5f68d19-c0fc-91a9-118d-7c6a5a3f5fad@gmail.com
2023-04-14Support RBM_ZERO_AND_CLEANUP_LOCK in ExtendBufferedRelTo(), add testsAndres Freund
For some reason I had not implemented RBM_ZERO_AND_CLEANUP_LOCK support in ExtendBufferedRelTo(), likely thinking it not being reachable. But it is reachable, e.g. when replaying a WAL record for a page in a relation that subsequently is truncated (likely only reachable when doing crash recovery or PITR, not during ongoing streaming replication). As now all of the RBM_* modes are supported, remove assertions checking mode. As we had no test coverage for this scenario, add a new TAP test. There's plenty more that ought to be tested in this area... Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/392271.1681238924%40sss.pgh.pa.us Discussion: https://postgr.es/m/0b5eb82b-cb99-e0a4-b932-3dc60e2e3926@gmail.com
2023-04-13Harmonize some more function parameter names.Peter Geoghegan
Make sure that function declarations use names that exactly match the corresponding names from function definitions in a few places. These inconsistencies were all introduced relatively recently, after the code base had parameter name mismatches fixed in bulk (see commits starting with commits 4274dc22 and 035ce1fe). pg_bsd_indent still has a couple of similar inconsistencies, which I (pgeoghegan) have left untouched for now. Like all earlier commits that cleaned up function parameter names, this commit was written with help from clang-tidy.
2023-04-08Add io_direct setting (developer-only).Thomas Munro
Provide a way to ask the kernel to use O_DIRECT (or local equivalent) where available for data and WAL files, to avoid or minimize kernel caching. This hurts performance currently and is not intended for end users yet. Later proposed work would introduce our own I/O clustering, read-ahead, etc to replace the facilities the kernel disables with this option. The only user-visible change, if the developer-only GUC is not used, is that this commit also removes the obscure logic that would activate O_DIRECT for the WAL when wal_sync_method=open_[data]sync and wal_level=minimal (which also requires max_wal_senders=0). Those are non-default and unlikely settings, and this behavior wasn't (correctly) documented. The same effect can be achieved with io_direct=wal. Author: Thomas Munro <thomas.munro@gmail.com> Author: Andres Freund <andres@anarazel.de> Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGK1X532hYqJ_MzFWt0n1zt8trz980D79WbjwnT-yYLZpg%40mail.gmail.com
2023-04-08Introduce PG_IO_ALIGN_SIZE and align all I/O buffers.Thomas Munro
In order to have the option to use O_DIRECT/FILE_FLAG_NO_BUFFERING in a later commit, we need the addresses of user space buffers to be well aligned. The exact requirements vary by OS and file system (typically sectors and/or memory pages). The address alignment size is set to 4096, which is enough for currently known systems: it matches modern sectors and common memory page size. There is no standard governing O_DIRECT's requirements so we might eventually have to reconsider this with more information from the field or future systems. Aligning I/O buffers on memory pages is also known to improve regular buffered I/O performance. Three classes of I/O buffers for regular data pages are adjusted: (1) Heap buffers are now allocated with the new palloc_aligned() or MemoryContextAllocAligned() functions introduced by commit 439f6175. (2) Stack buffers now use a new struct PGIOAlignedBlock to respect PG_IO_ALIGN_SIZE, if possible with this compiler. (3) The buffer pool is also aligned in shared memory. WAL buffers were already aligned on XLOG_BLCKSZ. It's possible for XLOG_BLCKSZ to be configured smaller than PG_IO_ALIGNED_SIZE and thus for O_DIRECT WAL writes to fail to be well aligned, but that's a pre-existing condition and will be addressed by a later commit. BufFiles are not yet addressed (there's no current plan to use O_DIRECT for those, but they could potentially get some incidental speedup even in plain buffered I/O operations through better alignment). If we can't align stack objects suitably using the compiler extensions we know about, we disable the use of O_DIRECT by setting PG_O_DIRECT to 0. This avoids the need to consider systems that have O_DIRECT but can't align stack objects the way we want; such systems could in theory be supported with more work but we don't currently know of any such machines, so it's easier to pretend there is no O_DIRECT support instead. That's an existing and tested class of system. Add assertions that all buffers passed into smgrread(), smgrwrite() and smgrextend() are correctly aligned, unless PG_O_DIRECT is 0 (= stack alignment tricks may be unavailable) or the block size has been set too small to allow arrays of buffers to be all aligned. Author: Thomas Munro <thomas.munro@gmail.com> Author: Andres Freund <andres@anarazel.de> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/CA+hUKGK1X532hYqJ_MzFWt0n1zt8trz980D79WbjwnT-yYLZpg@mail.gmail.com
2023-04-07Track IO times in pg_stat_ioAndres Freund
a9c70b46dbe and 8aaa04b32S added counting of IO operations to a new view, pg_stat_io. Now, add IO timing for reads, writes, extends, and fsyncs to pg_stat_io as well. This combines the tracking for pgBufferUsage with the tracking for pg_stat_io into a new function pgstat_count_io_op_time(). This should make it a bit easier to avoid the somewhat costly instr_time conversion done for pgBufferUsage. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_ay5iKmnbXZ3DsauViF3eMxu4m1oNnJXqV_HyqYeg55Ww%40mail.gmail.com
2023-04-07Improve IO accounting for temp relation writesAndres Freund
Both pgstat_database and pgBufferUsage count IO timing for reads of temporary relation blocks into local buffers. However, both failed to count write IO timing for flushes of dirty local buffers. Fix. Additionally, FlushRelationBuffers() seems to have omitted counting write IO (both count and timing) stats for both pgstat_database and pgBufferUsage. Fix. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20230321023451.7rzy4kjj2iktrg2r%40awork3.anarazel.de
2023-04-07Fix copy-paste bug in 12f3867f553 triggering an assert after a write errorAndres Freund
The same condition accidentally was copied to both branches. Manual testing confirms that otherwise the error recovery path works fine. Found while reviewing the logical-decoding-on-standby patch.
2023-04-07Add VACUUM/ANALYZE BUFFER_USAGE_LIMIT optionDavid Rowley
Add new options to the VACUUM and ANALYZE commands called BUFFER_USAGE_LIMIT to allow users more control over how large to make the buffer access strategy that is used to limit the usage of buffers in shared buffers. Larger rings can allow VACUUM to run more quickly but have the drawback of VACUUM possibly evicting more buffers from shared buffers that might be useful for other queries running on the database. Here we also add a new GUC named vacuum_buffer_usage_limit which controls how large to make the access strategy when it's not specified in the VACUUM/ANALYZE command. This defaults to 256KB, which is the same size as the access strategy was prior to this change. This setting also controls how large to make the buffer access strategy for autovacuum. Per idea by Andres Freund. Author: Melanie Plageman Reviewed-by: David Rowley Reviewed-by: Andres Freund Reviewed-by: Justin Pryzby Reviewed-by: Bharath Rupireddy Discussion: https://postgr.es/m/20230111182720.ejifsclfwymw2reb@awork3.anarazel.de
2023-04-05bufmgr: Introduce infrastructure for faster relation extensionAndres Freund
The primary bottlenecks for relation extension are: 1) The extension lock is held while acquiring a victim buffer for the new page. Acquiring a victim buffer can require writing out the old page contents including possibly needing to flush WAL. 2) When extending via ReadBuffer() et al, we write a zero page during the extension, and then later write out the actual page contents. This can nearly double the write rate. 3) The existing bulk relation extension infrastructure in hio.c just amortized the cost of acquiring the relation extension lock, but none of the other costs. Unfortunately 1) cannot currently be addressed in a central manner as the callers to ReadBuffer() need to acquire the extension lock. To address that, this this commit moves the responsibility for acquiring the extension lock into bufmgr.c functions. That allows to acquire the relation extension lock for just the required time. This will also allow us to improve relation extension further, without changing callers. The reason we write all-zeroes pages during relation extension is that we hope to get ENOSPC errors earlier that way (largely works, except for CoW filesystems). It is easier to handle out-of-space errors gracefully if the page doesn't yet contain actual tuples. This commit addresses 2), by using the recently introduced smgrzeroextend(), which extends the relation, without dirtying the kernel page cache for all the extended pages. To address 3), this commit introduces a function to extend a relation by multiple blocks at a time. There are three new exposed functions: ExtendBufferedRel() for extending the relation by a single block, ExtendBufferedRelBy() to extend a relation by multiple blocks at once, and ExtendBufferedRelTo() for extending a relation up to a certain size. To avoid duplicating code between ReadBuffer(P_NEW) and the new functions, ReadBuffer(P_NEW) now implements relation extension with ExtendBufferedRel(), using a flag to tell ExtendBufferedRel() that the relation lock is already held. Note that this commit does not yet lead to a meaningful performance or scalability improvement - for that uses of ReadBuffer(P_NEW) will need to be converted to ExtendBuffered*(), which will be done in subsequent commits. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de
2023-04-05bufmgr: Support multiple in-progress IOs by using resownerAndres Freund
A future patch will add support for extending relations by multiple blocks at once. To be concurrency safe, the buffers for those blocks need to be marked as BM_IO_IN_PROGRESS. Until now we only had infrastructure for recovering from an IO error for a single buffer. This commit extends that infrastructure to multiple buffers by using the resource owner infrastructure. This commit increases the size of the ResourceOwnerData struct, which appears to have a just about measurable overhead in very extreme workloads. Medium term we are planning to substantially shrink the size of ResourceOwnerData. Short term the increase is small enough to not worry about it for now. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de Discussion: https://postgr.es/m/20221029200025.w7bvlgvamjfo6z44@awork3.anarazel.de
2023-04-05bufmgr: Acquire and clean victim buffer separatelyAndres Freund
Previously we held buffer locks for two buffer mapping partitions at the same time to change the identity of buffers. Particularly for extending relations needing to hold the extension lock while acquiring a victim buffer is painful.But it also creates a bottleneck for workloads that just involve reads. Now we instead first acquire a victim buffer and write it out, if necessary. Then we remove that buffer from the old partition with just the old partition's partition lock held and insert it into the new partition with just that partition's lock held. By separating out the victim buffer acquisition, future commits will be able to change relation extensions to scale better. On my workstation, a micro-benchmark exercising buffered reads strenuously and under a lot of concurrency, sees a >2x improvement. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de
2023-04-05bufmgr: Add Pin/UnpinLocalBuffer()Andres Freund
So far these were open-coded in quite a few places, without a good reason. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de
2023-04-05bufmgr: Add some more error checking [infrastructure] around pinningAndres Freund
This adds a few more assertions against a buffer being local in places we don't expect, and extracts the check for a buffer being pinned exactly once from LockBufferForCleanup() into its own function. Later commits will use this function. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: http://postgr.es/m/419312fd-9255-078c-c3e3-f0525f911d7f@iki.fi
2023-04-03bufmgr: Remove buffer-write-dirty tracepointsAndres Freund
The trace point was using the relfileno / fork / block for the to-be-read-in buffer. Some upcoming work would make that more expensive to provide. We still have buffer-flush-start/done, which can serve most tracing needs that buffer-write-dirty could serve. Discussion: https://postgr.es/m/f5164e7a-eef6-8972-75a3-8ac622ed0c6e@iki.fi
2023-04-03Rename BufferAccessStrategyData.ring_size to nbuffersDavid Rowley
The new name better reflects what the field is - the size of the buffers[] array. ring_size sounded more like it is in units of bytes. An upcoming commit allows a BufferAccessStrategy of custom sizes, so it seems relevant to improve this beforehand. Author: Melanie Plageman Reviewed-by: David Rowley Discussion: https://postgr.es/m/CAAKRu_YefVbhg4rAxU2frYFdTap61UftH-kUNQZBvAs%2BYK81Zg%40mail.gmail.com
2023-03-30Track shared buffer hits in pg_stat_ioAndres Freund
Among other things, this should make it easier to calculate a useful cache hit ratio by excluding buffer reads via buffer access strategies. As buffer access strategies reuse buffers (and thus evict the prior buffer contents), it is normal to see reads on repeated scans of the same data. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAAKRu_beMa9Hzih40%3DXPYqhDVz6tsgUGTrhZXRo%3Dunp%2Bszb%3DUA%40mail.gmail.com
2023-03-30bufmgr: Fix undefined behaviour with, unrealistically, large temp_buffersAndres Freund
Quoting Melanie: > Since if buffer is INT_MAX, then the -(buffer + 1) version invokes > undefined behavior while the -buffer - 1 version doesn't. All other places were already using the correct version. I (Andres), copied the code into more places in a patch. Melanie caught it in review, but to prevent more people from copying the bad code, fix it. Even if it is a theoretical issue. We really ought to wrap these accesses in a helper function... As this is a theoretical issue, don't backpatch. Reported-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAAKRu_aW2SX_LWtwHgfnqYpBrunMLfE9PD6-ioPpkh92XH0qpg@mail.gmail.com
2023-03-29Remove empty function BufmgrCommit().Tom Lane
This function has been a no-op for over a decade. Even if bufmgr regains a need to be called during commit, it seems unlikely that the most appropriate call points would be precisely here, so it's not doing us much good as a placeholder either. Now, removing it probably doesn't save any noticeable number of cycles --- but the main call is inside the commit critical section, and the less work done there the better. Matthias van de Meent Discussion: https://postgr.es/m/CAEze2Wi1=tLKbxZnXzcD+8fYKyKqBtivVakLQC_mYBsP4Y8qVA@mail.gmail.com
2023-03-22Fix memory leak and inefficiency in CREATE DATABASE ... STRATEGY WAL_LOGAndres Freund
RelationCopyStorageUsingBuffer() did not free the strategies used to access the source / target relation. They memory was released at the end of the transaction, but when using a template database with a lot of relations, the temporary leak can become big prohibitively big. RelationCopyStorageUsingBuffer() acquired the buffer for the target relation with RBM_NORMAL, therefore requiring a read of a block guaranteed to be zero. Use RBM_ZERO_AND_LOCK instead. Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/20230321070113.o2vqqxogjykwgfrr@awork3.anarazel.de Backpatch: 15-, where STRATEGY WAL_LOG was introduced
2023-02-27Update types in smgr APIPeter Eisentraut
Change data buffer to void *, from char *, and add const where appropriate. This makes it match the File API (see also 2d4f1ba6cfc2f0a977f1c30bda9848041343e248) and stdio. Discussion: https://www.postgresql.org/message-id/flat/11dda853-bb5b-59ba-a746-e168b1ce4bdb%40enterprisedb.com
2023-02-20Speedup and increase usability of set proc title functionsDavid Rowley
The setting of the process title could be seen on profiles of very fast-to-execute queries. In many locations where we call set_ps_display() we pass along a string constant, the length of which is known during compilation. Here we effectively rename set_ps_display() to set_ps_display_with_len() and then add a static inline function named set_ps_display() which calls strlen() on the given string. This allows the compiler to optimize away the strlen() call when dealing with call sites passing a string constant. We can then also use memcpy() instead of strlcpy() to copy the string into the destination buffer. That's significantly faster than strlcpy's byte-at-a-time way of copying. Here we also take measures to improve some code which was adjusting the process title to add a " waiting" suffix to it. Call sites which require this can now just call set_ps_display_suffix() to add or adjust the suffix and call set_ps_display_remove_suffix() to remove it again. Reviewed-by: Andres Freund Discussion: https://postgr.es/m/CAApHDvocBvvk-0gWNA2Gohe+sv9fMcv+fK_G+siBKJrgDG4O7g@mail.gmail.com
2023-02-09pgstat: Track more detailed relation IO statisticsAndres Freund
Commit 28e626bde00 introduced the infrastructure for tracking more detailed IO statistics. This commit adds the actual collection of the new IO statistics for relations and temporary relations. See aforementioned commit for goals and high-level design. The changes in this commit are fairly straight-forward. The bulk of the change is to passing sufficient information to the callsites of pgstat_count_io_op(). A somewhat unsightly detail is that it currently is hard to find a better place to count fsyncs than in md.c, whereas the other pgstat_count_io_op() calls are in bufmgr.c/localbuf.c. As the number of fsyncs is tied to md.c implementation details, it's not obvious there is a better answer. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20200124195226.lth52iydq2n2uilq@alap3.anarazel.de
2023-02-06Remove useless casts to (void *) in hash_search() callsPeter Eisentraut
Some of these appear to be leftovers from when hash_search() took a char * argument (changed in 5999e78fc45dcb91784b64b6e9ae43f4e4f68ca2). Since after this there is some more horizontal space available, do some light reformatting where suitable. Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/fd9adf5d-b1aa-e82f-e4c7-263c30145807%40enterprisedb.com
2023-01-20Zero initialize uses of instr_time about to trigger compiler warningsAndres Freund
These are all not necessary from a correctness POV. However, in the near future instr_time will be simplified to an int64, at which point gcc would otherwise start to warn about the changed places. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/20230116023639.rn36vf6ajqmfciua@awork3.anarazel.de
2023-01-13Manual cleanup and pgindent of pgstat and bufmgr related codeAndres Freund
This is in preparation for commiting a larger patch series in the area. Discussion: https://postgr.es/m/CAAKRu_bHwGEbzNxxy+MQDkrsgog6aO6iUvajJ4d6PD98gFU7+w@mail.gmail.com
2023-01-02Update copyright for 2023Bruce Momjian
Backpatch-through: 11
2022-12-20Add copyright notices to meson filesAndrew Dunstan
Discussion: https://postgr.es/m/222b43a5-2fb3-2c1b-9cd0-375d376c8246@dunslane.net
2022-10-31Clean up some inconsistencies with GUC declarationsMichael Paquier
This is similar to 7d25958, and this commit takes care of all the remaining inconsistencies between the initial value used in the C variable associated to a GUC and its default value stored in the GUC tables (as of pg_settings.boot_val). Some of the initial values of the GUCs updated rely on a compile-time default. These are refactored so as the GUC table and its C declaration use the same values. This makes everything consistent with other places, backend_flush_after, bgwriter_flush_after, port, checkpoint_flush_after doing so already, for example. Extracted from a larger patch by Peter Smith. The spots updated in the modules are from me. Author: Peter Smith, Michael Paquier Reviewed-by: Nathan Bossart, Tom Lane, Justin Pryzby Discussion: https://postgr.es/m/CAHut+PtHE0XSfjjRQ6D4v7+dqzCw=d+1a64ujra4EX8aoc_Z+w@mail.gmail.com
2022-09-30Remove useless argument from UnpinBuffer()Michael Paquier
The last caller of UnpinBuffer() that did not want to adjust CurrentResourceOwner was removed in 2d115e4, and nothing has been introduced in bufmgr.c to do the same thing since. This simplifies 10 code paths. Author: Aleksander Alekseev Reviewed-by: Nathan Bossart, Zhang Mingli, Bharath Rupireddy Discussion: https://postgr.es/m/CAJ7c6TOmmFpb6ohurLhTC7hKNJWGzdwf8s4EAtAZxD48g-e6Jw@mail.gmail.com
2022-09-21meson: Add initial version of meson based build systemAndres Freund
Autoconf is showing its age, fewer and fewer contributors know how to wrangle it. Recursive make has a lot of hard to resolve dependency issues and slow incremental rebuilds. Our home-grown MSVC build system is hard to maintain for developers not using Windows and runs tests serially. While these and other issues could individually be addressed with incremental improvements, together they seem best addressed by moving to a more modern build system. After evaluating different build system choices, we chose to use meson, to a good degree based on the adoption by other open source projects. We decided that it's more realistic to commit a relatively early version of the new build system and mature it in tree. This commit adds an initial version of a meson based build system. It supports building postgres on at least AIX, FreeBSD, Linux, macOS, NetBSD, OpenBSD, Solaris and Windows (however only gcc is supported on aix, solaris). For Windows/MSVC postgres can now be built with ninja (faster, particularly for incremental builds) and msbuild (supporting the visual studio GUI, but building slower). Several aspects (e.g. Windows rc file generation, PGXS compatibility, LLVM bitcode generation, documentation adjustments) are done in subsequent commits requiring further review. Other aspects (e.g. not installing test-only extensions) are not yet addressed. When building on Windows with msbuild, builds are slower when using a visual studio version older than 2019, because those versions do not support MultiToolTask, required by meson for intra-target parallelism. The plan is to remove the MSVC specific build system in src/tools/msvc soon after reaching feature parity. However, we're not planning to remove the autoconf/make build system in the near future. Likely we're going to keep at least the parts required for PGXS to keep working around until all supported versions build with meson. Some initial help for postgres developers is at https://wiki.postgresql.org/wiki/Meson With contributions from Thomas Munro, John Naylor, Stone Tickle and others. Author: Andres Freund <andres@anarazel.de> Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Author: Peter Eisentraut <peter@eisentraut.org> Reviewed-By: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Discussion: https://postgr.es/m/20211012083721.hvixq4pnh2pixr3j@alap3.anarazel.de
2022-09-19Harmonize parameter names in storage and AM code.Peter Geoghegan
Make sure that function declarations use names that exactly match the corresponding names from function definitions in storage, catalog, access method, executor, and logical replication code, as well as in miscellaneous utility/library code. Like other recent commits that cleaned up function parameter names, this commit was written with help from clang-tidy. Later commits will do the same for other parts of the codebase. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAH2-WznJt9CMM9KJTMjJh_zbL5hD9oX44qdJ4aqZtjFi-zA3Tg@mail.gmail.com
2022-09-13Split up guc.c for better build speed and ease of maintenance.Tom Lane
guc.c has grown to be one of our largest .c files, making it a bottleneck for compilation. It's also acquired a bunch of knowledge that'd be better kept elsewhere, because of our not very good habit of putting variable-specific check hooks here. Hence, split it up along these lines: * guc.c itself retains just the core GUC housekeeping mechanisms. * New file guc_funcs.c contains the SET/SHOW interfaces and some SQL-accessible functions for GUC manipulation. * New file guc_tables.c contains the data arrays that define the built-in GUC variables, along with some already-exported constant tables. * GUC check/assign/show hook functions are moved to the variable's home module, whenever that's clearly identifiable. A few hard- to-classify hooks ended up in commands/variable.c, which was already a home for miscellaneous GUC hook functions. To avoid cluttering a lot more header files with #include "guc.h", I also invented a new header file utils/guc_hooks.h and put all the GUC hook functions' declarations there, regardless of their originating module. That allowed removal of #include "guc.h" from some existing headers. The fallout from that (hopefully all caught here) demonstrates clearly why such inclusions are best minimized: there are a lot of files that, for example, were getting array.h at two or more levels of remove, despite not having any connection at all to GUCs in themselves. There is some very minor code beautification here, such as renaming a couple of inconsistently-named hook functions and improving some comments. But mostly this just moves code from point A to point B and deals with the ensuing needs for #include adjustments and exporting a few functions that previously weren't exported. Patch by me, per a suggestion from Andres Freund; thanks also to Michael Paquier for the idea to invent guc_funcs.c. Discussion: https://postgr.es/m/587607.1662836699@sss.pgh.pa.us
2022-08-24Include RelFileLocator fields individually in BufferTag.Robert Haas
This is preparatory work for a project to increase the number of bits in a RelFileNumber from 32 to 56. Along the way, introduce static inline accessor functions for a couple of BufferTag fields. Dilip Kumar, reviewed by me. The overall patch series has also had review at various times from Andres Freund, Ashutosh Sharma, Hannu Krosing, Vignesh C, Álvaro Herrera, and Tom Lane. Discussion: http://postgr.es/m/CAFiTN-trubju5YbWAq-BSpZ90-Z6xCVBQE8BVqXqANOZAF1Znw@mail.gmail.com
2022-08-24Further reduce warnings with -Wshadow=compatible-localDavid Rowley
In a similar effort to f01592f91, here we're targetting fixing the warnings that -Wshadow=compatible-local produces that we can fix by moving a variable to an inner scope to stop that variable from being shadowed by another variable declared somewhere later in the function. All of the warnings being fixed here are changing the scope of variables which are being used as an iterator for a "for" loop. In each instance, the fix happens to be changing the for loop to use the C99 type initialization. Much of this code likely pre-dates our use of C99. Reducing the scope of the outer scoped variable seems like the safest way to fix these. Renaming seems more likely to risk patches using the wrong variable. Reducing the scope is more likely to result in a compilation failure after applying some future patch rather than introducing bugs with it. By my count, this takes the warning count from 129 down to 114. Author: Justin Pryzby Discussion: https://postgr.es/m/CAApHDvrwLGBP%2BYw9vriayyf%3DXR4uPWP5jr6cQhP9au_kaDUhbA%40mail.gmail.com
2022-08-18When using the WAL-logged CREATE DATABASE strategy, bulk extend.Robert Haas
This should improve performance, and was suggested by Andres Freund. Back-patch to v15 to keep the code consistent across branches. Dilip Kumar Discussion: http://postgr.es/m/C3458199-FEDD-4356-865A-08DFAA5D4065@anarazel.de Discussion: http://postgr.es/m/CAFiTN-sJ0vVpJrZ=R5M+g7Tr8=NN4wKOtrqOcDEsfFfnZgivVA@mail.gmail.com
2022-08-12Avoid using a fake relcache entry to own an SmgrRelation.Robert Haas
If an error occurs before we close the fake relcache entry, the the fake relcache entry will be destroyed by the SmgrRelation will survive until end of transaction. Its smgr_owner pointer ends up pointing to already-freed memory. The original reason for using a fake relcache entry here was to try to avoid reusing an SMgrRelation across a relevant invalidation. To avoid that problem, just call smgropen() again each time we need a reference to it. Hopefully someday we will come up with a more elegant approach, but accessing uninitialized memory is bad so let's do this for now. Dilip Kumar, reviewed by Andres Freund and Tom Lane. Report by Justin Pryzby. Discussion: http://postgr.es/m/20220802175043.GA13682@telsasoft.com Discussion: http://postgr.es/m/CAFiTN-vSFeE6_W9z698XNtFROOA_nSqUXWqLcG0emob_kJ+dEQ@mail.gmail.com
2022-08-06Fix data-corruption hazard in WAL-logged CREATE DATABASE.Tom Lane
RelationCopyStorageUsingBuffer thought it could skip copying empty pages, but of course that does not work at all, because subsequent blocks will be out of place. Also fix it to acquire share lock on the source buffer. It *might* be safe to not do that, but it's not very certain, and I don't think this code deserves any benefit of the doubt. Dilip Kumar, per complaint from me Discussion: https://postgr.es/m/3679800.1659654066@sss.pgh.pa.us
2022-07-27Convert macros to static inline functions (buf_internals.h)Robert Haas
Dilip Kumar, reviewed by Vignesh C, Ashutosh Sharma, and me. Discussion: http://postgr.es/m/CAFiTN-tYbM7D+2UGiNc2kAFMSQTa5FTeYvmg-Vj2HvPdVw2Gvg@mail.gmail.com
2022-07-25Fix ReadRecentBuffer for local buffers.Heikki Linnakangas
It incorrectly used GetBufferDescriptor instead of GetLocalBufferDescriptor, causing it to not find the correct buffer in most cases, and performing an out-of-bounds memory read in the corner case that temp_buffers > shared_buffers. It also bumped the usage-count on the buffer, even if it was previously pinned. That won't lead to crashes or incorrect results, but it's different from what the shared-buffer case does, and different from the usual code in LocalBufferAlloc. Fix that too, and make the code ordering match LocalBufferAlloc() more closely, so that it's easier to verify that it's doing the same thing. Currently, ReadRecentBuffer() is only used with non-temp relations, in WAL redo, so the broken code is currently dead code. However, it could be used by extensions. Backpatch-through: 14 Discussion: https://www.postgresql.org/message-id/2d74b46f-27c9-fb31-7f99-327a87184cc0%40iki.fi Reviewed-by: Thomas Munro, Zhang Mingli, Richard Guo
2022-07-12Rename some functions to mention Relation instead of RelFileLocator.Robert Haas
This is definitely shorter, and hopefully clearer. Kyotaro Horiguchi, reviewed by Dilip Kumar and by me Discussion: http://postgr.es/m/20220707.174436.1885393789789795413.horikyota.ntt@gmail.com
2022-07-11Fix mistake in comment.Robert Haas
Kyotaro Horiguchi Discussion: http://postgr.es/m/20220708.145951.382076151410075693.horikyota.ntt@gmail.com
2022-07-06Change internal RelFileNode references to RelFileNumber or RelFileLocator.Robert Haas
We have been using the term RelFileNode to refer to either (1) the integer that is used to name the sequence of files for a certain relation within the directory set aside for that tablespace/database combination; or (2) that value plus the OIDs of the tablespace and database; or occasionally (3) the whole series of files created for a relation based on those values. Using the same name for more than one thing is confusing. Replace RelFileNode with RelFileNumber when we're talking about just the single number, i.e. (1) from above, and with RelFileLocator when we're talking about all the things that are needed to locate a relation's files on disk, i.e. (2) from above. In the places where we refer to (3) as a relfilenode, instead refer to "relation storage". Since there is a ton of SQL code in the world that knows about pg_class.relfilenode, don't change the name of that column, or of other SQL-facing things that derive their name from it. On the other hand, do adjust closely-related internal terminology. For example, the structure member names dbNode and spcNode appear to be derived from the fact that the structure itself was called RelFileNode, so change those to dbOid and spcOid. Likewise, various variables with names like rnode and relnode get renamed appropriately, according to how they're being used in context. Hopefully, this is clearer than before. It is also preparation for future patches that intend to widen the relfilenumber fields from its current width of 32 bits. Variables that store a relfilenumber are now declared as type RelFileNumber rather than type Oid; right now, these are the same, but that can now more easily be changed. Dilip Kumar, per an idea from me. Reviewed also by Andres Freund. I fixed some whitespace issues, changed a couple of words in a comment, and made one other minor correction. Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-05-12Pre-beta mechanical code beautification.Tom Lane
Run pgindent, pgperltidy, and reformat-dat-files. I manually fixed a couple of comments that pgindent uglified.
2022-04-13Remove extraneous blank lines before block-closing bracesAlvaro Herrera
These are useless and distracting. We wouldn't have written the code with them to begin with, so there's no reason to keep them. Author: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com Discussion: https://postgr.es/m/attachment/133167/0016-Extraneous-blank-lines.patch
2022-04-08Rename delayChkpt to delayChkptFlags.Robert Haas
Before commit 412ad7a55639516f284cd0ef9757d6ae5c7abd43, delayChkpt was a Boolean. Now it's an integer. Extensions using it need to be appropriately updated, so let's rename the field to make sure that a hard compilation failure occurs. Replacing delayChkpt with delayChkptFlags made a few comments extend past 80 characters, so I reflowed them and changed some wording very slightly. The back-branches will need a different change to restore compatibility with existing minor releases; this is just for master. Per suggestion from Tom Lane. Discussion: http://postgr.es/m/a7880f4d-1d74-582a-ada7-dad168d046d1@enterprisedb.com
2022-04-07Prefetch data referenced by the WAL, take II.Thomas Munro
Introduce a new GUC recovery_prefetch. When enabled, look ahead in the WAL and try to initiate asynchronous reading of referenced data blocks that are not yet cached in our buffer pool. For now, this is done with posix_fadvise(), which has several caveats. Since not all OSes have that system call, "try" is provided so that it can be enabled where available. Better mechanisms for asynchronous I/O are possible in later work. Set to "try" for now for test coverage. Default setting to be finalized before release. The GUC wal_decode_buffer_size limits the distance we can look ahead in bytes of decoded data. The existing GUC maintenance_io_concurrency is used to limit the number of concurrent I/Os allowed, based on pessimistic heuristics used to infer that I/Os have begun and completed. We'll also not look more than maintenance_io_concurrency * 4 block references ahead. Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> (earlier version) Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version) Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> (earlier version) Tested-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> (earlier version) Tested-by: Jakub Wartak <Jakub.Wartak@tomtom.com> (earlier version) Tested-by: Dmitry Dolgov <9erthalion6@gmail.com> (earlier version) Tested-by: Sait Talha Nisanci <Sait.Nisanci@microsoft.com> (earlier version) Discussion: https://postgr.es/m/CA%2BhUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq%3DAovOddfHpA%40mail.gmail.com
2022-04-06pgstat: store statistics in shared memory.Andres Freund
Previously the statistics collector received statistics updates via UDP and shared statistics data by writing them out to temporary files regularly. These files can reach tens of megabytes and are written out up to twice a second. This has repeatedly prevented us from adding additional useful statistics. Now statistics are stored in shared memory. Statistics for variable-numbered objects are stored in a dshash hashtable (backed by dynamic shared memory). Fixed-numbered stats are stored in plain shared memory. The header for pgstat.c contains an overview of the architecture. The stats collector is not needed anymore, remove it. By utilizing the transactional statistics drop infrastructure introduced in a prior commit statistics entries cannot "leak" anymore. Previously leaked statistics were dropped by pgstat_vacuum_stat(), called from [auto-]vacuum. On systems with many small relations pgstat_vacuum_stat() could be quite expensive. Now that replicas drop statistics entries for dropped objects, it is not necessary anymore to reset stats when starting from a cleanly shut down replica. Subsequent commits will perform some further code cleanup, adapt docs and add tests. Bumps PGSTAT_FILE_FORMAT_ID. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Andres Freund <andres@anarazel.de> Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-By: Andres Freund <andres@anarazel.de> Reviewed-By: Thomas Munro <thomas.munro@gmail.com> Reviewed-By: Justin Pryzby <pryzby@telsasoft.com> Reviewed-By: "David G. Johnston" <david.g.johnston@gmail.com> Reviewed-By: Tomas Vondra <tomas.vondra@2ndquadrant.com> (in a much earlier version) Reviewed-By: Arthur Zakirov <a.zakirov@postgrespro.ru> (in a much earlier version) Reviewed-By: Antonin Houska <ah@cybertec.at> (in a much earlier version) Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de Discussion: https://postgr.es/m/20220308205351.2xcn6k4x5yivcxyd@alap3.anarazel.de Discussion: https://postgr.es/m/20210319235115.y3wz7hpnnrshdyv6@alap3.anarazel.de
2022-03-29Add new block-by-block strategy for CREATE DATABASE.Robert Haas
Because this strategy logs changes on a block-by-block basis, it avoids the need to checkpoint before and after the operation. However, because it logs each changed block individually, it might generate a lot of extra write-ahead logging if the template database is large. Therefore, the older strategy remains available via a new STRATEGY parameter to CREATE DATABASE, and a corresponding --strategy option to createdb. Somewhat controversially, this patch assembles the list of relations to be copied to the new database by reading the pg_class relation of the template database. Cross-database access like this isn't normally possible, but it can be made to work here because there can't be any connections to the database being copied, nor can it contain any in-doubt transactions. Even so, we have to use lower-level interfaces than normal, since the table scan and relcache interfaces will not work for a database to which we're not connected. The advantage of this approach is that we do not need to rely on the filesystem to determine what ought to be copied, but instead on PostgreSQL's own knowledge of the database structure. This avoids, for example, copying stray files that happen to be located in the source database directory. Dilip Kumar, with a fairly large number of cosmetic changes by me. Reviewed and tested by Ashutosh Sharma, Andres Freund, John Naylor, Greg Nancarrow, Neha Sharma. Additional feedback from Bruce Momjian, Heikki Linnakangas, Julien Rouhaud, Adam Brusselback, Kyotaro Horiguchi, Tomas Vondra, Andrew Dunstan, Álvaro Herrera, and others. Discussion: http://postgr.es/m/CA+TgmoYtcdxBjLh31DLxUXHxFVMPGzrU5_T=CYCvRyFHywSBUQ@mail.gmail.com