summaryrefslogtreecommitdiff
path: root/src/backend/utils
AgeCommit message (Collapse)Author
2023-12-26Remove unused macroPeter Eisentraut
Usage was removed in 6c5576075b but the definition was not removed.
2023-12-25Enhance checkpointer restartpoint statisticsAlexander Korotkov
Bhis commit introduces enhancements to the pg_stat_checkpointer view by adding three new columns: restartpoints_timed, restartpoints_req, and restartpoints_done. These additions aim to improve the visibility and monitoring of restartpoint processes on replicas. Previously, it was challenging to differentiate between successful and failed restartpoint requests. This limitation arises because restartpoints on replicas are dependent on checkpoint records from the primary, and cannot occur more frequently than these checkpoints. The new columns allow for clear distinction and tracking of restartpoint requests, their triggers, and successful completions. This enhancement aids database administrators and developers in better understanding and diagnosing issues related to restartpoint behavior, particularly in scenarios where restartpoint requests may fail. System catalog is changed. Catversion is bumped. Discussion: https://postgr.es/m/99b2ccd1-a77a-962a-0837-191cdf56c2b9%40inbox.ru Author: Anton A. Melnikov Reviewed-by: Kyotaro Horiguchi, Alexander Korotkov
2023-12-20Add a new WAL summarizer process.Robert Haas
When active, this process writes WAL summary files to $PGDATA/pg_wal/summaries. Each summary file contains information for a certain range of LSNs on a certain TLI. For each relation, it stores a "limit block" which is 0 if a relation is created or destroyed within a certain range of WAL records, or otherwise the shortest length to which the relation was truncated during that range of WAL records, or otherwise InvalidBlockNumber. In addition, it stores a list of blocks which have been modified during that range of WAL records, but excluding blocks which were removed by truncation after they were modified and never subsequently modified again. In other words, it tells us which blocks need to copied in case of an incremental backup covering that range of WAL records. But this doesn't yet add the capability to actually perform an incremental backup; the next patch will do that. A new parameter summarize_wal enables or disables this new background process. The background process also automatically deletes summary files that are older than wal_summarize_keep_time, if that parameter has a non-zero value and the summarizer is configured to run. Patch by me, with some design help from Dilip Kumar and Andres Freund. Reviewed by Matthias van de Meent, Dilip Kumar, Jakub Wartak, Peter Eisentraut, and Álvaro Herrera. Discussion: http://postgr.es/m/CA+TgmoYOYZfMCyOXFyC-P+-mdrZqm5pP2N7S-r0z3_402h9rsA@mail.gmail.com
2023-12-19Simplify newNode() by removing special casesHeikki Linnakangas
- Remove MemoryContextAllocZeroAligned(). It was supposed to be a faster version of MemoryContextAllocZero(), but modern compilers turn the MemSetLoop() into a call to memset() anyway, making it more or less identical to MemoryContextAllocZero(). That was the only user of MemSetTest, MemSetLoop, so remove those too, as well as palloc0fast(). - Convert newNode() to a static inline function. When this was originally originally written, it was written as a macro because testing showed that gcc didn't inline the size check as we intended. Modern compiler versions do, and now that it just calls palloc0() there is no size-check to inline anyway. One nice effect is that the palloc0() takes one less argument than MemoryContextAllocZeroAligned(), which saves a few instructions in the callers of newNode(). Reviewed-by: Peter Eisentraut, Tom Lane, John Naylor, Thomas Munro Discussion: https://www.postgresql.org/message-id/b51f1fa7-7e6a-4ecc-936d-90a8a1659e7c@iki.fi
2023-12-18Micro-optimize datum_to_json_internal() some more.Nathan Bossart
Commit dc3f9bc549 mainly targeted the JSONTYPE_NUMERIC code path. This commit applies similar optimizations (e.g., removing unnecessary runtime calls to strlen() and palloc()) to nearby code. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/20231208203708.GA4126315%40nathanxps13
2023-12-16Refactor pgstat_prepare_io_time() with an input argument instead of a GUCMichael Paquier
Originally, this routine relied on track_io_timing to check if a time interval for an I/O operation stored in pg_stat_io should be initialized or not. However, the addition of WAL statistics to pg_stat_io requires that the initialization happens when track_wal_io_timing is enabled, which is dependent on the code path where the I/O operation happens. Author: Nazir Bilal Yavuz Discussion: https://postgr.es/m/CAN55FZ3AiQ+ZMxUuXnBpd0Rrh1YhwJ5FudkHg=JU0P+-W8T4Vg@mail.gmail.com
2023-12-11Remove trace_recovery_messagesMichael Paquier
This GUC was intended as a debugging help in the 9.0 area when hot standby and streaming replication were being developped, able to offer more information at LOG level rather than DEBUGn. There are more tools available these days that are able to offer rather equivalent information, like pg_waldump introduced in 9.3. It is not obvious how this facility is useful these days, so let's remove it. Author: Bharath Rupireddy Discussion: https://postgr.es/m/ZXEXEAUVFrvpquSd@paquier.xyz
2023-12-08Micro-optimize JSONTYPE_NUMERIC code path in json.c.Nathan Bossart
This commit does the following: * In datum_to_json_internal(), the call to IsValidJsonNumber() is replaced with simplified validation code. This avoids an extra call to strlen() in this path, and it avoids validating the entire string (which is okay since we know we're dealing with a numeric data type's output). * In datum_to_json_internal(), the call to escape_json() in the JSONTYPE_NUMERIC path is replaced with code that just surrounds the string with quotes. In passing, some other nearby calls to appendStringInfo() have been replaced with similar code to avoid unnecessary calls to vsnprintf(). * In composite_to_json(), the length of the separator is now determined at compile time to avoid unnecessary calls to strlen(). On my machine, this speeds up a benchmark for the proposed COPY TO (FORMAT json) command with many integers by upwards of 20%. There are likely other code paths that could be given a similar treatment, but that is left as a future exercise. Reviewed-by: Jeff Davis, Tom Lane, David Rowley, John Naylor Discussion: https://postgr.es/m/20231207231251.GB3359478%40nathanxps13
2023-12-08Cache opaque handle for GUC option to avoid repeasted lookups.Jeff Davis
When setting GUCs from proconfig, performance is important, and hash lookups in the GUC table are significant. Per suggestion from Robert Haas. Discussion: https://postgr.es/m/CA+TgmoYpKxhR3HOD9syK2XwcAUVPa0+ba0XPnwWBcYxtKLkyxA@mail.gmail.com Reviewed-by: John Naylor
2023-12-08Allow parallel CREATE INDEX for BRIN indexesTomas Vondra
Allow using multiple worker processes to build BRIN index, which until now was supported only for BTREE indexes. For large tables this often results in significant speedup when the build is CPU-bound. The work is split in a simple way - each worker builds BRIN summaries on a subset of the table, determined by the regular parallel scan used to read the data, and feeds them into a shared tuplesort which sorts them by blkno (start of the range). The leader then reads this sorted stream of ranges, merges duplicates (which may happen if the parallel scan does not align with BRIN pages_per_range), and adds the resulting ranges into the index. The number of duplicate results produced by workers (requiring merging in the leader process) should be fairly small, thanks to how parallel scans assign chunks to workers. The likelihood of duplicate results may increase for higher pages_per_range values, but then there are fewer page ranges in total. In any case, we expect the merging to be much cheaper than summarization, so this should be a win. Most of the parallelism infrastructure is a simplified copy of the code used by BTREE indexes, omitting the parts irrelevant for BRIN indexes (e.g. uniqueness checks). This also introduces a new index AM flag amcanbuildparallel, determining whether to attempt to start parallel workers for the index build. Original patch by me, with reviews and substantial reworks by Matthias van de Meent, certainly enough to make him a co-author. Author: Tomas Vondra, Matthias van de Meent Reviewed-by: Matthias van de Meent Discussion: https://postgr.es/m/c2ee7d69-ce17-43f2-d1a0-9811edbda6e6%40enterprisedb.com
2023-12-08Rename ShmemVariableCache to TransamVariablesHeikki Linnakangas
The old name was misleading: It's not a cache, the values kept in the struct are the authoritative source. Reviewed-by: Tristan Partin, Richard Guo Discussion: https://www.postgresql.org/message-id/6537d63d-4bb5-46f8-9b5d-73a8ba4720ab@iki.fi
2023-12-07Fix issues in binary_upgrade_logical_slot_has_caught_up().Amit Kapila
The commit 29d0a77fa6 labelled binary_upgrade_logical_slot_has_caught_up() as a non-strict function to allow providing a better error message to callers in case the passed slot_name is NULL. On further discussion, it seems that it is not helpful to have a different error message for NULL input in this function, so this patch marks the function as strict. This patch also removes the explicit permission check to use replication slots as this function is invoked only by superusers and instead adds an Assert. Reported-by: Masahiko Sawada Author: Hayato Kuroda Reviewed-by: Vignesh C Discussion: https://postgr.es/m/CAD21AoDSyiBKkMXBxN_gUayZZUCOgyHnG8Ge8rcPXNP3Tf6B4g@mail.gmail.com
2023-12-04Teach convert() and friends to avoid copying when possible.Nathan Bossart
Presently, pg_convert() allocates a new bytea and copies the result regardless of whether any conversion actually happened. This commit adjusts this function to return the source pointer as-is if no conversion occurred. This optimization isn't expected to make a tremendous difference, but it still seems worthwhile to avoid unnecessary memory allocations. Author: Yurii Rashkovskii Reviewed-by: Bertrand Drouvot Discussion: https://postgr.es/m/CA%2BRLCQyknBPSWXRBQGOi6aYEcdQ9RpH9Kch4GjoeY8dQ3D%2Bvhw%40mail.gmail.com
2023-12-01Adjust obsolete comment explaining set_stack_base().Thomas Munro
Commit 7389aad6 removed the notion of backends started from inside a signal handler. A stray comment still referred to them, while explaining the need for a call to set_stack_base(). That leads to the question of whether we still need the call in !EXEC_BACKEND builds. There doesn't seem to be much point in suppressing it now, as it doesn't hurt and probably helps to measure the stack base from the exact same place in EXEC_BACKEND and !EXEC_BACKEND builds. Back-patch to 16. Reported-by: Heikki Linnakangas <hlinnaka@iki.fi> Reported-by: Tristan Partin <tristan@neon.tech> Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CA%2BhUKG%2BEJHcevNGNOxVWxTNFbuB%3Dvjf4U68%2B85rAC_Sxvy2zEQ%40mail.gmail.com
2023-11-30Apply quotes more consistently to GUC names in logsMichael Paquier
Quotes are applied to GUCs in a very inconsistent way across the code base, with a mix of double quotes or no quotes used. This commit removes double quotes around all the GUC names that are obviously referred to as parameters with non-English words (use of underscore, mixed case, etc). This is the result of a discussion with Álvaro Herrera, Nathan Bossart, Laurenz Albe, Peter Eisentraut, Tom Lane and Daniel Gustafsson. Author: Peter Smith Discussion: https://postgr.es/m/CAHut+Pv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w@mail.gmail.com
2023-11-29Use larger segment file names for pg_notifyAlexander Korotkov
This avoids the wraparound in async.c and removes the corresponding code complexity. The maximum amount of allocated SLRU pages for NOTIFY / LISTEN queue is now determined by the max_notify_queue_pages GUC. The default value is 1048576. It allows to consume up to 8 GB of disk space which is exactly the limit we had previously. Author: Maxim Orlov, Aleksander Alekseev, Alexander Korotkov, Teodor Sigaev Author: Nikita Glukhov, Pavel Borisov, Yura Sokolov Reviewed-by: Jacob Champion, Heikki Linnakangas, Alexander Korotkov Reviewed-by: Japin Li, Pavel Borisov, Tom Lane, Peter Eisentraut, Andres Freund Reviewed-by: Andrey Borodin, Dilip Kumar, Aleksander Alekseev Discussion: https://postgr.es/m/CACG%3DezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe%3DpyyjVWA%40mail.gmail.com Discussion: https://postgr.es/m/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq%2BvfkmTF5Q%40mail.gmail.com
2023-11-28Don't use bms_membership() in cases where we don't need toDavid Rowley
00b41463c adjusted Bitmapset so that an empty set is always represented as NULL. This makes checking for empty sets far cheaper than it used to be. There were various places in the code where we'd call bms_membership() to handle the 3 possible BMS_Membership values. For the BMS_SINGLETON case, we'd also call bms_singleton_member() to find the single set member. This can now be done in a more optimal way by first checking if the set is NULL and then not bothering with bms_membership() and simply call bms_get_singleton_member() instead to find the single member. This function will return false if there are multiple members in the set. Here we also tidy up some logic in examine_variable() for the single member case. There's now no need to call bms_is_member() as we've already established that we're working with a singleton Bitmapset, so we can just check if varRelid matches the singleton member. Reviewed-by: Richard Guo Discussion: https://postgr.es/m/CAApHDvqW+CxNPcY245GaWiuqkkqgTudtG2ncGvvSjGn2wdTZLA@mail.gmail.com
2023-11-24Use SECS_PER_HOUR macro in tzparser.c, instead of constantsBruce Momjian
Reported-by: CharSyam Discussion: https://postgr.es/m/CAMrLSE5j_aWfoBDMrSvk14oBKSy+-2cjzNNH_FciirA7Kwo9TA@mail.gmail.com Author: CharSyam Backpatch-through: master
2023-11-18Guard against overflow in interval_mul() and interval_div().Dean Rasheed
Commits 146604ec43 and a898b409f6 added overflow checks to interval_mul(), but not to interval_div(), which contains almost identical code, and so is susceptible to the same kinds of overflows. In addition, those checks did not catch all possible overflow conditions. Add additional checks to the "cascade down" code in interval_mul(), and copy all the overflow checks over to the corresponding code in interval_div(), so that they both generate "interval out of range" errors, rather than returning bogus results. Given that these errors are relatively easy to hit, back-patch to all supported branches. Per bug #18200 from Alexander Lakhin, and subsequent investigation. Discussion: https://postgr.es/m/18200-5ea288c7b2d504b1%40postgresql.org
2023-11-17Extract column statistics from CTE references, if possible.Tom Lane
examine_simple_variable() left this as an unimplemented case years ago, with the result that plans for queries involving un-flattened CTEs might be much stupider than necessary. It's not hard to extend the existing logic for RTE_SUBQUERY cases to also be able to drill down into CTEs, so let's do that. There was some discussion of whether this patch breaks the idea of a MATERIALIZED CTE being an optimization fence. We concluded it's okay, because we already allow the outer planner level to see the estimated width and rowcount of the CTE result, and letting it see column statistics too seems fairly equivalent. Basically, what we expect of the optimization fence is that the outer query should not affect the plan chosen for the CTE query. Once that plan is chosen, it's okay for the outer planner level to make use of whatever information we have about it. Jian Guo and Tom Lane, per complaint from Hans Buschmann Discussion: https://postgr.es/m/4504e67078d648cdac3651b2960da6e7@nidsa.net
2023-11-17Don't specify number of dimensions in cases where we don't know it.Tom Lane
A few places in array_in() and plperl would report a misleading value (always MAXDIM+1) for the number of dimensions in the input, because we'd error out as soon as that was clearly too large rather than scanning the entire input. There doesn't seem to be much value in offering the true number, at least not enough to justify the extra complication involved in trying to get it. So just remove that parenthetical remark. We already have other places that do it like that, anyway. Per suggestions from Alexander Lakhin and Heikki Linnakangas. Discussion: https://postgr.es/m/2794005.1683042087@sss.pgh.pa.us
2023-11-17Change logtape/tuplestore code to use int64 for block numbersMichael Paquier
The code previously relied on "long" as type to track block numbers, which would be 4 bytes in all Windows builds or any 32-bit builds. This limited the code to be able to handle up to 16TB of data with the default block size of 8kB, like during a CLUSTER. This code now relies on a more portable int64, which should be more than enough for at least the next 20 years to come. This issue has been reported back in 2017, but nothing was done about it back then, so here we go now. Reported-by: Peter Geoghegan Reviewed-by: Heikki Linnakangas Discussion: https://postgr.es/m/CAH2-WznCscXnWmnj=STC0aSa7QG+BRedDnZsP=Jo_R9GUZvUrg@mail.gmail.com
2023-11-16Add target "slru" to pg_stat_reset_shared()Michael Paquier
Currently, pg_stat_reset_shared() cannot reset the counters in the view pg_stat_slru even if it is a type of shared stats. This patch adds support for a new value in pg_stat_reset_shared(), called "slru", able to do that. Note that pg_stat_reset_shared(NULL) also resets SLRU counters. There may be a point in removing pg_stat_reset_slru() that was introduced in 28cac71bd368 (v13~) as the new option overlaps with this function, but we would lose the ability to reset individual SLRU counters. This is left for future reconsideration. Author: Atsushi Torikoshi Discussion: https://postgr.es/m/e3c25d72e81378e7b64f3c52e0306fc9@oss.nttdata.com
2023-11-15Retire MemoryContextResetAndDeleteChildren() macro.Nathan Bossart
As of commit eaa5808e8e, MemoryContextResetAndDeleteChildren() is just a backwards compatibility macro for MemoryContextReset(). Now that some time has passed, this macro seems more likely to create confusion. This commit removes the macro and replaces all remaining uses with calls to MemoryContextReset(). Any third-party code that use this macro will need to be adjusted to call MemoryContextReset() instead. Since the two have behaved the same way since v9.5, such adjustments won't produce any behavior changes for all currently-supported versions of PostgreSQL. Reviewed-by: Amul Sul, Tom Lane, Alvaro Herrera, Dagfinn Ilmari Mannsåker Discussion: https://postgr.es/m/20231113185950.GA1668018%40nathanxps13
2023-11-15Fix dsa.c with different resource owners.Heikki Linnakangas
The comments in dsa.c suggested that areas were owned by resource owners, but it was not in fact tracked explicitly. The DSM attachments held by the dsa were owned by resource owners, but not the area itself. That led to confusion if you used one resource owner to attach or create the area, but then switched to a different resource owner before allocating or even just accessing the allocations in the area with dsa_get_address(). The additional DSM segments associated with the area would get owned by a different resource owner than the initial segment. To fix, add an explicit 'resowner' field to dsa_area. It replaces the 'mapping_pinned' flag; resowner == NULL now indicates that the mapping is pinned. This is arguably a bug fix, but I'm not backpatching because it doesn't seem to be a live bug in the back branches. In 'master', it is a bug because commit b8bff07daa made ResourceOwners more strict so that you are no longer allowed to remember new resources in a ResourceOwner after you have started to release it. Merely accessing a dsa pointer might need to attach a new DSM segment, and before this commit it was temporarily remembered in the current owner for a very brief period even if the DSA was pinned. And that could happen in AtEOXact_PgStat(), which is called after the owner is already released. Reported-by: Alexander Lakhin Reviewed-by: Alexander Lakhin, Thomas Munro, Andres Freund Discussion: https://www.postgresql.org/message-id/11b70743-c5f3-3910-8e5b-dd6c115ff829%40gmail.com
2023-11-14Support +/- infinity in the interval data type.Dean Rasheed
This adds support for infinity to the interval data type, using the same input/output representation as the other date/time data types that support infinity. This allows various arithmetic operations on infinite dates, timestamps and intervals. The new values are represented by setting all fields of the interval to INT32/64_MIN for -infinity, and INT32/64_MAX for +infinity. This ensures that they compare as less/greater than all other interval values, without the need for any special-case comparison code. Note that, since those 2 values were formerly accepted as legal finite intervals, pg_upgrade and dump/restore from an old database will turn them from finite to infinite intervals. That seems OK, since those exact values should be extremely rare in practice, and they are outside the documented range supported by the interval type, which gives us a certain amount of leeway. Bump catalog version. Joseph Koshakow, Jian He, and Ashutosh Bapat, reviewed by me. Discussion: https://postgr.es/m/CAAvxfHea4%2BsPybKK7agDYOMo9N-Z3J6ZXf3BOM79pFsFNcRjwA%40mail.gmail.com
2023-11-14Replace Gen_dummy_probes.sed with Gen_dummy_probes.plPeter Eisentraut
To generate a dummy probes.h file when dtrace is not available, we had two different scripts: A sed version, which is the original version, and a Perl version, which was generated by s2p. This split was necessary because Perl was not a mandatory build dependency on Unix, but sed was not guaranteed to be available on Windows. (The Meson build system used the sed version even on Windows, which was probably incorrect and probably would have had to be fixed before elevating that build system from experimental status.) As of 721856ff24, Perl is a required build dependency, so this split is no longer necessary. We can just use the Perl script in all build environments and remove a whole bunch of infrastructure to keep the two variants in sync. The new Gen_dummy_probes.pl is not the version generated by s2p but a new implementation written by hand by adapting the sed version to Perl syntax. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/3fd0f1bc-4483-4ba9-8aa0-64765b052039@eisentraut.org
2023-11-13Improve readability and error detection of array_in().Tom Lane
Rewrite array_in() and its subroutines so that we make only one pass over the input text, rather than two. This requires potentially re-pallocing the working arrays values[] and nulls[] larger than our initial guess, but that cost will hopefully be made up by avoiding duplicate parsing. In any case this coding seems much clearer and more straightforward than what we had before. This also fixes array_in() to reject non-rectangular input (that is, different brace depths in different parts of the input) more reliably than before, and to give a better error message when it does so. This is analogous to the plpython and plperl fixes in 0553528e7 and f47004add. Like those PLs, we now accept input such as '{{},{}}' as a valid representation of an empty array, which we did not before. Additionally, reject explicit array subscripts that are outside the integer range (previously you just got whatever atoi() converted them to), and make some other minor improvements in error reporting. Although this is arguably a bug fix, it's also a behavioral change that might trip somebody up, so no back-patch. Tom Lane, Heikki Linnakangas, and Jian He. Thanks to Alexander Lakhin for the initial report and for review/testing. Discussion: https://postgr.es/m/2794005.1683042087@sss.pgh.pa.us
2023-11-12Add ability to reset all shared stats types in pg_stat_reset_shared()Michael Paquier
Currently, pg_stat_reset_shared() can use an argument to specify the target of statistics to reset, doing nothing for NULL as it is strict. This patch adds to pg_stat_reset_shared() the possibility to reset all the stats types already handled in this function rather than do nothing if the argument value given is NULL or if nothing is specified (proisstrict is switched to false). Like previously, SLRUs are not included in what gets reset. The idea to use NULL or no argument to control if all the shared stats already covered by this function should be reset has been proposed by Andres Freund. Bump catalog version. Author: Atsushi Torikoshi Reviewed-by: Kyotaro Horiguchi, Michael Paquier, Bharath Rupireddy, Matthias van de Meent Discussion: https://postgr.es/m/4291a55137ddda77cf7cc5f46e846daf@oss.nttdata.com
2023-11-10Prohibit max_slot_wal_keep_size to value other than -1 during upgrade.Amit Kapila
We don't want existing slots in the old cluster to get invalidated during the upgrade. During an upgrade, we set this variable to -1 via the command line in an attempt to prevent such invalidations, but users have ways to override it. This patch ensures the value is not overridden by the user. Author: Kyotaro Horiguchi Reviewed-by: Peter Smith, Alvaro Herrera Discussion: http://postgr.es/m/20231027.115759.2206827438943188717.horikyota.ntt@gmail.com
2023-11-09Avoid integer overflow hazard in interval_time().Dean Rasheed
When casting an interval to a time, the original code suffered from 64-bit integer overflow for inputs with a sufficiently large negative "time" field, leading to bogus results. Fix by rewriting the algorithm in a simpler form, that more obviously cannot overflow. While at it, improve the test coverage to include negative interval inputs. Discussion: https://postgr.es/m/CAEZATCXoUKHkcuq4q63hkiPsKZJd0kZWzgKtU%2BNT0aU4wbf_Pw%40mail.gmail.com
2023-11-10Ensure we use the correct spelling of "ensure"David Rowley
We seem to have accidentally used "insure" in a few places. Correct that. Author: Peter Smith Discussion: https://postgr.es/m/CAHut+Pv0biqrhA3pMhu40aDsj343mTsD75khKnHsLqR8P04f=Q@mail.gmail.com Backpatch-through: 12, oldest supported version
2023-11-09Fix bug in the new ResourceOwner implementation.Heikki Linnakangas
When the hash table is in use, ResoureOwnerSort() moves any elements from the small fixed-size array to the hash table, and sorts it. When the hash table is not in use, it sorts the elements in the small fixed-size array directly. However, ResourceOwnerSort() and ResourceOwnerReleaseAll() had different idea on when the hash table is in use: ResourceOwnerSort() checked owner->nhash != 0, and ResourceOwnerReleaseAll() checked owner->hash != NULL. If the hash table was allocated but was currently empty, you hit an assertion failure. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://www.postgresql.org/message-id/be58d565-9e95-d417-4e47-f6bd408dea4b@gmail.com
2023-11-08Use a faster hash function in resource owners.Heikki Linnakangas
This buys back some of the performance loss that we otherwise saw from the previous commit. Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu Reviewed-by: Peter Eisentraut, Andres Freund Discussion: https://www.postgresql.org/message-id/d746cead-a1ef-7efe-fb47-933311e876a3%40iki.fi
2023-11-08Make ResourceOwners more easily extensible.Heikki Linnakangas
Instead of having a separate array/hash for each resource kind, use a single array and hash to hold all kinds of resources. This makes it possible to introduce new resource "kinds" without having to modify the ResourceOwnerData struct. In particular, this makes it possible for extensions to register custom resource kinds. The old approach was to have a small array of resources of each kind, and if it fills up, switch to a hash table. The new approach also uses an array and a hash, but now the array and the hash are used at the same time. The array is used to hold the recently added resources, and when it fills up, they are moved to the hash. This keeps the access to recent entries fast, even when there are a lot of long-held resources. All the resource-specific ResourceOwnerEnlarge*(), ResourceOwnerRemember*(), and ResourceOwnerForget*() functions have been replaced with three generic functions that take resource kind as argument. For convenience, we still define resource-specific wrapper macros around the generic functions with the old names, but they are now defined in the source files that use those resource kinds. The release callback no longer needs to call ResourceOwnerForget on the resource being released. ResourceOwnerRelease unregisters the resource from the owner before calling the callback. That needed some changes in bufmgr.c and some other files, where releasing the resources previously always called ResourceOwnerForget. Each resource kind specifies a release priority, and ResourceOwnerReleaseAll releases the resources in priority order. To make that possible, we have to restrict what you can do between phases. After calling ResourceOwnerRelease(), you are no longer allowed to remember any more resources in it or to forget any previously remembered resources by calling ResourceOwnerForget. There was one case where that was done previously. At subtransaction commit, AtEOSubXact_Inval() would handle the invalidation messages and call RelationFlushRelation(), which temporarily increased the reference count on the relation being flushed. We now switch to the parent subtransaction's resource owner before calling AtEOSubXact_Inval(), so that there is a valid ResourceOwner to temporarily hold that relcache reference. Other end-of-xact routines make similar calls to AtEOXact_Inval() between release phases, but I didn't see any regression test failures from those, so I'm not sure if they could reach a codepath that needs remembering extra resources. There were two exceptions to how the resource leak WARNINGs on commit were printed previously: llvmjit silently released the context without printing the warning, and a leaked buffer io triggered a PANIC. Now everything prints a WARNING, including those cases. Add tests in src/test/modules/test_resowner. Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu Reviewed-by: Peter Eisentraut, Andres Freund Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08Move a few ResourceOwnerEnlarge() calls for safety and clarity.Heikki Linnakangas
These are functions where a lot of things happen between the ResourceOwnerEnlarge and ResourceOwnerRemember calls. It's important that there are no unrelated ResourceOwnerRemember calls in the code in between, otherwise the reserved entry might be used up by the intervening ResourceOwnerRemember and not be available at the intended ResourceOwnerRemember call anymore. I don't see any bugs here, but the longer the code path between the calls is, the harder it is to verify. In bufmgr.c, there is a function similar to ResourceOwnerEnlarge, ReservePrivateRefCountEntry(), to ensure that the private refcount array has enough space. The ReservePrivateRefCountEntry() calls were made at different places than the ResourceOwnerEnlargeBuffers() calls. Move the ResourceOwnerEnlargeBuffers() and ReservePrivateRefCountEntry() calls together for consistency. Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu Reviewed-by: Peter Eisentraut, Andres Freund Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-07Stop including parsenodes.h in plannodes.hAlvaro Herrera
I added it by mistake in commit 7103ebb7aae8. To clean up, struct MergeAction needs to be moved to primnodes.h from parsenodes.h. (This forces us to also move OverridingKind to primnodes.h). Having to add parsenodes.h to bootstrap.h as fallout is a bit surprising, since nothing nominally needs it there. However, per comments in bootscanner.l, it is needed so that YYSTYPE can be declared. I think this only started with commit dac048f71ebb, but I didn't actually verify that. In passing, stop including parsenodes.h in tcopprot.h. Nothing needs it there. Per discussion on a patch by Ashutosh Bapat. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/202311071106.6y7b2ascqjlz@alvherre.pgsql
2023-11-07Reorder two functions in inval.cMichael Paquier
This file separates public and static functions with a separator comment, but two routines were not defined in a location reflecting that, so reorder them. Author: Aleksander Alekseev Reviewed-by: Álvaro Herrera, Michael Paquier Discussion: https://postgr.es/m/CAJ7c6TMX2dd0g91UKvcC+CVygKQYJkKJq1+ZzT4rOK42+b53=w@mail.gmail.com
2023-11-06Detect integer overflow while computing new array dimensions.Tom Lane
array_set_element() and related functions allow an array to be enlarged by assigning to subscripts outside the current array bounds. While these places were careful to check that the new bounds are allowable, they neglected to consider the risk of integer overflow in computing the new bounds. In edge cases, we could compute new bounds that are invalid but get past the subsequent checks, allowing bad things to happen. Memory stomps that are potentially exploitable for arbitrary code execution are possible, and so is disclosure of server memory. To fix, perform the hazardous computations using overflow-detecting arithmetic routines, which fortunately exist in all still-supported branches. The test cases added for this generate (after patching) errors that mention the value of MaxArraySize, which is platform-dependent. Rather than introduce multiple expected-files, use psql's VERBOSITY parameter to suppress the printing of the message text. v11 psql lacks that parameter, so omit the tests in that branch. Our thanks to Pedro Gallegos for reporting this problem. Security: CVE-2023-5869
2023-11-06Remove distprepPeter Eisentraut
A PostgreSQL release tarball contains a number of prebuilt files, in particular files produced by bison, flex, perl, and well as html and man documentation. We have done this consistent with established practice at the time to not require these tools for building from a tarball. Some of these tools were hard to get, or get the right version of, from time to time, and shipping the prebuilt output was a convenience to users. Now this has at least two problems: One, we have to make the build system(s) work in two modes: Building from a git checkout and building from a tarball. This is pretty complicated, but it works so far for autoconf/make. It does not currently work for meson; you can currently only build with meson from a git checkout. Making meson builds work from a tarball seems very difficult or impossible. One particular problem is that since meson requires a separate build directory, we cannot make the build update files like gram.h in the source tree. So if you were to build from a tarball and update gram.y, you will have a gram.h in the source tree and one in the build tree, but the way things work is that the compiler will always use the one in the source tree. So you cannot, for example, make any gram.y changes when building from a tarball. This seems impossible to fix in a non-horrible way. Second, there is increased interest nowadays in precisely tracking the origin of software. We can reasonably track contributions into the git tree, and users can reasonably track the path from a tarball to packages and downloads and installs. But what happens between the git tree and the tarball is obscure and in some cases non-reproducible. The solution for both of these issues is to get rid of the step that adds prebuilt files to the tarball. The tarball now only contains what is in the git tree (*). Getting the additional build dependencies is no longer a problem nowadays, and the complications to keep these dual build modes working are significant. And of course we want to get the meson build system working universally. This commit removes the make distprep target altogether. The make dist target continues to do its job, it just doesn't call distprep anymore. (*) - The tarball also contains the INSTALL file that is built at make dist time, but not by distprep. This is unchanged for now. The make maintainer-clean target, whose job it is to remove the prebuilt files in addition to what make distclean does, is now just an alias to make distprep. (In practice, it is probably obsolete given that git clean is available.) The following programs are now hard build requirements in configure (they were already required by meson.build): - bison - flex - perl Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/e07408d9-e5f2-d9fd-5672-f53354e9305e@eisentraut.org
2023-11-06Set GUC "is_superuser" in all processes that set AuthenticatedUserId.Noah Misch
It was always false in single-user mode, in autovacuum workers, and in background workers. This had no specifically-identified security consequences, but non-core code or future work might make it security-relevant. Back-patch to v11 (all supported versions). Jelte Fennema-Nio. Reported by Jelte Fennema-Nio.
2023-11-06Add XMLText function (SQL/XML X038)Daniel Gustafsson
This function implements the standard XMLTest function, which converts text into xml text nodes. It uses the libxml2 function xmlEncodeSpecialChars to escape predefined entities (&"<>), so that those do not cause any conflict when concatenating the text node output with existing xml documents. This also adds a note in features.sgml about not supporting XML(SEQUENCE). The SQL specification defines a RETURNING clause to a set of XML functions, where RETURNING CONTENT or RETURNING SEQUENCE can be defined. Since PostgreSQL doesn't support XML(SEQUENCE) all of these functions operate with an implicit RETURNING CONTENT. Author: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: Vik Fearing <vik@postgresfriends.org> Discussion: https://postgr.es/m/86617a66-ec95-581f-8d54-08059cca8885@uni-muenster.de
2023-11-02Make GetConfigOption/GetConfigOptionResetString return "" for NULL.Tom Lane
As per the preceding commit, GUC APIs generally expose NULL-valued string variables as empty strings. Extend that policy to GetConfigOption() and GetConfigOptionResetString(), eliminating a crash hazard for unwary callers, as well as a fundamental ambiguity in GetConfigOption()'s API. No back-patch, since this is an API change and conceivably somebody somewhere is depending on this corner case. Xing Guo, Aleksander Alekseev, Tom Lane Discussion: https://postgr.es/m/CACpMh+AyDx5YUpPaAgzVwC1d8zfOL4JoD-uyFDnNSa1z0EsDQQ@mail.gmail.com
2023-11-02Be more wary about NULL values for GUC string variables.Tom Lane
get_explain_guc_options() crashed if a string GUC marked GUC_EXPLAIN has a NULL boot_val. Nosing around found a couple of other places that seemed insufficiently cautious about NULL string values, although those are likely unreachable in practice. Add some commentary defining the expectations for NULL values of string variables, in hopes of forestalling future additions of more such bugs. Xing Guo, Aleksander Alekseev, Tom Lane Discussion: https://postgr.es/m/CACpMh+AyDx5YUpPaAgzVwC1d8zfOL4JoD-uyFDnNSa1z0EsDQQ@mail.gmail.com
2023-11-01Additional unicode primitive functions.Jeff Davis
Introduce unicode_version(), icu_unicode_version(), and unicode_assigned(). The latter requires introducing a new lookup table for the Unicode General Category, which is generated along with the other Unicode lookup tables. Discussion: https://postgr.es/m/CA+TgmoYzYR-yhU6k1XFCADeyj=Oyz2PkVsa3iKv+keM8wp-F_A@mail.gmail.com Reviewed-by: Peter Eisentraut
2023-11-01Fix function name in commentDaniel Gustafsson
The name of the function resulting from the macro expansion was incorrectly stated. Backpatch to 16 where it was introduced. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20231101.172308.1740861597185391383.horikyota.ntt@gmail.com Backpatch-through: v16
2023-10-31doc: 1-byte varlena headers can be used for user PLAIN storageBruce Momjian
This also updates some C comments. Reported-by: suchithjn22@gmail.com Discussion: https://postgr.es/m/167336599095.2667301.15497893107226841625@wrigleys.postgresql.org Author: Laurenz Albe (doc patch) Backpatch-through: 11
2023-10-31improve alignment of postgresql.conf commentsBruce Momjian
Discussion: https://postgr.es/m/CAHut+Ps5MdQ1b4jp9rd63zfE2X25mV58y1W+hm2v53svtGDxBQ@mail.gmail.com Author: Peter Smith Backpatch-through: master
2023-10-30Introduce pg_stat_checkpointerMichael Paquier
Historically, the statistics of the checkpointer have been always part of pg_stat_bgwriter. This commit removes a few columns from pg_stat_bgwriter, and introduces pg_stat_checkpointer with equivalent, renamed columns (plus a new one for the reset timestamp): - checkpoints_timed -> num_timed - checkpoints_req -> num_requested - checkpoint_write_time -> write_time - checkpoint_sync_time -> sync_time - buffers_checkpoint -> buffers_written The fields of PgStat_CheckpointerStats and its SQL functions are renamed to match with the new field names, for consistency. Note that background writer and checkpointer have been split into two different processes in commits 806a2aee3791 and bf405ba8e460. The pgstat structures were already split, making this change straight-forward. Bump catalog version. Author: Bharath Rupireddy Reviewed-by: Bertrand Drouvot, Andres Freund, Michael Paquier Discussion: https://postgr.es/m/CALj2ACVxX2ii=66RypXRweZe2EsBRiPMj0aHfRfHUeXJcC7kHg@mail.gmail.com
2023-10-30Refactor some code related to transaction-level statistics for relationsMichael Paquier
This commit refactors find_tabstat_entry() so as transaction counters for inserted, updated and deleted tuples are included in the result returned. If a shared entry is found for a relation, its result is now a copy of the PgStat_TableStatus entry retrieved from shared memory. This idea has been proposed by Andres Freund. While on it, the following SQL functions, used in system views, are refactored with macros, in the same spirit as 83a1a1b56645, reducing the amount of code: - pg_stat_get_xact_tuples_deleted() - pg_stat_get_xact_tuples_inserted() - pg_stat_get_xact_tuples_updated() There is now only one caller of find_tabstat_entry() in the tree. Author: Bertrand Drouvot Discussion: https://postgr.es/m/b9e1f543-ee93-8168-d530-d961708ad9d3@gmail.com