user/sven/postgresql.git

Age	Commit message (Collapse)	Author
2025-12-09	Add started_by column to pg_stat_progress_analyze view.	Masahiko Sawada
	The new column, started_by, indicates the initiator of the analyze ('manual' or 'autovacuum'), helping users and monitoring tools to better understand ANALYZE behavior. Bump catalog version. Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Yu Wang <wangyu_runtime@163.com> Discussion: https://postgr.es/m/CAA5RZ0suoicwxFeK_eDkUrzF7s0BVTaE7M%2BehCpYcCk5wiECpw%40mail.gmail.com
2025-12-09	Add mode and started_by columns to pg_stat_progress_vacuum view.	Masahiko Sawada
	The new columns, mode and started_by, indicate the vacuum mode ('normal', 'aggressive', or 'failsafe') and the initiator of the vacuum ('manual', 'autovacuum', or 'autovacuum_wraparound'), respectively. This allows users and monitoring tools to better understand VACUUM behavior. Bump catalog version. Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Yu Wang <wangyu_runtime@163.com> Discussion: https://postgr.es/m/CAOzEurQcOY-OBL_ouEVfEaFqe_md3vB5pXjR_m6L71Dcp1JKCQ@mail.gmail.com
2025-12-09	Support "j" length modifier in snprintf.c.	Tom Lane
	POSIX has for a long time defined the "j" length modifier for printf conversions as meaning the size of intmax_t or uintmax_t. We got away without supporting that so far, because we were not using intmax_t anywhere. However, commit e6be84356 re-introduced upstream's use of intmax_t and PRIdMAX into zic.c. It emerges that on some platforms (at least FreeBSD and macOS), <inttypes.h> defines PRIdMAX as "jd", so that snprintf.c falls over if that is used. (We hadn't noticed yet because it would only be apparent if bad data is fed to zic, resulting in an error report, and even then the only visible symptom is a missing line number in the error message.) We could revert that decision from our copy of zic.c, but on the whole it seems better to update snprintf.c to support this standard modifier. There might well be extensions, now or in future, that expect it to work. I did this in the lazy man's way of translating "j" to either "l" or "ll" depending on a compile-time sizeof() check, just as was done long ago to support "z" for size_t. One could imagine promoting intmax_t to have full support in snprintf.c, for example converting fmtint()'s value argument and internal arithmetic to use [u]intmax_t not [unsigned] long long. But that'd be more work and I'm hesitant to do it anyway: if there are any platforms out there where intmax_t is actually wider than "long long", this would doubtless result in a noticeable speed penalty to snprintf(). Let's not go there until we have positive evidence that there's a reason to, and some way to measure what size of penalty we're taking. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3210703.1765236740@sss.pgh.pa.us
2025-12-09	Add wait event for the group commit delay before WAL flush	Heikki Linnakangas
	Author: Rafia Sabih <rafia.pghackers@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://www.postgresql.org/message-id/CA%2BFpmFf-hWXtrC0Q3Cr_Xo78zuP_M_VC5xgWPOYOkwqOD0T8eg@mail.gmail.com
2025-12-09	Fix warning about wrong format specifier for off_t type	Heikki Linnakangas
	Per OS X buildfarm members.
2025-12-09	Widen MultiXactOffset to 64 bits	Heikki Linnakangas
	This eliminates MultiXactOffset wraparound and the 2^32 limit on the total number of multixid members. Multixids are still limited to 2^31, but this is a nice improvement because 'members' can grow much faster than the number of multixids. On such systems, you can now run longer before hitting hard limits or triggering anti-wraparound vacuums. Not having to deal with MultiXactOffset wraparound also simplifies the code and removes some gnarly corner cases. We no longer need to perform emergency anti-wraparound freezing because of running out of 'members' space, so the offset stop limit is gone. But you might still not want 'members' to consume huge amounts of disk space. For that reason, I kept the logic for lowering vacuum's multixid freezing cutoff if a large amount of 'members' space is used. The thresholds for that are roughly the same as the "safe" and "danger" thresholds used before, 2 billion transactions and 4 billion transactions. This keeps the behavior for the freeze cutoff roughly the same as before. It might make sense to make this smarter or configurable, now that the threshold is only needed to manage disk usage, but that's left for the future. Add code to pg_upgrade to convert multitransactions from the old to the new format, rewriting the pg_multixact SLRU files. Because pg_upgrade now rewrites the files, we can get rid of some hacks we had put in place to deal with old bugs and upgraded clusters. Bump catalog version for the pg_multixact/offsets format change. Author: Maxim Orlov <orlovmg@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Discussion: https://www.postgresql.org/message-id/CACG%3DezaWg7_nt-8ey4aKv2w9LcuLthHknwCawmBgEeTnJrJTcw@mail.gmail.com
2025-12-09	Move pg_multixact SLRU page format definitions to a separate header	Heikki Linnakangas
	This makes them accessible from pg_upgrade, needed by the next commit. I'm doing this mechanical move as a separate commit to make the next commit's changes to these definitions more obvious. Author: Maxim Orlov <orlovmg@gmail.com> Discussion: https://www.postgresql.org/message-id/CACG%3DezbZo_3_fnx%3DS5BfepwRftzrpJ%2B7WET4EkTU6wnjDTsnjg@mail.gmail.com
2025-12-09	Fix distinctness check for queries with grouping sets	Richard Guo
	query_is_distinct_for() is intended to determine whether a query never returns duplicates of the specified columns. For queries using grouping sets, if there are no grouping expressions, the query may contain one or more empty grouping sets. The goal is to detect whether there is exactly one empty grouping set, in which case the query would return a single row and thus be distinct. The previous logic in query_is_distinct_for() was incomplete because the check was insufficiently thorough and could return false when it could have returned true. It failed to consider cases where the DISTINCT clause is used on the GROUP BY, in which case duplicate empty grouping sets are removed, leaving only one. It also did not correctly handle all possible structures of GroupingSet nodes that represent a single empty grouping set. To fix, add a check for the groupDistinct flag, and expand the query's groupingSets tree into a flat list, then verify that the expanded list contains only one element. No backpatch as this could result in plan changes. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAMbWs480Z04NtP8-O55uROq2Zego309+h3hhaZhz6ztmgWLEBw@mail.gmail.com
2025-12-09	Fix const-simplification for index expressions and predicate	Richard Guo
	Similar to the issue with constraint and statistics expressions fixed in 317c117d6, index expressions and predicate can also suffer from incorrect reduction of NullTest clauses during const-simplification, due to unfixed varnos and the use of a NULL root. It has been reported that this issue can cause the planner to fail to pick up a partial index that it previously matched successfully. Because we need to cache the const-simplified index expressions and predicate in the relcache entry, we cannot fix the Vars before applying eval_const_expressions. To ensure proper reduction of NullTest clauses, this patch runs eval_const_expressions a second time -- after the Vars have been fixed and with a valid root. It could be argued that the additional call to eval_const_expressions might increase planning time, but I don't think that's a concern. It only runs when index expressions and predicate are present; it is relatively cheap when run on small expression trees (which is typically the case for index expressions and predicate), and it runs on expressions that have already been const-simplified once, making the second pass even cheaper. In return, in cases like the one reported, it allows the planner to match and use partial indexes, which can lead to significant execution-time improvements. Bug: #19007 Reported-by: Bryan Fox <bryfox@gmail.com> Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/19007-4cc6e252ed8aa54a@postgresql.org
2025-12-09	Fix LOCK_TIMEOUT handling in slotsync worker.	Amit Kapila
	Previously, the slotsync worker relied on SIGINT for graceful shutdown during promotion. However, SIGINT is also used by the LOCK_TIMEOUT handler to cancel queries. Since the slotsync worker can lock catalog tables while parsing libpq tuples, this overlap caused it to ignore LOCK_TIMEOUT signals and potentially wait indefinitely on locks. This patch replaces the slotsync worker's SIGINT handler with StatementCancelHandler to correctly process query-cancel interrupts. Additionally, the startup process now uses SIGUSR1 to signal the slotsync worker to stop during promotion. The worker exits after detecting that the shared memory flag stopSignaled is set. Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 17, here it was introduced Discussion: https://postgr.es/m/TY4PR01MB169078F33846E9568412D878C94A2A@TY4PR01MB16907.jpnprd01.prod.outlook.com
2025-12-09	Remove useless casts in format arguments	Peter Eisentraut
	There were a number of useless casts in format arguments, either where the input to the cast was already in the right type, or seemingly uselessly casting between types instead of just using the right format placeholder to begin with. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/07fa29f9-42d7-4aac-8834-197918cbbab6%40eisentraut.org
2025-12-09	Clean up int64-related format strings	Peter Eisentraut
	Remove some gratuitous uses of INT64_FORMAT. Make use of PRIu64/PRId64 were appropriate, remove unnecessary casts. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/07fa29f9-42d7-4aac-8834-197918cbbab6%40eisentraut.org
2025-12-09	Remove unnecessary casts in printf format arguments (%zu/%zd)	Peter Eisentraut
	Many of these are probably left over from before use of %zu/%zd was portable. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/07fa29f9-42d7-4aac-8834-197918cbbab6%40eisentraut.org
2025-12-09	Use palloc_object() and palloc_array() in more areas of the tree	Michael Paquier
	The idea is to encourage more the use of these new routines across the tree, as these offer stronger type safety guarantees than palloc(). The following paths are included in this batch, treating all the areas proposed by the author for the most trivial changes, except src/backend (by far the largest batch): src/bin/ src/common/ src/fe_utils/ src/include/ src/pl/ src/test/ src/tutorial/ Similar work has been done in 31d3847a37be. The code compiles the same before and after this commit, with the following exceptions due to changes in line numbers because some of the new allocation formulas are shorter: blkreftable.c pgfnames.c pl_exec.c Author: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/ad0748d4-3080-436e-b0bc-ac8f86a3466a@gmail.com
2025-12-08	Improve documentation for pg_atomic_unlocked_write_u32()	Andres Freund
	After my recent commit 7902a47c20b, Nathan noticed that pg_atomic_unlocked_write_u64() was not accurately described by the comments for the 32bit version. Turns out the 32bit version has suffered from copy-and-paste-itis since its introduction. Fix. Reported-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/aTGt7q4Jvn97uGAx@nathan
2025-12-09	Doc: fix typo in hash index documentation	David Rowley
	Plus a similar fix to the README. Backpatch as far back as the sgml issue exists. The README issue does exist in v14, but that seems unlikely to harm anyone. Author: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/ed3db7ea-55b4-4809-86af-81ad3bb2c7d3@gmail.com Backpatch-through: 15
2025-12-09	libpq: Refactor logic checking for exit() in shared library builds	Michael Paquier
	This commit refactors the sanity check done by libpq to ensure that there is no exit() reference in the build, moving the check from a standalone Makefile rule to a perl script. Platform-specific checks are now part of the script, avoiding most of the duplication created by the introduction of this check for meson, but not all of them: - Solaris and Windows skipped in the script. - Whitelist of symbols is in the script. - nm availability, with its path given as an option of the script. Its execution is checked in the script. - Check is disabled if coverage reports are enabled. This part is not pushed down to the script. - Check is disabled for static builds of libpq. This part is filtered out in each build script. A trick is required for the stamp file, in the shape of an optional argument that can be given to the script. Meson expects the stamp in output and uses this argument, generating the stamp file in the script. Meson is able to handle the removal of the stamp file internally when libpq needs to be rebuilt and the check done again. This refactoring piece has come up while discussing the addition of more items in the symbols considered as acceptable. This sanity check has never been run by meson since its introduction in dc227eb82ea8, so it is possible that this fails in some of the buildfarm members. At least the CI is happy with it, but let's see how it goes. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Co-authored-by: VASUKI M <vasukim1992002@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/19095-6d8256d0c37d4be2@postgresql.org
2025-12-08	Fix minor portability issue in pg_resetwal.c.	Tom Lane
	The argument of isspace() (like other <ctype.h> functions) must be cast to unsigned char to ensure portable results. Per NetBSD buildfarm members. Oversight in 636c1914b.
2025-12-08	Avoid pointer chasing in _bt_readpage inner loop.	Peter Geoghegan
	Make _bt_readpage pass down the current scan direction to various utility functions within its pstate variable. Also have _bt_readpage work off of a local copy of scan->ignore_killed_tuples within its per-tuple loop (rather than using scan->ignore_killed_tuples directly). Testing has shown that this significantly benefits large range scans, which are naturally able to take full advantage of the pstate.startikey optimization added by commit 8a510275. Running a pgbench script with a "SELECT abalance FROM pgbench_accounts WHERE aid BETWEEN ..." query shows an increase in transaction throughput of over 5%. There also appears to be a small performance benefit when running pgbench's built-in select-only script. Follow-up to commit 65d6acbc. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Victor Yegorov <vyegorov@gmail.com> Discussion: https://postgr.es/m/CAH2-WzmwMwcwKFgaf+mYPwiz3iL4AqpXnwtW_O0vqpWPXRom9Q@mail.gmail.com
2025-12-08	Unify some more messages	Álvaro Herrera
	No backpatch here because of message wording changes. Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/202512081537.ahw5gwoencou@alvherre.pgsql
2025-12-08	Relocate _bt_readpage and related functions.	Peter Geoghegan
	Quite a bit of code within nbtutils.c is only called by _bt_readpage. Move _bt_readpage and all of the nbtutils.c functions it depends on into a new .c file, nbtreadpage.c. Also reorder some of the functions within the new file for clarity. This commit has no functional impact. It is strictly mechanical. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Victor Yegorov <vyegorov@gmail.com> Discussion: https://postgr.es/m/CAH2-WzmwMwcwKFgaf+mYPwiz3iL4AqpXnwtW_O0vqpWPXRom9Q@mail.gmail.com
2025-12-08	Unify error messages	Álvaro Herrera
	No visible changes, just refactor how messages are constructed.
2025-12-08	pg_resetwal: Use separate flags for whether an option is given	Heikki Linnakangas
	Currently, we use special values that are otherwise invalid for each option to indicate "option was not given". Replace that with separate boolean variables for each option. It seems more clear to be explicit. We were already doing that for the -m option, because there were no invalid values for nextMulti that we could use (since commit 94939c5f3a). Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/81adf5f3-36ad-4bcd-9ba5-1b95c7b7a807@iki.fi
2025-12-08	pg_resetwal: Reject negative and out of range arguments	Heikki Linnakangas
	The strtoul() function that we used to parse many of the options accepts negative values, and silently wraps them to the equivalent unsigned values. For example, -1 becomes 0xFFFFFFFF, on platforms where unsigned long is 32 bits wide. Also, on platforms where "unsigned long" is 64 bits wide, we silently casted values larger than UINT32_MAX to the equivalent 32-bit value. Both of those behaviors seem undesirable, so tighten up the parsing to reject them. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/81adf5f3-36ad-4bcd-9ba5-1b95c7b7a807@iki.fi
2025-12-08	Make ecpg parse.pl more robust with braces	Peter Eisentraut
	When parse.pl processes braces, it does not take into account that braces could also be their own token if single quoted ('{', '}'). This is not currently used but a future patch wants to make use of it. This fixes that by using lookaround assertions to detect the quotes. To make sure all Perl versions in play support this and to avoid surprises later on, let's give this a spin on the buildfarm now. It can exist independently of future work. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/a855795d-e697-4fa5-8698-d20122126567@eisentraut.org
2025-12-08	Use PGAlignedXLogBlock for some code simplification	Peter Eisentraut
	The code in BootStrapXLOG() and in pg_test_fsync.c tried to align WAL buffers in complicated ways. Also, they still used XLOG_BLCKSZ for the alignment, even though that should now be PG_IO_ALIGN_SIZE. This can now be simplified and made more consistent by using PGAlignedXLogBlock, either directly in BootStrapXLOG() and using alignas in pg_test_fsync.c. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/f462a175-b608-44a1-b428-bdf351e914f4%40eisentraut.org
2025-12-08	test_custom_stats: Test module for custom cumulative statistics	Michael Paquier
	This test module acts as a replacement that existed prior to d52c24b0f808 in the test module injection_points. It uses a more flexible structure than its ancestor: - Two libraries are built, one for fixed-sized stats and one for variable-sized stats. - No GUCs required. The stats are enabled only if one or both libraries are loaded with shared_preload_libraries. - Same kind IDs reserved: 25 (variable-sized) and 26 (fixed-sized) The goal of this redesign is to be able to easier extend the code coverage provided by this module for other changes that are currently under discussion, and injection_points was not suited for these. Injection points are also now widely used in the tree now, so extending more the test coverage for custom pgstats in the test module injection_points would be a riskier long-term move. The new code is mostly a copy of what existed previously in the test module injection_points, with the same callbacks defined for fixed-sized and variable-sized stats, but a simpler overall structure in terms of the stats counters updated. The test coverage should remain the same as previously: one TAP test is used to check data reports, crash recovery and clean restart scenarios. Tests are added for the manual reset of fixed-sized stats, something not tested until now. Author: Sami Imseih <samimseih@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAA5RZ0sJgO6GAwgFxmzg9MVP=rM7Us8KKcWpuqxe-f5qxmpE0g@mail.gmail.com
2025-12-08	Prevent invalidation of newly created replication slots.	Amit Kapila
	A race condition could cause a newly created replication slot to become invalidated between WAL reservation and a checkpoint. Previously, if the required WAL was removed, we retried the reservation process. However, the slot could still be invalidated before the retry if the WAL was not yet removed but the checkpoint advanced the redo pointer beyond the slot's intended restart LSN and computed the minimum LSN that needs to be preserved for the slots. The fix is to acquire an exclusive lock on ReplicationSlotAllocationLock during WAL reservation to serialize WAL reservation and checkpoint's minimum restart_lsn computation. This ensures that, if WAL reservation occurs first, the checkpoint waits until restart_lsn is updated before removing WAL. If the checkpoint runs first, subsequent WAL reservations pick a position at or after the latest checkpoint's redo pointer. We can't use the same fix for branch 17 and prior because commit 2090edc6f3 changed to compute to the minimum restart_LSN among slot's at the beginning of checkpoint (or restart point). The fix for 17 and prior branches is under discussion and will be committed separately. Reported-by: suyu.cmj <mengjuan.cmj@alibaba-inc.com> Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Vitaly Davydov <v.davydov@postgrespro.ru> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 18 Discussion: https://postgr.es/m/5e045179-236f-4f8f-84f1-0f2566ba784c.mengjuan.cmj@alibaba-inc.com
2025-12-08	injection_points: Remove portions related to custom pgstats	Michael Paquier
	The test module injection_points has been used as a landing spot to provide coverage for the custom pgstats APIs, for both fixed-sized and variable-sized stats kinds. Some recent work related to pgstats is proving that this structure makes the implementation of new tests harder. This commit removes the code related to pgstats from injection_points, and an equivalent will be reintroduced as a separate test module in a follow-up commit. This removal is done in its own commit for clarity. Using injection_points for this test coverage was perhaps not the best way to design things, but this was good enough while working on the first flavor of the custom pgstats APIs. Using a new test module will make easier the introduction of new tests, and we will not need to worry about the impact of new changes related to custom pgstats could have with the internals of injection_points. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0sJgO6GAwgFxmzg9MVP=rM7Us8KKcWpuqxe-f5qxmpE0g@mail.gmail.com
2025-12-08	Improve error messages of input functions for pg_dependencies and pg_ndistinct	Michael Paquier
	The error details updated in this commit can be reached in the regression tests. They did not follow the project style, and they should be written them as full sentences. Some of the errors are switched to use an elog(), for cases that involve paths that cannot be reached based on the previous state of the parser processing the input data (array start, object end, etc.). The error messages for these cases use now a more consistent style across the board, with the state of the parser reported for debugging. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Michael Paquier <michael@paquier.xyz> Co-authored-by: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/1353179.1764901790@sss.pgh.pa.us
2025-12-07	ecpg: refactor to eliminate cast-away-const in find_variable().	Tom Lane
	find_variable() and its subroutines transiently scribble on the passed-in "name" string, even though we've declared that "const". The string is in fact temporary, so this is not very harmful, but it's confusing and will produce compiler warnings with late-model gcc. Rearrange the code so that instead of modifying the given string, we make temporary copies of the parts that we need separated out. (I used loc_alloc so that the copies are short-lived and don't need to be freed explicitly.) This code is poorly structured and confusing, to the point where my first attempt to fix it was wrong. It is also under-tested, allowing the broken v1 patch to nonetheless pass regression. I'll restrain myself from rewriting it completely, and just add some comments and more test cases. We will probably want to back-patch this once gcc 15.2 becomes more widespread, but for now just put it in master. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1324889.1764886170@sss.pgh.pa.us
2025-12-07	Micro-optimize datatype conversions in datum_to_jsonb_internal.	Tom Lane
	The general case for converting to a JSONB numeric value is to run the source datatype's output function and then numeric_in, but we can do substantially better than that for integer and numeric source values. This patch improves the speed of jsonb_agg by 30% for integer input, and nearly 2X for numeric input. Sadly, the obvious idea of using float4_numeric and float8_numeric to speed up those cases doesn't work: they are actually slower than the generic coerce-via-I/O method, and not by a small amount. They might round off differently than this code has historically done, too. Leave that alone pending possible changes in those functions. We can also do better than the existing code for text/varchar/bpchar source data; this optimization is similar to one that already exists in the json_agg() code. That saves 20% or so for such inputs. Also make a couple of other minor improvements, such as not giving JSONTYPE_CAST its own special case outside the switch when it could perfectly well be handled inside, and not using dubious string hacking to detect infinity and NaN results. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: jian he <jian.universality@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/1060917.1753202222@sss.pgh.pa.us
2025-12-07	Remove fundamentally-redundant processing in jsonb_agg() et al.	Tom Lane
	The various variants of jsonb_agg() operate as follows, for each aggregate input value: 1. Build a JsonbValue tree representation of the input value. 2. Flatten the JsonbValue tree into a Jsonb in on-disk format. 3. Iterate through the Jsonb, building a JsonbValue that is part of the aggregate's state stored in aggcontext, but is otherwise identical to what phase 1 built. This is very slightly less silly than it sounds, because phase 1 involves calling non-JSONB code such as datatype output functions, which are likely to leak memory, and we don't want to leak into the aggcontext. Nonetheless, phases 2 and 3 are accomplishing exactly nothing that is useful if we can make phase 1 put the JsonbValue tree where we need it. We could probably do that with a bunch of MemoryContextSwitchTo's, but what seems more robust is to give pushJsonbValue the responsibility of building the JsonbValue tree in a specified non-current memory context. The previous patch created the infrastructure for that, and this patch simply makes the aggregate functions use it and then rips out phases 2 and 3. For me, this makes jsonb_agg() with a text column as input run about 2X faster than before. It's not yet on par with json_agg(), but this removes a whole lot of the difference. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: jian he <jian.universality@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/1060917.1753202222@sss.pgh.pa.us
2025-12-07	Revise APIs for pushJsonbValue() and associated routines.	Tom Lane
	Instead of passing "JsonbParseState **" to pushJsonbValue(), pass a pointer to a JsonbInState, which will contain the parseState stack pointer as well as other useful fields. Also, instead of returning a JsonbValue pointer that is often meaningless/ignored, return the top-level JsonbValue pointer in the "result" field of the JsonbInState. This involves a lot of (mostly mechanical) edits, but I think the results are notationally cleaner and easier to understand. Certainly the business with sometimes capturing the result of pushJsonbValue() and sometimes not was bug-prone and incapable of mechanical verification. In the new arrangement, JsonbInState.result remains null until we've completed a valid sequence of pushes, so that an incorrect sequence will result in a null-pointer dereference, not mistaken use of a partial result. However, this isn't simply an exercise in prettier notation. The real reason for doing it is to provide a mechanism whereby pushJsonbValue() can be told to construct the JsonbValue tree in a context that is not CurrentMemoryContext. That happens when a non-null "outcontext" is specified in the JsonbInState. No callers exercise that option in this patch, but the next patch in the series will make use of it. I tried to improve the comments in this area too. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: jian he <jian.universality@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/1060917.1753202222@sss.pgh.pa.us
2025-12-07	Add a macro for the declared typlen of type timetz.	Tom Lane
	pg_type.typlen says 12 for the size of timetz, but sizeof(TimeTzADT) will be 16 on most platforms due to alignment padding. Using the sizeof number is no problem for usages such as palloc'ing a result datum, but in usages such as datumCopy we really ought to match what pg_type says. Add a macro TIMETZ_TYPLEN so that we have a symbolic way to write that rather than hard-coding "12". I cannot find any place where we've needed this so far, but an upcoming patch requires it. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/2329959.1765047648@sss.pgh.pa.us
2025-12-06	Handle constant inputs to corr() and related aggregates more precisely.	Tom Lane
	The SQL standard says that corr() and friends should return NULL in the mathematically-undefined case where all the inputs in one of the columns have the same value. We were checking that by seeing if the sums Sxx and Syy were zero, but that approach is very vulnerable to roundoff error: if a sum is close to zero but not exactly that, we'd come out with a pretty silly non-NULL result. Instead, directly track whether the inputs are all equal by remembering the common value in each column. Once we detect that a new input is different from before, represent that by storing NaN for the common value. (An objection to this scheme is that if the inputs are all NaN, we will consider that they were not all equal. But under IEEE float arithmetic rules, one NaN is never equal to another, so this behavior is arguably correct. Moreover it matches what we did before in such cases.) Then, leave the sums at their exact value of zero for as long as we haven't detected different input values. This solution requires the aggregate transition state to contain 8 float values not 6, which is not problematic, and it seems to add less than 1% to the aggregates' runtime, which seems acceptable. While we're here, improve corr()'s final function to cope with overflow/underflow in the final calculation, and to clamp its result to [-1, 1] in case of roundoff error. Although this is arguably a bug fix, it requires a catversion bump due to the change in aggregates' initial states, so it can't be back-patched. Patch written by me, but many of the ideas are due to Dean Rasheed, who also did a deal of testing. Bug: #19340 Reported-by: Oleg Ivanov <o15611@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Co-authored-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/19340-6fb9f6637f562092@postgresql.org
2025-12-06	Improve error reporting of recovery test 027_stream_regress	Michael Paquier
	Previously, the 027_stream_regress test reported the full contents of regression.diffs upon a test failure, when the standby and the primary were still alive. If a test fails quite badly, the amount of information reported can be really high, bloating the reports in the buildfarm, the CI, or even local runs. In most cases, we have noticed that having all this information is not necessary when attempting to identify the source of a problem in this test. This commit changes the situation by including the head and tail of regression.diffs in the reports generated on failure rather than its full contents, building upon b93f4e2f98b3 to optionally control the size of the reports with the new environment variable PG_TEST_FILE_READ_LINES. This will perhaps require some more tuning, but the hope is to reduce some of the buildfarm report bloat while making the information good enough to deduce what is happening when something is going wrong, be it in the buildfarm or some tests run in the CI, at least. Suggested-by: Andres Freund <andres@anarazel.de> Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAN55FZ1D6KXvjSs7YGsDeadqCxNF3UUhjRAfforzzP0k-cE=bA@mail.gmail.com
2025-12-06	Add PostgreSQL::Test::Cluster::read_head_tail() helper to PostgreSQL/Utils.pm	Michael Paquier
	This function reads the lines from a file and filters its contents to report its head and tail contents. The amount of contents to read from a file can be tuned by the environment variable PG_TEST_FILE_READ_LINES, that can be used to override the default of 50 lines. If the file whose content is read has less lines than two times PG_TEST_FILE_READ_LINES, the whole file is returned. This will be used in a follow-up commit to limit the amount of information reported by some of the TAP tests on failure, where we have noticed that the contents reported by the buildfarm can be heavily bloated in some cases, with the head and tail contents of a report being able to provide enough information to be useful for debugging. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAN55FZ1D6KXvjSs7YGsDeadqCxNF3UUhjRAfforzzP0k-cE=bA@mail.gmail.com
2025-12-05	Fix text substring search for non-deterministic collations.	Tom Lane
	Due to an off-by-one error, the code failed to find matches at the end of the haystack. Fix by rewriting the loop. While at it, fix a comment that claimed that the function could find a zero-length match. Such a match could send a caller into an endless loop. However, zero-length matches only make sense with an empty search string, and that case is explicitly excluded by all callers. To make sure it stays that way, add an Assert and a comment. Bug: #19341 Reported-by: Adam Warland <adam.warland@infor.com> Author: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19341-1d9a22915edfec58@postgresql.org Backpatch-through: 18
2025-12-05	Fix test to work with non-8kB block sizes	Heikki Linnakangas
	Author: Maxim Orlov <orlovmg@gmail.com> Discussion: https://www.postgresql.org/message-id/CACG%3Dezbtm%2BLOzEMyLX7rzGcAv3ez3F6nNpSJjvZeMzed0Oe6Pw%40mail.gmail.com
2025-12-05	Don't reset the pathlist of partitioned joinrels.	Robert Haas
	apply_scanjoin_target_to_paths wants to avoid useless work and platform-specific dependencies by throwing away the path list created prior to applying the final scan/join target and constructing a whole new one using the final scan/join target. However, this is only valid when we'll consider all the same strategies after the pathlist reset as before. After resetting the path list, we reconsider Append and MergeAppend paths with the modified target list; therefore, it's only valid for a partitioned relation. However, what the previous coding missed is that it cannot be a partitioned join relation, because that also has paths that are not Append or MergeAppend paths and will not be reconsidered. Thus, before this patch, we'd sometimes choose a partitionwise strategy with a higher total cost than cheapest non-partitionwise strategy, which is not good. We had a surprising number of tests cases that were relying on this bug to work as they did. A big part of the reason for this is that row counts in regression test cases tend to be low, which brings the cost of partitionwise and non-partitionwise strategies very close together, especially for merge joins, where the real and perceived advantages of a partitionwise approach are minimal. In addition, one test case included a row-count-inflating join. In such cases, a partitionwise join can easily be a loser on cost, because the total number of tuples passing through an Append node is much higher than it is with a non-partitionwise strategy. That test case is adjusted by adding additional join clauses to avoid the row count inflation. Although the failure of the planner to choose the lowest-cost path is a bug, we generally do not back-patch fixes of this type, because planning is not an exact science and there is always a possibility that some user will end up with a plan that has a lower estimated cost but actually runs more slowly. Hence, no backpatch here, either. The code change here is exactly what was originally proposed by Ashutosh, but the changes to the comments and test cases have been very heavily rewritten by me, helped along by some very useful advice from Richard Guo. Reported-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Author: Robert Haas <rhaas@postgresql.org> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Arne Roland <arne.roland@malkut.net> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: http://postgr.es/m/CAExHW5toze58+jL-454J3ty11sqJyU13Sz5rJPQZDmASwZgWiA@mail.gmail.com
2025-12-05	Fix some cases of indirectly casting away const.	Tom Lane
	Newest versions of gcc are able to detect cases where code implicitly casts away const by assigning the result of strchr() or a similar function applied to a "const char " value to a target variable that's just "char ". This of course creates a hazard of not getting a compiler warning about scribbling on a string one was not supposed to, so fixing up such cases is good. This patch fixes a dozen or so places where we were doing that. Most are trivial additions of "const" to the target variable, since no actually-hazardous change was occurring. There is one place in ecpg.trailer where we were indeed violating the intention of not modifying a string passed in as "const char *". I believe that's harmless not a live bug, but let's fix it by copying the string before modifying it. There is a remaining trouble spot in ecpg/preproc/variable.c, which requires more complex surgery. I've left that out of this commit because I want to study that code a bit more first. We probably will want to back-patch this once compilers that detect this pattern get into wider circulation, but for now I'm just going to apply it to master to see what the buildfarm says. Thanks to Bertrand Drouvot for finding a couple more spots than I had. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/1324889.1764886170@sss.pgh.pa.us
2025-12-05	Stabilize tests some more	Álvaro Herrera
	Tests added by commits 90eae926abbb, 2bc7e886fc1b, bc32a12e0db2 have occasionally failed, depending on timing. Add some dependency markers to the spec to try and remove the instability. Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Discussion: https://postgr.es/m/202512041739.sgg3tb2yobe2@alvherre.pgsql
2025-12-05	Fix setting next multixid's offset at offset wraparound	Heikki Linnakangas
	In commit 789d65364c, we started updating the next multixid's offset too when recording a multixid, so that it can always be used to calculate the number of members. I got it wrong at offset wraparound: we need to skip over offset 0. Fix that. Discussion: https://www.postgresql.org/message-id/d9996478-389a-4340-8735-bfad456b313c@iki.fi Backpatch-through: 14
2025-12-05	Improve test output of extended statistics for ndistinct and dependencies	Michael Paquier
	Corey Huinker has come up with a recipe that is more compact and more pleasant to the eye for extended stats because we know that all of them are 1-dimension JSON arrays. This commit switches the extended stats tests to use replace() instead of jsonb_pretty(), splitting the data so as one line is used for each item in the extended stats object. This results in the removal of a good chunk of test output, that is now easier to debug with one line used for each item in a stats object. This patch has not been provided by Corey. This is some post-commit cleanup work that I have noticed as good enough to do on its own while reviewing the rest of the patch set Corey has posted. Discussion: https://postgr.es/m/CADkLM=csMd52i39Ye8-PUUHyzBb3546eSCUTh-FBQ7bzT2uZ4Q@mail.gmail.com
2025-12-05	Rename column slotsync_skip_at to slotsync_last_skip.	Amit Kapila
	Commit 76b78721ca introduced two new columns in pg_stat_replication_slots to improve monitoring of slot synchronization. One of these columns was named slotsync_skip_at, which is inconsistent with the naming convention used for similar columns in other system views. Columns that store timestamps of the most recent event typically use the 'last_' in the column name (e.g., last_autovacuum, checksum_last_failure). Renaming slotsync_skip_at to slotsync_last_skip aligns with this pattern, making the purpose of the column clearer and improving overall consistency across the views. Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Michael Banck <mbanck@gmx.net> Discussion: https://postgr.es/m/20251128091552.GB13635@p46.dedyn.io;lightning.p46.dedyn.io Discussion: https://postgr.es/m/CAE9k0PkhfKrTEAsGz4DjOhEj1nQ+hbQVfvWUxNacD38ibW3a1g@mail.gmail.com
2025-12-05	Fix some compiler warnings	Michael Paquier
	Some of the buildfarm members with some old gcc versions have been complaining about an always-true test for a NULL pointer caused by a combination of SOFT_ERROR_OCCURRED() and a local ErrorSaveContext variable. These warnings are taken care of by removing SOFT_ERROR_OCCURRED(), switching to a direct variable check, like 56b1e88c804b. Oversights in e1405aa5e3ac and 44eba8f06e55. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1341064.1764895052@sss.pgh.pa.us
2025-12-05	Show version of nodes in output of TAP tests	Michael Paquier
	This commit adds the version information of a node initialized by Cluster.pm, that may vary depending on the install_path given by the test. The code was written so as the node information, that includes the version number, was dumped before the version number was set. This is particularly useful for the pg_upgrade TAP tests, that may mix several versions for cross-version runs. The TAP infrastructure also allows mixing nodes with different versions, so this information can be useful for out-of-core tests. Backpatch down to v15, where Cluster.pm and the pg_upgrade TAP tests have been introduced. Author: Potapov Alexander <a.potapov@postgrespro.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/e59bb-692c0a80-5-6f987180@170377126 Backpatch-through: 15
2025-12-04	Suppress spurious Coverity warning in prune freeze logic	Melanie Plageman
	Adjust the prune_freeze_setup() parameter types of new_relfrozen_xid and new_relmin_mxid to prevent misleading Coverity analysis. heap_page_prune_and_freeze() compared these values against NULL when passing them to prune_freeze_setup(), causing Coverity to assume they could be NULL and flag a possible null-pointer dereference later, even though it occurs inside a directly related conditional. Reported-by: Coverity Author: Melanie Plageman <melanieplageman@gmail.com>
2025-12-04	Fix key size of PrivateRefCountHash.	Nathan Bossart
	The key is the first member of PrivateRefCountEntry, which has type Buffer. This commit changes the key size from sizeof(int32) to sizeof(Buffer). This appears to be an oversight in commit 4b4b680c3d, but it's of no consequence because Buffer has been a signed 32-bit integer for a long time. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aS77DTpl0fOkIKSZ%40ip-10-97-1-34.eu-west-3.compute.internal