summaryrefslogtreecommitdiff
path: root/src/backend/utils
AgeCommit message (Collapse)Author
19 hoursEnsure sanity of hash-join costing when there are no MCV statistics.HEADorigin/masterorigin/HEADmasterTom Lane
estimate_hash_bucket_stats is defined to return zero to *mcv_freq if it cannot obtain a value for the frequency of the most common value. Its sole caller final_cost_hashjoin ignored this provision and would blindly believe the zero value, resulting in computing zero for the largest bucket size. In consequence, the safety check that intended to prevent the largest bucket from exceeding get_hash_memory_limit() was ineffective, allowing very silly plans to be chosen if statistics were missing. After fixing final_cost_hashjoin to disregard zero results for mcv_freq, a second problem appeared: some cases that should use hash joins failed to. This is because estimate_hash_bucket_stats was unaware of the fact that ANALYZE won't store MCV statistics if it doesn't find any multiply-occurring values. Thus the lack of an MCV stats entry doesn't necessarily mean that we know nothing; we may well know that the column is unique. The former coding returned zero for *mcv_freq in this case, which was pretty close to correct, but now final_cost_hashjoin doesn't believe it and disables the hash join. So check to see if there is a HISTOGRAM stats entry; if so, ANALYZE has in fact run for this column and must have found it to be unique. In that case report the MCV frequency as 1 / rows, instead of claiming ignorance. Reporting a more accurate *mcv_freq in this case can also affect the bucket-size skew adjustment further down in estimate_hash_bucket_stats, causing hash-join cost estimates to change slightly. This affects some plan choices in the core regression tests. The first diff in join.out corresponds to a case where we have no stats and should not risk a hash join, but the remaining changes are caused by producing a better bucket-size estimate for unique join columns. Those are all harmless changes so far as I can tell. The existing behavior was introduced in commit 4867d7f62 in v11. It appears from the commit log that disabling the bucket-size safety check in the absence of statistics was intentional; but we've now seen a case where the ensuing behavior is bad enough to make that seem like a poor decision. In any case the lack of other problems with that safety check after several years helps to justify enforcing it more strictly. However, we won't risk back-patching this, in case any applications are depending on the existing behavior. Bug: #19363 Reported-by: Jinhui Lai <jinhui.lai@qq.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/2380165.1766871097@sss.pgh.pa.us Discussion: https://postgr.es/m/19363-8dd32fc7600a1153@postgresql.org
35 hoursIgnore PlaceHolderVars when looking up statisticsRichard Guo
When looking up statistical data about an expression, we failed to look through PlaceHolderVar nodes, treating them as opaque. This could prevent us from matching an expression to base columns, index expressions, or extended statistics, as examine_variable() relies on strict structural matching. As a result, queries involving PlaceHolderVar nodes often fell back to default selectivity estimates, potentially leading to poor plan choices. This patch updates examine_variable() to strip PlaceHolderVars before analysis. This is safe during estimation because PlaceHolderVars are transparent for the purpose of statistics lookup: they do not alter the value distribution of the underlying expression. To minimize performance overhead on this hot path, a lightweight walker first checks for the presence of PlaceHolderVars. The more expensive mutator is invoked only when necessary. There is one ensuing plan change in the regression tests, which is expected and demonstrates the fix: the rowcount estimate becomes much more accurate with this patch. Back-patch to v18. Although this issue exists before that, changes in this version made it common enough to notice. Given the lack of field reports for older versions, I am not back-patching further. Reported-by: Haowu Ge <gehaowu@bitmoe.com> Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/62af586c-c270-44f3-9c5e-02c81d537e3d.gehaowu@bitmoe.com Backpatch-through: 18
48 hoursChange some Datum to void * for opaque pass-through pointerPeter Eisentraut
Here, Datum was used to pass around an opaque pointer between a group of functions. But one might as well use void * for that; the use of Datum doesn't achieve anything here and is just distracting. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/1c5d23cb-288b-4154-b1cd-191fe2301707%40eisentraut.org
3 daysFix pg_stat_get_backend_activity() to use multi-byte truncated resultMichael Paquier
pg_stat_get_backend_activity() calls pgstat_clip_activity() to ensure that the reported query string is correctly truncated when it finishes with an incomplete multi-byte sequence. However, the result returned by the function was not what pgstat_clip_activity() generated, but the non-truncated, original, contents from PgBackendStatus.st_activity_raw. Oversight in 54b6cd589ac2, so backpatch all the way down. Author: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAEoWx2mDzwc48q2EK9tSXS6iJMJ35wvxNQnHX+rXjy5VgLvJQw@mail.gmail.com Backpatch-through: 14
6 daysImprove comment in pgstatfuncs.cMichael Paquier
Author: Zizhen Qiao <zizhen_qiao@163.com> Discussion: https://postgr.es/m/5ee635f9.49f7.19b4ed9e803.Coremail.zizhen_qiao@163.com
7 daysToggle logical decoding dynamically based on logical slot presence.Masahiko Sawada
Previously logical decoding required wal_level to be set to 'logical' at server start. This meant that users had to incur the overhead of logical-level WAL logging even when no logical replication slots were in use. This commit adds functionality to automatically control logical decoding availability based on logical replication slot presence. The newly introduced module logicalctl.c allows logical decoding to be dynamically activated when needed when wal_level is set to 'replica'. When the first logical replication slot is created, the system automatically increases the effective WAL level to maintain logical-level WAL records. Conversely, after the last logical slot is dropped or invalidated, it decreases back to 'replica' WAL level. While activation occurs synchronously right after creating the first logical slot, deactivation happens asynchronously through the checkpointer process. This design avoids a race condition at the end of recovery; a concurrent deactivation could happen while the startup process enables logical decoding at the end of recovery, but WAL writes are still not permitted until recovery fully completes. The checkpointer will handle it after recovery is done. Asynchronous deactivation also avoids excessive toggling of the logical decoding status in workloads that repeatedly create and drop a single logical slot. On the other hand, this lazy approach can delay changes to effective_wal_level and the disabling logical decoding, especially when the checkpointer is busy with other tasks. We chose this lazy approach in all deactivation paths to keep the implementation simple, even though laziness is strictly required only for end-of-recovery cases. Future work might address this limitation either by using a dedicated worker instead of the checkpointer, or by implementing synchronous waiting during slot drops if workloads are significantly affected by the lazy deactivation of logical decoding. The effective WAL level, determined internally by XLogLogicalInfo, is allowed to change within a transaction until an XID is assigned. Once an XID is assigned, the value becomes fixed for the remainder of the transaction. This behavior ensures that the logging mode remains consistent within a writing transaction, similar to the behavior of GUC parameters. A new read-only GUC parameter effective_wal_level is introduced to monitor the actual WAL level in effect. This parameter reflects the current operational WAL level, which may differ from the configured wal_level setting. Bump PG_CONTROL_VERSION as it adds a new field to CheckPoint struct. Reviewed-by: Shveta Malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/CAD21AoCVLeLYq09pQPaWs+Jwdni5FuJ8v2jgq-u9_uFbcp6UbA@mail.gmail.com
8 daysSwitch buffile.c/h to use pgoff_t instead of off_tMichael Paquier
off_t was previously used for offsets, which is 4 bytes on Windows, hence limiting the backend code to a hard limit for files longer than 2GB. This leads to some simplification in these files, removing some casts based on long, also 4 bytes on Windows. This commit removes one comment introduced in db3c4c3a2d98, not relevant anymore as pgoff_t is a safe 8-byte alternative on Windows. This change is surprisingly not invasive, as the callers of BufFileTell(), BufFileSeek() and BufFileTruncateFileSet() (worker.c, tuplestore.c, etc.) track offsets in local structures that just to switch from off_t to pgoff_t for the most part. The file is still relying on a maximum file size of MAX_PHYSICAL_FILESIZE (1GB). This change allows the code to make this maximum potentially larger in the future, or larger on a per-demand basis. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aUStrqoOCDRFAq1M@paquier.xyz
11 daysUse proper type for RestoreTransactionSnapshot's PGPROC argHeikki Linnakangas
Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/08cbaeb5-aaaf-47b6-9ed8-4f7455b0bc4b@iki.fi
11 daysAdd guard to prevent recursive memory context logging.Fujii Masao
Previously, if memory context logging was triggered repeatedly and rapidly while a previous request was still being processed, it could result in recursive calls to ProcessLogMemoryContextInterrupt(). This could lead to infinite recursion and potentially crash the process. This commit adds a guard to prevent such recursion. If ProcessLogMemoryContextInterrupt() is already in progress and logging memory contexts, subsequent calls will exit immediately, avoiding unintended recursive calls. While this scenario is unlikely in practice, it's not impossible. This change adds a safety check to prevent such failures. Back-patch to v14, where memory context logging was introduced. Reported-by: Robert Haas <robertmhaas@gmail.com> Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Artem Gavrilov <artem.gavrilov@percona.com> Discussion: https://postgr.es/m/CA+TgmoZMrv32tbNRrFTvF9iWLnTGqbhYSLVcrHGuwZvCtph0NA@mail.gmail.com Backpatch-through: 14
13 daysChange pgstat_report_vacuum() to use RelationMichael Paquier
This change makes pgstat_report_vacuum() more consistent with pgstat_report_analyze(), that also uses a Relation. This enforces a policy that callers of this routine should open and lock the relation whose statistics are updated before calling this routine. We will unlikely have a lot of callers of this routine in the tree, but it seems like a good idea to imply this requirement in the long run. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Suggested-by: Andres Freund <andres@anarazel.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aUEA6UZZkDCQFgSA@ip-10-97-1-34.eu-west-3.compute.internal
14 daysRemove useless code in InjectionPointAttach()Michael Paquier
strlcpy() ensures that a target string is zero-terminated, so there is no need to enforce it a second time in this code. This simplification could have been done in 0eb23285a257. Author: Feilong Meng <feelingmeng@foxmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/tencent_771178777C5BC17FCB7F7A1771CD1FFD5708@qq.com
14 daysAvoid global LC_CTYPE dependency in pg_locale_icu.c.Jeff Davis
ICU still depends on libc for compatibility with certain historical behavior for single-byte encodings. Make the dependency explicit by holding a locale_t object when required. We should consider a better solution in the future, such as decoding the text to UTF-32 and using u_tolower(). That would be a behavior change and require additional infrastructure though; so for now, just avoid the global LC_CTYPE dependency. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com
14 daysdowncase_identifier(): use method table from locale provider.Jeff Davis
Previously, libc's tolower() was always used for lowercasing identifiers, regardless of the database locale (though only characters beyond 127 in single-byte encodings were affected). Refactor to allow each provider to supply its own implementation of identifier downcasing. For historical compatibility, when using a single-byte encoding, ICU still relies on tolower(). One minor behavior change is that, before the database default locale is initialized, it uses ASCII semantics to downcase the identifiers. Previously, it would use the postmaster's LC_CTYPE setting from the environment. While that could have some effect during GUC processing, for example, it would have been fragile to rely on the environment setting anyway. (Also, it only matters when the encoding is single-byte.) Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com
2025-12-16Separate out bytea sort support from varlena.cJohn Naylor
In the wake of commit b45242fd3, bytea_sortsupport() still called out to varstr_sortsupport(). Treating bytea as a kind of text/varchar required varstr_sortsupport() to allow for the possibility of NUL bytes, but only for C collation. This was confusing. For better separation of concerns, create an independent sortsupport implementation in bytea.c. The heuristics for bytea_abbrev_abort() remain the same as for varstr_abbrev_abort(). It's possible that the bytea case warrants different treatment, but that is left for future investigation. In passing, adjust some strange looking comparisons in varstr_abbrev_abort(). Author: Aleksander Alekseev <aleksander@tigerdata.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAJ7c6TP1bAbEhUJa6+rgceN6QJWMSsxhg1=mqfSN=Nb-n6DAKg@mail.gmail.com
2025-12-15Revisit cosmetics of "For inplace update, send nontransactional invalidations."Noah Misch
This removes a never-used CacheInvalidateHeapTupleInplace() parameter. It adds README content about inplace update visibility in logical decoding. It rewrites other comments. Back-patch to v18, where commit 243e9b40f1b2dd09d6e5bf91ebf6e822a2cd3704 first appeared. Since this removes a CacheInvalidateHeapTupleInplace() parameter, expect a v18 ".abi-compliance-history" edit to follow. PGXN contains no calls to that function. Reported-by: Paul A Jungwirth <pj@illuminatedcomputing.com> Reported-by: Ilyasov Ian <ianilyasov@outlook.com> Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Surya Poondla <s_poondla@apple.com> Discussion: https://postgr.es/m/CA+renyU+LGLvCqS0=fHit-N1J-2=2_mPK97AQxvcfKm+F-DxJA@mail.gmail.com Backpatch-through: 18
2025-12-15Remove unused single-byte char_is_cased() API.Jeff Davis
https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com
2025-12-15Use multibyte-aware extraction of pattern prefixes.Jeff Davis
Previously, like_fixed_prefix() used char-at-a-time logic, which forced it to be too conservative for case-insensitive matching. Introduce like_fixed_prefix_ci(), and use that for case-insensitive pattern prefixes. It uses multibyte and locale-aware logic, along with the new pg_iswcased() API introduced in 630706ced0. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com
2025-12-15Add retry logic to pg_sync_replication_slots().Amit Kapila
Previously, pg_sync_replication_slots() would finish without synchronizing slots that didn't meet requirements, rather than failing outright. This could leave some failover slots unsynchronized if required catalog rows or WAL segments were missing or at risk of removal, while the standby continued removing needed data. To address this, the function now waits for the primary slot to advance to a position where all required data is available on the standby before completing synchronization. It retries cyclically until all failover slots that existed on the primary at the start of the call are synchronized. Slots created after the function begins are not included. If the standby is promoted during this wait, the function exits gracefully and the temporary slots will be removed. Author: Ajin Cherian <itsajin@gmail.com> Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Shveta Malik <shveta.malik@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Yilin Zhang <jiezhilove@126.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAFPTHDZAA%2BgWDntpa5ucqKKba41%3DtXmoXqN3q4rpjO9cdxgQrw%40mail.gmail.com
2025-12-15Allow cumulative statistics to read/write auxiliary data from/to diskMichael Paquier
Cumulative stats kinds gain the capability to write additional per-entry data when flushing the stats at shutdown, and read this data when loading back the stats at startup. This can be fit for example in the case of variable-length data (like normalized query strings), so as it becomes possible to link the shared memory stats entries to data that is stored in a different area, like a DSA segment. Three new optional callbacks are added to PgStat_KindInfo, available to variable-numbered stats kinds: * to_serialized_data: writes auxiliary data for an entry. * from_serialized_data: reads auxiliary data for an entry. * finish: performs actions after read/write/discard operations. This is invoked after processing all the entries of a kind, allowing extensions to close file handles and clean up resources. Stats kinds have the option to store this data in the existing pgstats file, but can as well store it in one or more additional files whose names can be built upon the entry keys. The new serialized callbacks are called once an entry key is read or written from the main stats file. A file descriptor to the main pgstats file is available in the arguments of the callbacks. Author: Sami Imseih <samimseih@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0s9SDOu+Z6veoJCHWk+kDeTktAtC-KY9fQ9Z6BJdDUirQ@mail.gmail.com
2025-12-14Update typedefs.list to match what the buildfarm currently reports.Tom Lane
The current list from the buildfarm includes quite a few typedef names that it used to miss. The reason is a bit obscure, but it seems likely to have something to do with our recent increased use of palloc_object and palloc_array. In any case, this makes the relevant struct declarations be much more nicely formatted, so I'll take it. Install the current list and re-run pgindent to update affected code. Syncing with the current list also removes some obsolete typedef names and fixes some alphabetization errors. Discussion: https://postgr.es/m/1681301.1765742268@sss.pgh.pa.us
2025-12-13Fix jsonb_object_agg crash after eliminating null-valued pairs.Tom Lane
In commit b61aa76e4 I added an assumption in jsonb_object_agg_finalfn that it'd be okay to apply uniqueifyJsonbObject repeatedly to a JsonbValue. I should have studied that code more closely first, because in skip_nulls mode it removed leading nulls by changing the "pairs" array start pointer. This broke the data structure's invariants in two ways: pairs no longer references a repalloc-able chunk, and the distance from pairs to the end of its array is less than parseState->size. So any subsequent addition of more pairs is at high risk of clobbering memory and/or causing repalloc to crash. Unfortunately, adding more pairs is exactly what will happen when the aggregate is being used as a window function. Fix by rewriting uniqueifyJsonbObject to not do that. The prior coding had little to recommend it anyway. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/ec5e96fb-ee49-4e5f-8a09-3f72b4780538@gmail.com
2025-12-13Clarify comment about temporal foreign keysPeter Eisentraut
In RI_ConstraintInfo, period_contained_by_oper and period_intersect_oper can take either anyrange or anymultirange. Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Discussion: https://www.postgresql.org/message-id/CA%2BrenyWzDth%2BjqLZA2L2Cezs3wE%2BWX-5P8W2EOVx_zfFD%3Daicg%40mail.gmail.com
2025-12-12Replace most StaticAssertStmt() with StaticAssertDecl()Peter Eisentraut
Similar to commit 75f49221c22, it is preferable to use StaticAssertDecl() instead of StaticAssertStmt() when possible. Discussion: https://www.postgresql.org/message-id/flat/CA%2BhUKGKvr0x_oGmQTUkx%3DODgSksT2EtgCA6LmGx_jQFG%3DsDUpg%40mail.gmail.com
2025-12-11Fix some comments.Nathan Bossart
Like commit 123661427b, these were discovered while reviewing Aleksander Alekseev's proposed changes to pgindent.
2025-12-10Add pg_iswcased().Jeff Davis
True if character has multiple case forms. Will be a useful multibyte-aware replacement for char_is_cased(). Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com
2025-12-10Remove char_tolower() API.Jeff Davis
It's only useful for an ILIKE optimization for the libc provider using a single-byte encoding and a non-C locale, but it creates significant internal complexity. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com
2025-12-10Fix some near-bugs related to ResourceOwner function argumentsHeikki Linnakangas
These functions took a ResourceOwner argument, but only checked if it was NULL, and then used CurrentResourceOwner for the actual work. Surely the intention was to use the passed-in resource owner. All current callers passed CurrentResourceOwner or NULL, so this has no consequences at the moment, but it's an accident waiting to happen for future caller and extensions. Author: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://www.postgresql.org/message-id/CAEze2Whnfv8VuRZaohE-Af+GxBA1SNfD_rXfm84Jv-958UCcJA@mail.gmail.com Backpatch-through: 17
2025-12-10Fix misleading comment in tuplesort.cDavid Rowley
A comment in tuplesort.c was claiming that the code was defining INITIAL_MEMTUPSIZE so that it *does not* exceed ALLOCSET_SEPARATE_THRESHOLD, but the code actually ensures that we purposefully *do* exceed ALLOCSET_SEPARATE_THRESHOLD for the initial allocation of the tuples array, as per reasons detailed in the commentary of grow_memtuples(). Also, there's not much need to repeat the mention about ALLOCSET_SEPARATE_THRESHOLD in each location where INITIAL_MEMTUPSIZE is used, so remove those comments. Author: ChangAo Chen <cca5507@qq.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Discussion: https://postgr.es/m/tencent_6FA14F85D6B5B5291532D6789E07F4765C08%40qq.com
2025-12-10Use palloc_object() and palloc_array() in backend codeMichael Paquier
The idea is to encourage more the use of these new routines across the tree, as these offer stronger type safety guarantees than palloc(). This batch of changes includes most of the trivial changes suggested by the author for src/backend/. A total of 334 files are updated here. Among these files, 48 of them have their build change slightly; these are caused by line number changes as the new allocation formulas are simpler, shaving around 100 lines of code in total. Similar work has been done in 0c3c5c3b06a3 and 31d3847a37be. Author: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/ad0748d4-3080-436e-b0bc-ac8f86a3466a@gmail.com
2025-12-09Add wait event for the group commit delay before WAL flushHeikki Linnakangas
Author: Rafia Sabih <rafia.pghackers@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://www.postgresql.org/message-id/CA%2BFpmFf-hWXtrC0Q3Cr_Xo78zuP_M_VC5xgWPOYOkwqOD0T8eg@mail.gmail.com
2025-12-09Remove useless casts in format argumentsPeter Eisentraut
There were a number of useless casts in format arguments, either where the input to the cast was already in the right type, or seemingly uselessly casting between types instead of just using the right format placeholder to begin with. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/07fa29f9-42d7-4aac-8834-197918cbbab6%40eisentraut.org
2025-12-09Remove unnecessary casts in printf format arguments (%zu/%zd)Peter Eisentraut
Many of these are probably left over from before use of %zu/%zd was portable. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/07fa29f9-42d7-4aac-8834-197918cbbab6%40eisentraut.org
2025-12-08Improve error messages of input functions for pg_dependencies and pg_ndistinctMichael Paquier
The error details updated in this commit can be reached in the regression tests. They did not follow the project style, and they should be written them as full sentences. Some of the errors are switched to use an elog(), for cases that involve paths that cannot be reached based on the previous state of the parser processing the input data (array start, object end, etc.). The error messages for these cases use now a more consistent style across the board, with the state of the parser reported for debugging. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Michael Paquier <michael@paquier.xyz> Co-authored-by: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/1353179.1764901790@sss.pgh.pa.us
2025-12-07Micro-optimize datatype conversions in datum_to_jsonb_internal.Tom Lane
The general case for converting to a JSONB numeric value is to run the source datatype's output function and then numeric_in, but we can do substantially better than that for integer and numeric source values. This patch improves the speed of jsonb_agg by 30% for integer input, and nearly 2X for numeric input. Sadly, the obvious idea of using float4_numeric and float8_numeric to speed up those cases doesn't work: they are actually slower than the generic coerce-via-I/O method, and not by a small amount. They might round off differently than this code has historically done, too. Leave that alone pending possible changes in those functions. We can also do better than the existing code for text/varchar/bpchar source data; this optimization is similar to one that already exists in the json_agg() code. That saves 20% or so for such inputs. Also make a couple of other minor improvements, such as not giving JSONTYPE_CAST its own special case outside the switch when it could perfectly well be handled inside, and not using dubious string hacking to detect infinity and NaN results. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: jian he <jian.universality@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/1060917.1753202222@sss.pgh.pa.us
2025-12-07Remove fundamentally-redundant processing in jsonb_agg() et al.Tom Lane
The various variants of jsonb_agg() operate as follows, for each aggregate input value: 1. Build a JsonbValue tree representation of the input value. 2. Flatten the JsonbValue tree into a Jsonb in on-disk format. 3. Iterate through the Jsonb, building a JsonbValue that is part of the aggregate's state stored in aggcontext, but is otherwise identical to what phase 1 built. This is very slightly less silly than it sounds, because phase 1 involves calling non-JSONB code such as datatype output functions, which are likely to leak memory, and we don't want to leak into the aggcontext. Nonetheless, phases 2 and 3 are accomplishing exactly nothing that is useful if we can make phase 1 put the JsonbValue tree where we need it. We could probably do that with a bunch of MemoryContextSwitchTo's, but what seems more robust is to give pushJsonbValue the responsibility of building the JsonbValue tree in a specified non-current memory context. The previous patch created the infrastructure for that, and this patch simply makes the aggregate functions use it and then rips out phases 2 and 3. For me, this makes jsonb_agg() with a text column as input run about 2X faster than before. It's not yet on par with json_agg(), but this removes a whole lot of the difference. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: jian he <jian.universality@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/1060917.1753202222@sss.pgh.pa.us
2025-12-07Revise APIs for pushJsonbValue() and associated routines.Tom Lane
Instead of passing "JsonbParseState **" to pushJsonbValue(), pass a pointer to a JsonbInState, which will contain the parseState stack pointer as well as other useful fields. Also, instead of returning a JsonbValue pointer that is often meaningless/ignored, return the top-level JsonbValue pointer in the "result" field of the JsonbInState. This involves a lot of (mostly mechanical) edits, but I think the results are notationally cleaner and easier to understand. Certainly the business with sometimes capturing the result of pushJsonbValue() and sometimes not was bug-prone and incapable of mechanical verification. In the new arrangement, JsonbInState.result remains null until we've completed a valid sequence of pushes, so that an incorrect sequence will result in a null-pointer dereference, not mistaken use of a partial result. However, this isn't simply an exercise in prettier notation. The real reason for doing it is to provide a mechanism whereby pushJsonbValue() can be told to construct the JsonbValue tree in a context that is not CurrentMemoryContext. That happens when a non-null "outcontext" is specified in the JsonbInState. No callers exercise that option in this patch, but the next patch in the series will make use of it. I tried to improve the comments in this area too. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: jian he <jian.universality@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/1060917.1753202222@sss.pgh.pa.us
2025-12-06Handle constant inputs to corr() and related aggregates more precisely.Tom Lane
The SQL standard says that corr() and friends should return NULL in the mathematically-undefined case where all the inputs in one of the columns have the same value. We were checking that by seeing if the sums Sxx and Syy were zero, but that approach is very vulnerable to roundoff error: if a sum is close to zero but not exactly that, we'd come out with a pretty silly non-NULL result. Instead, directly track whether the inputs are all equal by remembering the common value in each column. Once we detect that a new input is different from before, represent that by storing NaN for the common value. (An objection to this scheme is that if the inputs are all NaN, we will consider that they were not all equal. But under IEEE float arithmetic rules, one NaN is never equal to another, so this behavior is arguably correct. Moreover it matches what we did before in such cases.) Then, leave the sums at their exact value of zero for as long as we haven't detected different input values. This solution requires the aggregate transition state to contain 8 float values not 6, which is not problematic, and it seems to add less than 1% to the aggregates' runtime, which seems acceptable. While we're here, improve corr()'s final function to cope with overflow/underflow in the final calculation, and to clamp its result to [-1, 1] in case of roundoff error. Although this is arguably a bug fix, it requires a catversion bump due to the change in aggregates' initial states, so it can't be back-patched. Patch written by me, but many of the ideas are due to Dean Rasheed, who also did a deal of testing. Bug: #19340 Reported-by: Oleg Ivanov <o15611@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Co-authored-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/19340-6fb9f6637f562092@postgresql.org
2025-12-05Fix text substring search for non-deterministic collations.Tom Lane
Due to an off-by-one error, the code failed to find matches at the end of the haystack. Fix by rewriting the loop. While at it, fix a comment that claimed that the function could find a zero-length match. Such a match could send a caller into an endless loop. However, zero-length matches only make sense with an empty search string, and that case is explicitly excluded by all callers. To make sure it stays that way, add an Assert and a comment. Bug: #19341 Reported-by: Adam Warland <adam.warland@infor.com> Author: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19341-1d9a22915edfec58@postgresql.org Backpatch-through: 18
2025-12-05Fix some cases of indirectly casting away const.Tom Lane
Newest versions of gcc are able to detect cases where code implicitly casts away const by assigning the result of strchr() or a similar function applied to a "const char *" value to a target variable that's just "char *". This of course creates a hazard of not getting a compiler warning about scribbling on a string one was not supposed to, so fixing up such cases is good. This patch fixes a dozen or so places where we were doing that. Most are trivial additions of "const" to the target variable, since no actually-hazardous change was occurring. There is one place in ecpg.trailer where we were indeed violating the intention of not modifying a string passed in as "const char *". I believe that's harmless not a live bug, but let's fix it by copying the string before modifying it. There is a remaining trouble spot in ecpg/preproc/variable.c, which requires more complex surgery. I've left that out of this commit because I want to study that code a bit more first. We probably will want to back-patch this once compilers that detect this pattern get into wider circulation, but for now I'm just going to apply it to master to see what the buildfarm says. Thanks to Bertrand Drouvot for finding a couple more spots than I had. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/1324889.1764886170@sss.pgh.pa.us
2025-12-05Rename column slotsync_skip_at to slotsync_last_skip.Amit Kapila
Commit 76b78721ca introduced two new columns in pg_stat_replication_slots to improve monitoring of slot synchronization. One of these columns was named slotsync_skip_at, which is inconsistent with the naming convention used for similar columns in other system views. Columns that store timestamps of the most recent event typically use the 'last_' in the column name (e.g., last_autovacuum, checksum_last_failure). Renaming slotsync_skip_at to slotsync_last_skip aligns with this pattern, making the purpose of the column clearer and improving overall consistency across the views. Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Michael Banck <mbanck@gmx.net> Discussion: https://postgr.es/m/20251128091552.GB13635@p46.dedyn.io;lightning.p46.dedyn.io Discussion: https://postgr.es/m/CAE9k0PkhfKrTEAsGz4DjOhEj1nQ+hbQVfvWUxNacD38ibW3a1g@mail.gmail.com
2025-12-05Fix some compiler warningsMichael Paquier
Some of the buildfarm members with some old gcc versions have been complaining about an always-true test for a NULL pointer caused by a combination of SOFT_ERROR_OCCURRED() and a local ErrorSaveContext variable. These warnings are taken care of by removing SOFT_ERROR_OCCURRED(), switching to a direct variable check, like 56b1e88c804b. Oversights in e1405aa5e3ac and 44eba8f06e55. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1341064.1764895052@sss.pgh.pa.us
2025-12-04Remove no longer needed casts from PointerPeter Eisentraut
These casts used to be required when Pointer was char *, but now it's void * (commit 1b2bb5077e9), so they are not needed anymore. Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://www.postgresql.org/message-id/4154950a-47ae-4223-bd01-1235cc50e933%40eisentraut.org
2025-12-04Remove no longer needed casts to PointerPeter Eisentraut
These casts used to be required when Pointer was char *, but now it's void * (commit 1b2bb5077e9), so they are not needed anymore. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/4154950a-47ae-4223-bd01-1235cc50e933%40eisentraut.org
2025-12-03Rename BUFFERPIN wait event class to BUFFERAndres Freund
In an upcoming patch more wait events will be added to the wait event class (for buffer locking), making the current name too specific. Alternatively we could introduce a dedicated wait event class for those, but it seems somewhat confusing to have a BUFFERPIN and a BUFFER wait event class. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff
2025-12-03Use "foo(void)" for definitions of functions with no parameters.Nathan Bossart
Standard practice in PostgreSQL is to use "foo(void)" instead of "foo()", as the latter looks like an "old-style" function declaration. Similar changes were made in commits cdf4b9aff2, 0e72b9d440, 7069dbcc31, f1283ed6cc, 7b66e2c086, e95126cf04, and 9f7c527af3. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/aTBObQPg%2Bps5I7vl%40ip-10-97-1-34.eu-west-3.compute.internal
2025-12-03Fix stray references to SubscriptRefPeter Eisentraut
This type never existed. SubscriptingRef was meant instead. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/2eaa45e3-efc5-4d75-b082-f8159f51445f%40eisentraut.org
2025-12-03Don't rely on pointer arithmetic with Pointer typePeter Eisentraut
The comment for the Pointer type says 'XXX Pointer arithmetic is done with this, so it can't be void * under "true" ANSI compilers.'. This fixes that. Change from Pointer to use char * explicitly where pointer arithmetic is needed. This makes the meaning of the code clearer locally and removes a dependency on the actual definition of the Pointer type. (The definition of the Pointer type is not changed in this commit.) Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/4154950a-47ae-4223-bd01-1235cc50e933%40eisentraut.org
2025-12-03Remove useless casts to PointerPeter Eisentraut
in arguments of memcpy() and memmove() calls Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/4154950a-47ae-4223-bd01-1235cc50e933%40eisentraut.org
2025-12-02Use LW_SHARED in dsa.c where possible.Nathan Bossart
Both dsa_get_total_size() and dsa_get_total_size_from_handle() take an exclusive lock just to read a variable. This commit reduces the lock level to LW_SHARED in those functions. Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/aS8fMzWs9e8iHxk2%40nathan
2025-12-02Show size of DSAs and dshashes in pg_dsm_registry_allocations.Nathan Bossart
Presently, this view reports NULL for the size of DSAs and dshash tables because 1) the current backend might not be attached to them and 2) the registry doesn't save the pointers to the dsa_area or dshash_table in local memory. Also, the view doesn't show partially-initialized entries to avoid ambiguity, since those entries would report a NULL size as well. This commit introduces a function that looks up the size of a DSA given its handle (transiently attaching to the control segment if needed) and teaches pg_dsm_registry_allocations to use it to show the size of successfully-initialized DSA and dshash entries. Furthermore, the view now reports partially-initialized entries with a NULL size. Reviewed-by: Rahila Syed <rahilasyed90@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aSeEDeznAsHR1_YF%40nathan