summaryrefslogtreecommitdiff
path: root/src/backend/utils
AgeCommit message (Collapse)Author
2025-10-08Add stats_reset to pg_stat_user_functionsMichael Paquier
It is possible to call pg_stat_reset_single_function_counters() for a single function, but the reset time was missing the system view showing its statistics. Like all the fields of pg_stat_user_functions, the GUC track_functions needs to be enabled to show the statistics about function executions. Bump catalog version. Bump PGSTAT_FILE_FORMAT_ID, as a result of the new field added to PgStat_StatFuncEntry. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aONjnsaJSx-nEdfU@paquier.xyz
2025-10-07Assign each subquery a unique name prior to planning it.Robert Haas
Previously, subqueries were given names only after they were planned, which makes it difficult to use information from a previous execution of the query to guide future planning. If, for example, you knew something about how you want "InitPlan 2" to be planned, you won't know whether the subquery you're currently planning will end up being "InitPlan 2" until after you've finished planning it, by which point it's too late to use the information that you had. To fix this, assign each subplan a unique name before we begin planning it. To improve consistency, use textual names for all subplans, rather than, as we did previously, a mix of numbers (such as "InitPlan 1") and names (such as "CTE foo"), and make sure that the same name is never assigned more than once. We adopt the somewhat arbitrary convention of using the type of sublink to set the plan name; for example, a query that previously had two expression sublinks shown as InitPlan 2 and InitPlan 1 will now end up named expr_1 and expr_2. Because names are assigned before rather than after planning, some of the regression test outputs show the numerical part of the name switching positions: what was previously SubPlan 2 was actually the first one encountered, but we finished planning it later. We assign names even to subqueries that aren't shown as such within the EXPLAIN output. These include subqueries that are a FROM clause item or a branch of a set operation, rather than something that will be turned into an InitPlan or SubPlan. The purpose of this is to make sure that, below the topmost query level, there's always a name for each subquery that is stable from one planning cycle to the next (assuming no changes to the query or the database schema). Author: Robert Haas <rhaas@postgresql.org> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: http://postgr.es/m/3641043.1758751399@sss.pgh.pa.us
2025-10-06Optimize hex_encode() and hex_decode() using SIMD.Nathan Bossart
The hex_encode() and hex_decode() functions serve as the workhorses for hexadecimal data for bytea's text format conversion functions, and some workloads are sensitive to their performance. This commit adds new implementations that use routines from port/simd.h, which testing indicates are much faster for larger inputs. For small or invalid inputs, we fall back on the existing scalar versions. Since we are using port/simd.h, these optimizations apply to both x86-64 and AArch64. Author: Nathan Bossart <nathandbossart@gmail.com> Co-authored-by: Chiranmoy Bhattacharya <chiranmoy.bhattacharya@fujitsu.com> Co-authored-by: Susmitha Devanga <devanga.susmitha@fujitsu.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/aLhVWTRy0QPbW2tl%40nathan
2025-10-06Add stats_reset to pg_stat_all_{tables,indexes} and related viewsMichael Paquier
It is possible to call pg_stat_reset_single_table_counters() on a relation (index or table) but the reset time was missing from the system views showing their statistics. This commit adds the reset time as an attribute of pg_stat_all_tables, pg_stat_all_indexes, and other relations related to them. Bump catalog version. Bump PGSTAT_FILE_FORMAT_ID, as a result of the new field added to PgStat_StatTabEntry. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aN8l182jKxEq1h9f@paquier.xyz
2025-10-06Fix two comments in numeric.cMichael Paquier
The comments at the top of numeric_int4_safe() and numeric_int8_safe() mentioned respectively int4_numeric() and int8_numeric(). The intention is to refer to numeric_int4() and numeric_int8(). Oversights in 4246a977bad6. Reported-by: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxFfVt7Jx9_j=juxXyP-6tznN8OcvS9E-QSgp0BrD8KUgA@mail.gmail.com
2025-10-05Don't include access/htup_details.h in executor/tuptable.hÁlvaro Herrera
This is not at all needed; I suspect it was a simple mistake in commit 5408e233f066. It causes htup_details.h to bleed into a huge number of places via execnodes.h. Remove it and fix fallout. Discussion: https://postgr.es/m/202510021240.ptc2zl5cvwen@alvherre.pgsql
2025-10-03Fix incorrect function reference in commentRichard Guo
The comment incorrectly references the defunct function BufFileOpenShared(), which was replaced in commit dcac5e7ac. This patch updates the comment to refer to the current function BufFileOpenFileSet(). Author: Zhang Mingli <zmlpostgres@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/1cb48b4c-54ab-40cc-b355-0b3c2af6d3f7@Spark
2025-10-03Add IGNORE NULLS/RESPECT NULLS option to Window functions.Tatsuo Ishii
Add IGNORE NULLS/RESPECT NULLS option (null treatment clause) to lead, lag, first_value, last_value and nth_value window functions. If unspecified, the default is RESPECT NULLS which includes NULL values in any result calculation. IGNORE NULLS ignores NULL values. Built-in window functions are modified to call new API WinCheckAndInitializeNullTreatment() to indicate whether they accept IGNORE NULLS/RESPECT NULLS option or not (the API can be called by user defined window functions as well). If WinGetFuncArgInPartition's allowNullTreatment argument is true and IGNORE NULLS option is given, WinGetFuncArgInPartition() or WinGetFuncArgInFrame() will return evaluated function's argument expression on specified non NULL row (if it exists) in the partition or the frame. When IGNORE NULLS option is given, window functions need to visit and evaluate same rows over and over again to look for non null rows. To mitigate the issue, 2-bit not null information array is created while executing window functions to remember whether the row has been already evaluated to NULL or NOT NULL. If already evaluated, we could skip the evaluation work, thus we could get better performance. Author: Oliver Ford <ojford@gmail.com> Co-authored-by: Tatsuo Ishii <ishii@postgresql.org> Reviewed-by: Krasiyan Andreev <krasiyan@gmail.com> Reviewed-by: Andrew Gierth <andrew@tao11.riddles.org.uk> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Fetter <david@fetter.org> Reviewed-by: Vik Fearing <vik@postgresfriends.org> Reviewed-by: "David G. Johnston" <david.g.johnston@gmail.com> Reviewed-by: Chao Li <lic@highgo.com> Discussion: https://postgr.es/m/flat/CAGMVOdsbtRwE_4+v8zjH1d9xfovDeQAGLkP_B6k69_VoFEgX-A@mail.gmail.com
2025-10-02Generate EUC_CN mappings from gb18030-2022.ucmJohn Naylor
In the wake of cfa6cd292, EUC_CN was the only encoding that used gb-18030-2000.xml to generate the .map files. Since EUC_CN is a subset of GB18030, we can easily use the same UCM file. This allows deleting the XML file from our repository. Author: Chao Li <lic@highgo.com> Discussion: https://postgr.es/m/CANWCAZaNRXZ-5NuXmsaMA2mKvMZnCGHZqQusLkpE%2B8YX%2Bi5OYg%40mail.gmail.com
2025-10-01Fix typo in pgstat_relation.c header commentDavid Rowley
Looks like a copy and paste error from pgstat_function.c Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aNuaVMAdTGbgBgqh@ip-10-97-1-34.eu-west-3.compute.internal
2025-09-30Make some use of anonymous unions [pg_locale_t]Peter Eisentraut
Make some use of anonymous unions, which are allowed as of C11, as examples and encouragement for future code, and to test compilers. This commit changes the pg_locale_t type. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/f00a9968-388e-4f8c-b5ef-5102e962d997%40eisentraut.org
2025-09-30Do a tiny bit of header file maintenanceÁlvaro Herrera
Stop including utils/relcache.h in access/genam.h, and stop including htup_details.h in nodes/tidbitmap.h. Both these files (genam.h and tidbitmap.h) are widely used in other header files, so it's in our best interest that they remain as lean as reasonable. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/202509291356.o5t6ny2hoa3q@alvherre.pgsql
2025-09-29Add GROUP BY ALL.Tom Lane
GROUP BY ALL is a form of GROUP BY that adds any TargetExpr that does not contain an aggregate or window function into the groupClause of the query, making it exactly equivalent to specifying those same expressions in an explicit GROUP BY list. This feature is useful for certain kinds of data exploration. It's already present in some other DBMSes, and the SQL committee recently accepted it into the standard, so we can be reasonably confident in the syntax being stable. We do have to invent part of the semantics, as the standard doesn't allow for expressions in GROUP BY, so they haven't specified what to do with window functions. We assume that those should be treated like aggregates, i.e., left out of the constructed GROUP BY list. In passing, wordsmith some existing documentation about GROUP BY, and update some neglected synopsis entries in select_into.sgml. Author: David Christensen <david@pgguru.net> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAHM0NXjz0kDwtzoe-fnHAqPB1qA8_VJN0XAmCgUZ+iPnvP5LbA@mail.gmail.com
2025-09-29Add support for tracking of entry count in pgstatsMichael Paquier
Stats kinds can set a new option called "track_entry_count" (disabled by default, available for variable-numbered stats) that will make pgstats track the number of entries that exist in its shared hashtable. As there is only one code path where a new entry is added, and one code path where entries are freed, the count tracking is straight-forward in its implementation. Reads of these counters are optimistic, and may change across two calls. The counter is incremented when an entry is created (not when reused), and is decremented when an entry is freed from the hashtable (marked for drop with its refcount reaching 0), which is something that pgstats decides internally. A first use case of this facility would be pg_stat_statements, where we need to be able to cap the number of entries that would be stored in the shared hashtable, based on its "max" GUC. The module currently relies on hash_get_num_entries(), which offers a cheap way to count how many entries are in its hash table, but we cannot do that in pgstats for variable-sized stats kinds as a single hashtable is used for all the stats kinds. Independently of PGSS, this is useful for other custom stats kinds that want to cap, control, or track the number of entries they have, without depending on a potentially expensive sequential scan to know the number of entries while holding an extra exclusive lock. Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Keisuke Kuroda <keisuke.kuroda.3862@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aMPKWR81KT5UXvEr@paquier.xyz
2025-09-26Create a separate file listing backend typesÁlvaro Herrera
Use our established coding pattern to reduce maintenance pain when adding other per-process-type characteristics. Like PG_KEYWORD, PG_CMDTAG, PG_RMGR. To keep the strings translatable, the relevant makefile now also scans src/include for this specific file. I didn't want to have it scan all .h files, as then gettext would have to scan all header files. I didn't find any way to affect the meson behavior in this respect though. Author: Álvaro Herrera <alvherre@kurilemu.de> Co-authored-by: Jonathan Gonzalez V. <jonathan.abdiel@gmail.com> Discussion: https://postgr.es/m/202507151830.dwgz5nmmqtdy@alvherre.pgsql
2025-09-26Fix misleading comment in pg_get_statisticsobjdef_string()David Rowley
The comment claimed that a TABLESPACE reference was added to the resulting string, but that's not true. Looks like the comment was copied from pg_get_indexdef_string() without being adjusted correctly. Reported-by: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxHwVPgeu8o9D8oUeDQYEHTAZGt-J5uaJNgYMzkAW7MiCA@mail.gmail.com
2025-09-25Try to avoid floating-point roundoff error in pg_sleep().Tom Lane
I noticed the surprising behavior that pg_sleep(0.001) will sleep for 2ms not the expected 1ms. Apparently the float8 calculation of time-to-sleep is managing to produce something a hair over 1, which ceil() rounds up to 2, and then WaitLatch() faithfully waits 2ms. It could be that this works as-expected for some ranges of current timestamp but not others, which would account for not having seen it before. In any case, let's try to avoid it by removing the float arithmetic in the delay calculation. We're stuck with the declared input type being float8, but we can convert that to integer microseconds right away, and then work strictly with integral values. There might still be roundoff surprises for certain input values, but at least the behavior won't be time-varying. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/3879137.1758825752@sss.pgh.pa.us
2025-09-25Don't include execnodes.h in replication/conflict.hÁlvaro Herrera
... which silently propagates a lot of headers into many places via pgstat.h, as evidenced by the variety of headers that this patch needs to add to seemingly random places. Add a minimum of typedefs to conflict.h to be able to remove execnodes.h, and fix the fallout. Backpatch to 18, where conflict.h first appeared. Discussion: https://postgr.es/m/202509191927.uj2ijwmho7nv@alvherre.pgsql
2025-09-24Ensure guc_tables.o's dependency on guc_tables.inc.c is known.Tom Lane
Without this, rebuilds can malfunction unless --enable-depend is used. Historically we've expected that you can get away without --enable-depend as long as you manually clean after changing *.h files; the makefiles are supposed to handle other sorts of dependencies. So add this one. Follow-on to 635998965, so no need for back-patch. Discussion: https://postgr.es/m/3121329.1758650878@sss.pgh.pa.us
2025-09-24Remove PointerIsValid()Peter Eisentraut
This doesn't provide any value over the standard style of checking the pointer directly or comparing against NULL. Also remove related: - AllocPointerIsValid() [unused] - IndexScanIsValid() [had one user] - HeapScanIsValid() [unused] - InvalidRelation [unused] Leaving HeapTupleIsValid(), ItemIdIsValid(), PortalIsValid(), RelationIsValid for now, to reduce code churn. Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/ad50ab6b-6f74-4603-b099-1cd6382fb13d%40eisentraut.org Discussion: https://www.postgresql.org/message-id/CA+hUKG+NFKnr=K4oybwDvT35dW=VAjAAfiuLxp+5JeZSOV3nBg@mail.gmail.com Discussion: https://www.postgresql.org/message-id/bccf2803-5252-47c2-9ff0-340502d5bd1c@iki.fi
2025-09-24Fix incorrect option name in usage screenDaniel Gustafsson
The usage screen incorrectly refered to the --docs option as --sgml. Backpatch down to v17 where this script was introduced. Author: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/20250729.135638.1148639539103758555.horikyota.ntt@gmail.com Backpatch-through: 17
2025-09-24Consistently handle tab delimiters for wait event namesDaniel Gustafsson
Format validation and element extraction for intermediate line strings were inconsistent in their handling of tab delimiters, which resulted in an unclear error when multiple tab characters were used as a delimiter. This fixes it by using captures from the validation regex instead of a separate split() to avoid the inconsistency. Also, it ensures that \t+ is used consistently when inspecting the strings. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/20250729.135638.1148639539103758555.horikyota.ntt@gmail.com
2025-09-24Update GB18030 encoding from version 2000 to 2022John Naylor
Mappings for 18 characters have changed, affecting 36 code points. This is a break in compatibility, but these characters are rarely used. U+E5E5 (Private Use Area) was previously mapped to \xA3A0. This code point now maps to \x65356535. Attempting to convert \xA3A0 will now raise an error. Separate from the 2022 update, the following mappings were previously swapped, and subsequently corrected in 2000 and later versions: * U+E7C7 (Private Use Area) now maps to \x8135F437 * U+1E3F (Latin Small Letter M with Acute) now maps to \xA8BC The 2022 standard mentions the following policy changes, but they have no effect in our implementation: 66 new ideographs are now required, but these are mapped algorithmically so were already handled by utf8_and_gb18030.c. Nine CJK compatibility ideographs are no longer required, but implementations may retain them, as does the source we use from the Unicode Consortium. Release notes: Compatibility section For further details, see: https://www.unicode.org/L2/L2022/22274-disruptive-changes.pdf https://ken-lunde.medium.com/the-gb-18030-2022-standard-3d0ebaeb4132 Author: Chao Li <lic@highgo.com> Author: Zheng Tao <taoz@highgo.com> Discussion: https://postgr.es/m/966d9fc.169.198741fe60b.Coremail.jiaoshuntian%40highgo.com
2025-09-20Add support for base64url encoding and decodingDaniel Gustafsson
This adds support for base64url encoding and decoding, a base64 variant which is safe to use in filenames and URLs. base64url replaces '+' in the base64 alphabet with '-' and '/' with '_', thus making it safe for URL addresses and file systems. Support for base64url was originally suggested by Przemysław Sztoch. Author: Florents Tselai <florents.tselai@gmail.com> Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Reviewed-by: David E. Wheeler <david@justatheory.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Chao Li (Evan) <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/70f2b6a8-486a-4fdb-a951-84cef35e22ab@sztoch.pl
2025-09-20Track the maximum possible frequency of non-MCE array elements.Tom Lane
The lossy-counting algorithm that ANALYZE uses to identify most-common array elements has a notion of cutoff frequency: elements with frequency greater than that are guaranteed to be collected, elements with smaller frequencies are not. In cases where we find fewer MCEs than the stats target would permit us to store, the cutoff frequency provides valuable additional information, to wit that there are no non-MCEs with frequency greater than that. What the selectivity estimation functions actually use the "minfreq" entry for is as a ceiling on the possible frequency of non-MCEs, so using the cutoff rather than the lowest stored MCE frequency provides a tighter bound and more accurate estimates. Therefore, instead of redundantly storing the minimum observed MCE frequency, store the cutoff frequency when there are fewer tracked values than we want. (When there are more, then of course we cannot assert that no non-stored elements are above the cutoff frequency, since we're throwing away some that are; so we still use the minimum stored frequency in that case.) Notably, this works even when none of the values are common enough to be called MCEs. In such cases we previously stored nothing in the STATISTIC_KIND_MCELEM pg_statistic slot, which resulted in the selectivity functions falling back to default estimates. So in that case we want to construct a STATISTIC_KIND_MCELEM entry that contains no "values" but does have "numbers", to wit the three extra numbers that the MCELEM entry type defines. A small obstacle is that update_attstats() has traditionally stored a null, not an empty array, when passed zero "values" for a slot. That gives rise to an MCELEM entry that get_attstatsslot() will spit up on. The least risky solution seems to be to adjust update_attstats() so that it will emit a non-null (but possibly empty) array when the passed stavalues array pointer isn't NULL, rather than conditioning that on numvalues > 0. In other existing cases I don't believe that that changes anything. For consistency, handle the stanumbers array the same way. In passing, improve the comments in routines that use STATISTIC_KIND_MCELEM data. Particularly, explain why we use minfreq / 2 not minfreq as the estimate for non-MCE values. Thanks to Matt Long for the suggestion that we could apply this idea even when there are more than zero MCEs. Reported-by: Mark Frost <FROSTMAR@uk.ibm.com> Reported-by: Matt Long <matt@mattlong.org> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/PH3PPF1C905D6E6F24A5C1A1A1D8345B593E16FA@PH3PPF1C905D6E6.namprd15.prod.outlook.com
2025-09-20Re-allow using statistics for bool-valued functions in WHERE.Tom Lane
Commit a391ff3c3, which added the ability for a function's support function to provide a custom selectivity estimate for "WHERE f(...)", unintentionally removed the possibility of applying expression statistics after finding there's no applicable support function. That happened because we no longer fell through to boolvarsel() as before. Refactor to do so again, putting the 0.3333333 default back into boolvarsel() where it had been (cf. commit 39df0f150). I surely wouldn't have made this error if 39df0f150 had included a test case, so add one now. At the time we did not have the "extended statistics" infrastructure, but we do now, and it is also unable to work in this scenario because of this error. So make use of that for the test case. This is very clearly a bug fix, but I'm afraid to put it into released branches because of the likelihood of altering plan choices, which we avoid doing in minor releases. So, master only. Reported-by: Frédéric Yhuel <frederic.yhuel@dalibo.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/a8b99dce-1bfb-4d97-af73-54a32b85c916@dalibo.com
2025-09-19Document and check that PgStat_HashKey has no paddingMichael Paquier
This change is a tighter rework of 7d85d87f4d5c, which tried to improve the code so as it would work should PgStat_HashKey gain new fields that create padding bytes. However, the previous change is proving to not be enough as some code paths of pgstats do not pass PgStat_HashKey by reference (valgrind would warn when padding is added to the structure, through a new field). Per discussion, let's document and check that PgStat_HashKey has no padding rather than try to complicate the code of pgstats so as it is able to work around that. This removes a couple of memset(0) calls that should not be required. While on it, this commit adds a static assertion checking that no padding is introduced in the structure, by checking that the size of PgStat_HashKey matches with the sum of the size of all its fields. The object ID part of the hash key is already 8 bytes, which should be plenty enough already. A comment is added to discourage the addition of new fields. Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0t9omat+HVSakJXwTMWvhpYFcAZb41RPWKwrKFUgmAFBQ@mail.gmail.com
2025-09-16Provide more-specific error details/hints for function lookup failures.Tom Lane
Up to now we've contented ourselves with a one-size-fits-all error hint when we fail to find any match to a function or procedure call. That was mostly okay in the beginning, but it was never great, and since the introduction of named arguments it's really not adequate. We at least ought to distinguish "function name doesn't exist" from "function name exists, but not with those argument names". And the rules for named-argument matching are arcane enough that some more detail seems warranted if we match the argument names but the call still doesn't work. This patch creates a framework for dealing with these problems: FuncnameGetCandidates and related code will now pass back a bitmask of flags showing how far the match succeeded. This allows a considerable amount of granularity in the reports. The set-bits-in-a-bitmask approach means that when there are multiple candidate functions, the report will reflect the match(es) that got the furthest, which seems correct. Also, we can avoid mentioning "maybe add casts" unless failure to match argument types is actually the issue. Extend the same return-a-bitmask approach to OpernameGetCandidates. The issues around argument names don't apply to operator syntax, but it still seems worth distinguishing between "there is no operator of that name" and "we couldn't match the argument types". While at it, adjust these messages and related ones to more strictly separate "detail" from "hint", following our message style guidelines' distinction between those. Reported-by: Dominique Devienne <ddevienne@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/1756041.1754616558@sss.pgh.pa.us
2025-09-16Generate GB18030 mappings from the Unicode Consortium's UCM fileJohn Naylor
Previously we built the .map files for GB18030 (version 2000) from an XML file. The 2022 version for this encoding is only available as a Unicode Character Mapping (UCM) file, so as preparatory refactoring switch to this format as the source for building version 2000. As we do with most input files for the conversion mappings, download the file on demand. In order to generate the same mappings we have now, we must download from a previous upstream commit, rather than the head since the latter contains a correction not present in our current .map files. The XML file is still used by EUC_CN, so we cannot delete it from our repository. GB18030 is a superset of EUC_CN, so it may be possible to build EUC_CN from the same UCM file, but that is left for future work. Author: Chao Li <lic@highgo.com> Discussion: https://postgr.es/m/966d9fc.169.198741fe60b.Coremail.jiaoshuntian%40highgo.com
2025-09-15Change fmgr.h typedefs to use original namesPeter Eisentraut
fmgr.h defined some types such as fmNodePtr which is just Node *, but it made its own types to avoid having to include various header files. With C11, we can now instead typedef the original names without fear of conflicts. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/10d32190-f31b-40a5-b177-11db55597355@eisentraut.org
2025-09-13Amend recent fix for SIMILAR TO regex conversion.Tom Lane
Commit e3ffc3e91 fixed the translation of character classes in SIMILAR TO regular expressions. Unfortunately the fix broke a corner case: if there is an escape character right after the opening bracket (for example in "[\q]"), a closing bracket right after the escape sequence would not be seen as closing the character class. There were two more oversights: a backslash or a nested opening bracket right at the beginning of a character class should remove the special meaning from any following caret or closing bracket. This bug suggests that this code needs to be more readable, so also rename the variables "charclass_depth" and "charclass_start" to something more meaningful, rewrite an "if" cascade to be more consistent, and improve the commentary. Reported-by: Dominique Devienne <ddevienne@gmail.com> Reported-by: Stephan Springl <springl-psql@bfw-online.de> Author: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAFCRh-8NwJd0jq6P=R3qhHyqU7hw0BTor3W0SvUcii24et+zAw@mail.gmail.com Backpatch-through: 13
2025-09-12Allow redeclaration of typedef yyscan_tPeter Eisentraut
This is allowed in C11, so we don't need the workaround guards against it anymore. This effectively reverts commit 382092a0cd2 that put these guards in place. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/10d32190-f31b-40a5-b177-11db55597355@eisentraut.org
2025-09-12Default to log_lock_waits=onPeter Eisentraut
If someone is stuck behind a lock for more than a second, that is almost always a problem that is worth a log entry. Author: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-By: Michael Banck <mbanck@gmx.net> Reviewed-By: Robert Haas <robertmhaas@gmail.com> Reviewed-By: Christoph Berg <myon@debian.org> Reviewed-By: Stephen Frost <sfrost@snowman.net> Discussion: https://postgr.es/m/b8b8502915e50f44deb111bc0b43a99e2733e117.camel%40cybertec.at
2025-09-10meson: Build numeric.c with -ftree-vectorize.Nathan Bossart
autoconf builds have compiled this file with -ftree-vectorize since commit 8870917623, but meson builds seem to have missed the memo. Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/aL85CeasM51-0D1h%40nathan Backpatch-through: 16
2025-09-10Remove dynahash.hMichael Paquier
All the callers of my_log2() are now limited inside dynahash.c, so let's remove this header. The same capability is provided by pg_bitutils.h already. Discussion: https://postgr.es/m/CAEZATCUJPQD_7sC-wErak2CQGNa6bj2hY-mr8wsBki=kX7f2_A@mail.gmail.com
2025-09-09Fix typo in commentPeter Eisentraut
Author: Alexandra Wang <alexandra.wang.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/CAK98qZ0whQ%3Dc%2BJGXbGSEBxCtLgy6sf-YGYqsKTAGsS-wt0wj%2BA%40mail.gmail.com
2025-09-09Add date and timestamp variants of random(min, max).Dean Rasheed
This adds 3 new variants of the random() function: random(min date, max date) returns date random(min timestamp, max timestamp) returns timestamp random(min timestamptz, max timestamptz) returns timestamptz Each returns a random value x in the range min <= x <= max. Author: Damien Clochard <damien@dalibo.info> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Vik Fearing <vik@postgresfriends.org> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/f524d8cab5914613d9e624d9ce177d3d@dalibo.info
2025-09-08Fix corruption of pgstats shared hashtable due to OOM failuresMichael Paquier
A new pgstats entry is created as a two-step process: - The entry is looked at in the shared hashtable of pgstats, and is inserted if not found. - When not found and inserted, its fields are then initialized. This part include a DSA chunk allocation for the stats data of the new entry. As currently coded, if the DSA chunk allocation fails due to an out-of-memory failure, an ERROR is generated, leaving in the pgstats shared hashtable an inconsistent entry due to the first step, as the entry has already been inserted in the hashtable. These broken entries can then be found by other backends, crashing them. There are only two callers of pgstat_init_entry(), when loading the pgstats file at startup and when creating a new pgstats entry. This commit changes pgstat_init_entry() so as we use dsa_allocate_extended() with DSA_ALLOC_NO_OOM, making it return NULL on allocation failure instead of failing. This way, a backend failing an entry creation can take appropriate cleanup actions in the shared hashtable before throwing an error. Currently, this means removing the entry from the shared hashtable before throwing the error for the allocation failure. Out-of-memory errors unlikely happen in the wild, and we do not bother with back-patches when these are fixed, usually. However, the problem dealt with here is a degree worse as it breaks the shared memory state of pgstats, impacting other processes that may look at an inconsistent entry that a different process has failed to create. Author: Mikhail Kot <mikhail.kot@databricks.com> Discussion: https://postgr.es/m/CAAi9E7jELo5_-sBENftnc2E8XhW2PKZJWfTC3i2y-GMQd2bcqQ@mail.gmail.com Backpatch-through: 15
2025-09-06Allow to log raw parse tree.Tatsuo Ishii
This commit allows to log the raw parse tree in the same way we currently log the parse tree, rewritten tree, and plan tree. To avoid unnecessary log noise for users not interested in this detail, a new GUC option, "debug_print_raw_parse", has been added. When starting the PostgreSQL process with "-d N", and N is 3 or higher, debug_print_raw_parse is enabled automatically, alongside debug_print_parse. Author: Chao Li <lic@highgo.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Tatsuo Ishii <ishii@postgresql.org> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/CAEoWx2mcO0Gpo4vd8kPMAFWeJLSp0MeUUnaLdE1x0tSVd-VzUw%40mail.gmail.com
2025-09-05Switch some numeric-related functions to use soft error reportingMichael Paquier
This commit changes some functions related to the data type numeric to use the soft error reporting rather than a custom boolean flag (called "have_error") that callers of these functions could rely on to bypass the generation of ERROR reports, letting the callers do their own error handling (timestamp, jsonpath and numeric_to_char() require them). This results in the removal of some boilerplate code that was required to handle both the ereport() and the "have_error" code paths bypassing ereport(), unifying everything under the soft error reporting facility. While on it, some duplicated error messages are removed. The function upgraded in this commit were suffixed with "_opt_error" in their names. They are renamed to "_safe" instead. This change relies on d9f7f5d32f20, that has introduced the soft error reporting infrastructure. Author: Amul Sul <sulamul@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/CAAJ_b96No5h5tRuR+KhcC44YcYUCw8WAHuLoqqyyop8_k3+JDQ@mail.gmail.com
2025-09-05Change pg_lsn_in_internal() to use soft error reportingMichael Paquier
pg_lsn includes pg_lsn_in_internal() for the purpose of parsing a LSN position for the GUC recovery_target_lsn (21f428ebde39). It relies on a boolean called "have_error" that would be set when the LSN parsing fails, then let its callers handle any errors. d9f7f5d32f20 has added support for soft error reporting. This commit removes some boilerplate code and switches the routine to use soft error reporting directly, giving to the callers of pg_lsn_in_internal() the possibility to be fed the error message generated on failure. pg_lsn_in_internal() routine is renamed to pg_lsn_in_safe(), for consistency with other similar routines that are given an escontext. Author: Amul Sul <sulamul@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/CAAJ_b96No5h5tRuR+KhcC44YcYUCw8WAHuLoqqyyop8_k3+JDQ@mail.gmail.com
2025-09-04Adjust commentary for WaitEventLWLock in wait_event_names.txt.Nathan Bossart
In addition to changing a couple of references for clarity, this commit combines the two similar comments.
2025-09-04Fix incorrect comment in pgstat_backend.cMichael Paquier
The counters saved from pgWalUsage, used for the difference calculations when flushing the backend WAL stats, are updated when calling pgstat_flush_backend() under PGSTAT_BACKEND_FLUSH_WAL, and not pgstat_report_wal(). The comment updated in this commit referenced the latter, but it is perfectly OK to flush the backend stats independently of the WAL stats. Noticed while looking at this area of the code, introduced by 76def4cdd7c2 as a copy-pasto. Backpatch-through: 18
2025-09-03Fix mistake in new GUC tables sourcePeter Eisentraut
Commit 63599896545 had it so that the parameter "debug_discard_caches" did not exist unless DISCARD_CACHES_ENABLED was defined (typically via enabling asserts). This was a mistake, it did not correspond to the prior setup. Several tests use this parameter, so they were now failing if you did not have asserts enabled.
2025-09-03Generate GUC tables from .dat filePeter Eisentraut
Store the information in guc_tables.c in a .dat file similar to the catalog data in src/include/catalog/, and generate a part of guc_tables.c from that. The goal is to make it easier to edit that information, and to be able to make changes to the downstream data structures more easily. (Essentially, those are the same reasons as for the original adoption of the .dat format.) Reviewed-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: David E. Wheeler <david@justatheory.com> Discussion: https://www.postgresql.org/message-id/flat/dae6fe89-1e0c-4c3f-8d92-19d23374fb10%40eisentraut.org
2025-09-02Generate pgstat_count_slru*() functions for slru using macrosMichael Paquier
This change replaces seven functions definitions by macros, reducing a bit some repetitive patterns in the code. An interesting side effect is that this removes an inconsistency in the naming of SLRU increment functions with the field names. This change is similar to 850f4b4c8cab, 8018ffbf5895 or 83a1a1b56645. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aLHA//gr4dTpDHHC@ip-10-97-1-34.eu-west-3.compute.internal
2025-08-28Avoid including commands/dbcommands.h in so many placesÁlvaro Herrera
This has been done historically because of get_database_name (which since commit cb98e6fb8fd4 belongs in lsyscache.c/h, so let's move it there) and get_database_oid (which is in the right place, but whose declaration should appear in pg_database.h rather than dbcommands.h). Clean this up. Also, xlogreader.h and stringinfo.h are no longer needed by dbcommands.h since commit f1fd515b393a, so remove them. Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/202508191031.5ipojyuaswzt@alvherre.pgsql
2025-08-27Fix: Don't strip $libdir from nested module_pathnamesPeter Eisentraut
This patch fixes a bug in how 'load_external_function' handles '$libdir/ prefixes in module paths. Previously, 'load_external_function' would unconditionally strip '$libdir/' from the beginning of the 'filename' string. This caused an issue when the path was nested, such as "$libdir/nested/my_lib". Stripping the prefix resulted in a path of "nested/my_lib", which would fail to be found by the expand_dynamic_library_name function because the original '$libdir' macro was removed. To fix this, the code now checks for the presence of an additional directory separator ('/' or '\') after the '$libdir/' prefix. The prefix is only stripped if the remaining string does not contain a separator. This ensures that simple filenames like '"$libdir/my_lib"' are correctly handled, while nested paths are left intact for 'expand_dynamic_library_name' to process correctly. Reported-by: Dilip Kumar <dilipbalaut@gmail.com> Co-authored-by: Matheus Alcantara <matheusssilv97@gmail.com> Co-authored-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAFiTN-uKNzAro4tVwtJhF1UqcygfJ%2BR%2BRL%3Db-_ZMYE3LdHoGhA%40mail.gmail.com
2025-08-25Formatting cleanup of guc_tables.cPeter Eisentraut
This cleans up a few minor formatting inconsistencies. Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/dae6fe89-1e0c-4c3f-8d92-19d23374fb10%40eisentraut.org
2025-08-22Revert "Get rid of WALBufMappingLock"Alexander Korotkov
This reverts commit bc22dc0e0ddc2dcb6043a732415019cc6b6bf683. It appears that conditional variables are not suitable for use inside critical sections. If WaitLatch()/WaitEventSetWaitBlock() face postmaster death, they exit, releasing all locks instead of PANIC. In certain situations, this leads to data corruption. Reported-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/B3C69B86-7F82-4111-B97F-0005497BB745%40yandex-team.ru Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Reviewed-by: Aleksander Alekseev <aleksander@tigerdata.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Yura Sokolov <y.sokolov@postgrespro.ru> Reviewed-by: Michael Paquier <michael@paquier.xyz> Backpatch-through: 18