| Age | Commit message (Collapse) | Author |
|
Run renumber_oids.pl to move high-numbered OIDs down, as per pre-beta
tasks specified by RELEASE_CHANGES. For reference, the command was
./renumber_oids.pl --first-mapped-oid 8000 --target-oid 6300
This should have been done prior to beta1, but it was forgotten. This
will ensure we get the correct numbering for beta2 onward.
|
|
This is required before the creation of a new branch. pgindent is
clean, as well as is reformat-dat-files.
perltidy version is v20230309, as documented in pgindent's README.
|
|
Just like selecting from a view is exploitable (CVE-2024-7348),
selecting from a table with virtual generated columns is exploitable.
Users who are concerned about this can avoid selecting from views, but
telling them to avoid selecting from tables is less practical.
To address this, this changes it so that generation expressions for
virtual generated columns are restricted to using built-in functions
and types, and the columns are restricted to having a built-in type.
We assume that built-in functions and types cannot be exploited for
this purpose.
In the future, this could be expanded by some new mechanism to declare
other functions and types as safe or trusted for this purpose, but
that is to be designed.
(An alternative approach might have been to expand the
restrict_nonsystem_relation_kind GUC to handle this, like the fix for
CVE-2024-7348. But that is kind of an ugly approach. That fix had to
fit in the constraints of fixing an ancient vulnerability in all
branches. Since virtual generated columns are new, we're free from
the constraints of the past, and we can and should use cleaner
options.)
Reported-by: Feike Steenbergen <feikesteenbergen@gmail.com>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAK_s-G2Q7de8Q0qOYUR%3D_CTB5FzzVBm5iZjOp%2BmeVWpMpmfO0w%40mail.gmail.com
|
|
The algorithm to squash lists of constants added by commit 62d712ecfd94
was a bit too simplistic; we wanted to avoid adding unnecessary
complexity, but cases like direct function calls of typecasting
functions (and others) were missed, and bogus SQL syntax was being shown
in pg_stat_statements normalized query text field. To fix normalization
for those cases, we need the parser to transmit information about were
each list of constant values starts and ends, so add that to a couple of
nodes. Also add a few more test cases to make sure we're doing the
right thing.
The patch initially submitted by Sami added a new private struct in
gram.y to carry the start/end information for A_Expr, but I (Álvaro)
decided that a better fix was to remove the parser indirection via the
in_expr production, and instead create separate components in the a_expr
rule. I'm surprised that this works and doesn't require more changes,
but I assume (without checking) that the grammar used to be more complex
and got simplified at some point.
Bump catversion.
Author: Sami Imseih <samimseih@gmail.com>
Author: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAA5RZ0tRXoPG2y6bMgBCWNDt0Tn=unRerbzYM=oW0syi1=C1OA@mail.gmail.com
|
|
Commit 7406ab623fe added a gist support function that we internally
refer to by the symbol GIST_STRATNUM_PROC. This translated from
"well-known" strategy numbers to opfamily-specific strategy numbers.
However, we later (commit 630f9a43cec) changed this to fit into
index-AM-level compare type mapping, so this function actually now
maps from compare type to opfamily-specific strategy numbers. So this
name is no longer fitting.
Moreover, the index AM level also supports the opposite, a function to
map from strategy number to compare type. This is currently not
supported in gist, but one might wonder what this function is supposed
to be called when it is added.
This patch changes the naming of the gist-level functionality to be
more in line with the index-AM-level functionality. This makes sense
because these are essentially the same thing on different levels.
This also changes the names of the externally visible functions that
are provided for use as such a support function.
Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://www.postgresql.org/message-id/37ebb1d9-9036-485f-a215-e55435689917%40eisentraut.org
|
|
Due to concerns raised about the approach, and memory leaks found
in sensitive contexts the functionality is reverted. This reverts
commits 45e7e8ca9, f8c115a6c, d2a1ed172, 55ef7abf8 and 042a66291
for v18 with an intent to revisit this patch for v19.
Discussion: https://postgr.es/m/594293.1747708165@sss.pgh.pa.us
|
|
A few places that access this catalog don't set up an active
snapshot before potentially accessing its TOAST table. However,
roname (the replication origin name) is the only varlena column, so
this is only a problem if the name requires out-of-line storage.
This commit removes its TOAST table to avoid needing to set up a
snapshot. It also places a limit on replication origin names so
that attempts to set long names will fail with a more user-friendly
error. Those chosen limit of 512 bytes should be sufficient to
avoid "row is too big" errors independent of BLCKSZ, but it should
also be lenient enough for all reasonable use-cases.
Bumps catversion.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Reviewed-by: Nisha Moond <nisha.moond412@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
|
|
Commit 0bada39c83a150079567a6e97b1a25a198f30ea3 fixed a bug of this kind,
which existed in all branches for six days before detection. While the
probability of reaching the trouble was low, the disruption was extreme. No
new backends could start, and service restoration needed an immediate
shutdown. Hence, add this to catch the next bug like it.
The new check in RelationIdGetRelation() suffices to make autovacuum detect
the bug in commit 243e9b40f1b2dd09d6e5bf91ebf6e822a2cd3704 that led to commit
0bada39. This also checks in a number of similar places. It replaces each
Assert(IsTransactionState()) that pertained to a conditional catalog read.
No back-patch for now, but a back-patch of commit 243e9b4 should back-patch
this, too. A back-patch could omit the src/test/regress changes, since back
branches won't gain new index columns.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/20250410191830.0e.nmisch@google.com
Discussion: https://postgr.es/m/10ec0bc3-5933-1189-6bb8-5dec4114558e@gmail.com
|
|
Commit c57971034e69ca renamed an argument for a function but missed
to bump the catversion to reflect this.
Reported-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvqOega=dPtu3h2C5fJWJEuaGCMDib_sVfhKQqgUNJVmFA@mail.gmail.com
|
|
During development the third argument to pg_get_process_memory_contexts
was a retry count, but it was changed to a timeout instead. The param
name was accidentally left in pg_proc.dat though. Fix by renaming to
the correct parameter name.
Author: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/3eb40b3e-45c7-426a-b7f8-81f7d05a9b53@oss.nttdata.com
|
|
This adds a function for retrieving memory context statistics
and information from backends as well as auxiliary processes.
The intended usecase is cluster debugging when under memory
pressure or unanticipated memory usage characteristics.
When calling the function it sends a signal to the specified
process to submit statistics regarding its memory contexts
into dynamic shared memory. Each memory context is returned
in detail, followed by a cumulative total in case the number
of contexts exceed the max allocated amount of shared memory.
Each process is limited to use at most 1Mb memory for this.
A summary can also be explicitly requested by the user, this
will return the TopMemoryContext and a cumulative total of
all lower contexts.
In order to not block on busy processes the caller specifies
the number of seconds during which to retry before timing out.
In the case where no statistics are published within the set
timeout, the last known statistics are returned, or NULL if
no previously published statistics exist. This allows dash-
board type queries to continually publish even if the target
process is temporarily congested. Context records contain a
timestamp to indicate when they were submitted.
Author: Rahila Syed <rahilasyed90@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Discussion: https://postgr.es/m/CAH2L28v8mc9HDt8QoSJ8TRmKau_8FM_HKS41NeO9-6ZAkuZKXw@mail.gmail.com
|
|
Introduce new pg_shmem_alloctions_numa view with information about how
shared memory is distributed across NUMA nodes. For each shared memory
segment, the view returns one row for each NUMA node backing it, with
the total amount of memory allocated from that node.
The view may be relatively expensive, especially when executed for the
first time in a backend, as it has to touch all memory pages to get
reliable information about the NUMA node. This may also force allocation
of the shared memory.
Unlike pg_shmem_allocations, the view does not show anonymous shared
memory allocations. It also does not show memory allocated using the
dynamic shared memory infrastructure.
Author: Jakub Wartak <jakub.wartak@enterprisedb.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAKZiRmxh6KWo0aqRqvmcoaX2jUxZYb4kGp3N%3Dq1w%2BDiH-696Xw%40mail.gmail.com
|
|
Add basic NUMA awareness routines, using a minimal src/port/pg_numa.c
portability wrapper and an optional build dependency, enabled by
--with-libnuma configure option. For now this is Linux-only, other
platforms may be supported later.
A built-in SQL function pg_numa_available() allows checking NUMA
support, i.e. that the server was built/linked with the NUMA library.
The main function introduced is pg_numa_query_pages(), which allows
determining the NUMA node for individual memory pages. Internally the
function uses move_pages(2) syscall, as it allows batching, and is more
efficient than get_mempolicy(2).
Author: Jakub Wartak <jakub.wartak@enterprisedb.com>
Co-authored-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAKZiRmxh6KWo0aqRqvmcoaX2jUxZYb4kGp3N%3Dq1w%2BDiH-696Xw%40mail.gmail.com
|
|
The list of transform objects that a function should use is specified
in CREATE FUNCTION's TRANSFORM clause, and then represented indirectly
in pg_proc.protrftypes. However, ProcedureCreate completely ignored
that for purposes of constructing pg_depend entries, and instead made
the function depend on any transforms that exist for its parameter or
return data types. This is bad in both directions: the function could
be made dependent on a transform it does not actually use, or it
could try to use a transform that's since been dropped. (The latter
scenario would require use of a transform that's not for any of the
parameter or return types, but that seems legit for cases where the
function performs SQL operations internally.)
To fix, pass in the list of transform objects that CreateFunction
identified, and build pg_depend entries from that not from the
parameter/return types. This results in changes in the expected
test outputs in contrib/bool_plperl, which I guess are due to
different ordering of pg_depend entries -- that test case is
surely not exercising either of the problem scenarios.
This fix is not back-patchable as-is: changing the signature of
ProcedureCreate seems too risky in stable branches. We could
do something like making ProcedureCreate a wrapper around
ProcedureCreateExt or so. However, I'm more inclined to do
nothing in the back branches. We had no field complaints up to
now, so the hazards don't seem to be a big issue in practice.
And we couldn't do anything about existing pg_depend entries,
so a back-patched fix would result in a mishmash of dependencies
created according to different rules. That cure could be worse
than the disease, perhaps.
I bumped catversion just to lay down a marker that the expected
contents of pg_depend are a bit different than before.
Reported-by: Chapman Flack <jcflack@acm.org>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3112950.1743984111@sss.pgh.pa.us
|
|
This allows them to be added without scanning the table, and validating
them afterwards without holding access exclusive lock on the table after
any violating rows have been deleted or fixed.
Doing ALTER TABLE ... SET NOT NULL for a column that has an invalid
not-null constraint validates that constraint. ALTER TABLE .. VALIDATE
CONSTRAINT is also supported. There are various checks on whether an
invalid constraint is allowed in a child table when the parent table has
a valid constraint; this should match what we do for enforced/not
enforced constraints.
pg_attribute.attnotnull is now only an indicator for whether a not-null
constraint exists for the column; whether it's valid or invalid must be
queried in pg_constraint. Applications can continue to query
pg_attribute.attnotnull as before, but now it's possible that NULL rows
are present in the column even when that's set to true.
For backend internal purposes, we cache the nullability status in
CompactAttribute->attnullability that each tuple descriptor carries
(replacing CompactAttribute.attnotnull, which was a mirror of
Form_pg_attribute.attnotnull). During the initial tuple descriptor
creation, based on the pg_attribute scan, we set this to UNRESTRICTED if
pg_attribute.attnotnull is false, or to UNKNOWN if it's true; then we
update the latter to VALID or INVALID depending on the pg_constraint
scan. This flag is also copied when tupledescs are copied.
Comparing tuple descs for equality must also compare the
CompactAttribute.attnullability flag and return false in case of a
mismatch.
pg_dump deals with these constraints by storing the OIDs of invalid
not-null constraints in a separate array, and running a query to obtain
their properties. The regular table creation SQL omits them entirely.
They are then dealt with in the same way as "separate" CHECK
constraints, and dumped after the data has been loaded. Because no
additional pg_dump infrastructure was required, we don't bump its
version number.
I decided not to bump catversion either, because the old catalog state
works perfectly in the new world. (Trying to run with new catalog state
and the old server version would likely run into issues, however.)
System catalogs do not support invalid not-null constraints (because
commit 14e87ffa5c54 didn't allow them to have pg_constraint rows
anyway.)
Author: Rushabh Lathia <rushabh.lathia@gmail.com>
Author: Jian He <jian.universality@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Tested-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CAGPqQf0KitkNack4F5CFkFi-9Dqvp29Ro=EpcWt=4_hs-Rt+bQ@mail.gmail.com
|
|
Teach nbtree multi-column index scans to opportunistically skip over
irrelevant sections of the index given a query with no "=" conditions on
one or more prefix index columns. When nbtree is passed input scan keys
derived from a predicate "WHERE b = 5", new nbtree preprocessing steps
output "WHERE a = ANY(<every possible 'a' value>) AND b = 5" scan keys.
That is, preprocessing generates a "skip array" (and an output scan key)
for the omitted prefix column "a", which makes it safe to mark the scan
key on "b" as required to continue the scan. The scan is therefore able
to repeatedly reposition itself by applying both the "a" and "b" keys.
A skip array has "elements" that are generated procedurally and on
demand, but otherwise works just like a regular ScalarArrayOp array.
Preprocessing can freely add a skip array before or after any input
ScalarArrayOp arrays. Index scans with a skip array decide when and
where to reposition the scan using the same approach as any other scan
with array keys. This design builds on the design for array advancement
and primitive scan scheduling added to Postgres 17 by commit 5bf748b8.
Testing has shown that skip scans of an index with a low cardinality
skipped prefix column can be multiple orders of magnitude faster than an
equivalent full index scan (or sequential scan). In general, the
cardinality of the scan's skipped column(s) limits the number of leaf
pages that can be skipped over.
The core B-Tree operator classes on most discrete types generate their
array elements with the help of their own custom skip support routine.
This infrastructure gives nbtree a way to generate the next required
array element by incrementing (or decrementing) the current array value.
It can reduce the number of index descents in cases where the next
possible indexable value frequently turns out to be the next value
stored in the index. Opclasses that lack a skip support routine fall
back on having nbtree "increment" (or "decrement") a skip array's
current element by setting the NEXT (or PRIOR) scan key flag, without
directly changing the scan key's sk_argument. These sentinel values
behave just like any other value from an array -- though they can never
locate equal index tuples (they can only locate the next group of index
tuples containing the next set of non-sentinel values that the scan's
arrays need to advance to).
A skip array's range is constrained by "contradictory" inequality keys.
For example, a skip array on "x" will only generate the values 1 and 2
given a qual such as "WHERE x BETWEEN 1 AND 2 AND y = 66". Such a skip
array qual usually has near-identical performance characteristics to a
comparable SAOP qual "WHERE x = ANY('{1, 2}') AND y = 66". However,
improved performance isn't guaranteed. Much depends on physical index
characteristics.
B-Tree preprocessing is optimistic about skipping working out: it
applies static, generic rules when determining where to generate skip
arrays, which assumes that the runtime overhead of maintaining skip
arrays will pay for itself -- or lead to only a modest performance loss.
As things stand, these assumptions are much too optimistic: skip array
maintenance will lead to unacceptable regressions with unsympathetic
queries (queries whose scan can't skip over many irrelevant leaf pages).
An upcoming commit will address the problems in this area by enhancing
_bt_readpage's approach to saving cycles on scan key evaluation, making
it work in a way that directly considers the needs of = array keys
(particularly = skip array keys).
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Masahiro Ikeda <masahiro.ikeda@nttdata.com>
Reviewed-By: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reviewed-By: Tomas Vondra <tomas@vondra.me>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Alena Rybakina <a.rybakina@postgrespro.ru>
Discussion: https://postgr.es/m/CAH2-Wzmn1YsLzOGgjAQZdn1STSG_y8qP__vggTaPAYXJP+G4bw@mail.gmail.com
|
|
Previously, ALTER DEFAULT PRIVILEGES did not support large objects.
This meant that to grant privileges to users other than the owner,
permissions had to be manually assigned each time a large object
was created, which was inconvenient.
This commit extends ALTER DEFAULT PRIVILEGES to allow defining default
access privileges for large objects. With this change, specified privileges
will automatically apply to newly created large objects, making privilege
management more efficient.
As a side effect, this commit introduces the new keyword OBJECTS
since it's used in the syntax of ALTER DEFAULT PRIVILEGES.
Original patch by Haruka Takatsuka, with some fixes and tests by Yugo Nagata,
and rebased by Laurenz Albe.
Author: Takatsuka Haruka <harukat@sraoss.co.jp>
Co-authored-by: Yugo Nagata <nagata@sraoss.co.jp>
Co-authored-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Masao Fujii <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/20240424115242.236b499b2bed5b7a27f7a418@sraoss.co.jp
|
|
The issue is that the transactions prepared before two-phase decoding is
enabled can fail to replicate to the subscriber after being committed on a
promoted standby following a failover. This is because the two_phase_at
field of a slot, which tracks the LSN from which two-phase decoding
starts, is not synchronized to standby servers. Without two_phase_at, the
logical decoding might incorrectly identify prepared transaction as
already replicated to the subscriber after promotion of standby server,
causing them to be skipped.
To address the issue on HEAD, the two_phase_at field of the slot is
exposed by the pg_replication_slots view and allows the slot
synchronization to copy this value to the corresponding synced slot on the
standby server.
This bug is likely to occur if the user toggles the two_phase option to
true after initial slot creation. Given that altering the two_phase option
of a replication slot is not allowed in PostgreSQL 17, this bug is less
likely to occur. We can't change the view/function definition in
backbranch so we can't push the same fix but we are brainstorming an
appropriate solution for PG17.
Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/TYAPR01MB5724CC7C288535BBCEEE65DA94A72@TYAPR01MB5724.jpnprd01.prod.outlook.com
|
|
For GiST, having a sortsupport function allows building the index
using the "sorted build" method, which is much faster.
For b-tree, the sortsupport routine doesn't give any new
functionality, but speeds up sorting a tiny bit. The difference is not
very significant, about 2% in cursory testing on my laptop, because
the range type comparison function has quite a lot of overhead from
detoasting. In any case, since we have the function for GiST anyway,
we might as well register it for the btree opfamily too.
Author: Bernd Helmle <mailings@oopsware.de>
Discussion: https://www.postgresql.org/message-id/64d324ce2a6d535d3f0f3baeeea7b25beff82ce4.camel@oopsware.de
|
|
Create a function that will sort the elements of an array
according to the element type's sort order. If the array
has more than one dimension, the sub-arrays of the first
dimension are sorted per normal array-comparison rules,
leaving their contents alone.
In support of this, add pg_type.typarray to the set of fields
cached by the typcache.
Author: Junwang Zhao <zhjwpku@gmail.com>
Co-authored-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Aleksander Alekseev <aleksander@timescale.com>
Discussion: https://postgr.es/m/CAEG8a3J41a4dpw_-F94fF-JPRXYxw-GfsgoGotKcjs9LVfEEvw@mail.gmail.com
|
|
The new view lists all IO handles that are currently in use and is mainly
useful for PG developers, but may also be useful when tuning PG.
Bumps catversion.
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
|
|
This macro allows dynamically loaded shared libraries (modules) to
provide a wired-in module name and version, and possibly other
compile-time-constant fields in future. This information can be
retrieved with the new pg_get_loaded_modules() function.
This feature is expected to be particularly useful for modules
that do not have any exposed SQL functionality and thus are
not associated with a SQL-level extension object. But even for
modules that do belong to extensions, being able to verify the
actual code version can be useful.
Author: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Yurii Rashkovskii <yrashk@omnigres.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/dd4d1b59-d0fe-49d5-b28f-1e463b68fa32@gmail.com
|
|
These are useful general-purpose math functions which are included in
POSIX and C99, and are commonly included in other math libraries, so
expose them as SQL-callable functions.
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Stepan Neretin <sncfmgg@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Dmitry Koval <d.koval@postgrespro.ru>
Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com>
Discussion: https://postgr.es/m/CAEZATCXpGyfjXCirFk9au+FvM0y2Ah+2-0WSJx7MO368ysNUPA@mail.gmail.com
|
|
For import and export, use schemaname/relname rather than
regclass.
This is more natural during export, fits with the other arguments
better, and it gives better control over error handling in case we
need to downgrade more errors to warnings.
Also, use text for the argument types for schemaname, relname, and
attname so that casts to "name" are not required.
Author: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/CADkLM=ceOSsx_=oe73QQ-BxUFR2Cwqum7-UP_fPe22DBY0NerA@mail.gmail.com
|
|
Introduce a new conflict type, multiple_unique_conflicts, to handle cases
where an incoming row during logical replication violates multiple UNIQUE
constraints.
Previously, the apply worker detected and reported only the first
encountered key conflict (insert_exists/update_exists), causing repeated
failures as each constraint violation needs to be handled one by one
making the process slow and error-prone.
With this patch, the apply worker checks all unique constraints upfront
once the first key conflict is detected and reports
multiple_unique_conflicts if multiple violations exist. This allows users
to resolve all conflicts at once by deleting all conflicting tuples rather
than dealing with them individually or skipping the transaction.
In the future, this will also allow us to specify different resolution
handlers for such a conflict type.
Add the stats for this conflict type in pg_stat_subscription_stats.
Author: Nisha Moond <nisha.moond412@gmail.com>
Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Discussion: https://postgr.es/m/CABdArM7FW-_dnthGkg2s0fy1HhUB8C3ELA0gZX1kkbs1ZZoV3Q@mail.gmail.com
|
|
errdetail_relkind_not_supported() was declared within
EXPOSE_TO_CLIENT_CODE, which is mistaken since that function
isn't available client-side. While relatively harmless,
this isn't good precedent.
Discussion: https://postgr.es/m/1134562.1742507765@sss.pgh.pa.us
|
|
This commit introduces a function for reversing the order of the
bytes in binary strings.
Bumps catversion.
Author: Aleksander Alekseev <aleksander@timescale.com>
Discussion: https://postgr.es/m/CAJ7c6TMe0QVRuNssUArbMi0bJJK32%2BzNA3at5m3osrBQ25MHuw%40mail.gmail.com
|
|
This commit adds per-backend WAL statistics, providing the same
information as pg_stat_wal, except that it is now possible to know how
much WAL activity is happening in each backend rather than an overall
aggregate of all the activity. Like pg_stat_wal, the implementation
relies on pgWalUsage, tracking the difference of activity between two
reports to pgstats.
This data can be retrieved with a new system function called
pg_stat_get_backend_wal(), that returns one tuple based on the PID
provided in input. Like pg_stat_get_backend_io(), this is useful when
joined with pg_stat_activity to get a live picture of the WAL generated
for each running backend, showing how the activity is [un]balanced.
pgstat_flush_backend() gains a new flag value, able to control the flush
of the WAL stats.
This commit relies mostly on the infrastructure provided by
9aea73fc61d4, that has introduced backend statistics.
Bump catalog version. A bump of PGSTAT_FILE_FORMAT_ID is not required,
as backend stats do not persist on disk.
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Discussion: https://postgr.es/m/Z3zqc4o09dM/Ezyz@ip-10-97-1-34.eu-west-3.compute.internal
|
|
This allows smallint, integer, and bigint values to be cast to and
from bytea. The bytea value is the two's complement representation of
the integer, with the most significant byte first. For example:
1234::bytea -> \x000004d2
(-1234)::bytea -> \xfffffb2e
Author: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-by: Joel Jacobson <joel@compiler.org>
Reviewed-by: Yugo Nagata <nagata@sraoss.co.jp>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/CAJ7c6TPtOp6%2BkFX5QX3fH1SVr7v65uHr-7yEJ%3DGMGQi5uhGtcA%40mail.gmail.com
|
|
An additional paramater ("strip_in_arrays") is added to these functions.
It defaults to false. If true, then null array elements are removed as
well as null valued object fields. JSON that just consists of a single
null is not affected.
Author: Florents Tselai <florents.tselai@gmail.com>
Discussion: https://postgr.es/m/4BCECCD5-4F40-4313-9E98-9E16BEB0B01D@gmail.com
|
|
StoreAttrDefault() is no longer responsible for filling
attmissingval, so remove the code for that.
Get rid of RawColumnDefault.missingMode, too, as we no longer
need that to pass information around.
While here, clean up some sloppy coding in StoreAttrDefault(),
such as failure to use XXXGetDatum macros. These aren't bugs
but they're not good code either.
Reported-by: jian he <jian.universality@gmail.com>
Author: jian he <jian.universality@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACJufxHFssPvkP1we7WMhPD_1kwgbG52o=kQgL+TnVoX5LOyCQ@mail.gmail.com
|
|
If a domain type has a default, adding a column of that type (without
any explicit DEFAULT clause) failed to install the domain's default
value in existing rows, instead leaving the new column null. This
is unexpected, and it used to work correctly before v11. The cause
is confusion in the atthasmissing mechanism about which default value
to install: we'd only consider installing an explicitly-specified
default, and then we'd decide that no table rewrite is needed.
To fix, take the responsibility for filling attmissingval out of
StoreAttrDefault, and instead put it into ATExecAddColumn's existing
logic that derives the correct value to fill the new column with.
Also, centralize the logic that determines the need for
default-related table rewriting there, instead of spreading it over
four or five places.
In the back branches, we'll leave the attmissingval-filling code
in StoreAttrDefault even though it's now dead, for fear that some
extension may be depending on that functionality to exist there.
A separate HEAD-only patch will clean up the now-useless code.
Reported-by: jian he <jian.universality@gmail.com>
Author: jian he <jian.universality@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACJufxHFssPvkP1we7WMhPD_1kwgbG52o=kQgL+TnVoX5LOyCQ@mail.gmail.com
Backpatch-through: 13
|
|
Add relallfrozen, an estimate of the number of pages marked all-frozen
in the visibility map.
pg_class already has relallvisible, an estimate of the number of pages
in the relation marked all-visible in the visibility map. This is used
primarily for planning.
relallfrozen, together with relallvisible, is useful for estimating the
outstanding number of all-visible but not all-frozen pages in the
relation for the purposes of scheduling manual VACUUMs and tuning vacuum
freeze parameters.
A future commit will use relallfrozen to trigger more frequent vacuums
on insert-focused workloads with significant volume of frozen data.
Bump catalog version
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com>
Discussion: https://postgr.es/m/flat/CAAKRu_aj-P7YyBz_cPNwztz6ohP%2BvWis%3Diz3YcomkB3NpYA--w%40mail.gmail.com
|
|
Reported-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/Z77IkjmmfbFfNh3f@paquier.xyz
|
|
After commit f3dae2ae58, the primary purpose of separating the
pg_set_*_stats() from the pg_restore_*_stats() variants was
eliminated.
Leave pg_restore_relation_stats() and pg_restore_attribute_stats(),
which satisfy both purposes, and remove pg_set_relation_stats() and
pg_set_attribute_stats().
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/1457469.1740419458@sss.pgh.pa.us
|
|
The four following attributes are removed from pg_stat_wal:
* wal_write
* wal_sync
* wal_write_time
* wal_sync_time
a051e71e28a1 has added an equivalent of this information in pg_stat_io
with more granularity as this now spreads across the backend types, IO
context and IO objects. So, keeping the same information in pg_stat_wal
has little benefits.
Another benefit of this commit is the removal of PendingWalStats,
simplifying an upcoming patch to add per-backend WAL statistics, which
already support IO statistics and which have access to the write/sync
stats data of WAL.
The GUC track_wal_io_timing, that was used to enable or disable the
aggregation of the write and sync timings for WAL, is also removed.
pgstat_prepare_io_time() is simplified.
Bump catalog version.
Bump PGSTAT_FILE_FORMAT_ID, due to the update of PgStat_WalStats.
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/Z7RkQ0EfYaqqjgz/@ip-10-97-1-34.eu-west-3.compute.internal
|
|
The signedness of the 'char' type in C is
implementation-dependent. For instance, 'signed char' is used by
default on x86 CPUs, while 'unsigned char' is used on aarch
CPUs. Previously, we accidentally let C implementation signedness
affect persistent data. This led to inconsistent results when
comparing char data across different platforms.
This commit introduces a new 'default_char_signedness' field in
ControlFileData to store the signedness of the 'char' type. While this
change does not encourage the use of 'char' without explicitly
specifying its signedness, this field can be used as a hint to ensure
consistent behavior for pre-v18 data files that store data sorted by
the 'char' type on disk (e.g., GIN and GiST indexes), especially in
cross-platform replication scenarios.
Newly created database clusters unconditionally set the default char
signedness to true. pg_upgrade (with an upcoming commit) changes this
flag for clusters if the source database cluster has
signedness=false. As a result, signedness=false setting will become
rare over time. If we had known about the problem during the last
development cycle that forced initdb (v8.3), we would have made all
clusters signed or all clusters unsigned. Making pg_upgrade the only
source of signedness=false will cause the population of database
clusters to converge toward that retrospective ideal.
Bump catalog version (for the catalog changes) and PG_CONTROL_VERSION
(for the additions in ControlFileData).
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com
|
|
The type argument wasn't actually really necessary. It was a remnant
of converting the API of the gist strategy translation from using
opclass to using opfamily+opcintype (commits c09e5a6a016,
622f678c102). For looking up the gist translation function, we used
the convention "amproclefttype = amprocrighttype = opclass's
opcintype" (see pg_amproc.h). But each operator family should only
have one translation function, and getting the right type for the
lookup is sometimes cumbersome and fragile, so this is all
unnecessarily complicated.
To simplify this, change the gist stategy support procedure to take
"any", "any" as argument. (This is arbitrary but seems intuitive.
The alternative of using InvalidOid as argument(s) upsets various DDL
commands, so it's not practical.) Then we don't need opcintype for
the lookup, and we can remove it from all the API layers introduced by
commit c09e5a6a016.
This also adds some more documentation about the correct signature of
the gist support function and adds more checks in gistvalidate().
This was previously underspecified. (It relied implicitly on
convention mentioned above.)
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
|
|
This commit adds the amount of time spent sleeping due to
cost-based delay to the pg_stat_progress_vacuum and
pg_stat_progress_analyze system views. A new configuration
parameter named track_cost_delay_timing, which is off by default,
controls whether this information is gathered. For vacuum, the
reported value includes the sleep time of any associated parallel
workers. However, parallel workers only report their sleep time
once per second to avoid overloading the leader process.
Bumps catversion.
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Masahiro Ikeda <ikedamsh@oss.nttdata.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Sergei Kornilov <sk@zsrv.org>
Discussion: https://postgr.es/m/ZmaXmWDL829fzAVX%40ip-10-97-1-34.eu-west-3.compute.internal
|
|
Commit 1dc5ebc90 allowed PL/pgSQL to perform in-place updates
of expanded-object variables that are being updated with
assignments like "x := f(x, ...)". However this was allowed
only for a hard-wired list of functions f(), since we need to
be sure that f() will not modify the variable if it fails.
It was always envisioned that we should make that extensible,
but at the time we didn't have a good way to do so. Since
then we've invented the idea of "support functions" to allow
attaching specialized optimization knowledge to functions,
and that is a perfect mechanism for doing this.
Hence, adjust PL/pgSQL to use a support function request instead
of hard-wired logic to decide if in-place update is safe.
Preserve the previous optimizations by creating support functions
for the three functions that were previously hard-wired.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>
Discussion: https://postgr.es/m/CACxu=vJaKFNsYxooSnW1wEgsAO5u_v1XYBacfVJ14wgJV_PYeg@mail.gmail.com
|
|
This adds a new variant of generated columns that are computed on read
(like a view, unlike the existing stored generated columns, which are
computed on write, like a materialized view).
The syntax for the column definition is
... GENERATED ALWAYS AS (...) VIRTUAL
and VIRTUAL is also optional. VIRTUAL is the default rather than
STORED to match various other SQL products. (The SQL standard makes
no specification about this, but it also doesn't know about VIRTUAL or
STORED.) (Also, virtual views are the default, rather than
materialized views.)
Virtual generated columns are stored in tuples as null values. (A
very early version of this patch had the ambition to not store them at
all. But so much stuff breaks or gets confused if you have tuples
where a column in the middle is completely missing. This is a
compromise, and it still saves space over being forced to use stored
generated columns. If we ever find a way to improve this, a bit of
pg_upgrade cleverness could allow for upgrades to a newer scheme.)
The capabilities and restrictions of virtual generated columns are
mostly the same as for stored generated columns. In some cases, this
patch keeps virtual generated columns more restricted than they might
technically need to be, to keep the two kinds consistent. Some of
that could maybe be relaxed later after separate careful
considerations.
Some functionality that is currently not supported, but could possibly
be added as incremental features, some easier than others:
- index on or using a virtual column
- hence also no unique constraints on virtual columns
- extended statistics on virtual columns
- foreign-key constraints on virtual columns
- not-null constraints on virtual columns (check constraints are supported)
- ALTER TABLE / DROP EXPRESSION
- virtual column cannot have domain type
- virtual columns are not supported in logical replication
The tests in generated_virtual.sql have been copied over from
generated_stored.sql with the keyword replaced. This way we can make
sure the behavior is mostly aligned, and the differences can be
visible. Some tests for currently not supported features are
currently commented out.
Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Tested-by: Shlok Kyal <shlok.kyal.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/a368248e-69e4-40be-9c07-6c3b5880b0a6@eisentraut.org
|
|
The column added in commit e65dbc9927, pubgencols_type, was inconsistent
with the naming conventions of other columns in the pg_publication
catalog.
Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Discussion: https://postgr.es/m/CALDaNm1u-ufVOW-RUsXSooqzkpohxfZYy=z78fbcr_9Pq5hbCg@mail.gmail.com
|
|
This commit adds four fields to the statistics of relations, aggregating
the amount of time spent for each operation on a relation:
- total_vacuum_time, for manual vacuum.
- total_autovacuum_time, for vacuum done by the autovacuum daemon.
- total_analyze_time, for manual analyze.
- total_autoanalyze_time, for analyze done by the autovacuum daemon.
This gives users the option to derive the average time spent for these
operations with the help of the related "count" fields.
Bump catalog version (for the catalog changes) and PGSTAT_FILE_FORMAT_ID
(for the additions in PgStat_StatTabEntry).
Author: Sami Imseih
Reviewed-by: Bertrand Drouvot, Michael Paquier
Discussion: https://postgr.es/m/CAA5RZ0uVOGBYmPEeGF2d1B_67tgNjKx_bKDuL+oUftuoz+=Y1g@mail.gmail.com
|
|
Useful for caseless matching. Similar to LOWER(), but avoids edge-case
problems with using LOWER() for caseless matching.
For collations that support it, CASEFOLD() handles characters with
more than two case variations or multi-character case variations. Some
characters may fold to uppercase. The results of case folding are also
more stable across Unicode versions than LOWER() or UPPER().
Discussion: https://postgr.es/m/a1886ddfcd8f60cb3e905c93009b646b4cfb74c5.camel%40j-davis.com
Reviewed-by: Ian Lawrence Barwick
|
|
The current boolean publish_generated_columns option only supports a
binary choice, which is insufficient for future enhancements where
generated columns can be of different types (e.g., stored or virtual). The
supported values for the publish_generated_columns option are 'none' and
'stored'.
Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/d718d219-dd47-4a33-bb97-56e8fc4da994@eisentraut.org
Discussion: https://postgr.es/m/B80D17B2-2C8E-4C7D-87F2-E5B4BE3C069E@gmail.com
|
|
If a referenced UPDATE changes the temporal start/end times, shrinking
the span the row is valid, we get a false return from
ri_Check_Pk_Match(), but overlapping references may still be valid, if
their reference didn't overlap with the removed span.
We need to consider what span(s) are still provided in the referenced
table. Instead of returning that from ri_Check_Pk_Match(), we can
just look it up in the main SQL query.
Reported-by: Sam Gabrielsson <sam@movsom.se>
Author: Paul Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
|
|
The PG_UNICODE_FAST locale uses code point sort order (fast,
memcmp-based) combined with Unicode character semantics. The character
semantics are based on Unicode full case mapping.
Full case mapping can map a single codepoint to multiple codepoints,
such as "ß" uppercasing to "SS". Additionally, it handles
context-sensitive mappings like the "final sigma", and it uses
titlecase mappings such as "Dž" when titlecasing (rather than plain
uppercase mappings).
Importantly, the uppercasing of "ß" as "SS" is specifically mentioned
by the SQL standard. In Postgres, UCS_BASIC uses plain ASCII semantics
for case mapping and pattern matching, so if we changed it to use the
PG_UNICODE_FAST locale, it would offer better compliance with the
standard. For now, though, do not change the behavior of UCS_BASIC.
Discussion: https://postgr.es/m/ddfd67928818f138f51635712529bc5e1d25e4e7.camel@j-davis.com
Discussion: https://postgr.es/m/27bb0e52-801d-4f73-a0a4-02cfdd4a9ada@eisentraut.org
Reviewed-by: Peter Eisentraut, Daniel Verite
|
|
If a time zone abbreviation used in datetime input is defined in
the currently active timezone, use that definition in preference
to looking in the timezone_abbreviations list. That allows us to
correctly handle abbreviations that have different meanings in
different timezones. Also, it eliminates an inconsistency between
datetime input and datetime output: the non-ISO datestyles for
timestamptz have always printed abbreviations taken from the IANA
data, not from timezone_abbreviations. Before this fix, it was
possible to demonstrate cases where casting a timestamp to text
and back fails or changes the value significantly because of that
inconsistency.
While this change removes the ability to override the IANA data about
an abbreviation known in the current zone, it's not clear that there's
any real use-case for doing so. But it is clear that this makes life
a lot easier for dealing with abbreviations that have conflicts across
different time zones.
Also update the pg_timezone_abbrevs view to report abbreviations
that are recognized via the IANA data, and *not* report any
timezone_abbreviations entries that are thereby overridden.
Under the hood, there are now two SRFs, one that pulls the IANA
data and one that pulls timezone_abbreviations entries. They're
combined by logic in the view. This approach was useful for
debugging (since the functions can be called on their own).
While I don't intend to document the functions explicitly,
they might be useful to call directly.
Also improve DecodeTimezoneAbbrev's caching logic so that it can
cache zone abbreviations found in the IANA data. Without that,
this patch would have caused a noticeable degradation of the
runtime of timestamptz_in.
Per report from Aleksander Alekseev and additional investigation.
Discussion: https://postgr.es/m/CAJ7c6TOATjJqvhnYsui0=CO5XFMF4dvTGH+skzB--jNhqSQu5g@mail.gmail.com
|
|
This allows the RETURNING list of INSERT/UPDATE/DELETE/MERGE queries
to explicitly return old and new values by using the special aliases
"old" and "new", which are automatically added to the query (if not
already defined) while parsing its RETURNING list, allowing things
like:
RETURNING old.colname, new.colname, ...
RETURNING old.*, new.*
Additionally, a new syntax is supported, allowing the names "old" and
"new" to be changed to user-supplied alias names, e.g.:
RETURNING WITH (OLD AS o, NEW AS n) o.colname, n.colname, ...
This is useful when the names "old" and "new" are already defined,
such as inside trigger functions, allowing backwards compatibility to
be maintained -- the interpretation of any existing queries that
happen to already refer to relations called "old" or "new", or use
those as aliases for other relations, is not changed.
For an INSERT, old values will generally be NULL, and for a DELETE,
new values will generally be NULL, but that may change for an INSERT
with an ON CONFLICT ... DO UPDATE clause, or if a query rewrite rule
changes the command type. Therefore, we put no restrictions on the use
of old and new in any DML queries.
Dean Rasheed, reviewed by Jian He and Jeff Davis.
Discussion: https://postgr.es/m/CAEZATCWx0J0-v=Qjc6gXzR=KtsdvAE7Ow=D=mu50AgOe+pvisQ@mail.gmail.com
|
|
Add "IWYU pragma: export" annotations in each catalog header file so
that, for instance, including "catalog/pg_aggregate.h" is considered
acceptable in place of "catalog/pg_aggregate_d.h". This is very
common and it seems better to silence IWYU about it than trying to fix
this up.
Discussion: https://www.postgresql.org/message-id/flat/9395d484-eff4-47c2-b276-8e228526c8ae@eisentraut.org
|