Age | Commit message (Collapse) | Author |
|
CREATE SCHEMA AUTHORIZATION with appended schema elements can lead to
crashes when comparing the schema name of the query with the schemas
used in the qualification of some clauses in the elements' queries.
The origin of the problem is that the transformation routine for the
elements listed in a CREATE SCHEMA query uses as new, expected, schema
name the one listed in CreateSchemaStmt itself. However, depending on
the query, CreateSchemaStmt.schemaname may be NULL, being computed
instead from the role specification of the query given by the
AUTHORIZATION clause, that could be either:
- A user name string, with the new schema name being set to the same
value as the role given.
- Guessed from CURRENT_ROLE, SESSION_ROLE or CURRENT_ROLE, with a new
schema name computed from the security context where CREATE SCHEMA is
running.
Regression tests are added for CREATE SCHEMA with some appended elements
(some of them with schema qualifications), covering also some role
specification patterns.
While on it, this simplifies the context structure used during the
transformation of the elements listed in a CREATE SCHEMA query by
removing the fields for the role specification and the role type. They
were not used, and for the role specification this could be confusing as
the schema name may by extracted from that at the beginning of
CreateSchemaCommand().
This issue exists for a long time, so backpatch down to all the versions
supported.
Reported-by: Song Hongyu
Author: Michael Paquier
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/17909-f65c12dfc5f0451d@postgresql.org
Backpatch-through: 11
|
|
In v13 and v14, the ENABLE/DISABLE TRIGGER USER variant malfunctioned
on cloned triggers, failing to find the clones because it thought they
were system triggers. Other variants of ENABLE/DISABLE TRIGGER would
improperly apply a superuserness check. Fix by adjusting the is-it-
a-system-trigger check to match reality in those branches. (As far
as I can find, this is the only place that got it wrong.)
There's no such bug in v15/HEAD, because we revised the catalog
representation of system triggers to be what this code was expecting.
However, add the test case to these branches anyway, because this area
is visibly pretty fragile. Also remove an obsoleted comment.
The recent v15/HEAD commit 6949b921d fixed a nearby bug. I now see
that my commit message for that was inaccurate: the behavior of
recursing to clone triggers is older than v15, but it didn't apply
to the case in v13/v14 because in those branches parent partitioned
tables have no pg_trigger entries for foreign-key triggers. But add
the test case from that commit to v13/v14, just to show what is
happening there.
Per bug #17886 from DzmitryH.
Discussion: https://postgr.es/m/17886-5406d5d828aa4aa3@postgresql.org
|
|
Up through v11 it was sensible to use the "oid" system column as
a foreign key column, but since that was removed there's no visible
usefulness in making any of the remaining system columns a foreign
key. Moreover, since the TupleTableSlot rewrites in v12, such cases
actively fail because of implicit assumptions that only user columns
appear in foreign keys. The lack of complaints about that seems
like good evidence that no one is trying to do it. Hence, rather
than trying to repair those assumptions (of which there are at least
two, maybe more), let's just forbid the case up front.
Per this patch, a system column in either the referenced or
referencing side of a foreign key will draw this error; however,
putting one in the referenced side would have failed later anyway,
since we don't allow unique indexes to be made on system columns.
Per bug #17877 from Alexander Lakhin. Back-patch to v12; the
case still appears to work in v11, so we shouldn't break it there.
Discussion: https://postgr.es/m/17877-4bcc658e33df6de1@postgresql.org
|
|
The totalrows/totaldeadrows outputs were left uninitialized in cases
where we find no analyzable child tables of a partitioned table. This
could lead to setting the partitioned table's pg_class.reltuples value
to garbage. It's not clear that that would have any very bad effects
in practice, but fix it anyway because it's making valgrind unhappy.
Reported and diagnosed by Alexander Lakhin (bug #17880).
Discussion: https://postgr.es/m/17880-9282037c923d856e@postgresql.org
|
|
find_composite_type_dependencies() ignored indexes, which is a poor
decision because an expression index could have a stored column of
a composite (or other container) type even when the underlying table
does not. Teach it to detect such cases and error out. We have to
work a bit harder than for other relations because the pg_depend entry
won't identify the specific index column of concern, but it's not much
new code.
This does not address bug #17872's original complaint that dropping
a column in such a type might lead to violations of the uniqueness
property that a unique index is supposed to ensure. That seems of
much less concern to me because it won't lead to crashes.
Per bug #17872 from Alexander Lakhin. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/17872-d0fbb799dc3fd85d@postgresql.org
|
|
The COPY documentation is quite clear that "COPY relation TO" copies
rows from only the named table, not any inheritance children it may
have. However, if you enabled row-level security on the table then
this stopped being true, because the code forgot to apply the ONLY
modifier in the "SELECT ... FROM relation" query that it constructs
in order to allow RLS predicates to be attached. Fix that.
Report and patch by Antonin Houska (comment adjustments and test case
by me). Back-patch to all supported branches.
Discussion: https://postgr.es/m/3472.1675251957@antos
|
|
The motivation for this change is that when pg_dump dumps a
partitioned index that's marked REPLICA IDENTITY, it generates a
command sequence that applies REPLICA IDENTITY before the partitioned
index has been marked valid, causing restore to fail. We could
perhaps change pg_dump to not do it like that, but that would be
difficult and would not fix existing dump files with the problem.
There seems to be very little reason for the backend to disallow
this anyway --- the code ignores indisreplident when the index
isn't valid --- so instead let's fix it by allowing the case.
Commit 9511fb37a previously expressed a concern that allowing
indisreplident to be set on invalid indexes might allow us to
wind up in a situation where a table could have indisreplident
set on multiple indexes. I'm not sure I follow that concern
exactly, but in any case the only way that could happen is because
relation_mark_replica_identity is too trusting about the existing set
of markings being valid. Let's just rip out its early-exit code path
(which sure looks like premature optimization anyway; what are we
doing expending code to make redundant ALTER TABLE ... REPLICA
IDENTITY commands marginally faster and not-redundant ones marginally
slower?) and fix it to positively guarantee that no more than one
index is marked indisreplident.
The pg_dump failure can be demonstrated in all supported branches,
so back-patch all the way. I chose to back-patch 9511fb37a as well,
just to keep indisreplident handling the same in all branches.
Per bug #17756 from Sergey Belyashov.
Discussion: https://postgr.es/m/17756-dd50e8e0c8dd4a40@postgresql.org
|
|
pg_database.datfrozenxid gets updated using an in-place update at the
end of vacuum or autovacuum. Since 96cdeae, as pg_database has a toast
relation, it is possible for a pg_database tuple to have toast values
if there is a large set of ACLs in place. In such a case, the in-place
update would fail because of the flattening of the toast values done for
the catcache entry fetched. Instead of using a copy from the catcache,
this changes the logic to fetch the copy of the tuple by directly
scanning pg_database.
Note that before 96cdeae, attempting to insert such a tuple to
pg_database would cause a "row is too big" error, so the end-of-vacuum
problem was not reachable.
This issue has been originally fixed in 947789f on v14~, and there have
been reports about this problem on v12 and v13, causing failures at the
end of VACUUM. This completes the fix on all the stable branches where
pg_database can use a toast table, down to 12.
Author: Ashwin Agrawal, Junfeng Yang
Discussion: https://postgr.es/m/DM5PR0501MB38800D9E4605BCA72DD35557CCE10@DM5PR0501MB3880.namprd05.prod.outlook.com
Discussion: https://postgr.es/m/Y70XNVbUWQsR2Car@paquier.xyz
Backpatch-through: 12
|
|
This is a back-patch of the v15-era commit f10f0ae42 into older
supported branches. The idea is to design out bugs in which an
ill-timed relcache flush clears rel->rd_smgr partway through
some code sequence that wasn't expecting that. We had another
report today of a corner case that reliably crashes v14 under
debug_discard_caches (nee CLOBBER_CACHE_ALWAYS), and therefore
would crash once in a blue moon in the field. We're unlikely
to get rid of all such code paths unless we adopt the more
rigorous coding rules instituted by f10f0ae42. Therefore,
even though this is a bit invasive, it's time to back-patch.
Some comfort can be taken in the fact that f10f0ae42 has been
in v15 for 16 months without problems.
I left the RelationOpenSmgr macro present in the back branches,
even though no core code should use it anymore, in order to not break
third-party extensions in minor releases. Such extensions might opt
to start using RelationGetSmgr instead, to reduce their code
differential between v15 and earlier branches. This carries a hazard
of failing to compile against headers from existing minor releases.
However, once compiled the extension should work fine even with such
releases, because RelationGetSmgr is a "static inline" function so
it creates no link-time dependency. So depending on distribution
practices, that might be an OK tradeoff.
Per report from Spyridon Dimitrios Agathos. Original patch
by Amul Sul.
Discussion: https://postgr.es/m/CAFM5RaqdgyusQvmWkyPYaWMwoK5gigdtW-7HcgHgOeAw7mqJ_Q@mail.gmail.com
Discussion: https://postgr.es/m/CANiYTQsU7yMFpQYnv=BrcRVqK_3U3mtAzAsJCaqtzsDHfsUbdQ@mail.gmail.com
|
|
partitioned tables.
"Triggers on partitioned tables cannot have transition tables." is
incorrect as we allow statement-level triggers on partitioned tables to
have transition tables.
This has been wrong since commit 86f575948; back-patch to v11 where that
commit came in.
Reviewed by Tom Lane.
Discussion: https://postgr.es/m/CAPmGK17gk4vXLzz2iG%2BG4LWRWCoVyam70nZ3OuGm1hMJwDrhcg%40mail.gmail.com
|
|
Commit f56f8f8da6af added some code in CloneFkReferencing that's way too
lax about a Constraint node it manufactures, not initializing enough
struct members -- initially_valid in particular was forgotten. This
causes some FKs in partitions added by ALTER TABLE ATTACH PARTITION to
be marked as not validated. Set initially_valid true, which fixes the
bug.
While at it, make the struct initialization more complete. Very similar
code was added in two other places by the same commit; make them all
follow the same pattern for consistency, though no bugs are apparent
there.
This bug has never been reported: I only happened to notice while
working on commit 614a406b4ff1. The test case that was added there with
the improper result is repaired.
Backpatch to 12.
Discussion: https://postgr.es/m/20221005105523.bhuhkdx4olajboof@alvherre.pgsql
|
|
There are a number of bugs in this area. Two of them are fixed here,
namely:
1. get_relation_idx_constraint_oid does not restrict the type of
constraint that's returned, so with sufficient bad luck it can
return the OID of a foreign key constraint. This has the effect that
a primary key in a partition can end up as a child of a foreign key,
which makes no sense (it needs to be the child of the equivalent
primary key.)
Change the API contract so that only index-backed constraints are
returned, mimicking get_constraint_index().
2. Both CloneFkReferenced and CloneFkReferencing clone a
self-referencing foreign key, so the partition ends up with
a duplicate foreign key. Change the former function to ignore such
constraints.
Add some tests to verify that things are better now. (However, these
new tests show some additional misbehavior that will be fixed later --
namely that there's a constraint marked NOT VALID.)
Backpatch to 12, where these constraints are possible at all.
Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Discussion: https://postgr.es/m/20220603154232.1715b14c@karst
|
|
This prevents marking the argument string for translation for gettext,
and it also prevents the given string (which is already translated) from
being translated at runtime.
Also, mark the strings used as arguments to check_rolespec_name for
translation.
Backpatch all the way back as appropriate. None of this is caught by
any tests (necessarily so), so I verified it manually.
|
|
Commit 25936fd46 adjusted things so that the "storeslot" we use
for remapping trigger tuples would have adequate lifespan, but it
neglected to consider the lifespan of the tuple descriptor that
the slot depends on. It turns out that in at least some cases, the
tupdesc we are passing is a refcounted tupdesc, and the refcount for
the slot's reference can get assigned to a resource owner having
different lifespan than the slot does. That leads to an error like
"tupdesc reference 0x7fdef236a1b8 is not owned by resource owner
SubTransaction". Worse, because of a second oversight in the same
commit, we'd try to free the same tupdesc refcount again while
cleaning up after that error, leading to recursive errors and an
"ERRORDATA_STACK_SIZE exceeded" PANIC.
To fix the initial problem, let's just make a non-refcounted copy
of the tupdesc we're supposed to use. That seems likely to guard
against additional problems, since there's no strong reason for
this code to assume that what it's given is a refcounted tupdesc;
in which case there's an independent hazard of the tupdesc having
shorter lifespan than the slot does. (I didn't bother trying to
free said copy, since it should go away anyway when the (sub)
transaction context is cleaned up.)
The other issue can be fixed by making the code added to
AfterTriggerFreeQuery work like the rest of that function, ie be
sure that it doesn't try to free the same slot twice in the event
of recursive error cleanup.
While here, also clean up minor stylistic issues in the test case
added by 25936fd46: don't use "create or replace function", as any
name collision within the tests is likely to have ill effects
that that won't mask; and don't use function names as generic as
trigger_function1, especially if you're not going to drop them
at the end of the test stanza.
Per bug #17607 from Thomas Mc Kay. Back-patch to v12, as the
previous fix was.
Discussion: https://postgr.es/m/17607-bd8ccc81226f7f80@postgresql.org
|
|
This comment has been wrong since its introduction in commit 0d5f05cde;
backpatch to v12 where that came in.
Discussion: https://postgr.es/m/CAPmGK14VGf-xQjGQN4o1QyAbXAaxugU5%3DqfcmTDh1iufUDnV_w%40mail.gmail.com
|
|
During ALTER TABLE ATTACH PARTITION, if the name of a parent's foreign
key constraint is already used on the partition, the code tries to
choose another one before the FK attributes list has been populated,
so the resulting constraint name was "<relname>__fkey" instead of
"<relname>_<attrs>_fkey". Repair, and add a test case.
Backpatch to 12. In 11, the code to attach a partition was not smart
enough to cope with conflicting constraint names, so the problem doesn't
exist there.
Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Discussion: https://postgr.es/m/20220901184156.738ebee5@karst
|
|
|
|
When creating a partitioned index, DefineIndex tries to identify
any existing indexes on the partitions that match the partitioned
index, so that it can absorb those as child indexes instead of
building new ones. Part of the matching is to compare IndexInfo
structs --- but that wasn't done quite right. We're comparing
the IndexInfo built within DefineIndex itself to one made from
existing catalog contents by BuildIndexInfo. Notably, while
BuildIndexInfo will run index expressions and predicates through
expression preprocessing, that has not happened to DefineIndex's
struct. The result is failure to match and subsequent creation
of duplicate indexes.
The easiest and most bulletproof fix is to build a new IndexInfo
using BuildIndexInfo, thereby guaranteeing that the processing done
is identical.
While here, let's also extract the opfamily and collation data
from the new partitioned index, removing ad-hoc logic that
duplicated knowledge about how those are constructed.
Per report from Christophe Pettus. Back-patch to v11 where
we invented partitioned indexes.
Richard Guo and Tom Lane
Discussion: https://postgr.es/m/8864BFAA-81FD-4BF9-8E06-7DEB8D4164ED@thebuild.com
|
|
Previously, if an extension script did CREATE OR REPLACE and there was
an existing object not belonging to the extension, it would overwrite
the object and adopt it into the extension. This is problematic, first
because the overwrite is probably unintentional, and second because we
didn't change the object's ownership. Thus a hostile user could create
an object in advance of an expected CREATE EXTENSION command, and would
then have ownership rights on an extension object, which could be
modified for trojan-horse-type attacks.
Hence, forbid CREATE OR REPLACE of an existing object unless it already
belongs to the extension. (Note that we've always forbidden replacing
an object that belongs to some other extension; only the behavior for
previously-free-standing objects changes here.)
For the same reason, also fail CREATE IF NOT EXISTS when there is
an existing object that doesn't belong to the extension.
Our thanks to Sven Klemm for reporting this problem.
Security: CVE-2022-2625
|
|
Using ATSimpleRecursion() in ATPrepCmd() to do so as bbb927b4db9b did is
not correct, because ATPrepCmd() can't distinguish between triggers that
may be cloned and those that may not, so would wrongly try to recurse
for the latter category of triggers.
So this commit restores the code in EnableDisableTrigger() that
86f575948c77 had added to do the recursion, which would do it only for
triggers that may be cloned, that is, row-level triggers. This also
changes tablecmds.c such that ATExecCmd() is able to pass the value of
ONLY flag down to EnableDisableTrigger() using its new 'recurse'
parameter.
This also fixes what seems like an oversight of 86f575948c77 that the
recursion to partition triggers would only occur if EnableDisableTrigger()
had actually changed the trigger. It is more apt to recurse to inspect
partition triggers even if the parent's trigger didn't need to be
changed: only then can we be certain that all descendants share the same
state afterwards.
Backpatch all the way back to 11, like bbb927b4db9b. Care is taken not
to break ABI compatibility (and that no catversion bump is needed.)
Co-authored-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Dmitry Koval <d.koval@postgrespro.ru>
Discussion: https://postgr.es/m/CA+HiwqG-cZT3XzGAnEgZQLoQbyfJApVwOTQaCaas1mhpf+4V5A@mail.gmail.com
|
|
Crash recovery on standby may encounter missing directories
when replaying database-creation WAL records. Prior to this
patch, the standby would fail to recover in such a case;
however, the directories could be legitimately missing.
Consider the following sequence of commands:
CREATE DATABASE
DROP DATABASE
DROP TABLESPACE
If, after replaying the last WAL record and removing the
tablespace directory, the standby crashes and has to replay the
create database record again, crash recovery must be able to continue.
A fix for this problem was already attempted in 49d9cfc68bf4, but it
was reverted because of design issues. This new version is based
on Robert Haas' proposal: any missing tablespaces are created
during recovery before reaching consistency. Tablespaces
are created as real directories, and should be deleted
by later replay. CheckRecoveryConsistency ensures
they have disappeared.
The problems detected by this new code are reported as PANIC,
except when allow_in_place_tablespaces is set to ON, in which
case they are WARNING. Apart from making tests possible, this
gives users an escape hatch in case things don't go as planned.
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Asim R Praveen <apraveen@pivotal.io>
Author: Paul Guo <paulguo@gmail.com>
Reviewed-by: Anastasia Lubennikova <lubennikovaav@gmail.com> (older versions)
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> (older versions)
Reviewed-by: Michaël Paquier <michael@paquier.xyz>
Diagnosed-by: Paul Guo <paulguo@gmail.com>
Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com
|
|
This is a backpatch to branches 10-14 of the following commits:
7170f2159fb2 Allow "in place" tablespaces.
c6f2f01611d4 Fix pg_basebackup with in-place tablespaces.
f6f0db4d6240 Fix pg_tablespace_location() with in-place tablespaces
7a7cd84893e0 doc: Remove mention to in-place tablespaces for pg_tablespace_location()
5344723755bd Remove unnecessary Windows-specific basebackup code.
In-place tablespaces were introduced as a testing helper mechanism, but
they are going to be used for a bugfix in WAL replay to be backpatched
to all stable branches.
I (Álvaro) had to adjust some code to account for lack of
get_dirent_type() in branches prior to 14.
Author: Thomas Munro <thomas.munro@gmail.com>
Author: Michaël Paquier <michael@paquier.xyz>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20220722081858.omhn2in5zt3g4nek@alvherre.pgsql
|
|
Commit a117cebd638dd02e5c2e791c25e43745f233111b used the original userid
for ACL checks located directly in DefineIndex(), but it still adopted
the table owner userid for more ACL checks than intended. That broke
dump/reload of indexes that refer to an operator class, collation, or
exclusion operator in a schema other than "public" or "pg_catalog".
Back-patch to v10 (all supported versions), like the earlier commit.
Nathan Bossart and Noah Misch
Discussion: https://postgr.es/m/f8a4105f076544c180a87ef0c4822352@stmuk.bayern.de
|
|
When an implicit operator family is created, it wasn't getting reported.
Make it do so.
This has always been missing. Backpatch to 10.
Author: Masahiko Sawada <sawada.mshk@gmail.com>
Reported-by: Leslie LEMAIRE <leslie.lemaire@developpement-durable.gouv.fr>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Michael Paquiër <michael@paquier.xyz>
Discussion: https://postgr.es/m/f74d69e151b22171e8829551b1159e77@developpement-durable.gouv.fr
|
|
It intended to, but did not, achieve this. Adopt the new standard of
setting user ID just after locking the relation. Back-patch to v10 (all
supported versions).
Reviewed by Simon Riggs. Reported by Alvaro Herrera.
Security: CVE-2022-1552
|
|
When a feature enumerates relations and runs functions associated with
all found relations, the feature's user shall not need to trust every
user having permission to create objects. BRIN-specific functionality
in autovacuum neglected to account for this, as did pg_amcheck and
CLUSTER. An attacker having permission to create non-temp objects in at
least one schema could execute arbitrary SQL functions under the
identity of the bootstrap superuser. CREATE INDEX (not a
relation-enumerating operation) and REINDEX protected themselves too
late. This change extends to the non-enumerating amcheck interface.
Back-patch to v10 (all supported versions).
Sergey Shinderuk, reviewed (in earlier versions) by Alexander Lakhin.
Reported by Alexander Lakhin.
Security: CVE-2022-1552
|
|
An ALTER FUNCTION command that tried to update both the function's
proparallel property and its proconfig list failed to do the former,
because it stored the new proparallel value into a tuple that was
no longer the interesting one. Carelessness in 7aea8e4f2.
(I did not bother with a regression test, because the only likely
future breakage would be for someone to ignore the comment I added
and add some other field update after the heap_modify_tuple step.
A test using existing function properties could not catch that.)
Per report from Bryn Llewellyn. Back-patch to all supported branches.
Discussion: https://postgr.es/m/8AC9A37F-99BD-446F-A2F7-B89AD0022774@yugabyte.com
|
|
|
|
This reverts commit 49d9cfc68bf4. The approach taken by this patch has
problems, so we'll come up with a radically different fix.
Discussion: https://postgr.es/m/CA+TgmoYcUPL+WOJL2ZzhH=zmrhj0iOQ=iCFM0SuYqBbqZEamEg@mail.gmail.com
|
|
Crash recovery on standby may encounter missing directories when
replaying create database WAL records. Prior to this patch, the standby
would fail to recover in such a case. However, the directories could be
legitimately missing. Consider a sequence of WAL records as follows:
CREATE DATABASE
DROP DATABASE
DROP TABLESPACE
If, after replaying the last WAL record and removing the tablespace
directory, the standby crashes and has to replay the create database
record again, the crash recovery must be able to move on.
This patch adds a mechanism similar to invalid-page tracking, to keep a
tally of missing directories during crash recovery. If all the missing
directory references are matched with corresponding drop records at the
end of crash recovery, the standby can safely continue following the
primary.
Backpatch to 13, at least for now. The bug is older, but fixing it in
older branches requires more careful study of the interactions with
commit e6d8069522c8, which appeared in 13.
A new TAP test file is added to verify the condition. However, because
it depends on commit d6d317dbf615, it can only be added to branch
master. I (Álvaro) manually verified that the code behaves as expected
in branch 14. It's a bit nervous-making to leave the code uncovered by
tests in older branches, but leaving the bug unfixed is even worse.
Also, the main reason this fix took so long is precisely that we
couldn't agree on a good strategy to approach testing for the bug, so
perhaps this is the best we can do.
Diagnosed-by: Paul Guo <paulguo@gmail.com>
Author: Paul Guo <paulguo@gmail.com>
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Asim R Praveen <apraveen@pivotal.io>
Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com
|
|
DROP INDEX needs to lock the index's table before the index itself,
else it will deadlock against ordinary queries that acquire the
relation locks in that order. This is correctly mechanized for
plain indexes by RangeVarCallbackForDropRelation; but in the case of
a partitioned index, we neglected to lock the child tables in advance
of locking the child indexes. We can fix that by traversing the
inheritance tree and acquiring the needed locks in RemoveRelations,
after we have acquired our locks on the parent partitioned table and
index.
While at it, do some refactoring to eliminate confusion between
the actual and expected relkind in RangeVarCallbackForDropRelation.
We can save a couple of syscache lookups too, by having that function
pass back info that RemoveRelations will need.
Back-patch to v11 where partitioned indexes were added.
Jimmy Yih, Gaurab Dey, Tom Lane
Discussion: https://postgr.es/m/BYAPR05MB645402330042E17D91A70C12BD5F9@BYAPR05MB6454.namprd05.prod.outlook.com
|
|
The syscache lookup may return NULL even for valid OID, for example due
to a concurrent DROP STATISTICS, so a HeapTupleIsValid is necessary.
Without it, it may fail with a segfault.
Reported by Alexander Lakhin, patch by me. Backpatch to 13, where ALTER
STATISTICS ... SET STATISTICS was introduced.
Backpatch-through: 13
Discussion: https://postgr.es/m/17372-bf3b6e947e35ae77%40postgresql.org
|
|
Commit 859b3003de disabled building of extended stats for inheritance
trees, to prevent updating the same catalog row twice. While that
resolved the issue, it also means there are no extended stats for
declaratively partitioned tables, because there are no data in the
non-leaf relations.
That also means declaratively partitioned tables were not affected by
the issue 859b3003de addressed, which means this is a regression
affecting queries that calculate estimates for the whole inheritance
tree as a whole (which includes e.g. GROUP BY queries).
But because partitioned tables are empty, we can invert the condition
and build statistics only for the case with inheritance, without losing
anything. And we can consider them when calculating estimates.
It may be necessary to run ANALYZE on partitioned tables, to collect
proper statistics. For declarative partitioning there should no prior
statistics, and it might take time before autoanalyze is triggered. For
tables partitioned by inheritance the statistics may include data from
child relations (if built 859b3003de), contradicting the current code.
Report and patch by Justin Pryzby, minor fixes and cleanup by me.
Backpatch all the way back to PostgreSQL 10, where extended statistics
were introduced (same as 859b3003de).
Author: Justin Pryzby
Reported-by: Justin Pryzby
Backpatch-through: 10
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
|
|
We disallow altering a column datatype within a regular table,
if the table's rowtype is used as a column type elsewhere,
because we lack code to go around and rewrite the other tables.
This restriction should apply to partitioned tables as well, but it
was not checked because ATRewriteTables and ATPrepAlterColumnType
were not on the same page about who should do it for which relkinds.
Per bug #17351 from Alexander Lakhin. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/17351-6db1870f3f4f612a@postgresql.org
|
|
Commit 4ace45677 failed to fix the problem fully, because the
same issue of attempting to fetch a non-returnable index column
can occur when rechecking the indexqual after using a lossy index
operator. Moreover, it broke EXPLAIN for such indexquals (which
indicates a gap in our test cases :-().
Revert the code changes of 4ace45677 in favor of adding a new field
to struct IndexOnlyScan, containing a version of the indexqual that
can be executed against the index-returned tuple without using any
non-returnable columns. (The restrictions imposed by check_index_only
guarantee this is possible, although we may have to recompute indexed
expressions.) Support construction of that during setrefs.c
processing by marking IndexOnlyScan.indextlist entries as resjunk
if they can't be returned, rather than removing them entirely.
(We could alternatively require setrefs.c to look up the IndexOptInfo
again, but abusing resjunk this way seems like a reasonably safe way
to avoid needing to do that.)
This solution isn't great from an API-stability standpoint: if there
are any extensions out there that build IndexOnlyScan structs directly,
they'll be broken in the next minor releases. However, only a very
invasive extension would be likely to do such a thing. There's no
change in the Path representation, so typical planner extensions
shouldn't have a problem.
As before, back-patch to all supported branches.
Discussion: https://postgr.es/m/3179992.1641150853@sss.pgh.pa.us
Discussion: https://postgr.es/m/17350-b5bdcf476e5badbb@postgresql.org
|
|
Replica identities that depend directly on an index rely on a set of
properties, one of them being that all the columns defined in this index
have to be marked as NOT NULL. There was a hole in the logic with ALTER
TABLE DROP NOT NULL, where it was possible to remove the NOT NULL
property of a column part of an index used as replica identity, so block
it to avoid problems with logical decoding down the road.
The same check was already done columns part of a primary key, so the
fix is straight-forward.
Author: Haiying Tang, Hou Zhijie
Reviewed-by: Dilip Kumar, Michael Paquier
Discussion: https://postgr.es/m/OS0PR01MB6113338C102BEE8B2FFC5BD9FB619@OS0PR01MB6113.jpnprd01.prod.outlook.com
Backpatch-through: 10
|
|
When changing REPLICA IDENTITY INDEX to another one, the target table's
relcache was not being invalidated. This leads to skipping update/delete
operations during apply on the subscriber side as the columns required to
search corresponding rows won't get logged.
Author: Tang Haiying, Hou Zhijie
Reviewed-by: Euler Taveira, Amit Kapila
Backpatch-through: 10
Discussion: https://postgr.es/m/OS0PR01MB61133CA11630DAE45BC6AD95FB939@OS0PR01MB6113.jpnprd01.prod.outlook.com
|
|
Commit 1b5d797cd4f7 intended to relax the lock level used to rename
indexes, but inadvertently allowed *any* relation to be renamed with a
lowered lock level, as long as the command is spelled ALTER INDEX.
That's undesirable for other relation types, so retry the operation with
the higher lock if the relation turns out not to be an index.
After this fix, ALTER INDEX <sometable> RENAME will require access
exclusive lock, which it didn't before.
Author: Nathan Bossart <bossartn@amazon.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Onder Kalaci <onderk@microsoft.com>
Discussion: https://postgr.es/m/PH0PR21MB1328189E2821CDEC646F8178D8AE9@PH0PR21MB1328.namprd21.prod.outlook.com
|
|
The grammar of this command run on indexes with column names has always
been authorized by the parser, and it has never been documented.
Since 911e702, it is possible to define opclass parameters as of CREATE
INDEX, which actually broke the old case of ALTER INDEX/TABLE where
relation-level parameters n_distinct and n_distinct_inherited could be
defined for an index (see 76a47c0 and its thread where this point has
been touched, still remained unused). Attempting to do that in v13~
would cause the index to become unusable, as there is a new dedicated
code path to load opclass parameters instead of the relation-level ones
previously available. Note that it is possible to fix things with a
manual catalog update to bring the relation back online.
This commit disables this command for now as the use of column names for
indexes does not make sense anyway, particularly when it comes to index
expressions where names are automatically computed. One way to properly
support this case properly in the future would be to use column numbers
when it comes to indexes, in the same way as ALTER INDEX .. ALTER COLUMN
.. SET STATISTICS.
Partitioned indexes were already blocked, but not indexes. Some tests
are added for both cases.
There was some code in ANALYZE to enforce n_distinct to be used for an
index expression if the parameter was defined, but just remove it for
now until/if there is support for this (note that index-level parameters
never had support in pg_dump either, previously), so this was just dead
code.
Reported-by: Matthijs van der Vleuten
Author: Nathan Bossart, Michael Paquier
Reviewed-by: Vik Fearing, Dilip Kumar
Discussion: https://postgr.es/m/17220-15d684c6c2171a83@postgresql.org
Backpatch-through: 13
|
|
Failing to do that, any direct inserts/updates of those partitions
would fail to enforce the correct constraint, that is, one that
considers the new partition constraint of their parent table.
Backpatch to 10.
Reported by: Hou Zhijie <houzj.fnst@fujitsu.com>
Author: Amit Langote <amitlangote09@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Nitin Jadhav <nitinjadhavpostgres@gmail.com>
Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>
Discussion: https://postgr.es/m/OS3PR01MB5718DA1C4609A25186D1FBF194089%40OS3PR01MB5718.jpnprd01.prod.outlook.com
|
|
An extension may want to call GetSecurityLabel() on a shared object
before the shared relcaches are fully initialized. For instance, a
ClientAuthentication_hook might want to retrieve the security label on
a role.
Discussion: https://postgr.es/m/ecb7af0b26e3be1d96d291c8453a86f1f82d9061.camel@j-davis.com
Backpatch-through: 9.6
|
|
Both bugs #16676[1] and #17141[2] illustrate that the combination of
SKIP LOCKED and FETCH FIRST WITH TIES break expectations when it comes
to rows returned to other sessions accessing the same row. Since this
situation is detectable from the syntax and hard to fix otherwise,
forbid for now, with the potential to fix in the future.
[1] https://postgr.es/m/16676-fd62c3c835880da6@postgresql.org
[2] https://postgr.es/m/17141-913d78b9675aac8e@postgresql.org
Backpatch-through: 13, where WITH TIES was introduced
Author: David Christensen <david.christensen@crunchydata.com>
Discussion: https://postgr.es/m/CAOxo6XLPccCKru3xPMaYDpa+AXyPeWFs+SskrrL+HKwDjJnLhg@mail.gmail.com
|
|
Updates/Deletes on a partition were allowed even without replica identity
after the parent table was added to a publication. This would later lead
to an error on subscribers. The reason was that we were not invalidating
the partition's relcache and the publication information for partitions
was not getting rebuilt. Similarly, we were not invalidating the
partitions' relcache after dropping a partitioned table from a publication
which will prohibit Updates/Deletes on its partition without replica
identity even without any publication.
Reported-by: Haiying Tang
Author: Hou Zhijie and Vignesh C
Reviewed-by: Vignesh C and Amit Kapila
Backpatch-through: 13
Discussion: https://postgr.es/m/OS0PR01MB6113D77F583C922F1CEAA1C3FBD29@OS0PR01MB6113.jpnprd01.prod.outlook.com
|
|
Before commit 84f5c2908, a STABLE function in a plpgsql CALL
statement's argument list would see an up-to-date snapshot,
because exec_stmt_call would push a new snapshot. I got rid of
that because the possibility of the snapshot disappearing within
COMMIT made it too hard to manage a snapshot across the CALL
statement. That's fine so far as the procedure itself goes,
but I forgot to think about the possibility of STABLE functions
within the CALL argument list. As things now stand, those'll
be executed with the Portal's snapshot as ActiveSnapshot,
keeping them from seeing updates more recent than Portal startup.
(VOLATILE functions don't have a problem because they take their
own snapshots; which indeed is also why the procedure itself
doesn't have a problem. There are no STABLE procedures.)
We can fix this by pushing a new snapshot transiently within
ExecuteCallStmt itself. Popping the snapshot before we get
into the procedure proper eliminates the management problem.
The possibly-useless extra snapshot-grab is slightly annoying,
but it's no worse than what happened before 84f5c2908.
Per bug #17199 from Alexander Nawratil. Back-patch to v11,
like the previous patch.
Discussion: https://postgr.es/m/17199-1ab2561f0d94af92@postgresql.org
|
|
Formerly, we sent signals for outgoing NOTIFY messages within
ProcessCompletedNotifies, which was also responsible for sending
relevant ones of those messages to our connected client. It therefore
had to run during the main-loop processing that occurs just before
going idle. This arrangement had two big disadvantages:
* Now that procedures allow intra-command COMMITs, it would be
useful to send NOTIFYs to other sessions immediately at COMMIT
(though, for reasons of wire-protocol stability, we still shouldn't
forward them to our client until end of command).
* Background processes such as replication workers would not send
NOTIFYs at all, since they never execute the client communication
loop. We've had requests to allow triggers running in replication
workers to send NOTIFYs, so that's a problem.
To fix these things, move transmission of outgoing NOTIFY signals
into AtCommit_Notify, where it will happen during CommitTransaction.
Also move the possible call of asyncQueueAdvanceTail there, to
ensure we don't bloat the async SLRU if a background worker sends
many NOTIFYs with no one listening.
We can also drop the call of asyncQueueReadAllNotifications,
allowing ProcessCompletedNotifies to go away entirely. That's
because commit 790026972 added a call of ProcessNotifyInterrupt
adjacent to PostgresMain's call of ProcessCompletedNotifies,
and that does its own call of asyncQueueReadAllNotifications,
meaning that we were uselessly doing two such calls (inside two
separate transactions) whenever inbound notify signals coincided
with an outbound notify. We need only set notifyInterruptPending
to ensure that ProcessNotifyInterrupt runs, and we're done.
The existing documentation suggests that custom background workers
should call ProcessCompletedNotifies if they want to send NOTIFY
messages. To avoid an ABI break in the back branches, reduce it
to an empty routine rather than removing it entirely. Removal
will occur in v15.
Although the problems mentioned above have existed for awhile,
I don't feel comfortable back-patching this any further than v13.
There was quite a bit of churn in adjacent code between 12 and 13.
At minimum we'd have to also backpatch 51004c717, and a good deal
of other adjustment would also be needed, so the benefit-to-risk
ratio doesn't look attractive.
Per bug #15293 from Michael Powers (and similar gripes from others).
Artur Zakirov and Tom Lane
Discussion: https://postgr.es/m/153243441449.1404.2274116228506175596@wrigleys.postgresql.org
|
|
We have long forbidden fetching backwards from a NO SCROLL cursor,
but the prohibition didn't extend to cases in which we rewind the
query altogether and then re-fetch forwards. I think the reason is
that this logic was mainly meant to protect plan nodes that can't
be run in the reverse direction. However, re-reading the query output
is problematic if the query is volatile (which includes SELECT FOR
UPDATE, not just queries with volatile functions): the re-read can
produce different results, which confuses the cursor navigation logic
completely. Another reason for disliking this approach is that some
code paths will either fetch backwards or rewind-and-fetch-forwards
depending on the distance to the target row; so that seemingly
identical use-cases may or may not draw the "cursor can only scan
forward" error. Hence, let's clean things up by disallowing rewind
as well as fetch-backwards in a NO SCROLL cursor.
Ordinarily we'd only make such a definitional change in HEAD, but
there is a third reason to consider this change now. Commit ba2c6d6ce
created some new user-visible anomalies for non-scrollable cursors
WITH HOLD, in that navigation in the cursor result got confused if the
cursor had been partially read before committing. The only good way
to resolve those anomalies is to forbid rewinding such a cursor, which
allows removal of the incorrect cursor state manipulations that
ba2c6d6ce added to PersistHoldablePortal.
To minimize the behavioral change in the back branches (including
v14), refuse to rewind a NO SCROLL cursor only when it has a holdStore,
ie has been held over from a previous transaction due to WITH HOLD.
This should avoid breaking most applications that have been sloppy
about whether to declare cursors as scrollable. We'll enforce the
prohibition across-the-board beginning in v15.
Back-patch to v11, as ba2c6d6ce was.
Discussion: https://postgr.es/m/3712911.1631207435@sss.pgh.pa.us
|
|
Some plan node types don't react well to being called again after
they've already returned NULL. PortalRunSelect() has long dealt
with this by calling the executor with NoMovementScanDirection
if it sees that we've already run the portal to the end. However,
commit ba2c6d6ce overlooked this point, so that persisting an
already-fully-fetched cursor would fail if it had such a plan.
Per report from Tomas Barton. Back-patch to v11, as the faulty
commit was. (I've omitted a test case because the type of plan
that causes a problem isn't all that stable.)
Discussion: https://postgr.es/m/CAPV2KRjd=ErgVGbvO2Ty20tKTEZZr6cYsYLxgN_W3eAo9pf5sw@mail.gmail.com
|
|
Updates/Deletes on a relation were allowed even without replica identity
after we define the publication for all tables. This would later lead to
an error on subscribers. The reason was that for such publications we were
not invalidating the relcache and the publication information for
relations was not getting rebuilt. Similarly, we were not invalidating the
relcache after dropping of such publications which will prohibit
Updates/Deletes without replica identity even without any publication.
Author: Vignesh C and Hou Zhijie
Reviewed-by: Hou Zhijie, Kyotaro Horiguchi, Amit Kapila
Backpatch-through: 10, where it was introduced
Discussion: https://postgr.es/m/CALDaNm0pF6zeWqCA8TCe2sDuwFAy8fCqba=nHampCKag-qLixg@mail.gmail.com
|
|
Previously this was allowed, but the collation effectively vanished
into the ether because of the way lookup_collation() works: you could
not use the collation, nor even drop it. Seems better to give an
error up front than to leave the user wondering why it doesn't work.
(Because this test is in DefineCollation not CreateCollation, it does
not prevent pg_import_system_collations from creating ICU collations,
regardless of the initially-chosen encoding.)
Per bug #17170 from Andrew Bille. Back-patch to v10 where ICU support
was added.
Discussion: https://postgr.es/m/17170-95845cf3f0a9c36d@postgresql.org
|
|
If the system crashed between CREATE TABLESPACE and the next checkpoint,
the result could be some files in the tablespace unexpectedly containing
no rows. Affected files would be those for which the system did not
write WAL; see the wal_skip_threshold documentation. Before v13, a
different set of conditions governed the writing of WAL; see v12's
<sect2 id="populate-pitr">. (The v12 conditions were broader in some
ways and narrower in others.) Users may want to audit non-default
tablespaces for unexpected short files. The bug could have truncated an
index without affecting the associated table, and reindexing the index
would fix that particular problem.
This fixes the bug by making create_tablespace_directories() more like
TablespaceCreateDbspace(). create_tablespace_directories() was
recursively removing tablespace contents, reasoning that WAL redo would
recreate everything removed that way. That assumption holds for other
wal_level values. Under wal_level=minimal, the old approach could
delete files for which no other copy existed. Back-patch to 9.6 (all
supported versions).
Reviewed by Robert Haas and Prabhat Sahu. Reported by Robert Haas.
Discussion: https://postgr.es/m/CA+TgmoaLO9ncuwvr2nN-J4VEP5XyAcy=zKiHxQzBbFRxxGxm0w@mail.gmail.com
|