summaryrefslogtreecommitdiff
path: root/src/backend/commands
AgeCommit message (Collapse)Author
2023-04-28Fix crashes with CREATE SCHEMA AUTHORIZATION and schema elementsMichael Paquier
CREATE SCHEMA AUTHORIZATION with appended schema elements can lead to crashes when comparing the schema name of the query with the schemas used in the qualification of some clauses in the elements' queries. The origin of the problem is that the transformation routine for the elements listed in a CREATE SCHEMA query uses as new, expected, schema name the one listed in CreateSchemaStmt itself. However, depending on the query, CreateSchemaStmt.schemaname may be NULL, being computed instead from the role specification of the query given by the AUTHORIZATION clause, that could be either: - A user name string, with the new schema name being set to the same value as the role given. - Guessed from CURRENT_ROLE, SESSION_ROLE or CURRENT_ROLE, with a new schema name computed from the security context where CREATE SCHEMA is running. Regression tests are added for CREATE SCHEMA with some appended elements (some of them with schema qualifications), covering also some role specification patterns. While on it, this simplifies the context structure used during the transformation of the elements listed in a CREATE SCHEMA query by removing the fields for the role specification and the role type. They were not used, and for the role specification this could be confusing as the schema name may by extracted from that at the beginning of CreateSchemaCommand(). This issue exists for a long time, so backpatch down to all the versions supported. Reported-by: Song Hongyu Author: Michael Paquier Reviewed-by: Richard Guo Discussion: https://postgr.es/m/17909-f65c12dfc5f0451d@postgresql.org Backpatch-through: 11
2023-04-05Fix another issue with ENABLE/DISABLE TRIGGER on partitioned tables.Tom Lane
In v13 and v14, the ENABLE/DISABLE TRIGGER USER variant malfunctioned on cloned triggers, failing to find the clones because it thought they were system triggers. Other variants of ENABLE/DISABLE TRIGGER would improperly apply a superuserness check. Fix by adjusting the is-it- a-system-trigger check to match reality in those branches. (As far as I can find, this is the only place that got it wrong.) There's no such bug in v15/HEAD, because we revised the catalog representation of system triggers to be what this code was expecting. However, add the test case to these branches anyway, because this area is visibly pretty fragile. Also remove an obsoleted comment. The recent v15/HEAD commit 6949b921d fixed a nearby bug. I now see that my commit message for that was inaccurate: the behavior of recursing to clone triggers is older than v15, but it didn't apply to the case in v13/v14 because in those branches parent partitioned tables have no pg_trigger entries for foreign-key triggers. But add the test case from that commit to v13/v14, just to show what is happening there. Per bug #17886 from DzmitryH. Discussion: https://postgr.es/m/17886-5406d5d828aa4aa3@postgresql.org
2023-03-31Reject system columns as elements of foreign keys.Tom Lane
Up through v11 it was sensible to use the "oid" system column as a foreign key column, but since that was removed there's no visible usefulness in making any of the remaining system columns a foreign key. Moreover, since the TupleTableSlot rewrites in v12, such cases actively fail because of implicit assumptions that only user columns appear in foreign keys. The lack of complaints about that seems like good evidence that no one is trying to do it. Hence, rather than trying to repair those assumptions (of which there are at least two, maybe more), let's just forbid the case up front. Per this patch, a system column in either the referenced or referencing side of a foreign key will draw this error; however, putting one in the referenced side would have failed later anyway, since we don't allow unique indexes to be made on system columns. Per bug #17877 from Alexander Lakhin. Back-patch to v12; the case still appears to work in v11, so we shouldn't break it there. Discussion: https://postgr.es/m/17877-4bcc658e33df6de1@postgresql.org
2023-03-31Ensure acquire_inherited_sample_rows sets its output parameters.Tom Lane
The totalrows/totaldeadrows outputs were left uninitialized in cases where we find no analyzable child tables of a partitioned table. This could lead to setting the partitioned table's pg_class.reltuples value to garbage. It's not clear that that would have any very bad effects in practice, but fix it anyway because it's making valgrind unhappy. Reported and diagnosed by Alexander Lakhin (bug #17880). Discussion: https://postgr.es/m/17880-9282037c923d856e@postgresql.org
2023-03-27Reject attempts to alter composite types used in indexes.Tom Lane
find_composite_type_dependencies() ignored indexes, which is a poor decision because an expression index could have a stored column of a composite (or other container) type even when the underlying table does not. Teach it to detect such cases and error out. We have to work a bit harder than for other relations because the pg_depend entry won't identify the specific index column of concern, but it's not much new code. This does not address bug #17872's original complaint that dropping a column in such a type might lead to violations of the uniqueness property that a unique index is supposed to ensure. That seems of much less concern to me because it won't lead to crashes. Per bug #17872 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/17872-d0fbb799dc3fd85d@postgresql.org
2023-03-10Ensure COPY TO on an RLS-enabled table copies no more than it should.Tom Lane
The COPY documentation is quite clear that "COPY relation TO" copies rows from only the named table, not any inheritance children it may have. However, if you enabled row-level security on the table then this stopped being true, because the code forgot to apply the ONLY modifier in the "SELECT ... FROM relation" query that it constructs in order to allow RLS predicates to be attached. Fix that. Report and patch by Antonin Houska (comment adjustments and test case by me). Back-patch to all supported branches. Discussion: https://postgr.es/m/3472.1675251957@antos
2023-01-21Allow REPLICA IDENTITY to be set on an index that's not (yet) valid.Tom Lane
The motivation for this change is that when pg_dump dumps a partitioned index that's marked REPLICA IDENTITY, it generates a command sequence that applies REPLICA IDENTITY before the partitioned index has been marked valid, causing restore to fail. We could perhaps change pg_dump to not do it like that, but that would be difficult and would not fix existing dump files with the problem. There seems to be very little reason for the backend to disallow this anyway --- the code ignores indisreplident when the index isn't valid --- so instead let's fix it by allowing the case. Commit 9511fb37a previously expressed a concern that allowing indisreplident to be set on invalid indexes might allow us to wind up in a situation where a table could have indisreplident set on multiple indexes. I'm not sure I follow that concern exactly, but in any case the only way that could happen is because relation_mark_replica_identity is too trusting about the existing set of markings being valid. Let's just rip out its early-exit code path (which sure looks like premature optimization anyway; what are we doing expending code to make redundant ALTER TABLE ... REPLICA IDENTITY commands marginally faster and not-redundant ones marginally slower?) and fix it to positively guarantee that no more than one index is marked indisreplident. The pg_dump failure can be demonstrated in all supported branches, so back-patch all the way. I chose to back-patch 9511fb37a as well, just to keep indisreplident handling the same in all branches. Per bug #17756 from Sergey Belyashov. Discussion: https://postgr.es/m/17756-dd50e8e0c8dd4a40@postgresql.org
2023-01-11Avoid using tuple from syscache for update of pg_database.datfrozenxidMichael Paquier
pg_database.datfrozenxid gets updated using an in-place update at the end of vacuum or autovacuum. Since 96cdeae, as pg_database has a toast relation, it is possible for a pg_database tuple to have toast values if there is a large set of ACLs in place. In such a case, the in-place update would fail because of the flattening of the toast values done for the catcache entry fetched. Instead of using a copy from the catcache, this changes the logic to fetch the copy of the tuple by directly scanning pg_database. Note that before 96cdeae, attempting to insert such a tuple to pg_database would cause a "row is too big" error, so the end-of-vacuum problem was not reachable. This issue has been originally fixed in 947789f on v14~, and there have been reports about this problem on v12 and v13, causing failures at the end of VACUUM. This completes the fix on all the stable branches where pg_database can use a toast table, down to 12. Author: Ashwin Agrawal, Junfeng Yang Discussion: https://postgr.es/m/DM5PR0501MB38800D9E4605BCA72DD35557CCE10@DM5PR0501MB3880.namprd05.prod.outlook.com Discussion: https://postgr.es/m/Y70XNVbUWQsR2Car@paquier.xyz Backpatch-through: 12
2022-11-17Replace RelationOpenSmgr() with RelationGetSmgr().Tom Lane
This is a back-patch of the v15-era commit f10f0ae42 into older supported branches. The idea is to design out bugs in which an ill-timed relcache flush clears rel->rd_smgr partway through some code sequence that wasn't expecting that. We had another report today of a corner case that reliably crashes v14 under debug_discard_caches (nee CLOBBER_CACHE_ALWAYS), and therefore would crash once in a blue moon in the field. We're unlikely to get rid of all such code paths unless we adopt the more rigorous coding rules instituted by f10f0ae42. Therefore, even though this is a bit invasive, it's time to back-patch. Some comfort can be taken in the fact that f10f0ae42 has been in v15 for 16 months without problems. I left the RelationOpenSmgr macro present in the back branches, even though no core code should use it anymore, in order to not break third-party extensions in minor releases. Such extensions might opt to start using RelationGetSmgr instead, to reduce their code differential between v15 and earlier branches. This carries a hazard of failing to compile against headers from existing minor releases. However, once compiled the extension should work fine even with such releases, because RelationGetSmgr is a "static inline" function so it creates no link-time dependency. So depending on distribution practices, that might be an OK tradeoff. Per report from Spyridon Dimitrios Agathos. Original patch by Amul Sul. Discussion: https://postgr.es/m/CAFM5RaqdgyusQvmWkyPYaWMwoK5gigdtW-7HcgHgOeAw7mqJ_Q@mail.gmail.com Discussion: https://postgr.es/m/CANiYTQsU7yMFpQYnv=BrcRVqK_3U3mtAzAsJCaqtzsDHfsUbdQ@mail.gmail.com
2022-11-04Correct error message for row-level triggers with transition tables on ↵Etsuro Fujita
partitioned tables. "Triggers on partitioned tables cannot have transition tables." is incorrect as we allow statement-level triggers on partitioned tables to have transition tables. This has been wrong since commit 86f575948; back-patch to v11 where that commit came in. Reviewed by Tom Lane. Discussion: https://postgr.es/m/CAPmGK17gk4vXLzz2iG%2BG4LWRWCoVyam70nZ3OuGm1hMJwDrhcg%40mail.gmail.com
2022-11-03Create FKs properly when attaching table as partitionAlvaro Herrera
Commit f56f8f8da6af added some code in CloneFkReferencing that's way too lax about a Constraint node it manufactures, not initializing enough struct members -- initially_valid in particular was forgotten. This causes some FKs in partitions added by ALTER TABLE ATTACH PARTITION to be marked as not validated. Set initially_valid true, which fixes the bug. While at it, make the struct initialization more complete. Very similar code was added in two other places by the same commit; make them all follow the same pattern for consistency, though no bugs are apparent there. This bug has never been reported: I only happened to notice while working on commit 614a406b4ff1. The test case that was added there with the improper result is repaired. Backpatch to 12. Discussion: https://postgr.es/m/20221005105523.bhuhkdx4olajboof@alvherre.pgsql
2022-10-07Fix self-referencing foreign keys with partitioned tablesAlvaro Herrera
There are a number of bugs in this area. Two of them are fixed here, namely: 1. get_relation_idx_constraint_oid does not restrict the type of constraint that's returned, so with sufficient bad luck it can return the OID of a foreign key constraint. This has the effect that a primary key in a partition can end up as a child of a foreign key, which makes no sense (it needs to be the child of the equivalent primary key.) Change the API contract so that only index-backed constraints are returned, mimicking get_constraint_index(). 2. Both CloneFkReferenced and CloneFkReferencing clone a self-referencing foreign key, so the partition ends up with a duplicate foreign key. Change the former function to ignore such constraints. Add some tests to verify that things are better now. (However, these new tests show some additional misbehavior that will be fixed later -- namely that there's a constraint marked NOT VALID.) Backpatch to 12, where these constraints are possible at all. Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> Discussion: https://postgr.es/m/20220603154232.1715b14c@karst
2022-09-28Change some errdetail() to errdetail_internal()Alvaro Herrera
This prevents marking the argument string for translation for gettext, and it also prevents the given string (which is already translated) from being translated at runtime. Also, mark the strings used as arguments to check_rolespec_name for translation. Backpatch all the way back as appropriate. None of this is caught by any tests (necessarily so), so I verified it manually.
2022-09-25Fix tupdesc lifespan bug with AfterTriggersTableData.storeslot.Tom Lane
Commit 25936fd46 adjusted things so that the "storeslot" we use for remapping trigger tuples would have adequate lifespan, but it neglected to consider the lifespan of the tuple descriptor that the slot depends on. It turns out that in at least some cases, the tupdesc we are passing is a refcounted tupdesc, and the refcount for the slot's reference can get assigned to a resource owner having different lifespan than the slot does. That leads to an error like "tupdesc reference 0x7fdef236a1b8 is not owned by resource owner SubTransaction". Worse, because of a second oversight in the same commit, we'd try to free the same tupdesc refcount again while cleaning up after that error, leading to recursive errors and an "ERRORDATA_STACK_SIZE exceeded" PANIC. To fix the initial problem, let's just make a non-refcounted copy of the tupdesc we're supposed to use. That seems likely to guard against additional problems, since there's no strong reason for this code to assume that what it's given is a refcounted tupdesc; in which case there's an independent hazard of the tupdesc having shorter lifespan than the slot does. (I didn't bother trying to free said copy, since it should go away anyway when the (sub) transaction context is cleaned up.) The other issue can be fixed by making the code added to AfterTriggerFreeQuery work like the rest of that function, ie be sure that it doesn't try to free the same slot twice in the event of recursive error cleanup. While here, also clean up minor stylistic issues in the test case added by 25936fd46: don't use "create or replace function", as any name collision within the tests is likely to have ill effects that that won't mask; and don't use function names as generic as trigger_function1, especially if you're not going to drop them at the end of the test stanza. Per bug #17607 from Thomas Mc Kay. Back-patch to v12, as the previous fix was. Discussion: https://postgr.es/m/17607-bd8ccc81226f7f80@postgresql.org
2022-09-22Fix thinko in comment.Etsuro Fujita
This comment has been wrong since its introduction in commit 0d5f05cde; backpatch to v12 where that came in. Discussion: https://postgr.es/m/CAPmGK14VGf-xQjGQN4o1QyAbXAaxugU5%3DqfcmTDh1iufUDnV_w%40mail.gmail.com
2022-09-08Choose FK name correctly during partition attachmentAlvaro Herrera
During ALTER TABLE ATTACH PARTITION, if the name of a parent's foreign key constraint is already used on the partition, the code tries to choose another one before the FK attributes list has been populated, so the resulting constraint name was "<relname>__fkey" instead of "<relname>_<attrs>_fkey". Repair, and add a test case. Backpatch to 12. In 11, the code to attach a partition was not smart enough to cope with conflicting constraint names, so the problem doesn't exist there. Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> Discussion: https://postgr.es/m/20220901184156.738ebee5@karst
2022-08-26Fix typo in comment.Etsuro Fujita
2022-08-18Fix subtly-incorrect matching of parent and child partitioned indexes.Tom Lane
When creating a partitioned index, DefineIndex tries to identify any existing indexes on the partitions that match the partitioned index, so that it can absorb those as child indexes instead of building new ones. Part of the matching is to compare IndexInfo structs --- but that wasn't done quite right. We're comparing the IndexInfo built within DefineIndex itself to one made from existing catalog contents by BuildIndexInfo. Notably, while BuildIndexInfo will run index expressions and predicates through expression preprocessing, that has not happened to DefineIndex's struct. The result is failure to match and subsequent creation of duplicate indexes. The easiest and most bulletproof fix is to build a new IndexInfo using BuildIndexInfo, thereby guaranteeing that the processing done is identical. While here, let's also extract the opfamily and collation data from the new partitioned index, removing ad-hoc logic that duplicated knowledge about how those are constructed. Per report from Christophe Pettus. Back-patch to v11 where we invented partitioned indexes. Richard Guo and Tom Lane Discussion: https://postgr.es/m/8864BFAA-81FD-4BF9-8E06-7DEB8D4164ED@thebuild.com
2022-08-08In extensions, don't replace objects not belonging to the extension.Tom Lane
Previously, if an extension script did CREATE OR REPLACE and there was an existing object not belonging to the extension, it would overwrite the object and adopt it into the extension. This is problematic, first because the overwrite is probably unintentional, and second because we didn't change the object's ownership. Thus a hostile user could create an object in advance of an expected CREATE EXTENSION command, and would then have ownership rights on an extension object, which could be modified for trojan-horse-type attacks. Hence, forbid CREATE OR REPLACE of an existing object unless it already belongs to the extension. (Note that we've always forbidden replacing an object that belongs to some other extension; only the behavior for previously-free-standing objects changes here.) For the same reason, also fail CREATE IF NOT EXISTS when there is an existing object that doesn't belong to the extension. Our thanks to Sven Klemm for reporting this problem. Security: CVE-2022-2625
2022-08-05Fix ENABLE/DISABLE TRIGGER to handle recursion correctlyAlvaro Herrera
Using ATSimpleRecursion() in ATPrepCmd() to do so as bbb927b4db9b did is not correct, because ATPrepCmd() can't distinguish between triggers that may be cloned and those that may not, so would wrongly try to recurse for the latter category of triggers. So this commit restores the code in EnableDisableTrigger() that 86f575948c77 had added to do the recursion, which would do it only for triggers that may be cloned, that is, row-level triggers. This also changes tablecmds.c such that ATExecCmd() is able to pass the value of ONLY flag down to EnableDisableTrigger() using its new 'recurse' parameter. This also fixes what seems like an oversight of 86f575948c77 that the recursion to partition triggers would only occur if EnableDisableTrigger() had actually changed the trigger. It is more apt to recurse to inspect partition triggers even if the parent's trigger didn't need to be changed: only then can we be certain that all descendants share the same state afterwards. Backpatch all the way back to 11, like bbb927b4db9b. Care is taken not to break ABI compatibility (and that no catversion bump is needed.) Co-authored-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Dmitry Koval <d.koval@postgrespro.ru> Discussion: https://postgr.es/m/CA+HiwqG-cZT3XzGAnEgZQLoQbyfJApVwOTQaCaas1mhpf+4V5A@mail.gmail.com
2022-07-28Fix replay of create database records on standbyAlvaro Herrera
Crash recovery on standby may encounter missing directories when replaying database-creation WAL records. Prior to this patch, the standby would fail to recover in such a case; however, the directories could be legitimately missing. Consider the following sequence of commands: CREATE DATABASE DROP DATABASE DROP TABLESPACE If, after replaying the last WAL record and removing the tablespace directory, the standby crashes and has to replay the create database record again, crash recovery must be able to continue. A fix for this problem was already attempted in 49d9cfc68bf4, but it was reverted because of design issues. This new version is based on Robert Haas' proposal: any missing tablespaces are created during recovery before reaching consistency. Tablespaces are created as real directories, and should be deleted by later replay. CheckRecoveryConsistency ensures they have disappeared. The problems detected by this new code are reported as PANIC, except when allow_in_place_tablespaces is set to ON, in which case they are WARNING. Apart from making tests possible, this gives users an escape hatch in case things don't go as planned. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Asim R Praveen <apraveen@pivotal.io> Author: Paul Guo <paulguo@gmail.com> Reviewed-by: Anastasia Lubennikova <lubennikovaav@gmail.com> (older versions) Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> (older versions) Reviewed-by: Michaël Paquier <michael@paquier.xyz> Diagnosed-by: Paul Guo <paulguo@gmail.com> Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com
2022-07-27Allow "in place" tablespaces.Alvaro Herrera
This is a backpatch to branches 10-14 of the following commits: 7170f2159fb2 Allow "in place" tablespaces. c6f2f01611d4 Fix pg_basebackup with in-place tablespaces. f6f0db4d6240 Fix pg_tablespace_location() with in-place tablespaces 7a7cd84893e0 doc: Remove mention to in-place tablespaces for pg_tablespace_location() 5344723755bd Remove unnecessary Windows-specific basebackup code. In-place tablespaces were introduced as a testing helper mechanism, but they are going to be used for a bugfix in WAL replay to be backpatched to all stable branches. I (Álvaro) had to adjust some code to account for lack of get_dirent_type() in branches prior to 14. Author: Thomas Munro <thomas.munro@gmail.com> Author: Michaël Paquier <michael@paquier.xyz> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/20220722081858.omhn2in5zt3g4nek@alvherre.pgsql
2022-06-25CREATE INDEX: use the original userid for more ACL checks.Noah Misch
Commit a117cebd638dd02e5c2e791c25e43745f233111b used the original userid for ACL checks located directly in DefineIndex(), but it still adopted the table owner userid for more ACL checks than intended. That broke dump/reload of indexes that refer to an operator class, collation, or exclusion operator in a schema other than "public" or "pg_catalog". Back-patch to v10 (all supported versions), like the earlier commit. Nathan Bossart and Noah Misch Discussion: https://postgr.es/m/f8a4105f076544c180a87ef0c4822352@stmuk.bayern.de
2022-05-20Fix DDL deparse of CREATE OPERATOR CLASSAlvaro Herrera
When an implicit operator family is created, it wasn't getting reported. Make it do so. This has always been missing. Backpatch to 10. Author: Masahiko Sawada <sawada.mshk@gmail.com> Reported-by: Leslie LEMAIRE <leslie.lemaire@developpement-durable.gouv.fr> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Michael Paquiër <michael@paquier.xyz> Discussion: https://postgr.es/m/f74d69e151b22171e8829551b1159e77@developpement-durable.gouv.fr
2022-05-09In REFRESH MATERIALIZED VIEW, set user ID before running user code.Noah Misch
It intended to, but did not, achieve this. Adopt the new standard of setting user ID just after locking the relation. Back-patch to v10 (all supported versions). Reviewed by Simon Riggs. Reported by Alvaro Herrera. Security: CVE-2022-1552
2022-05-09Make relation-enumerating operations be security-restricted operations.Noah Misch
When a feature enumerates relations and runs functions associated with all found relations, the feature's user shall not need to trust every user having permission to create objects. BRIN-specific functionality in autovacuum neglected to account for this, as did pg_amcheck and CLUSTER. An attacker having permission to create non-temp objects in at least one schema could execute arbitrary SQL functions under the identity of the bootstrap superuser. CREATE INDEX (not a relation-enumerating operation) and REINDEX protected themselves too late. This change extends to the non-enumerating amcheck interface. Back-patch to v10 (all supported versions). Sergey Shinderuk, reviewed (in earlier versions) by Alexander Lakhin. Reported by Alexander Lakhin. Security: CVE-2022-1552
2022-04-19Fix breakage in AlterFunction().Tom Lane
An ALTER FUNCTION command that tried to update both the function's proparallel property and its proconfig list failed to do the former, because it stored the new proparallel value into a tuple that was no longer the interesting one. Carelessness in 7aea8e4f2. (I did not bother with a regression test, because the only likely future breakage would be for someone to ignore the comment I added and add some other field update after the heap_modify_tuple step. A test using existing function properties could not catch that.) Per report from Bryn Llewellyn. Back-patch to all supported branches. Discussion: https://postgr.es/m/8AC9A37F-99BD-446F-A2F7-B89AD0022774@yugabyte.com
2022-03-30Fix typo in comment.Etsuro Fujita
2022-03-29Revert "Fix replay of create database records on standby"Alvaro Herrera
This reverts commit 49d9cfc68bf4. The approach taken by this patch has problems, so we'll come up with a radically different fix. Discussion: https://postgr.es/m/CA+TgmoYcUPL+WOJL2ZzhH=zmrhj0iOQ=iCFM0SuYqBbqZEamEg@mail.gmail.com
2022-03-25Fix replay of create database records on standbyAlvaro Herrera
Crash recovery on standby may encounter missing directories when replaying create database WAL records. Prior to this patch, the standby would fail to recover in such a case. However, the directories could be legitimately missing. Consider a sequence of WAL records as follows: CREATE DATABASE DROP DATABASE DROP TABLESPACE If, after replaying the last WAL record and removing the tablespace directory, the standby crashes and has to replay the create database record again, the crash recovery must be able to move on. This patch adds a mechanism similar to invalid-page tracking, to keep a tally of missing directories during crash recovery. If all the missing directory references are matched with corresponding drop records at the end of crash recovery, the standby can safely continue following the primary. Backpatch to 13, at least for now. The bug is older, but fixing it in older branches requires more careful study of the interactions with commit e6d8069522c8, which appeared in 13. A new TAP test file is added to verify the condition. However, because it depends on commit d6d317dbf615, it can only be added to branch master. I (Álvaro) manually verified that the code behaves as expected in branch 14. It's a bit nervous-making to leave the code uncovered by tests in older branches, but leaving the bug unfixed is even worse. Also, the main reason this fix took so long is precisely that we couldn't agree on a good strategy to approach testing for the bug, so perhaps this is the best we can do. Diagnosed-by: Paul Guo <paulguo@gmail.com> Author: Paul Guo <paulguo@gmail.com> Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Asim R Praveen <apraveen@pivotal.io> Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com
2022-03-21Fix risk of deadlock failure while dropping a partitioned index.Tom Lane
DROP INDEX needs to lock the index's table before the index itself, else it will deadlock against ordinary queries that acquire the relation locks in that order. This is correctly mechanized for plain indexes by RangeVarCallbackForDropRelation; but in the case of a partitioned index, we neglected to lock the child tables in advance of locking the child indexes. We can fix that by traversing the inheritance tree and acquiring the needed locks in RemoveRelations, after we have acquired our locks on the parent partitioned table and index. While at it, do some refactoring to eliminate confusion between the actual and expected relkind in RangeVarCallbackForDropRelation. We can save a couple of syscache lookups too, by having that function pass back info that RemoveRelations will need. Back-patch to v11 where partitioned indexes were added. Jimmy Yih, Gaurab Dey, Tom Lane Discussion: https://postgr.es/m/BYAPR05MB645402330042E17D91A70C12BD5F9@BYAPR05MB6454.namprd05.prod.outlook.com
2022-01-23Check syscache result in AlterStatisticsTomas Vondra
The syscache lookup may return NULL even for valid OID, for example due to a concurrent DROP STATISTICS, so a HeapTupleIsValid is necessary. Without it, it may fail with a segfault. Reported by Alexander Lakhin, patch by me. Backpatch to 13, where ALTER STATISTICS ... SET STATISTICS was introduced. Backpatch-through: 13 Discussion: https://postgr.es/m/17372-bf3b6e947e35ae77%40postgresql.org
2022-01-15Build inherited extended stats on partitioned tablesTomas Vondra
Commit 859b3003de disabled building of extended stats for inheritance trees, to prevent updating the same catalog row twice. While that resolved the issue, it also means there are no extended stats for declaratively partitioned tables, because there are no data in the non-leaf relations. That also means declaratively partitioned tables were not affected by the issue 859b3003de addressed, which means this is a regression affecting queries that calculate estimates for the whole inheritance tree as a whole (which includes e.g. GROUP BY queries). But because partitioned tables are empty, we can invert the condition and build statistics only for the case with inheritance, without losing anything. And we can consider them when calculating estimates. It may be necessary to run ANALYZE on partitioned tables, to collect proper statistics. For declarative partitioning there should no prior statistics, and it might take time before autoanalyze is triggered. For tables partitioned by inheritance the statistics may include data from child relations (if built 859b3003de), contradicting the current code. Report and patch by Justin Pryzby, minor fixes and cleanup by me. Backpatch all the way back to PostgreSQL 10, where extended statistics were introduced (same as 859b3003de). Author: Justin Pryzby Reported-by: Justin Pryzby Backpatch-through: 10 Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
2022-01-06Prevent altering partitioned table's rowtype, if it's used elsewhere.Tom Lane
We disallow altering a column datatype within a regular table, if the table's rowtype is used as a column type elsewhere, because we lack code to go around and rewrite the other tables. This restriction should apply to partitioned tables as well, but it was not checked because ATRewriteTables and ATPrepAlterColumnType were not on the same page about who should do it for which relkinds. Per bug #17351 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/17351-6db1870f3f4f612a@postgresql.org
2022-01-03Fix index-only scan plans, take 2.Tom Lane
Commit 4ace45677 failed to fix the problem fully, because the same issue of attempting to fetch a non-returnable index column can occur when rechecking the indexqual after using a lossy index operator. Moreover, it broke EXPLAIN for such indexquals (which indicates a gap in our test cases :-(). Revert the code changes of 4ace45677 in favor of adding a new field to struct IndexOnlyScan, containing a version of the indexqual that can be executed against the index-returned tuple without using any non-returnable columns. (The restrictions imposed by check_index_only guarantee this is possible, although we may have to recompute indexed expressions.) Support construction of that during setrefs.c processing by marking IndexOnlyScan.indextlist entries as resjunk if they can't be returned, rather than removing them entirely. (We could alternatively require setrefs.c to look up the IndexOptInfo again, but abusing resjunk this way seems like a reasonably safe way to avoid needing to do that.) This solution isn't great from an API-stability standpoint: if there are any extensions out there that build IndexOnlyScan structs directly, they'll be broken in the next minor releases. However, only a very invasive extension would be likely to do such a thing. There's no change in the Path representation, so typical planner extensions shouldn't have a problem. As before, back-patch to all supported branches. Discussion: https://postgr.es/m/3179992.1641150853@sss.pgh.pa.us Discussion: https://postgr.es/m/17350-b5bdcf476e5badbb@postgresql.org
2021-11-25Block ALTER TABLE .. DROP NOT NULL on columns in replica identity indexMichael Paquier
Replica identities that depend directly on an index rely on a set of properties, one of them being that all the columns defined in this index have to be marked as NOT NULL. There was a hole in the logic with ALTER TABLE DROP NOT NULL, where it was possible to remove the NOT NULL property of a column part of an index used as replica identity, so block it to avoid problems with logical decoding down the road. The same check was already done columns part of a primary key, so the fix is straight-forward. Author: Haiying Tang, Hou Zhijie Reviewed-by: Dilip Kumar, Michael Paquier Discussion: https://postgr.es/m/OS0PR01MB6113338C102BEE8B2FFC5BD9FB619@OS0PR01MB6113.jpnprd01.prod.outlook.com Backpatch-through: 10
2021-11-16Invalidate relcache when changing REPLICA IDENTITY index.Amit Kapila
When changing REPLICA IDENTITY INDEX to another one, the target table's relcache was not being invalidated. This leads to skipping update/delete operations during apply on the subscriber side as the columns required to search corresponding rows won't get logged. Author: Tang Haiying, Hou Zhijie Reviewed-by: Euler Taveira, Amit Kapila Backpatch-through: 10 Discussion: https://postgr.es/m/OS0PR01MB61133CA11630DAE45BC6AD95FB939@OS0PR01MB6113.jpnprd01.prod.outlook.com
2021-10-19Ensure correct lock level is used in ALTER ... RENAMEAlvaro Herrera
Commit 1b5d797cd4f7 intended to relax the lock level used to rename indexes, but inadvertently allowed *any* relation to be renamed with a lowered lock level, as long as the command is spelled ALTER INDEX. That's undesirable for other relation types, so retry the operation with the higher lock if the relation turns out not to be an index. After this fix, ALTER INDEX <sometable> RENAME will require access exclusive lock, which it didn't before. Author: Nathan Bossart <bossartn@amazon.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reported-by: Onder Kalaci <onderk@microsoft.com> Discussion: https://postgr.es/m/PH0PR21MB1328189E2821CDEC646F8178D8AE9@PH0PR21MB1328.namprd21.prod.outlook.com
2021-10-19Block ALTER INDEX/TABLE index_name ALTER COLUMN colname SET (options)Michael Paquier
The grammar of this command run on indexes with column names has always been authorized by the parser, and it has never been documented. Since 911e702, it is possible to define opclass parameters as of CREATE INDEX, which actually broke the old case of ALTER INDEX/TABLE where relation-level parameters n_distinct and n_distinct_inherited could be defined for an index (see 76a47c0 and its thread where this point has been touched, still remained unused). Attempting to do that in v13~ would cause the index to become unusable, as there is a new dedicated code path to load opclass parameters instead of the relation-level ones previously available. Note that it is possible to fix things with a manual catalog update to bring the relation back online. This commit disables this command for now as the use of column names for indexes does not make sense anyway, particularly when it comes to index expressions where names are automatically computed. One way to properly support this case properly in the future would be to use column numbers when it comes to indexes, in the same way as ALTER INDEX .. ALTER COLUMN .. SET STATISTICS. Partitioned indexes were already blocked, but not indexes. Some tests are added for both cases. There was some code in ANALYZE to enforce n_distinct to be used for an index expression if the parameter was defined, but just remove it for now until/if there is support for this (note that index-level parameters never had support in pg_dump either, previously), so this was just dead code. Reported-by: Matthijs van der Vleuten Author: Nathan Bossart, Michael Paquier Reviewed-by: Vik Fearing, Dilip Kumar Discussion: https://postgr.es/m/17220-15d684c6c2171a83@postgresql.org Backpatch-through: 13
2021-10-18Invalidate partitions of table being attached/detachedAlvaro Herrera
Failing to do that, any direct inserts/updates of those partitions would fail to enforce the correct constraint, that is, one that considers the new partition constraint of their parent table. Backpatch to 10. Reported by: Hou Zhijie <houzj.fnst@fujitsu.com> Author: Amit Langote <amitlangote09@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Nitin Jadhav <nitinjadhavpostgres@gmail.com> Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com> Discussion: https://postgr.es/m/OS3PR01MB5718DA1C4609A25186D1FBF194089%40OS3PR01MB5718.jpnprd01.prod.outlook.com
2021-10-14Check criticalSharedRelcachesBuilt in GetSharedSecurityLabel().Jeff Davis
An extension may want to call GetSecurityLabel() on a shared object before the shared relcaches are fully initialized. For instance, a ClientAuthentication_hook might want to retrieve the security label on a role. Discussion: https://postgr.es/m/ecb7af0b26e3be1d96d291c8453a86f1f82d9061.camel@j-davis.com Backpatch-through: 9.6
2021-10-01Error out if SKIP LOCKED and WITH TIES are both specifiedAlvaro Herrera
Both bugs #16676[1] and #17141[2] illustrate that the combination of SKIP LOCKED and FETCH FIRST WITH TIES break expectations when it comes to rows returned to other sessions accessing the same row. Since this situation is detectable from the syntax and hard to fix otherwise, forbid for now, with the potential to fix in the future. [1] https://postgr.es/m/16676-fd62c3c835880da6@postgresql.org [2] https://postgr.es/m/17141-913d78b9675aac8e@postgresql.org Backpatch-through: 13, where WITH TIES was introduced Author: David Christensen <david.christensen@crunchydata.com> Discussion: https://postgr.es/m/CAOxo6XLPccCKru3xPMaYDpa+AXyPeWFs+SskrrL+HKwDjJnLhg@mail.gmail.com
2021-09-22Invalidate all partitions for a partitioned table in publication.Amit Kapila
Updates/Deletes on a partition were allowed even without replica identity after the parent table was added to a publication. This would later lead to an error on subscribers. The reason was that we were not invalidating the partition's relcache and the publication information for partitions was not getting rebuilt. Similarly, we were not invalidating the partitions' relcache after dropping a partitioned table from a publication which will prohibit Updates/Deletes on its partition without replica identity even without any publication. Reported-by: Haiying Tang Author: Hou Zhijie and Vignesh C Reviewed-by: Vignesh C and Amit Kapila Backpatch-through: 13 Discussion: https://postgr.es/m/OS0PR01MB6113D77F583C922F1CEAA1C3FBD29@OS0PR01MB6113.jpnprd01.prod.outlook.com
2021-09-21Fix misevaluation of STABLE parameters in CALL within plpgsql.Tom Lane
Before commit 84f5c2908, a STABLE function in a plpgsql CALL statement's argument list would see an up-to-date snapshot, because exec_stmt_call would push a new snapshot. I got rid of that because the possibility of the snapshot disappearing within COMMIT made it too hard to manage a snapshot across the CALL statement. That's fine so far as the procedure itself goes, but I forgot to think about the possibility of STABLE functions within the CALL argument list. As things now stand, those'll be executed with the Portal's snapshot as ActiveSnapshot, keeping them from seeing updates more recent than Portal startup. (VOLATILE functions don't have a problem because they take their own snapshots; which indeed is also why the procedure itself doesn't have a problem. There are no STABLE procedures.) We can fix this by pushing a new snapshot transiently within ExecuteCallStmt itself. Popping the snapshot before we get into the procedure proper eliminates the management problem. The possibly-useless extra snapshot-grab is slightly annoying, but it's no worse than what happened before 84f5c2908. Per bug #17199 from Alexander Nawratil. Back-patch to v11, like the previous patch. Discussion: https://postgr.es/m/17199-1ab2561f0d94af92@postgresql.org
2021-09-14Send NOTIFY signals during CommitTransaction.Tom Lane
Formerly, we sent signals for outgoing NOTIFY messages within ProcessCompletedNotifies, which was also responsible for sending relevant ones of those messages to our connected client. It therefore had to run during the main-loop processing that occurs just before going idle. This arrangement had two big disadvantages: * Now that procedures allow intra-command COMMITs, it would be useful to send NOTIFYs to other sessions immediately at COMMIT (though, for reasons of wire-protocol stability, we still shouldn't forward them to our client until end of command). * Background processes such as replication workers would not send NOTIFYs at all, since they never execute the client communication loop. We've had requests to allow triggers running in replication workers to send NOTIFYs, so that's a problem. To fix these things, move transmission of outgoing NOTIFY signals into AtCommit_Notify, where it will happen during CommitTransaction. Also move the possible call of asyncQueueAdvanceTail there, to ensure we don't bloat the async SLRU if a background worker sends many NOTIFYs with no one listening. We can also drop the call of asyncQueueReadAllNotifications, allowing ProcessCompletedNotifies to go away entirely. That's because commit 790026972 added a call of ProcessNotifyInterrupt adjacent to PostgresMain's call of ProcessCompletedNotifies, and that does its own call of asyncQueueReadAllNotifications, meaning that we were uselessly doing two such calls (inside two separate transactions) whenever inbound notify signals coincided with an outbound notify. We need only set notifyInterruptPending to ensure that ProcessNotifyInterrupt runs, and we're done. The existing documentation suggests that custom background workers should call ProcessCompletedNotifies if they want to send NOTIFY messages. To avoid an ABI break in the back branches, reduce it to an empty routine rather than removing it entirely. Removal will occur in v15. Although the problems mentioned above have existed for awhile, I don't feel comfortable back-patching this any further than v13. There was quite a bit of churn in adjacent code between 12 and 13. At minimum we'd have to also backpatch 51004c717, and a good deal of other adjustment would also be needed, so the benefit-to-risk ratio doesn't look attractive. Per bug #15293 from Michael Powers (and similar gripes from others). Artur Zakirov and Tom Lane Discussion: https://postgr.es/m/153243441449.1404.2274116228506175596@wrigleys.postgresql.org
2021-09-10Fix some anomalies with NO SCROLL cursors.Tom Lane
We have long forbidden fetching backwards from a NO SCROLL cursor, but the prohibition didn't extend to cases in which we rewind the query altogether and then re-fetch forwards. I think the reason is that this logic was mainly meant to protect plan nodes that can't be run in the reverse direction. However, re-reading the query output is problematic if the query is volatile (which includes SELECT FOR UPDATE, not just queries with volatile functions): the re-read can produce different results, which confuses the cursor navigation logic completely. Another reason for disliking this approach is that some code paths will either fetch backwards or rewind-and-fetch-forwards depending on the distance to the target row; so that seemingly identical use-cases may or may not draw the "cursor can only scan forward" error. Hence, let's clean things up by disallowing rewind as well as fetch-backwards in a NO SCROLL cursor. Ordinarily we'd only make such a definitional change in HEAD, but there is a third reason to consider this change now. Commit ba2c6d6ce created some new user-visible anomalies for non-scrollable cursors WITH HOLD, in that navigation in the cursor result got confused if the cursor had been partially read before committing. The only good way to resolve those anomalies is to forbid rewinding such a cursor, which allows removal of the incorrect cursor state manipulations that ba2c6d6ce added to PersistHoldablePortal. To minimize the behavioral change in the back branches (including v14), refuse to rewind a NO SCROLL cursor only when it has a holdStore, ie has been held over from a previous transaction due to WITH HOLD. This should avoid breaking most applications that have been sloppy about whether to declare cursors as scrollable. We'll enforce the prohibition across-the-board beginning in v15. Back-patch to v11, as ba2c6d6ce was. Discussion: https://postgr.es/m/3712911.1631207435@sss.pgh.pa.us
2021-09-09Avoid fetching from an already-terminated plan.Tom Lane
Some plan node types don't react well to being called again after they've already returned NULL. PortalRunSelect() has long dealt with this by calling the executor with NoMovementScanDirection if it sees that we've already run the portal to the end. However, commit ba2c6d6ce overlooked this point, so that persisting an already-fully-fetched cursor would fail if it had such a plan. Per report from Tomas Barton. Back-patch to v11, as the faulty commit was. (I've omitted a test case because the type of plan that causes a problem isn't all that stable.) Discussion: https://postgr.es/m/CAPV2KRjd=ErgVGbvO2Ty20tKTEZZr6cYsYLxgN_W3eAo9pf5sw@mail.gmail.com
2021-09-08Invalidate relcache for publications defined for all tables.Amit Kapila
Updates/Deletes on a relation were allowed even without replica identity after we define the publication for all tables. This would later lead to an error on subscribers. The reason was that for such publications we were not invalidating the relcache and the publication information for relations was not getting rebuilt. Similarly, we were not invalidating the relcache after dropping of such publications which will prohibit Updates/Deletes without replica identity even without any publication. Author: Vignesh C and Hou Zhijie Reviewed-by: Hou Zhijie, Kyotaro Horiguchi, Amit Kapila Backpatch-through: 10, where it was introduced Discussion: https://postgr.es/m/CALDaNm0pF6zeWqCA8TCe2sDuwFAy8fCqba=nHampCKag-qLixg@mail.gmail.com
2021-09-03Disallow creating an ICU collation if the DB encoding won't support it.Tom Lane
Previously this was allowed, but the collation effectively vanished into the ether because of the way lookup_collation() works: you could not use the collation, nor even drop it. Seems better to give an error up front than to leave the user wondering why it doesn't work. (Because this test is in DefineCollation not CreateCollation, it does not prevent pg_import_system_collations from creating ICU collations, regardless of the initially-chosen encoding.) Per bug #17170 from Andrew Bille. Back-patch to v10 where ICU support was added. Discussion: https://postgr.es/m/17170-95845cf3f0a9c36d@postgresql.org
2021-08-27Fix data loss in wal_level=minimal crash recovery of CREATE TABLESPACE.Noah Misch
If the system crashed between CREATE TABLESPACE and the next checkpoint, the result could be some files in the tablespace unexpectedly containing no rows. Affected files would be those for which the system did not write WAL; see the wal_skip_threshold documentation. Before v13, a different set of conditions governed the writing of WAL; see v12's <sect2 id="populate-pitr">. (The v12 conditions were broader in some ways and narrower in others.) Users may want to audit non-default tablespaces for unexpected short files. The bug could have truncated an index without affecting the associated table, and reindexing the index would fix that particular problem. This fixes the bug by making create_tablespace_directories() more like TablespaceCreateDbspace(). create_tablespace_directories() was recursively removing tablespace contents, reasoning that WAL redo would recreate everything removed that way. That assumption holds for other wal_level values. Under wal_level=minimal, the old approach could delete files for which no other copy existed. Back-patch to 9.6 (all supported versions). Reviewed by Robert Haas and Prabhat Sahu. Reported by Robert Haas. Discussion: https://postgr.es/m/CA+TgmoaLO9ncuwvr2nN-J4VEP5XyAcy=zKiHxQzBbFRxxGxm0w@mail.gmail.com