summaryrefslogtreecommitdiff
path: root/src/backend/executor
AgeCommit message (Collapse)Author
2024-03-14Remove redundant snapshot copying from parallel leader to workersHeikki Linnakangas
The parallel query infrastructure copies the leader backend's active snapshot to the worker processes. But BitmapHeapScan node also had bespoken code to pass the snapshot from leader to the worker. That was redundant, so remove it. The removed code was analogous to the snapshot serialization in table_parallelscan_initialize(), but that was the wrong role model. A parallel bitmap heap scan is more like an independent non-parallel bitmap heap scan in each parallel worker as far as the table AM is concerned, because the coordination is done in nodeBitmapHeapscan.c, and the table AM doesn't need to know anything about it. This relies on the assumption that es_snapshot == GetActiveSnapshot(). That's not a new assumption, things would get weird if you used the QueryDesc's snapshot for visibility checks in the scans, but the active snapshot for evaluating quals, for example. This could use some refactoring and cleanup, but for now, just add some assertions. Reviewed-by: Dilip Kumar, Robert Haas Discussion: https://www.postgresql.org/message-id/5f3b9d59-0f43-419d-80ca-6d04c07cf61a@iki.fi
2024-03-12Fix confusion about the return rowtype of SQL-language procedures.Tom Lane
There is a very ancient hack in check_sql_fn_retval that allows a single SELECT targetlist entry of composite type to be taken as supplying all the output columns of a function returning composite. (This is grotty and fundamentally ambiguous, but it's really hard to do nested composite-returning functions without it.) As far as I know, that doesn't cause any problems in ordinary functions. It's disastrous for procedures however. All procedures that have any output parameters are labeled with prorettype RECORD, and the CALL code expects it will get back a record with one column per output parameter, regardless of whether any of those parameters is composite. Doing something else leads to an assertion failure or core dump. This is simple enough to fix: we just need to not apply that rule when considering procedures. However, that requires adding another argument to check_sql_fn_retval, which at least in principle might be getting called by external callers. Therefore, in the back branches convert check_sql_fn_retval into an ABI-preserving wrapper around a new function check_sql_fn_retval_ext. Per report from Yahor Yuzefovich. This has been broken since we implemented procedures, so back-patch to all supported branches. Discussion: https://postgr.es/m/CABz5gWHSjj2df6uG0NRiDhZ_Uz=Y8t0FJP-_SVSsRsnrQT76Gg@mail.gmail.com
2024-03-11Fix incorrect accessing of pfree'd memory in MemoizeDavid Rowley
For pass-by-reference types, the code added in 0b053e78b, which aimed to resolve a memory leak, was overly aggressive in resetting the per-tuple memory context which could result in pfree'd memory being accessed resulting in failing to find previously cached results in the hash table. What was happening was prepare_probe_slot() was switching to the per-tuple memory context and calling ExecEvalExpr(). ExecEvalExpr() may have required a memory allocation. Both MemoizeHash_hash() and MemoizeHash_equal() were aggressively resetting the per-tuple context and after determining the hash value, the context would have gotten reset before MemoizeHash_equal() was called. This could have resulted in MemoizeHash_equal() looking at pfree'd memory. This is less likely to have caused issues on a production build as some other allocation would have had to have reused the pfree'd memory to overwrite it. Otherwise, the original contents would have been intact. However, this clearly caused issues on MEMORY_CONTEXT_CHECKING builds. Author: Tender Wang, Andrei Lepikhov Reported-by: Tender Wang (using SQLancer) Reviewed-by: Andrei Lepikhov, Richard Guo, David Rowley Discussion: https://postgr.es/m/CAHewXNnT6N6UJkya0z-jLFzVxcwGfeRQSfhiwA+NyLg-x8iGew@mail.gmail.com Backpatch-through: 14, where Memoize was added
2024-03-07Fix handling of self-modified tuples in MERGE.Dean Rasheed
When an UPDATE or DELETE action in MERGE returns TM_SelfModified, there are 2 possible causes: 1). The target tuple was already updated or deleted by the current command. This can happen if the target row joins to more than one source row, and the SQL standard explicitly says that this must be an error. 2). The target tuple was already updated or deleted by a later command in the current transaction. This can happen if the tuple is modified by a BEFORE trigger or a volatile function used in the query, and should be an error for the same reason that it is in a plain UPDATE or DELETE command. In MERGE's primary error handling block, it failed to check for (2), causing it to return a misleading error message in such cases. In the secondary error handling block, following a concurrent update from another session, it failed to check for (1), causing it to silently ignore target rows joined to more than one source row, instead of reporting an error. Fix this, and add tests for both of these cases. Per report from Wenjiang Zhang. Back-patch to v15, where MERGE was introduced. Discussion: https://postgr.es/m/tencent_41DE0FF443FE14B94A5898D373792109E408%40qq.com
2024-03-06Fix type-checking of RECORD-returning functions in FROM.Tom Lane
In the corner case where a function returning RECORD has been simplified to a RECORD constant or an inlined ROW() expression, ExecInitFunctionScan failed to cross-check the function's result rowtype against the coldeflist provided by the calling query. That happened because get_expr_result_type is able to extract a tupdesc from such expressions, which led ExecInitFunctionScan to ignore the coldeflist. (Instead, it used the extracted tupdesc to check the function's output, which of course always succeeds.) I have not been able to demonstrate any really serious consequences from this, because if some column of the result is of the wrong type and is directly referenced by a Var of the calling query, CheckVarSlotCompatibility will catch it. However, we definitely do fail to report the case where the function returns more columns than the coldeflist expects, and in the converse case where it returns fewer columns, we get an assert failure (but, seemingly, no worse results in non-assert builds). To fix, always build the expected tupdesc from the coldeflist if there is one, and consult get_expr_result_type only when there isn't one. Also remove the failing Assert, even though it is no longer reached after this fix. It doesn't seem to be adding anything useful, since later checking will deal with cases with the wrong number of columns. The only other place I could find that is doing something similar is inline_set_returning_function. There's no live bug there because we cannot be looking at a Const or RowExpr, but for consistency change that code to agree with ExecInitFunctionScan. Per report from PetSerAl. After some debate I've concluded that this should be back-patched. There is a small risk that somebody has been relying on such a case not throwing an error, but I judge this outweighed by the risk that I've missed some way in which the failure to cross-check has worse consequences than sketched above. Discussion: https://postgr.es/m/CAKygsHSerA1eXsJHR9wft3Gn3wfHQ5RfP8XHBzF70_qcrrRvEg@mail.gmail.com
2024-03-04Remove unused #include's from backend .c filesPeter Eisentraut
as determined by include-what-you-use (IWYU) While IWYU also suggests to *add* a bunch of #include's (which is its main purpose), this patch does not do that. In some cases, a more specific #include replaces another less specific one. Some manual adjustments of the automatic result: - IWYU currently doesn't know about includes that provide global variable declarations (like -Wmissing-variable-declarations), so those includes are being kept manually. - All includes for port(ability) headers are being kept for now, to play it safe. - No changes of catalog/pg_foo.h to catalog/pg_foo_d.h, to keep the patch from exploding in size. Note that this patch touches just *.c files, so nothing declared in header files changes in hidden ways. As a small example, in src/backend/access/transam/rmgr.c, some IWYU pragma annotations are added to handle a special case there. Discussion: https://www.postgresql.org/message-id/flat/af837490-6b2f-46df-ba05-37ea6a6653fc%40eisentraut.org
2024-03-03Redefine backend ID to be an index into the proc arrayHeikki Linnakangas
Previously, backend ID was an index into the ProcState array, in the shared cache invalidation manager (sinvaladt.c). The entry in the ProcState array was reserved at backend startup by scanning the array for a free entry, and that was also when the backend got its backend ID. Things become slightly simpler if we redefine backend ID to be the index into the PGPROC array, and directly use it also as an index to the ProcState array. This uses a little more memory, as we reserve a few extra slots in the ProcState array for aux processes that don't need them, but the simplicity is worth it. Aux processes now also have a backend ID. This simplifies the reservation of BackendStatusArray and ProcSignal slots. You can now convert a backend ID into an index into the PGPROC array simply by subtracting 1. We still use 0-based "pgprocnos" in various places, for indexes into the PGPROC array, but the only difference now is that backend IDs start at 1 while pgprocnos start at 0. (The next commmit will get rid of the term "backend ID" altogether and make everything 0-based.) There is still a 'backendId' field in PGPROC, now part of 'vxid' which encapsulates the backend ID and local transaction ID together. It's needed for prepared xacts. For regular backends, the backendId is always equal to pgprocno + 1, but for prepared xact PGPROC entries, it's the ID of the original backend that processed the transaction. Reviewed-by: Andres Freund, Reid Thompson Discussion: https://www.postgresql.org/message-id/8171f1aa-496f-46a6-afc3-c46fe7a9b407@iki.fi
2024-02-29Support MERGE into updatable views.Dean Rasheed
This allows the target relation of MERGE to be an auto-updatable or trigger-updatable view, and includes support for WITH CHECK OPTION, security barrier views, and security invoker views. A trigger-updatable view must have INSTEAD OF triggers for every type of action (INSERT, UPDATE, and DELETE) mentioned in the MERGE command. An auto-updatable view must not have any INSTEAD OF triggers. Mixing auto-update and trigger-update actions (i.e., having a partial set of INSTEAD OF triggers) is not supported. Rule-updatable views are also not supported, since there is no rewriter support for non-SELECT rules with MERGE operations. Dean Rasheed, reviewed by Jian He and Alvaro Herrera. Discussion: https://postgr.es/m/CAEZATCVcB1g0nmxuEc-A+gGB0HnfcGQNGYH7gS=7rq0u0zOBXA@mail.gmail.com
2024-02-29Remove field UpdateContext->updated in nodeModifyTable.cDean Rasheed
This field has been redundant ever since it was added by commit 25e777cf8e, which split up ExecUpdate() and ExecDelete() into reusable pieces. The only place that reads it is ExecMergeMatched(), if the result from ExecUpdateAct() is TM_Ok. However, all paths through ExecUpdateAct() that return TM_Ok also set this field to true, so the return status by itself is sufficient to tell if the update happened. Removing this field is a modest simplification, and it brings the UPDATE path in ExecMergeMatched() more into line with ExecUpdate(), ensuring that ExecUpdateEpilogue() is always called if ExecUpdateAct() returns TM_Ok, reducing the chance of bugs. Dean Rasheed, reviewed by Alvaro Herrera. Discussion: https://postgr.es/m/CAEZATCWGGmigGBzLHkJm5Ccv2mMxXmwi3%2Buq0yhwDHm-tsvSLg%40mail.gmail.com
2024-01-30Delay build of Memoize hash table until executor runDavid Rowley
Previously this hash table was built during executor startup. This could cause long delays in EXPLAIN (without ANALYZE) when the planner opts to use a large Memoize hash table. No backpatch for now due to lack of complaints. Author: David Rowley Discussion: https://postgr.es/m/CAApHDvoJktJ5XL=Kjh2a2TFr64R-7eQZV-+jcJrUwoES2GLiWg@mail.gmail.com
2024-01-24Add soft error handling to some expression nodesAmit Langote
This adjusts the code for CoerceViaIO and CoerceToDomain expression nodes to handle errors softly. For CoerceViaIo, this adds a new ExprEvalStep opcode EEOP_IOCOERCE_SAFE, which is implemented in the new accompanying function ExecEvalCoerceViaIOSafe(). The only difference from EEOP_IOCOERCE's inline implementation is that the input function receives an ErrorSaveContext via the function's FunctionCallInfo.context, which it can use to handle errors softly. For CoerceToDomain, this simply entails replacing the ereport() in ExecEvalConstraintNotNull() and ExecEvalConstraintCheck() by errsave() passing it the ErrorSaveContext passed in the expression's ExprEvalStep. In both cases, the ErrorSaveContext to be used is passed by setting ExprState.escontext to point to it before calling ExecInitExprRec() on the expression tree whose errors are to be handled softly. Note that there's no functional change as of this commit as no call site of ExecInitExprRec() has been changed. This is intended for implementing new SQL/JSON expression nodes in future commits. Extracted from a much larger patch to add SQL/JSON query functions. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He, Peter Eisentraut Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-01-08Fix indentation in ExecParallelHashIncreaseNumBatches()Alexander Korotkov
Backpatch-through: 12
2024-01-07Fix oversized memory allocation in Parallel Hash JoinAlexander Korotkov
During the calculations of the maximum for the number of buckets, take into account that later we round that to the next power of 2. Reported-by: Karen Talarico Bug: #16925 Discussion: https://postgr.es/m/16925-ec96d83529d0d629%40postgresql.org Author: Thomas Munro, Andrei Lepikhov, Alexander Korotkov Reviewed-by: Alena Rybakina Backpatch-through: 12
2024-01-04Add macros for looping through a List without a ListCell.Nathan Bossart
Many foreach loops only use the ListCell pointer to retrieve the content of the cell, like so: ListCell *lc; foreach(lc, mylist) { int myint = lfirst_int(lc); ... } This commit adds a few convenience macros that automatically declare the loop variable and retrieve the current cell's contents. This allows us to rewrite the previous loop like this: foreach_int(myint, mylist) { ... } This commit also adjusts a few existing loops in order to add coverage for the new/adjusted macros. There is presently no plan to bulk update all foreach loops, as that could introduce a significant amount of back-patching pain. Instead, these macros are primarily intended for use in new code. Author: Jelte Fennema-Nio Reviewed-by: David Rowley, Alvaro Herrera, Vignesh C, Tom Lane Discussion: https://postgr.es/m/CAGECzQSwXKnxGwW1_Q5JE%2B8Ja20kyAbhBHO04vVrQsLcDciwXA%40mail.gmail.com
2024-01-04Fix use of incorrect TupleTableSlot in DISTINCT aggregatesDavid Rowley
1349d2790 added code to allow DISTINCT and ORDER BY aggregates to work more efficiently by using presorted input. That commit added some code that made use of the AggState's tmpcontext and adjusted the ecxt_outertuple and ecxt_innertuple slots before checking if the current row is distinct from the previously seen row. That code forgot to set the TupleTableSlots back to what they were originally, which could result in errors such as: ERROR: attribute 1 of type record has wrong type This only affects aggregate functions which have multiple arguments when DISTINCT is used. For example: string_agg(DISTINCT col, ', ') Thanks to Tom Lane for identifying the breaking commit. Bug: #18264 Reported-by: Vojtěch Beneš Discussion: https://postgr.es/m/18264-e363593d7e9feb7d@postgresql.org Backpatch-through: 16, where 1349d2790 was added
2024-01-03Update copyright for 2024Bruce Momjian
Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12
2024-01-02Fix typos in comments and in one isolation test.Robert Haas
Dagfinn Ilmari Mannsåker, reviewed by Shubham Khanna. Some subtractions by me. Discussion: http://postgr.es/m/87le9fmi01.fsf@wibble.ilmari.org
2023-12-21Fix BEFORE ROW trigger handling in cross-partition MERGE update.Dean Rasheed
Fix a bug during MERGE if a cross-partition update is attempted on a partitioned table with a BEFORE DELETE ROW trigger that returns NULL, to prevent the update. This would cause an error to be thrown, or an assert failure in an assert-enabled build. This was an oversight in 9321c79c86, which failed to properly distinguish a DELETE prevented by a trigger from one prevented by a concurrent update. Fix by having ExecDelete() return the TM_Result status to ExecCrossPartitionUpdate(), so that it can distinguish the two cases, and make ExecCrossPartitionUpdate() return the TM_Result status to ExecUpdateAct(), so that it can return the correct status from a concurrent update. In addition, ensure that the command tag is correctly updated by having ExecMergeMatched() pass canSetTag to ExecUpdateAct(), rather than passing false, so that it updates the command tag if it does a cross-partition update, making this code path in ExecMergeMatched() consistent with ExecUpdate(). Per bug #18238 from Alexander Lakhin. Back-patch to v15, where MERGE was introduced. Dean Rasheed, reviewed by Richard Guo and Jian He. Discussion: https://postgr.es/m/18238-2f2bdc7f720180b9%40postgresql.org
2023-12-07Verify that attribute counts match in ExecCopySlotDavid Rowley
tts_virtual_copyslot() contained an Assert that checked that the srcslot contained <= attributes than the dstslot. This seems to be backwards as if the srcslot contained fewer attributes then the dstslot could be left with stale Datum values from the previously stored tuple. It might make more sense to allow the source to contain additional attributes and only copy the leading ones that also exist in the destination, however, that's not what we're doing here. Here we just remove the Assert from tts_virtual_copyslot() and add an Assert to ExecCopySlot() to verify the attribute counts match. There does not seem to be any places where the destination contains fewer attributes, so instead of going to the trouble of making the code properly handle this, let's just ensure the attribute counts match. If this Assert fails then that will indicate that we do have cases that require us to handle the srcslot with more attributes than the dstslot. It seems better to only write this code if there's a genuine requirement for it rather than write it now only to leave it untested. Thanks to Andres Freund for helping with the analysis of this. Discussion: https://postgr.es/m/CAApHDvpMAvBL0T+TRORquyx1iqFQKMVTXtqNocOw0Pa2uh1heg@mail.gmail.com
2023-11-25Reuse BrinDesc and BrinRevmap in brininsertTomas Vondra
The brininsert code used to initialize (and destroy) BrinDesc and BrinRevmap for each tuple, which is not free. This patch initializes these structures only once, and reuses them for all inserts in the same command. The data is passed through indexInfo->ii_AmCache. This also introduces an optional AM callback "aminsertcleanup" that allows performing custom cleanup in case simply pfree-ing ii_AmCache is not sufficient (which is the case when the cache contains TupleDesc, Buffers, and so on). Author: Soumyadeep Chakraborty Reviewed-by: Alvaro Herrera, Matthias van de Meent, Tomas Vondra Discussion: https://postgr.es/m/CAE-ML%2B9r2%3DaO1wwji1sBN9gvPz2xRAtFUGfnffpd0ZqyuzjamA%40mail.gmail.com
2023-11-23Use ResourceOwner to track WaitEventSets.Heikki Linnakangas
A WaitEventSet holds file descriptors or event handles (on Windows). If FreeWaitEventSet is not called, those fds or handles are leaked. Use ResourceOwners to track WaitEventSets, to clean those up automatically on error. This was a live bug in async Append nodes, if a FDW's ForeignAsyncRequest function failed. (In back branches, I will apply a more localized fix for that based on PG_TRY-PG_FINALLY.) The added test doesn't check for leaking resources, so it passed even before this commit. But at least it covers the code path. In the passing, fix misleading comment on what the 'nevents' argument to WaitEventSetWait means. Report by Alexander Lakhin, analysis and suggestion for the fix by Tom Lane. Fixes bug #17828. Reviewed-by: Alexander Lakhin, Thomas Munro Discussion: https://www.postgresql.org/message-id/472235.1678387869@sss.pgh.pa.us
2023-11-15Retire MemoryContextResetAndDeleteChildren() macro.Nathan Bossart
As of commit eaa5808e8e, MemoryContextResetAndDeleteChildren() is just a backwards compatibility macro for MemoryContextReset(). Now that some time has passed, this macro seems more likely to create confusion. This commit removes the macro and replaces all remaining uses with calls to MemoryContextReset(). Any third-party code that use this macro will need to be adjusted to call MemoryContextReset() instead. Since the two have behaved the same way since v9.5, such adjustments won't produce any behavior changes for all currently-supported versions of PostgreSQL. Reviewed-by: Amul Sul, Tom Lane, Alvaro Herrera, Dagfinn Ilmari Mannsåker Discussion: https://postgr.es/m/20231113185950.GA1668018%40nathanxps13
2023-11-09Fix AFTER ROW trigger execution in MERGE cross-partition update.Dean Rasheed
When executing a MERGE UPDATE action, if the UPDATE is turned into a cross-partition DELETE then INSERT, do not attempt to invoke AFTER UPDATE ROW triggers, or any of the other post-update actions in ExecUpdateEpilogue(). For consistency with a plain UPDATE command, such triggers should not be fired (and typically fail anyway), and similarly, other post-update actions, such as WCO/RLS checks should not be executed, and might also lead to unexpected failures. Therefore, as with ExecUpdate(), make ExecMergeMatched() return immediately if ExecUpdateAct() reports that a cross-partition update was done, to be sure that no further processing is done for that tuple. Back-patch to v15, where MERGE was introduced. Discussion: https://postgr.es/m/CAEZATCWjBgagyNZs02vgDF0DvASYj-iHTFtXG2-nP3orZhmtcw%40mail.gmail.com
2023-10-28Fix intra-query memory leak when a SRF returns zero rows.Tom Lane
When looping around after finding that the set-returning function returned zero rows for the current input tuple, ExecProjectSet neglected to reset either of the two memory contexts it's responsible for cleaning out. Typically this wouldn't cause much problem, because once the SRF does return at least one row, the contexts would get reset on the next call. However, if the SRF returns no rows for many input tuples in succession, quite a lot of memory could be transiently consumed. To fix, make sure we reset both contexts while looping around. Per bug #18172 from Sergei Kornilov. Back-patch to all supported branches. Discussion: https://postgr.es/m/18172-9b8c5fc1d676ded3@postgresql.org
2023-10-26Add trailing commas to enum definitionsPeter Eisentraut
Since C99, there can be a trailing comma after the last value in an enum definition. A lot of new code has been introducing this style on the fly. Some new patches are now taking an inconsistent approach to this. Some add the last comma on the fly if they add a new last value, some are trying to preserve the existing style in each place, some are even dropping the last comma if there was one. We could nudge this all in a consistent direction if we just add the trailing commas everywhere once. I omitted a few places where there was a fixed "last" value that will always stay last. I also skipped the header files of libpq and ecpg, in case people want to use those with older compilers. There were also a small number of cases where the enum type wasn't used anywhere (but the enum values were), which ended up confusing pgindent a bit, so I left those alone. Discussion: https://www.postgresql.org/message-id/flat/386f8c45-c8ac-4681-8add-e3b0852c1620%40eisentraut.org
2023-10-24Fix problems when a plain-inheritance parent table is excluded.Tom Lane
When an UPDATE/DELETE/MERGE's target table is an old-style inheritance tree, it's possible for the parent to get excluded from the plan while some children are not. (I believe this is only possible if we can prove that a CHECK ... NO INHERIT constraint on the parent contradicts the query WHERE clause, so it's a very unusual case.) In such a case, ExecInitModifyTable mistakenly concluded that the first surviving child is the target table, leading to at least two bugs: 1. The wrong table's statement-level triggers would get fired. 2. In v16 and up, it was possible to fail with "invalid perminfoindex 0 in RTE with relid nnnn" due to the child RTE not having permissions data included in the query plan. This was hard to reproduce reliably because it did not occur unless the update triggered some non-HOT index updates. In v14 and up, this is easy to fix by defining ModifyTable.rootRelation to be the parent RTE in plain inheritance as well as partitioned cases. While the wrong-triggers bug also appears in older branches, the relevant code in both the planner and executor is quite a bit different, so it would take a good deal of effort to develop and test a suitable patch. Given the lack of field complaints about the trigger issue, I'll desist for now. (Patching v11 for this seems unwise anyway, given that it will have no more releases after next month.) Per bug #18147 from Hans Buschmann. Amit Langote and Tom Lane Discussion: https://postgr.es/m/18147-6fc796538913ee88@postgresql.org
2023-10-19Add local_blk_{read|write}_time I/O timing statistics for local blocksMichael Paquier
There was no I/O timing statistics for counting read and write timings on local blocks, contrary to the counterparts for temp and shared blocks. This information is available when track_io_timing is enabled. The output of EXPLAIN is updated to show this information. An update of pg_stat_statements is planned next. Author: Nazir Bilal Yavuz Reviewed-by: Robert Haas, Melanie Plageman Discussion: https://postgr.es/m/CAN55FZ19Ss279mZuqGbuUNxka0iPbLgYuOQXqAKewrjNrp27VA@mail.gmail.com
2023-10-19Rename I/O timing statistics columns to shared_blk_{read|write}_timeMichael Paquier
These two counters, defined in BufferUsage to track respectively the time spent while reading and writing blocks have historically only tracked data related to shared buffers, when track_io_timing is enabled. An upcoming patch to add specific counters for local buffers will take advantage of this rename as it has come up that no data is currently tracked for local buffers, and tracking local and shared buffers using the same fields would be inconsistent with the treatment done for temp buffers. Renaming the existing fields clarifies what the block type of each stats field is. pg_stat_statement is updated to reflect the rename. No extension version bump is required as 5a3423ad8ee17 has done one, affecting v17~. Author: Nazir Bilal Yavuz Reviewed-by: Robert Haas, Melanie Plageman Discussion: https://postgr.es/m/CAN55FZ19Ss279mZuqGbuUNxka0iPbLgYuOQXqAKewrjNrp27VA@mail.gmail.com
2023-10-13Fix runtime partition pruning for HASH partitioned tablesDavid Rowley
This could only affect HASH partitioned tables with at least 2 partition key columns. If partition pruning was delayed until execution and the query contained an IS NULL qual on one of the partitioned keys, and some subsequent partitioned key was being compared to a non-Const, then this could result in a crash due to the incorrect keyno being used to calculate the stateidx for the expression evaluation code. Here we fix this by properly skipping partitioned keys which have a nullkey set. Effectively, this must be the same as what's going on inside perform_pruning_base_step(). Sergei Glukhov also provided a patch, but that's not what's being used here. Reported-by: Sergei Glukhov Reviewed-by: tender wang, Sergei Glukhov Discussion: https://postgr.es/m/d05b26fa-af54-27e1-f693-6c31590802fa@postgrespro.ru Backpatch-through: 11, where runtime partition pruning was added.
2023-10-10Add const to values and nulls argumentsPeter Eisentraut
This excludes any changes that would change the external AM APIs. Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://www.postgresql.org/message-id/flat/14c31f4a-0347-0805-dce8-93a9072c05a5%40eisentraut.org
2023-10-09Remove duplicate words in docs and code comments.Amit Kapila
Additionally, add a missing "the" in a couple of places. Author: Vignesh C, Dagfinn Ilmari Mannsåker Discussion: http://postgr.es/m/CALDaNm28t+wWyPfuyqEaARS810Je=dRFkaPertaLAEJYY2cWYQ@mail.gmail.com
2023-10-05Fix memory leak in Memoize codeDavid Rowley
Ensure we switch to the per-tuple memory context to prevent any memory leaks of detoasted Datums in MemoizeHash_hash() and MemoizeHash_equal(). Reported-by: Orlov Aleksej Author: Orlov Aleksej, David Rowley Discussion: https://postgr.es/m/83281eed63c74e4f940317186372abfd%40cft.ru Backpatch-through: 14, where Memoize was added
2023-10-02Revert "Add soft error handling to some expression nodes"Amit Langote
This reverts commit 7fbc75b26ed8ec70c729c5e7f8233896c54c900f. Looks like the LLVM additions may not be totally correct.
2023-10-02Add soft error handling to some expression nodesAmit Langote
This adjusts the expression evaluation code for CoerceViaIO and CoerceToDomain to handle errors softly if needed. For CoerceViaIo, this means using InputFunctionCallSafe(), which provides the option to handle errors softly, instead of calling the type input function directly. For CoerceToDomain, this simply entails replacing the ereport() in ExecEvalConstraintCheck() by errsave(). In both cases, the ErrorSaveContext to be used when evaluating the expression is stored by ExecInitExprRec() in the expression's struct in the expression's ExprEvalStep. The ErrorSaveContext is passed by setting ExprState.escontext to point to it when calling ExecInitExprRec() on the expression whose errors are to be handled softly. Note that no call site of ExecInitExprRec() has been changed in this commit, so there's no functional change. This is intended for implementing new SQL/JSON expression nodes in future commits that will use to it suppress errors that may occur during type coercions. Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2023-09-30Fix EvalPlanQual rechecking during MERGE.Dean Rasheed
Under some circumstances, concurrent MERGE operations could lead to inconsistent results, that varied according the plan chosen. This was caused by a lack of rowmarks on the source relation, which meant that EvalPlanQual rechecking was not guaranteed to return the same source tuples when re-running the join query. Fix by ensuring that preprocess_rowmarks() sets up PlanRowMarks for all non-target relations used in MERGE, in the same way that it does for UPDATE and DELETE. Per bug #18103. Back-patch to v15, where MERGE was introduced. Dean Rasheed, reviewed by Richard Guo. Discussion: https://postgr.es/m/18103-c4386baab8e355e3%40postgresql.org
2023-09-28Remove obsolete executor cleanup codeAmit Langote
This commit removes unnecessary ExecExprFreeContext() calls in ExecEnd* routines because the actual cleanup is managed by FreeExecutorState(). With no callers remaining for ExecExprFreeContext(), this commit also removes the function. This commit also drops redundant ExecClearTuple() calls, because ExecResetTupleTable() in ExecEndPlan() already takes care of resetting and dropping all TupleTableSlots initialized with ExecInitScanTupleSlot() and ExecInitExtraTupleSlot(). After these modifications, the ExecEnd*() routines for ValuesScan, NamedTuplestoreScan, and WorkTableScan became redundant. So, this commit removes them. Reviewed-by: Robert Haas Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com
2023-08-07Fix RLS policy usage in MERGE.Dean Rasheed
If MERGE executes an UPDATE action on a table with row-level security, the code incorrectly applied the WITH CHECK clauses from the target table's INSERT policies to new rows, instead of the clauses from the table's UPDATE policies. In addition, it failed to check new rows against the target table's SELECT policies, if SELECT permissions were required (likely to always be the case). In addition, if MERGE executes a DO NOTHING action for matched rows, the code incorrectly applied the USING clauses from the target table's DELETE policies to existing target tuples. These policies were applied as checks that would throw an error, if they did not pass. Fix this, so that a MERGE UPDATE action applies the same RLS policies as a plain UPDATE query with a WHERE clause, and a DO NOTHING action does not apply any RLS checks (other than adding clauses from SELECT policies to the join). Back-patch to v15, where MERGE was introduced. Dean Rasheed, reviewed by Stephen Frost. Security: CVE-2023-39418
2023-07-26Add more SQL/JSON constructor functionsAmit Langote
This Patch introduces three SQL standard JSON functions: JSON() JSON_SCALAR() JSON_SERIALIZE() JSON() produces json values from text, bytea, json or jsonb values, and has facilitites for handling duplicate keys. JSON_SCALAR() produces a json value from any scalar sql value, including json and jsonb. JSON_SERIALIZE() produces text or bytea from input which containis or represents json or jsonb; For the most part these functions don't add any significant new capabilities, but they will be of use to users wanting standard compliant JSON handling. Catversion bumped as this changes ruleutils.c. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Peter Eisentraut Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2023-07-25Remove unnecessary checks for indexes for REPLICA IDENTITY FULL tables.Masahiko Sawada
Previously, when selecting an usable index for update/delete for the REPLICA IDENTITY FULL table, in IsIndexOnlyExpression(), we used to check if all index fields are not expressions. However, it was not necessary, because it is enough to check if only the leftmost index field is not an expression (and references the remote table column) and this check has already been done by RemoteRelContainsLeftMostColumnOnIdx(). This commit removes IsIndexOnlyExpression() and RemoteRelContainsLeftMostColumnOnIdx() and all checks for usable indexes for REPLICA IDENTITY FULL tables are now performed by IsIndexUsableForReplicaIdentityFull(). Backpatch this to remain the code consistent. Reported-by: Peter Smith Reviewed-by: Amit Kapila, Önder Kalacı Discussion: https://postgr.es/m/CAHut%2BPsGRE5WSsY0jcLHJEoA17MrbP9yy8FxdjC_ZOAACxbt%2BQ%40mail.gmail.com Backpatch-through: 16
2023-07-14Allow the use of a hash index on the subscriber during replication.Amit Kapila
Commit 89e46da5e5 allowed using BTREE indexes that are neither PRIMARY KEY nor REPLICA IDENTITY on the subscriber during apply of update/delete. This patch extends that functionality to also allow HASH indexes. We explored supporting other index access methods as well but they don't have a fixed strategy for equality operation which is required by the current infrastructure in logical replication to scan the indexes. Author: Kuroda Hayato Reviewed-by: Peter Smith, Onder Kalaci, Amit Kapila Discussion: https://postgr.es/m/TYAPR01MB58669D7414E59664E17A5827F522A@TYAPR01MB5866.jpnprd01.prod.outlook.com
2023-07-13Doc: clarify the conditions of usable indexes for REPLICA IDENTITY FULL tables.Masahiko Sawada
Commit 89e46da5e allowed REPLICA IDENTITY FULL tables to use an index on the subscriber during apply of update/delete. This commit clarifies in the documentation that the leftmost field of candidate indexes must be a column (not an expression) that references the published relation column. The source code comments are also updated accordingly. Reviewed-by: Peter Smith, Amit Kapila Discussion: https://postgr.es/m/CAD21AoDJjffEvUFKXT27Q5U8-UU9JHv4rrJ9Ke8Zkc5UPWHLvA@mail.gmail.com Backpatch-through: 16
2023-07-13Don't include CaseTestExpr in JsonValueExpr.formatted_exprAmit Langote
A CaseTestExpr is currently being put into JsonValueExpr.formatted_expr as placeholder for the result of evaluating JsonValueExpr.raw_expr, which in turn is evaluated separately. Though, there's no need for this indirection if raw_expr itself can be embedded into formatted_expr and evaluated as part of evaluating the latter, especially as there is no special reason to evaluate it separately. So this commit makes it so. As a result, JsonValueExpr.raw_expr no longer needs to be evaluated in ExecInterpExpr(), eval_const_exprs_mutator() etc. and is now only used for displaying the original "unformatted" expression in ruleutils.c. While at it, this also removes the function makeCaseTestExpr(), because the code in makeJsonConstructorExpr() looks more readable without it IMO and isn't used by anyone else either. Finally, a note is added in the comment above CaseTestExpr's definition that JsonConstructorExpr is also using it. Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2023-07-02Fix memory leak in Incremental Sort rescansTomas Vondra
The Incremental Sort had a couple issues, resulting in leaking memory during rescans, possibly triggering OOM. The code had a couple of related flaws: 1. During rescans, the sort states were reset but then also set to NULL (despite the comment saying otherwise). ExecIncrementalSort then sees NULL and initializes a new sort state, leaking the memory used by the old one. 2. Initializing the sort state also automatically rebuilt the info about presorted keys, leaking the already initialized info. presorted_keys was also unnecessarily reset to NULL. Patch by James Coleman, based on patches by Laurenz Albe and Tom Lane. Backpatch to 13, where Incremental Sort was introduced. Author: James Coleman, Laurenz Albe, Tom Lane Reported-by: Laurenz Albe, Zu-Ming Jiang Backpatch-through: 13 Discussion: https://postgr.es/m/b2bd02dff61af15e3526293e2771f874cf2a3be7.camel%40cybertec.at Discussion: https://postgr.es/m/db03c582-086d-e7cd-d4a1-3bc722f81765%40inf.ethz.ch
2023-06-29Fix order of operations in ExecEvalFieldStoreDeForm().Tom Lane
If the given composite datum is toasted out-of-line, DatumGetHeapTupleHeader will perform database accesses to detoast it. That can invalidate the result of get_cached_rowtype, as documented (perhaps not plainly enough) in that function's API spec; which leads to strange errors or crashes when we try to use the TupleDesc to read the tuple. In short then, trying to update a field of a composite column could fail intermittently if the overall column value is wide enough to require toasting. We can fix the bug at no cost by just changing the order of operations, since we don't need the TupleDesc until after detoasting. (Other callers of get_cached_rowtype appear to get this right already, so there's only one bug.) Note that the added regression test case reveals this bug reliably only with debug_discard_caches/CLOBBER_CACHE_ALWAYS. Per bug #17994 from Alexander Lakhin. Sadly, this patch does not fix the missing-values issue revealed in the bug discussion; we'll need some more work to cover that. Discussion: https://postgr.es/m/17994-5c7100b51b4790e9@postgresql.org
2023-06-20Pre-beta2 mechanical code beautification.Tom Lane
Run pgindent and pgperltidy. It seems we're still some ways away from all committers doing this automatically. Now that we have a buildfarm animal that will whine about poorly-indented code, we'll try to keep the tree more tidy. Discussion: https://postgr.es/m/3156045.1687208823@sss.pgh.pa.us
2023-06-14Retain relkind too in RTE_SUBQUERY entries for views.Amit Langote
47bb9db75 modified the ApplyRetrieveRule()'s conversion of a view's original RTE_RELATION entry into an RTE_SUBQUERY one to retain relid, rellockmode, and perminfoindex so that the executor can lock the view and check its permissions. It seems better to also retain relkind for cross-checking that the exception of an RTE_SUBQUERY entry being allowed to carry relation details only applies to views, so do so. Bump catversion because this changes the output format of RTE_SUBQUERY RTEs. Suggested-by: David Steele <david@pgmasters.net> Reviewed-by: David Steele <david@pgmasters.net> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/3953179e-9540-e5d1-a743-4bef368785b0%40pgmasters.net
2023-06-07Use per-tuple context in ExecGetAllUpdatedColsTomas Vondra
Commit fc22b6623b (generated columns) replaced ExecGetUpdatedCols() with ExecGetAllUpdatedCols() in a couple places handling UPDATE (triggers and lock mode). However, ExecGetUpdatedCols() did exec_rt_fetch() while ExecGetAllUpdatedCols() also allocates memory through bms_union() without paying attention to the memory context and happened to use the long-lived ExecutorState, leaking the memory until the end of the query. The amount of leaked memory is proportional to the number of (updated) attributes, types of UPDATE triggers, and the number of processed rows (which for UPDATE ... FROM ... may be much higher than updated rows). Fixed by switching to the per-tuple context in GetAllUpdatedColumns(). This is fine for all in-core callers, but external callers may need to copy the result. But we're not aware of any such callers. Note the issue was introduced by fc22b6623b, but the macros were later renamed by f50e888990. Backpatch to 12, where the issue was introduced. Reported-by: Tomas Vondra Reviewed-by: Andres Freund, Tom Lane, Jakub Wartak Backpatch-through: 12 Discussion: https://postgr.es/m/222a3442-7f7d-246c-ed9b-a76209d19239@enterprisedb.com
2023-05-19Pre-beta mechanical code beautification.Tom Lane
Run pgindent, pgperltidy, and reformat-dat-files. This set of diffs is a bit larger than typical. We've updated to pg_bsd_indent 2.1.2, which properly indents variable declarations that have multi-line initialization expressions (the continuation lines are now indented one tab stop). We've also updated to perltidy version 20230309 and changed some of its settings, which reduces its desire to add whitespace to lines to make assignments etc. line up. Going forward, that should make for fewer random-seeming changes to existing code. Discussion: https://postgr.es/m/20230428092545.qfb3y5wcu4cm75ur@alvherre.pgsql
2023-05-19Fix misbehavior of EvalPlanQual checks with multiple result relations.Tom Lane
The idea of EvalPlanQual is that we replace the query's scan of the result relation with a single injected tuple, and see if we get a tuple out, thereby implying that the injected tuple still passes the query quals. (In join cases, other relations in the query are still scanned normally.) This logic was not updated when commit 86dc90056 made it possible for a single DML query plan to have multiple result relations, when the query target relation has inheritance or partition children. We replaced the output for the current result relation successfully, but other result relations were still scanned normally; thus, if any other result relation contained a tuple satisfying the quals, we'd think the EPQ check passed, even if it did not pass for the injected tuple itself. This would lead to update or delete actions getting performed when they should have been skipped due to a conflicting concurrent update in READ COMMITTED isolation mode. Fix by blocking all sibling result relations from emitting tuples during an EvalPlanQual recheck. In the back branches, the fix is complicated a bit by the need to not change the size of struct EPQState (else we'd have ABI-breaking changes in offsets in struct ModifyTableState). Like the back-patches of 3f7836ff6 and 4b3e37993, add a separately palloc'd struct to avoid that. The logic is the same as in HEAD otherwise. This is only a live bug back to v14 where 86dc90056 came in. However, I chose to back-patch the test cases further, on the grounds that this whole area is none too well tested. I skipped doing so in v11 though because none of the test applied cleanly, and it didn't quite seem worth extra work for a branch with only six months to live. Per report from Ante Krešić (via Aleksander Alekseev) Discussion: https://postgr.es/m/CAJ7c6TMBTN3rcz4=AjYhLPD_w3FFT0Wq_C15jxCDn8U4tZnH1g@mail.gmail.com
2023-05-19Allocate hash join files in a separate memory contextTomas Vondra
Should a hash join exceed memory limit, the hashtable is split up into multiple batches. The number of batches is doubled each time a given batch is determined not to fit in memory. Each batch file is allocated with a block-sized buffer for buffering tuples and parallel hash join has additional sharedtuplestore accessor buffers. In some pathological cases requiring a lot of batches, often with skewed data, bad stats, or very large datasets, users can run out-of-memory solely from the memory overhead of all the batch files' buffers. Batch files were allocated in the ExecutorState memory context, making it very hard to identify when this batch explosion was the source of an OOM. This commit allocates the batch files in a dedicated memory context, making it easier to identify the cause of an OOM and work to avoid it. Based on initial draft by Tomas Vondra, with significant reworks and improvements by Jehan-Guillaume de Rorthais. Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> Author: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/20190421114618.z3mpgmimc3rmubi4@development Discussion: https://postgr.es/m/20230504193006.1b5b9622%40karst#273020ff4061fc7a2fbb1ba96b281f17