summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2021-09-29pgbench: Fix handling of socket errors during benchmark.Fujii Masao
Previously socket errors such as invalid socket or socket wait method failures during benchmark caused pgbench to exit with status 0. Instead, errors during the run should result in exit status 2. Back-patch to v12 where pgbench started reporting exit status. Original complaint and patch by Hayato Kuroda. Author: Yugo Nagata, Fabien COELHO Reviewed-by: Kyotaro Horiguchi, Fujii Masao Discussion: https://postgr.es/m/TYCPR01MB5870057375ACA8A73099C649F5349@TYCPR01MB5870.jpnprd01.prod.outlook.com
2021-09-29pgbench: Correct log level of message output when socket wait method fails.Fujii Masao
The failure of socket wait method like "select()" doesn't terminate pgbench. So the log level of error message when that failure happens should be ERROR. But previously FATAL was used in that case. Back-patch to v13 where pgbench started using common logging API. Author: Yugo Nagata, Fabien COELHO Reviewed-by: Kyotaro Horiguchi, Fujii Masao Discussion: https://postgr.es/m/20210617005934.8bd37bf72efd5f1b38e6f482@sraoss.co.jp
2021-09-23Split macros from visibilitymap.h into a separate headerAlexander Korotkov
That allows to include just visibilitymapdefs.h from file.c, and in turn, remove include of postgres.h from relcache.h. Reported-by: Andres Freund Discussion: https://postgr.es/m/20210913232614.czafiubr435l6egi%40alap3.anarazel.de Author: Alexander Korotkov Reviewed-by: Andres Freund, Tom Lane, Alvaro Herrera Backpatch-through: 13
2021-09-23Release memory allocated by dependency_degreeTomas Vondra
Calculating degree of a functional dependency may allocate a lot of memory - we have released mot of the explicitly allocated memory, but e.g. detoasted varlena values were left behind. That may be an issue, because we consider a lot of dependencies (all combinations), and the detoasting may happen for each one again. Fixed by calling dependency_degree() in a dedicated context, and resetting it after each call. We only need the calculated dependency degree, so we don't need to copy anything. Backpatch to PostgreSQL 10, where extended statistics were introduced. Backpatch-through: 10 Discussion: https://www.postgresql.org/message-id/20210915200928.GP831%40telsasoft.com
2021-09-23Free memory after building each statistics objectTomas Vondra
Until now, all extended statistics on a given relation were built in the same memory context, without resetting. Some of the memory was released explicitly, but not all of it - for example memory allocated while detoasting values is hard to free. This is how it worked since extended statistics were introduced in PostgreSQL 10, but adding support for extended stats on expressions made the issue somewhat worse as it increases the number of statistics to build. Fixed by adding a memory context which gets reset after building each statistics object (all the statistics kinds included in it). Resetting it after building each statistics kind would be even better, but it would require more invasive changes and copying of results, making it harder to backpatch. Backpatch to PostgreSQL 10, where extended statistics were introduced. Author: Justin Pryzby Reported-by: Justin Pryzby Reviewed-by: Tomas Vondra Backpatch-through: 10 Discussion: https://www.postgresql.org/message-id/20210915200928.GP831%40telsasoft.com
2021-09-22Invalidate all partitions for a partitioned table in publication.Amit Kapila
Updates/Deletes on a partition were allowed even without replica identity after the parent table was added to a publication. This would later lead to an error on subscribers. The reason was that we were not invalidating the partition's relcache and the publication information for partitions was not getting rebuilt. Similarly, we were not invalidating the partitions' relcache after dropping a partitioned table from a publication which will prohibit Updates/Deletes on its partition without replica identity even without any publication. Reported-by: Haiying Tang Author: Hou Zhijie and Vignesh C Reviewed-by: Vignesh C and Amit Kapila Backpatch-through: 13 Discussion: https://postgr.es/m/OS0PR01MB6113D77F583C922F1CEAA1C3FBD29@OS0PR01MB6113.jpnprd01.prod.outlook.com
2021-09-22Fix places in TestLib.pm in need of adaptation to the output of Msys perlMichael Paquier
Contrary to the output of native perl, Msys perl generates outputs with CRLFs characters. There are already places in the TAP code where CRLFs (\r\n) are automatically converted to LF (\n) on Msys, but we missed a couple of places when running commands and using their output for comparison, that would lead to failures. This problem has been found thanks to the test added in 5adb067 using TestLib::command_checks_all(), but after a closer look more code paths were missing a filter. This is backpatched all the way down to prevent any surprises if a new test is introduced in stable branches. Reviewed-by: Andrew Dunstan, Álvaro Herrera Discussion: https://postgr.es/m/1252480.1631829409@sss.pgh.pa.us Backpatch-through: 9.6
2021-09-21Fix misevaluation of STABLE parameters in CALL within plpgsql.Tom Lane
Before commit 84f5c2908, a STABLE function in a plpgsql CALL statement's argument list would see an up-to-date snapshot, because exec_stmt_call would push a new snapshot. I got rid of that because the possibility of the snapshot disappearing within COMMIT made it too hard to manage a snapshot across the CALL statement. That's fine so far as the procedure itself goes, but I forgot to think about the possibility of STABLE functions within the CALL argument list. As things now stand, those'll be executed with the Portal's snapshot as ActiveSnapshot, keeping them from seeing updates more recent than Portal startup. (VOLATILE functions don't have a problem because they take their own snapshots; which indeed is also why the procedure itself doesn't have a problem. There are no STABLE procedures.) We can fix this by pushing a new snapshot transiently within ExecuteCallStmt itself. Popping the snapshot before we get into the procedure proper eliminates the management problem. The possibly-useless extra snapshot-grab is slightly annoying, but it's no worse than what happened before 84f5c2908. Per bug #17199 from Alexander Nawratil. Back-patch to v11, like the previous patch. Discussion: https://postgr.es/m/17199-1ab2561f0d94af92@postgresql.org
2021-09-20Remove overzealous index deletion assertion.Peter Geoghegan
A broken HOT chain is not an unexpected condition, even when the offset number points past the end of the page's line pointer array. heap_prune_chain() does not (and never has) treated this condition as unexpected, so derivative code in heap_index_delete_tuples() shouldn't do so either. Oversight in commit 4228817449. The assertion can probably only fail on Postgres 14 and master. Earlier releases don't have commit 3c3b8a4b, which taught VACUUM to truncate the line pointer array of heap pages. Backpatch all the same, just to be consistent. Author: Peter Geoghegan <pg@bowt.ie> Reported-By: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/17197-9438f31f46705182@postgresql.org Backpatch: 12-, just like commit 4228817449.
2021-09-20Don't elide casting to typmod -1.Tom Lane
Casting a value that's already of a type with a specific typmod to an unspecified typmod doesn't do anything so far as run-time behavior is concerned. However, it really ought to change the exposed type of the expression to match. Up to now, coerce_type_typmod hasn't bothered with that, which creates gotchas in contexts such as recursive unions. If for example one side of the union is numeric(18,3), but it needs to be plain numeric to match the other side, there's no direct way to express that. This is easy enough to fix, by inserting a RelabelType to update the exposed type of the expression. However, it's a bit nervous-making to change this behavior, because it's stood for a really long time. But no complaints have emerged about 14beta3, so go ahead and back-patch. Back-patch of 5c056b0c2 into previous supported branches. Discussion: https://postgr.es/m/CABNQVagu3bZGqiTjb31a8D5Od3fUMs7Oh3gmZMQZVHZ=uWWWfQ@mail.gmail.com Discussion: https://postgr.es/m/1488389.1631984807@sss.pgh.pa.us
2021-09-17Fix pull_varnos to cope with translated PlaceHolderVars.Tom Lane
Commit 55dc86eca changed pull_varnos to use (if possible) the associated ph_eval_at for a PlaceHolderVar. I missed a fine point though: we might be looking at a PHV in the quals or tlist of a child appendrel, in which case we need to compute a ph_eval_at value that's been translated in the same way that the PHV itself has been (cf. adjust_appendrel_attrs). Fortunately, enough info is available in the PlaceHolderInfo to make such translation possible without additional outside data, so we don't need another round of uglification of planner APIs. This is a little bit complicated, but since it's a hard-to-hit corner case, I'm not much worried about adding cycles here. Per report from Jaime Casanova. Back-patch to v12, like the previous commit. Discussion: https://postgr.es/m/20210915230959.GB17635@ahch-to
2021-09-16Fix variable shadowing in procarray.c.Fujii Masao
ProcArrayGroupClearXid function has a parameter named "proc", but the same name was used for its local variables. This commit fixes this variable shadowing, to improve code readability. Back-patch to all supported versions, to make future back-patching easy though this patch is classified as refactoring only. Reported-by: Ranier Vilela Author: Ranier Vilela, Aleksander Alekseev https://postgr.es/m/CAEudQAqyoTZC670xWi6w-Oe2_Bk1bfu2JzXz6xRfiOUzm7xbyQ@mail.gmail.com
2021-09-15Disallow LISTEN in background workers.Tom Lane
It's possible to execute user-defined SQL in some background processes; for example, logical replication workers can fire triggers. This opens the possibility that someone would try to execute LISTEN in such a context. But since only regular backends ever call ProcessNotifyInterrupt, no messages would actually be received, and thus the registered listener would simply prevent the message queue from being cleaned. Eventually NOTIFY would stop working, which is bad. Perhaps someday somebody will invent infrastructure to make listening in a background worker actually useful. In the meantime, forbid it. Back-patch to v13, which is where we introduced the MyBackendType variable. It'd be a lot harder to implement the check without that, and it doesn't seem worth the trouble. Discussion: https://postgr.es/m/153243441449.1404.2274116228506175596@wrigleys.postgresql.org
2021-09-14Send NOTIFY signals during CommitTransaction.Tom Lane
Formerly, we sent signals for outgoing NOTIFY messages within ProcessCompletedNotifies, which was also responsible for sending relevant ones of those messages to our connected client. It therefore had to run during the main-loop processing that occurs just before going idle. This arrangement had two big disadvantages: * Now that procedures allow intra-command COMMITs, it would be useful to send NOTIFYs to other sessions immediately at COMMIT (though, for reasons of wire-protocol stability, we still shouldn't forward them to our client until end of command). * Background processes such as replication workers would not send NOTIFYs at all, since they never execute the client communication loop. We've had requests to allow triggers running in replication workers to send NOTIFYs, so that's a problem. To fix these things, move transmission of outgoing NOTIFY signals into AtCommit_Notify, where it will happen during CommitTransaction. Also move the possible call of asyncQueueAdvanceTail there, to ensure we don't bloat the async SLRU if a background worker sends many NOTIFYs with no one listening. We can also drop the call of asyncQueueReadAllNotifications, allowing ProcessCompletedNotifies to go away entirely. That's because commit 790026972 added a call of ProcessNotifyInterrupt adjacent to PostgresMain's call of ProcessCompletedNotifies, and that does its own call of asyncQueueReadAllNotifications, meaning that we were uselessly doing two such calls (inside two separate transactions) whenever inbound notify signals coincided with an outbound notify. We need only set notifyInterruptPending to ensure that ProcessNotifyInterrupt runs, and we're done. The existing documentation suggests that custom background workers should call ProcessCompletedNotifies if they want to send NOTIFY messages. To avoid an ABI break in the back branches, reduce it to an empty routine rather than removing it entirely. Removal will occur in v15. Although the problems mentioned above have existed for awhile, I don't feel comfortable back-patching this any further than v13. There was quite a bit of churn in adjacent code between 12 and 13. At minimum we'd have to also backpatch 51004c717, and a good deal of other adjustment would also be needed, so the benefit-to-risk ratio doesn't look attractive. Per bug #15293 from Michael Powers (and similar gripes from others). Artur Zakirov and Tom Lane Discussion: https://postgr.es/m/153243441449.1404.2274116228506175596@wrigleys.postgresql.org
2021-09-13jit: Do not try to shut down LLVM state in case of LLVM triggered errors.Andres Freund
If an allocation failed within LLVM it is not safe to call back into LLVM as LLVM is not generally safe against exceptions / stack-unwinding. Thus errors while in LLVM code are promoted to FATAL. However llvm_shutdown() did call back into LLVM even in such cases, while llvm_release_context() was careful not to do so. We cannot generally skip shutting down LLVM, as that can break profiling. But it's OK to do so if there was an error from within LLVM. Reported-By: Jelte Fennema <Jelte.Fennema@microsoft.com> Author: Andres Freund <andres@anarazel.de> Author: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/AM5PR83MB0178C52CCA0A8DEA0207DC14F7FF9@AM5PR83MB0178.EURPRD83.prod.outlook.com Backpatch: 11-, where jit was introduced
2021-09-13Fix EXIT out of outermost block in plpgsql.Tom Lane
Ordinarily, using EXIT this way would draw "control reached end of function without RETURN". However, if the function is one where we don't require an explicit RETURN (such as a DO block), that should not happen. It did anyway, because add_dummy_return() neglected to account for the case. Per report from Herwig Goemans. Back-patch to all supported branches. Discussion: https://postgr.es/m/868ae948-e3ca-c7ec-95a6-83cfc08ef750@gmail.com
2021-09-13Fix reorder buffer memory accounting for toast changes.Amit Kapila
While processing toast changes in logical decoding, we rejigger the tuple change to point to in-memory toast tuples instead to on-disk toast tuples. And, to make sure the memory accounting is correct, we were subtracting the old change size and then after re-computing the new tuple, re-adding its size at the end. Now, if there is any error before we add the new size, we will release the changes and that will update the accounting info (subtracting the size from the counters). And we were underflowing there which leads to an assertion failure in assert enabled builds and wrong memory accounting in reorder buffer otherwise. Author: Bertrand Drouvot Reviewed-by: Amit Kapila Backpatch-through: 13, where memory accounting was introduced Discussion: https://postgr.es/m/92b0ee65-b8bd-e42d-c082-4f3f4bf12d34@amazon.com
2021-09-13Fix error handling with threads on OOM in ECPG connection logicMichael Paquier
An out-of-memory failure happening when allocating the structures to store the connection parameter keywords and values would mess up with the set of connections saved, as on failure the pthread mutex would still be hold with the new connection object listed but free()'d. Rather than just unlocking the mutex, which would leave the static list of connections into an inconsistent state, move the allocation for the structures of the connection parameters before beginning the test manipulation. This ensures that the list of connections and the connection mutex remain consistent all the time in this code path. This error is unlikely going to happen, but this could mess up badly with ECPG clients in surprising ways, so backpatch all the way down. Reported-by: ryancaicse Discussion: https://postgr.es/m/17186-b4cfd8f0eb4d1dee@postgresql.org Backpatch-through: 9.6
2021-09-11Make pg_regexec() robust against out-of-range search_start.Tom Lane
If search_start is greater than the length of the string, we should just return REG_NOMATCH immediately. (Note that the equality case should *not* be rejected, since the pattern might be able to match zero characters.) This guards various internal assumptions that the min of a range of string positions is not more than the max. Violation of those assumptions could allow an attempt to fetch string[search_start-1], possibly causing a crash. Jaime Casanova pointed out that this situation is reachable with the new regexp_xxx functions that accept a user-specified start position. I don't believe it's reachable via any in-core call site in v14 and below. However, extensions could possibly call pg_regexec with an out-of-range search_start, so let's back-patch the fix anyway. Discussion: https://postgr.es/m/20210911180357.GA6870@ahch-to
2021-09-10Fix some anomalies with NO SCROLL cursors.Tom Lane
We have long forbidden fetching backwards from a NO SCROLL cursor, but the prohibition didn't extend to cases in which we rewind the query altogether and then re-fetch forwards. I think the reason is that this logic was mainly meant to protect plan nodes that can't be run in the reverse direction. However, re-reading the query output is problematic if the query is volatile (which includes SELECT FOR UPDATE, not just queries with volatile functions): the re-read can produce different results, which confuses the cursor navigation logic completely. Another reason for disliking this approach is that some code paths will either fetch backwards or rewind-and-fetch-forwards depending on the distance to the target row; so that seemingly identical use-cases may or may not draw the "cursor can only scan forward" error. Hence, let's clean things up by disallowing rewind as well as fetch-backwards in a NO SCROLL cursor. Ordinarily we'd only make such a definitional change in HEAD, but there is a third reason to consider this change now. Commit ba2c6d6ce created some new user-visible anomalies for non-scrollable cursors WITH HOLD, in that navigation in the cursor result got confused if the cursor had been partially read before committing. The only good way to resolve those anomalies is to forbid rewinding such a cursor, which allows removal of the incorrect cursor state manipulations that ba2c6d6ce added to PersistHoldablePortal. To minimize the behavioral change in the back branches (including v14), refuse to rewind a NO SCROLL cursor only when it has a holdStore, ie has been held over from a previous transaction due to WITH HOLD. This should avoid breaking most applications that have been sloppy about whether to declare cursors as scrollable. We'll enforce the prohibition across-the-board beginning in v15. Back-patch to v11, as ba2c6d6ce was. Discussion: https://postgr.es/m/3712911.1631207435@sss.pgh.pa.us
2021-09-09Avoid fetching from an already-terminated plan.Tom Lane
Some plan node types don't react well to being called again after they've already returned NULL. PortalRunSelect() has long dealt with this by calling the executor with NoMovementScanDirection if it sees that we've already run the portal to the end. However, commit ba2c6d6ce overlooked this point, so that persisting an already-fully-fetched cursor would fail if it had such a plan. Per report from Tomas Barton. Back-patch to v11, as the faulty commit was. (I've omitted a test case because the type of plan that causes a problem isn't all that stable.) Discussion: https://postgr.es/m/CAPV2KRjd=ErgVGbvO2Ty20tKTEZZr6cYsYLxgN_W3eAo9pf5sw@mail.gmail.com
2021-09-09Check for relation length overrun soon enough.Tom Lane
We don't allow relations to exceed 2^32-1 blocks, because block numbers are 32 bits and the last possible block number is reserved to mean InvalidBlockNumber. There is a check for this in mdextend, but that's really way too late, because the smgr API requires us to create a buffer for the block-to-be-added, and we do not want to have any buffer with blocknum InvalidBlockNumber. (Such a case can trigger assertions in bufmgr.c, plus I think it might confuse ReadBuffer's logic for data-past-EOF later on.) So put the check into ReadBuffer. Per report from Christoph Berg. It's been like this forever, so back-patch to all supported branches. Discussion: https://postgr.es/m/YTn1iTkUYBZfcODk@msg.credativ.de
2021-09-09Fix issue with WAL archiving in standby.Fujii Masao
Previously, walreceiver always closed the currently-opened WAL segment and created its archive notification file, after it finished writing the current segment up and received any WAL data that should be written into the next segment. If walreceiver exited just before any WAL data in the next segment arrived at standby, it did not create the archive notification file of the current segment even though that's known completed. This behavior could cause WAL archiving of the segment to be delayed until subsequent restartpoints or checkpoints created its notification file. To fix the issue, this commit changes walreceiver so that it creates an archive notification file of a current WAL segment immediately if that's known completed before receiving next WAL data. Back-patch to all supported branches. Reported-by: Kyotaro Horiguchi Author: Fujii Masao Reviewed-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/20200630.165503.1465894182551545886.horikyota.ntt@gmail.com
2021-09-08Avoid useless malloc/free traffic around getFormattedTypeName().Tom Lane
Coverity complained that one caller of getFormattedTypeName() failed to free the returned string. Which is true, but rather than fixing that one, let's get rid of this tedious and error-prone requirement. Now that getFormattedTypeName() caches its result, strdup'ing that result and expecting the caller to free it accomplishes little except to waste cycles. We do create a leak in the case where getTypes didn't make a TypeInfo for the type, but that basically shouldn't ever happen. Back-patch, as commit 6c450a861 was. This isn't a particularly interesting bug fix, but the API change seems like a hazard for future back-patching activity if we don't back-patch it.
2021-09-08Fix rewriter to set hasModifyingCTE correctly on rewritten queries.Tom Lane
If we copy data-modifying CTEs from the original query to a replacement query (from a DO INSTEAD rule), we must set hasModifyingCTE properly in the replacement query. Failure to do this can cause various unpleasantness, such as unsafe usage of parallel plans. The code also neglected to propagate hasRecursive, though that's only cosmetic at the moment. A difficulty arises if the rule action is an INSERT...SELECT. We attach the original query's RTEs and CTEs to the sub-SELECT Query, but data-modifying CTEs are only allowed to appear in the topmost Query. For the moment, throw an error in such cases. It would probably be possible to avoid this error by attaching the CTEs to the top INSERT Query instead; but that would require a bunch of new code to adjust ctelevelsup references. Given the narrowness of the use-case, and the need to back-patch this fix, it does not seem worth the trouble for now. We can revisit this if we get field complaints. Per report from Greg Nancarrow. Back-patch to all supported branches. (The test case added here does not fail before v10, but there are plenty of places checking top-level hasModifyingCTE in 9.6, so I have no doubt that this code change is necessary there too.) Greg Nancarrow and Tom Lane Discussion: https://postgr.es/m/CAJcOf-f68DT=26YAMz_i0+Au3TcLO5oiHY5=fL6Sfuits6r+_w@mail.gmail.com Discussion: https://postgr.es/m/CAJcOf-fAdj=nDKMsRhQzndm-O13NY4dL6xGcEvdX5Xvbbi0V7g@mail.gmail.com
2021-09-08Invalidate relcache for publications defined for all tables.Amit Kapila
Updates/Deletes on a relation were allowed even without replica identity after we define the publication for all tables. This would later lead to an error on subscribers. The reason was that for such publications we were not invalidating the relcache and the publication information for relations was not getting rebuilt. Similarly, we were not invalidating the relcache after dropping of such publications which will prohibit Updates/Deletes without replica identity even without any publication. Author: Vignesh C and Hou Zhijie Reviewed-by: Hou Zhijie, Kyotaro Horiguchi, Amit Kapila Backpatch-through: 10, where it was introduced Discussion: https://postgr.es/m/CALDaNm0pF6zeWqCA8TCe2sDuwFAy8fCqba=nHampCKag-qLixg@mail.gmail.com
2021-09-06AIX: Fix missing libpq symbols by respecting SHLIB_EXPORTS.Noah Misch
We make each AIX shared library export all globals found in .o files that originate in the library. That doesn't include symbols acquired by -lpgcommon_shlib. That is good on average, but it became a problem for libpq when commit e6afa8918c461c1dd80c5063a950518fa4e950cd moved five official libpq API symbols into src/common. Fix this by implementing the SHLIB_EXPORTS mechanism for AIX, so affected libraries export the same symbols that they export on Linux. This reintroduces symbols pg_encoding_to_char, pg_utf_mblen, pg_char_to_encoding, pg_valid_server_encoding, and pg_valid_server_encoding_id. Back-patch to v13, where the aforementioned commit first appeared. While a minor release is usually the wrong time to add or remove symbol exports in libpq or libecpg, we should expect users to want each documented symbol. Tony Reix Discussion: https://postgr.es/m/PR3PR02MB6396742E2FC3E77D37A920BC86C79@PR3PR02MB6396.eurprd02.prod.outlook.com
2021-09-06Fix bogus timetz_zone() results for DYNTZ abbreviations.Tom Lane
timetz_zone() delivered completely wrong answers if the zone was specified by a dynamic TZ abbreviation, because it failed to account for the difference between the POSIX conventions for field values in struct pg_tm and the conventions used in PG-specific datetime code. As a stopgap fix, just adjust the tm_year and tm_mon fields to match PG conventions. This is fixed in a different way in HEAD (388e71af8) but I don't want to back-patch the change of reference point. Discussion: https://postgr.es/m/CAJ7c6TOMG8zSNEZtCn5SPe+cCk3Lfxb71ZaQwT2F4T7PJ_t=KA@mail.gmail.com
2021-09-06Fix pkg-config files for static linkingPeter Eisentraut
Since ea53100d5 (PostgreSQL 12), the shipped pkg-config files have been broken for statically linking libpq because libpgcommon and libpgport are missing. This patch adds those two missing private dependencies (in a non-hardcoded way). Reported-by: Filip Gospodinov <f@gospodinov.ch> Discussion: https://www.postgresql.org/message-id/flat/c7108bde-e051-11d5-a234-99beec01ce2a@gospodinov.ch
2021-09-04Further portability tweaks for float4/float8 hash functions.Tom Lane
Attempting to make hashfloat4() look as much as possible like hashfloat8(), I'd figured I could replace NaNs with get_float4_nan() before widening to float8. However, results from protosciurus and topminnow show that on some platforms that produces a different bit-pattern from get_float8_nan(), breaking the intent of ce773f230. Rearrange so that we use the result of get_float8_nan() for all NaN cases. As before, back-patch.
2021-09-04Revert "Avoid creating archive status ".ready" files too early"Alvaro Herrera
This reverts commit 515e3d84a0b5 and equivalent commits in back branches. This solution to the problem has a number of problems, so we'll try again with a different approach. Per note from Andres Freund Discussion: https://postgr.es/m/20210831042949.52eqp5xwbxgrfank@alap3.anarazel.de
2021-09-03Remove arbitrary MAXPGPATH limit on command lengths in pg_ctl.Tom Lane
Replace fixed-length command buffers with psprintf() calls. We didn't have anything as convenient as psprintf() when this code was written, but now that we do, there's little reason for the limitation to stand. Removing it eliminates some corner cases where (for example) starting the postmaster with a whole lot of options fails. Most individual file names that pg_ctl deals with are still restricted to MAXPGPATH, but we've seldom had complaints about that limitation so long as it only applies to one filename. Back-patch to all supported branches. Phil Krylov Discussion: https://postgr.es/m/567e199c6b97ee19deee600311515b86@krylov.eu
2021-09-03Disallow creating an ICU collation if the DB encoding won't support it.Tom Lane
Previously this was allowed, but the collation effectively vanished into the ether because of the way lookup_collation() works: you could not use the collation, nor even drop it. Seems better to give an error up front than to leave the user wondering why it doesn't work. (Because this test is in DefineCollation not CreateCollation, it does not prevent pg_import_system_collations from creating ICU collations, regardless of the initially-chosen encoding.) Per bug #17170 from Andrew Bille. Back-patch to v10 where ICU support was added. Discussion: https://postgr.es/m/17170-95845cf3f0a9c36d@postgresql.org
2021-09-03Fix portability issue in tests from commit ce773f230.Tom Lane
Modern POSIX seems to require strtod() to accept "-NaN", but there's nothing about NaN in SUSv2, and some of our oldest buildfarm members don't like it. Let's try writing it as -'NaN' instead; that seems to produce the same result, at least on Intel hardware. Per buildfarm.
2021-09-02Fix float4/float8 hash functions to produce uniform results for NaNs.Tom Lane
The IEEE 754 standard allows a wide variety of bit patterns for NaNs, of which at least two ("NaN" and "-NaN") are pretty easy to produce from SQL on most machines. This is problematic because our btree comparison functions deem all NaNs to be equal, but our float hash functions know nothing about NaNs and will happily produce varying hash codes for them. That causes unexpected results from queries that hash a column containing different NaN values. It could also produce unexpected lookup failures when using a hash index on a float column, i.e. "WHERE x = 'NaN'" will not find all the rows it should. To fix, special-case NaN in the float hash functions, not too much unlike the existing special case that forces zero and minus zero to hash the same. I arranged for the most vanilla sort of NaN (that coming from the C99 NAN constant) to still have the same hash code as before, to reduce the risk to existing hash indexes. I dithered about whether to back-patch this into stable branches, but ultimately decided to do so. It's a clear improvement for queries that hash internally. If there is anybody who has -NaN in a hash index, they'd be well advised to re-index after applying this patch ... but the misbehavior if they don't will not be much worse than the misbehavior they had before. Per bug #17172 from Ma Liangzhu. Discussion: https://postgr.es/m/17172-7505bea9e04e230f@postgresql.org
2021-09-01Fix the random test failure in 001_rep_changes.Amit Kapila
The check to test whether the subscription workers were restarting after a change in the subscription was failing. The reason was that the test was assuming the walsender started before it reaches the 'streaming' state and the walsender was exiting due to an error before that. Now, the walsender was erroring out before reaching the 'streaming' state because it tries to acquire the slot before the previous walsender has exited. In passing, improve the die messages so that it is easier to investigate the failures in the future if any. Reported-by: Michael Paquier, as per buildfarm Author: Ajin Cherian Reviewed-by: Masahiko Sawada, Amit Kapila Backpatch-through: 10, where this test was introduced Discussion: https://postgr.es/m/YRnhFxa9bo73wfpV@paquier.xyz
2021-08-31In pg_dump, avoid doing per-table queries for RLS policies.Tom Lane
For no particularly good reason, getPolicies() queried pg_policy separately for each table. We can collect all the policies in a single query instead, and attach them to the correct TableInfo objects using findTableByOid() lookups. On the regression database, this reduces the number of queries substantially, and provides a visible savings even when running against a local server. Per complaint from Hubert Depesz Lubaczewski. Since this is such a simple fix and can have a visible performance benefit, back-patch to all supported branches. Discussion: https://postgr.es/m/20210826084430.GA26282@depesz.com
2021-08-31Cache the results of format_type() queries in pg_dump.Tom Lane
There's long been a "TODO: there might be some value in caching the results" annotation on pg_dump's getFormattedTypeName function; but we hadn't gotten around to checking what it was costing us to repetitively look up type names. It turns out that when dumping the current regression database, about 10% of the total number of queries issued are duplicative format_type() queries. However, Hubert Depesz Lubaczewski reported a not-unusual case where these account for over half of the queries issued by pg_dump. Individually these queries aren't expensive, but when network lag is a factor, they add up to a problem. We can very easily add some caching to getFormattedTypeName to solve it. Since this is such a simple fix and can have a visible performance benefit, back-patch to all supported branches. Discussion: https://postgr.es/m/20210826084430.GA26282@depesz.com
2021-08-31Rename the role in stats_ext to have regress_ prefixTomas Vondra
Commit 5be8ce82e8 added a new role to the stats_ext regression suite, but the role name did not start with regress_ causing failures when running with ENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS. Fixed by renaming the role to start with the expected regress_ prefix. Backpatch-through: 10, same as the new regression test Discussion: https://postgr.es/m/1F238937-7CC2-4703-A1B1-6DC225B8978A%40enterprisedb.com
2021-08-31Fix lookup error in extended stats ownership checkTomas Vondra
When an ownership check on extended statistics object failed, the code was calling aclcheck_error_type to report the failure, which is clearly wrong, resulting in cache lookup errors. Fix by calling aclcheck_error. This issue exists since the introduction of extended statistics, so backpatch all the way back to PostgreSQL 10. It went unnoticed because there were no tests triggering the error, so add one. Reported-by: Mark Dilger Backpatch-through: 10, where extended stats were introduced Discussion: https://postgr.es/m/1F238937-7CC2-4703-A1B1-6DC225B8978A%40enterprisedb.com
2021-08-30Report tuple address in data-corruption error messageAlvaro Herrera
Most data-corruption reports mention the location of the problem, but this one failed to. Add it. Backpatch all the way back. In 12 and older, also assign the ERRCODE_DATA_CORRUPTED error code as was done in commit fd6ec93bf890 for 13 and later. Discussion: https://postgr.es/m/202108191637.oqyzrdtnheir@alvherre.pgsql
2021-08-30Fix incorrect error code in StartupReplicationOrigin().Amit Kapila
ERRCODE_CONFIGURATION_LIMIT_EXCEEDED was used for checksum failure, use ERRCODE_DATA_CORRUPTED instead. Reported-by: Tatsuhito Kasahara Author: Tatsuhito Kasahara Backpatch-through: 9.6, where it was introduced Discussion: https://postgr.es/m/CAP0=ZVLHtYffs8SOWcFJWrBGoRzT9QQbk+_aP+E5AHLNXiOorA@mail.gmail.com
2021-08-28psql \dP: reference regclass with "pg_catalog." prefixAlvaro Herrera
Strictly speaking this isn't a bug, but since all references to catalog objects are schema-qualified, we might as well be consistent. The omission first appeared in commit 1c5d9270e339, so backpatch to 12. Author: Justin Pryzby <pryzbyj@telsasoft.com> Discussion: https://postgr.es/m/20210827193151.GN26465@telsasoft.com
2021-08-27Fix data loss in wal_level=minimal crash recovery of CREATE TABLESPACE.Noah Misch
If the system crashed between CREATE TABLESPACE and the next checkpoint, the result could be some files in the tablespace unexpectedly containing no rows. Affected files would be those for which the system did not write WAL; see the wal_skip_threshold documentation. Before v13, a different set of conditions governed the writing of WAL; see v12's <sect2 id="populate-pitr">. (The v12 conditions were broader in some ways and narrower in others.) Users may want to audit non-default tablespaces for unexpected short files. The bug could have truncated an index without affecting the associated table, and reindexing the index would fix that particular problem. This fixes the bug by making create_tablespace_directories() more like TablespaceCreateDbspace(). create_tablespace_directories() was recursively removing tablespace contents, reasoning that WAL redo would recreate everything removed that way. That assumption holds for other wal_level values. Under wal_level=minimal, the old approach could delete files for which no other copy existed. Back-patch to 9.6 (all supported versions). Reviewed by Robert Haas and Prabhat Sahu. Reported by Robert Haas. Discussion: https://postgr.es/m/CA+TgmoaLO9ncuwvr2nN-J4VEP5XyAcy=zKiHxQzBbFRxxGxm0w@mail.gmail.com
2021-08-27Count SP-GiST index scans in pg_stat statistics.Tom Lane
Somehow, spgist overlooked the need to call pgstat_count_index_scan(). Hence, pg_stat_all_indexes.idx_scan and equivalent columns never became nonzero for an SP-GiST index, although the related per-tuple counters worked fine. This fix works a bit differently from other index AMs, in that the counter increment occurs in spgrescan not spggettuple/spggetbitmap. It looks like this won't make the user-visible semantics noticeably different, so I won't go to the trouble of introducing an is-this- the-first-call flag just to make the counter bumps happen in the same places. Per bug #17163 from Christian Quest. Back-patch to all supported versions. Discussion: https://postgr.es/m/17163-b8c5cc88322a5e92@postgresql.org
2021-08-25Fix broken snapshot handling in parallel workers.Robert Haas
Pengchengliu reported an assertion failure in a parallel woker while performing a parallel scan using an overflowed snapshot. The proximate cause is that TransactionXmin was set to an incorrect value. The underlying cause is incorrect snapshot handling in parallel.c. In particular, InitializeParallelDSM() was unconditionally calling GetTransactionSnapshot(), because I (rhaas) mistakenly thought that was always retrieving an existing snapshot whereas, at isolation levels less than REPEATABLE READ, it's actually taking a new one. So instead do this only at higher isolation levels where there actually is a single snapshot for the whole transaction. By itself, this is not a sufficient fix, because we still need to guarantee that TransactionXmin gets set properly in the workers. The easiest way to do that seems to be to install the leader's active snapshot as the transaction snapshot if the leader did not serialize a transaction snapshot. This doesn't affect the results of future GetTrasnactionSnapshot() calls since those have to take a new snapshot anyway; what we care about is the side effect of setting TransactionXmin. Report by Pengchengliu. Patch by Greg Nancarrow, except for some comment text which I supplied. Discussion: https://postgr.es/m/002f01d748ac$eaa781a0$bff684e0$@tju.edu.cn
2021-08-25Fix toast rewrites in logical decoding.Amit Kapila
Commit 325f2ec555 introduced pg_class.relwrite to skip operations on tables created as part of a heap rewrite during DDL. It links such transient heaps to the original relation OID via this new field in pg_class but forgot to do anything about toast tables. So, logical decoding was not able to skip operations on internally created toast tables. This leads to an error when we tried to decode the WAL for the next operation for which it appeared that there is a toast data where actually it didn't have any toast data. To fix this, we set pg_class.relwrite for internally created toast tables as well which allowed skipping operations on them during logical decoding. Author: Bertrand Drouvot Reviewed-by: David Zhang, Amit Kapila Backpatch-through: 11, where it was introduced Discussion: https://postgr.es/m/b5146fb1-ad9e-7d6e-f980-98ed68744a7c@amazon.com
2021-08-25Avoid using ambiguous word "positive" in error message.Fujii Masao
There are two identical error messages about valid value of modulus for hash partition, in PostgreSQL source code. Commit 0e1275fb07 improved only one of them so that ambiguous word "positive" was avoided there, and forgot to improve the other. This commit improves the other. Which would reduce translator burden. Back-pach to v11 where the error message exists. Author: Kyotaro Horiguchi Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/20210819.170315.1413060634876301811.horikyota.ntt@gmail.com
2021-08-25Improve error message about valid value for distance in phrase operator.Fujii Masao
The distance in phrase operator must be an integer value between zero and MAXENTRYPOS inclusive. But previously the error message about its valid value included the information about its upper limit but not lower limit (i.e., zero). This commit improves the error message so that it also includes the information about its lower limit. Back-patch to v9.6 where full-text phrase search was supported. Author: Kyotaro Horiguchi Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/20210819.170315.1413060634876301811.horikyota.ntt@gmail.com
2021-08-24Fix regexp misbehavior with capturing parens inside "{0}".Tom Lane
Regexps like "(.){0}...\1" drew an "invalid backreference number". That's not unreasonable on its face, since the capture group will never be matched if it's iterated zero times. However, other engines such as Perl's don't complain about this, nor do we throw an error for related cases such as "(.)|\1", even though that backref can never succeed either. Also, if the zero-iterations case happens at runtime rather than compile time --- say, "(x)*...\1" when there's no "x" to be found --- that's not an error, we just deem the backref to not match. Making this even less defensible, no error was thrown for nested cases such as "((.)){0}...\2"; and to add insult to injury, those cases could result in assertion failures instead. (It seems that nothing especially bad happened in non-assert builds, though.) Let's just fix it so that no error is thrown and instead the backref is deemed to never match, so that compile-time detection of no iterations behaves the same as run-time detection. Per report from Mark Dilger. This appears to be an aboriginal error in Spencer's library, so back-patch to all supported versions. Pre-v14, it turns out to also be necessary to back-patch one aspect of commits cb76fbd7e/00116dee5, namely to create capture-node subREs with the begin/end states of their subexpressions, not the current lp/rp of the outer parseqatom invocation. Otherwise delsub complains that we're trying to disconnect a state from itself. This is a bit scary but code examination shows that it's safe: in the pre-v14 code, if we want to wrap iteration around the subexpression, the first thing we do is overwrite the atom's begin/end fields with new states. So the bogus values didn't survive long enough to be used for anything, except if no iteration is required, in which case it doesn't matter. Discussion: https://postgr.es/m/A099E4A8-4377-4C64-A98C-3DEDDC075502@enterprisedb.com