summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2018-12-13Prevent GIN deleted pages from being reclaimed too earlyAlexander Korotkov
When GIN vacuum deletes a posting tree page, it assumes that no concurrent searchers can access it, thanks to ginStepRight() locking two pages at once. However, since 9.4 searches can skip parts of posting trees descending from the root. That leads to the risk that page is deleted and reclaimed before concurrent search can access it. This commit prevents the risk of above by waiting for every transaction, which might wait to reference this page, to finish. Due to binary compatibility we can't change GinPageOpaqueData to store corresponding transaction id. Instead we reuse page header pd_prune_xid field, which is unused in index pages. Discussion: https://postgr.es/m/31a702a.14dd.166c1366ac1.Coremail.chjischj%40163.com Author: Andrey Borodin, Alexander Korotkov Reviewed-by: Alexander Korotkov Backpatch-through: 9.4
2018-12-13Prevent deadlock in ginRedoDeletePage()Alexander Korotkov
On standby ginRedoDeletePage() can work concurrently with read-only queries. Those queries can traverse posting tree in two ways. 1) Using rightlinks by ginStepRight(), which locks the next page before unlocking its left sibling. 2) Using downlinks by ginFindLeafPage(), which locks at most one page at time. Original lock order was: page, parent, left sibling. That lock order can deadlock with ginStepRight(). In order to prevent deadlock this commit changes lock order to: left sibling, page, parent. Note, that position of parent in locking order seems insignificant, because we only lock one page at time while traversing downlinks. Reported-by: Chen Huajun Diagnosed-by: Chen Huajun, Peter Geoghegan, Andrey Borodin Discussion: https://postgr.es/m/31a702a.14dd.166c1366ac1.Coremail.chjischj%40163.com Author: Alexander Korotkov Backpatch-through: 9.4
2018-12-10Add stack depth checks to key recursive functions in backend/nodes/*.c.Tom Lane
Although copyfuncs.c has a check_stack_depth call in its recursion, equalfuncs.c, outfuncs.c, and readfuncs.c lacked one. This seems unwise. Likewise fix planstate_tree_walker(), in branches where that exists. Discussion: https://postgr.es/m/30253.1544286631@sss.pgh.pa.us
2018-12-06Improve our response to invalid format strings, and detect more cases.Tom Lane
Places that are testing for *printf failure ought to include the format string in their error reports, since bad-format-string is one of the more likely causes of such failure. This both makes it easier to find and repair the mistake, and provides at least some useful info to the user who stumbles across such a problem. Also, tighten snprintf.c to report EINVAL for an invalid flag or final character in a format %-spec (including the case where the %-spec is missing a final character altogether). This seems like better project policy, and it also allows removing an instruction or two from the hot code path. Back-patch the error reporting change in pvsnprintf, since it should be harmless and may be helpful; but not the snprintf.c change. Per discussion of bug #15511 from Ertuğrul Kahveci, which reported an invalid translated format string. These changes don't fix that error, but they should improve matters next time we make such a mistake. Discussion: https://postgr.es/m/15511-1d8b6a0bc874112f@postgresql.org
2018-11-29Ensure static libraries have correct mod time even if ranlib messes it up.Tom Lane
In at least Apple's version of ranlib, the output file is updated to have a mod time equal to the max of the timestamps of its components, and that data only has seconds precision. On a filesystem with sub-second file timestamp precision --- say, APFS --- this can result in the finished static library appearing older than its input files, which causes useless rebuilds and possible outright failures in parallel makes. We've only seen this reported in the field from people using Apple's ranlib with a non-Apple make, because Apple's make doesn't know about sub-second timestamps either so it doesn't decide rebuilds are needed. But Apple's ranlib presumably shares code with at least some BSDen, so it's not that unlikely that the same problem could arise elsewhere. To fix, just "touch" the output file after ranlib finishes. We seem to need this in only one place. There are other calls of ranlib in our makefiles, but they are working on intermediate files whose timestamps are not actually important, or else on an installed static library for which sub-second timestamp precision is unlikely to matter either. (Also, so far as I can tell, Apple's ranlib doesn't mess up the file timestamp in the latter usage anyhow.) In passing, change "ranlib" to "$(RANLIB)" in one place that was bypassing the make macro for no good reason. Per bug #15525 from Jack Kelly (via Alyssa Ross). Back-patch to all supported branches. Discussion: https://postgr.es/m/15525-a30da084f17a1faa@postgresql.org
2018-11-29Fix handling of synchronous replication for stopping WAL sendersMichael Paquier
This fixes an oversight from c6c3334 which has introduced a more strict ordering in the way WAL senders are stopped to prevent current WAL activity when a shutdown checkpoint is created. After all backends are stopped, all WAL senders are requested to stop which makes them stop any activity, and switching their state as stopping. Once the checkpointer knows that all WAL senders are in a stopping state, the shutdown checkpoint can begin, with all WAL senders activated, waiting for their clients to flush the shutdown checkpoint record. If a subset of WAL senders are stopping and in a sync state, other WAL senders could still be waiting for a WAL position to be synced while committing a transaction, however the subset of stopping senders would not release waiters, potentially breaking synchronous replication guarantees. This commit makes sure that even WAL senders stopping are able to release waiters properly. On 9.4, this can also trigger an assertion failure when setting for example max_wal_senders to 1 where a WAL sender is not able to find itself as in synchronous state when the instance stops. Reported-by: Paul Guo Author: Paul Guo, Michael Paquier Discussion: https://postgr.es/m/CAEET0ZEv8VFqT3C-cQm6byOB4r4VYWcef1J21dOX-gcVhCSpmA@mail.gmail.com Backpatch-through: 9.4
2018-11-28Do not decode TOAST data for table rewritesTomas Vondra
During table rewrites (VACUUM FULL and CLUSTER), the main heap is logged using XLOG / FPI records, and thus (correctly) ignored in decoding. But the associated TOAST table is WAL-logged as plain INSERT records, and so was logically decoded and passed to reorder buffer. That has severe consequences with TOAST tables of non-trivial size. Firstly, reorder buffer has to keep all those changes, possibly spilling them to a file, incurring I/O costs and disk space. Secondly, ReoderBufferCommit() was stashing all those TOAST chunks into a hash table, which got discarded only after processing the row from the main heap. But as the main heap is not decoded for rewrites, this never happened, so all the TOAST data accumulated in memory, resulting either in excessive memory consumption or OOM. The fix is simple, as commit e9edc1ba already introduced infrastructure (namely HEAP_INSERT_NO_LOGICAL flag) to skip logical decoding of TOAST tables, but it only applied it to system tables. So simply use it for all TOAST data in raw_heap_insert(). That would however solve only the memory consumption issue - the TOAST changes would still be decoded and added to the reorder buffer, and spilled to disk (although without TOAST tuple data, so much smaller). But we can solve that by tweaking DecodeInsert() to just ignore such INSERT records altogether, using XLH_INSERT_CONTAINS_NEW_TUPLE flag, instead of skipping them later in ReorderBufferCommit(). Review: Masahiko Sawada Discussion: https://www.postgresql.org/message-id/flat/1a17c643-e9af-3dba-486b-fbe31bc1823a%402ndquadrant.com Backpatch: 9.4-, where logical decoding was introduced
2018-11-26Fix translation of special characters in psql's LaTeX output modes.Tom Lane
latex_escaped_print() mistranslated \ and failed to provide any translation for # ^ and ~, all of which would typically lead to LaTeX document syntax errors. In addition it didn't translate < > and |, which would typically render as unexpected characters. To some extent this represents shortcomings in ancient versions of LaTeX, which if memory serves had no easy way to render these control characters as ASCII text. But that's been fixed for, um, decades. In any case there is no value in emitting guaranteed-to-fail output for these characters. Noted while fooling with test cases added by commit 9a98984f4. Back-patch the code change to all supported versions.
2018-11-24Update additional float4/8 expected-output files.Tom Lane
I forgot that the back branches have more variant files than HEAD :-(. Per buildfarm. Discussion: https://postgr.es/m/15519-4fc785b483201ff1@postgresql.org
2018-11-24Fix float-to-integer coercions to handle edge cases correctly.Tom Lane
ftoi4 and its sibling coercion functions did their overflow checks in a way that looked superficially plausible, but actually depended on an assumption that the MIN and MAX comparison constants can be represented exactly in the float4 or float8 domain. That fails in ftoi4, ftoi8, and dtoi8, resulting in a possibility that values near the MAX limit will be wrongly converted (to negative values) when they need to be rejected. Also, because we compared before rounding off the fractional part, the other three functions threw errors for values that really ought to get rounded to the min or max integer value. Fix by doing rint() first (requiring an assumption that it handles NaN and Inf correctly; but dtoi8 and ftoi8 were assuming that already), and by comparing to values that should coerce to float exactly, namely INTxx_MIN and -INTxx_MIN. Also remove some random cosmetic discrepancies between these six functions. This back-patches commits cbdb8b4c0 and 452b637d4. In the 9.4 branch, also back-patch the portion of 62e2a8dc2 that added PG_INTnn_MIN and related constants to c.h, so that these functions can rely on them. Per bug #15519 from Victor Petrovykh. Patch by me; thanks to Andrew Gierth for analysis and discussion. Discussion: https://postgr.es/m/15519-4fc785b483201ff1@postgresql.org
2018-11-19Fix old TAP tests' method for selecting a valid PGPORT value.Tom Lane
This code was trying to be paranoid, but it wasn't paranoid enough. It only ensured that the selected port is in 0..65535, while most Unix systems will refuse unprivileged attempts to use TCP port numbers below 1024. Change it to allow specification of ports 1024..65535, while if the port is outside that range, map it into 49152..65535 which is the port range used by our later branches. The main reason we've not noticed this up to now is that it's not important when testing over Unix-socket connections, only TCP, and most of our test code deliberately prevents the postmaster from opening any TCP ports. However, the SSL tests do open up a TCP port, and I believe this explains why buildfarm member chipmunk has been failing the SSL tests in 9.5: it's picking a reserved port number. Patch in 9.5 and 9.4. Later branches do not use this code.
2018-11-19Back-patch updated thread flags tests into 9.4 and 9.5.Tom Lane
This commit back-patches these 9.6-era commits into 9.4 and 9.5: e97af6c8b Replace our hacked version of ax_pthread.m4 with latest upstream version. 3b14a17c8 Move pthread-tests earlier in the autoconf script. 01051a987 Use AS_IF rather than plain shell "if" in pthread-check. a2932283c Update ax_pthread.m4 to an experimental draft version from upstream. The net result is to sync configure's checks for threading-related flags and libraries with the version we've been using since 9.6. The motivation for doing so now is that it seems the older code does not work correctly on very recent RHEL7/ppc64, as evidenced by buildfarm member quokka. The newer code is pretty battle-hardened by now, so this seems like a low-risk fix. Discussion: https://postgr.es/m/3320.1542647565@sss.pgh.pa.us
2018-11-19PANIC on fsync() failure.Thomas Munro
On some operating systems, it doesn't make sense to retry fsync(), because dirty data cached by the kernel may have been dropped on write-back failure. In that case the only remaining copy of the data is in the WAL. A subsequent fsync() could appear to succeed, but not have flushed the data. That means that a future checkpoint could apparently complete successfully but have lost data. Therefore, violently prevent any future checkpoint attempts by panicking on the first fsync() failure. Note that we already did the same for WAL data; this change extends that behavior to non-temporary data files. Provide a GUC data_sync_retry to control this new behavior, for users of operating systems that don't eject dirty data, and possibly forensic/testing uses. If it is set to on and the write-back error was transient, a later checkpoint might genuinely succeed (on a system that does not throw away buffers on failure); if the error is permanent, later checkpoints will continue to fail. The GUC defaults to off, meaning that we panic. Back-patch to all supported releases. There is still a narrow window for error-loss on some operating systems: if the file is closed and later reopened and a write-back error occurs in the intervening time, but the inode has the bad luck to be evicted due to memory pressure before we reopen, we could miss the error. A later patch will address that with a scheme for keeping files with dirty data open at all times, but we judge that to be too complicated to back-patch. Author: Craig Ringer, with some adjustments by Thomas Munro Reported-by: Craig Ringer Reviewed-by: Robert Haas, Thomas Munro, Andres Freund Discussion: https://postgr.es/m/20180427222842.in2e4mibx45zdth5%40alap3.anarazel.de
2018-11-19Don't forget about failed fsync() requests.Thomas Munro
If fsync() fails, md.c must keep the request in its bitmap, so that future attempts will try again. Back-patch to all supported releases. Author: Thomas Munro Reviewed-by: Amit Kapila Reported-by: Andrew Gierth Discussion: https://postgr.es/m/87y3i1ia4w.fsf%40news-spur.riddles.org.uk
2018-11-18Add valgrind suppressions for wcsrtombs optimizationsTomas Vondra
wcsrtombs (called through wchar2char from common functions like lower, upper, etc.) uses various optimizations that may look like access to uninitialized data, triggering valgrind reports. For example AVX2 instructions load data in 256-bit chunks, and gconv does something similar with 32-bit chunks. This is faster than accessing the bytes one by one, and the uninitialized part of the buffer is not actually used. So suppress the bogus reports. The exact stack depends on possible optimizations - it might be AVX, SSE (as in the report by Aleksander Alekseev) or something else. Hence the last frame is wildcarded, to deal with this. Backpatch all the way back to 9.4. Author: Tomas Vondra Discussion: https://www.postgresql.org/message-id/flat/90ac0452-e907-e7a4-b3c8-15bd33780e62%402ndquadrant.com Discussion: https://www.postgresql.org/message-id/20180220150838.GD18315@e733.localdomain
2018-11-14Second try at fixing numeric data passed through an ECPG SQLDA.Tom Lane
In commit ecfd55795, I removed sqlda.c's checks for ndigits != 0 on the grounds that we should duplicate the state of the numeric value's digit buffer even when all the digits are zeroes. However, that still isn't quite right, because another possible state of the digit buffer is buf == digits == NULL (this occurs for a NaN). As the code now stands, it'll invoke memcpy with a NULL source address and zero bytecount, which we know a few platforms crash on. Hence, reinstate the no-copy short-circuit, but make it test specifically for buf != NULL rather than some other condition. In hindsight, the ndigits test (added by commit f2ae9f9c3) was almost certainly meant to fix the NaN case not the all-zeroes case as the associated thread alleged. As before, back-patch to all supported versions. Discussion: https://postgr.es/m/1803D792815FC24D871C00D17AE95905C71161@g01jpexmbkw24
2018-11-14Initialize TransactionState and user ID consistently at transaction startMichael Paquier
If a failure happens when a transaction is starting between the moment the transaction status is changed from TRANS_DEFAULT to TRANS_START and the moment the current user ID and security context flags are fetched via GetUserIdAndSecContext(), or before initializing its basic fields, then those may get reset to incorrect values when the transaction aborts, leaving the session in an inconsistent state. One problem reported is that failing a starting transaction at the first query of a session could cause several kinds of system crashes on the follow-up queries. In order to solve that, move the initialization of the transaction state fields and the call of GetUserIdAndSecContext() in charge of fetching the current user ID close to the point where the transaction status is switched to TRANS_START, where there cannot be any error triggered in-between, per an idea of Tom Lane. This properly ensures that the current user ID, the security context flags and that the basic fields of TransactionState remain consistent even if the transaction fails while starting. Reported-by: Richard Guo Diagnosed-By: Richard Guo Author: Michael Paquier Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAN_9JTxECSb=pEPcb0a8d+6J+bDcOZ4=DgRo_B7Y5gRHJUM=Rw@mail.gmail.com Backpatch-through: 9.4
2018-11-13Fix incorrect results for numeric data passed through an ECPG SQLDA.Tom Lane
Numeric values with leading zeroes were incorrectly copied into a SQLDA (SQL Descriptor Area), leading to wrong results in ECPG programs. Report and patch by Daisuke Higuchi. Back-patch to all supported versions. Discussion: https://postgr.es/m/1803D792815FC24D871C00D17AE95905C71161@g01jpexmbkw24
2018-11-12Limit the number of index clauses considered in choose_bitmap_and().Tom Lane
classify_index_clause_usage() is O(N^2) in the number of distinct index qual clauses it considers, because of its use of a simple search list to store them. For nearly all queries, that's fine because only a few clauses will be considered. But Alexander Kuzmenkov reported a machine-generated query with 80000 (!) index qual clauses, which caused this code to take forever. Somewhat remarkably, this is the only O(N^2) behavior we now have for such a query, so let's fix it. We can get rid of the O(N^2) runtime for cases like this without much damage to the functionality of choose_bitmap_and() by separating out paths with "too many" qual or pred clauses, and deeming them to always be nonredundant with other paths. Then their clauses needn't go into the search list, so it doesn't get too long, but we don't lose the ability to consider bitmap AND plans altogether. I set the threshold for "too many" to be 100 clauses per path, which should be plenty to ensure no change in planning behavior for normal queries. There are other things we could do to make this go faster, but it's not clear that it's worth any additional effort. 80000 qual clauses require a whole lot of work in many other places, too. The code's been like this for a long time, so back-patch to all supported branches. The troublesome query only works back to 9.5 (in 9.4 it fails with stack overflow in the parser); so I'm not sure that fixing this in 9.4 has any real-world benefit, but perhaps it does. Discussion: https://postgr.es/m/90c5bdfa-d633-dabe-9889-3cf3e1acd443@postgrespro.ru
2018-11-09Fix missing role dependencies for some schema and type ACLs.Tom Lane
This patch fixes several related cases in which pg_shdepend entries were never made, or were lost, for references to roles appearing in the ACLs of schemas and/or types. While that did no immediate harm, if a referenced role were later dropped, the drop would be allowed and would leave a dangling reference in the object's ACL. That still wasn't a big problem for normal database usage, but it would cause obscure failures in subsequent dump/reload or pg_upgrade attempts, taking the form of attempts to grant privileges to all-numeric role names. (I think I've seen field reports matching that symptom, but can't find any right now.) Several cases are fixed here: 1. ALTER DOMAIN SET/DROP DEFAULT would lose the dependencies for any existing ACL entries for the domain. This case is ancient, dating back as far as we've had pg_shdepend tracking at all. 2. If a default type privilege applies, CREATE TYPE recorded the ACL properly but forgot to install dependency entries for it. This dates to the addition of default privileges for types in 9.2. 3. If a default schema privilege applies, CREATE SCHEMA recorded the ACL properly but forgot to install dependency entries for it. This dates to the addition of default privileges for schemas in v10 (commit ab89e465c). Another somewhat-related problem is that when creating a relation rowtype or implicit array type, TypeCreate would apply any available default type privileges to that type, which we don't really want since such an object isn't supposed to have privileges of its own. (You can't, for example, drop such privileges once they've been added to an array type.) ab89e465c is also to blame for a race condition in the regression tests: privileges.sql transiently installed globally-applicable default privileges on schemas, which sometimes got absorbed into the ACLs of schemas created by concurrent test scripts. This should have resulted in failures when privileges.sql tried to drop the role holding such privileges; but thanks to the bug fixed here, it instead led to dangling ACLs in the final state of the regression database. We'd managed not to notice that, but it became obvious in the wake of commit da906766c, which allowed the race condition to occur in pg_upgrade tests. To fix, add a function recordDependencyOnNewAcl to encapsulate what callers of get_user_default_acl need to do; while the original call sites got that right via ad-hoc code, none of the later-added ones have. Also change GenerateTypeDependencies to generate these dependencies, which requires adding the typacl to its parameter list. (That might be annoying if there are any extensions calling that function directly; but if there are, they're most likely buggy in the same way as the core callers were, so they need work anyway.) While I was at it, I changed GenerateTypeDependencies to accept most of its parameters in the form of a Form_pg_type pointer, making its parameter list a bit less unwieldy and mistake-prone. The test race condition is fixed just by wrapping the addition and removal of default privileges into a single transaction, so that that state is never visible externally. We might eventually prefer to separate out tests of default privileges into a script that runs by itself, but that would be a bigger change and would make the tests run slower overall. Back-patch relevant parts to all supported branches. Discussion: https://postgr.es/m/15719.1541725287@sss.pgh.pa.us
2018-11-08Disallow setting client_min_messages higher than ERROR.Tom Lane
Previously it was possible to set client_min_messages to FATAL or PANIC, which had the effect of suppressing transmission of regular ERROR messages to the client. Perhaps that seemed like a useful option in the past, but the trouble with it is that it breaks guarantees that are explicitly made in our FE/BE protocol spec about how a query cycle can end. While libpq and psql manage to cope with the omission, that's mostly because they are not very bright; client libraries that have more semantic knowledge are likely to get confused. Notably, pgODBC doesn't behave very sanely. Let's fix this by getting rid of the ability to set client_min_messages above ERROR. In HEAD, just remove the FATAL and PANIC options from the set of allowed enum values for client_min_messages. (This change also affects trace_recovery_messages, but that's OK since these aren't useful values for that variable either.) In the back branches, there was concern that rejecting these values might break applications that are explicitly setting things that way. I'm pretty skeptical of that argument, but accommodate it by accepting these values and then internally setting the variable to ERROR anyway. In all branches, this allows a couple of tiny simplifications in the logic in elog.c, so do that. Also respond to the point that was made that client_min_messages has exactly nothing to do with the server's logging behavior, and therefore does not belong in the "When To Log" subsection of the documentation. The "Statement Behavior" subsection is a better match, so move it there. Jonah Harris and Tom Lane Discussion: https://postgr.es/m/7809.1541521180@sss.pgh.pa.us Discussion: https://postgr.es/m/15479-ef0f4cc2fd995ca2@postgresql.org
2018-11-06GUC: adjust effective_cache_size SQL descriptionsBruce Momjian
Follow on patch for commit 3e0f1a4741f564c1a2fa6e944729d6967355d8c7. Reported-by: Peter Eisentraut Discussion: https://postgr.es/m/369ec766-b947-51bd-4dad-6fb9e026439f@2ndquadrant.com Backpatch-through: 9.4
2018-11-05Stamp 9.4.20.REL9_4_20Tom Lane
2018-11-05Fix copy-paste error in errhint() introduced in 691d79a07933.Andres Freund
Reported-By: Petr Jelinek Discussion: https://postgr.es/m/c95a620b-34f0-7930-aeb5-f7ab804f26cb@2ndquadrant.com Backpatch: 9.4-, like the previous commit
2018-11-05Translation updatesPeter Eisentraut
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 23063751d2d17da76d34ddfdead3f633041a6cbe
2018-11-03Make ts_locale.c's character-type functions cope with UTF-16.Tom Lane
On Windows, in UTF8 database encoding, what char2wchar() produces is UTF16 not UTF32, ie, characters above U+FFFF will be represented by surrogate pairs. t_isdigit() and siblings did not account for this and failed to provide a large enough result buffer. That in turn led to bogus "invalid multibyte character for locale" errors, because contrary to what you might think from char2wchar()'s documentation, its Windows code path doesn't cope sanely with buffer overflow. The solution for t_isdigit() and siblings is pretty clear: provide a 3-wchar_t result buffer not 2. char2wchar() also needs some work to provide more consistent, and more accurately documented, buffer overrun behavior. But that's a bigger job and it doesn't actually have any immediate payoff, so leave it for later. Per bug #15476 from Kenji Uno, who deserves credit for identifying the cause of the problem. Back-patch to all active branches. Discussion: https://postgr.es/m/15476-4314f480acf0f114@postgresql.org
2018-11-02Yet further rethinking of build changes for macOS Mojave.Tom Lane
The solution arrived at in commit e74dd00f5 presumes that the compiler has a suitable default -isysroot setting ... but further experience shows that in many combinations of macOS version, XCode version, Xcode command line tools version, and phase of the moon, Apple's compiler will *not* supply a default -isysroot value. We could potentially go back to the approach used in commit 68fc227dd, but I don't have a lot of faith in the reliability or life expectancy of that either. Let's just revert to the approach already shipped in 11.0, namely specifying an -isysroot switch globally. As a partial response to the concerns raised by Jakob Egger, adjust the contents of Makefile.global to look like CPPFLAGS = -isysroot $(PG_SYSROOT) ... PG_SYSROOT = /path/to/sysroot This allows overriding the sysroot path at build time in a relatively painless way. Add documentation to installation.sgml about how to use the PG_SYSROOT option. I also took the opportunity to document how to work around macOS's "System Integrity Protection" feature. As before, back-patch to all supported versions. Discussion: https://postgr.es/m/20840.1537850987@sss.pgh.pa.us
2018-11-02GUC: adjust effective_cache_size docs and SQL descriptionBruce Momjian
Clarify that effective_cache_size is both kernel buffers and shared buffers. Reported-by: nat@makarevitch.org Discussion: https://postgr.es/m/153685164808.22334.15432535018443165207@wrigleys.postgresql.org Backpatch-through: 9.3
2018-11-01Fix error message typo introduced 691d79a07933.Andres Freund
Reported-By: Michael Paquier Discussion: https://postgr.es/m/20181101003405.GB1727@paquier.xyz Backpatch: 9.4-, like the previous commit
2018-10-31Disallow starting server with insufficient wal_level for existing slot.Andres Freund
Previously it was possible to create a slot, change wal_level, and restart, even if the new wal_level was insufficient for the slot. That's a problem for both logical and physical slots, because the necessary WAL records are not generated. This removes a few tests in newer versions that, somewhat inexplicably, whether restarting with a too low wal_level worked (a buggy behaviour!). Reported-By: Joshua D. Drake Author: Andres Freund Discussion: https://postgr.es/m/20181029191304.lbsmhshkyymhw22w@alap3.anarazel.de Backpatch: 9.4-, where replication slots where introduced
2018-10-31Fix memory leak in repeated SPGIST index scans.Tom Lane
spgendscan neglected to pfree all the memory allocated by spgbeginscan. It's possible to get away with that in most normal queries, since the memory is allocated in the executor's per-query context which is about to get deleted anyway; but it causes severe memory leakage during creation or filling of large exclusion-constraint indexes. Also, document that amendscan is supposed to free what ambeginscan allocates. The docs' lack of clarity on that point probably caused this bug to begin with. (There is discussion of changing that API spec going forward, but I don't think it'd be appropriate for the back branches.) Per report from Bruno Wolff. It's been like this since the beginning, so back-patch to all active branches. In HEAD, also fix an independent leak caused by commit 2a6368343 (allocating memory during spgrescan instead of spgbeginscan, which might be all right if it got cleaned up, but it didn't). And do a bit of code beautification on that commit, too. Discussion: https://postgr.es/m/20181024012314.GA27428@wolff.to
2018-10-31Sync our copy of the timezone library with IANA release tzcode2018g.Tom Lane
This patch absorbs an upstream fix to "zic" for a recently-introduced bug that made it output data that some 32-bit clients couldn't read. Given the current source data, the bug only manifests in zones with leap seconds, which we don't generate, so that there's no actual change in our installed timezone data files from this. Still, in case somebody uses our copy of "zic" to do something else, it seems best to apply the fix promptly. Also, update the README's notes about converting upstream code to our conventions.
2018-10-31Update time zone data files to tzdata release 2018g.Tom Lane
DST law changes in Morocco (with, effectively, zero notice). Historical corrections for Hawaii.
2018-10-28Fix perl searchpath for modern perl for MSVC toolsAndrew Dunstan
Modern versions of perl no longer include the current directory in the perl searchpath, as it's insecure. Instead of adding the current directory, we get around the problem by adding the directory where the script lives. Problem noted by Victor Wagner. Solution adapted from buildfarm client code. Backpatch to all live versions.
2018-10-19Sync our copy of the timezone library with IANA release tzcode2018f.Tom Lane
About half of this is purely cosmetic changes to reduce the diff between our code and theirs, like inserting "const" markers where they have them. The other half is tracking actual code changes in zic.c and localtime.c. I don't think any of these represent near-term compatibility hazards, but it seems best to stay up to date. I also fixed longstanding bugs in our code for producing the known_abbrevs.txt list, which by chance hadn't been exposed before, but which resulted in some garbage output after applying the upstream changes in zic.c. Notably, because upstream removed their old phony transitions at the Big Bang, it's now necessary to cope with TZif files containing no DST transition times at all.
2018-10-19Update time zone data files to tzdata release 2018f.Tom Lane
DST law changes in Chile, Fiji, and Russia (Volgograd). Historical corrections for China, Japan, Macau, and North Korea. Note: like the previous tzdata update, this involves a depressingly large amount of semantically-meaningless churn in tzdata.zi. That is a consequence of upstream's data compression method assigning unstable abbreviations to DST rulesets. I complained about that to them last time, and this version now uses an assignment method that pays some heed to not changing abbreviations unnecessarily. So hopefully, that'll be better going forward.
2018-10-18Still further rethinking of build changes for macOS Mojave.Tom Lane
To avoid the sorts of problems complained of by Jakob Egger, it'd be best if configure didn't emit any references to the sysroot path at all. In the case of PL/Tcl, we can do that just by keeping our hands off the TCL_INCLUDE_SPEC string altogether. In the case of PL/Perl, we need to substitute -iwithsysroot for -I in the compile commands, which is easily handled if we change to using a configure output variable that includes the switch not only the directory name. Since PL/Tcl and PL/Python already do it like that, this seems like good consistency cleanup anyway. Hence, this replaces the advice given to Perl-related extensions in commit 5e2217131; instead of writing "-I$(perl_archlibexp)/CORE", they should just write "$(perl_includespec)". (The old way continues to work, but not on recent macOS.) It's still the case that configure needs to be aware of the sysroot path internally, but that's cleaner than what we had before. As before, back-patch to all supported versions. Discussion: https://postgr.es/m/20840.1537850987@sss.pgh.pa.us
2018-10-17Fix minor bug in isolationtester.Tom Lane
If the lock wait query failed, isolationtester would report the PQerrorMessage from some other connection, meaning there would be no message or an unrelated one. This seems like a pretty unlikely occurrence, but if it did happen, this bug could make it really difficult/confusing to figure out what happened. That seems to justify patching all the way back. In passing, clean up another place where the "wrong" conn was used for an error report. That one's not actually buggy because it's a different alias for the same connection, but it's still confusing to the reader.
2018-10-17Improve tzparse's handling of TZDEFRULES ("posixrules") zone data.Tom Lane
In the IANA timezone code, tzparse() always tries to load the zone file named by TZDEFRULES ("posixrules"). Previously, we'd hacked that logic to skip the load in the "lastditch" code path, which we use only to initialize the default "GMT" zone during GUC initialization. That's critical for a couple of reasons: since we do not support leap seconds, we *must not* allow "GMT" to have leap seconds, and since this case runs before the GUC subsystem is fully alive, we'd really rather not take the risk of pg_open_tzfile throwing any errors. However, that still left the code reading TZDEFRULES on every other call, something we'd noticed to the extent of having added code to cache the result so it was only done once per process not a lot of times. Andres Freund complained about the static data space used up for the cache; but as long as the logic was like this, there was no point in trying to get rid of that space. We can improve matters by looking a bit more closely at what the IANA code actually needs the TZDEFRULES data for. One thing it does is that if "posixrules" is a leap-second-aware zone, the leap-second behavior will be absorbed into every POSIX-style zone specification. However, that's a behavior we'd really prefer to do without, since for our purposes the end effect is to render every POSIX-style zone name unsupported. Otherwise, the TZDEFRULES data is used only if the POSIX zone name specifies DST but doesn't include a transition date rule (e.g., "EST5EDT" rather than "EST5EDT,M3.2.0,M11.1.0"). That is a minority case for our purposes --- in particular, it never happens when tzload() invokes tzparse() to interpret a transition date rule string found in a tzdata zone file. Hence, if we legislate that we're going to ignore leap-second data from "posixrules", we can postpone the TZDEFRULES load into the path where we actually need to substitute for a missing date rule string. That means it will never happen at all in common scenarios, making it reasonable to dynamically allocate the cache space when it does happen. Even when the data is already loaded, this saves some cycles in the common code path since we avoid a memcpy of 23KB or so. And, IMO at least, this is a less ugly hack on the IANA logic than what we had before, since it's not messing with the lastditch-vs-regular code paths. Back-patch to all supported branches, not so much because this is a critical change as that I want to keep all our copies of the IANA timezone code in sync. Discussion: https://postgr.es/m/20181015200754.7y7zfuzsoux2c4ya@alap3.anarazel.de
2018-10-16Back off using -isysroot on Darwin.Tom Lane
Rethink the solution applied in commit 5e2217131 to get PL/Tcl to build on macOS Mojave. I feared that adding -isysroot globally might have undesirable consequences, and sure enough Jakob Egger reported one: it complicates building extensions with a different Xcode version than was used for the core server. (I find that a risky proposition in general, but apparently it works most of the time, so we shouldn't break it if we don't have to.) We'd already adopted the solution for PL/Perl of inserting the sysroot path directly into the -I switches used to find Perl's headers, and we can do the same thing for PL/Tcl by changing the -iwithsysroot switch that Apple's tclConfig.sh reports. This restricts the risks to PL/Perl and PL/Tcl themselves and directly-dependent extensions, which is a lot more pleasing in general than a global -isysroot switch. Along the way, tighten the test to see if we need to inject the sysroot path into $perl_includedir, as I'd speculated about upthread but not gotten round to doing. As before, back-patch to all supported versions. Discussion: https://postgr.es/m/20840.1537850987@sss.pgh.pa.us
2018-10-16Avoid rare race condition in privileges.sql regression test.Tom Lane
We created a temp table, then switched to a new session, leaving the old session to clean up its temp objects in background. If that took long enough, the eventual attempt to drop the user that owns the temp table could fail, as exhibited today by sidewinder. Fix by dropping the temp table explicitly when we're done with it. It's been like this for quite some time, so back-patch to all supported branches. Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&dt=2018-10-16%2014%3A45%3A00
2018-10-16Avoid statically allocating gmtsub()'s timezone workspace.Tom Lane
localtime.c's "struct state" is a rather large object, ~23KB. We were statically allocating one for gmtsub() to use to represent the GMT timezone, even though that function is not at all heavily used and is never reached in most backends. Let's malloc it on-demand, instead. This does pose the question of how to handle a malloc failure, but there's already a well-defined error report convention here, ie set errno and return NULL. We have but one caller of pg_gmtime in HEAD, and two in back branches, neither of which were troubling to check for error. Make them do so. The possible errors are sufficiently unlikely (out-of-range timestamp, and now malloc failure) that I think elog() is adequate. Back-patch to all supported branches to keep our copies of the IANA timezone code in sync. This particular change is in a stanza that already differs from upstream, so it's a wash for maintenance purposes --- but only as long as we keep the branches the same. Discussion: https://postgr.es/m/20181015200754.7y7zfuzsoux2c4ya@alap3.anarazel.de
2018-10-15Check for stack overrun in standard_ProcessUtility().Tom Lane
ProcessUtility can recurse, and indeed can be driven to infinite recursion, so it ought to have a check_stack_depth() call. This covers the reported bug (portal trying to execute itself) and a bunch of other cases that could perhaps arise somewhere. Per bug #15428 from Malthe Borch. Back-patch to all supported branches. Discussion: https://postgr.es/m/15428-b3c2915ec470b033@postgresql.org
2018-10-14Avoid duplicate XIDs at recovery when building initial snapshotMichael Paquier
On a primary, sets of XLOG_RUNNING_XACTS records are generated on a periodic basis to allow recovery to build the initial state of transactions for a hot standby. The set of transaction IDs is created by scanning all the entries in ProcArray. However it happens that its logic never counted on the fact that two-phase transactions finishing to prepare can put ProcArray in a state where there are two entries with the same transaction ID, one for the initial transaction which gets cleared when prepare finishes, and a second, dummy, entry to track that the transaction is still running after prepare finishes. This way ensures a continuous presence of the transaction so as callers of for example TransactionIdIsInProgress() are always able to see it as alive. So, if a XLOG_RUNNING_XACTS takes a standby snapshot while a two-phase transaction finishes to prepare, the record can finish with duplicated XIDs, which is a state expected by design. If this record gets applied on a standby to initial its recovery state, then it would simply fail, so the odds of facing this failure are very low in practice. It would be tempting to change the generation of XLOG_RUNNING_XACTS so as duplicates are removed on the source, but this requires to hold on ProcArrayLock for longer and this would impact all workloads, particularly those using heavily two-phase transactions. XLOG_RUNNING_XACTS is also actually used only to initialize the standby state at recovery, so instead the solution is taken to discard duplicates when applying the initial snapshot. Diagnosed-by: Konstantin Knizhnik Author: Michael Paquier Discussion: https://postgr.es/m/0c96b653-4696-d4b4-6b5d-78143175d113@postgrespro.ru Backpatch-through: 9.3
2018-10-12Remove abstime, reltime, tinterval tables from old regression databases.Tom Lane
In the back branches, drop these tables after the regression tests are done with them. This fixes failures of cross-branch pg_upgrade testing caused by these types having been removed in v12. We do lose the ability to test dump/restore behavior with these types in the back branches, but the actual loss of code coverage seems to be nil given that there's nothing very special about these types. Discussion: https://postgr.es/m/20181009192237.34wjp3nmw7oynmmr@alap3.anarazel.de
2018-10-12Back-patch addition of the ALLOCSET_FOO_SIZES macros.Tom Lane
These macros were originally added in commit ea268cdc9, and back-patched into 9.6 before 9.6.0. However, some extensions would like to use them in older branches, and there seems no harm in providing them. So add them to all supported branches. Per suggestions from Christoph Berg and Andres Freund. Discussion: https://postgr.es/m/20181012170355.bhxi273skjt6sag4@alap3.anarazel.de
2018-10-10Fix logical decoding error when system table w/ toast is repeatedly rewritten.Andres Freund
Repeatedly rewriting a mapped catalog table with VACUUM FULL or CLUSTER could cause logical decoding to fail with: ERROR, "could not map filenode \"%s\" to relation OID" To trigger the problem the rewritten catalog had to have live tuples with toasted columns. The problem was triggered as during catalog table rewrites the heap_insert() check that prevents logical decoding information to be emitted for system catalogs, failed to treat the new heap's toast table as a system catalog (because the new heap is not recognized as a catalog table via RelationIsLogicallyLogged()). The relmapper, in contrast to the normal catalog contents, does not contain historical information. After a single rewrite of a mapped table the new relation is known to the relmapper, but if the table is rewritten twice before logical decoding occurs, the relfilenode cannot be mapped to a relation anymore. Which then leads us to error out. This only happens for toast tables, because the main table contents aren't re-inserted with heap_insert(). The fix is simple, add a new heap_insert() flag that prevents logical decoding information from being emitted, and accept during decoding that there might not be tuple data for toast tables. Unfortunately that does not fix pre-existing logical decoding errors. Doing so would require not throwing an error when a filenode cannot be mapped to a relation during decoding, and that seems too likely to hide bugs. If it's crucial to fix decoding for an existing slot, temporarily changing the ERROR in ReorderBufferCommit() to a WARNING appears to be the best fix. Author: Andres Freund Discussion: https://postgr.es/m/20180914021046.oi7dm4ra3ot2g2kt@alap3.anarazel.de Backpatch: 9.4-, where logical decoding was introduced
2018-10-05Allow btree comparison functions to return INT_MIN.Tom Lane
Historically we forbade datatype-specific comparison functions from returning INT_MIN, so that it would be safe to invert the sort order just by negating the comparison result. However, this was never really safe for comparison functions that directly return the result of memcmp(), strcmp(), etc, as POSIX doesn't place any such restriction on those library functions. Buildfarm results show that at least on recent Linux on s390x, memcmp() actually does return INT_MIN sometimes, causing sort failures. The agreed-on answer is to remove this restriction and fix relevant call sites to not make such an assumption; code such as "res = -res" should be replaced by "INVERT_COMPARE_RESULT(res)". The same is needed in a few places that just directly negated the result of memcmp or strcmp. To help find places having this problem, I've also added a compile option to nbtcompare.c that causes some of the commonly used comparators to return INT_MIN/INT_MAX instead of their usual -1/+1. It'd likely be a good idea to have at least one buildfarm member running with "-DSTRESS_SORT_INT_MIN". That's far from a complete test of course, but it should help to prevent fresh introductions of such bugs. This is a longstanding portability hazard, so back-patch to all supported branches. Discussion: https://postgr.es/m/20180928185215.ffoq2xrq5d3pafna@alap3.anarazel.de
2018-10-02Set snprintf.c's maximum number of NL arguments to be 31.Tom Lane
Previously, we used the platform's NL_ARGMAX if any, otherwise 16. The trouble with this is that the platform value is hugely variable, ranging from the POSIX-minimum 9 to as much as 64K on recent FreeBSD. Values of more than a dozen or two have no practical use and slow down the initialization of the argtypes array. Worse, they cause snprintf.c to consume far more stack space than was the design intention, possibly resulting in stack-overflow crashes. Standardize on 31, which is comfortably more than we need (it looks like no existing translatable message has more than about 10 parameters). I chose that, not 32, to make the array sizes powers of 2, for some possible small gain in speed of the memset. The lack of reported crashes suggests that the set of platforms we use snprintf.c on (in released branches) may have no overlap with the set where NL_ARGMAX has unreasonably large values. But that's not entirely clear, so back-patch to all supported branches. Per report from Mateusz Guzik (via Thomas Munro). Discussion: https://postgr.es/m/CAEepm=3VF=PUp2f8gU8fgZB22yPE_KBS0+e1AHAtQ=09schTHg@mail.gmail.com
2018-10-02Fix corner-case failures in has_foo_privilege() family of functions.Tom Lane
The variants of these functions that take numeric inputs (OIDs or column numbers) are supposed to return NULL rather than failing on bad input; this rule reduces problems with snapshot skew when queries apply the functions to all rows of a catalog. has_column_privilege() had careless handling of the case where the table OID didn't exist. You might get something like this: select has_column_privilege(9999,'nosuchcol','select'); ERROR: column "nosuchcol" of relation "(null)" does not exist or you might get a crash, depending on the platform's printf's response to a null string pointer. In addition, while applying the column-number variant to a dropped column returned NULL as desired, applying the column-name variant did not: select has_column_privilege('mytable','........pg.dropped.2........','select'); ERROR: column "........pg.dropped.2........" of relation "mytable" does not exist It seems better to make this case return NULL as well. Also, the OID-accepting variants of has_foreign_data_wrapper_privilege, has_server_privilege, and has_tablespace_privilege didn't follow the principle of returning NULL for nonexistent OIDs. Superusers got TRUE, everybody else got an error. Per investigation of Jaime Casanova's report of a new crash in HEAD. These behaviors have been like this for a long time, so back-patch to all supported branches. Patch by me; thanks to Stephen Frost for discussion and review Discussion: https://postgr.es/m/CAJGNTeP=-6Gyqq5TN9OvYEydi7Fv1oGyYj650LGTnW44oAzYCg@mail.gmail.com