summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2019-02-17Fix CREATE VIEW to allow zero-column views.Tom Lane
We should logically have allowed this case when we allowed zero-column tables, but it was overlooked. Although this might be thought a feature addition, it's really a bug fix, because it was possible to create a zero-column view via the convert-table-to-view code path, and then you'd have a situation where dump/reload would fail. Hence, back-patch to all supported branches. Arrange the added test cases to provide coverage of the related pg_dump code paths (since these views will be dumped and reloaded during the pg_upgrade regression test). I also made them test the case where pg_dump has to postpone the view rule into post-data, which disturbingly had no regression coverage before. Report and patch by Ashutosh Sharma (test case by me) Discussion: https://postgr.es/m/CAE9k0PkmHdeSaeZt2ujnb_cKucmK3sDDceDzw7+d5UZoNJPYOg@mail.gmail.com
2019-02-15Fix race in dsm_attach() when handles are reused.Thomas Munro
DSM handle values can be reused as soon as the underlying shared memory object has been destroyed. That means that for a brief moment we might have two DSM slots with the same handle. While trying to attach, if we encounter a slot with refcnt == 1, meaning that it is currently being destroyed, we should continue our search in case the same handle exists in another slot. The race manifested as a rare "dsa_area could not attach to segment" error, and was more likely in 10 and 11 due to the lack of distinct seed for random() in parallel workers. It was made very unlikely in in master by commit 197e4af9, and older releases don't usually create new DSM segments in background workers so it was also unlikely there. This fixes the root cause of bug report #15585, in which the error could also sometimes result in a self-deadlock in the error path. It's not yet clear if further changes are needed to avoid that failure mode. Back-patch to 9.4, where dsm.c arrived. Author: Thomas Munro Reported-by: Justin Pryzby, Sergei Kornilov Discussion: https://postgr.es/m/20190207014719.GJ29720@telsasoft.com Discussion: https://postgr.es/m/15585-324ff6a93a18da46@postgresql.org
2019-02-12Relax overly strict assertionAlvaro Herrera
Ever since its birth, ReorderBufferBuildTupleCidHash() has contained an assertion that a catalog tuple cannot change Cmax after acquiring one. But that's wrong: if a subtransaction executes DDL that affects that catalog tuple, and later aborts and another DDL affects the same tuple, it will change Cmax. Relax the assertion to merely verify that the Cmax remains valid and monotonically increasing, instead. Add a test that tickles the relevant code. Diagnosed by, and initial patch submitted by: Arseny Sher Co-authored-by: Arseny Sher Discussion: https://postgr.es/m/874l9p8hyw.fsf@ars-thinkpad
2019-02-12Fix erroneous error reports in snapbuild.c.Tom Lane
It's pretty unhelpful to report the wrong file name in a complaint about syscall failure, but SnapBuildSerialize managed to do that twice in a span of 50 lines. Also fix half a dozen missing or poorly-chosen errcode assignments; that's mostly cosmetic, but still wrong. Noted while studying recent failures on buildfarm member nightjar. I'm not sure whether those reports are actually giving the wrong filename, because there are two places here with identically spelled error messages. The other one is specifically coded not to report ENOENT, but if it's this one, how could we be getting ENOENT from open() with O_CREAT? Need to sit back and await results. However, these ereports are clearly broken from birth, so back-patch.
2019-02-11Stamp 9.4.21.REL9_4_21Tom Lane
2019-02-11Translation updatesPeter Eisentraut
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: db7a038785d2919e23e222be1ffa1866087fa803
2019-02-09Repair unsafe/unportable snprintf usage in pg_restore.Tom Lane
warn_or_exit_horribly() was blithely passing a potentially-NULL string pointer to a %s format specifier. That works (at least to the extent of not crashing) on some platforms, but not all, and since we switched to our own snprintf.c it doesn't work for us anywhere. Of the three string fields being handled this way here, I think that only "owner" is supposed to be nullable ... but considering that this is error-reporting code, it has very little business assuming anything, so put in defenses for all three. Per a crash observed on buildfarm member crake and then reproduced here. Because of the portability aspect, back-patch to all supported versions.
2019-02-08Defend against null error message reported by libxml2.Tom Lane
While this isn't really supposed to happen, it can occur in OOM situations and perhaps others. Instead of crashing, substitute "(no message provided)". I didn't worry about localizing this text, since we aren't localizing anything else here; besides, if we're on the edge of OOM, it's unlikely gettext() would work. Report and fix by Sergio Conde Gómez in bug #15624. Discussion: https://postgr.es/m/15624-4dea54091a2864e6@postgresql.org
2019-02-07Ensure that foreign scans with lateral refs are planned correctly.Tom Lane
As reported in bug #15613 from Srinivasan S A, file_fdw and postgres_fdw neglected to mark plain baserel foreign paths as parameterized when the relation has lateral_relids. Other FDWs have surely copied this mistake, so rather than just patching those two modules, install a band-aid fix in create_foreignscan_path to rectify the mistake centrally. Although the band-aid is enough to fix the visible symptom, correct the calls in file_fdw and postgres_fdw anyway, so that they are valid examples for external FDWs. Also, since the band-aid isn't enough to make this work for parameterized foreign joins, throw an elog(ERROR) if such a case is passed to create_foreignscan_path. This shouldn't pose much of a problem for existing external FDWs, since it's likely they aren't trying to make such paths anyway (though some of them may need a defense against joins with lateral_relids, similar to the one this patch installs into postgres_fdw). Add some assertions in relnode.c to catch future occurrences of the same error --- in particular, as backstop against core-code mistakes like the one fixed by commit bdd9a99aa. Discussion: https://postgr.es/m/15613-092be1be9576c728@postgresql.org
2019-02-06Propagate lateral-reference information to indirect descendant relations.Tom Lane
create_lateral_join_info() computes a bunch of information about lateral references between base relations, and then attempts to propagate those markings to appendrel children of the original base relations. But the original coding neglected the possibility of indirect descendants (grandchildren etc). During v11 development we noticed that this was wrong for partitioned-table cases, but failed to realize that it was just as wrong for any appendrel. While the case can't arise for appendrels derived from traditional table inheritance (because we make a flat appendrel for that), nested appendrels can arise from nested UNION ALL subqueries. Failure to mark the lower-level relations as having lateral references leads to confusion in add_paths_to_append_rel about whether unparameterized paths can be built. It's not very clear whether that leads to any user-visible misbehavior; the lack of field reports suggests that it may cause nothing worse than minor cost misestimation. Still, it's a bug, and it leads to failures of Asserts that I intend to add later. To fix, we need to propagate information from all appendrel parents, not just those that are RELOPT_BASERELs. We can still do it in one pass, if we rely on the append_rel_list to be ordered with ancestor relationships before descendant ones; add assertions checking that. While fixing this, we can make a small performance improvement by traversing the append_rel_list just once instead of separately for each appendrel parent relation. Noted while investigating bug #15613, though this patch does not fix that (which is why I'm not committing the related Asserts yet). Discussion: https://postgr.es/m/3951.1549403812@sss.pgh.pa.us
2019-02-06Unify searchpath and do file logic in MSVC build scripts.Andrew Dunstan
Commit f83419b739 failed to notice that mkvcbuild.pl and build.pl use different searchpath and do-file logic, breaking the latter, so it is adjusted to use the same logic as mkvcbuild.pl.
2019-02-05Fix included file path for modern perlAndrew Dunstan
Contrary to the comment on 772d4b76, only paths starting with "./" or "../" are considered relative to the current working directory by perl's "do" function. So this patch converts all the relevant cases to use "./" paths. This only affects MSVC. Backpatch to all live branches.
2019-02-05More fixed for modern perl on back branchesAndrew Dunstan
Use "do" instead of "require" for included files, as it doesn't look for them in the search path but relative to the current working directory. These changes have already been made to REL_10_STABLE and later, to satisfy the demands of perlcritic, but need backporting now to earlier branches.
2019-02-05Keep perl style checker happyAndrew Dunstan
It doesn't like code before "use strict;".
2019-02-05Update time zone data files to tzdata release 2018i.Tom Lane
DST law changes in Kazakhstan, Metlakatla, and São Tomé and Príncipe. Kazakhstan's Qyzylorda zone is split in two, creating a new zone Asia/Qostanay, as some areas did not change UTC offset. Historical corrections for Hong Kong and numerous Pacific islands.
2019-02-05Fix searchpath for modern Perl for genbki.plAndrew Dunstan
This was fixed for MSVC tools by commit 1df92eeafefac4, but per buildfarm member bowerbird genbki.pl needs the same treatment. Backpatch to all live branches.
2019-02-04Fix dumping of matviews with indirect dependencies on primary keys.Tom Lane
Commit 62215de29 turns out to have been not quite on-the-mark. When we are forced to postpone dumping of a materialized view into the dump's post-data section (because it depends on a unique index that isn't created till that section), we may also have to postpone dumping other matviews that depend on said matview. The previous fix didn't reliably work for such cases: it'd break the dependency loops properly, producing a workable object ordering, but it didn't necessarily mark all the matviews as "postponed_def". This led to harmless bleating about "archive items not in correct section order", as reported by Tom Cassidy in bug #15602. Less harmlessly, selective-restore options such as --section might misbehave due to the matview dump objects not being properly labeled. The right way to fix it is to consider that each pre-data dependency we break amounts to moving the no-longer-dependent object into post-data, and hence we should mark that object if it's a matview. Back-patch to all supported versions, since the issue's been there since matviews were introduced. Discussion: https://postgr.es/m/15602-e895445f73dc450b@postgresql.org
2019-02-03Add PG_CFLAGS, PG_CXXFLAGS, and PG_LDFLAGS variables to PGXSMichael Paquier
Add PG_CFLAGS, PG_CXXFLAGS, and PG_LDFLAGS variables to pgxs.mk which will be appended or prepended to the corresponding make variables. Notably, there was previously no way to pass custom CXXFLAGS to third party extension module builds, COPT and PROFILE supporting only CFLAGS and LDFLAGS. Backpatch all the way down to ease integration with existing extensions. Author: Christoph Berg Reviewed-by: Andres Freund, Tom Lane, Michael Paquier Discussion: https://postgr.es/m/20181113104005.GA32154@msg.credativ.de Backpatch-through: 9.4
2019-02-02Avoid possible deadlock while locking multiple heap pages.Amit Kapila
To avoid deadlock, backend acquires a lock on heap pages in block number order. In certain cases, lock on heap pages is dropped and reacquired. In this case, the locks are dropped for reading in corresponding VM page/s. The issue is we re-acquire locks in bufferId order whereas the intention was to acquire in blockid order. This commit ensures that we will always acquire locks on heap pages in blockid order. Reported-by: Nishant Fnu Author: Nishant Fnu Reviewed-by: Amit Kapila and Robert Haas Backpatch-through: 9.4 Discussion: https://postgr.es/m/5883C831-2ED1-47C8-BFAC-2D5BAE5A8CAE@amazon.com
2019-02-01Fix use of dangling pointer in heap_delete() when logging replica identityMichael Paquier
When logging the replica identity of a deleted tuple, XLOG_HEAP_DELETE records include references of the old tuple. Its data is stored in an intermediate variable used to register this information for the WAL record, but this variable gets away from the stack when the record gets actually inserted. Spotted by clang's AddressSanitizer. Author: Stas Kelvish Discussion: https://postgr.es/m/085C8825-AD86-4E93-AF80-E26CDF03D1EA@postgrespro.ru Backpatch-through: 9.4
2019-01-30Fix a crash in logical replicationPeter Eisentraut
The bug was that determining which columns are part of the replica identity index using RelationGetIndexAttrBitmap() would run eval_const_expressions() on index expressions and predicates across all indexes of the table, which in turn might require a snapshot, but there wasn't one set, so it crashes. There were actually two separate bugs, one on the publisher and one on the subscriber. To trigger the bug, a table that is part of a publication or subscription needs to have an index with a predicate or expression that lends itself to constant expressions simplification. The fix is to avoid the constant expressions simplification in RelationGetIndexAttrBitmap(), so that it becomes safe to call in these contexts. The constant expressions simplification comes from the calls to RelationGetIndexExpressions()/RelationGetIndexPredicate() via BuildIndexInfo(). But RelationGetIndexAttrBitmap() calling BuildIndexInfo() is overkill. The latter just takes pg_index catalog information, packs it into the IndexInfo structure, which former then just unpacks again and throws away. We can just do this directly with less overhead and skip the troublesome calls to eval_const_expressions(). This also removes the awkward cross-dependency between relcache.c and index.c. Bug: #15114 Reported-by: Петър Славов <pet.slavov@gmail.com> Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/152110589574.1223.17983600132321618383@wrigleys.postgresql.org/
2019-01-25Allow UNLISTEN in hot-standby mode.Tom Lane
Since LISTEN is (still) disallowed, UNLISTEN must be a no-op in a hot-standby session, and so there's no harm in allowing it. This change allows client code to not worry about whether it's connected to a primary or standby server when performing session-state-reset type activities. (Note that DISCARD ALL, which includes UNLISTEN, was already allowed, making it inconsistent to reject UNLISTEN.) Per discussion, back-patch to all supported versions. Shay Rojansky, reviewed by Mi Tar Discussion: https://postgr.es/m/CADT4RqCf2gA_TJtPAjnGzkC3ZiexfBZiLmA-mV66e4UyuVv8bA@mail.gmail.com
2019-01-24Remove infinite-loop hazards in ecpg test suite.Tom Lane
A report from Andrew Dunstan showed that an ecpglib breakage that causes repeated query failures could lead to infinite loops in some ecpg test scripts, because they contain "while(1)" loops with no exit condition other than successful test completion. That might be all right for manual testing, but it seems entirely unacceptable for automated test environments such as our buildfarm. We don't want buildfarm owners to have to intervene manually when a test goes wrong. To fix, just change all those while(1) loops to exit after at most 100 iterations (which is more than any of them expect to iterate). This seems sufficient since we'd see discrepancies in the test output if any loop executed the wrong number of times. I tested this by dint of intentionally breaking ecpg_do_prologue to always fail, and verifying that the tests still got to completion. Back-patch to all supported branches, since the whole point of this exercise is to protect the buildfarm against future mistakes. Discussion: https://postgr.es/m/18693.1548302004@sss.pgh.pa.us
2019-01-23Blind attempt to fix _configthreadlocale() failures on MinGW.Tom Lane
Apparently, some builds of MinGW contain a version of _configthreadlocale() that always returns -1, indicating failure. Rather than treating that as a curl-up-and-die condition, soldier on as though the function didn't exist. This leaves us without thread safety on such MinGW versions, but we didn't have it anyway. Discussion: https://postgr.es/m/d06a16bc-52d6-9f0d-2379-21242d7dbe81@2ndQuadrant.com
2019-01-23Fix misc typos in comments.Heikki Linnakangas
Spotted mostly by Fabien Coelho. Discussion: https://www.postgresql.org/message-id/alpine.DEB.2.21.1901230947050.16643@lancre
2019-01-21Avoid thread-safety problem in ecpglib.Tom Lane
ecpglib attempts to force the LC_NUMERIC locale to "C" while reading server output, to avoid problems with strtod() and related functions. Historically it's just issued setlocale() calls to do that, but that has major problems if we're in a threaded application. setlocale() itself is not required by POSIX to be thread-safe (and indeed is not, on recent OpenBSD). Moreover, its effects are process-wide, so that we could cause unexpected results in other threads, or another thread could change our setting. On platforms having uselocale(), which is required by POSIX:2008, we can avoid these problems by using uselocale() instead. Windows goes its own way as usual, but we can make it safe by using _configthreadlocale(). Platforms having neither continue to use the old code, but that should be pretty much nobody among current systems. (Subsequent buildfarm results show that recent NetBSD versions still lack uselocale(), but it's not a big problem because they also do not support non-"C" settings for LC_NUMERIC.) Back-patch of commits 8eb4a9312 and ee27584c4. Michael Meskes and Tom Lane; thanks also to Takayuki Tsunakawa. Discussion: https://postgr.es/m/31420.1547783697@sss.pgh.pa.us
2019-01-19Revert "Add valgrind suppressions for wcsrtombs optimizations"Tomas Vondra
This reverts commit 41344896364c4bf2229ec590c95cf23a6bec928e. Per discussion, it's not desirable to add valgrind suppressions for outside our own code base (e.g. glibc in this case), especially when the suppressions may be platform-specific. There are better ways to deal with that, e.g. by providing local suppressions. Discussion: https://www.postgresql.org/message-id/flat/90ac0452-e907-e7a4-b3c8-15bd33780e62%402ndquadrant.com
2019-01-18Use our own getopt() on OpenBSD.Tom Lane
Recent OpenBSD (at least 5.9 and up) has a version of getopt(3) that will not cope with the "-:" spec we use to accept double-dash options in postgres.c and postmaster.c. Admittedly, that's a hack because POSIX only requires getopt() to allow alphanumeric option characters. I have no desire to find another way, however, so let's just do what we were already doing on Solaris: force use of our own src/port/getopt.c implementation. In passing, improve some of the comments around said implementation. Per buildfarm and local testing. Back-patch to all supported branches. Discussion: https://postgr.es/m/30197.1547835700@sss.pgh.pa.us
2019-01-17Postpone aggregate checks until after collation is assigned.Andrew Gierth
Previously, parseCheckAggregates was run before assign_query_collations, but this causes problems if any expression has already had a collation assigned by some transform function (e.g. transformCaseExpr) before parseCheckAggregates runs. The differing collations would cause expressions not to be recognized as equal to the ones in the GROUP BY clause, leading to spurious errors about unaggregated column references. The result was that CASE expr WHEN val ... would fail when "expr" contained a GROUPING() expression or matched one of the group by expressions, and where collatable types were involved; whereas the supposedly identical CASE WHEN expr = val ... would succeed. Backpatch all the way; this appears to have been wrong ever since collations were introduced. Per report from Guillaume Lelarge, analysis and patch by me. Discussion: https://postgr.es/m/CAECtzeVSO_US8C2Khgfv54ZMUOBR4sWq+6_bLrETnWExHT=rFg@mail.gmail.com Discussion: https://postgr.es/m/87muo0k0c7.fsf@news-spur.riddles.org.uk
2019-01-13fix typoAndrew Dunstan
2019-01-13Make DLSUFFIX easily discoverable by build scriptsAndrew Dunstan
This will enable things like the buildfarm client to discover more reliably if certain libraries have been installed. Discussion: https://postgr.es/m/859e7c91-7ef4-d4b4-2ca2-8046e0cbee09@2ndQuadrant.com Backpatch to all live branches.
2019-01-03Improve ANALYZE's handling of concurrent-update scenarios.Tom Lane
This patch changes the rule for whether or not a tuple seen by ANALYZE should be included in its sample. When we last touched this logic, in commit 51e1445f1, we weren't thinking very hard about tuples being UPDATEd by a long-running concurrent transaction. In such a case, we might see the pre-image as either LIVE or DELETE_IN_PROGRESS depending on timing; and we might see the post-image not at all, or as INSERT_IN_PROGRESS. Since the existing code will not sample either DELETE_IN_PROGRESS or INSERT_IN_PROGRESS tuples, this leads to concurrently-updated rows being omitted from the sample entirely. That's not very helpful, and it's especially the wrong thing if the concurrent transaction ends up rolling back. The right thing seems to be to sample DELETE_IN_PROGRESS rows just as if they were live. This makes the "sample it" and "count it" decisions the same, which seems good for consistency. It's clearly the right thing if the concurrent transaction ends up rolling back; in effect, we are sampling as though IN_PROGRESS transactions haven't happened yet. Also, this combination of choices ensures maximum robustness against the different combinations of whether and in which state we might see the pre- and post-images of an update. It's slightly annoying that we end up recording immediately-out-of-date stats in the case where the transaction does commit, but on the other hand the stats are fine for columns that didn't change in the update. And the alternative of sampling INSERT_IN_PROGRESS rows instead seems like a bad idea, because then the sampling would be inconsistent with the way rows are counted for the stats report. Per report from Mark Chambers; thanks to Jeff Janes for diagnosing what was happening. Back-patch to all supported versions. Discussion: https://postgr.es/m/CAFh58O_Myr6G3tcH3gcGrF-=OExB08PJdWZcSBcEcovaiPsrHA@mail.gmail.com
2019-01-02Don't believe MinMaxExpr is leakproof without checking.Tom Lane
MinMaxExpr invokes the btree comparison function for its input datatype, so it's only leakproof if that function is. Many such functions are indeed leakproof, but others are not, and we should not just assume that they are. Hence, adjust contain_leaked_vars to verify the leakproofness of the referenced function explicitly. I didn't add a regression test because it would need to depend on some particular comparison function being leaky, and that's a moving target, per discussion. This has been wrong all along, so back-patch to supported branches. Discussion: https://postgr.es/m/31042.1546194242@sss.pgh.pa.us
2018-12-31pg_regress: Promptly detect failed postmaster startup.Noah Misch
Detect it the way pg_ctl's wait_for_postmaster() does. When pg_regress spawned a postmaster that failed startup, we were detecting that only with "pg_regress: postmaster did not respond within 60 seconds". Back-patch to 9.4 (all supported versions). Reviewed by Tom Lane. Discussion: https://postgr.es/m/20181231172922.GA199150@gust.leadboat.com
2018-12-27Have DISCARD ALL/TEMP remove leftover temp tablesAlvaro Herrera
Previously, it would only remove temp tables created in the same session; but if the session uses the BackendId of a previously crashed backend that left temp tables around, those would not get removed. Since autovacuum would not drop them either (because it sees that the BackendId is in use by the current session) these can cause annoying xid-wraparound warnings. Apply to branches 9.4 to 10. This is not a problem since version 11, because commit 943576bddcb5 added state tracking that makes autovacuum realize that those temp tables are not ours, so it removes them. This is useful to handle in DISCARD, because even though it does not handle all situations, it does handle the common one where a connection pooler keeps the same session open for an indefinitely long time. Discussion: https://postgr.es/m/20181226190834.wsk2wzott5yzrjiq@alvherre.pgsql Reviewed-by: Takayuki Tsunakawa, Michaël Paquier
2018-12-27Make autovacuum more selective about temp tables to keepAlvaro Herrera
When temp tables are in danger of XID wraparound, autovacuum drops them; however, it preserves those that are owned by a working session. This is desirable, except when the session is connected to a different database (because the temp tables cannot be from that session), so make it only keep the temp tables only if the backend is in the same database as the temp tables. This is not bulletproof: it fails to detect temp tables left by a session whose backend ID is reused in the same database but the new session does not use temp tables. Commit 943576bddcb5 fixes that case too, for branches 11 and up (which is why we don't apply this fix to those branches), but back-patching that one is not universally agreed on. Discussion: https://postgr.es/m/20181214162843.37g6h3txto43akrb@alvherre.pgsql Reviewed-by: Takayuki Tsunakawa, Michaël Paquier
2018-12-27Ignore inherited temp relations from other sessions when truncatingMichael Paquier
Inheritance trees can include temporary tables if the parent is permanent, which makes possible the presence of multiple temporary children from different sessions. Trying to issue a TRUNCATE on the parent in this scenario causes a failure, so similarly to any other queries just ignore such cases, which makes TRUNCATE work transparently. This makes truncation behave similarly to any other DML query working on the parent table with queries which need to be issues on children. A set of isolation tests is added to cover basic cases. Reported-by: Zhou Digoal Author: Amit Langote, Michael Paquier Discussion: https://postgr.es/m/15565-ce67a48d0244436a@postgresql.org Backpatch-through: 9.4
2018-12-26Fix portability failure introduced in commits d2b0b60e7 et al.Tom Lane
I made a frontend fprintf() format use %m, forgetting that that's only safe in HEAD not the back branches; prior to 96bf88d52 and d6c55de1f, it would work on glibc platforms but not elsewhere. Revert to using %s ... strerror(errno) as the code did before. We could have left HEAD as-is, but for code consistency across branches, I chose to apply this patch there too. Per Coverity and a few buildfarm members.
2018-12-22Fix ancient compiler warnings and typos in !HAVE_SYMLINK codePeter Eisentraut
This has never been correct since this code was introduced.
2018-12-18Fix ancient thinko in mergejoin cost estimation.Tom Lane
"rescanratio" was computed as 1 + rescanned-tuples / total-inner-tuples, which is sensible if it's to be multiplied by total-inner-tuples or a cost value corresponding to scanning all the inner tuples. But in reality it was (mostly) multiplied by inner_rows or a related cost, numbers that take into account the possibility of stopping short of scanning the whole inner relation thanks to a limited key range in the outer relation. This'd still make sense if we could expect that stopping short would result in a proportional decrease in the number of tuples that have to be rescanned. It does not, however. The argument that establishes the validity of our estimate for that number is independent of whether we scan all of the inner relation or stop short, and experimentation also shows that stopping short doesn't reduce the number of rescanned tuples. So the correct calculation is 1 + rescanned-tuples / inner_rows, and we should be sure to multiply that by inner_rows or a corresponding cost value. Most of the time this doesn't make much difference, but if we have both a high rescan rate (due to lots of duplicate values) and an outer key range much smaller than the inner key range, then the error can be significant, leading to a large underestimate of the cost associated with rescanning. Per report from Vijaykumar Jain. This thinko appears to go all the way back to the introduction of the rescan estimation logic in commit 70fba7043, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAE7uO5hMb_TZYJcZmLAgO6iD68AkEK6qCe7i=vZUkCpoKns+EQ@mail.gmail.com
2018-12-17Fix use-after-free bug when renaming constraintsMichael Paquier
This is an oversight from recent commit b13fd344. While on it, tweak the previous test with a better name for the renamed primary key. Detected by buildfarm member prion which forces relation cache release with -DRELCACHE_FORCE_RELEASE. Back-patch down to 9.4 as the previous commit.
2018-12-17Make constraint rename issue relcache invalidation on target relationMichael Paquier
When a constraint gets renamed, it may have associated with it a target relation (for example domain constraints don't have one). Not invalidating the target relation cache when issuing the renaming can result in issues with subsequent commands that refer to the old constraint name using the relation cache, causing various failures. One pattern spotted was using CREATE TABLE LIKE after a constraint renaming. Reported-by: Stuart <sfbarbee@gmail.com> Author: Amit Langote Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/2047094.V130LYfLq4@station53.ousa.org
2018-12-13Fix wrong backpatching of ginRedoDeletePage() deadlock fixAlexander Korotkov
19cf52e6cc changes lock order in ginRedoDeletePage(). But did it in a wrong way due to oversight during backpatching. This commit fixes that. Reported-by: Bruce Momjian Discussion: https://postgr.es/m/20181213153232.GA10664%40momjian.us
2018-12-13Prevent GIN deleted pages from being reclaimed too earlyAlexander Korotkov
When GIN vacuum deletes a posting tree page, it assumes that no concurrent searchers can access it, thanks to ginStepRight() locking two pages at once. However, since 9.4 searches can skip parts of posting trees descending from the root. That leads to the risk that page is deleted and reclaimed before concurrent search can access it. This commit prevents the risk of above by waiting for every transaction, which might wait to reference this page, to finish. Due to binary compatibility we can't change GinPageOpaqueData to store corresponding transaction id. Instead we reuse page header pd_prune_xid field, which is unused in index pages. Discussion: https://postgr.es/m/31a702a.14dd.166c1366ac1.Coremail.chjischj%40163.com Author: Andrey Borodin, Alexander Korotkov Reviewed-by: Alexander Korotkov Backpatch-through: 9.4
2018-12-13Prevent deadlock in ginRedoDeletePage()Alexander Korotkov
On standby ginRedoDeletePage() can work concurrently with read-only queries. Those queries can traverse posting tree in two ways. 1) Using rightlinks by ginStepRight(), which locks the next page before unlocking its left sibling. 2) Using downlinks by ginFindLeafPage(), which locks at most one page at time. Original lock order was: page, parent, left sibling. That lock order can deadlock with ginStepRight(). In order to prevent deadlock this commit changes lock order to: left sibling, page, parent. Note, that position of parent in locking order seems insignificant, because we only lock one page at time while traversing downlinks. Reported-by: Chen Huajun Diagnosed-by: Chen Huajun, Peter Geoghegan, Andrey Borodin Discussion: https://postgr.es/m/31a702a.14dd.166c1366ac1.Coremail.chjischj%40163.com Author: Alexander Korotkov Backpatch-through: 9.4
2018-12-10Add stack depth checks to key recursive functions in backend/nodes/*.c.Tom Lane
Although copyfuncs.c has a check_stack_depth call in its recursion, equalfuncs.c, outfuncs.c, and readfuncs.c lacked one. This seems unwise. Likewise fix planstate_tree_walker(), in branches where that exists. Discussion: https://postgr.es/m/30253.1544286631@sss.pgh.pa.us
2018-12-06Improve our response to invalid format strings, and detect more cases.Tom Lane
Places that are testing for *printf failure ought to include the format string in their error reports, since bad-format-string is one of the more likely causes of such failure. This both makes it easier to find and repair the mistake, and provides at least some useful info to the user who stumbles across such a problem. Also, tighten snprintf.c to report EINVAL for an invalid flag or final character in a format %-spec (including the case where the %-spec is missing a final character altogether). This seems like better project policy, and it also allows removing an instruction or two from the hot code path. Back-patch the error reporting change in pvsnprintf, since it should be harmless and may be helpful; but not the snprintf.c change. Per discussion of bug #15511 from Ertuğrul Kahveci, which reported an invalid translated format string. These changes don't fix that error, but they should improve matters next time we make such a mistake. Discussion: https://postgr.es/m/15511-1d8b6a0bc874112f@postgresql.org
2018-11-29Ensure static libraries have correct mod time even if ranlib messes it up.Tom Lane
In at least Apple's version of ranlib, the output file is updated to have a mod time equal to the max of the timestamps of its components, and that data only has seconds precision. On a filesystem with sub-second file timestamp precision --- say, APFS --- this can result in the finished static library appearing older than its input files, which causes useless rebuilds and possible outright failures in parallel makes. We've only seen this reported in the field from people using Apple's ranlib with a non-Apple make, because Apple's make doesn't know about sub-second timestamps either so it doesn't decide rebuilds are needed. But Apple's ranlib presumably shares code with at least some BSDen, so it's not that unlikely that the same problem could arise elsewhere. To fix, just "touch" the output file after ranlib finishes. We seem to need this in only one place. There are other calls of ranlib in our makefiles, but they are working on intermediate files whose timestamps are not actually important, or else on an installed static library for which sub-second timestamp precision is unlikely to matter either. (Also, so far as I can tell, Apple's ranlib doesn't mess up the file timestamp in the latter usage anyhow.) In passing, change "ranlib" to "$(RANLIB)" in one place that was bypassing the make macro for no good reason. Per bug #15525 from Jack Kelly (via Alyssa Ross). Back-patch to all supported branches. Discussion: https://postgr.es/m/15525-a30da084f17a1faa@postgresql.org
2018-11-29Fix handling of synchronous replication for stopping WAL sendersMichael Paquier
This fixes an oversight from c6c3334 which has introduced a more strict ordering in the way WAL senders are stopped to prevent current WAL activity when a shutdown checkpoint is created. After all backends are stopped, all WAL senders are requested to stop which makes them stop any activity, and switching their state as stopping. Once the checkpointer knows that all WAL senders are in a stopping state, the shutdown checkpoint can begin, with all WAL senders activated, waiting for their clients to flush the shutdown checkpoint record. If a subset of WAL senders are stopping and in a sync state, other WAL senders could still be waiting for a WAL position to be synced while committing a transaction, however the subset of stopping senders would not release waiters, potentially breaking synchronous replication guarantees. This commit makes sure that even WAL senders stopping are able to release waiters properly. On 9.4, this can also trigger an assertion failure when setting for example max_wal_senders to 1 where a WAL sender is not able to find itself as in synchronous state when the instance stops. Reported-by: Paul Guo Author: Paul Guo, Michael Paquier Discussion: https://postgr.es/m/CAEET0ZEv8VFqT3C-cQm6byOB4r4VYWcef1J21dOX-gcVhCSpmA@mail.gmail.com Backpatch-through: 9.4
2018-11-28Do not decode TOAST data for table rewritesTomas Vondra
During table rewrites (VACUUM FULL and CLUSTER), the main heap is logged using XLOG / FPI records, and thus (correctly) ignored in decoding. But the associated TOAST table is WAL-logged as plain INSERT records, and so was logically decoded and passed to reorder buffer. That has severe consequences with TOAST tables of non-trivial size. Firstly, reorder buffer has to keep all those changes, possibly spilling them to a file, incurring I/O costs and disk space. Secondly, ReoderBufferCommit() was stashing all those TOAST chunks into a hash table, which got discarded only after processing the row from the main heap. But as the main heap is not decoded for rewrites, this never happened, so all the TOAST data accumulated in memory, resulting either in excessive memory consumption or OOM. The fix is simple, as commit e9edc1ba already introduced infrastructure (namely HEAP_INSERT_NO_LOGICAL flag) to skip logical decoding of TOAST tables, but it only applied it to system tables. So simply use it for all TOAST data in raw_heap_insert(). That would however solve only the memory consumption issue - the TOAST changes would still be decoded and added to the reorder buffer, and spilled to disk (although without TOAST tuple data, so much smaller). But we can solve that by tweaking DecodeInsert() to just ignore such INSERT records altogether, using XLH_INSERT_CONTAINS_NEW_TUPLE flag, instead of skipping them later in ReorderBufferCommit(). Review: Masahiko Sawada Discussion: https://www.postgresql.org/message-id/flat/1a17c643-e9af-3dba-486b-fbe31bc1823a%402ndquadrant.com Backpatch: 9.4-, where logical decoding was introduced