summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2018-03-22Fix tuple counting in SP-GiST index build.Tom Lane
Count the number of tuples in the index honestly, instead of assuming that it's the same as the number of tuples in the heap. (It might be different if the index is partial.) Back-patch to all supported versions. Tomas Vondra Discussion: https://postgr.es/m/3b3d8eac-c709-0d25-088e-b98339a1b28a@2ndquadrant.com
2018-03-21Fix mishandling of quoted-list GUC values in pg_dump and ruleutils.c.Tom Lane
Code that prints out the contents of setconfig or proconfig arrays in SQL format needs to handle GUC_LIST_QUOTE variables differently from other ones, because for those variables, flatten_set_variable_args() already applied a layer of quoting. The value can therefore safely be printed as-is, and indeed must be, or flatten_set_variable_args() will muck it up completely on reload. For all other GUC variables, it's necessary and sufficient to quote the value as a SQL literal. We'd recognized the need for this long ago, but mis-analyzed the need slightly, thinking that all GUC_LIST_INPUT variables needed the special treatment. That's actually wrong, since a valid value of a LIST variable might include characters that need quoting, although no existing variables accept such values. More to the point, we hadn't made any particular effort to keep the various places that deal with this up-to-date with the set of variables that actually need special treatment, meaning that we'd do the wrong thing with, for example, temp_tablespaces values. This affects dumping of SET clauses attached to functions, as well as ALTER DATABASE/ROLE SET commands. In ruleutils.c we can fix it reasonably honestly by exporting a guc.c function that allows discovering the flags for a given GUC variable. But pg_dump doesn't have easy access to that, so continue the old method of having a hard-wired list of affected variable names. At least we can fix it to have just one list not two, and update the list to match current reality. A remaining problem with this is that it only works for built-in GUC variables. pg_dump's list obvious knows nothing of third-party extensions, and even the "ask guc.c" method isn't bulletproof since the relevant extension might not be loaded. There's no obvious solution to that, so for now, we'll just have to discourage extension authors from inventing custom GUCs that need GUC_LIST_QUOTE. This has been busted for a long time, so back-patch to all supported branches. Michael Paquier and Tom Lane, reviewed by Kyotaro Horiguchi and Pavel Stehule Discussion: https://postgr.es/m/20180111064900.GA51030@paquier.xyz
2018-03-19Prevent query-lifespan memory leakage of SP-GiST traversal values.Tom Lane
The original coding of the SP-GiST scan traversalValue feature (commit ccd6eb49a) arranged for traversal values to be stored in the query's main executor context. That's fine if there's only one index scan per query, but if there are many, we have a memory leak as successive scans create new traversal values. Fix it by creating a separate memory context for traversal values, which we can reset during spgrescan(). Back-patch to 9.6 where this code was introduced. In principle, adding the traversalCxt field to SpGistScanOpaqueData creates an ABI break in the back branches. But I (tgl) have little sympathy for extensions including spgist_private.h, so I'm not very worried about that. Alternatively we could stick the new field at the end of the struct in back branches, but that has its own downsides. Anton Dignös, reviewed by Alexander Kuzmenkov Discussion: https://postgr.es/m/CALNdv1jb6y2Te-m8xHLxLX12RsBmZJ1f4hESX7J0HjgyOhA9eA@mail.gmail.com
2018-03-19Fix some corner-case issues in REFRESH MATERIALIZED VIEW CONCURRENTLY.Tom Lane
refresh_by_match_merge() has some issues in the way it builds a SQL query to construct the "diff" table: 1. It doesn't require the selected unique index(es) to be indimmediate. 2. It doesn't pay attention to the particular equality semantics enforced by a given index, but just assumes that they must be those of the column datatype's default btree opclass. 3. It doesn't check that the indexes are btrees. 4. It's insufficiently careful to ensure that the parser will pick the intended operator when parsing the query. (This would have been a security bug before CVE-2018-1058.) 5. It's not careful about indexes on system columns. The way to fix #4 is to make use of the existing code in ri_triggers.c for generating an arbitrary binary operator clause. I chose to move that to ruleutils.c, since that seems a more reasonable place to be exporting such functionality from than ri_triggers.c. While #1, #3, and #5 are just latent given existing feature restrictions, and #2 doesn't arise in the core system for lack of alternate opclasses with different equality behaviors, #4 seems like an issue worth back-patching. That's the bulk of the change anyway, so just back-patch the whole thing to 9.4 where this code was introduced. Discussion: https://postgr.es/m/13836.1521413227@sss.pgh.pa.us
2018-03-19Fix performance hazard in REFRESH MATERIALIZED VIEW CONCURRENTLY.Tom Lane
Jeff Janes discovered that commit 7ca25b7de made one of the queries run by REFRESH MATERIALIZED VIEW CONCURRENTLY perform badly. The root cause is bad cardinality estimation for correlated quals, but a principled solution to that problem is some way off, especially since the planner lacks any statistics about whole-row variables. Moreover, in non-error cases this query produces no rows, meaning it must be run to completion; but use of LIMIT 1 encourages the planner to pick a fast-start, slow-completion plan, exactly not what we want. Remove the LIMIT clause, and instead rely on the count parameter we pass to SPI_execute() to prevent excess work if the query does return some rows. While we've heard no field reports of planner misbehavior with this query, it could be that people are having performance issues that haven't reached the level of pain needed to cause a bug report. In any case, that LIMIT clause can't possibly do anything helpful with any existing version of the planner, and it demonstrably can cause bad choices in some cases, so back-patch to 9.4 where the code was introduced. Thomas Munro Discussion: https://postgr.es/m/CAMkU=1z-JoGymHneGHar1cru4F1XDfHqJDzxP_CtK5cL3DOfmg@mail.gmail.com
2018-03-18Fix pg_recvlogical for pre-10 versionsMagnus Hagander
In e170b8c8, protection against modified search_path was added. However, PostgreSQL versions prior to 10 does not accept SQL commands over a replication connection, so the protection would generate a syntax error. Since we cannot run SQL commands on it, we are also not vulnerable to the issue that e170b8c8 fixes, so we can just skip this command for older versions. Author: Michael Paquier <michael@paquier.xyz>
2018-03-17Fix overflow handling in plpgsql's integer FOR loops.Tom Lane
The test to exit the loop if the integer control value would overflow an int32 turns out not to work on some ICC versions, as it's dependent on the assumption that the compiler will execute the code as written rather than "optimize" it. ICC lacks any equivalent of gcc's -fwrapv switch, so it was optimizing on the assumption of no integer overflow, and that breaks this. Rewrite into a form that in fact does not do any overflowing computations. Per Tomas Vondra and buildfarm member fulmar. It's been like this for a long time, although it was not till we added a regression test case covering the behavior (in commit dd2243f2a) that the problem became apparent. Back-patch to all supported versions. Discussion: https://postgr.es/m/50562fdc-0876-9843-c883-15b8566c7511@2ndquadrant.com
2018-03-17Fix WHERE CURRENT OF when the referenced cursor uses an index-only scan.Tom Lane
"UPDATE/DELETE WHERE CURRENT OF cursor_name" failed, with an error message like "cannot extract system attribute from virtual tuple", if the cursor was using a index-only scan for the target table. Fix it by digging the current TID out of the indexscan state. It seems likely that the same failure could occur for CustomScan plans and perhaps some FDW plan types, so that leaving this to be treated as an internal error with an obscure message isn't as good an idea as it first seemed. Hence, add a bit of heaptuple.c infrastructure to let us deliver a more on-topic message. I chose to make the message match what you get for the case where execCurrentOf can't identify the target scan node at all, "cursor "foo" is not a simply updatable scan of table "bar"". Perhaps it should be different, but we can always adjust that later. In the future, it might be nice to provide hooks that would let custom scan providers and/or FDWs deal with this in other ways; but that's not a suitable topic for a back-patchable bug fix. It's been like this all along, so back-patch to all supported branches. Yugo Nagata and Tom Lane Discussion: https://postgr.es/m/20180201013349.937dfc5f.nagata@sraoss.co.jp
2018-03-16Fix query-lifespan memory leakage in repeatedly executed hash joins.Tom Lane
ExecHashTableCreate allocated some memory that wasn't freed by ExecHashTableDestroy, specifically the per-hash-key function information. That's not a huge amount of data, but if one runs a query that repeats a hash join enough times, it builds up. Fix by arranging for the data in question to be kept in the hashtable's hashCxt instead of leaving it "loose" in the query-lifespan executor context. (This ensures that we'll also clean up anything that the hash functions allocate in fn_mcxt.) Per report from Amit Khandekar. It's been like this forever, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAJ3gD9cFofAWGvcxLOxDHC=B0hjtW8yGmUsF2hdGh97CM38=7g@mail.gmail.com
2018-03-15test_ddl_deparse: rename matviewAlvaro Herrera
Should have done this in e69f5e0efca ... Per note from Tom Lane.
2018-03-15test_ddl_deparse: Don't use pg_class as source for a matviewAlvaro Herrera
Doing so causes a pg_upgrade of a database containing these objects to fail whenever pg_class changes. And it's pointless anyway: we have more interesting tables anyhow. Discussion: https://postgr.es/m/CAD5tBc+s8pW9WvH2+_z=B4x95FD4QuzZKcaMpff_9H4rS0VU1A@mail.gmail.com
2018-03-14Fix double frees in ecpg.Michael Meskes
Patch by Patrick Krecker <patrick@judicata.com>
2018-03-13When updating reltuples after ANALYZE, just extrapolate from our sample.Tom Lane
The existing logic for updating pg_class.reltuples trusted the sampling results only for the pages ANALYZE actually visited, preferring to believe the previous tuple density estimate for all the unvisited pages. While there's some rationale for doing that for VACUUM (first that VACUUM is likely to visit a very nonrandom subset of pages, and second that we know for sure that the unvisited pages did not change), there's no such rationale for ANALYZE: by assumption, it's looked at an unbiased random sample of the table's pages. Furthermore, in a very large table ANALYZE will have examined only a tiny fraction of the table's pages, meaning it cannot slew the overall density estimate very far at all. In a table that is physically growing, this causes reltuples to increase nearly proportionally to the change in relpages, regardless of what is actually happening in the table. This has been observed to cause reltuples to become so much larger than reality that it effectively shuts off autovacuum, whose threshold for doing anything is a fraction of reltuples. (Getting to the point where that would happen seems to require some additional, not well understood, conditions. But it's undeniable that if reltuples is seriously off in a large table, ANALYZE alone will not fix it in any reasonable number of iterations, especially not if the table is continuing to grow.) Hence, restrict the use of vac_estimate_reltuples() to VACUUM alone, and in ANALYZE, just extrapolate from the sample pages on the assumption that they provide an accurate model of the whole table. If, by very bad luck, they don't, at least another ANALYZE will fix it; in the old logic a single bad estimate could cause problems indefinitely. In HEAD, let's remove vac_estimate_reltuples' is_analyze argument altogether; it was never used for anything and now it's totally pointless. But keep it in the back branches, in case any third-party code is calling this function. Per bug #15005. Back-patch to all supported branches. David Gould, reviewed by Alexander Kuzmenkov, cosmetic changes by me Discussion: https://postgr.es/m/20180117164916.3fdcf2e9@engels
2018-03-13Avoid holding AutovacuumScheduleLock while rechecking table statistics.Tom Lane
In databases with many tables, re-fetching the statistics takes some time, so that this behavior seriously decreases the available concurrency for multiple autovac workers. There's discussion afoot about more complete fixes, but a simple and back-patchable amelioration is to claim the table and release the lock before rechecking stats. If we find out there's no longer a reason to process the table, re-taking the lock to un-claim the table is cheap enough. (This patch is quite old, but got lost amongst a discussion of more aggressive fixes. It's not clear when or if such a fix will be accepted, but in any case it'd be unlikely to get back-patched. Let's do this now so we have some improvement for the back branches.) In passing, make the normal un-claim step take AutovacuumScheduleLock not AutovacuumLock, since that is what is documented to protect the wi_tableoid field. This wasn't an actual bug in view of the fact that readers of that field hold both locks, but it creates some concurrency penalty against operations that need only AutovacuumLock. Back-patch to all supported versions. Jeff Janes Discussion: https://postgr.es/m/26118.1520865816@sss.pgh.pa.us
2018-03-12Set connection back to NULL after freeing it.Michael Meskes
Patch by Jeevan Ladhe <jeevan.ladhe@enterprisedb.com>
2018-03-11Fix improper uses of canonicalize_qual().Tom Lane
One of the things canonicalize_qual() does is to remove constant-NULL subexpressions of top-level AND/OR clauses. It does that on the assumption that what it's given is a top-level WHERE clause, so that NULL can be treated like FALSE. Although this is documented down inside a subroutine of canonicalize_qual(), it wasn't mentioned in the documentation of that function itself, and some callers hadn't gotten that memo. Notably, commit d007a9505 caused get_relation_constraints() to apply canonicalize_qual() to CHECK constraints. That allowed constraint exclusion to misoptimize situations in which a CHECK constraint had a provably-NULL subclause, as seen in the regression test case added here, in which a child table that should be scanned is not. (Although this thinko is ancient, the test case doesn't fail before 9.2, for reasons I've not bothered to track down in detail. There may be related cases that do fail before that.) More recently, commit f0e44751d added an independent bug by applying canonicalize_qual() to index expressions, which is even sillier since those might not even be boolean. If they are, though, I think this could lead to making incorrect index entries for affected index expressions in v10. I haven't attempted to prove that though. To fix, add an "is_check" parameter to canonicalize_qual() to specify whether it should assume WHERE or CHECK semantics, and make it perform NULL-elimination accordingly. Adjust the callers to apply the right semantics, or remove the call entirely in cases where it's not known that the expression has one or the other semantics. I also removed the call in some cases involving partition expressions, where it should be a no-op because such expressions should be canonical already ... and was a no-op, independently of whether it could in principle have done something, because it was being handed the qual in implicit-AND format which isn't what it expects. In HEAD, add an Assert to catch that type of mistake in future. This represents an API break for external callers of canonicalize_qual(). While that's intentional in HEAD to make such callers think about which case applies to them, it seems like something we probably wouldn't be thanked for in released branches. Hence, in released branches, the extra parameter is added to a new function canonicalize_qual_ext(), and canonicalize_qual() is a wrapper that retains its old behavior. Patch by me with suggestions from Dean Rasheed. Back-patch to all supported branches. Discussion: https://postgr.es/m/24475.1520635069@sss.pgh.pa.us
2018-03-06Refrain from duplicating data in reorderbuffersAlvaro Herrera
If a walsender exits leaving data in reorderbuffers, the next walsender that tries to decode the same transaction would append its decoded data in the same spill files without truncating it first, which effectively duplicate the data. Avoid that by removing any leftover reorderbuffer spill files when a walsender starts. Backpatch to 9.4; this bug has been there from the very beginning of logical decoding. Author: Craig Ringer, revised by me Reviewed by: Álvaro Herrera, Petr Jelínek, Masahiko Sawada
2018-03-06Fix pg_rewind to handle relation data files in tablespaces properly.Fujii Masao
pg_rewind checks whether each file is a relation data file, from its path. Previously this check logic had the bug which made pg_rewind fail to recognize any relation data files in tablespaces. Which also caused an assertion failure in pg_rewind. Back-patch to 9.5 where pg_rewind was added. Author: Takayuki Tsunakawa Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F8D6C7A@G01JPEXMBYT05
2018-03-03Fix assorted issues in convert_to_scalar().Tom Lane
If convert_to_scalar is passed a pair of datatypes it can't cope with, its former behavior was just to elog(ERROR). While this is OK so far as the core code is concerned, there's extension code that would like to use scalarltsel/scalargtsel/etc as selectivity estimators for operators that work on non-core datatypes, and this behavior is a show-stopper for that use-case. If we simply allow convert_to_scalar to return FALSE instead of outright failing, then the main logic of scalarltsel/scalargtsel will work fine for any operator that behaves like a scalar inequality comparison. The lack of conversion capability will mean that we can't estimate to better than histogram-bin-width precision, since the code will effectively assume that the comparison constant falls at the middle of its bin. But that's still a lot better than nothing. (Someday we should provide a way for extension code to supply a custom version of convert_to_scalar, but today is not that day.) While poking at this issue, we noted that the existing code for handling type bytea in convert_to_scalar is several bricks shy of a load. It assumes without checking that if the comparison value is type bytea, the bounds values are too; in the worst case this could lead to a crash. It also fails to detoast the input values, so that the comparison result is complete garbage if any input is toasted out-of-line, compressed, or even just short-header. I'm not sure how often such cases actually occur --- the bounds values, at least, are probably safe since they are elements of an array and hence can't be toasted. But that doesn't make this code OK. Back-patch to all supported branches, partly because author requested that, but mostly because of the bytea bugs. The change in API for the exposed routine convert_network_to_scalar() is theoretically a back-patch hazard, but it seems pretty unlikely that any third-party code is calling that function directly. Tomas Vondra, with some adjustments by me Discussion: https://postgr.es/m/b68441b6-d18f-13ab-b43b-9a72188a4e02@2ndquadrant.com
2018-03-02Fix VM buffer pin management in heap_lock_updated_tuple_rec().Tom Lane
Sloppy coding in this function could lead to leaking a VM buffer pin, or to attempting to free the same pin twice. Repair. While at it, reduce the code's tendency to free and reacquire the same page pin. Back-patch to 9.6; before that, this routine did not concern itself with VM pages. Amit Kapila and Tom Lane Discussion: https://postgr.es/m/CAA4eK1KJKwhc=isgTQHjM76CAdVswzNeAuZkh_cx-6QgGkSEgA@mail.gmail.com
2018-03-02Make gistvacuumcleanup() count the actual number of index tuples.Tom Lane
Previously, it just returned the heap tuple count, which might be only an estimate, and would be completely the wrong thing if the index is partial. Since this function scans every index page anyway to find free pages, it's practically free to count the surviving index tuples. Let's do that and return an accurate count. This is easily visible as a wrong reltuples value for a partial GiST index following VACUUM, so back-patch to all supported branches. Andrey Borodin, reviewed by Michail Nikolaev Discussion: https://postgr.es/m/151956654251.6915.675951950408204404.pgcf@coridan.postgresql.org
2018-03-01Use ereport not elog for some corrupt-HOT-chain reports.Tom Lane
These errors have been seen in the field in corrupted-data situations. It seems worthwhile to report them with ERRCODE_DATA_CORRUPTED, rather than the generic ERRCODE_INTERNAL_ERROR, for the benefit of log monitoring and tools like amcheck. However, use errmsg_internal so that the text strings still aren't translated; it seems unlikely to be worth translators' time to do so. Back-patch to 9.3, like the predecessor commit d70cf811f that introduced these elog calls originally (replacing Asserts). Peter Geoghegan Discussion: https://postgr.es/m/CAH2-Wzmn4-Pg-UGFwyuyK-wiTih9j32pwg_7T9iwqXpAUZr=Mg@mail.gmail.com
2018-03-01Relax overly strict sanity check for upgraded ancient databasesAlvaro Herrera
Commit 4800f16a7ad0 added some sanity checks to ensure we don't accidentally corrupt data, but in one of them we failed to consider the effects of a database upgraded from 9.2 or earlier, where a tuple exclusively locked prior to the upgrade has a slightly different bit pattern. Fix that by using the macro that we fixed in commit 74ebba84aeb6 for similar situations. Reported-by: Alexandre Garcia Reviewed-by: Andres Freund Discussion: https://postgr.es/m/CAPYLKR6yxV4=pfW0Gwij7aPNiiPx+3ib4USVYnbuQdUtmkMaEA@mail.gmail.com Andres suspects that this bug may have wider ranging consequences, but I couldn't find anything.
2018-03-01Fix IOS planning when only some index columns can return an attribute.Tom Lane
Since 9.5, it's possible that some but not all columns of an index support returning the indexed value for index-only scans. If the same indexed column appears in index columns that behave both ways, check_index_only() supposed that it'd be OK to do an index-only scan testing that column; but that fails if we have to recheck the indexed condition on one of the columns that doesn't support this. In principle we could make this work by remapping the recheck expressions to pull the value from a column that does support returning the indexed value. But such cases are so weird and rare that, at least for now, it doesn't seem worth the trouble. Instead, just teach check_index_only that a value is returnable only if all the index columns containing it are returnable, rather than any of them. Per report from David Pereiro Lagares. Back-patch to 9.5 where the possibility of this situation appeared. Kyotaro Horiguchi Discussion: https://postgr.es/m/1516210494.1798.16.camel@nlpgo.com
2018-02-28Rename base64 routines to avoid conflict with Solaris built-in functions.Tom Lane
Solaris 11.4 has built-in functions named b64_encode and b64_decode. Rename ours to something else to avoid the conflict (fortunately, ours are static so the impact is limited). One could wish for less duplication of code in this area, but that would be a larger patch and not very suitable for back-patching. Since this is a portability fix, we want to put it into all supported branches. Report and initial patch by Rainer Orth, reviewed and adjusted a bit by Michael Paquier Discussion: https://postgr.es/m/ydd372wk28h.fsf@CeBiTec.Uni-Bielefeld.DE
2018-02-28Remove restriction on SQL block length in isolationtester scanner.Tom Lane
specscanner.l had a fixed limit of 1024 bytes on the length of individual SQL stanzas in an isolation test script. People are starting to run into that, so fix it by making the buffer resizable. Once we allow this in HEAD, it seems inevitable that somebody will try to back-patch a test that exceeds the old limit, so back-patch this change as a preventive measure. Daniel Gustafsson Discussion: https://postgr.es/m/8D628BE4-6606-4FF6-A3FF-8B2B0E9B43D0@yesql.se
2018-02-27Fix up ecpg's configuration so it handles "long long int" in MSVC builds.Tom Lane
Although configure-based builds correctly define HAVE_LONG_LONG_INT when appropriate (in both pg_config.h and ecpg_config.h), builds using the MSVC scripts failed to do so. This currently has no impact on the backend, since it uses that symbol nowhere; but it does prevent ecpg from supporting "long long int". Fix that. Also, adjust Solution.pm so that in the constructed ecpg_config.h file, the "#if (_MSC_VER > 1200)" covers only the LONG_LONG_INT-related #defines, not the whole file. AFAICS this was a thinko on somebody's part: ENABLE_THREAD_SAFETY should always be defined in Windows builds, and in branches using USE_INTEGER_DATETIMES, the setting of that shouldn't depend on the compiler version either. If I'm wrong, I imagine the buildfarm will say so. Per bug #15080 from Jonathan Allen; issue diagnosed by Michael Meskes and Andrew Gierth. Back-patch to all supported branches. Discussion: https://postgr.es/m/151935568942.1461.14623890240535309745@wrigleys.postgresql.org
2018-02-27Remove regression tests' CREATE FUNCTION commands for unused C functions.Tom Lane
I removed these functions altogether in HEAD, in commit db3af9feb, and it emerges that that causes trouble for cross-branch upgrade testing. We could put back stub functions but that seems pretty silly. Instead, back-patch a minimal subset of db3af9feb, namely just removing the CREATE FUNCTION commands. Discussion: https://postgr.es/m/11927.1519756619@sss.pgh.pa.us
2018-02-27Prevent dangling-pointer access when update trigger returns old tuple.Tom Lane
A before-update row trigger may choose to return the "new" or "old" tuple unmodified. ExecBRUpdateTriggers failed to consider the second possibility, and would proceed to free the "old" tuple even if it was the one returned, leading to subsequent access to already-deallocated memory. In debug builds this reliably leads to an "invalid memory alloc request size" failure; in production builds it might accidentally work, but data corruption is also possible. This is a very old bug. There are probably a couple of reasons it hasn't been noticed up to now. It would be more usual to return NULL if one wanted to suppress the update action; returning "old" is significantly less efficient since the update will occur anyway. Also, none of the standard PLs would ever cause this because they all returned freshly-manufactured tuples even if they were just copying "old". But commit 4b93f5799 changed that for plpgsql, making it possible to see the bug with a plpgsql trigger. Still, this is certainly legal behavior for a trigger function, so it's ExecBRUpdateTriggers's fault not plpgsql's. It seems worth creating a test case that exercises returning "old" directly with a C-language trigger; testing this through plpgsql seems unreliable because its behavior might change again. Report and fix by Rushabh Lathia; regression test case by me. Back-patch to all supported branches. Discussion: https://postgr.es/m/CAGPqQf1P4pjiNPrMof=P_16E-DFjt457j+nH2ex3=nBTew7tXw@mail.gmail.com
2018-02-26Stamp 9.6.8.REL9_6_8Tom Lane
2018-02-26Schema-qualify references in test_ddl_deparse test script.Tom Lane
This omission seems to be what is causing buildfarm failures on crake. Security: CVE-2018-1058
2018-02-26Document security implications of search_path and the public schema.Noah Misch
The ability to create like-named objects in different schemas opens up the potential for users to change the behavior of other users' queries, maliciously or accidentally. When you connect to a PostgreSQL server, you should remove from your search_path any schema for which a user other than yourself or superusers holds the CREATE privilege. If you do not, other users holding CREATE privilege can redefine the behavior of your commands, causing them to perform arbitrary SQL statements under your identity. "SET search_path = ..." and "SELECT pg_catalog.set_config(...)" are not vulnerable to such hijacking, so one can use either as the first command of a session. As special exceptions, the following client applications behave as documented regardless of search_path settings and schema privileges: clusterdb createdb createlang createuser dropdb droplang dropuser ecpg (not programs it generates) initdb oid2name pg_archivecleanup pg_basebackup pg_config pg_controldata pg_ctl pg_dump pg_dumpall pg_isready pg_receivewal pg_recvlogical pg_resetwal pg_restore pg_rewind pg_standby pg_test_fsync pg_test_timing pg_upgrade pg_waldump reindexdb vacuumdb vacuumlo. Not included are core client programs that run user-specified SQL commands, namely psql and pgbench. PostgreSQL encourages non-core client applications to do likewise. Document this in the context of libpq connections, psql connections, dblink connections, ECPG connections, extension packaging, and schema usage patterns. The principal defense for applications is "SELECT pg_catalog.set_config('search_path', '', false)", and the principal defense for databases is "REVOKE CREATE ON SCHEMA public FROM PUBLIC". Either one is sufficient to prevent attack. After a REVOKE, consider auditing the public schema for objects named like pg_catalog objects. Authors of SECURITY DEFINER functions use some of the same defenses, and the CREATE FUNCTION reference page already covered them thoroughly. This is a good opportunity to audit SECURITY DEFINER functions for robust security practice. Back-patch to 9.3 (all supported versions). Reviewed by Michael Paquier and Jonathan S. Katz. Reported by Arseniy Sharoglazov. Security: CVE-2018-1058
2018-02-26Empty search_path in Autovacuum and non-psql/pgbench clients.Noah Misch
This makes the client programs behave as documented regardless of the connect-time search_path and regardless of user-created objects. Today, a malicious user with CREATE permission on a search_path schema can take control of certain of these clients' queries and invoke arbitrary SQL functions under the client identity, often a superuser. This is exploitable in the default configuration, where all users have CREATE privilege on schema "public". This changes behavior of user-defined code stored in the database, like pg_index.indexprs and pg_extension_config_dump(). If they reach code bearing unqualified names, "does not exist" or "no schema has been selected to create in" errors might appear. Users may fix such errors by schema-qualifying affected names. After upgrading, consider watching server logs for these errors. The --table arguments of src/bin/scripts clients have been lax; for example, "vacuumdb -Zt pg_am\;CHECKPOINT" performed a checkpoint. That now fails, but for now, "vacuumdb -Zt 'pg_am(amname);CHECKPOINT'" still performs a checkpoint. Back-patch to 9.3 (all supported versions). Reviewed by Tom Lane, though this fix strategy was not his first choice. Reported by Arseniy Sharoglazov. Security: CVE-2018-1058
2018-02-26Avoid using unsafe search_path settings during dump and restore.Tom Lane
Historically, pg_dump has "set search_path = foo, pg_catalog" when dumping an object in schema "foo", and has also caused that setting to be used while restoring the object. This is problematic because functions and operators in schema "foo" could capture references meant to refer to pg_catalog entries, both in the queries issued by pg_dump and those issued during the subsequent restore run. That could result in dump/restore misbehavior, or in privilege escalation if a nefarious user installs trojan-horse functions or operators. This patch changes pg_dump so that it does not change the search_path dynamically. The emitted restore script sets the search_path to what was used at dump time, and then leaves it alone thereafter. Created objects are placed in the correct schema, regardless of the active search_path, by dint of schema-qualifying their names in the CREATE commands, as well as in subsequent ALTER and ALTER-like commands. Since this change requires a change in the behavior of pg_restore when processing an archive file made according to this new convention, bump the archive file version number; old versions of pg_restore will therefore refuse to process files made with new versions of pg_dump. Security: CVE-2018-1058
2018-02-26Translation updatesPeter Eisentraut
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: b97fb93ecf9b04faed73a68db33226f33ae3065e
2018-02-23Synchronize doc/ copies of src/test/examples/.Noah Misch
This is mostly cosmetic, but it might fix build failures, on some platform, when copying from the documentation. Back-patch to 9.3 (all supported versions).
2018-02-23Fix planner failures with overlapping mergejoin clauses in an outer join.Tom Lane
Given overlapping or partially redundant join clauses, for example t1 JOIN t2 ON t1.a = t2.x AND t1.b = t2.x the planner's EquivalenceClass machinery will ordinarily refactor the clauses as "t1.a = t1.b AND t1.a = t2.x", so that join processing doesn't see multiple references to the same EquivalenceClass in a list of join equality clauses. However, if the join is outer, it's incorrect to derive a restriction clause on the outer side from the join conditions, so the clause refactoring does not happen and we end up with overlapping join conditions. The code that attempted to deal with such cases had several subtle bugs, which could result in "left and right pathkeys do not match in mergejoin" or "outer pathkeys do not match mergeclauses" planner errors, if the selected join plan type was a mergejoin. (It does not appear that any actually incorrect plan could have been emitted.) The core of the problem really was failure to recognize that the outer and inner relations' pathkeys have different relationships to the mergeclause list. A join's mergeclause list is constructed by reference to the outer pathkeys, so it will always be ordered the same as the outer pathkeys, but this cannot be presumed true for the inner pathkeys. If the inner sides of the mergeclauses contain multiple references to the same EquivalenceClass ({t2.x} in the above example) then a simplistic rendering of the required inner sort order is like "ORDER BY t2.x, t2.x", but the pathkey machinery recognizes that the second sort column is redundant and throws it away. The mergejoin planning code failed to account for that behavior properly. One error was to try to generate cut-down versions of the mergeclause list from cut-down versions of the inner pathkeys in the same way as the initial construction of the mergeclause list from the outer pathkeys was done; this could lead to choosing a mergeclause list that fails to match the outer pathkeys. The other problem was that the pathkey cross-checking code in create_mergejoin_plan treated the inner and outer pathkey lists identically, whereas actually the expectations for them must be different. That led to false "pathkeys do not match" failures in some cases, and in principle could have led to failure to detect bogus plans in other cases, though there is no indication that such bogus plans could be generated. Reported by Alexander Kuzmenkov, who also reviewed this patch. This has been broken for years (back to around 8.3 according to my testing), so back-patch to all supported branches. Discussion: https://postgr.es/m/5dad9160-4632-0e47-e120-8e2082000c01@postgrespro.ru
2018-02-22Backport: Mark assorted GUC variables as PGDLLIMPORT.Andres Freund
This backpatches 935dee9ad5a8d12f4d3b772a6e6c99d245e5ad44 to the the branches requested by extension authors. Original-Author: Metin Doslu Original-Committer: Robert Haas Author: Brian Cloutier
2018-02-21Repair pg_upgrade's failure to preserve relfrozenxid for matviews.Tom Lane
This oversight led to data corruption in matviews, manifesting as "could not access status of transaction" before our most recent releases, and "found xmin from before relfrozenxid" errors since then. The proximate cause of the problem seems to have been confusion between the task of preserving dropped-column status and the task of preserving frozenxid status. Those are required for distinct sets of relkinds, and the reasoning was entirely undocumented in the source code. In hopes of forestalling future errors of the same kind, try to improve the commentary in this area. In passing, also improve the remarkably unhelpful comments around pg_upgrade's set_frozenxids(). That's not actually buggy AFAICS, but good luck figuring out what it does from the old comments. Per report from Claudio Freire. It appears that bug #14852 from Alexey Ermakov is an earlier report of the same issue, and there may be other cases that we failed to identify at the time. Patch by me based on analysis by Andres Freund. The bug dates back to the introduction of matviews, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAGTBQpbrY9CdRGGhyBZ9yqY4jWaGC85rUF4X+R7d-aim=mBNsw@mail.gmail.com Discussion: https://postgr.es/m/20171013115320.28049.86457@wrigleys.postgresql.org
2018-02-19Fix misbehavior of CTE-used-in-a-subplan during EPQ rechecks.Tom Lane
An updating query that reads a CTE within an InitPlan or SubPlan could get incorrect results if it updates rows that are concurrently being modified. This is caused by CteScanNext supposing that nothing inside its recursive ExecProcNode call could change which read pointer is selected in the CTE's shared tuplestore. While that's normally true because of scoping considerations, it can break down if an EPQ plan tree gets built during the call, because EvalPlanQualStart builds execution trees for all subplans whether they're going to be used during the recheck or not. And it seems like a pretty shaky assumption anyway, so let's just reselect our own read pointer here. Per bug #14870 from Andrei Gorita. This has been broken since CTEs were implemented, so back-patch to all supported branches. Discussion: https://postgr.es/m/20171024155358.1471.82377@wrigleys.postgresql.org
2018-02-14Fix broken logic for reporting PL/Python function names in errcontext.Tom Lane
plpython_error_callback() reported the name of the function associated with the topmost PL/Python execution context. This was not merely wrong if there were nested PL/Python contexts, but it risked a core dump if the topmost one is an inline code block rather than a named function. That will have proname = NULL, and so we were passing a NULL pointer to snprintf("%s"). It seems that none of the PL/Python-testing machines in the buildfarm will dump core for that, but some platforms do, as reported by Marina Polyakova. Investigation finds that there actually is an existing regression test that used to prove that the behavior was wrong, though apparently no one had noticed that it was printing the wrong function name. It stopped showing the problem in 9.6 when we adjusted psql to not print CONTEXT by default for NOTICE messages. The problem is masked (if your platform avoids the core dump) in error cases, because PL/Python will throw away the originally generated error info in favor of a new traceback produced at the outer level. Repair by using ErrorContextCallback.arg to pass the correct context to the error callback. Add a regression test illustrating correct behavior. Back-patch to all supported branches, since they're all broken this way. Discussion: https://postgr.es/m/156b989dbc6fe7c4d3223cf51da61195@postgrespro.ru
2018-02-05Stamp 9.6.7.REL9_6_7Tom Lane
2018-02-05Translation updatesPeter Eisentraut
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 8c0c2fe449f6c310f83664b946ba85309a69bdf6
2018-02-05Ensure that all temp files made during pg_upgrade are non-world-readable.Tom Lane
pg_upgrade has always attempted to ensure that the transient dump files it creates are inaccessible except to the owner. However, refactoring in commit 76a7650c4 broke that for the file containing "pg_dumpall -g" output; since then, that file was protected according to the process's default umask. Since that file may contain role passwords (hopefully encrypted, but passwords nonetheless), this is a particularly unfortunate oversight. Prudent users of pg_upgrade on multiuser systems would probably run it under a umask tight enough that the issue is moot, but perhaps some users are depending only on pg_upgrade's umask changes to protect their data. To fix this in a future-proof way, let's just tighten the umask at process start. There are no files pg_upgrade needs to write at a weaker security level; and if there were, transiently relaxing the umask around where they're created would be a safer approach. Report and patch by Tom Lane; the idea for the fix is due to Noah Misch. Back-patch to all supported branches. Security: CVE-2018-1053
2018-01-29psql documentation fixesPeter Eisentraut
Update the documentation for \pset to mention columns|linestyle|pager_min_lines. Author: Дилян Палаузов <dpa-postgres@aegee.org>
2018-01-28Add stack-overflow guards in set-operation planning.Tom Lane
create_plan_recurse lacked any stack depth check. This is not per our normal coding rules, but I'd supposed it was safe because earlier planner processing is more complex and presumably should eat more stack. But bug #15033 from Andrew Grossman shows this isn't true, at least not for queries having the form of a many-thousand-way INTERSECT stack. Further testing showed that recurse_set_operations is also capable of being crashed in this way, since it likewise will recurse to the bottom of a parsetree before calling any support functions that might themselves contain any stack checks. However, its stack consumption is only perhaps a third of create_plan_recurse's. It's possible that this particular problem with create_plan_recurse can only manifest in 9.6 and later, since before that we didn't build a Path tree for set operations. But having seen this example, I now have no faith in the proposition that create_plan_recurse doesn't need a stack check, so back-patch to all supported branches. Discussion: https://postgr.es/m/20180127050845.28812.58244@wrigleys.postgresql.org
2018-01-27Update time zone data files to tzdata release 2018c.Tom Lane
DST law changes in Brazil, Sao Tome and Principe. Historical corrections for Bolivia, Japan, and South Sudan. The "US/Pacific-New" zone has been removed (it was only a link to America/Los_Angeles anyway).
2018-01-23Teach reparameterize_path() to handle AppendPaths.Tom Lane
If we're inside a lateral subquery, there may be no unparameterized paths for a particular child relation of an appendrel, in which case we *must* be able to create similarly-parameterized paths for each other child relation, else the planner will fail with "could not devise a query plan for the given query". This means that there are situations where we'd better be able to reparameterize at least one path for each child. This calls into question the assumption in reparameterize_path() that it can just punt if it feels like it. However, the only case that is known broken right now is where the child is itself an appendrel so that all its paths are AppendPaths. (I think possibly I disregarded that in the original coding on the theory that nested appendrels would get folded together --- but that only happens *after* reparameterize_path(), so it's not excused from handling a child AppendPath.) Given that this code's been like this since 9.3 when LATERAL was introduced, it seems likely we'd have heard of other cases by now if there were a larger problem. Per report from Elvis Pranskevichus. Back-patch to 9.3. Discussion: https://postgr.es/m/5981018.zdth1YWmNy@hammer.magicstack.net
2018-01-23Report an ERROR if a parallel worker fails to start properly.Robert Haas
Commit 28724fd90d2f85a0573a8107b48abad062a86d83 fixed things so that if a background worker fails to start due to fork() failure or because it is terminated before startup succeeds, BGWH_STOPPED will be reported. However, that only helps if the code that uses the background worker machinery notices the change in status, and the code in parallel.c did not. To fix that, do two things. First, make sure that when a worker exits, it triggers the leader to read from error queues. That way, if a worker which has attached to an error queue exits uncleanly, the leader is sure to throw some error, either the contents of the ErrorResponse sent by the worker, or "lost connection to parallel worker" if it exited without sending one. To cover the case where the worker never starts up in the first place or exits before attaching to the error queue, the ParallelContext now keeps track of which workers have sent at least one message via the error queue. A worker which sends no messages by the time the parallel operation finishes will be checked to see whether it exited before attaching to the error queue; if so, a new error message, "parallel worker failed to initialize", will be reported. If not, we'll continue to wait until it either starts up and exits cleanly, starts up and exits uncleanly, or fails to start, and then take the appropriate action. Patch by me, reviewed by Amit Kapila. Discussion: http://postgr.es/m/CA+TgmoYnBgXgdTu6wk5YPdWhmgabYc9nY_pFLq=tB=FSLYkD8Q@mail.gmail.com
2018-01-22Make pg_dump's ACL, sec label, and comment entries reliably identifiable.Tom Lane
_tocEntryRequired() expects that it can identify ACL, SECURITY LABEL, and COMMENT TOC entries that are for large objects by seeing whether the tag for them starts with "LARGE OBJECT ". While that works fine for actual large objects, which are indeed tagged that way, it's subject to false positives unless every such entry's tag starts with an appropriate type ID. And in fact it does not work for ACLs, because up to now we customarily tagged those entries with just the bare name of the object. This means that an ACL for an object named "LARGE OBJECT something" would be misclassified as data not schema, with undesirable results in a schema-only or data-only dump --- although pg_upgrade seems unaffected, due to the special case for binary-upgrade mode further down in _tocEntryRequired(). We can fix this by changing all the dumpACL calls to use the label strings already in use for comments and security labels, which do follow the convention of starting with an object type indicator. Well, mostly they follow it. dumpDatabase() got it wrong, using just the bare database name for those purposes, so that a database named "LARGE OBJECT something" would similarly be subject to having its comment or security label dropped or included when not wanted. Bring that into line too. (Note that up to now, database ACLs have not been processed by pg_dump, so that this issue doesn't affect them.) _tocEntryRequired() itself is not free of fault: it was overly liberal about matching object tags to "LARGE OBJECT " in binary-upgrade mode. This looks like it is probably harmless because there would be no data component to strip anyway in that mode, but at best it's trouble waiting to happen, so tighten that up too. The possible misclassification of SECURITY LABEL entries for databases is in principle a security problem, but the opportunities for actual exploits seem too narrow to be interesting. The other cases seem like just bugs, since an object owner can change its ACL or comment for himself, he needn't try to trick someone else into doing it by choosing a strange name. This has been broken since per-large-object TOC entries were introduced in 9.0, so back-patch to all supported branches. Discussion: https://postgr.es/m/21714.1516553459@sss.pgh.pa.us