summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2018-06-18Prevent hard failures of standbys caused by recycled WAL segmentsMichael Paquier
When a standby's WAL receiver stops reading WAL from a WAL stream, it writes data to the current WAL segment without having priorily zero'ed the page currently written to, which can cause the WAL reader to read junk data from a past recycled segment and then it would try to get a record from it. While sanity checks in place provide most of the protection needed, in some rare circumstances, with chances increasing when a record header crosses a page boundary, then the startup process could fail violently on an allocation failure, as follows: FATAL: invalid memory alloc request size XXX This is confusing for the user and also unhelpful as this requires in the worst case a manual restart of the instance, impacting potentially the availability of the cluster, and this also makes WAL data look like it is in a corrupted state. The chances of seeing failures are higher if the connection between the standby and its root node is unstable, causing WAL pages to be written in the middle. A couple of approaches have been discussed, like zero-ing new WAL pages within the WAL receiver itself but this has the disadvantage of impacting performance of any existing instances as this breaks the sequential writes done by the WAL receiver. This commit deals with the problem with a more simple approach, which has no performance impact without reducing the detection of the problem: if a record is found with a length higher than 1GB for backends, then do not try any allocation and report a soft failure which will force the standby to retry reading WAL. It could be possible that the allocation call passes and that an unnecessary amount of memory is allocated, however follow-up checks on records would just fail, making this allocation short-lived anyway. This patch owes a great deal to Tsunakawa Takayuki for reporting the failure first, and then discussing a couple of potential approaches to the problem. Backpatch down to 9.5, which is where palloc_extended has been introduced. Reported-by: Tsunakawa Takayuki Reviewed-by: Tsunakawa Takayuki Author: Michael Paquier Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F8B57AD@G01JPEXMBYT05
2018-06-16Avoid unnecessary use of strncpy in a couple of places in ecpg.Tom Lane
Use of strncpy with a length limit based on the source, rather than the destination, is non-idiomatic and draws warnings from gcc 8. Replace with memcpy, which does exactly the same thing in these cases, but with less chance for confusion. Backpatch to all supported branches. Discussion: https://postgr.es/m/21789.1529170195@sss.pgh.pa.us
2018-06-16Use snprintf not sprintf in pg_waldump's timestamptz_to_str.Tom Lane
This could only cause an issue if strftime returned a ridiculously long timezone name, which seems unlikely; and it wouldn't qualify as a security problem even then, since pg_waldump (nee pg_xlogdump) is a debug tool not part of the server. But gcc 8 has started issuing warnings about it, so let's use snprintf and be safe. Backpatch to 9.3 where this code was added. Discussion: https://postgr.es/m/21789.1529170195@sss.pgh.pa.us
2018-06-14Fail BRIN control functions during recovery explicitlyAlvaro Herrera
They already fail anyway, but prior to this patch they raise an ugly error message about a lock that cannot be acquired. This just improves the message. Author: Masahiko Sawada Reported-by: Masahiko Sawada Discussion: https://postgr.es/m/CAD21AoBZau4g4_NUf3BKNd=CdYK+xaPdtJCzvOC1TxGdTiJx_Q@mail.gmail.com Reviewed-by: Kuntal Ghosh, Alexander Korotkov, Simon Riggs, Michaël Paquier, Álvaro Herrera
2018-06-12Fix bugs in vacuum of shared rels, by keeping their relcache entries current.Andres Freund
When vacuum processes a relation it uses the corresponding relcache entry's relfrozenxid / relminmxid as a cutoff for when to remove tuples etc. Unfortunately for nailed relations (i.e. critical system catalogs) bugs could frequently lead to the corresponding relcache entry being stale. This set of bugs could cause actual data corruption as vacuum would potentially not remove the correct row versions, potentially reviving them at a later point. After 699bf7d05c some corruptions in this vein were prevented, but the additional error checks could also trigger spuriously. Examples of such errors are: ERROR: found xmin ... from before relfrozenxid ... and ERROR: found multixact ... from before relminmxid ... To be caused by this bug the errors have to occur on system catalog tables. The two bugs are: 1) Invalidations for nailed relations were ignored, based on the theory that the relcache entry for such tables doesn't change. Which is largely true, except for fields like relfrozenxid etc. This means that changes to relations vacuumed in other sessions weren't picked up by already existing sessions. Luckily autovacuum doesn't have particularly longrunning sessions. 2) For shared *and* nailed relations, the shared relcache init file was never invalidated while running. That means that for such tables (e.g. pg_authid, pg_database) it's not just already existing sessions that are affected, but even new connections are as well. That explains why the reports usually were about pg_authid et. al. To fix 1), revalidate the rd_rel portion of a relcache entry when invalid. This implies a bit of extra complexity to deal with bootstrapping, but it's not too bad. The fix for 2) is simpler, simply always remove both the shared and local init files. Author: Andres Freund Reviewed-By: Alvaro Herrera Discussion: https://postgr.es/m/20180525203736.crkbg36muzxrjj5e@alap3.anarazel.de https://postgr.es/m/CAMa1XUhKSJd98JW4o9StWPrfS=11bPgG+_GDMxe25TvUY4Sugg@mail.gmail.com https://postgr.es/m/CAKMFJucqbuoDRfxPDX39WhA3vJyxweRg_zDVXzncr6+5wOguWA@mail.gmail.com https://postgr.es/m/CAGewt-ujGpMLQ09gXcUFMZaZsGJC98VXHEFbF-tpPB0fB13K+A@mail.gmail.com Backpatch: 9.3-
2018-06-06Fix function code in error reportAlvaro Herrera
This bug causes a lseek() failure to be reported as a "could not open" failure in the error message, muddling bug reports. I introduced this copy-and-pasteo in commit 78e122010422. Noticed while reviewing code for bug report #15221, from lily liang. In version 10 the affected function is only used by multixact.c and commit_ts, and only in corner-case circumstances, neither of which are involved in the reported bug (a pg_subtrans failure.) Author: Álvaro Herrera
2018-05-25Fix misidentification of SQL statement type in plpgsql's exec_stmt_execsql.Tom Lane
To distinguish SQL statements that are INSERT/UPDATE/DELETE from other ones, exec_stmt_execsql looked at the post-rewrite form of the statement rather than the original. This is problematic because it did that only during first execution of the statement (in a session), but the correct answer could change later due to addition or removal of DO INSTEAD rules during the session. That could lead to an Assert failure, as reported by Tushar Ahuja and Robert Haas. In non-assert builds, there's a hazard that we would fail to enforce STRICT behavior when we'd be expected to. That would happen if an initially present DO INSTEAD, that replaced the original statement with one of a different type, were removed; after that the statement should act "normally", including strictness enforcement, but it didn't. (The converse case of enforcing strictness when we shouldn't doesn't seem to be a hazard, as addition of a DO INSTEAD that changes the statement type would always lead to acting as though the statement returned zero rows, so that the strictness error could not fire.) To fix, inspect the original form of the statement not the post-rewrite form, making it valid to assume the answer can't change intra-session. This should lead to the same answer in every case except when there is a DO INSTEAD that changes the statement type; we will now set mod_stmt=true anyway, while we would not have done so before. That breaks the Assert in the SPI_OK_REWRITTEN code path, which expected the latter behavior. It might be all right to assert mod_stmt rather than !mod_stmt there, but I'm not entirely convinced that that'd always hold, so just remove the assertion altogether. This has been broken for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/CA+TgmoZUrRN4xvZe_BbBn_Xp0BDwuMEue-0OyF0fJpfvU2Yc7Q@mail.gmail.com
2018-05-24Properly schema-qualify additional object types in getObjectDescription().Tom Lane
Collations, conversions, extended statistics objects (in >= v10), and all four types of text search objects have schema-qualified names. getObjectDescription() ignored that and would emit just the base name of the object, potentially producing wrong or at least highly misleading output. Fix it to add the schema name whenever the object is not "visible" in the current search path, as is the rule for other schema-qualifiable object types. Although in common situations the output won't change, this seems to me (tgl) to be a bug worthy of back-patching, hence do so. Kyotaro Horiguchi, per a complaint from me Discussion: https://postgr.es/m/20180522.182020.114074746.horiguchi.kyotaro@lab.ntt.co.jp
2018-05-23Fix simple_prompt() to disable echo on Windows when stdin != terminal.Tom Lane
If echo = false, simple_prompt() is supposed to prevent echoing the input (for password input). However, the Windows implementation applied the mode change to STD_INPUT_HANDLE. That would not have the desired effect if stdin isn't actually the terminal, for instance if the user is piping something into psql. Fix it to apply the mode change to the correct input file, so that passwords do not echo in such cases. In passing, shorten and de-uglify this code by using #elif rather than an #if nest and removing some duplicated code. Back-patch to all supported versions. To simplify that, also back-patch the portions of commit 9daec77e1 that got rid of an unnecessary malloc/free in the same area. Matthew Stickney (cosmetic changes by me) Discussion: https://postgr.es/m/502a1fff-862b-da52-1031-f68df6ed5a2d@gmail.com
2018-05-22Widen COPY FROM's current-line-number counter from 32 to 64 bits.Tom Lane
Because the code for the HEADER option skips a line when this counter is zero, a very long COPY FROM WITH HEADER operation would drop a line every 2^32 lines. A lesser but still unfortunate problem is that errors would show a wrong input line number for errors occurring beyond the 2^31'st input line. While such large input streams seemed impractical when this code was first written, they're not any more. Widening the counter (and some associated variables) to uint64 should be enough to prevent problems for the foreseeable future. David Rowley Discussion: https://postgr.es/m/CAKJS1f88yh-6wwEfO6QLEEvH3BEugOq2QX1TOja0vCauoynmOQ@mail.gmail.com
2018-05-21Fix SQL:2008 FETCH FIRST syntax to allow parameters.Andrew Gierth
OFFSET <x> ROWS FETCH FIRST <y> ROWS ONLY syntax is supposed to accept <simple value specification>, which includes parameters as well as literals. When this syntax was added all those years ago, it was done inconsistently, with <x> and <y> being different subsets of the standard syntax. Rectify that by making <x> and <y> accept the same thing, and allowing either a (signed) numeric literal or a c_expr there, which allows for parameters, variables, and parenthesized arbitrary expressions. Per bug #15200 from Lukas Eder. Backpatch all the way, since this has been broken from the start. Discussion: https://postgr.es/m/877enz476l.fsf@news-spur.riddles.org.uk Discussion: http://postgr.es/m/152647780335.27204.16895288237122418685@wrigleys.postgresql.org
2018-05-21Fix unsafe usage of strerror(errno) within ereport().Tom Lane
This is the converse of the unsafe-usage-of-%m problem: the reason ereport/elog provide that format code is mainly to dodge the hazard of errno getting changed before control reaches functions within the arguments of the macro. I only found one instance of this hazard, but it's been there since 9.4 :-(.
2018-05-20printf("%lf") is not portable, so omit the "l".Tom Lane
The "l" (ell) width spec means something in the corresponding scanf usage, but not here. While modern POSIX says that applying "l" to "f" and other floating format specs is a no-op, SUSv2 says it's undefined. Buildfarm experience says that some old compilers emit warnings about it, and at least one old stdio implementation (mingw's "ANSI" option) actually produces wrong answers and/or crashes. Discussion: https://postgr.es/m/21670.1526769114@sss.pgh.pa.us Discussion: https://postgr.es/m/c085e1da-0d64-1c15-242d-c921f32e0d5c@dunslane.net
2018-05-19Support platforms where strtoll/strtoull are spelled __strtoll/__strtoull.Tom Lane
Ancient HPUX, for one, does this. We hadn't noticed due to the lack of regression tests that required a working strtoll. (I was slightly tempted to remove the other historical spelling, strto[u]q, since it seems we have no buildfarm members testing that case. But I refrained.) Discussion: https://postgr.es/m/151935568942.1461.14623890240535309745@wrigleys.postgresql.org
2018-05-18Arrange to supply declarations for strtoll/strtoull if needed.Tom Lane
Buildfarm member dromedary is still unhappy about the recently-added ecpg "long long" tests. The reason turns out to be that it includes "-ansi" in its CFLAGS, and in their infinite wisdom Apple have decided to hide the declarations of strtoll/strtoull in C89-compliant builds. (I find it pretty curious that they hide those function declarations when you can nonetheless declare a "long long" variable, but anyway that is their behavior, both on dromedary's obsolete macOS version and the newest and shiniest.) As a result, gcc assumes these functions return "int", leading naturally to wrong results. (Looking at dromedary's past build results, it's evident that this problem also breaks pg_strtouint64() on 32-bit platforms; but we evidently have no regression tests that exercise that function with values above 32 bits.) To fix, supply declarations for these functions when the platform provides the functions but not the declarations, using the same type of mechanism as we use for some other similar cases. Discussion: https://postgr.es/m/151935568942.1461.14623890240535309745@wrigleys.postgresql.org
2018-05-18Hot-fix ecpg regression test for missing ecpg_config.h inclusion.Tom Lane
I don't think this is really the best long-term answer, and in particular it doesn't fix the pre-existing hazard in sqltypes.h. But for the moment let's just try to make the buildfarm green again. Discussion: https://postgr.es/m/151935568942.1461.14623890240535309745@wrigleys.postgresql.org
2018-05-18Add some test coverage for ecpg's "long long" support.Tom Lane
This will only actually exercise the "long long" code paths on platforms where "long" is 32 bits --- otherwise, the SQL bigint type maps to plain "long", and we will test that code path instead. But that's probably sufficient coverage, and anyway we weren't testing either code path before. Dang Minh Huong, tweaked a bit by me Discussion: https://postgr.es/m/151935568942.1461.14623890240535309745@wrigleys.postgresql.org
2018-05-18Recognize that MSVC can support strtoll() and strtoull().Tom Lane
This is needed for full support of "long long" variables in ecpg, but the previous patch for bug #15080 (commits 51057feaa et al) missed it. In MSVC versions where the functions don't exist under those names, we can nonetheless use _strtoi64() and _strtoui64(). Like the previous patch, back-patch all the way. Dang Minh Huong Discussion: https://postgr.es/m/151935568942.1461.14623890240535309745@wrigleys.postgresql.org
2018-05-18Fix error message on short read of pg_controlMagnus Hagander
Instead of saying "error: success", indicate that we got a working read but it was too short.
2018-05-16Fix misprocessing of equivalence classes involving record_eq().Tom Lane
canonicalize_ec_expression() is supposed to agree with coerce_type() as to whether a RelabelType should be inserted to make a subexpression be valid input for the operators of a given opclass. However, it did the wrong thing with named-composite-type inputs to record_eq(): it put in a RelabelType to RECORDOID, which the parser doesn't. In some cases this was harmless because all code paths involving a particular equivalence class did the same thing, but in other cases this would result in failing to recognize a composite-type expression as being a member of an equivalence class that it actually is a member of. The most obvious bad effect was to fail to recognize that an index on a composite column could provide the sort order needed for a mergejoin on that column, as reported by Teodor Sigaev. I think there might be other, subtler, cases that result in misoptimization. It also seems possible that an unwanted RelabelType would sometimes get into an emitted plan --- but because record_eq and friends don't examine the declared type of their input expressions, that would not create any visible problems. To fix, just treat RECORDOID as if it were a polymorphic type, which in some sense it is. We might want to consider formalizing that a bit more someday, but for the moment this seems to be the only place where an IsPolymorphicType() test ought to include RECORDOID as well. This has been broken for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/a6b22369-e3bf-4d49-f59d-0c41d3551e81@sigaev.ru
2018-05-15Fix type checking for support functions of parallel VARIADIC aggregates.Tom Lane
The impact of VARIADIC on the combine/serialize/deserialize support functions of an aggregate wasn't thought through carefully. There is actually no impact, because variadicity isn't passed through to these functions (and it doesn't seem like it would need to be). However, lookup_agg_function was mistakenly told to check things as though it were passed through. The net result was that it was impossible to declare an aggregate that had both VARIADIC input and parallelism support functions. In passing, fix a runtime check in nodeAgg.c for the combine function's strictness to make its error message agree with the creation-time check. The previous message was actually backwards, and it doesn't seem like there's a good reason to have two versions of this message text anyway. Back-patch to 9.6 where parallel aggregation was introduced. Alexey Bashtanov; message fix by me Discussion: https://postgr.es/m/f86dde87-fef4-71eb-0480-62754aaca01b@imap.cc
2018-05-09Update time zone data files to tzdata release 2018e.Tom Lane
DST law changes in North Korea. Redefinition of "daylight savings" in Ireland, as well as for some past years in Namibia and Czechoslovakia. Additional historical corrections for Czechoslovakia. With this change, the IANA database models Irish timekeeping as following "standard time" in summer, and "daylight savings" in winter, so that the daylight savings offset is one hour behind standard time not one hour ahead. This does not change their UTC offset (+1:00 in summer, 0:00 in winter) nor their timezone abbreviations (IST in summer, GMT in winter), though now "IST" is more correctly read as "Irish Standard Time" not "Irish Summer Time". However, the "is_dst" column in the pg_timezone_names view will now be true in winter and false in summer for the Europe/Dublin zone. Similar changes were made for Namibia between 1994 and 2017, and for Czechoslovakia between 1946 and 1947. So far as I can find, no Postgres internal logic cares about which way tm_isdst is reported; in particular, since commit b2cbced9e we do not rely on it to decide how to interpret ambiguous timestamps during DST transitions. So I don't think this change will affect any Postgres behavior other than the timezone-view outputs. Discussion: https://postgr.es/m/30996.1525445902@sss.pgh.pa.us
2018-05-08Improve inefficient regexes in vacuumdb TAP test.Tom Lane
The regexes used in 102_vacuumdb_stages.pl to check the postmaster log for expected output contained several places with ".*.*", which is underdetermined and can cause exponential runtime growth in Perl's regex matcher (since it's not bright enough not to waste time seeing whether different splits of the same substring would allow a match). We were fortunate that the amount of text in the postmaster log was generally not enough to make the runtime go to the moon; although commit 6271fceb8 had been on the hairy edge of an obvious problem, thanks to its increasing the default log verbosity to DEBUG1. Experimentation shows that anyone who tried to run this test case with an even higher log verbosity would have been in for serious pain. But even at default logging level, fixing this saves several hundred ms on my workstation, more on slower buildfarm members. Remove the extra ".*"s, restoring more-or-less-linear matching speed. Back-patch to 9.4 where the test case was added, mostly in case anyone tries to do related debugging in a back branch. Discussion: https://postgr.es/m/32459.1525657786@sss.pgh.pa.us
2018-05-07Stamp 9.6.9.REL9_6_9Tom Lane
2018-05-07Translation updatesPeter Eisentraut
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: ffe3ce875edb7ff07db5351ed1640897fe9eb92d
2018-05-06Clear severity 5 perlcritic warnings from vcregress.plAndrew Dunstan
My recent update for python3 support used some idioms that are unapproved. This fixes them. Backpatch to all live branches like the original.
2018-05-05Tweak tests to support Python 3.7Peter Eisentraut
Python 3.7 removes the trailing comma in the repr() of BaseException (see <https://bugs.python.org/issue30399>), leading to test output differences. Work around that by composing the equivalent test output in a more manual way.
2018-05-05Remove extra newlines after PQerrorMessage()Peter Eisentraut
2018-05-05Fix scenario where streaming standby gets stuck at a continuation record.Heikki Linnakangas
If a continuation record is split so that its first half has already been removed from the master, and is only present in pg_wal, and there is a recycled WAL segment in the standby server that looks like it would contain the second half, recovery would get stuck. The code in XLogPageRead() incorrectly started streaming at the beginning of the WAL record, even if we had already read the first page. Backpatch to 9.4. In principle, older versions have the same problem, but without replication slots, there was no straightforward mechanism to prevent the master from recycling old WAL that was still needed by standby. Without such a mechanism, I think it's reasonable to assume that there's enough slack in how many old segments are kept around to not run into this, or you have a WAL archive. Reported by Jonathon Nelson. Analysis and patch by Kyotaro HORIGUCHI, with some extra comments by me. Discussion: https://www.postgresql.org/message-id/CACJqAM3xVz0JY1XFDKPP%2BJoJAjoGx%3DGNuOAshEDWCext7BFvCQ%40mail.gmail.com
2018-05-04Don't mark pages all-visible spuriouslyAlvaro Herrera
Dan Wood diagnosed a long-standing problem that pages containing tuples that are locked by multixacts containing live lockers may spuriously end up as candidates for getting their all-visible flag set. This has the long-term effect that multixacts remain unfrozen; this may previously pass undetected, but since commit XYZ it would be reported as "ERROR: found multixact 134100944 from before relminmxid 192042633" because when a later vacuum tries to freeze the page it detects that a multixact that should have gotten frozen, wasn't. Dan proposed a (correct) patch that simply sets a variable to its correct value, after a bogus initialization. But, per discussion, it seems better coding to avoid the bogus initializations altogether, since they could give rise to more bugs later. Therefore this fix rewrites the logic a little bit to avoid depending on the bogus initializations. This bug was part of a family introduced in 9.6 by commit a892234f830e; later, commit 38e9f90a227d fixed most of them, but this one was unnoticed. Authors: Dan Wood, Pavan Deolasee, Álvaro Herrera Reviewed-by: Masahiko Sawada, Pavan Deolasee, Álvaro Herrera Discussion: https://postgr.es/m/84EBAC55-F06D-4FBE-A3F3-8BDA093CE3E3@amazon.com
2018-05-04Provide for testing on python3 modules when under MSVCAndrew Dunstan
This should have been done some years ago as promised in commit c4dcdd0c2. However, better late than never. Along the way do a little housekeeping, including using a simpler test for the python version being tested, and removing a redundant subroutine parameter. These changes only apply back to release 9.5. Backpatch to all live releases.
2018-05-04Allow MSYS as well as MINGW in Msys unameAndrew Dunstan
Msys2's uname -s outputs a string beginning MSYS rather than MINGW as is output by Msys. Allow either in pg_upgrade's test.sh. Backpatch to all live branches.
2018-05-04Sync our copy of the timezone library with IANA release tzcode2018e.Tom Lane
The non-cosmetic changes involve teaching the "zic" tzdata compiler about negative DST. While I'm not currently intending that we start using negative-DST data right away, it seems possible that somebody would try to use our copy of zic with bleeding-edge IANA data. So we'd better be out in front of this change code-wise, even though it doesn't matter for the data file we're shipping. Discussion: https://postgr.es/m/30996.1525445902@sss.pgh.pa.us
2018-05-03Add HOLD_INTERRUPTS section into FinishPreparedTransaction.Teodor Sigaev
If an interrupt arrives in the middle of FinishPreparedTransaction and any callback decide to call CHECK_FOR_INTERRUPTS (e.g. RemoveTwoPhaseFile can write a warning with ereport, which checks for interrupts) then it's possible to leave current GXact undeleted. Backpatch to all supported branches Stas Kelvich Discussion: ihttps://www.postgresql.org/message-id/3AD85097-A3F3-4EBA-99BD-C38EDF8D2949@postgrespro.ru
2018-05-02Revert back-branch changes in power()'s behavior for NaN inputs.Tom Lane
Per discussion, the value of fixing these bugs in the back branches doesn't outweigh the downsides of changing corner-case behavior in a minor release. Hence, revert commits 217d8f3a1 and 4d864de48 in the v10 branch and the corresponding commits in 9.3-9.6. Discussion: https://postgr.es/m/75DB81BEEA95B445AE6D576A0A5C9E936A73E741@BPXM05GP.gisp.nec.co.jp
2018-04-29Fix bogus list-iteration code in pg_regress.c, affecting ecpg tests only.Tom Lane
While looking at a recent buildfarm failure in the ecpg tests, I wondered why the pg_regress output claimed the stderr part of the test failed, when the regression diffs were clearly for the stdout part. Looking into it, the reason is that pg_regress.c's logic for iterating over three parallel lists is wrong, and has been wrong since it was written: it advances the "tag" pointer at a different place in the loop than the other two pointers. Fix that.
2018-04-29Avoid wrong results for power() with NaN input on more platforms.Tom Lane
Buildfarm results show that the modern POSIX rule that 1 ^ NaN = 1 is not honored on *BSD until relatively recently, and really old platforms don't believe that NaN ^ 0 = 1 either. (This is unsurprising, perhaps, since SUSv2 doesn't require either behavior.) In hopes of getting to platform independent behavior, let's deal with all the NaN-input cases explicitly in dpow(). Note that numeric_power() doesn't know either of these special cases. But since that behavior is platform-independent, I think it should be addressed separately, and probably not back-patched. Discussion: https://postgr.es/m/75DB81BEEA95B445AE6D576A0A5C9E936A73E741@BPXM05GP.gisp.nec.co.jp
2018-04-29Update time zone data files to tzdata release 2018d.Tom Lane
DST law changes in Palestine and Antarctica (Casey Station). Historical corrections for Portugal and its colonies, as well as Enderbury, Jamaica, Turks & Caicos Islands, and Uruguay.
2018-04-29Avoid wrong results for power() with NaN input on some platforms.Tom Lane
Per spec, the result of power() should be NaN if either input is NaN. It appears that on some versions of Windows, the libc function does return NaN, but it also sets errno = EDOM, confusing our code that attempts to work around shortcomings of other platforms. Hence, add guard tests to avoid substituting a wrong result for the right one. It's been like this for a long time (and the odd behavior only appears in older MSVC releases, too) so back-patch to all supported branches. Dang Minh Huong, reviewed by David Rowley Discussion: https://postgr.es/m/75DB81BEEA95B445AE6D576A0A5C9E936A73E741@BPXM05GP.gisp.nec.co.jp
2018-04-25Correct pg_recvlogical server version test.Noah Misch
The predecessor test boiled down to "PQserverVersion(NULL) >= 100000", which is always false. No release includes that, so it could not have reintroduced CVE-2018-1058. Back-patch to 9.4, like the addition of the predecessor in commit 8d2814f274def85f39fbe997d454b01628cb5667. Discussion: https://postgr.es/m/20180422215551.GB2676194@rfd.leadboat.com
2018-04-20Fix race conditions when an event trigger is added concurrently with DDL.Tom Lane
EventTriggerTableRewrite crashed if there were table_rewrite triggers present, but there had not been when the calling command started. EventTriggerDDLCommandEnd called ddl_command_end triggers if present, even if there had been no such triggers when the calling command started, which would lead to a failure in pg_event_trigger_ddl_commands. In both cases, fix by doing nothing; it's better to wait till the next command when things will be properly initialized. In passing, remove an elog(DEBUG1) call that might have seemed interesting four years ago but surely isn't today. We found this because of intermittent failures in the buildfarm. Thanks to Alvaro Herrera and Andrew Gierth for analysis. Back-patch to 9.5; some of this code exists before that, but the specific hazards we need to guard against don't. Discussion: https://postgr.es/m/5767.1523995174@sss.pgh.pa.us
2018-04-20Change more places to be less trusting of RestrictInfo.is_pushed_down.Tom Lane
On further reflection, commit e5d83995e didn't go far enough: pretty much everywhere in the planner that examines a clause's is_pushed_down flag ought to be changed to use the more complicated behavior where we also check the clause's required_relids. Otherwise we could make incorrect decisions about whether, say, a clause is safe to use as a hash clause. Some (many?) of these places are safe as-is, either because they are never reached while considering a parameterized path, or because there are additional checks that would reject a pushed-down clause anyway. However, it seems smarter to just code them all the same way rather than rely on easily-broken reasoning of that sort. In support of that, invent a new macro RINFO_IS_PUSHED_DOWN that should be used in place of direct tests on the is_pushed_down flag. Like the previous patch, back-patch to all supported branches. Discussion: https://postgr.es/m/f8128b11-c5bf-3539-48cd-234178b2314d@proxel.se
2018-04-19Fix incorrect handling of join clauses pushed into parameterized paths.Tom Lane
In some cases a clause attached to an outer join can be pushed down into the outer join's RHS even though the clause is not degenerate --- this can happen if we choose to make a parameterized path for the RHS. If the clause ends up attached to a lower outer join, we'd misclassify it as being a "join filter" not a plain "filter" condition at that node, leading to wrong query results. To fix, teach extract_actual_join_clauses to examine each join clause's required_relids, not just its is_pushed_down flag. (The latter now seems vestigial, or at least in need of rethinking, but we won't do anything so invasive as redefining it in a bug-fix patch.) This has been wrong since we introduced parameterized paths in 9.2, though it's evidently hard to hit given the lack of previous reports. The test case used here involves a lateral function call, and I think that a lateral reference may be required to get the planner to select a broken plan; though I wouldn't swear to that. In any case, even if LATERAL is needed to trigger the bug, it still affects all supported branches, so back-patch to all. Per report from Andreas Karlsson. Thanks to Andrew Gierth for preliminary investigation. Discussion: https://postgr.es/m/f8128b11-c5bf-3539-48cd-234178b2314d@proxel.se
2018-04-19Enlarge find_other_exec's meager fgets bufferAlvaro Herrera
The buffer was 100 bytes long, which is barely sufficient when the version string gets longer (such as by configure --with-extra-version). Set it to MAXPGPATH. Author: Nikhil Sontakke Discussion: https://postgr.es/m/CAMGcDxfLfpYU_Jru++L6ARPCOyxr0W+2O3Q54TDi5XdYeU36ow@mail.gmail.com
2018-04-18Better fix for deadlock hazard in CREATE INDEX CONCURRENTLY.Tom Lane
Commit 54eff5311 did not account for the possibility that we'd have a transaction snapshot due to default_transaction_isolation being set high enough to require one. The transaction snapshot is enough to hold back our advertised xmin and thus risk deadlock anyway. The only way to get rid of that snap is to start a new transaction, so let's do that instead. Also throw in an assert checking that we really have gotten to a state where no xmin is being advertised. Back-patch to 9.4, like the previous commit. Discussion: https://postgr.es/m/CAMkU=1ztk3TpQdcUNbxq93pc80FrXUjpDWLGMeVBDx71GHNwZQ@mail.gmail.com
2018-04-16Fix broken collation-aware searches in SP-GiST text opclass.Tom Lane
spg_text_leaf_consistent() supposed that it should compare only Min(querylen, entrylen) bytes of the two strings, and then deal with any excess bytes in one string or the other by assuming the longer string is greater if the prefixes are equal. Quite aside from the fact that that's just wrong in some locales (e.g., 'ch' is not less than 'd' in cs_CZ), it also risked passing incomplete multibyte characters to strcoll(), with ensuing bad results. Instead, just pass the full strings to varstr_cmp, and let it decide what to do about unequal-length strings. Fortunately, this error doesn't imply any index corruption, it's just that searches might return the wrong set of entries. Per report from Emre Hasegeli, though this is not his patch. Thanks to Peter Geoghegan for review and discussion. This code was born broken, so back-patch to all supported branches. In HEAD, I failed to resist the temptation to do a bit of cosmetic cleanup/pgindent'ing on 710d90da1, too. Discussion: https://postgr.es/m/CAE2gYzzb6K51VnTq5i5p52z+j9p2duEa-K1T3RrC_GQEynAKEg@mail.gmail.com
2018-04-13In libpq, free any partial query result before collecting a server error.Tom Lane
We'd throw away the partial result anyway after parsing the error message. Throwing it away beforehand costs nothing and reduces the risk of out-of-memory failure. Also, at least in systems that behave like glibc/Linux, if the partial result was very large then the error PGresult would get allocated at high heap addresses, preventing the heap storage used by the partial result from being released to the OS until the error PGresult is freed. In psql >= 9.6, we hold onto the error PGresult until another error is received (for \errverbose), so that this behavior causes a seeming memory leak to persist for awhile, as in a recent complaint from Darafei Praliaskouski. This is a potential performance regression from older versions, justifying back-patching at least that far. But similar behavior may occur in other client applications, so it seems worth just back-patching to all supported branches. Discussion: https://postgr.es/m/CAC8Q8tJ=7cOkPePyAbJE_Pf691t8nDFhJp0KZxHvnq_uicfyVg@mail.gmail.com
2018-04-12Fix bogus affix-merging code.Tom Lane
NISortAffixes() compared successive compound affixes incorrectly, thus possibly failing to merge identical affixes, or (less likely) merging ones that shouldn't be merged. The user-visible effects of this are unclear, to me anyway. Per bug #15150 from Alexander Lakhin. It's been broken for a long time, so back-patch to all supported branches. Arthur Zakirov Discussion: https://postgr.es/m/152353327780.31225.13445405496721177988@wrigleys.postgresql.org
2018-04-11Ignore nextOid when replaying an ONLINE checkpoint.Tom Lane
The nextOid value is from the start of the checkpoint and may well be stale compared to values from more recent XLOG_NEXTOID records. Previously, we adopted it anyway, allowing the OID counter to go backwards during a crash. While this should be harmless, it contributed to the severity of the bug fixed in commit 0408e1ed5, by allowing duplicate TOAST OIDs to be assigned immediately following a crash. Without this error, that issue would only have arisen when TOAST objects just younger than a multiple of 2^32 OIDs were deleted and then not vacuumed in time to avoid a conflict. Pavan Deolasee Discussion: https://postgr.es/m/CABOikdOgWT2hHkYG3Wwo2cyZJq2zfs1FH0FgX-=h4OLosXHf9w@mail.gmail.com
2018-04-11Do not select new object OIDs that match recently-dead entries.Tom Lane
When selecting a new OID, we take care to avoid picking one that's already in use in the target table, so as not to create duplicates after the OID counter has wrapped around. However, up to now we used SnapshotDirty when scanning for pre-existing entries. That ignores committed-dead rows, so that we could select an OID matching a deleted-but-not-yet-vacuumed row. While that mostly worked, it has two problems: * If recently deleted, the dead row might still be visible to MVCC snapshots, creating a risk for duplicate OIDs when examining the catalogs within our own transaction. Such duplication couldn't be visible outside the object-creating transaction, though, and we've heard few if any field reports corresponding to such a symptom. * When selecting a TOAST OID, deleted toast rows definitely *are* visible to SnapshotToast, and will remain so until vacuumed away. This leads to a conflict that will manifest in errors like "unexpected chunk number 0 (expected 1) for toast value nnnnn". We've been seeing reports of such errors from the field for years, but the cause was unclear before. The fix is simple: just use SnapshotAny to search for conflicting rows. This results in a slightly longer window before object OIDs can be recycled, but that seems unlikely to create any large problems. Pavan Deolasee Discussion: https://postgr.es/m/CABOikdOgWT2hHkYG3Wwo2cyZJq2zfs1FH0FgX-=h4OLosXHf9w@mail.gmail.com