user/sven/postgresql.git

Age	Commit message (Collapse)	Author
2015-06-25	Fix the logic for putting relations into the relcache init file.	Tom Lane
	Commit f3b5565dd4e59576be4c772da364704863e6a835 was a couple of bricks shy of a load; specifically, it missed putting pg_trigger_tgrelid_tgname_index into the relcache init file, because that index is not used by any syscache. However, we have historically nailed that index into cache for performance reasons. The upshot was that load_relcache_init_file always decided that the init file was busted and silently ignored it, resulting in a significant hit to backend startup speed. To fix, reinstantiate RelationIdIsInInitFile() as a wrapper around RelationSupportsSysCache(), which can know about additional relations that should be in the init file despite being unknown to syscache.c. Also install some guards against future mistakes of this type: make write_relcache_init_file Assert that all nailed relations get written to the init file, and make load_relcache_init_file emit a WARNING if it takes the "wrong number of nailed relations" exit path. Now that we remove the init files during postmaster startup, that case should never occur in the field, even if we are starting a minor-version update that added or removed rels from the nailed set. So the warning shouldn't ever be seen by end users, but it will show up in the regression tests if somebody breaks this logic. Back-patch to all supported branches, like the previous commit.
2015-06-23	Update get_relation_info comment.	Robert Haas
	Thomas Munro
2015-06-23	Add missing newline to debug-message.	Heikki Linnakangas
	Michael Paquier
2015-06-22	pg_rewind: Improve message wording	Peter Eisentraut

2015-06-22	pg_basebackup: Remove redundant newline in error message	Peter Eisentraut

2015-06-22	Improve inheritance_planner()'s performance for large inheritance sets.	Tom Lane
	Commit c03ad5602f529787968fa3201b35c119bbc6d782 introduced a planner performance regression for UPDATE/DELETE on large inheritance sets. It required copying the append_rel_list (which is of size proportional to the number of inherited tables) once for each inherited table, thus resulting in O(N^2) time and memory consumption. While it's difficult to avoid that in general, the extra work only has to be done for append_rel_list entries that actually reference subquery RTEs, which inheritance-set entries will not. So we can buy back essentially all of the loss in cases without subqueries in FROM; and even for those, the added work is mainly proportional to the number of UNION ALL subqueries. Back-patch to 9.2, like the previous commit. Tom Lane and Dean Rasheed, per a complaint from Thomas Munro.
2015-06-22	psql: Add some tab completion for TABLESAMPLE.	Robert Haas
	Petr Jelinek, reviewed by Brendan Jurd
2015-06-21	Truncate strings in tarCreateHeader() with strlcpy(), not sprintf().	Noah Misch
	This supplements the GNU libc bug #6530 workarounds introduced in commit 54cd4f04576833abc394e131288bf3dd7dcf4806. On affected systems, a tar-format pg_basebackup failed when some filename beneath the data directory was not valid character data in the postmaster/walsender locale. Back-patch to 9.1, where pg_basebackup was introduced. Extant, bug-prone conversion specifications receive only ASCII bytes or involve low-importance messages.
2015-06-21	Add transforms to pg_get_object_address and friends	Alvaro Herrera
	This was missed when transforms were added by commit cac76582053ef8e. Extracted from a larger patch Author: Michael Paquier
2015-06-21	Improve multixact emergency autovacuum logic.	Andres Freund
	Previously autovacuum was not necessarily triggered if space in the members slru got tight. The first problem was that the signalling was tied to values in the offsets slru, but members can advance much faster. Thats especially a problem if old sessions had been around that previously prevented the multixact horizon to increase. Secondly the skipping logic doesn't work if the database was restarted after autovacuum was triggered - that knowledge is not preserved across restart. This is especially a problem because it's a common panic-reaction to restart the database if it gets slow to anti-wraparound vacuums. Fix the first problem by separating the logic for members from offsets. Trigger autovacuum whenever a multixact crosses a segment boundary, as the current member offset increases in irregular values, so we can't use a simple modulo logic as for offsets. Add a stopgap for the second problem, by signalling autovacuum whenver ERRORing out because of boundaries. Discussion: 20150608163707.GD20772@alap3.anarazel.de Backpatch into 9.3, where it became more likely that multixacts wrap around.
2015-06-21	Add missing check for wal_debug GUC.	Andres Freund
	9a20a9b2 added a new elog(), enabled when WAL_DEBUG is defined. The other WAL_DEBUG dependant messages check for the wal_debug GUC, but this one did not. While at it replace 'upto' with 'up to'. Discussion: 20150610110253.GF3832@alap3.anarazel.de Backpatch to 9.4, the first release containing 9a20a9b2.
2015-06-21	PL/Perl: Add alternative expected file for Perl 5.22	Peter Eisentraut

2015-06-20	Fix failure to copy setlocale() return value.	Noah Misch
	POSIX permits setlocale() calls to invalidate any previous setlocale() return values, but commit 5f538ad004aa00cf0881f179f0cde789aad4f47e neglected to account for setlocale(LC_CTYPE, NULL) doing so. The effect was to set the LC_CTYPE environment variable to an unintended value. pg_perm_setlocale() sets this variable to assist PL/Perl; without it, Perl would undo PostgreSQL's locale settings. The known-affected configurations are 32-bit, release builds using Visual Studio 2012 or Visual Studio 2013. Visual Studio 2010 is unaffected, as were all buildfarm-attested configurations. In principle, this bug could leave the wrong LC_CTYPE in effect after PL/Perl use, which could in turn facilitate problems like corrupt tsvector datums. No known platform experiences that consequence, because PL/Perl on Windows does not use this environment variable. The bug has been user-visible, as early postmaster failure, on systems with Windows ANSI code page set to CP936 for "Chinese (Simplified, PRC)" and probably on systems using other multibyte code pages. (SetEnvironmentVariable() rejects values containing character data not valid under the Windows ANSI code page.) Back-patch to 9.4, where the faulty commit first appeared. Reported by Didi Hu and 林鹏程. Reviewed by Tom Lane, though this fix strategy was not his first choice.
2015-06-20	Revert "Detect setlocale(LC_CTYPE, NULL) clobbering previous return values."	Noah Misch
	This reverts commit b76e76be460a240e99c33f6fb470dd1d5fe01a2a. The buildfarm yielded no related failures.
2015-06-20	Fix thinko in comment (launcher -> worker)	Alvaro Herrera

2015-06-19	In immediate shutdown, postmaster should not exit till children are gone.	Tom Lane
	This adjusts commit 82233ce7ea42d6ba519aaec63008aff49da6c7af so that the postmaster does not exit until all its child processes have exited, even if the 5-second timeout elapses and we have to send SIGKILL. There is no great value in having the postmaster process quit sooner, and doing so can mislead onlookers into thinking that the cluster is fully terminated when actually some child processes still survive. This effect might explain recent test failures on buildfarm member hamster, wherein we failed to restart a cluster just after shutting it down with "pg_ctl stop -m immediate". I also did a bit of code review/beautification, including fixing a faulty use of the Max() macro on a volatile expression. Back-patch to 9.4. In older branches, the postmaster never waited for children to exit during immediate shutdowns, and changing that would be too much of a behavioral change.
2015-06-19	Clamp autovacuum launcher sleep time to 5 minutes	Alvaro Herrera
	This avoids the problem that it might go to sleep for an unreasonable amount of time in unusual conditions like the server clock moving backwards an unreasonable amount of time. (Simply moving the server clock forward again doesn't solve the problem unless you wake up the autovacuum launcher manually, say by sending it SIGHUP). Per trouble report from Prakash Itnal in https://www.postgresql.org/message-id/CAHC5u79-UqbapAABH2t4Rh2eYdyge0Zid-X=Xz-ZWZCBK42S0Q@mail.gmail.com Analyzed independently by Haribabu Kommi and Tom Lane.
2015-06-19	Fix bogus range_table_mutator() logic for RangeTblEntry.tablesample.	Tom Lane
	Must make a copy of the TableSampleClause node; the previous coding modified the input data structure in-place. Petr Jelinek
2015-06-19	Fix corner case in autovacuum-forcing logic for multixact wraparound.	Robert Haas
	Since find_multixact_start() relies on SimpleLruDoesPhysicalPageExist(), and that function looks only at the on-disk state, it's possible for it to fail to find a page that exists in the in-memory SLRU that has not been written yet. If that happens, SetOffsetVacuumLimit() will erroneously decide to force emergency autovacuuming immediately. We should probably fix find_multixact_start() to consider the data cached in memory as well as on the on-disk state, but that's no excuse for SetOffsetVacuumLimit() to be stupid about the case where it can no longer read the value after having previously succeeded in doing so. Report by Andres Freund.
2015-06-19	Add PASSWORD to tab completions for CREATE/ALTER ROLE/USER/GROUP.	Robert Haas
	Jeevan Chalke
2015-06-19	Change TAP test framework to not rely on having a chmod executable.	Robert Haas
	This might not work at all on Windows, and is not ever efficient. Michael Paquier
2015-06-17	Detect setlocale(LC_CTYPE, NULL) clobbering previous return values.	Noah Misch
	POSIX permits setlocale() calls to invalidate any previous setlocale() return values. Commit 5f538ad004aa00cf0881f179f0cde789aad4f47e neglected to account for that. In advance of fixing that bug, switch to failing hard on affected configurations. This is a planned temporary commit to assay buildfarm-represented configurations.
2015-06-15	Fix comment in fmgr.h to refer to actual function used.	Andrew Dunstan
	FunctionLookup() is long gone if it ever existed, and fmgr_info() is what's now used, so the comments now reflect that.
2015-06-15	Check for out of memory when allocating sqlca.	Michael Meskes
	Patch by Michael Paquier
2015-06-15	Fix memory leak in ecpglib's connect function.	Michael Meskes
	Patch by Michael Paquier
2015-06-12	Fix "path" infrastructure bug affecting jsonb_set()	Andrew Dunstan
	jsonb_set() and other clients of the setPathArray() utility function could get spurious results when an array integer subscript is provided that is not within the range of int. To fix, ensure that the value returned by strtol() within setPathArray() is within the range of int; when it isn't, assume an invalid input in line with existing, similar cases. The path-orientated operators that appeared in PostgreSQL 9.3 and 9.4 do not call setPathArray(), and already independently take this precaution, so no change there. Peter Geoghegan
2015-06-12	Fix failure to cover scalar-vs-rowtype cases in exec_stmt_return().	Tom Lane
	In commit 9e3ad1aac52454569393a947c06be0d301749362 I modified plpgsql to use exec_stmt_return's simple-variables fast path in more cases. However, I overlooked that there are really two different return conventions in use here, depending on whether estate->retistuple is true, and the existing fast-path code had only bothered to handle one of them. So trying to return a scalar in a function returning composite, or vice versa, could lead to unexpected error messages (typically "cache lookup failed for type 0") or to a null-pointer-dereference crash. In the DTYPE_VAR case, we can just throw error if retistuple is true, corresponding to what happens in the general-expression code path that was being used previously. (Perhaps someday both of these code paths should attempt a coercion, but today is not that day.) In the REC and ROW cases, just hand the problem to exec_eval_datum() when not retistuple. Also clean up the ROW coding slightly so it looks more like exec_eval_datum(). The previous commit also caused exec_stmt_return_next() to be used in more cases, but that code seems to be OK as-is. Per off-list report from Serge Rielau. This bug is new in 9.5 so no need to back-patch.
2015-06-12	Improve error message and hint for ALTER COLUMN TYPE can't-cast failure.	Tom Lane
	We already tried to improve this once, but the "improved" text was rather off-target if you had provided a USING clause. Also, it seems helpful to provide the exact text of a suggested USING clause, so users can just copy-and-paste it when needed. Per complaint from Keith Rarick and a suggestion from Merlin Moncure. Back-patch to 9.2 where the current wording was adopted.
2015-06-12	Make postmaster restart archiver soon after it dies, even during recovery.	Fujii Masao
	After the archiver dies, postmaster tries to start a new one immediately. But previously this could happen only while server was running normally even though archiving was enabled always (i.e., archive_mode was set to always). So the archiver running during recovery could not restart soon after it died. This is an oversight in commit ffd3774. This commit changes reaper(), postmaster's signal handler to cleanup after a child process dies, so that it tries to a new archiver even during recovery if necessary. Patch by me. Review by Alvaro Herrera.
2015-06-12	Fixed some memory leaks in ECPG.	Michael Meskes
	Patch by Michael Paquier
2015-06-12	Fix intoasc() in Informix compat lib. This function used to be a noop.	Michael Meskes
	Patch by Michael Paquier
2015-06-12	Fix alphabetization in catalogs.sgml.	Fujii Masao
	System catalogs and views should be listed alphabetically in catalog.sgml, but only pg_file_settings view not. This patch also fixes typos in pg_file_settings comments.
2015-06-12	Clean up useless mention of RMGRDESCSOURCES in pg_rewind Makefile.	Fujii Masao
	RMGRDESCSOURCES is defined and used only in pg_xlogdump Makefile, but pg_rewind Makefile mentioned it as extra files to remove in "make clean". This patch removes that useless mention from pg_rewind Makefile. Michael Paquier
2015-06-11	Rename jsonb - text[] operator to #- to avoid ambiguity.	Andrew Dunstan
	Following recent discussion on -hackers. The underlying function is also renamed to jsonb_delete_path. The regression tests now don't need ugly type casts to avoid the ambiguity, so they are also removed. Catalog version bumped.
2015-06-11	Fix some issues in pg_rewind.	Fujii Masao
	* Remove invalid option character "N" from the third argument (valid option string) of getopt_long(). * Use pg_free() or pfree() to free the memory allocated by pg_malloc() or palloc() instead of always using free(). * Assume problem is no disk space if write() fails but doesn't set errno. * Fix several typos. Patch by me. Review by Michael Paquier.
2015-06-10	Fix typo	Peter Eisentraut

2015-06-10	Fix typo in comment.	Kevin Grittner
	Backpatch to 9.4 to minimize possible conflicts.
2015-06-10	Fix typo in comment.	Fujii Masao
	David Rowley
2015-06-09	Report more information if pg_perm_setlocale() fails at startup.	Tom Lane
	We don't know why a few Windows users have seen this fail, but the taciturnity of the error message certainly isn't helping debug it. Let's at least find out which LC category isn't working.
2015-06-08	Fix typos	Alvaro Herrera
	tablesapce -> tablespace there -> their These were introduced in 72d422a52, so no need to backpatch.
2015-06-09	Refactor WAL segment copying code.	Fujii Masao
	* Remove unused argument "dstfname" and related code from XLogFileCopy(). * Previously XLogFileCopy() returned a pstrdup'd string so that InstallXLogFileSegment() used it later. Since the pstrdup'd string was never free'd, there could be a risk of memory leak. It was almost harmless because the startup process exited just after calling XLogFileCopy(), it existed. This commit changes XLogFileCopy() so that it directly calls InstallXLogFileSegment() and doesn't call pstrdup() at all. Which fixes that memory leak problem. * Extend InstallXLogFileSegment() so that the caller can specify the log level. Which allows us to emit an error when InstallXLogFileSegment() fails a disk file access like link() and rename(). Previously it was always logged with LOG level and additionally needed to be logged with ERROR when we wanted to treat it as an error. Michael Paquier
2015-06-08	Allow HotStandbyActiveInReplay() to be called in single user mode.	Andres Freund
	HotStandbyActiveInReplay, introduced in 061b079f, only allowed WAL replay to happen in the startup process, missing the single user case. This buglet is fairly harmless as it only causes problems when single user mode in an assertion enabled build is used to replay a btree vacuum record. Backpatch to 9.2. 061b079f was backpatched further, but the assertion was not.
2015-06-07	Desupport jsonb subscript deletion on objects	Andrew Dunstan
	Supporting deletion of JSON pairs within jsonb objects using an array-style integer subscript allowed for surprising outcomes. This was mostly due to the implementation-defined ordering of pairs within objects for jsonb. It also seems desirable to make jsonb integer subscript deletion consistent with the 9.4 era general purpose integer subscripting operator for jsonb (although that operator returns NULL when an object is encountered, while we prefer here to throw an error). Peter Geoghegan, following discussion on -hackers.
2015-06-07	Use a safer method for determining whether relcache init file is stale.	Tom Lane
	When we invalidate the relcache entry for a system catalog or index, we must also delete the relcache "init file" if the init file contains a copy of that rel's entry. The old way of doing this relied on a specially maintained list of the OIDs of relations present in the init file: we made the list either when reading the file in, or when writing the file out. The problem is that when writing the file out, we included only rels present in our local relcache, which might have already suffered some deletions due to relcache inval events. In such cases we correctly decided not to overwrite the real init file with incomplete data --- but we still used the incomplete initFileRelationIds list for the rest of the current session. This could result in wrong decisions about whether the session's own actions require deletion of the init file, potentially allowing an init file created by some other concurrent session to be left around even though it's been made stale. Since we don't support changing the schema of a system catalog at runtime, the only likely scenario in which this would cause a problem in the field involves a "vacuum full" on a catalog concurrently with other activity, and even then it's far from easy to provoke. Remarkably, this has been broken since 2002 (in commit 786340441706ac1957a031f11ad1c2e5b6e18314), but we had never seen a reproducible test case until recently. If it did happen in the field, the symptoms would probably involve unexpected "cache lookup failed" errors to begin with, then "could not open file" failures after the next checkpoint, as all accesses to the affected catalog stopped working. Recovery would require manually removing the stale "pg_internal.init" file. To fix, get rid of the initFileRelationIds list, and instead consult syscache.c's list of relations used in catalog caches to decide whether a relation is included in the init file. This should be a tad more efficient anyway, since we're replacing linear search of a list with ~100 entries with a binary search. It's a bit ugly that the init file contents are now so directly tied to the catalog caches, but in practice that won't make much difference. Back-patch to all supported branches.
2015-06-05	Get rid of a //-style comment.	Tom Lane
	Not sure how "//XXX" got into a committed patch in the first place, as it's both content-free and against project style. pgindent made a bit of a hash of it, too. Going forward, we should have at least one buildfarm member using "gcc -ansi" to catch such things, at least till such time as we decide the project target language isn't C90 any more. I've turned this option on on dromedary.
2015-06-05	Fix incorrect order of database-locking operations in InitPostgres().	Tom Lane
	We should set MyProc->databaseId after acquiring the per-database lock, not beforehand. The old way risked deadlock against processes trying to copy or delete the target database, since they would first acquire the lock and then wait for processes with matching databaseId to exit; that left a window wherein an incoming process could set its databaseId and then block on the lock, while the other process had the lock and waited in vain for the incoming process to exit. CountOtherDBBackends() would time out and fail after 5 seconds, so this just resulted in an unexpected failure not a permanent lockup, but it's still annoying when it happens. A real-world example of a use-case is that short-duration connections to a template database should not cause CREATE DATABASE to fail. Doing it in the other order should be fine since the contract has always been that processes searching the ProcArray for a database ID must hold the relevant per-database lock while searching. Thus, this actually removes the former race condition that required an assumption that storing to MyProc->databaseId is atomic. It's been like this for a long time, so back-patch to all active branches.
2015-06-05	Cope with possible failure of the oldest MultiXact to exist.	Robert Haas
	Recent commits, mainly b69bf30b9bfacafc733a9ba77c9587cf54d06c0c and 53bb309d2d5a9432d2602c93ed18e58bd2924e15, introduced mechanisms to protect against wraparound of the MultiXact member space: the number of multixacts that can exist at one time is limited to 2^32, but the total number of members in those multixacts is also limited to 2^32, and older code did not take care to enforce the second limit, potentially allowing old data to be overwritten while it was still needed. Unfortunately, these new mechanisms failed to account for the fact that the code paths in which they run might be executed during recovery or while the cluster was in an inconsistent state. Also, they failed to account for the fact that users who used pg_upgrade to upgrade a PostgreSQL version between 9.3.0 and 9.3.4 might have might oldestMultiXid = 1 in the control file despite the true value being larger. To fix these problems, first, avoid unnecessarily examining the mmembers of MultiXacts when the cluster is not known to be consistent. TruncateMultiXact has done this for a long time, and this patch does not fix that. But the new calls used to prevent member wraparound are not needed until we reach normal running, so avoid calling them earlier. (SetMultiXactIdLimit is actually called before InRecovery is set, so we can't rely on that; we invent our own multixact-specific flag instead.) Second, make failure to look up the members of a MultiXact a non-fatal error. Instead, if we're unable to determine the member offset at which wraparound would occur, postpone arming the member wraparound defenses until we are able to do so. If we're unable to determine the member offset that should force autovacuum, force it continuously until we are able to do so. If we're unable to deterine the member offset at which we should truncate the members SLRU, log a message and skip truncation. An important consequence of these changes is that anyone who does have a bogus oldestMultiXid = 1 value in pg_control will experience immediate emergency autovacuuming when upgrading to a release that contains this fix. The release notes should highlight this fact. If a user has no pg_multixact/offsets/0000 file, but has oldestMultiXid = 1 in the control file, they may wish to vacuum any tables with relminmxid = 1 prior to upgrading in order to avoid an immediate emergency autovacuum after the upgrade. This must be done with a PostgreSQL version 9.3.5 or newer and with vacuum_multixact_freeze_min_age and vacuum_multixact_freeze_table_age set to 0. This patch also adds an additional log message at each database server startup, indicating either that protections against member wraparound have been engaged, or that they have not. In the latter case, once autovacuum has advanced oldestMultiXid to a sane value, the message indicating that the guards have been engaged will appear at the next checkpoint. A few additional messages have also been added at the DEBUG1 level so that the correct operation of this code can be properly audited. Along the way, this patch fixes another, related bug in TruncateMultiXact that has existed since PostgreSQL 9.3.0: when no MultiXacts exist at all, the truncation code looks up NextMultiXactId, which doesn't exist yet. This can lead to TruncateMultiXact removing every file in pg_multixact/offsets instead of keeping one around, as it should. This in turn will cause the database server to refuse to start afterwards. Patch by me. Review by Álvaro Herrera, Andres Freund, Noah Misch, and Thomas Munro.
2015-06-04	Second try at stabilizing query plans in rowsecurity regression test.	Tom Lane
	This reverts commit 5cdf25e16843dff33dbc2ddc02941458032e3ad4, which was almost immediately proven insufficient by the buildfarm. On second thought, the tables involved are not large enough that autovacuum or autoanalyze would notice them; what seems far more likely to be the culprit is the database-wide "vacuum analyze" in the concurrent gist test. That thing has given us one headache too many, so get rid of it in favor of targeted vacuuming of that test's own tables only.
2015-06-04	Fix brin regression test so it actually tests cidr.	Tom Lane
	The problem noted in my previous commit was simpler than I thought: we weren't getting an index plan because the column wasn't indexed.
2015-06-04	Tighten the per-operator testing done in brin regression test.	Tom Lane
	Verify that the number of matches is exactly what it should be, not just that it not be zero. This should help us detect any environment-dependent issues. Also, verify that we're getting the expected type of scan plan (either bitmap or seqscan as appropriate). Right now, this is failing on the cidrcol test cases, as shown in the output file. I'll look into that in a bit, but it seems good to commit this as-is temporarily to verify that it behaves as expected on the buildfarm.