user/sven/postgresql.git

Age	Commit message (Collapse)	Author
2014-01-09	Remove unnecessary local variables to work around an icc optimization bug.	Tom Lane
	Buildfarm member dunlin has been crashing since commit 8b49a60, but other machines seem fine with that code. It turns out that removing the local variables in ordered_set_startup() that are copies of fields in "qstate" dodges the problem. This might cost a few cycles on register-rich machines, but it's probably a wash on others, and in any case this code isn't performance-critical. Thanks to Jeremy Drake for off-list investigation.
2014-01-09	Changed regression test to ecpg test suite for alignment problem just with last	Michael Meskes
	commit.
2014-01-09	Fix descriptor output in ECPG.	Michael Meskes
	While working on most platforms the old way sometimes created alignment problems. This should fix it. Also the regresion tests were updated to test for the reported case. Report and fix by MauMau <maumau307@gmail.com>
2014-01-09	Refactor checking whether we've reached the recovery target.	Heikki Linnakangas
	Makes the replay loop slightly more readable, by separating the concerns of whether to stop and whether to delay, and how to extract the timestamp from a record. This has the user-visible change that the timestamp of the last applied record is now updated after actually applying it. Before, it was updated just before applying it. That meant that pg_last_xact_replay_timestamp() could return the timestamp of a commit record that is in process of being replayed, but not yet applied. Normally the difference is small, but if min_recovery_apply_delay is set, there could be a significant delay between reading a record and applying it. Another behavioral change is that if you recover to a restore point, we stop after the restore point record, not before it. It makes no difference as far as running queries on the server is concerned, as applying a restore point record changes nothing, but if examine the timeline history you will see that the new timeline branched off just after the restore point record, not before it. One practical consequence is that if you do PITR to the new timeline, and set recovery target to the same named restore point again, it will find and stop recovery at the same restore point. Conceptually, I think it makes more sense to consider the restore point as part of the new timeline's history than not. In principle, setting the last-replayed timestamp before actually applying the record was a bug all along, but it doesn't seem worth the risk to backpatch, since min_recovery_apply_delay was only added in 9.4.
2014-01-09	pgcrypto: Make header files stand alone	Peter Eisentraut
	pgp.h used to require including mbuf.h and px.h first. Include those in pgp.h, so that it can be used without prerequisites. Remove mbuf.h inclusions in .c files where mbuf.h features are not used directly. (px.h was always used.)
2014-01-08	We don't need to include pg_sema.h in s_lock.h anymore.	Tom Lane
	Minor improvement to commit daa7527afc2274432094ebe7ceb03aa41f916607: s_lock.h no longer has any need to mention PGSemaphoreData, so we can rip out the #include that supplies that. In a non-HAVE_SPINLOCKS build, this doesn't really buy much since we still need the #include in spin.h --- but everywhere else, this reduces #include footprint by some trifle, and helps keep the different locking facilities separate.
2014-01-08	Fix "cannot accept a set" error when only some arms of a CASE return a set.	Tom Lane
	In commit c1352052ef1d4eeb2eb1d822a207ddc2d106cb13, I implemented an optimization that assumed that a function's argument expressions would either always return a set (ie multiple rows), or always not. This is wrong however: we allow CASE expressions in which some arms return a set of some type and others just return a scalar of that type. There may be other examples as well. To fix, replace the run-time test of whether an argument returned a set with a static precheck (expression_returns_set). This adds a little bit of query startup overhead, but it seems barely measurable. Per bug #8228 from David Johnston. This has been broken since 8.0, so patch all supported branches.
2014-01-08	Reduce the number of semaphores used under --disable-spinlocks.	Robert Haas
	Instead of allocating a semaphore from the operating system for every spinlock, allocate a fixed number of semaphores (by default, 1024) from the operating system and multiplex all the spinlocks that get created onto them. This could self-deadlock if a process attempted to acquire more than one spinlock at a time, but since processes aren't supposed to execute anything other than short stretches of straight-line code while holding a spinlock, that shouldn't happen. One motivation for this change is that, with the introduction of dynamic shared memory, it may be desirable to create spinlocks that last for less than the lifetime of the server. Without this change, attempting to use such facilities under --disable-spinlocks would quickly exhaust any supply of available semaphores. Quite apart from that, it's desirable to contain the quantity of semaphores needed to run the server simply on convenience grounds, since using too many may make it harder to get PostgreSQL running on a new platform, which is mostly the point of --disable-spinlocks in the first place. Patch by me; review by Tom Lane.
2014-01-08	Fix pause_at_recovery_target + recovery_target_inclusive combination.	Heikki Linnakangas
	If pause_at_recovery_target is set, recovery pauses before applying the target record, even if recovery_target_inclusive is set. If you then continue with pg_xlog_replay_resume(), it will apply the target record before ending recovery. In other words, if you log in while it's paused and verify that the database looks OK, ending recovery changes its state again, possibly destroying data that you were tring to salvage with PITR. Backpatch to 9.1, this has been broken since pause_at_recovery_target was added.
2014-01-08	If multiple recovery_targets are specified, use the latest one.	Heikki Linnakangas
	The docs say that only one of recovery_target_xid, recovery_target_time, or recovery_target_name can be specified. But the code actually did something different, so that a name overrode time, and xid overrode both time and name. Now the target specified last takes effect, whether it's an xid, time or name. With this patch, we still accept multiple recovery_target settings, even though docs say that only one can be specified. It's a general property of the recovery.conf file parser that you if you specify the same option twice, the last one takes effect, like with postgresql.conf.
2014-01-08	Avoid extra AggCheckCallContext() checks in ordered-set aggregates.	Tom Lane
	In the transition functions, we don't really need to recheck this after the first call. I had been feeling paranoid about possibly getting a non-null argument value in some other context; but it's probably game over anyway if we have a non-null "internal" value that's not what we are expecting. In the final functions, the general convention in pre-existing final functions seems to be that an Assert() is good enough, so do it like that here too. This seems to save a few tenths of a percent of overall query runtime, which isn't much, but still it's just overhead if there's not a plausible case where the checks would fire.
2014-01-08	Save a few cycles in advance_transition_function().	Tom Lane
	Keep a pre-initialized FunctionCallInfoData in AggStatePerAggData, and re-use that at each row instead of doing InitFunctionCallInfoData each time. This saves only half a dozen assignments and maybe some stack manipulation, and yet that seems to be good for a percent or two of the overall query run time for simple aggregates such as count(*). The cost is that the FunctionCallInfoData (which is about a kilobyte, on 64-bit machines) stays allocated for the duration of the query instead of being short-lived stack data. But we're already paying an equivalent space cost for each regular FuncExpr or OpExpr node, so I don't feel bad about paying it for aggregate functions. The code seems a little cleaner this way too, since the number of things passed to advance_transition_function decreases.
2014-01-08	Fix bug in determining when recovery has reached consistency.	Heikki Linnakangas
	When starting WAL replay from an online checkpoint, the last replayed WAL record variable was initialized using the checkpoint record's location, even though the records between the REDO location and the checkpoint record had not been replayed yet. That was noted as "slightly confusing" but harmless in the comment, but in some cases, it fooled CheckRecoveryConsistency to incorrectly conclude that we had already reached a consistent state immediately at the beginning of WAL replay. That caused the system to accept read-only connections in hot standby mode too early, and also PANICs with message "WAL contains references to invalid pages". Fix by initializing the variables to the REDO location instead. In 9.2 and above, change CheckRecoveryConsistency() to use lastReplayedEndRecPtr variable when checking if backup end location has been reached. It was inconsistently using EndRecPtr for that check, but lastReplayedEndRecPtr when checking min recovery point. It made no difference before this patch, because in all the places where CheckRecoveryConsistency was called the two variables were the same, but it was always an accident waiting to happen, and would have been wrong after this patch anyway. Report and analysis by Tomonari Katsumata, bug #8686. Backpatch to 9.0, where hot standby was introduced.
2014-01-08	pg_upgrade: Fix fatal error handling	Peter Eisentraut
	Restore exiting when pg_log(PG_FATAL) is called directly instead of calling pg_fatal(). Fault introduced in 264aa14a2f687eba8c8cc2a5b6cbd6397973da98.
2014-01-07	Update copyright for 2014	Bruce Momjian
	Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.
2014-01-07	Fix LATERAL references to target table of UPDATE/DELETE.	Tom Lane
	I failed to think much about UPDATE/DELETE when implementing LATERAL :-(. The implemented behavior ended up being that subqueries in the FROM or USING clause (respectively) could access the update/delete target table as though it were a lateral reference; which seems fine if they said LATERAL, but certainly ought to draw an error if they didn't. Fix it so you get a suitable error when you omit LATERAL. Per report from Emre Hasegeli.
2014-01-07	Silence compiler warning on MSVC.	Heikki Linnakangas
	MSVC doesn't know that elog(ERROR) doesn't return, and gives a warning about missing return. Silence that. Amit Kapila
2014-01-07	Move permissions check from do_pg_start_backup to pg_start_backup	Magnus Hagander
	And the same for do_pg_stop_backup. The code in do_pg_* is not allowed to access the catalogs. For manual base backups, the permissions check can be handled in the calling function, and for streaming base backups only users with the required permissions can get past the authentication step in the first place. Reported by Antonin Houska, diagnosed by Andres Freund
2014-01-07	Avoid including tablespaces inside PGDATA twice in base backups	Magnus Hagander
	If a tablespace was crated inside PGDATA it was backed up both as part of the PGDATA backup and as the backup of the tablespace. Avoid this by skipping any directory inside PGDATA that contains one of the active tablespaces. Dimitri Fontaine and Magnus Hagander
2014-01-06	Add more use of psprintf()	Peter Eisentraut

2014-01-06	Remove bogus -K option from pg_dump.	Heikki Linnakangas
	I added it to the getopt call by accident in commit 691e595dd9c7786d37d73ccd327f8c2b6f0dace6. Amit Kapila
2014-01-05	Cache catalog lookup data across groups in ordered-set aggregates.	Tom Lane
	The initial commit of ordered-set aggregates just did all the setup work afresh each time the aggregate function is started up. But in a GROUP BY query, the catalog lookups need not be repeated for each group, since the column datatypes and sort information won't change. When there are many small groups, this makes for a useful, though not huge, performance improvement. Per suggestion from Andrew Gierth. Profiling of these cases suggests that it might be profitable to avoid duplicate lookups within tuplesort startup as well; but changing the tuplesort APIs would have much broader impact, so I left that for another day.
2014-01-04	Fix translatability markings in psql, and add defenses against future bugs.	Tom Lane
	Several previous commits have added columns to various \d queries without updating their translate_columns[] arrays, leading to potentially incorrect translations in NLS-enabled builds. Offenders include commit 893686762 (added prosecdef to \df+), c9ac00e6e (added description to \dc+) and 3b17efdfd (added description to \dC+). Fix those cases back to 9.3 or 9.2 as appropriate. Since this is evidently more easily missed than one would like, in HEAD also add an Assert that the supplied array is long enough. This requires an API change for printQuery(), so it seems inappropriate for back branches, but presumably all future changes will be tested in HEAD anyway. In HEAD and 9.3, also clean up a whole lot of sloppiness in the emitted SQL for \dy (event triggers): lack of translatability due to failing to pass words-to-be-translated through gettext_noop(), inadequate schema qualification, and sloppy formatting resulting in unnecessarily ugly -E output. Peter Eisentraut and Tom Lane, per bug #8702 from Sergey Burladyan
2014-01-04	Fix header comment for bitncmp().	Tom Lane
	The result is an int less than, equal to, or greater than zero, in the style of memcmp (and, in fact, exactly the output of memcmp in some cases). This comment previously said -1, 1, or 0, which was an overspecification, as noted by Emre Hasegeli. All of the existing callers appear to be fine with the actual behavior, so just fix the comment. In passing, improve infelicitous formatting of some call sites.
2014-01-04	Fix typo in comment.	Tom Lane
	classifyClauses was renamed to classifyConditions somewhere along the line, but this comment didn't get the memo. Ian Barwick
2014-01-03	Restore some comments lost during 15732b34e8c8	Alvaro Herrera
	Michael Paquier
2014-01-02	Ooops, should use double not single quotes in StaticAssertStmt().	Tom Lane
	That's what I get for testing this on an older compiler.
2014-01-02	Fix calculation of maximum statistics-message size.	Tom Lane
	The PGSTAT_NUM_TABENTRIES macro should have been updated when new fields were added to struct PgStat_MsgTabstat in commit 644828908, but it wasn't. Fix that. Also, add a static assertion that we didn't overrun the intended size limit on stats messages. This will not necessarily catch every mistake in computing the maximum array size for stats messages, but it will catch ones that have practical consequences. (The assertion in fact doesn't complain about the aforementioned error in PGSTAT_NUM_TABENTRIES, because that was not big enough to cause the array length to increase.) No back-patch, as there's no actual bug in existing releases; this is just in the nature of future-proofing. Mark Dilger and Tom Lane
2014-01-02	Handle 5-char filenames in SlruScanDirectory	Alvaro Herrera
	Original users of slru.c were all producing 4-digit filenames, so that was all that that code was prepared to handle. Changes to multixact.c in the course of commit 0ac5ad5134f made pg_multixact/members create 5-digit filenames once a certain threshold was reached, which SlruScanDirectory wasn't prepared to deal with; in particular, 5-digit-name files were not removed during truncation. Change that routine to make it aware of those files, and have it process them just like any others. Right now, some pg_multixact/members directories will contain a mixture of 4-char and 5-char filenames. A future commit is expected fix things so that each slru.c user declares the correct maximum width for the files it produces, to avoid such unsightly mixtures. Noticed while investigating bug #8673 reported by Serge Negodyuck.
2014-01-02	Wrap multixact/members correctly during extension	Alvaro Herrera
	In the 9.2 code for extending multixact/members, the logic was very simple because the number of entries in a members page was a proper divisor of 2^32, and thus at 2^32 wraparound the logic for page switch was identical than at any other page boundary. In commit 0ac5ad5134f I failed to realize this and introduced code that was not able to go over the 2^32 boundary. Fix that by ensuring that when we reach the last page of the last segment we correctly zero the initial page of the initial segment, using correct uint32-wraparound-safe arithmetic. Noticed while investigating bug #8673 reported by Serge Negodyuck, as diagnosed by Andres Freund.
2014-01-02	Handle wraparound during truncation in multixact/members	Alvaro Herrera
	In pg_multixact/members, relying on modulo-2^32 arithmetic for wraparound handling doesn't work all that well. Because we don't explicitely track wraparound of the allocation counter for members, it is possible that the "live" area exceeds 2^31 entries; trying to remove SLRU segments that are "old" according to the original logic might lead to removal of segments still in use. To fix, have the truncation routine use a tailored SlruScanDirectory callback that keeps track of the live area in actual use; that way, when the live range exceeds 2^31 entries, the oldest segments still live will not get removed untimely. This new SlruScanDir callback needs to take care not to remove segments that are "in the future": if new SLRU segments appear while the truncation is ongoing, make sure we don't remove them. This requires examination of shared memory state to recheck for false positives, but testing suggests that this doesn't cause a problem. The original coding didn't suffer from this pitfall because segments created when truncation is running are never considered to be removable. Per Andres Freund's investigation of bug #8673 reported by Serge Negodyuck.
2014-01-02	Aggressively freeze tables when CLUSTER or VACUUM FULL rewrites them.	Robert Haas
	We haven't wanted to do this in the past on the grounds that in rare cases the original xmin value will be needed for forensic purposes, but commit 37484ad2aacef5ec794f4dd3d5cf814475180a78 removes that objection, so now we can. Per extensive discussion, among many people, on pgsql-hackers.
2014-01-02	Fix contrib/pg_upgrade to clean all the cruft made during "make check".	Tom Lane
	Although these files get cleaned up if the test runs to completion, a failure partway through leaves trash all over the floor. The Makefile ought to be bright enough to get rid of it when you say "make clean".
2014-01-01	Rename walLogHints to wal_log_hints for easier grepping.	Robert Haas
	Michael Paquier
2014-01-01	Do not use an empty hostname.	Michael Meskes
	When trying to connect to a given database libecpg should not try using an empty hostname if no hostname was given.
2013-12-30	Fix broken support for event triggers as extension members.	Tom Lane
	CREATE EVENT TRIGGER forgot to mark the event trigger as a member of its extension, and pg_dump didn't pay any attention anyway when deciding whether to dump the event trigger. Per report from Moshe Jacobson. Given the obvious lack of testing here, it's rather astonishing that ALTER EXTENSION ADD/DROP EVENT TRIGGER work, but they seem to.
2013-12-30	Fix alphabetization in catalogs.sgml.	Tom Lane
	Some recent patches seem not to have grasped the concept that the catalogs are described in alphabetical order.
2013-12-30	Remove dead code now that orindxpath.c is history.	Tom Lane
	We don't need make_restrictinfo_from_bitmapqual() anymore at all. generate_bitmap_or_paths() doesn't need to be exported, and we can drop its rather klugy restriction_only flag.
2013-12-30	Extract restriction OR clauses whether or not they are indexable.	Tom Lane
	It's possible to extract a restriction OR clause from a join clause that has the form of an OR-of-ANDs, if each sub-AND includes a clause that mentions only one specific relation. While PG has been aware of that idea for many years, the code previously only did it if it could extract an indexable OR clause. On reflection, though, that seems a silly limitation: adding a restriction clause can be a win by reducing the number of rows that have to be filtered at the join step, even if we have to test the clause as a plain filter clause during the scan. This should be especially useful for foreign tables, where the change can cut the number of rows that have to be retrieved from the foreign server; but testing shows it can win even on local tables. Per a suggestion from Robert Haas. As a heuristic, I made the code accept an extracted restriction clause if its estimated selectivity is less than 0.9, which will probably result in accepting extracted clauses just about always. We might need to tweak that later based on experience. Since the code no longer has even a weak connection to Path creation, remove orindxpath.c and create a new file optimizer/util/orclauses.c. There's some additional janitorial cleanup of now-dead code that needs to happen, but it seems like that's a fit subject for a separate commit.
2013-12-29	Don't attempt to limit target database for pg_restore.	Kevin Grittner
	There was an apparent attempt to limit the target database for pg_restore to version 7.1.0 or later. Due to a leading zero this was interpreted as an octal number, which allowed targets with version numbers down to 2.87.36. The lowest actual release above that was 6.0.0, so that was effectively the limit. Since the success of the restore attempt will depend primarily on on what statements were generated by the dump run, we don't want pg_restore trying to guess whether a given target should be allowed based on version number. Allow a connection to any version. Since it is very unlikely that anyone would be using a recent version of pg_restore to restore to a pre-6.0 database, this has little to no practical impact, but it makes the code less confusing to read. Issue reported and initial patch suggestion from Joel Jacobson based on an article by Andrey Karpov reporting on issues found by PVS-Studio static code analyzer. Final patch based on analysis by Tom Lane. Back-patch to all supported branches.
2013-12-29	Undo autoconf 2.69's attempt to #define _DARWIN_USE_64_BIT_INODE.	Tom Lane
	Defining this symbol causes OS X 10.5 to use a buggy version of readdir(), which can sometimes fail with EINVAL if the previously-fetched directory entry has been deleted or renamed. In later OS X versions that bug has been repaired, but we still don't need the #define because it's on by default. So this is just an all-around bad idea, and we can do without it.
2013-12-28	Update grammar	Peter Eisentraut
	From: Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>
2013-12-27	Fix whitespace	Peter Eisentraut

2013-12-27	Properly detect invalid JSON numbers when generating JSON.	Andrew Dunstan
	Instead of looking for characters that aren't valid in JSON numbers, we simply pass the output string through the JSON number parser, and if it fails the string is quoted. This means among other things that money and domains over money will be quoted correctly and generate valid JSON. Fixes bug #8676 reported by Anderson Cristian da Silva. Backpatched to 9.2 where JSON generation was introduced.
2013-12-27	Fix misplaced right paren bugs in pgstatfuncs.c.	Kevin Grittner
	The bug would only show up if the C sockaddr structure contained zero in the first byte for a valid address; otherwise it would fail to fail, which is probably why it went unnoticed for so long. Patch submitted by Joel Jacobson after seeing an article by Andrey Karpov in which he reports finding this through static code analysis using PVS-Studio. While I was at it I moved a definition of a local variable referenced in the buggy code to a more local context. Backpatch to all supported branches.
2013-12-26	Fix whitespace	Peter Eisentraut

2013-12-24	Fix inadequately-tested code path in tuplesort_skiptuples().	Tom Lane
	Per report from Jeff Davis.
2013-12-23	Fix ANALYZE failure on a column that's a domain over a range.	Tom Lane
	Most other range operations seem to work all right on domains, but this one not so much, at least not since commit 918eee0c. Per bug #8684 from Brett Neumeier.
2013-12-23	Revise documentation for new freezing method.	Robert Haas
	Commit 37484ad2aacef5ec794f4dd3d5cf814475180a78 invalidated a good chunk of documentation, so patch it up to reflect the new state of play. Along the way, patch remaining documentation references to FrozenXID to say instead FrozenTransactionId, so that they match the way we actually spell it in the code.
2013-12-23	Fix portability issue in ordered-set patch.	Tom Lane
	Overly compact coding in makeOrderedSetArgs() led to a platform dependency: if the compiler chose to execute the subexpressions in the wrong order, list_length() might get applied to an already-modified List, giving a value we didn't want. Per buildfarm.