user/sven/postgresql.git

Age	Commit message (Collapse)	Author
2013-01-15	On second thought, use an empty string instead of "none" when not connected.	Heikki Linnakangas
	"none" could mislead to think that you're connected a database with that name. Also, it needs to be translated, which might be hard without some context. So in back-branches, use empty string, so that the message is (currently ""), which is at least unambiguous and doens't require translation. In master, it's no problem to add translatable strings, so use a different fix there.
2013-01-15	Don't pass NULL to fprintf, if not currently connected to a database.	Heikki Linnakangas
	Backpatch all the way to 8.3. Fixes bug #7811, per report and diagnosis by Meng Qingzhong.
2013-01-14	Reject out-of-range dates in to_date().	Tom Lane
	Dates outside the supported range could be entered, but would not print reasonably, and operations such as conversion to timestamp wouldn't behave sanely either. Since this has the potential to result in undumpable table data, it seems worth back-patching. Hitoshi Harada
2013-01-14	Add new timezone abbrevation "FET".	Tom Lane
	This seems to have been invented in 2011 to represent GMT+3, non daylight savings rules, as now used in Europe/Kaliningrad and Europe/Minsk. There are no conflicts so might as well add it to the Default list. Per bug #7804 from Ruslan Izmaylov.
2013-01-12	Extend and improve use of EXTRA_REGRESS_OPTS.	Andrew Dunstan
	This is now used by ecpg tests, and not clobbered by pg_upgrade tests. This change won't affect anything that doesn't set this environment variable, but will enable the buildfarm to control exactly what port regression test installs will be running on, and thus to detect possible rogue postmasters more easily. Backpatch to release 9.2 where EXTRA_REGRESS_OPTS was first used.
2013-01-11	Revert ill-considered change of index-size fudge factor.	Tom Lane
	This partially reverts commit 21a39de5809cd3050a37d2554323cc1d0cbeed9d, restoring the pre-9.2 cost estimates for index usage. That change introduced much too large a bias against larger indexes, as per reports from Jeff Janes and others. The whole thing needs a rewrite, which I've done in HEAD, but the safest thing to do in 9.2 is just to undo this multiplier change.
2013-01-09	Properly install ecpg_compat and pgtypes libraries on msvc	Magnus Hagander
	JiangGuiqing
2013-01-08	Fix potential corruption of lock table in CREATE/DROP INDEX CONCURRENTLY.	Tom Lane
	If VirtualXactLock() has to wait for a transaction that holds its VXID lock as a fast-path lock, it must first convert the fast-path lock to a regular lock. It failed to take the required "partition" lock on the main shared-memory lock table while doing so. This is the direct cause of the assert failure in GetLockStatusData() recently observed in the buildfarm, but more worryingly it could result in arbitrary corruption of the shared lock table if some other process were concurrently engaged in modifying the same partition of the lock table. Fortunately, VirtualXactLock() is only used by CREATE INDEX CONCURRENTLY and DROP INDEX CONCURRENTLY, so the opportunities for failure are fewer than they might have been. In passing, improve some comments and be a bit more consistent about order of operations.
2013-01-04	Invent a "one-shot" variant of CachedPlans for better performance.	Tom Lane
	SPI_execute() and related functions create a CachedPlan, execute it once, and immediately discard it, so that the functionality offered by plancache.c is of no value in this code path. And performance measurements show that the extra data copying and invalidation checking done by plancache.c slows down simple queries by 10% or more compared to 9.1. However, enough of the SPI code is shared with functions that do need plan caching that it seems impractical to bypass plancache.c altogether. Instead, let's invent a variant version of cached plans that preserves 99% of the API but doesn't offer any of the actual functionality, nor the overhead. This puts SPI_execute() performance back on par, or maybe even slightly better, than it was before. This change should resolve recent complaints of performance degradation from Dong Ye, Pavel Stehule, and others. By avoiding data copying, this change also reduces the amount of memory needed to execute many-statement SPI_execute() strings, as for instance in a recent complaint from Tomas Vondra. An additional benefit of this change is that multi-statement SPI_execute() query strings are now processed fully serially, that is we complete execution of earlier statements before running parse analysis and planning on following ones. This eliminates a long-standing POLA violation, in that DDL that affects the behavior of a later statement will now behave as expected. Back-patch to 9.2, since this was a performance regression compared to 9.1. (In 9.2, place the added struct fields so as to avoid changing the offsets of existing fields.) Heikki Linnakangas and Tom Lane
2013-01-03	Tolerate timeline switches while "pg_basebackup -X fetch" is running.	Heikki Linnakangas
	If you take a base backup from a standby server with "pg_basebackup -X fetch", and the timeline switches while the backup is being taken, the backup used to fail with an error "requested WAL segment %s has already been removed". This is because the server-side code that sends over the required WAL files would not construct the WAL filename with the correct timeline after a switch. Fix that by using readdir() to scan pg_xlog for all the WAL segments in the range, regardless of timeline. Also, include all timeline history files in the backup, if taken with "-X fetch". That fixes another related bug: If a timeline switch happened just before the backup was initiated in a standby, the WAL segment containing the initial checkpoint record contains WAL from the older timeline too. Recovery will not accept that without a timeline history file that lists the older timeline. Backpatch to 9.2. Versions prior to that were not affected as you could not take a base backup from a standby before 9.2.
2012-12-30	Keep timeline history files restored from archive in pg_xlog.	Heikki Linnakangas
	The cascading standby patch in 9.2 changed the way WAL files are treated when restored from the archive. Before, they were restored under a temporary filename, and not kept in pg_xlog, but after the patch, they were copied under pg_xlog. This is necessary for a cascading standby to find them, but it also means that if the archive goes offline and a standby is restarted, it can recover back to where it was using the files in pg_xlog. It also means that if you take an offline backup from a standby server, it includes all the required WAL files in pg_xlog. However, the same change was not made to timeline history files, so if the WAL segment containing the checkpoint record contains a timeline switch, you will still get an error if you try to restart recovery without the archive, or recover from an offline backup taken from the standby. With this patch, timeline history files restored from archive are copied into pg_xlog like WAL files are, so that pg_xlog contains all the files required to recover. This is a corner-case pre-existing issue in 9.2, but even more important in master where it's possible for a standby to follow a timeline switch through streaming replication. To make that possible, the timeline history files must be present in pg_xlog.
2012-12-24	Fix some minor issues in view pretty-printing.	Tom Lane
	Code review for commit 2f582f76b1945929ff07116cd4639747ce9bb8a1: don't use a static variable for what ought to be a deparse_context field, fix non-multibyte-safe test for spaces, avoid useless and potentially O(N^2) (though admittedly with a very small constant) calculations of wrap positions when we aren't going to wrap.
2012-12-23	Prevent failure when RowExpr or XmlExpr is parse-analyzed twice.	Tom Lane
	transformExpr() is required to cope with already-transformed expression trees, for various ugly-but-not-quite-worth-cleaning-up reasons. However, some of its newer subroutines hadn't gotten the memo. This accounts for bug #7763 from Norbert Buchmuller: transformRowExpr() was overwriting the previously determined type of a RowExpr during CREATE TABLE LIKE INCLUDING INDEXES. Additional investigation showed that transformXmlExpr had the same kind of problem, but all the other cases seem to be safe. Andres Freund and Tom Lane
2012-12-21	Fix race condition if a file is removed while pg_basebackup is running.	Heikki Linnakangas
	If a relation file was removed when the server-side counterpart of pg_basebackup was just about to open it to send it to the client, you'd get a "could not open file" error. Fix that. Backpatch to 9.1, this goes back to when pg_basebackup was introduced.
2012-12-20	Fix grammatical mistake in error message	Peter Eisentraut

2012-12-20	Fix pg_extension_config_dump() to handle update cases more sanely.	Tom Lane
	If pg_extension_config_dump() is executed again for a table already listed in the extension's extconfig, the code was blindly making a new array entry. This does not seem useful. Fix it to replace the existing array entry instead, so that it's possible for extension update scripts to alter the filter conditions for configuration tables. In addition, teach ALTER EXTENSION DROP TABLE to check for an extconfig entry for the target table, and remove it if present. This is not a 100% solution because it's allowed for an extension update script to just summarily DROP a member table, and that code path doesn't go through ExecAlterExtensionContentsStmt. We could probably make that case clean things up if we had to, but it would involve sticking a very ugly wart somewhere in the guts of dependency.c. Since on the whole it seems quite unlikely that extension updates would want to remove pre-existing configuration tables, making the case possible with an explicit command seems sufficient. Per bug #7756 from Regina Obe. Back-patch to 9.1 where extensions were introduced.
2012-12-20	Fix recycling of WAL segments after changing recovery target timeline.	Heikki Linnakangas
	After the recovery target timeline is changed, we would still recycle and preallocate WAL segments on the old target timeline. Those WAL segments created for the old timeline are a waste of space, although otherwise harmless. The problem is that when installing a recycled WAL segment as a future one, ThisTimeLineID is used to construct the filename. ThisTimeLineID is initialized in the checkpointer process to the recovery target timeline at startup, but it was not updated when the startup process chooses a new target timeline (recovery_target_timeline='latest'). To fix, always update ThisTimeLineID before recycling WAL segments at a restartpoint. This still leaves a small window where we might install WAL segments under wrong timeline ID, if the target timeline is changed just as we're about to start recycling. Also, when we're not on the target timeline yet, but still replaying some older timeline, we'll install WAL segments to the newer timeline anyway and they will still go wasted. We'll just live with the waste in that situation. Commit to 9.2 and 9.1. Older versions didn't change recovery target timeline after startup, and for master, I'll commit a slightly different variant of this.
2012-12-19	Check if we've reached end-of-backup point also if no redo is required.	Heikki Linnakangas
	If you restored from a backup taken from a standby, and the last record in the backup is the checkpoint record, ie. there is no redo required except for the checkpoint record, we would fail to notice that we've reached the end-of-backup point, and the database is consistent. The result was an error "WAL ends before end of online backup". To fix, move the have-we-reached-end-of-backup check into CheckRecoveryConsistency(), which is already responsible for similar checks with minRecoveryPoint, and is called in the right places. Backpatch to 9.2, this check and bug did not exist before that.
2012-12-17	Fix failure to ignore leftover temp tables after a server crash.	Tom Lane
	During crash recovery, we remove disk files belonging to temporary tables, but the system catalog entries for such tables are intentionally not cleaned up right away. Instead, the first backend that uses a temp schema is expected to clean out any leftover objects therein. This approach requires that we be careful to ignore leftover temp tables (since any actual access attempt would fail), even if their BackendId matches our session, if we have not yet established use of the session's corresponding temp schema. That worked fine in the past, but was broken by commit debcec7dc31a992703911a9953e299c8d730c778 which incorrectly removed the rd_islocaltemp relcache flag. Put it back, and undo various changes that substituted tests like "rel->rd_backend == MyBackendId" for use of a state-aware flag. Per trouble report from Heikki Linnakangas. Back-patch to 9.1 where the erroneous change was made. In the back branches, be careful to add rd_islocaltemp in a spot in the struct that was alignment padding before, so as not to break existing add-on code.
2012-12-16	Fix filling of postmaster.pid in bootstrap/standalone mode.	Tom Lane
	We failed to ever fill the sixth line (LISTEN_ADDR), which caused the attempt to fill the seventh line (SHMEM_KEY) to fail, so that the shared memory key never got added to the file in standalone mode. This has been broken since we added more content to our lock files in 9.1. To fix, tweak the logic in CreateLockFile to add an empty LISTEN_ADDR line in standalone mode. This is a tad grotty, but since that function already knows almost everything there is to know about the contents of lock files, it doesn't seem that it's any better to hack it elsewhere. It's not clear how significant this bug really is, since a standalone backend should never have any children and thus it seems not critical to be able to check the nattch count of the shmem segment externally. But I'm going to back-patch the fix anyway. This problem had escaped notice because of an ancient (and in hindsight pretty dubious) decision to suppress LOG-level messages by default in standalone mode; so that the elog(LOG) complaint in AddToDataDirLockFile that should have warned of the problem didn't do anything. Fixing that is material for a separate patch though.
2012-12-16	Properly copy fmgroids.h after clean on Win32	Magnus Hagander
	Craig Ringer
2012-12-12	In multi-insert, don't go into infinite loop on a huge tuple and fillfactor.	Heikki Linnakangas
	If a tuple is larger than page size minus space reserved for fillfactor, heap_multi_insert would never find a page that it fits in and repeatedly ask for a new page from RelationGetBufferForTuple. If a tuple is too large to fit on any page, taking fillfactor into account, RelationGetBufferForTuple will always expand the relation. In a normal insert, heap_insert will accept that and put the tuple on the new page. heap_multi_insert, however, does a fillfactor check of its own, and doesn't accept the newly-extended page RelationGetBufferForTuple returns, even though there is no other choice to make the tuple fit. Fix that by making the logic in heap_multi_insert more like the heap_insert logic. The first tuple is always put on the page RelationGetBufferForTuple gives us, and the fillfactor check is only applied to the subsequent tuples. Report from David Gould, although I didn't use his patch.
2012-12-11	Add defenses against integer overflow in dynahash numbuckets calculations.	Tom Lane
	The dynahash code requires the number of buckets in a hash table to fit in an int; but since we calculate the desired hash table size dynamically, there are various scenarios where we might calculate too large a value. The resulting overflow can lead to infinite loops, division-by-zero crashes, etc. I (tgl) had previously installed some defenses against that in commit 299d1716525c659f0e02840e31fbe4dea3, but that covered only one call path. Moreover it worked by limiting the request size to work_mem, but in a 64-bit machine it's possible to set work_mem high enough that the problem appears anyway. So let's fix the problem at the root by installing limits in the dynahash.c functions themselves. Trouble report and patch by Jeff Davis.
2012-12-11	Consistency check should compare last record replayed, not last record read.	Heikki Linnakangas
	EndRecPtr is the last record that we've read, but not necessarily yet replayed. CheckRecoveryConsistency should compare minRecoveryPoint with the last replayed record instead. This caused recovery to think it's reached consistency too early. Now that we do the check in CheckRecoveryConsistency correctly, we have to move the call of that function to after redoing a record. The current place, after reading a record but before replaying it, is wrong. In particular, if there are no more records after the one ending at minRecoveryPoint, we don't enter hot standby until one extra record is generated and read by the standby, and CheckRecoveryConsistency is called. These two bugs conspired to make the code appear to work correctly, except for the small window between reading the last record that reaches minRecoveryPoint, and replaying it. In the passing, rename recoveryLastRecPtr, which is the last record replayed, to lastReplayedEndRecPtr. This makes it slightly less confusing with replayEndRecPtr, which is the last record read that we're about to replay. Original report from Kyotaro HORIGUCHI, further diagnosis by Fujii Masao. Backpatch to 9.0, where Hot Standby subtly changed the test from "minRecoveryPoint < EndRecPtr" to "minRecoveryPoint <= EndRecPtr". The former works because where the test is performed, we have always read one more record than we've replayed.
2012-12-11	Add mode where contrib installcheck runs each module in a separately named ↵	Andrew Dunstan
	database. Normally each module is tested in a database named contrib_regression, which is dropped and recreated at the beginhning of each pg_regress run. This new mode, enabled by adding USE_MODULE_DB=1 to the make command line, runs most modules in a database with the module name embedded in it. This will make testing pg_upgrade on clusters with the contrib modules a lot easier. Second attempt at this, this time accomodating make versions older than 3.82. Still to be done: adapt to the MSVC build system. Backpatch to 9.0, which is the earliest version it is reasonably possible to test upgrading from.
2012-12-10	Update minimum recovery point on truncation.	Heikki Linnakangas
	If a file is truncated, we must update minRecoveryPoint. Once a file is truncated, there's no going back; it would not be safe to stop recovery at a point earlier than that anymore. Per report from Kyotaro HORIGUCHI. Backpatch to 8.4. Before that, minRecoveryPoint was not updated during recovery at all.
2012-12-09	Fix assorted bugs in privileges-for-types patch.	Tom Lane
	Commit 729205571e81b4767efc42ad7beb53663e08d1ff added privileges on data types, but there were a number of oversights. The implementation of default privileges for types missed a few places, and pg_dump was utterly innocent of the whole concept. Per bug #7741 from Nathan Alden, and subsequent wider investigation.
2012-12-05	Fix intermittent crash in DROP INDEX CONCURRENTLY.	Tom Lane
	When deleteOneObject closes and reopens the pg_depend relation, we must see to it that the relcache pointer held by the calling function (typically performMultipleDeletions) is updated. Usually the relcache entry is retained so that the pointer value doesn't change, which is why the problem had escaped notice ... but after a cache flush event there's no guarantee that the same memory will be reassigned. To fix, change the recursive functions' APIs so that we pass around a "Relation *" not just "Relation". Per investigation of occasional buildfarm failures. This is trivial to reproduce with -DCLOBBER_CACHE_ALWAYS, which points up the sad lack of any buildfarm member running that way on a regular basis.
2012-12-05	Ensure recovery pause feature doesn't pause unless users can connect.	Tom Lane
	If we're not in hot standby mode, then there's no way for users to connect to reset the recoveryPause flag, so we shouldn't pause. The code was aware of this but the test to see if pausing was safe was seriously inadequate: it wasn't paying attention to reachedConsistency, and besides what it was testing was that we could legally enter hot standby, not that we have done so. Get rid of that in favor of checking LocalHotStandbyActive, which because of the coding in CheckRecoveryConsistency is tantamount to checking that we have told the postmaster to enter hot standby. Also, move the recoveryPausesHere() call that reacts to asynchronous recoveryPause requests so that it's not in the middle of application of a WAL record. I put it next to the recoveryStopsHere() call --- in future those are going to need to interact significantly, so this seems like a good waystation. Also, don't bother trying to read another WAL record if we've already decided not to continue recovery. This was no big deal when the code was written originally, but now that reading a record might entail actions like fetching an archive file, it seems a bit silly to do it like that. Per report from Jeff Janes and subsequent discussion. The pause feature needs quite a lot more work, but this gets rid of some indisputable bugs, and seems safe enough to back-patch.
2012-12-05	Must not reach consistency before XLOG_BACKUP_RECORD	Simon Riggs
	When waiting for an XLOG_BACKUP_RECORD the minRecoveryPoint will be incorrect, so we must not declare recovery as consistent before we have seen the record. Major bug allowing recovery to end too early in some cases, allowing people to see inconsistent db. This patch to HEAD and 9.2, other fix required for 9.1 and 9.0 Simon Riggs and Andres Freund, bug report by Jeff Janes
2012-12-04	Include isinf.o in libecpg if isinf() is not available on the system.	Michael Meskes
	Patch done by Jiang Guiqing <jianggq@cn.fujitsu.com>.
2012-12-03	Stamp 9.2.2.REL9_2_2	Tom Lane

2012-12-03	Revert "Add mode where contrib installcheck runs each module in a separately ↵	Andrew Dunstan
	named database." This reverts commit 30248be9635e844918c2873ae7bb598f8a487e49.
2012-12-03	Avoid holding vmbuffer pin after VACUUM.	Simon Riggs
	During VACUUM if we pause to perform a cycle of index cleanup we drop the vmbuffer pin, so we should do the same thing when heap scan completes. This avoids holding vmbuffer pin across the main index cleanup in VACUUM, which could be minutes or hours longer than necessary for correctness. Bug report and suggested fix from Pavan Deolasee
2012-12-03	Translation updates	Peter Eisentraut

2012-12-02	Add mode where contrib installcheck runs each module in a separately named ↵	Andrew Dunstan
	database. Normally each module is tested in aq database named contrib_regression, which is dropped and recreated at the beginhning of each pg_regress run. This mode, enabled by adding USE_MODULE_DB=1 to the make command line, runs most modules in a database with the module name embedded in it. This will make testing pg_upgrade on clusters with the contrib modules a lot easier. Still to be done: adapt to the MSVC build system. Backpatch to 9.0, which is the earliest version it is reasonably possible to test upgrading from.
2012-12-02	Update time zone data files to tzdata release 2012j.	Tom Lane
	DST law changes in Cuba, Israel, Jordan, Libya, Palestine, Western Samoa, and portions of Brazil.
2012-12-02	Don't advance checkPoint.nextXid near the end of a checkpoint sequence.	Tom Lane
	This reverts commit c11130690d6dca64267201a169cfb38c1adec5ef in favor of actually fixing the problem: namely, that we should never have been modifying the checkpoint record's nextXid at this point to begin with. The nextXid should match the state as of the checkpoint's logical WAL position (ie the redo point), not the state as of its physical position. It's especially bogus to advance it in some wal_levels and not others. In any case there is no need for the checkpoint record to carry the same nextXid shown in the XLOG_RUNNING_XACTS record just emitted by LogStandbySnapshot, as any replay operation will already have adopted that value as current. This fixes bug #7710 from Tarvi Pillessaar, and probably also explains bug #6291 from Daniel Farina, in that if a checkpoint were in progress at the instant of XID wraparound, the epoch bump would be lost as reported. (And, of course, these days there's at least a 50-50 chance of a checkpoint being in progress at any given instant.) Diagnosed by me and independently by Andres Freund. Back-patch to all branches supporting hot standby.
2012-12-02	XidEpoch++ if wraparound during checkpoint.	Simon Riggs
	If wal_level = hot_standby we update the checkpoint nextxid, though in the case where a wraparound occurred half-way through a checkpoint we would neglect updating the epoch also. Updating the nextxid is arguably the wrong thing to do, but changing that may introduce subtle bugs into hot standby startup, while updating the value doesn't cause any known bugs yet. Minimal fix now to HEAD and backbranches, wider fix later in HEAD. Bug reported in #6291 by Daniel Farina and slightly differently in Cause analysis and recommended fixes from Tom Lane and Andres Freund. Applied patch is minimal version of Andres Freund's work.
2012-12-02	Fix psql crash while parsing SQL file whose encoding is different from	Tatsuo Ishii
	client encoding and the client encoding is not safe one. Such an example is, file encoding is UTF-8 and client encoding SJIS. Patch contributed by Jiang Guiqing.
2012-12-01	Prevent passing gmake's environment variables down through pg_regress.	Tom Lane
	When we do "make install" to create a temp installation, we don't want that instance of make to try to communicate with any instance of make that might be calling us. This is known to cause problems if the upper make has a -jN flag, and in principle could cause problems even without that. Unset the relevant environment variables to prevent such issues. Andres Freund
2012-12-01	Make sure sharedir/extension/ directory is created when needed.	Tom Lane
	The previous coding worked as long as MODULEDIR wasn't set explicitly, because we create sharedir/$(datamoduledir) and the default value of that is "extension". But if some other value is specified for MODULEDIR then the installation directory needed for the control file wasn't made. Cédric Villemain
2012-11-30	Change test ExceptionalCondition to return void	Alvaro Herrera
	Commit 81107282a changed it in assert.c, but overlooked this other file.
2012-11-30	Add missing buffer lock acquisition in GetTupleForTrigger().	Tom Lane
	If we had not been holding buffer pin continuously since the tuple was initially fetched by the UPDATE or DELETE query, it would be possible for VACUUM or a page-prune operation to move the tuple while we're trying to copy it. This would result in a garbage "old" tuple value being passed to an AFTER ROW UPDATE or AFTER ROW DELETE trigger. The preconditions for this are somewhat improbable, and the timing constraints are very tight; so it's not so surprising that this hasn't been reported from the field, even though the bug has been there a long time. Problem found by Andres Freund. Back-patch to all active branches.
2012-11-29	Produce a more useful error message for over-length Unix socket paths.	Tom Lane
	The length of a socket path name is constrained by the size of struct sockaddr_un, and there's not a lot we can do about it since that is a kernel API. However, it would be a good thing if we produced an intelligible error message when the user specifies a socket path that's too long --- and getaddrinfo's standard API is too impoverished to do this in the natural way. So insert explicit tests at the places where we construct a socket path name. Now you'll get an error that makes sense and even tells you what the limit is, rather than something generic like "Non-recoverable failure in name resolution". Per trouble report from Jeremy Drake and a fix idea from Andrew Dunstan.
2012-11-29	Cleanup VirtualXact at end of Hot Standby	Simon Riggs
	Resolves bug 7572 reported by Daniele Varrazzo
2012-11-29	Correctly init fast path fields on PGPROC	Simon Riggs

2012-11-29	When processing nested structure pointer variables ecpg always expected an	Michael Meskes
	array datatype which of course is wrong. Applied patch by Muhammad Usama <m.usama@gmail.com> to fix this.
2012-11-29	Fix assorted bugs in CREATE/DROP INDEX CONCURRENTLY.	Tom Lane
	Commit 8cb53654dbdb4c386369eb988062d0bbb6de725e, which introduced DROP INDEX CONCURRENTLY, managed to break CREATE INDEX CONCURRENTLY via a poor choice of catalog state representation. The pg_index state for an index that's reached the final pre-drop stage was the same as the state for an index just created by CREATE INDEX CONCURRENTLY. This meant that the (necessary) change to make RelationGetIndexList ignore about-to-die indexes also made it ignore freshly-created indexes; which is catastrophic because the latter do need to be considered in HOT-safety decisions. Failure to do so leads to incorrect index entries and subsequently wrong results from queries depending on the concurrently-created index. To fix, make the final state be indisvalid = true and indisready = false, which is otherwise nonsensical. This is pretty ugly but we can't add another column without forcing initdb, and it's too late for that in 9.2. (There's a cleaner fix in HEAD.) In addition, change CREATE/DROP INDEX CONCURRENTLY so that the pg_index flag changes they make without exclusive lock on the index are made via heap_inplace_update() rather than a normal transactional update. The latter is not very safe because moving the pg_index tuple could result in concurrent SnapshotNow scans finding it twice or not at all, thus possibly resulting in index corruption. This is a pre-existing bug in CREATE INDEX CONCURRENTLY, which was copied into the DROP code. In addition, fix various places in the code that ought to check to make sure that the indexes they are manipulating are valid and/or ready as appropriate. These represent bugs that have existed since 8.2, since a failed CREATE INDEX CONCURRENTLY could leave a corrupt or invalid index behind, and we ought not try to do anything that might fail with such an index. Also fix RelationReloadIndexInfo to ensure it copies all the pg_index columns that are allowed to change after initial creation. Previously we could have been left with stale values of some fields in an index relcache entry. It's not clear whether this actually had any user-visible consequences, but it's at least a bug waiting to happen. In addition, do some code and docs review for DROP INDEX CONCURRENTLY; some cosmetic code cleanup but mostly addition and revision of comments. Portions of this need to be back-patched even further, but I'll work on that separately. Problem reported by Amit Kapila, diagnosis by Pavan Deolasee, fix by Tom Lane and Andres Freund.
2012-11-28	If we don't have a backup-end-location, don't claim we've reached it.	Heikki Linnakangas
	This was apparently a typo, which caused recovery to think that it immediately reached the end of backup, and allowed the database to start up too early. Reported by Jeff Janes. Backpatch to 9.2, where this code was introduced.