user/sven/postgresql.git

Age	Commit message (Collapse)	Author
2011-10-31	Stop btree indexscans upon reaching nulls in either direction.	Tom Lane
	The existing scan-direction-sensitive tests were overly complex, and failed to stop the scan in cases where it's perfectly legitimate to do so. Per bug #6278 from Maksym Boguk. Back-patch to 8.3, which is as far back as the patch applies easily. Doesn't seem worth sweating over a relatively minor performance issue in 8.2 at this late date. (But note that this was a performance regression from 8.1 and before, so 8.2 is being left as an outlier.)
2011-09-16	Avoid unnecessary page-level SSI lock check in heap_insert().	Tom Lane
	As observed by Heikki, we need not conflict on heap page locks during an insert; heap page locks are only aggregated tuple locks, they don't imply locking "gaps" as index page locks do. So we can avoid some unnecessary conflicts, and also do the SSI check while not holding exclusive lock on the target buffer. Kevin Grittner, reviewed by Jeff Davis. Back-patch to 9.1.
2011-09-16	gistendscan() forgot to free so->giststate.	Tom Lane
	This oversight led to a massive memory leak --- upwards of 10KB per tuple --- during creation-time verification of an exclusion constraint based on a GIST index. In most other scenarios it'd just be a leak of 10KB that would be recovered at end of query, so not too significant; though perhaps the leak would be noticeable in a situation where a GIST index was being used in a nestloop inner indexscan. In any case, it's a real leak of long standing, so patch all supported branches. Per report from Harald Fuchs.
2011-09-05	Adjust translator comment format to xgettext expectations	Alvaro Herrera

2011-09-05	Mark some untranslatable messages with errmsg_internal	Alvaro Herrera

2011-08-17	If backup-end record is not seen, and we reach end of recovery from a	Heikki Linnakangas
	streamed backup, throw an error and refuse to start up. The restore has not finished correctly in that case and the data directory is possibly corrupt. We already errored out in case of archive recovery, but could not during crash recovery because we couldn't distinguish between the case that pg_start_backup() was called and the database then crashed (must not error, data is OK), and the case that we're restoring from a backup and not all the needed WAL was replayed (data can be corrupt). To distinguish those cases, add a line to backup_label to indicate whether the backup was taken with pg_start/stop_backup(), or by streaming (ie. pg_basebackup). This is a different implementation than what I committed to 9.2 a week ago. That implementation was not back-patchable because it required re-initdb. Fujii Masao
2011-08-16	Preserve toast value OIDs in toast-swap-by-content for CLUSTER/VACUUM FULL.	Tom Lane
	This works around the problem that a catalog cache entry might contain a toast pointer that we try to dereference just as a VACUUM FULL completes on that catalog. We will see the sinval message on the cache entry when we acquire lock on the toast table, but by that point we've already told tuptoaster.c "here's the pointer to fetch", so it's difficult from a code structural standpoint to update the pointer before we use it. Much less painful to ensure that toast pointers are not invalidated in the first place. We have to add a bit of code to deal with the case that a value that previously wasn't toasted becomes so; but that should be a seldom-exercised corner case, so the inefficiency shouldn't be significant. Back-patch to 9.0. In prior versions, we didn't allow CLUSTER on system catalogs, and VACUUM FULL didn't result in reassignment of toast OIDs, so there was no problem.
2011-08-16	Fix race condition in relcache init file invalidation.	Tom Lane
	The previous code tried to synchronize by unlinking the init file twice, but that doesn't actually work: it leaves a window wherein a third process could read the already-stale init file but miss the SI messages that would tell it the data is stale. The result would be bizarre failures in catalog accesses, typically "could not read block 0 in file ..." later during startup. Instead, hold RelCacheInitLock across both the unlink and the sending of the SI messages. This is more straightforward, and might even be a bit faster since only one unlink call is needed. This has been wrong since it was put in (in 2002!), so back-patch to all supported releases.
2011-08-10	Back-patch assorted latch-related fixes.	Tom Lane
	Fix a whole bunch of signal handlers that had been hacked to do things that might change errno, without adding the necessary save/restore logic for errno. Also make some minor fixes in unix_latch.c, and clean up bizarre and unsafe scheme for disowning the process's latch. While at it, rename the PGPROC latch field to procLatch for consistency with 9.2. Issues noted while reviewing a patch by Peter Geoghegan.
2011-08-09	Measure WaitLatch's timeout parameter in milliseconds, not microseconds.	Tom Lane
	The original definition had the problem that timeouts exceeding about 2100 seconds couldn't be specified on 32-bit machines. Milliseconds seem like sufficient resolution, and finer grain than that would be fantasy anyway on many platforms. Back-patch to 9.1 so that this aspect of the latch API won't change between 9.1 and later releases. Peter Geoghegan
2011-07-15	Fix two ancient bugs in GiST code to re-find a parent after page split:	Heikki Linnakangas
	First, when following a right-link, we incorrectly marked the current page as the parent of the right sibling. In reality, the parent of the right page is the same as the parent of the current page (or some page to the right of it, gistFindCorrectParent() will sort that out). Secondly, when we follow a right-link, we must prepend, not append, the right page to our list of pages to visit. That's because we assume that once we hit a leaf page in the list, all the rest are leaf pages too, and give up. To hit these bugs, you need concurrent actions and several unlucky accidents. Another backend must split the root page, while you're in process of splitting a lower-level page. Furthermore, while you scan the internal nodes to re-find the parent, another backend needs to again split some more internal pages. Even then, the bugs don't necessarily manifest as user-visible errors or index corruption. While we're at it, make the error reporting a bit better if gistFindPath() fails to re-find the parent. It used to be an assertion, but an elog() seems more appropriate. Backpatch to all supported branches.
2011-07-02	Unify spelling of "canceled", "canceling", "cancellation"	Peter Eisentraut
	We had previously (af26857a2775e7ceb0916155e931008c2116632f) established the U.S. spellings as standard.
2011-06-29	Restore correct btree preprocessing of "indexedcol IS NULL" conditions.	Tom Lane
	Such a condition is unsatisfiable in combination with any other type of btree-indexable condition (since we assume btree operators are always strict). 8.3 and 8.4 had an explicit test for this, which I removed in commit 29c4ad98293e3c5cb3fcdd413a3f4904efff8762, mistakenly thinking that the case would be subsumed by the more general handling of IS (NOT) NULL added in that patch. Put it back, and improve the comments about it, and add a regression test case. Per bug #6079 from Renat Nasyrov, and analysis by Dean Rasheed.
2011-06-29	Move the PredicateLockRelation() call from nodeSeqscan.c to heapam.c. It's	Heikki Linnakangas
	more consistent that way, since all the other PredicateLock* calls are made in various heapam.c and index AM functions. The call in nodeSeqscan.c was unnecessarily aggressive anyway, there's no need to try to lock the relation every time a tuple is fetched, it's enough to do it once. This has the user-visible effect that if a seq scan is initialized in the executor, but never executed, we now acquire the predicate lock on the heap relation anyway. We could avoid that by taking the lock on the first heap_getnext() call instead, but it doesn't seem worth the trouble given that it feels more natural to do it in heap_beginscan(). Also, remove the retail PredicateLockTuple() calls from heap_getnext(). In a seqscan, started with heap_begin(), we're holding a whole-relation predicate lock on the heap so there's no need to lock the tuples individually. Kevin Grittner and me
2011-06-27	Reduce impact of btree page reuse on Hot Standby by fixing off-by-1 error.	Simon Riggs
	WAL records of type XLOG_BTREE_REUSE_PAGE were generated using a latestRemovedXid one higher than actually needed because xid used was page opaque->btpo.xact rather than an actually removed xid. Noticed on an otherwise quiet system by Noah Misch. Noah Misch and Simon Riggs
2011-06-22	Message style and spelling improvements	Peter Eisentraut

2011-06-16	pgindent run of recent SSI changes. Also, remove an unnecessary #include.	Heikki Linnakangas
	Kevin Grittner
2011-06-16	Respect Hot Standby controls while recycling btree index pages.	Simon Riggs
	Btree pages were recycled after VACUUM deletes all records on a page and then a subsequent VACUUM occurs after the RecentXmin horizon is reached. Using RecentXmin meant that we did not respond correctly to the user controls provide to avoid Hot Standby conflicts and so spurious conflicts could be generated in some workload combinations. We now reuse pages only when we reach RecentGlobalXmin, which can be much later in the presence of long running queries and is also controlled by vacuum_defer_cleanup_age and hot_standby_feedback. Noah Misch and Simon Riggs
2011-06-15	Make non-MVCC snapshots exempt from predicate locking. Scans with non-MVCC	Heikki Linnakangas
	snapshots, like in REINDEX, are basically non-transactional operations. The DDL operation itself might participate in SSI, but there's separate functions for that. Kevin Grittner and Dan Ports, with some changes by me.
2011-06-14	Oops, forgot to change the order of entries in 2PC callback arrays when I	Heikki Linnakangas
	renumbered the resource managers. This should fix the buildfarm..
2011-06-10	Work around gcc 4.6.0 bug that breaks WAL replay.	Tom Lane
	ReadRecord's habit of using both direct references to tmpRecPtr and references to *RecPtr (which is pointing at tmpRecPtr) triggers an optimization bug in gcc 4.6.0, which apparently has forgotten about aliasing rules. Avoid the compiler bug, and make the code more readable to boot, by getting rid of the direct references. Improve the comments while at it. Back-patch to all supported versions, in case they get built with 4.6.0. Tom Lane, with some cosmetic suggestions from Alex Hunsaker
2011-06-09	Pgindent run before 9.1 beta2.	Bruce Momjian

2011-05-31	Protect GIST logic that assumes penalty values can't be negative.	Tom Lane
	Apparently sane-looking penalty code might return small negative values, for example because of roundoff error. This will confuse places like gistchoose(). Prevent problems by clamping negative penalty values to zero. (Just to be really sure, I also made it force NaNs to zero.) Back-patch to all supported branches. Alexander Korotkov
2011-05-30	The row-version chaining in Serializable Snapshot Isolation was still wrong.	Heikki Linnakangas
	On further analysis, it turns out that it is not needed to duplicate predicate locks to the new row version at update, the lock on the version that the transaction saw as visible is enough. However, there was a different bug in the code that checks for dangerous structures when a new rw-conflict happens. Fix that bug, and remove all the row-version chaining related code. Kevin Grittner & Dan Ports, with some comment editorialization by me.
2011-05-19	Spell checking and markup refinement	Peter Eisentraut

2011-05-12	Fix assorted typos	Alvaro Herrera

2011-05-11	Shut down WAL receiver if it's still running at end of recovery. We used to	Heikki Linnakangas
	just check that it's not running and PANIC if it was, but that can rightfully happen if recovery stops at recovery target.
2011-05-06	Move RegisterPredicateLockingXid() call to a safer place.	Tom Lane
	The SSI patch inserted a call of RegisterPredicateLockingXid into GetNewTransactionId, which was a bad idea on a couple of grounds. First, it's not necessary to hold XidGenLock while manipulating that shared memory, and doing so is bad because XidGenLock is a high-contention lock that should be held for as short a time as possible. (Not to mention that it adds an entirely unnecessary deadlock hazard, since we must take SerializableXactHashLock as well.) Second, the specific place where it was put was between extending CLOG and advancing nextXid, which could result in unpleasant behavior in case of a failure there. Pull the call out to AssignTransactionId, which is much safer and arguably better from a modularity standpoint too. There is more work to do to clean up the failure-before-advancing-nextXid issue, but that is a separate change that will need to be back-patched. So for the moment I just want to make GetNewTransactionId look the same as it did in prior versions.
2011-04-25	Fix SSI-related assertion failure.	Robert Haas
	Bug #5899, reported by Marko Tiikkaja. Heikki Linnakangas, reviewed by Kevin Grittner and Dan Ports.
2011-04-23	Hash indexes had better pass the index collation to support functions, too.	Tom Lane
	Per experimentation with contrib/citext, whose hash function assumes that it'll be passed a collation.
2011-04-22	Make GIN and GIST pass the index collation to all their support functions.	Tom Lane
	Experimentation with contrib/btree_gist shows that the majority of the GIST support functions potentially need collation information. Safest policy seems to be to pass it to all of them, instead of making assumptions about which ones could possibly need it.
2011-04-22	Make a code-cleanup pass over the collations patch.	Tom Lane
	This patch is almost entirely cosmetic --- mostly cleaning up a lot of neglected comments, and fixing code layout problems in places where the patch made lines too long and then pgindent did weird things with that. I did find a bug-of-omission in equalTupleDescs().
2011-04-18	recoveryStopsHere() must check the resource manager ID.	Robert Haas
	Before commit c016ce728139be95bb0dc7c4e5640507334c2339, this wasn't needed, but now that multiple resource manager IDs can percolate down through here, we have to make sure we know which one we've got. Otherwise, we can confuse (for example) an XLOG_XACT_COMMIT record with an XLOG_CHECKPOINT_SHUTDOWN record. Review by Jaime Casanova
2011-04-16	Add an Assert that indexam.c isn't used on an index awaiting reindexing.	Tom Lane
	This might have caught the recent embarrassment over trying to modify pg_index while its indexes were being rebuilt. Noah Misch
2011-04-13	Revert the patch to check if we've reached end-of-backup also when doing	Heikki Linnakangas
	crash recovery, and throw an error if not. hubert depesz lubaczewski pointed out that that situation also happens in the crash recovery following a system crash that happens during an online backup. We might want to do something smarter in 9.1, like put the check back for backups taken with pg_basebackup, but that's for another patch.
2011-04-12	Pass collations to functions in FunctionCallInfoData, not FmgrInfo.	Tom Lane
	Since collation is effectively an argument, not a property of the function, FmgrInfo is really the wrong place for it; and this becomes critical in cases where a cached FmgrInfo is used for varying purposes that might need different collation settings. Fix by passing it in FunctionCallInfoData instead. In particular this allows a clean fix for bug #5970 (record_cmp not working). This requires touching a bit more code than the original method, but nobody ever thought that collations would not be an invasive patch...
2011-04-11	Clean up most -Wunused-but-set-variable warnings from gcc 4.6	Peter Eisentraut
	This warning is new in gcc 4.6 and part of -Wall. This patch cleans up most of the noise, but there are some still warnings that are trickier to remove.
2011-04-10	pgindent run before PG 9.1 beta 1.	Bruce Momjian

2011-04-08	Tweak collation setup for GIN index comparison functions.	Tom Lane
	Honor index column's collation spec if there is one, don't go to the expense of calling get_typcollation when we can reasonably assume that all GIN storage types will use default collation, and be sure to set a collation for the comparePartialFn too.
2011-04-07	Revise the API for GUC variable assign hooks.	Tom Lane
	The previous functions of assign hooks are now split between check hooks and assign hooks, where the former can fail but the latter shouldn't. Aside from being conceptually clearer, this approach exposes the "canonicalized" form of the variable value to guc.c without having to do an actual assignment. And that lets us fix the problem recently noted by Bernd Helmle that the auto-tune patch for wal_buffers resulted in bogus log messages about "parameter "wal_buffers" cannot be changed without restarting the server". There may be some speed advantage too, because this design lets hook functions avoid re-parsing variable values when restoring a previous state after a rollback (they can store a pre-parsed representation of the value instead). This patch also resolves a longstanding annoyance about custom error messages from variable assign hooks: they should modify, not appear separately from, guc.c's own message about "invalid parameter value".
2011-04-04	Avoid assuming there will be only 3 states for synchronous_commit.	Simon Riggs
	Also avoid hardcoding the current default state by giving it the name "on" and replace with a meaningful name that reflects its behaviour. Coding only, no change in behaviour.
2011-04-04	Merge synchronous_replication setting into synchronous_commit.	Robert Haas
	This means one less thing to configure when setting up synchronous replication, and also avoids some ambiguity around what the behavior should be when the settings of these variables conflict. Fujii Masao, with additional hacking by me.
2011-03-31	Improve error message when WAL ends before reaching end of online backup.	Heikki Linnakangas

2011-03-30	Check that we've reached end-of-backup also when we're not performing	Heikki Linnakangas
	archive recovery. It's possible to restore an online backup without recovery.conf, by simply copying all the necessary WAL files to pg_xlog. "pg_basebackup -x" does that too. That's the use case where this cross-check is useful. Backpatch to 9.0. We used to do this in earlier versins, but in 9.0 the code was inadvertently changed so that the check is only performed after archive recovery. Fujii Masao.
2011-03-26	Clean up cruft around collation initialization for tupdescs and scankeys.	Tom Lane
	I found actual bugs in GiST and plpgsql; the rest of this is cosmetic but meant to decrease the odds of future bugs of omission.
2011-03-23	Minor changes to recovery pause behaviour.	Simon Riggs
	Change location LOG message so it works each time we pause, not just for final pause. Ensure that we pause only if we are in Hot Standby and can connect to allow us to run resume function. This change supercedes the code to override parameter recoveryPauseAtTarget to false if not attempting to enter Hot Standby, which is now removed.
2011-03-23	Prevent intermittent hang in recovery from bgwriter interaction.	Simon Riggs
	Startup process waited for cleanup lock but when hot_standby = off the pid was not registered, so that the bgwriter would not wake the waiting process as intended.
2011-03-21	When two base backups are started at the same time with pg_basebackup,	Heikki Linnakangas
	ensure that they use different checkpoints as the starting point. We use the checkpoint redo location as a unique identifier for the base backup in the end-of-backup record, and in the backup history file name. Bug spotted by Fujii Masao.
2011-03-19	Revise collation derivation method and expression-tree representation.	Tom Lane
	All expression nodes now have an explicit output-collation field, unless they are known to only return a noncollatable data type (such as boolean or record). Also, nodes that can invoke collation-aware functions store a separate field that is the collation value to pass to the function. This avoids confusion that arises when a function has collatable inputs and noncollatable output type, or vice versa. Also, replace the parser's on-the-fly collation assignment method with a post-pass over the completed expression tree. This allows us to use a more complex (and hopefully more nearly spec-compliant) assignment rule without paying for it in extra storage in every expression node. Fix assorted bugs in the planner's handling of collations by making collation one of the defining properties of an EquivalenceClass and by converting CollateExprs into discardable RelabelType nodes during expression preprocessing.
2011-03-18	Remove bogus semicolons in recoveryPausesHere.	Robert Haas
	Without this, the startup process goes into a tight loop, consuming 100% of one CPU and failing to respond to interrupts.