user/sven/postgresql.git

Age	Commit message (Collapse)	Author
2013-03-10	Fix race condition in DELETE RETURNING.	Tom Lane
	When RETURNING is specified, ExecDelete would return a virtual-tuple slot that could contain pointers into an already-unpinned disk buffer. Another process could change the buffer contents before we get around to using the data, resulting in garbage results or even a crash. This seems of fairly low probability, which may explain why there are no known field reports of the problem, but it's definitely possible. Fix by forcing the result slot to be "materialized" before we release pin on the disk buffer. Back-patch to 9.0; in earlier branches there is no bug because ExecProcessReturning sent the tuple to the destination immediately. Also, this is already fixed in HEAD as part of the writable-foreign-tables patch (where the fix is necessary for DELETE RETURNING to work at all with postgres_fdw).
2013-01-24	Fix SPI documentation for new handling of ExecutorRun's count parameter.	Tom Lane
	Since 9.0, the count parameter has only limited the number of tuples actually returned by the executor. It doesn't affect the behavior of INSERT/UPDATE/DELETE unless RETURNING is specified, because without RETURNING, the ModifyTable plan node doesn't return control to execMain.c for each tuple. And we only check the limit at the top level. While this behavioral change was unintentional at the time, discussion of bug #6572 led us to the conclusion that we prefer the new behavior anyway, and so we should just adjust the docs to match rather than change the code. Accordingly, do that. Back-patch as far as 9.0 so that the docs match the code in each branch.
2012-12-11	Add defenses against integer overflow in dynahash numbuckets calculations.	Tom Lane
	The dynahash code requires the number of buckets in a hash table to fit in an int; but since we calculate the desired hash table size dynamically, there are various scenarios where we might calculate too large a value. The resulting overflow can lead to infinite loops, division-by-zero crashes, etc. I (tgl) had previously installed some defenses against that in commit 299d1716525c659f0e02840e31fbe4dea3, but that covered only one call path. Moreover it worked by limiting the request size to work_mem, but in a 64-bit machine it's possible to set work_mem high enough that the problem appears anyway. So let's fix the problem at the root by installing limits in the dynahash.c functions themselves. Trouble report and patch by Jeff Davis.
2012-11-29	Fix assorted bugs in CREATE INDEX CONCURRENTLY.	Tom Lane
	This patch changes CREATE INDEX CONCURRENTLY so that the pg_index flag changes it makes without exclusive lock on the index are made via heap_inplace_update() rather than a normal transactional update. The latter is not very safe because moving the pg_index tuple could result in concurrent SnapshotNow scans finding it twice or not at all, thus possibly resulting in index corruption. In addition, fix various places in the code that ought to check to make sure that the indexes they are manipulating are valid and/or ready as appropriate. These represent bugs that have existed since 8.2, since a failed CREATE INDEX CONCURRENTLY could leave a corrupt or invalid index behind, and we ought not try to do anything that might fail with such an index. Also fix RelationReloadIndexInfo to ensure it copies all the pg_index columns that are allowed to change after initial creation. Previously we could have been left with stale values of some fields in an index relcache entry. It's not clear whether this actually had any user-visible consequences, but it's at least a bug waiting to happen. This is a subset of a patch already applied in 9.2 and HEAD. Back-patch into all earlier supported branches. Tom Lane and Andres Freund
2012-10-11	Fix cross-type case in partial row matching for hashed subplans.	Tom Lane
	When hashing a subplan like "WHERE (a, b) NOT IN (SELECT x, y FROM ...)", findPartialMatch() attempted to match rows using the hashtable's internal equality operators, which of course are for x and y's datatypes. What we need to use are the potentially cross-type operators for a=x, b=y, etc. Failure to do that leads to wrong answers or even crashes. The scope for problems is limited to cases where we have different types with compatible hash functions (else we'd not be using a hashed subplan), but for example int4 vs int8 can cause the problem. Per bug #7597 from Bo Jensen. This has been wrong since the hashed-subplan code was written, so patch all the way back.
2012-08-15	Fix rescan logic in nodeCtescan.	Tom Lane
	The previous coding essentially assumed that nodes would be rescanned in the same order they were initialized in; or at least that the "leader" of a group of CTEscans would be rescanned before any others were required to execute. Unfortunately, that isn't even a little bit true. It's possible to devise queries in which the leader isn't rescanned until other CTEscans on the same CTE have run to completion, or even in which the leader never gets a rescan call at all. The fix makes the leader specially responsible only for initial creation and final destruction of the tuplestore; rescan resets are now a symmetrically shared responsibility. This means that we might reset the tuplestore multiple times when restarting a plan subtree containing multiple CTEscans; but resetting an already-empty tuplestore is cheap enough that that doesn't seem like a problem. Per report from Adam Mackler; the new regression test cases are based on his example query. Back-patch to 8.4 where CTE scans were introduced.
2012-07-20	Fix whole-row Var evaluation to cope with resjunk columns (again).	Tom Lane
	When a whole-row Var is reading the result of a subquery, we need it to ignore any "resjunk" columns that the subquery might have evaluated for GROUP BY or ORDER BY purposes. We've hacked this area before, in commit 68e40998d058c1f6662800a648ff1e1ce5d99cba, but that fix only covered whole-row Vars of named composite types, not those of RECORD type; and it was mighty klugy anyway, since it just assumed without checking that any extra columns in the result must be resjunk. A proper fix requires getting hold of the subquery's targetlist so we can actually see which columns are resjunk (whereupon we can use a JunkFilter to get rid of them). So bite the bullet and add some infrastructure to make that possible. Per report from Andrew Dunstan and additional testing by Merlin Moncure. Back-patch to all supported branches. In 8.3, also back-patch commit 292176a118da6979e5d368a4baf27f26896c99a5, which for some reason I had not done at the time, but it's a prerequisite for this change.
2012-06-21	Fix memory leak in ARRAY(SELECT ...) subqueries.	Tom Lane
	Repeated execution of an uncorrelated ARRAY_SUBLINK sub-select (which I think can only happen if the sub-select is embedded in a larger, correlated subquery) would leak memory for the duration of the query, due to not reclaiming the array generated in the previous execution. Per bug #6698 from Armando Miraglia. Diagnosis and fix idea by Heikki, patch itself by me. This has been like this all along, so back-patch to all supported versions.
2012-03-21	Don't allow CREATE TABLE AS to put relations in pg_global.	Robert Haas
	This was never intended to be allowed, and is blocked for an ordinary CREATE TABLE, but CREATE TABLE AS slipped through the cracks. This commit won't do anything to fix existing cases where this has loophole has been exploited, but it still seems prudent to lock it down going forward. Back-branch commit only, as this problem has been refactored away on the master branch. Andres Freund
2012-01-04	Make executor's SELECT INTO code save and restore original tuple receiver.	Tom Lane
	As previously coded, the QueryDesc's dest pointer was left dangling (pointing at an already-freed receiver object) after ExecutorEnd. It's a bit astonishing that it took us this long to notice, and I'm not sure that the known problem case with SQL functions is the only one. Fix it by saving and restoring the original receiver pointer, which seems the most bulletproof way of ensuring any related bugs are also covered. Per bug #6379 from Paul Ramsey. Back-patch to 8.4 where the current handling of SELECT INTO was introduced.
2011-08-21	Fix trigger WHEN conditions when both BEFORE and AFTER triggers exist.	Tom Lane
	Due to tuple-slot mismanagement, evaluation of WHEN conditions for AFTER ROW UPDATE triggers could crash if there had been a BEFORE ROW trigger fired for the same update. Fix by not trying to overload the use of estate->es_trig_tuple_slot. Per report from Yoran Heling. Back-patch to 9.0, when trigger WHEN conditions were introduced.
2011-08-02	Avoid integer overflow when LIMIT + OFFSET >= 2^63.	Heikki Linnakangas
	This fixes bug #6139 reported by Hitoshi Harada.
2011-06-02	Disallow SELECT FOR UPDATE/SHARE on sequences.	Tom Lane
	We can't allow this because such an operation stores its transaction XID into the sequence tuple's xmax. Because VACUUM doesn't process sequences (and we don't want it to start doing so), such an xmax value won't get frozen, meaning it will eventually refer to nonexistent pg_clog storage, and even wrap around completely. Since the row lock is ignored by nextval and setval, the usefulness of the operation is highly debatable anyway. Per reports of trouble with pgpool 3.0, which had ill-advisedly started using such commands as a form of locking. In HEAD, also disallow SELECT FOR UPDATE/SHARE on toast tables. Although this does work safely given the current implementation, there seems no good reason to allow it. I refrained from changing that behavior in back branches, however.
2011-05-23	Install defenses against overflow in BuildTupleHashTable().	Tom Lane
	The planner can sometimes compute very large values for numGroups, and in cases where we have no alternative to building a hashtable, such a value will get fed directly to BuildTupleHashTable as its nbuckets parameter. There were two ways in which that could go bad. First, BuildTupleHashTable declared the parameter as "int" but most callers were passing "long"s, so on 64-bit machines undetected overflow could occur leading to a bogus negative value. The obvious fix for that is to change the parameter to "long", which is what I've done in HEAD. In the back branches that seems a bit risky, though, since third-party code might be calling this function. So for them, just put in a kluge to treat negative inputs as INT_MAX. Second, hash_create can go nuts with extremely large requested table sizes (notably, my_log2 becomes an infinite loop for inputs larger than LONG_MAX/2). What seems most appropriate to avoid that is to bound the initial table size request to work_mem. This fixes bug #6035 reported by Daniel Schreiber. Although the reported case only occurs back to 8.4 since it involves WITH RECURSIVE, I think it's a good idea to install the defenses in all supported branches.
2011-02-21	Fix dangling-pointer problem in before-row update trigger processing.	Tom Lane
	ExecUpdate checked for whether ExecBRUpdateTriggers had returned a new tuple value by seeing if the returned tuple was pointer-equal to the old one. But the "old one" was in estate->es_junkFilter's result slot, which would be scribbled on if we had done an EvalPlanQual update in response to a concurrent update of the target tuple; therefore we were comparing a dangling pointer to a live one. Given the right set of circumstances we could get a false match, resulting in not forcing the tuple to be stored in the slot we thought it was stored in. In the case reported by Maxim Boguk in bug #5798, this led to "cannot extract system attribute from virtual tuple" failures when trying to do "RETURNING ctid". I believe there is a very-low-probability chance of more serious errors, such as generating incorrect index entries based on the original rather than the trigger-modified version of the row. In HEAD, change all of ExecBRInsertTriggers, ExecIRInsertTriggers, ExecBRUpdateTriggers, and ExecIRUpdateTriggers so that they continue to have similar APIs. In the back branches I just changed ExecBRUpdateTriggers, since there is no bug in the ExecBRInsertTriggers case.
2011-02-09	Fix improper matching of resjunk column names for FOR UPDATE in subselect.	Tom Lane
	Flattening of subquery range tables during setrefs.c could lead to the rangetable indexes in PlanRowMark nodes not matching up with the column names previously assigned to the corresponding resjunk ctid (resp. tableoid or wholerow) columns. Typical symptom would be either a "cannot extract system attribute from virtual tuple" error or an Assert failure. This wasn't a problem before 9.0 because we didn't support FOR UPDATE below the top query level, and so the final flattening could never renumber an RTE that was relevant to FOR UPDATE. Fix by using a plan-tree-wide unique number for each PlanRowMark to label the associated resjunk columns, so that the number need not change during flattening. Per report from David Johnston (though I'm darned if I can see how this got past initial testing of the relevant code). Back-patch to 9.0.
2011-02-01	Fix wrong error reports in 'number of array dimensions exceeds the	Itagaki Takahiro
	maximum allowed' messages, that have reported one-less dimensions. Alexey Klyukin
2011-01-12	Fix PlanRowMark/ExecRowMark structures to handle inheritance correctly.	Tom Lane
	In an inherited UPDATE/DELETE, each target table has its own subplan, because it might have a column set different from other targets. This means that the resjunk columns we add to support EvalPlanQual might be at different physical column numbers in each subplan. The EvalPlanQual rewrite I did for 9.0 failed to account for this, resulting in possible misbehavior or even crashes during concurrent updates to the same row, as seen in a recent report from Gordon Shannon. Revise the data structure so that we track resjunk column numbers separately for each subplan. I also chose to move responsibility for identifying the physical column numbers back to executor startup, instead of assuming that numbers derived during preprocess_targetlist would stay valid throughout subsequent massaging of the plan. That's a bit slower, so we might want to consider undoing it someday; but it would complicate the patch considerably and didn't seem justifiable in a bug fix that has to be back-patched to 9.0.
2010-12-01	Prevent inlining a SQL function with multiple OUT parameters.	Tom Lane
	There were corner cases in which the planner would attempt to inline such a function, which would result in a failure at runtime due to loss of information about exactly what the result record type is. Fix by disabling inlining when the function's recorded result type is RECORD. There might be some sub-cases where inlining could still be allowed, but this is a simple and backpatchable fix, so leave refinements for another day. Per bug #5777 from Nate Carson. Back-patch to all supported branches. 8.1 happens to avoid a core-dump here, but it still does the wrong thing.
2010-08-26	Fix ExecMakeTableFunctionResult to verify that all rows returned by a SRF	Tom Lane
	returning "record" actually do have the same rowtype. This is needed because the parser can't realistically enforce that they will all have the same typmod, as seen in a recent example from David Wheeler. Back-patch to 8.0, which is as far back as we have the notion of RECORD subtypes being distinguished by typmod. Wheeler's example depends on 8.4-and-up features, but I suspect there may be ways to provoke similar failures before 8.4.
2010-08-18	Reset the per-output-tuple exprcontext each time through the main loop in	Tom Lane
	ExecModifyTable(). This avoids memory leakage when trigger functions leave junk behind in that context (as they more or less must). Problem and solution identified by Dean Rasheed. I'm a bit concerned about the longevity of this solution --- once a plan can have multiple ModifyTable nodes, we are very possibly going to have to do something different. But it should hold up for 9.0.
2010-07-28	Fix oversight in new EvalPlanQual logic: the second loop over the ExecRowMark	Tom Lane
	list in ExecLockRows() forgot to allow for the possibility that some of the rowmarks are for child tables that aren't relevant to the current row. Per report from Kenichiro Tanaka.
2010-07-28	Fix potential failure when hashing the output of a subplan that produces	Tom Lane
	a pass-by-reference datatype with a nontrivial projection step. We were using the same memory context for the projection operation as for the temporary context used by the hashtable routines in execGrouping.c. However, the hashtable routines feel free to reset their temp context at any time, which'd lead to destroying input data that was still needed. Report and diagnosis by Tao Ma. Back-patch to 8.1, where the problem was introduced by the changes that allowed us to work with "virtual" tuples instead of materializing intermediate tuple values everywhere. The earlier code looks quite similar, but it doesn't suffer the problem because the data gets copied into another context as a result of having to materialize ExecProject's output tuple.
2010-07-06	pgindent run for 9.0, second run	Bruce Momjian

2010-05-29	Add C comment that we will have to remove an exclusion constraint check	Bruce Momjian
	if we ever implement '<>' index opclasses. Jeff Davis
2010-05-28	Rejigger mergejoin logic so that a tuple with a null in the first merge column	Tom Lane
	is treated like end-of-input, if nulls sort last in that column and we are not doing outer-join filling for that input. In such a case, the tuple cannot join to anything from the other input (because we assume mergejoinable operators are strict), and neither can any tuple following it in the sort order. If we're not interested in doing outer-join filling we can just pretend the tuple and its successors aren't there at all. This can save a great deal of time in situations where there are many nulls in the join column, as in a recent example from Scott Marlowe. Also, since the planner tends to not count nulls in its mergejoin scan selectivity estimates, this is an important fix to make the runtime behavior more like the estimate. I regard this as an omission in the patch I wrote years ago to teach mergejoin that tuples containing nulls aren't joinable, so I'm back-patching it. But only to 8.3 --- in older versions, we didn't have a solid notion of whether nulls sort high or low, so attempting to apply this optimization could break things.
2010-04-28	Introduce wal_level GUC to explicitly control if information needed for	Heikki Linnakangas
	archival or hot standby should be WAL-logged, instead of deducing that from other options like archive_mode. This replaces recovery_connections GUC in the primary, where it now has no effect, but it's still used in the standby to enable/disable hot standby. Remove the WAL-logging of "unlogged operations", like creating an index without WAL-logging and fsyncing it at the end. Instead, we keep a copy of the wal_mode setting and the settings that affect how much shared memory a hot standby server needs to track master transactions (max_connections, max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings change, at server restart, write a WAL record noting the new settings and update pg_control. This allows us to notice the change in those settings in the standby at the right moment, they used to be included in checkpoint records, but that meant that a changed value was not reflected in the standby until the first checkpoint after the change. Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to the sequence it used to follow, before hot standby and subsequent patches changed it to 0x9003.
2010-03-21	Message tuning	Peter Eisentraut

2010-03-19	Modify error context callback functions to not assume that they can fetch	Tom Lane
	catalog entries via SearchSysCache and related operations. Although, at the time that these callbacks are called by elog.c, we have not officially aborted the current transaction, it still seems rather risky to initiate any new catalog fetches. In all these cases the needed information is readily available in the caller and so it's just a matter of a bit of extra notation to pass it to the callback. Per crash report from Dennis Koegel. I've concluded that the real fix for his problem is to clear the error context stack at entry to proc_exit, but it still seems like a good idea to make the callbacks a bit less fragile for other cases. Backpatch to 8.4. We could go further back, but the patch doesn't apply cleanly. In the absence of proof that this fixes something and isn't just paranoia, I'm not going to expend the effort.
2010-02-26	pgindent run for 9.0	Bruce Momjian

2010-02-20	Clean up handling of XactReadOnly and RecoveryInProgress checks.	Tom Lane
	Add some checks that seem logically necessary, in particular let's make real sure that HS slave sessions cannot create temp tables. (If they did they would think that temp tables belonging to the master's session with the same BackendId were theirs. We must not allow myTempNamespace to become set in a slave session.) Change setval() and nextval() so that they are only allowed on temp sequences in a read-only transaction. This seems consistent with what we allow for table modifications in read-only transactions. Since an HS slave can't have a temp sequence, this also provides a nicer cure for the setval PANIC reported by Erik Rijkers. Make the error messages more uniform, and have them mention the specific command being complained of. This seems worth the trifling amount of extra code, since people are likely to see such messages a lot more than before.
2010-02-18	Fix ExecEvalArrayRef to pass down the old value of the array element or slice	Tom Lane
	being assigned to, in case the expression to be assigned is a FieldStore that would need to modify that value. The need for this was foreseen some time ago, but not implemented then because we did not have arrays of composites. Now we do, but the point evidently got overlooked in that patch. Net result is that updating a field of an array element doesn't work right, as illustrated if you try the new regression test on an unpatched backend. Noted while experimenting with EXPLAIN VERBOSE, which has also got some issues in this area. Backpatch to 8.3, where arrays of composites were introduced.
2010-02-14	Wrap calls to SearchSysCache and related functions using macros.	Robert Haas
	The purpose of this change is to eliminate the need for every caller of SearchSysCache, SearchSysCacheCopy, SearchSysCacheExists, GetSysCacheOid, and SearchSysCacheList to know the maximum number of allowable keys for a syscache entry (currently 4). This will make it far easier to increase the maximum number of keys in a future release should we choose to do so, and it makes the code shorter, too. Design and review by Tom Lane.
2010-02-12	Extend the set of frame options supported for window functions.	Tom Lane
	This patch allows the frame to start from CURRENT ROW (in either RANGE or ROWS mode), and it also adds support for ROWS n PRECEDING and ROWS n FOLLOWING start and end points. (RANGE value PRECEDING/FOLLOWING isn't there yet --- the grammar works, but that's all.) Hitoshi Harada, reviewed by Pavel Stehule
2010-02-09	Fix up rickety handling of relation-truncation interlocks.	Tom Lane
	Move rd_targblock, rd_fsm_nblocks, and rd_vm_nblocks from relcache to the smgr relation entries, so that they will get reset to InvalidBlockNumber whenever an smgr-level flush happens. Because we now send smgr invalidation messages immediately (not at end of transaction) when a relation truncation occurs, this ensures that other backends will reset their values before they next access the relation. We no longer need the unreliable assumption that a VACUUM that's doing a truncation will hold its AccessExclusive lock until commit --- in fact, we can intentionally release that lock as soon as we've completed the truncation. This patch therefore reverts (most of) Alvaro's patch of 2009-11-10, as well as my marginal hacking on it yesterday. We can also get rid of assorted no-longer-needed relcache flushes, which are far more expensive than an smgr flush because they kill a lot more state. In passing this patch fixes smgr_redo's failure to perform visibility-map truncation, and cleans up some rather dubious assumptions in freespace.c and visibilitymap.c about when rd_fsm_nblocks and rd_vm_nblocks can be out of date.
2010-02-08	Create an official API function for C functions to use to check if they are	Tom Lane
	being called as aggregates, and to get the aggregate transition state memory context if needed. Use it instead of poking directly into AggState and WindowAggState in places that shouldn't know so much. We should have done this in 8.4, probably, but better late than never. Revised version of a patch by Hitoshi Harada.
2010-02-08	Remove old-style VACUUM FULL (which was known for a little while as	Tom Lane
	VACUUM FULL INPLACE), along with a boatload of subsidiary code and complexity. Per discussion, the use case for this method of vacuuming is no longer large enough to justify maintaining it; not to mention that we don't wish to invest the work that would be needed to make it play nicely with Hot Standby. Aside from the code directly related to old-style VACUUM FULL, this commit removes support for certain WAL record types that could only be generated within VACUUM FULL, redirect-pointer removal in heap_page_prune, and nontransactional generation of cache invalidation sinval messages (the last being the sticking point for Hot Standby). We still have to retain all code that copes with finding HEAP_MOVED_OFF and HEAP_MOVED_IN flag bits on existing tuples. This can't be removed as long as we want to support in-place update from pre-9.0 databases.
2010-02-07	Create a "relation mapping" infrastructure to support changing the relfilenodes	Tom Lane
	of shared or nailed system catalogs. This has two key benefits: * The new CLUSTER-based VACUUM FULL can be applied safely to all catalogs. * We no longer have to use an unsafe reindex-in-place approach for reindexing shared catalogs. CLUSTER on nailed catalogs now works too, although I left it disabled on shared catalogs because the resulting pg_index.indisclustered update would only be visible in one database. Since reindexing shared system catalogs is now fully transactional and crash-safe, the former special cases in REINDEX behavior have been removed; shared catalogs are treated the same as non-shared. This commit does not do anything about the recently-discussed problem of deadlocks between VACUUM FULL/CLUSTER on a system catalog and other concurrent queries; will address that in a separate patch. As a stopgap, parallel_schedule has been tweaked to run vacuum.sql by itself, to avoid such failures during the regression tests.
2010-02-03	Move the responsibility of writing a "unlogged WAL operation" record from	Heikki Linnakangas
	heap_sync() to the callers, because heap_sync() is sometimes called even if the operation itself is WAL-logged. This eliminates the bogus unlogged records from CLUSTER that Simon Riggs reported, patch by Fujii Masao.
2010-02-01	Augment EXPLAIN output with more details on Hash nodes.	Robert Haas
	We show the number of buckets, the number of batches (and also the original number if it has changed), and the peak space used by the hash table. Minor executor changes to track peak space used.
2010-01-31	Fix memory leak created by deferrable-index-constraints patches.	Tom Lane
	We need to free the OID list returned by ExecInsertIndexTuples to avoid a query-lifespan memory leak. When many rows require rechecking, this can be a significant leak --- it's even more than the space used for the queued trigger events. Dean Rasheed
2010-01-28	Type table feature	Peter Eisentraut
	This adds the CREATE TABLE name OF type command, per SQL standard.
2010-01-15	Introduce Streaming Replication.	Heikki Linnakangas
	This includes two new kinds of postmaster processes, walsenders and walreceiver. Walreceiver is responsible for connecting to the primary server and streaming WAL to disk, while walsender runs in the primary server and streams WAL from disk to the client. Documentation still needs work, but the basics are there. We will probably pull the replication section to a new chapter later on, as well as the sections describing file-based replication. But let's do that as a separate patch, so that it's easier to see what has been added/changed. This patch also adds a new section to the chapter about FE/BE protocol, documenting the protocol used by walsender/walreceivxer. Bump catalog version because of two new functions, pg_last_xlog_receive_location() and pg_last_xlog_replay_location(), for monitoring the progress of replication. Fujii Masao, with additional hacking by me
2010-01-11	Improve ExecEvalVar's handling of whole-row variables in cases where the	Tom Lane
	rowtype contains dropped columns. Sometimes the input tuple will be formed from a select targetlist in which dropped columns are filled with a NULL of an arbitrary type (the planner typically uses INT4, since it can't tell what type the dropped column really was). So we need to relax the rowtype compatibility check to not insist on physical compatibility if the actual column value is NULL. In principle we might need to do this for functions returning composite types, too (see tupledesc_match()). In practice there doesn't seem to be a bug there, probably because the function will be using the same cached rowtype descriptor as the caller. Fixing that code path would require significant rearrangement, so I left it alone for now. Per complaint from Filip Rembialkowski.
2010-01-09	Make ExecEvalFieldSelect throw a more intelligible error if it's asked to	Tom Lane
	extract a system column, and remove a couple of lines that are useless in light of the fact that we aren't ever going to support this case. There isn't much point in trying to make this work because a tuple Datum does not carry many of the system columns. Per experimentation with a case reported by Dean Rasheed; we'll have to fix his problem somewhere else.
2010-01-08	Fix oversight in EvalPlanQualFetch: after failing to lock a tuple because	Tom Lane
	someone else has just updated it, we have to set priorXmax to that tuple's xmax (ie, the XID of the other xact that updated it) before looping back to examine the next tuple. Obviously, the next tuple in the update chain should have that XID as its xmin, not the same xmin as the preceding tuple that we had been trying to lock. The mismatch would cause the EvalPlanQual logic to decide that the tuple chain ended in a deletion, when actually there was a live tuple that should have been found. I inserted this error when recently adding logic to EvalPlanQual to make it lock tuples before returning them (as opposed to the old method in which the lock would occur much later, causing a great deal of work to be wasted if we only then discover someone else updated it). Sigh. Per today's report from Takahiro Itagaki of inconsistent results during pgbench runs.
2010-01-06	Preserve relfilenodes:	Bruce Momjian
	Add support to pg_dump --binary-upgrade to preserve all relfilenodes, for use by pg_migrator.
2010-01-05	Add support for doing FULL JOIN ON FALSE. While this is really a rather	Tom Lane
	peculiar variant of UNION ALL, and so wouldn't likely get written directly as-is, it's possible for it to arise as a result of simplification of less-obviously-silly queries. In particular, now that we can do flattening of subqueries that have constant outputs and are underneath an outer join, it's possible for the case to result from simplification of queries of the type exhibited in bug #5263. Back-patch to 8.4 to avoid a functionality regression for this type of query.
2010-01-04	When estimating the selectivity of an inequality "column > constant" or	Tom Lane
	"column < constant", and the comparison value is in the first or last histogram bin or outside the histogram entirely, try to fetch the actual column min or max value using an index scan (if there is an index on the column). If successful, replace the lower or upper histogram bound with that value before carrying on with the estimate. This limits the estimation error caused by moving min/max values when the comparison value is close to the min or max. Per a complaint from Josh Berkus. It is tempting to consider using this mechanism for mergejoinscansel as well, but that would inject index fetches into main-line join estimation not just endpoint cases. I'm refraining from that until we can get a better handle on the costs of doing this type of lookup.
2010-01-02	check_exclusion_constraint didn't actually work correctly for index	Tom Lane
	expressions: FormIndexDatum requires the estate's scantuple to already point at the tuple the values are supposedly being extracted from. Adjust test case so that this type of confusion will be exposed. Per report from hubert depesz lubaczewski.