user/sven/postgresql.git

Age	Commit message (Collapse)	Author
2012-01-01	Update copyright notices for year 2012.	Bruce Momjian

2011-10-11	Rearrange the implementation of index-only scans.	Tom Lane
	This commit changes index-only scans so that data is read directly from the index tuple without first generating a faux heap tuple. The only immediate benefit is that indexes on system columns (such as OID) can be used in index-only scans, but this is necessary infrastructure if we are ever to support index-only scans on expression indexes. The executor is now ready for that, though the planner still needs substantial work to recognize the possibility. To do this, Vars in index-only plan nodes have to refer to index columns not heap columns. I introduced a new special varno, INDEX_VAR, to mark such Vars to avoid confusion. (In passing, this commit renames the two existing special varnos to OUTER_VAR and INNER_VAR.) This allows ruleutils.c to handle them with logic similar to what we use for subplan reference Vars. Since index-only scans are now fundamentally different from regular indexscans so far as their expression subtrees are concerned, I also chose to change them to have their own plan node type (and hence, their own executor source file).
2011-09-22	Make EXPLAIN ANALYZE report the numbers of rows rejected by filter steps.	Tom Lane
	This provides information about the numbers of tuples that were visited but not returned by table scans, as well as the numbers of join tuples that were considered and discarded within a join plan node. There is still some discussion going on about the best way to report counts for outer-join situations, but I think most of what's in the patch would not change if we revise that, so I'm going to go ahead and commit it as-is. Documentation changes to follow (they weren't in the submitted patch either). Marko Tiikkaja, reviewed by Marc Cousin, somewhat revised by Tom
2011-09-16	Redesign the plancache mechanism for more flexibility and efficiency.	Tom Lane
	Rewrite plancache.c so that a "cached plan" (which is rather a misnomer at this point) can support generation of custom, parameter-value-dependent plans, and can make an intelligent choice between using custom plans and the traditional generic-plan approach. The specific choice algorithm implemented here can probably be improved in future, but this commit is all about getting the mechanism in place, not the policy. In addition, restructure the API to greatly reduce the amount of extraneous data copying needed. The main compromise needed to make that possible was to split the initial creation of a CachedPlanSource into two steps. It's worth noting in particular that SPI_saveplan is now deprecated in favor of SPI_keepplan, which accomplishes the same end result with zero data copying, and no need to then spend even more cycles throwing away the original SPIPlan. The risk of long-term memory leaks while manipulating SPIPlans has also been greatly reduced. Most of this improvement is based on use of the recently-added MemoryContextSetParent primitive.
2011-09-04	Clean up the #include mess a little.	Tom Lane
	walsender.h should depend on xlog.h, not vice versa. (Actually, the inclusion was circular until a couple hours ago, which was even sillier; but Bruce broke it in the expedient rather than logically correct direction.) Because of that poor decision, plus blind application of pgrminclude, we had a situation where half the system was depending on xlog.h to include such unrelated stuff as array.h and guc.h. Clean up the header inclusion, and manually revert a lot of what pgrminclude had done so things build again. This episode reinforces my feeling that pgrminclude should not be run without adult supervision. Inclusion changes in header files in particular need to be reviewed with great care. More generally, it'd be good if we had a clearer notion of module layering to dictate which headers can sanely include which others ... but that's a big task for another day.
2011-09-01	Remove unnecessary #include references, per pgrminclude script.	Bruce Momjian

2011-05-23	Install defenses against overflow in BuildTupleHashTable().	Tom Lane
	The planner can sometimes compute very large values for numGroups, and in cases where we have no alternative to building a hashtable, such a value will get fed directly to BuildTupleHashTable as its nbuckets parameter. There were two ways in which that could go bad. First, BuildTupleHashTable declared the parameter as "int" but most callers were passing "long"s, so on 64-bit machines undetected overflow could occur leading to a bogus negative value. The obvious fix for that is to change the parameter to "long", which is what I've done in HEAD. In the back branches that seems a bit risky, though, since third-party code might be calling this function. So for them, just put in a kluge to treat negative inputs as INT_MAX. Second, hash_create can go nuts with extremely large requested table sizes (notably, my_log2 becomes an infinite loop for inputs larger than LONG_MAX/2). What seems most appropriate to avoid that is to bound the initial table size request to work_mem. This fixes bug #6035 reported by Daniel Schreiber. Although the reported case only occurs back to 8.4 since it involves WITH RECURSIVE, I think it's a good idea to install the defenses in all supported branches.
2011-04-10	pgindent run before PG 9.1 beta 1.	Bruce Momjian

2011-03-24	Fix handling of collation in SQL-language functions.	Tom Lane
	Ensure that parameter symbols receive collation from the function's resolved input collation, and fix inlining to behave properly. BTW, this commit lays about 90% of the infrastructure needed to support use of argument names in SQL functions. Parsing of parameters is now done via the parser-hook infrastructure ... we'd just need to supply a column-ref hook ...
2011-02-27	Refactor the executor's API to support data-modifying CTEs better.	Tom Lane
	The originally committed patch for modifying CTEs didn't interact well with EXPLAIN, as noted by myself, and also had corner-case problems with triggers, as noted by Dean Rasheed. Those problems show it is really not practical for ExecutorEnd to call any user-defined code; so split the cleanup duties out into a new function ExecutorFinish, which must be called between the last ExecutorRun call and ExecutorEnd. Some Asserts have been added to these functions to help verify correct usage. It is no longer necessary for callers of the executor to call AfterTriggerBeginQuery/AfterTriggerEndQuery for themselves, as this is now done by ExecutorStart/ExecutorFinish respectively. If you really need to suppress that and do it for yourself, pass EXEC_FLAG_SKIP_TRIGGERS to ExecutorStart. Also, refactor portal commit processing to allow for the possibility that PortalDrop will invoke user-defined code. I think this is not actually necessary just yet, since the portal-execution-strategy logic forces any non-pure-SELECT query to be run to completion before we will consider committing. But it seems like good future-proofing.
2011-02-25	Support data-modifying commands (INSERT/UPDATE/DELETE) in WITH.	Tom Lane
	This patch implements data-modifying WITH queries according to the semantics that the updates all happen with the same command counter value, and in an unspecified order. Therefore one WITH clause can't see the effects of another, nor can the outer query see the effects other than through the RETURNING values. And attempts to do conflicting updates will have unpredictable results. We'll need to document all that. This commit just fixes the code; documentation updates are waiting on author. Marko Tiikkaja and Hitoshi Harada
2011-02-21	Remove ExecRemoveJunk(), which is no longer used anywhere.	Tom Lane
	This was a leftover from the pre-8.1 design of junkfilters. It doesn't seem to have any reason to live, since it's merely a combination of two easy function calls, and not a well-designed combination at that (it encourages callers to leak the result tuple).
2011-02-20	Implement an API to let foreign-data wrappers actually be functional.	Tom Lane
	This commit provides the core code and documentation needed. A contrib module test case will follow shortly. Shigeru Hanada, Jan Urbanski, Heikki Linnakangas
2011-01-12	Fix PlanRowMark/ExecRowMark structures to handle inheritance correctly.	Tom Lane
	In an inherited UPDATE/DELETE, each target table has its own subplan, because it might have a column set different from other targets. This means that the resjunk columns we add to support EvalPlanQual might be at different physical column numbers in each subplan. The EvalPlanQual rewrite I did for 9.0 failed to account for this, resulting in possible misbehavior or even crashes during concurrent updates to the same row, as seen in a recent report from Gordon Shannon. Revise the data structure so that we track resjunk column numbers separately for each subplan. I also chose to move responsibility for identifying the physical column numbers back to executor startup, instead of assuming that numbers derived during preprocess_targetlist would stay valid throughout subsequent massaging of the plan. That's a bit slower, so we might want to consider undoing it someday; but it would complicate the patch considerably and didn't seem justifiable in a bug fix that has to be back-patched to 9.0.
2011-01-01	Stamp copyrights for year 2011.	Bruce Momjian

2010-12-30	Move symbols for ExecMergeJoin's state machine into nodeMergejoin.c.	Tom Lane
	There's no reason for these values to be known anywhere else. After doing this, executor/execdefs.h is vestigial and can be removed.
2010-12-30	Support RIGHT and FULL OUTER JOIN in hash joins.	Tom Lane
	This is advantageous first because it allows us to hash the smaller table regardless of the outer-join type, and second because hash join can be more flexible than merge join in dealing with arbitrary join quals in a FULL join. For merge join all the join quals have to be mergejoinable, but hash join will work so long as there's at least one hashjoinable qual --- the others can be any condition. (This is true essentially because we don't keep per-inner-tuple match flags in merge join, while hash join can do so.) To do this, we need a has-it-been-matched flag for each tuple in the hashtable, not just one for the current outer tuple. The key idea that makes this practical is that we can store the match flag in the tuple's infomask, since there are lots of bits there that are of no interest for a MinimalTuple. So we aren't increasing the size of the hashtable at all for the feature. To write this without turning the hash code into even more of a pile of spaghetti than it already was, I rewrote ExecHashJoin in a state-machine style, similar to ExecMergeJoin. Other than that decision, it was pretty straightforward.
2010-12-27	Fix failure of executor/hashjoin.h to compile standalone.	Tom Lane
	Noted while experimenting with cpluspluscheck.
2010-12-02	Create core infrastructure for KNNGIST.	Tom Lane
	This is a heavily revised version of builtin_knngist_core-0.9. The ordering operators are no longer mixed in with actual quals, which would have confused not only humans but significant parts of the planner. Instead, ordering operators are carried separately throughout planning and execution. Since the API for ambeginscan and amrescan functions had to be changed anyway, this commit takes the opportunity to rationalize that a bit. RelationGetIndexScan no longer forces a premature index_rescan call; instead, callers of index_beginscan must call index_rescan too. Aside from making the AM-side initialization logic a bit less peculiar, this has the advantage that we do not make a useless extra am_rescan call when there are runtime key values. AMs formerly could not assume that the key values passed to amrescan were actually valid; now they can. Teodor Sigaev and Tom Lane
2010-10-14	Support MergeAppend plans, to allow sorted output from append relations.	Tom Lane
	This patch eliminates the former need to sort the output of an Append scan when an ordered scan of an inheritance tree is wanted. This should be particularly useful for fast-start cases such as queries with LIMIT. Original patch by Greg Stark, with further hacking by Hans-Jurgen Schonig, Robert Haas, and Tom Lane.
2010-09-20	Remove cvs keywords from all files.	Magnus Hagander

2010-07-22	Centralize DML permissions-checking logic.	Robert Haas
	Remove bespoke code in DoCopy and RI_Initial_Check, which now instead fabricate call ExecCheckRTPerms with a manufactured RangeTblEntry. This is intended to make it feasible for an enhanced security provider to actually make use of ExecutorCheckPerms_hook, but also has the advantage that RI_Initial_Check can allow use of the fast-path when column-level but not table-level permissions are present. KaiGai Kohei. Reviewed (in an earlier version) by Stephen Frost, and by me. Some further changes to the comments by me.
2010-07-12	Make NestLoop plan nodes pass outer-relation variables into their inner	Tom Lane
	relation using the general PARAM_EXEC executor parameter mechanism, rather than the ad-hoc kluge of passing the outer tuple down through ExecReScan. The previous method was hard to understand and could never be extended to handle parameters coming from multiple join levels. This patch doesn't change the set of possible plans nor have any significant performance effect, but it's necessary infrastructure for future generalization of the concept of an inner indexscan plan. ExecReScan's second parameter is now unused, so it's removed.
2010-07-09	Add a hook in ExecCheckRTPerms().	Robert Haas
	This hook allows a loadable module to gain control when table permissions are checked. It is expected to be used by an eventual SE-PostgreSQL implementation, but there are other possible applications as well. A sample contrib module can be found in the archives at: http://archives.postgresql.org/pgsql-hackers/2010-05/msg01095.php Robert Haas and Stephen Frost
2010-02-26	pgindent run for 9.0	Bruce Momjian

2010-02-08	Remove old-style VACUUM FULL (which was known for a little while as	Tom Lane
	VACUUM FULL INPLACE), along with a boatload of subsidiary code and complexity. Per discussion, the use case for this method of vacuuming is no longer large enough to justify maintaining it; not to mention that we don't wish to invest the work that would be needed to make it play nicely with Hot Standby. Aside from the code directly related to old-style VACUUM FULL, this commit removes support for certain WAL record types that could only be generated within VACUUM FULL, redirect-pointer removal in heap_page_prune, and nontransactional generation of cache invalidation sinval messages (the last being the sticking point for Hot Standby). We still have to retain all code that copes with finding HEAP_MOVED_OFF and HEAP_MOVED_IN flag bits on existing tuples. This can't be removed as long as we want to support in-place update from pre-9.0 databases.
2010-02-01	Augment EXPLAIN output with more details on Hash nodes.	Robert Haas
	We show the number of buckets, the number of batches (and also the original number if it has changed), and the peak space used by the hash table. Minor executor changes to track peak space used.
2010-01-08	pgBufferUsage needs PGDLLIMPORT for pg_stat_statements on Windows.	Itagaki Takahiro

2010-01-02	Update copyright for the year 2010.	Bruce Momjian

2009-12-15	Add an EXPLAIN (BUFFERS) option to show buffer-usage statistics.	Robert Haas
	This patch also removes buffer-usage statistics from the track_counts output, since this (or the global server statistics) is deemed to be a better interface to this information. Itagaki Takahiro, reviewed by Euler Taveira de Oliveira.
2009-12-14	Fix a bug introduced when set-returning SQL functions were made inline-able:	Tom Lane
	we have to cope with the possibility that the declared result rowtype contains dropped columns. This fails in 8.4, as per bug #5240. While at it, be more paranoid about inserting binary coercions when inlining. The pre-8.4 code did not really need to worry about that because it could not inline at all in any case where an added coercion could change the behavior of the function's statement. However, when inlining a SRF we allow sorting, grouping, and set-ops such as UNION. In these cases, modifying one of the targetlist entries that the sort/group/setop depends on could conceivably change the behavior of the function's statement --- so don't inline when such a case applies.
2009-12-07	Add exclusion constraints, which generalize the concept of uniqueness to	Tom Lane
	support any indexable commutative operator, not just equality. Two rows violate the exclusion constraint if "row1.col OP row2.col" is TRUE for each of the columns in the constraint. Jeff Davis, reviewed by Robert Haas
2009-11-04	Add support for invoking parser callback hooks via SPI and in cached plans.	Tom Lane
	As proof of concept, modify plpgsql to use the hooks. plpgsql is still inserting $n symbols textually, but the "back end" of the parsing process now goes through the ParamRef hook instead of using a fixed parameter-type array, and then execution only fetches actually-referenced parameters, using a hook added to ParamListInfo. Although there's a lot left to be done in plpgsql, this already cures the "if (TG_OP = 'INSERT' and NEW.foo ...)" problem, as illustrated by the changed regression test.
2009-10-26	Re-implement EvalPlanQual processing to improve its performance and eliminate	Tom Lane
	a lot of strange behaviors that occurred in join cases. We now identify the "current" row for every joined relation in UPDATE, DELETE, and SELECT FOR UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the appropriate row into each scan node in the rechecking plan, forcing it to emit only that one row. The former behavior could rescan the whole of each joined relation for each recheck, which was terrible for performance, and what's much worse could result in duplicated output tuples. Also, the original implementation of EvalPlanQual could not re-use the recheck execution tree --- it had to go through a full executor init and shutdown for every row to be tested. To avoid this overhead, I've associated a special runtime Param with each LockRows or ModifyTable plan node, and arranged to make every scan node below such a node depend on that Param. Thus, by signaling a change in that Param, the EPQ machinery can just rescan the already-built test plan. This patch also adds a prohibition on set-returning functions in the targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the duplicate-output-tuple problem. It seems fairly reasonable since the other restrictions on SELECT FOR UPDATE are meant to ensure that there is a unique correspondence between source tuples and result tuples, which an output SRF destroys as much as anything else does.
2009-10-12	Move the handling of SELECT FOR UPDATE locking and rechecking out of	Tom Lane
	execMain.c and into a new plan node type LockRows. Like the recent change to put table updating into a ModifyTable plan node, this increases planning flexibility by allowing the operations to occur below the top level of the plan tree. It's necessary in any case to restore the previous behavior of having FOR UPDATE locking occur before ModifyTable does. This partially refactors EvalPlanQual to allow multiple rows-under-test to be inserted into the EPQ machinery before starting an EPQ test query. That isn't sufficient to fix EPQ's general bogosity in the face of plans that return multiple rows per test row, though. Since this patch is mostly about getting some plan node infrastructure in place and not about fixing ten-year-old bugs, I will leave EPQ improvements for another day. Another behavioral change that we could now think about is doing FOR UPDATE before LIMIT, but that too seems like it should be treated as a followon patch.
2009-10-10	Split the processing of INSERT/UPDATE/DELETE operations out of execMain.c.	Tom Lane
	They are now handled by a new plan node type called ModifyTable, which is placed at the top of the plan tree. In itself this change doesn't do much, except perhaps make the handling of RETURNING lists and inherited UPDATEs a tad less klugy. But it is necessary preparation for the intended extension of allowing RETURNING queries inside WITH. Marko Tiikkaja
2009-10-08	Remove very ancient tuple-counting infrastructure (IncrRetrieved() and	Tom Lane
	friends). This code has all been ifdef'd out for many years, and doesn't seem to have any prospect of becoming any more useful in the future. EXPLAIN ANALYZE is what people use in practice, and I think if we did want process-wide counters we'd be more likely to put in dtrace events for that than try to resurrect this code. Get rid of it so as to have one less detail to worry about while refactoring execMain.c.
2009-09-27	Remove no-longer-needed ExecCountSlots infrastructure.	Tom Lane

2009-09-27	Replace the array-style TupleTable data structure with a simple List of	Tom Lane
	TupleTableSlot nodes. This eliminates the need to count in advance how many Slots will be needed, which seems more than worth the small increase in the amount of palloc traffic during executor startup. The ExecCountSlots infrastructure is now all dead code, but I'll remove it in a separate commit for clarity. Per a comment from Robert Haas.
2009-09-12	Rewrite the planner's handling of materialized plan types so that there is	Tom Lane
	an explicit model of rescan costs being different from first-time costs. The costing of Material nodes in particular now has some visible relationship to the actual runtime behavior, where before it was essentially fantasy. This also fixes up a couple of places where different materialized plan types were treated differently for no very good reason (probably just oversights). A couple of the regression tests are affected, because the planner now chooses to put the other relation on the inside of a nestloop-with-materialize. So far as I can see both changes are sane, and the planner is now more consistently following the expectation that it should prefer to materialize the smaller of two relations. Per a recent discussion with Robert Haas.
2009-07-29	Support deferrable uniqueness constraints.	Tom Lane
	The current implementation fires an AFTER ROW trigger for each tuple that looks like it might be non-unique according to the index contents at the time of insertion. This works well as long as there aren't many conflicts, but won't scale to massive unique-key reassignments. Improving that case is a TODO item. Dean Rasheed
2009-07-22	Change do_tup_output() to take Datum/isnull arrays instead of a char * array,	Tom Lane
	so it doesn't go through BuildTupleFromCStrings. This is more or less a wash for current uses, but will avoid inefficiency for planned changes to EXPLAIN. Robert Haas
2009-07-18	Fix error cleanup failure caused by 8.4 changes in plpgsql to try to avoid	Tom Lane
	memory leakage in error recovery. We were calling FreeExprContext, and therefore invoking ExprContextCallback callbacks, in both normal and error exits from subtransactions. However this isn't very safe, as shown in recent trouble report from Frank van Vugt, in which releasing a tupledesc refcount failed. It's also unnecessary, since the resources that callbacks might wish to release should be cleaned up by other error recovery mechanisms (ie the resource owners). We only really want FreeExprContext to release memory attached to the exprcontext in the error-exit case. So, add a bool parameter to FreeExprContext to tell it not to call the callbacks. A more general solution would be to pass the isCommit bool parameter on to the callbacks, so they could do only safe things during error exit. But that would make the patch significantly more invasive and possibly break third-party code that registers ExprContextCallback callbacks. We might want to do that later in HEAD, but for now I'll just do what seems reasonable to back-patch.
2009-06-11	8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list	Bruce Momjian
	provided by Andrew.
2009-03-30	Fix an oversight in the support for storing/retrieving "minimal tuples" in	Tom Lane
	TupleTableSlots. We have functions for retrieving a minimal tuple from a slot after storing a regular tuple in it, or vice versa; but these were implemented by converting the internal storage from one format to the other. The problem with that is it invalidates any pass-by-reference Datums that were already fetched from the slot, since they'll be pointing into the just-freed version of the tuple. The known problem cases involve fetching both a whole-row variable and a pass-by-reference value from a slot that is fed from a tuplestore or tuplesort object. The added regression tests illustrate some simple cases, but there may be other failure scenarios traceable to the same bug. Note that the added tests probably only fail on unpatched code if it's built with --enable-cassert; otherwise the bug leads to fetching from freed memory, which will not have been overwritten without additional conditions. Fix by allowing a slot to contain both formats simultaneously; which turns out not to complicate the logic much at all, if anything it seems less contorted than before. Back-patch to 8.2, where minimal tuples were introduced.
2009-03-21	Optimize multi-batch hash joins when the outer relation has a nonuniform	Tom Lane
	distribution, by creating a special fast path for the (first few) most common values of the outer relation. Tuples having hashvalues matching the MCVs are effectively forced to be in the first batch, so that we never write them out to the batch temp files. Bryce Cutt and Ramon Lawrence, with some editorialization by me.
2009-01-21	Add new SPI_OK_REWRITTEN return code to SPI_execute and friends, for the	Heikki Linnakangas
	case that the command is rewritten into another type of command. The old behavior to return the command tag of the last executed command was pretty surprising. In PL/pgSQL, for example, it meant that if a command was rewritten to a utility statement, FOUND wasn't set at all.
2009-01-07	Insert conditional SPI_push/SPI_pop calls into InputFunctionCall,	Tom Lane
	OutputFunctionCall, and friends. This allows SPI-using functions to invoke datatype I/O without concern for the possibility that a SPI-using function will be called (which could be either the I/O function itself, or a function used in a domain check constraint). It's a tad ugly, but not nearly as ugly as what'd be needed to make this work via retail insertion of push/pop operations in all the PLs. This reverts my patch of 2007-01-30 that inserted some retail SPI_push/pop calls into plpgsql; that approach only fixed plpgsql, and not any other PLs. But the other PLs have the issue too, as illustrated by a recent gripe from Christian Schröder. Back-patch to 8.2, which is as far back as this solution will work. It's also as far back as we need to worry about the domain-constraint case, since earlier versions did not attempt to check domain constraints within datatype input. I'm not aware of any old I/O functions that use SPI themselves, so this should be sufficient for a back-patch.
2009-01-07	Fix executor/spi.h to follow our usual conventions for include files, ie,	Tom Lane
	not include postgres.h nor anything else it doesn't directly need. Add #includes to calling files as needed to compensate. Per my proposal of yesterday. This should be noted as a source code change in the 8.4 release notes, since it's likely to require changes in add-on modules.
2009-01-02	Include a pointer to the query's source text in QueryDesc structs. This is	Tom Lane
	practically free given prior 8.4 changes in plancache and portal management, and it makes it a lot easier for ExecutorStart/Run/End hooks to get at the query text. Extracted from Itagaki Takahiro's pg_stat_statements patch, with minor editorialization.