user/sven/postgresql.git

Age	Commit message (Collapse)	Author
2016-09-22	Be sure to rewind the tuplestore read pointer in non-leader CTEScan nodes.	Tom Lane
	ExecInitCteScan supposed that it didn't have to do anything to the extra tuplestore read pointer it gets from tuplestore_alloc_read_pointer. However, it needs this read pointer to be positioned at the start of the tuplestore, while tuplestore_alloc_read_pointer is actually defined as cloning the current position of read pointer 0. In normal situations that accidentally works because we initialize the whole plan tree at once, before anything gets read. But it fails in an EvalPlanQual recheck, as illustrated in bug #14328 from Dima Pavlov. To fix, just forcibly rewind the pointer after tuplestore_alloc_read_pointer. The cost of doing so is negligible unless the tuplestore is already in TSS_READFILE state, which wouldn't happen in normal cases. We could consider altering tuplestore's API to make that case cheaper, but that would make for a more invasive back-patch and it doesn't seem worth it. This has been broken probably for as long as we've had CTEs, so back-patch to all supported branches. Discussion: <32468.1474548308@sss.pgh.pa.us>
2016-08-24	Fix improper repetition of previous results from a hashed aggregate.	Tom Lane
	ExecReScanAgg's check for whether it could re-use a previously calculated hashtable neglected the possibility that the Agg node might reference PARAM_EXEC Params that are not referenced by its input plan node. That's okay if the Params are in upper tlist or qual expressions; but if one appears in aggregate input expressions, then the hashtable contents need to be recomputed when the Param's value changes. To avoid unnecessary performance degradation in the case of a Param that isn't within an aggregate input, add logic to the planner to determine which Params are within aggregate inputs. This requires a new field in struct Agg, but fortunately we never write plans to disk, so this isn't an initdb-forcing change. Per report from Jeevan Chalke. This has been broken since forever, so back-patch to all supported branches. Andrew Gierth, with minor adjustments by me Report: <CAM2+6=VY8ykfLT5Q8vb9B6EbeBk-NGuLbT6seaQ+Fq4zXvrDcA@mail.gmail.com>
2016-08-11	Fix busted Assert for CREATE MATVIEW ... WITH NO DATA.	Tom Lane
	Commit 874fe3aea changed the command tag returned for CREATE MATVIEW/CREATE TABLE AS ... WITH NO DATA, but missed that there was code in spi.c that expected the command tag to always be "SELECT". Fortunately, the consequence was only an Assert failure, so this oversight should have no impact in production builds. Since this code path was evidently un-exercised, add a regression test. Per report from Shivam Saxena. Back-patch to 9.3, like the previous commit. Michael Paquier Report: <97218716-480B-4527-B5CD-D08D798A0C7B@dresources.com>
2016-08-08	Fix two errors with nested CASE/WHEN constructs.	Tom Lane
	ExecEvalCase() tried to save a cycle or two by passing &econtext->caseValue_isNull as the isNull argument to its sub-evaluation of the CASE value expression. If that subexpression itself contained a CASE, then *isNull was an alias for econtext->caseValue_isNull within the recursive call of ExecEvalCase(), leading to confusion about whether the inner call's caseValue was null or not. In the worst case this could lead to a core dump due to dereferencing a null pointer. Fix by not assigning to the global variable until control comes back from the subexpression. Also, avoid using the passed-in isNull pointer transiently for evaluation of WHEN expressions. (Either one of these changes would have been sufficient to fix the known misbehavior, but it's clear now that each of these choices was in itself dangerous coding practice and best avoided. There do not seem to be any similar hazards elsewhere in execQual.c.) Also, it was possible for inlining of a SQL function that implements the equality operator used for a CASE comparison to result in one CASE expression's CaseTestExpr node being inserted inside another CASE expression. This would certainly result in wrong answers since the improperly nested CaseTestExpr would be caused to return the inner CASE's comparison value not the outer's. If the CASE values were of different data types, a crash might result; moreover such situations could be abused to allow disclosure of portions of server memory. To fix, teach inline_function to check for "bare" CaseTestExpr nodes in the arguments of a function to be inlined, and avoid inlining if there are any. Heikki Linnakangas, Michael Paquier, Tom Lane Report: https://github.com/greenplum-db/gpdb/pull/327 Report: <4DDCEEB8.50602@enterprisedb.com> Security: CVE-2016-5423
2016-07-26	Fix constant-folding of ROW(...) IS [NOT] NULL with composite fields.	Tom Lane
	The SQL standard appears to specify that IS [NOT] NULL's tests of field nullness are non-recursive, ie, we shouldn't consider that a composite field with value ROW(NULL,NULL) is null for this purpose. ExecEvalNullTest got this right, but eval_const_expressions did not, leading to weird inconsistencies depending on whether the expression was such that the planner could apply constant folding. Also, adjust the docs to mention that IS [NOT] DISTINCT FROM NULL can be used as a substitute test if a simple null check is wanted for a rowtype argument. That motivated reordering things so that IS [NOT] DISTINCT FROM is described before IS [NOT] NULL. In HEAD, I went a bit further and added a table showing all the comparison-related predicates. Per bug #14235. Back-patch to all supported branches, since it's certainly undesirable that constant-folding should change the semantics. Report and patch by Andrew Gierth; assorted wordsmithing and revised regression test cases by me. Report: <20160708024746.1410.57282@wrigleys.postgresql.org>
2016-05-23	Fix latent crash in do_text_output_multiline().	Tom Lane
	do_text_output_multiline() would fail (typically with a null pointer dereference crash) if its input string did not end with a newline. Such cases do not arise in our current sources; but it certainly could happen in future, or in extension code's usage of the function, so we should fix it. To fix, replace "eol += len" with "eol = text + len". While at it, make two cosmetic improvements: mark the input string const, and rename the argument from "text" to "txt" to dodge pgindent strangeness (since "text" is a typedef name). Even though this problem is only latent at present, it seems like a good idea to back-patch the fix, since it's a very simple/safe patch and it's not out of the realm of possibility that we might in future back-patch something that expects sane behavior from do_text_output_multiline(). Per report from Hao Lee. Report: <CAGoxFiFPAGyPAJLcFxTB5cGhTW2yOVBDYeqDugYwV4dEd1L_Ag@mail.gmail.com>
2015-10-04	Further twiddling of nodeHash.c hashtable sizing calculation.	Tom Lane
	On reflection, the submitted patch didn't really work to prevent the request size from exceeding MaxAllocSize, because of the fact that we'd happily round nbuckets up to the next power of 2 after we'd limited it to max_pointers. The simplest way to enforce the limit correctly is to round max_pointers down to a power of 2 when it isn't one already. (Note that the constraint to INT_MAX / 2, if it were doing anything useful at all, is properly applied after that.)
2015-10-04	Fix possible "invalid memory alloc request size" failure in nodeHash.c.	Tom Lane
	Limit the size of the hashtable pointer array to not more than MaxAllocSize. We've seen reports of failures due to this in HEAD/9.5, and it seems possible in older branches as well. The change in NTUP_PER_BUCKET in 9.5 may have made the problem more likely, but surely it didn't introduce it. Tomas Vondra, slightly modified by me
2015-08-21	Avoid O(N^2) behavior when enlarging SPI tuple table in spi_printtup().	Tom Lane
	For no obvious reason, spi_printtup() was coded to enlarge the tuple pointer table by just 256 slots at a time, rather than doubling the size at each reallocation, as is our usual habit. For very large SPI results, this makes for O(N^2) time spent in repalloc(), which of course soon comes to dominate the runtime. Use the standard doubling approach instead. This is a longstanding performance bug, so back-patch to all active branches. Neil Conway
2015-07-30	Avoid some zero-divide hazards in the planner.	Tom Lane
	Although I think on all modern machines floating division by zero results in Infinity not SIGFPE, we still don't want infinities running around in the planner's costing estimates; too much risk of that leading to insane behavior. grouping_planner() failed to consider the possibility that final_rel might be known dummy and hence have zero rowcount. (I wonder if it would be better to set a rows estimate of 1 for dummy relations? But at least in the back branches, changing this convention seems like a bad idea, so I'll leave that for another day.) Make certain that get_variable_numdistinct() produces a nonzero result. The case that can be shown to be broken is with stadistinct < 0.0 and small ntuples; we did not prevent the result from rounding to zero. For good luck I applied clamp_row_est() to all the nonconstant return values. In ExecChooseHashTableSize(), Assert that we compute positive nbuckets and nbatch. I know of no reason to think this isn't the case, but it seems like a good safety check. Per reports from Piotr Stefaniak. Back-patch to all active branches.
2015-03-24	Fix ExecOpenScanRelation to take a lock on a ROW_MARK_COPY relation.	Tom Lane
	ExecOpenScanRelation assumed that any relation listed in the ExecRowMark list has been locked by InitPlan; but this is not true if the rel's markType is ROW_MARK_COPY, which is possible if it's a foreign table. In most (possibly all) cases, failure to acquire a lock here isn't really problematic because the parser, planner, or plancache would have taken the appropriate lock already. In principle though it might leave us vulnerable to working with a relation that we hold no lock on, and in any case if the executor isn't depending on previously-taken locks otherwise then it should not do so for ROW_MARK_COPY relations. Noted by Etsuro Fujita. Back-patch to all active versions, since the inconsistency has been there a long time. (It's almost certainly irrelevant in 9.0, since that predates foreign tables, but the code's still wrong on its own terms.)
2015-03-12	Ensure tableoid reads correctly in EvalPlanQual-manufactured tuples.	Tom Lane
	The ROW_MARK_COPY path in EvalPlanQualFetchRowMarks() was just setting tableoid to InvalidOid, I think on the assumption that the referenced RTE must be a subquery or other case without a meaningful OID. However, foreign tables also use this code path, and they do have meaningful table OIDs; so failure to set the tuple field can lead to user-visible misbehavior. Fix that by fetching the appropriate OID from the range table. There's still an issue about whether CTID can ever have a meaningful value in this case; at least with postgres_fdw foreign tables, it does. But that is a different problem that seems to require a significantly different patch --- it's debatable whether postgres_fdw really wants to use this code path at all. Simplified version of a patch by Etsuro Fujita, who also noted the problem to begin with. The issue can be demonstrated in all versions having FDWs, so back-patch to 9.1.
2015-02-15	Fix null-pointer-deref crash while doing COPY IN with check constraints.	Tom Lane
	In commit bf7ca15875988a88e97302e012d7c4808bef3ea9 I introduced an assumption that an RTE referenced by a whole-row Var must have a valid eref field. This is false for RTEs constructed by DoCopy, and there are other places taking similar shortcuts. Perhaps we should make all those places go through addRangeTableEntryForRelation or its siblings instead of having ad-hoc logic, but the most reliable fix seems to be to make the new code in ExecEvalWholeRowVar cope if there's no eref. We can reasonably assume that there's no need to insert column aliases if no aliases were provided. Add a regression test case covering this, and also verifying that a sane column name is in fact available in this situation. Although the known case only crashes in 9.4 and HEAD, it seems prudent to back-patch the code change to 9.2, since all the ingredients for a similar failure exist in the variant patch applied to 9.3 and 9.2. Per report from Jean-Pierre Pelletier.
2015-01-28	Fix column-privilege leak in error-message paths	Stephen Frost
	While building error messages to return to the user, BuildIndexValueDescription, ExecBuildSlotValueDescription and ri_ReportViolation would happily include the entire key or entire row in the result returned to the user, even if the user didn't have access to view all of the columns being included. Instead, include only those columns which the user is providing or which the user has select rights on. If the user does not have any rights to view the table or any of the columns involved then no detail is provided and a NULL value is returned from BuildIndexValueDescription and ExecBuildSlotValueDescription. Note that, for key cases, the user must have access to all of the columns for the key to be shown; a partial key will not be returned. Back-patch all the way, as column-level privileges are now in all supported versions. This has been assigned CVE-2014-8161, but since the issue and the patch have already been publicized on pgsql-hackers, there's no point in trying to hide this commit.
2015-01-15	Fix use-of-already-freed-memory problem in EvalPlanQual processing.	Tom Lane
	Up to now, the "child" executor state trees generated for EvalPlanQual rechecks have simply shared the ResultRelInfo arrays used for the original execution tree. However, this leads to dangling-pointer problems, because ExecInitModifyTable() is all too willing to scribble on some fields of the ResultRelInfo(s) even when it's being run in one of those child trees. This trashes those fields from the perspective of the parent tree, because even if the generated subtree is logically identical to what was in use in the parent, it's in a memory context that will go away when we're done with the child state tree. We do however want to share information in the direction from the parent down to the children; in particular, fields such as es_instrument must be shared or we'll lose the stats arising from execution of the children. So the simplest fix is to make a copy of the parent's ResultRelInfo array, but not copy any fields back at end of child execution. Per report from Manuel Kniep. The added isolation test is based on his example. In an unpatched memory-clobber-enabled build it will reliably fail with "ctid is NULL" errors in all branches back to 9.1, as a consequence of junkfilter->jf_junkAttNo being overwritten with $7f7f. This test cannot be run as-is before that for lack of WITH syntax; but I have no doubt that some variant of this problem can arise in older branches, so apply the code change all the way back.
2014-12-11	Fix corner case where SELECT FOR UPDATE could return a row twice.	Tom Lane
	In READ COMMITTED mode, if a SELECT FOR UPDATE discovers it has to redo WHERE-clause checking on rows that have been updated since the SELECT's snapshot, it invokes EvalPlanQual processing to do that. If this first occurs within a non-first child table of an inheritance tree, the previous coding could accidentally re-return a matching row from an earlier, already-scanned child table. (And, to add insult to injury, I think this could make it miss returning a row that should have been returned, if the updated row that this happens on should still have passed the WHERE qual.) Per report from Kyotaro Horiguchi; the added isolation test is based on his test case. This has been broken for quite awhile, so back-patch to all supported branches.
2014-11-12	Explicitly support the case that a plancache's raw_parse_tree is NULL.	Tom Lane
	This only happens if a client issues a Parse message with an empty query string, which is a bit odd; but since it is explicitly called out as legal by our FE/BE protocol spec, we'd probably better continue to allow it. Fix by adding tests everywhere that the raw_parse_tree field is passed to functions that don't or shouldn't accept NULL. Also make it clear in the relevant comments that NULL is an expected case. This reverts commits a73c9dbab0165b3395dfe8a44a7dfd16166963c4 and 2e9650cbcff8c8fb0d9ef807c73a44f241822eee, which fixed specific crash symptoms by hacking things at what now seems to be the wrong end, ie the callee functions. Making the callees allow NULL is superficially more robust, but it's not always true that there is a defensible thing for the callee to do in such cases. The caller has more context and is better able to decide what the empty-query case ought to do. Per followup discussion of bug #11335. Back-patch to 9.2. The code before that is sufficiently different that it would require development of a separate patch, which doesn't seem worthwhile for what is believed to be an essentially cosmetic change.
2014-11-10	Ensure that whole-row Vars produce nonempty column names.	Tom Lane
	At one time it wasn't terribly important what column names were associated with the fields of a composite Datum, but since the introduction of operations like row_to_json(), it's important that looking up the rowtype ID embedded in the Datum returns the column names that users would expect. However, that doesn't work terribly well: you could get the column names of the underlying table, or column aliases from any level of the query, depending on minor details of the plan tree. You could even get totally empty field names, which is disastrous for cases like row_to_json(). It seems unwise to change this behavior too much in stable branches, however, since users might not have noticed that they weren't getting the least-unintuitive choice of field names. Therefore, in the back branches, only change the results when the child plan has returned an actually-empty field name. (We assume that can't happen with a named rowtype, so this also dodges the issue of possibly producing RECORD-typed output from a Var with a named composite result type.) As in the sister patch for HEAD, we can get a better name to use from the Var's corresponding RTE. There is no need to touch the RowExpr code since it was already using a copy of the RTE's alias list for RECORD cases. Back-patch as far as 9.2. Before that we did not have row_to_json() so there were no core functions potentially affected by bogus field names. While 9.1 and earlier do have contrib's hstore(record) which is also affected, those versions don't seem to produce empty field names (at least not in the known problem cases), so we'll leave them alone.
2014-07-11	Fix bug with whole-row references to append subplans.	Tom Lane
	ExecEvalWholeRowVar incorrectly supposed that it could "bless" the source TupleTableSlot just once per query. But if the input is coming from an Append (or, perhaps, other cases?) more than one slot might be returned over the query run. This led to "record type has not been registered" errors when a composite datum was extracted from a non-blessed slot. This bug has been there a long time; I guess it escaped notice because when dealing with subqueries the planner tends to expand whole-row Vars into RowExprs, which don't have the same problem. It is possible to trigger the problem in all active branches, though, as illustrated by the added regression test.
2014-06-19	Avoid leaking memory while evaluating arguments for a table function.	Tom Lane
	ExecMakeTableFunctionResult evaluated the arguments for a function-in-FROM in the query-lifespan memory context. This is insignificant in simple cases where the function relation is scanned only once; but if the function is in a sub-SELECT or is on the inside of a nested loop, any memory consumed during argument evaluation can add up quickly. (The potential for trouble here had been foreseen long ago, per existing comments; but we'd not previously seen a complaint from the field about it.) To fix, create an additional temporary context just for this purpose. Per an example from MauMau. Back-patch to all active branches.
2014-05-06	Remove tabs after spaces in C comments	Bruce Momjian
	This was not changed in HEAD, but will be done later as part of a pgindent run. Future pgindent runs will also do this. Report by Tom Lane Backpatch through all supported branches, but not HEAD
2014-05-01	Fix failure to detoast fields in composite elements of structured types.	Tom Lane
	If we have an array of records stored on disk, the individual record fields cannot contain out-of-line TOAST pointers: the tuptoaster.c mechanisms are only prepared to deal with TOAST pointers appearing in top-level fields of a stored row. The same applies for ranges over composite types, nested composites, etc. However, the existing code only took care of expanding sub-field TOAST pointers for the case of nested composites, not for other structured types containing composites. For example, given a command such as UPDATE tab SET arraycol = ARRAY[(ROW(x,42)::mycompositetype] ... where x is a direct reference to a field of an on-disk tuple, if that field is long enough to be toasted out-of-line then the TOAST pointer would be inserted as-is into the array column. If the source record for x is later deleted, the array field value would become a dangling pointer, leading to errors along the line of "missing chunk number 0 for toast value ..." when the value is referenced. A reproducible test case for this was provided by Jan Pecek, but it seems likely that some of the "missing chunk number" reports we've heard in the past were caused by similar issues. Code-wise, the problem is that PG_DETOAST_DATUM() is not adequate to produce a self-contained Datum value if the Datum is of composite type. Seen in this light, the problem is not just confined to arrays and ranges, but could also affect some other places where detoasting is done in that way, for example form_index_tuple(). I tried teaching the array code to apply toast_flatten_tuple_attribute() along with PG_DETOAST_DATUM() when the array element type is composite, but this was messy and imposed extra cache lookup costs whether or not any TOAST pointers were present, indeed sometimes when the array element type isn't even composite (since sometimes it takes a typcache lookup to find that out). The idea of extending that approach to all the places that currently use PG_DETOAST_DATUM() wasn't attractive at all. This patch instead solves the problem by decreeing that composite Datum values must not contain any out-of-line TOAST pointers in the first place; that is, we expand out-of-line fields at the point of constructing a composite Datum, not at the point where we're about to insert it into a larger tuple. This rule is applied only to true composite Datums, not to tuples that are being passed around the system as tuples, so it's not as invasive as it might sound at first. With this approach, the amount of code that has to be touched for a full solution is greatly reduced, and added cache lookup costs are avoided except when there actually is a TOAST pointer that needs to be inlined. The main drawback of this approach is that we might sometimes dereference a TOAST pointer that will never actually be used by the query, imposing a rather large cost that wasn't there before. On the other side of the coin, if the field value is used multiple times then we'll come out ahead by avoiding repeat detoastings. Experimentation suggests that common SQL coding patterns are unaffected either way, though. Applications that are very negatively affected could be advised to modify their code to not fetch columns they won't be using. In future, we might consider reverting this solution in favor of detoasting only at the point where data is about to be stored to disk, using some method that can drill down into multiple levels of nested structured types. That will require defining new APIs for structured types, though, so it doesn't seem feasible as a back-patchable fix. Note that this patch changes HeapTupleGetDatum() from a macro to a function call; this means that any third-party code using that macro will not get protection against creating TOAST-pointer-containing Datums until it's recompiled. The same applies to any uses of PG_RETURN_HEAPTUPLEHEADER(). It seems likely that this is not a big problem in practice: most of the tuple-returning functions in core and contrib produce outputs that could not possibly be toasted anyway, and the same probably holds for third-party extensions. This bug has existed since TOAST was invented, so back-patch to all supported branches.
2014-03-06	Don't reject ROW_MARK_REFERENCE rowmarks for materialized views.	Tom Lane
	We should allow this so that matviews can be referenced in UPDATE/DELETE statements in READ COMMITTED isolation level. The requirement for that is that a re-fetch by TID will see the same row version the query saw earlier, which is true of matviews, so there's no reason for the restriction. Per bug #9398. Michael Paquier, after a suggestion by me
2014-02-03	Fix *-qualification of named parameters in SQL-language functions.	Tom Lane
	Given a composite-type parameter named x, "$1." worked fine, but "x." not so much. This has been broken since named parameter references were added in commit 9bff0780cf5be2193a5bad0d3df2dbe143085264, so patch back to 9.2. Per bug #9085 from Hardy Falk.
2014-01-08	Fix "cannot accept a set" error when only some arms of a CASE return a set.	Tom Lane
	In commit c1352052ef1d4eeb2eb1d822a207ddc2d106cb13, I implemented an optimization that assumed that a function's argument expressions would either always return a set (ie multiple rows), or always not. This is wrong however: we allow CASE expressions in which some arms return a set of some type and others just return a scalar of that type. There may be other examples as well. To fix, replace the run-time test of whether an argument returned a set with a static precheck (expression_returns_set). This adds a little bit of query startup overhead, but it seems barely measurable. Per bug #8228 from David Johnston. This has been broken since 8.0, so patch all supported branches.
2013-11-07	Prevent display of dropped columns in row constraint violation messages.	Tom Lane
	ExecBuildSlotValueDescription() printed "null" for each dropped column in a row being complained of by ExecConstraints(). This has some sanity in terms of the underlying implementation, but is of course pretty surprising to users. To fix, we must pass the target relation's descriptor to ExecBuildSlotValueDescription(), because the slot descriptor it had been using doesn't get labeled with attisdropped markers. Per bug #8408 from Maxim Boguk. Back-patch to 9.2 where the feature of printing row values in NOT NULL and CHECK constraint violation messages was introduced. Michael Paquier and Tom Lane
2013-11-03	Prevent memory leaks from accumulating across printtup() calls.	Tom Lane
	Historically, printtup() has assumed that it could prevent memory leakage by pfree'ing the string result of each output function and manually managing detoasting of toasted values. This amounts to assuming that datatype output functions never leak any memory internally; an assumption we've already decided to be bogus elsewhere, for example in COPY OUT. range_out in particular is known to leak multiple kilobytes per call, as noted in bug #8573 from Godfried Vanluffelen. While we could go in and fix that leak, it wouldn't be very notationally convenient, and in any case there have been and undoubtedly will again be other leaks in other output functions. So what seems like the best solution is to run the output functions in a temporary memory context that can be reset after each row, as we're doing in COPY OUT. Some quick experimentation suggests this is actually a tad faster than the retail pfree's anyway. This patch fixes all the variants of printtup, except for debugtup() which is used in standalone mode. It doesn't seem worth worrying about query-lifespan leaks in standalone mode, and fixing that case would be a bit tedious since debugtup() doesn't currently have any startup or shutdown functions. While at it, remove manual detoast management from several other output-function call sites that had copied it from printtup(). This doesn't make a lot of difference right now, but in view of recent discussions about supporting "non-flattened" Datums, we're going to want that code gone eventually anyway. Back-patch to 9.2 where range_out was introduced. We might eventually decide to back-patch this further, but in the absence of known major leaks in older output functions, I'll refrain for now.
2013-11-02	Fix subquery reference to non-populated MV in CMV.	Kevin Grittner
	A subquery reference to a matview should be allowed by CREATE MATERIALIZED VIEW WITH NO DATA, just like a direct reference is. Per bug report from Laurent Sartran. Backpatch to 9.3.
2013-08-30	Reset the binary heap in MergeAppend rescans.	Tom Lane
	Failing to do so can cause queries to return wrong data, error out or crash. This requires adding a new binaryheap_reset() method to binaryheap.c, but that probably should have been there anyway. Per bug #8410 from Terje Elde. Diagnosis and patch by Andres Freund.
2013-08-14	Improve error message when view is not updatable	Peter Eisentraut
	Avoid using the term "updatable" in confusing ways. Suggest a trigger first, before a rule.
2013-07-25	Prevent leakage of SPI tuple tables during subtransaction abort.	Tom Lane
	plpgsql often just remembers SPI-result tuple tables in local variables, and has no mechanism for freeing them if an ereport(ERROR) causes an escape out of the execution function whose local variable it is. In the original coding, that wasn't a problem because the tuple table would be cleaned up when the function's SPI context went away during transaction abort. However, once plpgsql grew the ability to trap exceptions, repeated trapping of errors within a function could result in significant intra-function-call memory leakage, as illustrated in bug #8279 from Chad Wagner. We could fix this locally in plpgsql with a bunch of PG_TRY/PG_CATCH coding, but that would be tedious, probably slow, and prone to bugs of omission; moreover it would do nothing for similar risks elsewhere. What seems like a better plan is to make SPI itself responsible for freeing tuple tables at subtransaction abort. This patch attacks the problem that way, keeping a list of live tuple tables within each SPI function context. Currently, such freeing is automatic for tuple tables made within the failed subtransaction. We might later add a SPI call to mark a tuple table as not to be freed this way, allowing callers to opt out; but until someone exhibits a clear use-case for such behavior, it doesn't seem worth bothering. A very useful side-effect of this change is that SPI_freetuptable() can now defend itself against bad calls, such as duplicate free requests; this should make things more robust in many places. (In particular, this reduces the risks involved if a third-party extension contains now-redundant SPI_freetuptable() calls in error cleanup code.) Even though the leakage problem is of long standing, it seems imprudent to back-patch this into stable branches, since it does represent an API semantics change for SPI users. We'll patch this in 9.3, but live with the leakage in older branches.
2013-06-12	Improve updatability checking for views and foreign tables.	Tom Lane
	Extend the FDW API (which we already changed for 9.3) so that an FDW can report whether specific foreign tables are insertable/updatable/deletable. The default assumption continues to be that they're updatable if the relevant executor callback function is supplied by the FDW, but finer granularity is now possible. As a test case, add an "updatable" option to contrib/postgres_fdw. This patch also fixes the information_schema views, which previously did not think that foreign tables were ever updatable, and fixes view_is_auto_updatable() so that a view on a foreign table can be auto-updatable. initdb forced due to changes in information_schema views and the functions they rely on. This is a bit unfortunate to do post-beta1, but if we don't change this now then we'll have another API break for FDWs when we do change it. Dean Rasheed, somewhat editorialized on by Tom Lane
2013-05-29	pgindent run for release 9.3	Bruce Momjian
	This is the first run of the Perl-based pgindent script. Also update pgindent instructions.
2013-05-12	Fix handling of strict non-set functions with NULLs in set-valued inputs.	Tom Lane
	In a construct like "select plain_function(set_returning_function(...))", the plain function is applied to each output row of the SRF successively. If some of the SRF outputs are NULL, and the plain function is strict, you'd expect to get NULL results for such rows ... but what actually happened was that such rows were omitted entirely from the result set. This was due to confusion of this case with what should happen for nested set-returning functions; a strict SRF is indeed supposed to yield an empty set for null input. Per bug #8150 from Erwin Brandstetter. Although this has been broken forever, we're not back-patching because of the possibility that some apps out there expect the incorrect behavior. This change should be listed as a possible incompatibility in the 9.3 release notes.
2013-04-28	Editorialize a bit on new ProcessUtility() API.	Tom Lane
	Choose a saner ordering of parameters (adding a new input param after the output params seemed a bit random), update the function's header comment to match reality (cmon folks, is this really that hard?), get rid of useless and sloppily-defined distinction between PROCESS_UTILITY_SUBCOMMAND and PROCESS_UTILITY_GENERATED.
2013-04-27	Incidental cleanup of matviews code.	Tom Lane
	Move checking for unscannable matviews into ExecOpenScanRelation, which is a better place for it first because the open relation is already available (saving a relcache lookup cycle), and second because this eliminates the problem of telling the difference between rangetable entries that will or will not be scanned by the query. In particular we can get rid of the not-terribly-well-thought-out-or-implemented isResultRel field that the initial matviews patch added to RangeTblEntry. Also get rid of entirely unnecessary scannability check in the rewriter, and a bogus decision about whether RefreshMatViewStmt requires a parse-time snapshot. catversion bump due to removal of a RangeTblEntry field, which changes stored rules.
2013-04-24	Fix assertion failure for REFRESH MATERIALIZED VIEW in PL.	Kevin Grittner
	This was due to incomplete implementation of rowcount reporting for RMV, which was due to initial waffling on whether it should be provided. It seems unlikely to be a useful or universally available number as more sophisticated techniques for maintaining matviews are added, so remove the partial support rather than completing it. Per report of Jeevan Chalke, but with a different fix
2013-04-20	Clean up references to SQL92	Peter Eisentraut
	In most cases, these were just references to the SQL standard in general. In a few cases, a contrast was made between SQL92 and later standards -- those have been kept unchanged.
2013-04-19	Improve error message when an FDW doesn't support WHERE CURRENT OF.	Tom Lane
	If an FDW fails to take special measures with a CurrentOfExpr, we will end up trying to execute it as an ordinary qual, which was being treated as a purely internal failure condition. Provide a more user-oriented error message for such cases.
2013-04-12	sepgsql: Enforce db_procedure:{execute} permission.	Robert Haas
	To do this, we add an additional object access hook type, OAT_FUNCTION_EXECUTE. KaiGai Kohei
2013-04-09	Create a distinction between a populated matview and a scannable one.	Kevin Grittner
	The intent was that being populated would, long term, be just one of the conditions which could affect whether a matview was scannable; being populated should be necessary but not always sufficient to scan the relation. Since only CREATE and REFRESH currently determine the scannability, names and comments accidentally conflated these concepts, leading to confusion. Also add missing locking for the SQL function which allows a test for scannability, and fix a modularity violatiion. Per complaints from Tom Lane, although its not clear that these will satisfy his concerns. Hopefully this will at least better frame the discussion.
2013-03-10	Support writable foreign tables.	Tom Lane
	This patch adds the core-system infrastructure needed to support updates on foreign tables, and extends contrib/postgres_fdw to allow updates against remote Postgres servers. There's still a great deal of room for improvement in optimization of remote updates, but at least there's basic functionality there now. KaiGai Kohei, reviewed by Alexander Korotkov and Laurenz Albe, and rather heavily revised by Tom Lane.
2013-03-06	Arrange to cache FdwRoutine structs in foreign tables' relcache entries.	Tom Lane
	This saves several catalog lookups per reference. It's not all that exciting right now, because we'd managed to minimize the number of places that need to fetch the data; but the upcoming writable-foreign-tables patch needs this info in a lot more places.
2013-03-03	Add a materialized view relations.	Kevin Grittner
	A materialized view has a rule just like a view and a heap and other physical properties like a table. The rule is only used to populate the table, references in queries refer to the materialized data. This is a minimal implementation, but should still be useful in many cases. Currently data is only populated "on demand" by the CREATE MATERIALIZED VIEW and REFRESH MATERIALIZED VIEW statements. It is expected that future releases will add incremental updates with various timings, and that a more refined concept of defining what is "fresh" data will be developed. At some point it may even be possible to have queries use a materialized in place of references to underlying tables, but that requires the other above-mentioned features to be working first. Much of the documentation work by Robert Haas. Review by Noah Misch, Thom Brown, Robert Haas, Marko Tiikkaja Security review by KaiGai Kohei, with a decision on how best to implement sepgsql still pending.
2013-03-03	Fix SQL function execution to be safe with long-lived FmgrInfos.	Tom Lane
	fmgr_sql had been designed on the assumption that the FmgrInfo it's called with has only query lifespan. This is demonstrably unsafe in connection with range types, as shown in bug #7881 from Andrew Gierth. Fix things so that we re-generate the function's cache data if the (sub)transaction it was made in is no longer active. Back-patch to 9.2. This might be needed further back, but it's not clear whether the case can realistically arise without range types, so for now I'll desist from back-patching further.
2013-02-07	Fix performance issue in EXPLAIN (ANALYZE, TIMING OFF).	Tom Lane
	Commit af7914c6627bcf0b0ca614e9ce95d3f8056602bf, which added the TIMING option to EXPLAIN, had an oversight: if the TIMING option is disabled then control in InstrStartNode() goes through an elog(DEBUG2) call, which typically does nothing but takes a noticeable amount of time to do it. Tweak the logic to avoid that. In HEAD, also change the elog(DEBUG2)'s in instrument.c to elog(ERROR). It's not very clear why they weren't like that to begin with, but this episode shows that not complaining more vociferously about misuse is likely to do little except allow bugs to remain hidden. While at it, adjust some code that was making possibly-dangerous assumptions about flag bits being in the rightmost byte of the instrument_options word. Problem reported by Pavel Stehule (via Tomas Vondra).
2013-01-30	Fix plpgsql's reporting of plan-time errors in possibly-simple expressions.	Tom Lane
	exec_simple_check_plan and exec_eval_simple_expr attempted to call GetCachedPlan directly. This meant that if an error was thrown during planning, the resulting context traceback would not include the line normally contributed by _SPI_error_callback. This is already inconsistent, but just to be really odd, a re-execution of the very same expression would show the additional context line, because we'd already have cached the plan and marked the expression as non-simple. The problem is easy to demonstrate in 9.2 and HEAD because planning of a cached plan doesn't occur at all until GetCachedPlan is done. In earlier versions, it could only be an issue if initial planning had succeeded, then a replan was forced (already somewhat improbable for a simple expression), and the replan attempt failed. Since the issue is mainly cosmetic in older branches anyway, it doesn't seem worth the risk of trying to fix it there. It is worth fixing in 9.2 since the instability of the context printout can affect the results of GET STACKED DIAGNOSTICS, as per a recent discussion on pgsql-novice. To fix, introduce a SPI function that wraps GetCachedPlan while installing the correct callback function. Use this instead of calling GetCachedPlan directly from plpgsql. Also introduce a wrapper function for extracting a SPI plan's CachedPlanSource list. This lets us stop including spi_priv.h in pl_exec.c, which was never a very good idea from a modularity standpoint. In passing, fix a similar inconsistency that could occur in SPI_cursor_open, which was also calling GetCachedPlan without setting up a context callback.
2013-01-29	Provide database object names as separate fields in error messages.	Tom Lane
	This patch addresses the problem that applications currently have to extract object names from possibly-localized textual error messages, if they want to know for example which index caused a UNIQUE_VIOLATION failure. It adds new error message fields to the wire protocol, which can carry the name of a table, table column, data type, or constraint associated with the error. (Since the protocol spec has always instructed clients to ignore unrecognized field types, this should not create any compatibility problem.) Support for providing these new fields has been added to just a limited set of error reports (mainly, those in the "integrity constraint violation" SQLSTATE class), but we will doubtless add them to more calls in future. Pavel Stehule, reviewed and extensively revised by Peter Geoghegan, with additional hacking by Tom Lane.
2013-01-24	Fix SPI documentation for new handling of ExecutorRun's count parameter.	Tom Lane
	Since 9.0, the count parameter has only limited the number of tuples actually returned by the executor. It doesn't affect the behavior of INSERT/UPDATE/DELETE unless RETURNING is specified, because without RETURNING, the ModifyTable plan node doesn't return control to execMain.c for each tuple. And we only check the limit at the top level. While this behavioral change was unintentional at the time, discussion of bug #6572 led us to the conclusion that we prefer the new behavior anyway, and so we should just adjust the docs to match rather than change the code. Accordingly, do that. Back-patch as far as 9.0 so that the docs match the code in each branch.
2013-01-23	Improve concurrency of foreign key locking	Alvaro Herrera
	This patch introduces two additional lock modes for tuples: "SELECT FOR KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each other, in contrast with already existing "SELECT FOR SHARE" and "SELECT FOR UPDATE". UPDATE commands that do not modify the values stored in the columns that are part of the key of the tuple now grab a SELECT FOR NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently with tuple locks of the FOR KEY SHARE variety. Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this means the concurrency improvement applies to them, which is the whole point of this patch. The added tuple lock semantics require some rejiggering of the multixact module, so that the locking level that each transaction is holding can be stored alongside its Xid. Also, multixacts now need to persist across server restarts and crashes, because they can now represent not only tuple locks, but also tuple updates. This means we need more careful tracking of lifetime of pg_multixact SLRU files; since they now persist longer, we require more infrastructure to figure out when they can be removed. pg_upgrade also needs to be careful to copy pg_multixact files over from the old server to the new, or at least part of multixact.c state, depending on the versions of the old and new servers. Tuple time qualification rules (HeapTupleSatisfies routines) need to be careful not to consider tuples with the "is multi" infomask bit set as being only locked; they might need to look up MultiXact values (i.e. possibly do pg_multixact I/O) to find out the Xid that updated a tuple, whereas they previously were assured to only use information readily available from the tuple header. This is considered acceptable, because the extra I/O would involve cases that would previously cause some commands to block waiting for concurrent transactions to finish. Another important change is the fact that locking tuples that have previously been updated causes the future versions to be marked as locked, too; this is essential for correctness of foreign key checks. This causes additional WAL-logging, also (there was previously a single WAL record for a locked tuple; now there are as many as updated copies of the tuple there exist.) With all this in place, contention related to tuples being checked by foreign key rules should be much reduced. As a bonus, the old behavior that a subtransaction grabbing a stronger tuple lock than the parent (sub)transaction held on a given tuple and later aborting caused the weaker lock to be lost, has been fixed. Many new spec files were added for isolation tester framework, to ensure overall behavior is sane. There's probably room for several more tests. There were several reviewers of this patch; in particular, Noah Misch and Andres Freund spent considerable time in it. Original idea for the patch came from Simon Riggs, after a problem report by Joel Jacobson. Most code is from me, with contributions from Marti Raudsepp, Alexander Shulgin, Noah Misch and Andres Freund. This patch was discussed in several pgsql-hackers threads; the most important start at the following message-ids: AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com 1290721684-sup-3951@alvh.no-ip.org 1294953201-sup-2099@alvh.no-ip.org 1320343602-sup-2290@alvh.no-ip.org 1339690386-sup-8927@alvh.no-ip.org 4FE5FF020200002500048A3D@gw.wicourts.gov 4FEAB90A0200002500048B7D@gw.wicourts.gov